Identification and Validation of Novel Combinatorial Genetic Risk Factors for Endometriosis across Multiple UK and US Patient Cohorts

other preprint OA: green CC-BY-NC-4.0
AI-generated summary by claude@2026-06, 2026-06-13

Combinatorial analysis of UK and US cohorts identified 1,709 novel genetic risk factor signatures for endometriosis, revealing new biological pathways and candidate genes.

One-sentence paraphrase of the abstract; not a substitute for reading it. No clinical advice. How this works

AI-generated deep summary by claude@2026-06, 2026-06-13 · read from full text

The paper used combinatorial genetic “disease signature” analysis to discover multilocus SNP-genotype combinations associated with endometriosis in a UK Biobank case-control cohort, using 35 meta-GWAS SNPs (including imputed loci) and applying filters to reduce population substructure, remove adenomyosis cases, and prune linkage disequilibrium. It identified 1,709 endometriosis risk signatures in UKB, then evaluated their enrichment and reproducibility in an independent, diverse-ancestry All of Us cohort, including stratified sub-cohorts by self-reported race/ethnicity, and compared reproducibility for meta-GWAS SNPs, signatures containing them, and fully novel signatures; it also removed low-recombination MHC and aimed to annotate novel genes for potential drug-repurposing targets. A key caveat explicitly noted is that UKB cases and controls differ in age distribution, raising the possibility of missed endometriosis diagnoses among older “controls,” which could dilute genetic associations and lower reproducibility. This paper is centrally about endometriosis—combinatorial genetic risk factors were identified in UK Biobank and validated for reproducibility in All of Us, with adenomyosis excluded from the discovery cases.

Read from the paper's body, not the abstract. Not a substitute for reading the paper. No clinical advice. How this works

Abstract

BACKGROUND: Endometriosis affects about 10% of women usually of reproductive age. It often has severe negative impacts on patients' quality of life, but the average time to a definitive diagnosis remains 7-9 years, and there are few effective therapeutic options. Relatively little is known about the genetic drivers of the disease even though its heritability is fairly high. A recent large genome wide association study (GWAS) meta-analysis identified 42 genomic loci associated with risk of endometriosis, but together these explain only 5% of disease variance. METHODS: We used the PrecisionLife® combinatorial analytics platform to identify multi-SNP disease signatures significantly associated with endometriosis in a white European UK Biobank (UKB) cohort. We assessed the reproducibility of these multi-SNP disease signatures as well as 35 of the 42 meta-GWAS SNPs in a multi-ancestry American endometriosis cohort from All of Us (AoU) after controlling for population structure. RESULTS: We identified 1,709 disease signatures, comprising 2,957 unique SNPs in combinations of 2-5 SNPs, that were associated with increased prevalence of endometriosis in UKB. Pathways enriched in the disease signatures included cell adhesion, proliferation and migration, cytoskeleton remodeling, angiogenesis as well as biological processes involved in fibrosis and neuropathic pain.We observed a significant enrichment of these signatures (58-88%, p<0.04) that are also positively associated with endometriosis in the AoU cohort, including one 2-SNP signature that is individually significant. Reproducibility rates were greatest for higher frequency signatures, ranging from 80-88% for signatures with greater than 9% frequency (p<0.01) in AoU. Encouragingly, the disease signatures also show high reproducibility rates in non-white European AoU sub-cohorts (66-76%, p9%). Of these, 7 genes were previously identified in the endometriosis meta-GWAS study and 16 genes have a previous association with endometriosis. 75 novel genes were identified in this study.We characterized 9 novel genes that occur at the highest frequency in reproducing signatures and that do not contain any SNPs linked to known GWAS genes, providing new evidence for links between endometriosis and autophagy and macrophage biology. Reproducibility rates, ranging between 73% to 85%. are especially strong for the signatures that contain these 9 genes independently of any SNPs mapping to the meta-GWAS genes. CONCLUSION: Although using much smaller, less well-characterized datasets than the previous whole genome meta-GWAS study, combinatorial analysis has provided important new insights into the genetics and biology of endometriosis including reproducible biologically relevant genes that are overlooked by GWAS approaches.The 75 novel gene associations provide new insights and routes for study of the disease and potential new therapies. Several of the novel genes identified are credible targets for drug discovery, repurposing and/or repositioning. Using the disease signatures identified as genetic biomarkers in trials of candidates drugs targeting specific mechanisms will enable precision medicine-based approaches. We hope this will encourage new targeted therapy discovery efforts.
Full text 54,957 characters · extracted from pmc · 5 sections · click to expand

Results

Running combinatorial analysis on the UKB population and including the 35 meta-GWAS SNPs, we identified 1,759 multi-SNP signatures significantly associated with endometriosis (UKB disease signatures), containing 3,039 unique SNPs. 1,709 of these signatures could also be assessed in AoU. These include: 196 combinations of two SNP-genotypes 403 combinations of three SNP-genotypes 440 combinations of four SNP-genotypes 670 combinations of five SNP-genotypes 196 combinations of two SNP-genotypes 403 combinations of three SNP-genotypes 440 combinations of four SNP-genotypes 670 combinations of five SNP-genotypes Together, these 1,709 combinations include a total of 2,946 unique SNPs which mapped to 1,309 genes. Pathways enriched in the disease signatures included cell adhesion, proliferation and migration, cytoskeleton remodeling, angiogenesis as well as biological processes involved in fibrosis and neuropathic pain ( Supplementary Figure 4 ). We then evaluated the frequency and disease association of these UKB disease signatures in AoU. The most common UKB disease signature occurs in 20.1% of AoU participants, and 20% of UKB disease signatures (342/1709) occur in fewer than 1% of AoU participants. Figure 1 presents a histogram of UKB disease signatures by frequency. When considering the full set of 1,709 disease signatures, none were significantly associated with increased disease risk in AoU after applying Bonferroni or Benjamini-Hochberg FDR adjustments. However, when we tested just the 196 disease signatures comprised of 2 SNP-genotypes, resulting in a less severe FDR correction, one signature was significantly associated with increased disease risk in AoU under both FDR adjustment approaches. This signature, which has a Benjami-Hochberg adjusted p -value = 0.038 and logistic regression odds ratio = 1.21 in AoU, is comprised of the heterozygous genotypes for rs11751190 (G/A) and rs1888328 (T/G). The former SNP is located within an intron of the gene SYNE1 while the latter SNP is located in intronic regions of multiple lncRNAs. SYNE1 was also linked to a meta-GWAS SNP, but the SYNE1 SNP in this signature is not the same as the lead SYNE1 SNP in the meta-GWAS (rs71575922). No disease signatures were statistically significant when we individually assess 3-, 4-, or 5-SNP genotype signatures. Evaluating all 1,709 endometriosis disease signatures, we observed a significant enrichment of signatures that also have odds ratios > 1 in AoU (58.5%). This reproducibility rate trends higher when the signatures are filtered by increasingly greater frequency ( Figure 2 ). For example, reproducibility rates increased to 62.8% when assessing the 1,367 signatures with frequency >1%. The reproducibility rate further increased to 68.1% for the 521 signatures with frequency >4%, 80.1% for the 176 signatures with frequency >9%, and 88.3% for the 120 signatures with frequency >11%. These frequency cutoffs were chosen based on inflection points in reproducibility rates across 1% frequency intervals (e.g., signatures with frequency between 8%-9% vs. 9%-10%). We confirmed that all results are statistically significant, i.e., similar reproducibility rates are observed in fewer than 5% of random permutations ( p < 0.05, Table 1 ). UKB disease signatures that reproduce in AoU exhibit a range of odds ratios – across all the reproducing signatures regardless of frequency, 46% have odds ratios greater than 1.1 in AoU, 6% have odds ratios greater than 1.3, the maximum odds ratio is 2.37, and the mean odds ratio is 1.12 ( Table 2 and Supplementary Figure 3 ). We observed a narrower range of odds ratios and lower mean odds ratios after applying frequency filters, as expected given the trade-off between effect size and feature frequency in disease genetics ( Supplementary Figure 3 ). However, we continue to observe a relatively large proportion of reproducing signatures with moderate odds ratios (> 1.1). Among reproducing signatures that occur in more than 1% of individuals, the maximum odds ratio is 1.49, the mean odds ratio is 1.09, and the proportion of signatures with odds ratio greater 1.1 is 40%. Among reproducing signatures that occur in more than 9% of individuals, the maximum odds ratio is 1.21, the mean odds ratio is 1.08, and 40% of signatures have odds ratios greater than 1.1. In comparison, the mean odds ratio for the set of meta-GWAS SNPs that reproduce in AoU (33 of 35) is 1.06 and just 15% have odds ratios greater than 1.1. Reproducibility rates for all combinatorial UKB disease signatures in self-identified ‘Black / African American’ and ‘Hispanic / Latino’ cohorts from AoU are similar to reproducibility rates for the whole cohort ( Table 3 ). We also observe similar reproducibility rates across cohorts for signatures with greater than 1% frequency and greater than 4% frequency. Reproducibility rates for signatures with greater than 9% frequency are lower than the reproducibility rates for the full cohort but still demonstrate strong enrichment of reproducing signatures. 35 of the 42 meta-GWAS SNPs were present in the dataset used in our combinatorial analysis. The identified risk alleles for six of these SNPs (17%), annotated to the genes CDKN2-BAS1 , FRMD7 , LINC00629 , WNT4 , MLLT10 , and KDR , are also significantly associated with increased risk of endometriosis in AoU based on GWAS results after applying a Bonferroni FDR correction (see Supplementary Table 7 ). An additional four SNPs, annotated to the genes CD109 , ACTL9 , GDAP1 , and VEZT significantly replicate in AoU if we apply a Benjamini-Hochberg FDR correction, corresponding to a cumulative 29% replication. The GWAS associations for 21 (60%) of the meta-GWAS SNPs in AoU are however not even nominally significant ( p > 0.05), including 9 SNPs (26%) with p -values > 0.5 in AoU. This lack of statistical replication likely reflects the much smaller sample size of the cohorts for AoU vs. the meta-GWAS, as 94% (33/35) of the meta-GWAS SNPs reproduce with odds ratios greater than 1. 20% of the reproducing UKB disease signatures (336/1709) contain at least one of the lead meta-GWAS SNPs, with 17 of the 35 meta-GWAS SNPs present in at least one disease signature ( Table 4 ). Disease signatures assigned to 11 of these 17 SNPs have reproducibility rates greater than 60% in the full AoU cohort, and the overall reproducibility rate for these 336 signatures is 74% (247/336). If we expand the definition of meta-GWAS signatures to include any signature containing a SNP that is annotated to one of the meta-GWAS genes, the number of meta-GWAS signatures increases to 644 (38% of signatures) with a reproducibility rate of 75%. The number of associated meta-GWAS genes increases by four, including two genes where the meta-GWAS SNP was not detected in the combinatorial analysis and two where the meta-GWAS SNP was absent from the dataset. The increase in meta-GWAS signatures is largely due to the inclusion of 261 signatures that contain one of three SNPs assigned to SYNE1 that differ from the lead meta-GWAS SNP. The 80% reproducibility rate for the 261 signatures associated with these alternative SYNE1 SNPs is higher than the 74% reproducibility rate for signatures associated with the lead SYNE1 meta-GWAS SNP. The combinatorial analysis identified novel associations between SNPs and endometriosis in two contexts: novel SNPs that occur in the same disease signature as at least one meta-GWAS SNP (‘meta-GWAS signatures’) and novel SNPs that occur in disease signatures containing no meta-GWAS SNPs (‘non-meta-GWAS signatures’). SNPs can also occur in a mix of meta-GWAS and non-meta-GWAS signatures. Higher frequency disease signatures exhibit the strongest evidence for a consistent link to endometriosis in the AoU replication analysis. A total of 195 unique SNPs mapping to 98 genes were identified among the 141 disease signatures that have frequency >9% and that also replicate in AoU (odds ratio > 1). Of these, 7 genes were previously identified in the endometriosis meta-GWAS study and 16 additional genes have a previous association with endometriosis in Open Targets 35 or Pharos 36 but were not identified by the meta-GWAS (see Supplementary Table 8 ). The remaining 75 genes represent novel disease associated genes. Pathways enriched in the genes in high frequency disease signatures included cell-adhesion and proliferation mechanisms, cytoskeletal organization along with neurological and pain processes ( Supplementary Figure 5 ). Many of these novel genes occur in meta-GWAS signatures which exhibit strong reproducibility rates overall, but it is challenging to separate the contribution of meta-GWAS genes vs. novel genes to this reproducibility. To validate some of the novel genes associated with endometriosis, we focused on the set of 25 SNPs that were found in 25 or more disease signatures (termed ‘core’ SNPs). This criterion reflects the network geometry-based approach utilized in the PrecisionLife combinatorial analytics platform, which more strongly supports SNPs that are repeatedly linked to disease status in different combinatorial contexts. 20 of these 25 core SNPs were annotated to genes. 6 of those 20 genes overlap with genes identified by the meta-GWAS (the meta-GWAS SNPs annotated to WNT4 / CDC42 , CDKN2B-AS1 , and FSHB/ARL14EP , plus 3 SNPs associated with SYNE1 including the lead meta-GWAS SNP). Reproducibility statistics for these genes are included in Table 4 and Supplementary Table 7 ). Of the remaining 14 annotated core SNPs, 9 most frequently occur (>50%) in non-meta-GWAS disease signatures (i.e., signatures that contain no component SNPs annotated to a meta-GWAS gene). We refer to these 9 SNPs as ‘novel core SNPs’. Importantly, all 9 novel core SNPs exhibit a strong enrichment of disease signatures that reproduce in AoU ( Table 5 ). Reproducibility rates for the novel core SNPs are especially strong for non-meta-GWAS signatures, ranging between 73% to 85%. Reproducibility of these non-meta-GWAS signatures unambiguously demonstrates the capability of combinatorial analytics to identify novel signal missed by GWAS. Five additional core SNPs predominantly co-occur with meta-GWAS SNPs (in >80% of signatures), and as such we cannot evaluate whether reproducibility or lack thereof is due to the core SNP or the meta-GWAS SNP. Of these meta-GWAS signatures, three show high rates of reproducibility, while two show no evidence of reproducibility ( Table 5 ).

Materials

We used a cohort of endometriosis patients and healthy controls from UKB to identify disease signatures that are significantly associated with elevated prevalence of endometriosis in a case-control design. Cases were initially defined as any UKB participant with self-reported diagnosis (Data field 20002: data-code 1402) or ICD-10/9 codes (N80.*) of endometriosis. We further removed all patients with adenomyosis (ICD-10 code N80.0 - ‘endometriosis of uterus’). To minimize the risk of detecting false positive disease signatures that are indirectly associated with disease due to population substructure rather than disease biology, we filtered the patient cohort to include only patients identified as having ‘white British’ genetic ancestry. We were unable to perform combinatorial analyses on patients with alternative ancestries due to their low representation and small sample sizes in UKB. 196,188 potential controls matched the following criteria: Females with white British genetic ancestry Exclude all patients with self-reported diagnosis (Data field 20002: data-code 1402) or ICD-10/9 codes (N80.*) for endometriosis Females with white British genetic ancestry Exclude all patients with self-reported diagnosis (Data field 20002: data-code 1402) or ICD-10/9 codes (N80.*) for endometriosis We randomly selected a subset of controls to produce a 1:10 case-control ratio relative to the number of UKB patients with all forms of endometriosis (N80.*). The bulk of the study was hypothesis-free. However, we also wanted to test the relative reproducibility of the loci identified by the Rahmioglu et al. (2023) meta-GWAS study 20 . Of the 42 significant SNPs identified by the meta-GWAS only 7 were present on the genotyping array used by UKB. To allow us to capture the effects of these missing loci in our combinatorial analysis, we incorporated genotypes for the missing meta-GWAS SNPs using imputed SNP genotype data provided by UKB. We would not typically conduct combinatorial analysis on imputed genotype data as the systematic genotyping errors associated with imputation compound when considering combinations of SNP genotypes, especially across ancestries. For example, a 2% average misgenotyping error rate for individual SNPs implies that nearly 10% of samples will likely be misgenotyped for at least one of the SNPs in a five-SNP disease signature. However, it was considered that the UKB imputation error rate is lowest for white British samples, and that inclusion of the 35 imputed meta-GWAS SNPs was very unlikely to result in compounded error in this study as they comprise a very small fraction (0.02%) of the total dataset. In any combinatorial analysis, only a small fraction of the total possible number of disease signatures can be evaluated due to computational and statistical power limitations, and it would be inefficient to devote limited resources to evaluating signatures that could not be further validated using the available datasets. The original intent of our study design was to validate the UKB disease signatures in the Copenhagen Hospital Biobank (CHB) 28 . We had therefore filtered the UKB dataset to only include SNPs that were also present in CHB, which removed 7 of the meta-GWAS SNPs. Unfortunately, this resource was ultimately not available to us for this validation study so instead we used the All of Us (AoU) resource, with the limitations that this imposed on the diminished overlap of SNP genotypes. We also removed SNPs in the low-recombination MHC region of chromosome 6 and conducted LD pruning in PLINK 1.9 (--indep-pairwise 50 2 0.2). This step maximizes opportunities for discovering unique signal by reducing the risk of the combinatorial analytics algorithm redundantly testing effectively equivalent disease signatures comprised of SNPs in linkage disequilibrium. Any meta-GWAS SNPs removed during pruning were added back to the dataset. The above steps resulted in a UKB cohort with 4,493 cases, 62,574 controls, and 204,423 SNPs used for disease signature discovery. The distribution of ages differs considerably between cases and controls, with a stronger skew towards older age UKB participants in controls ( Supplementary Figure 1 ). This finding implies that older UKB participants were less likely to have received an ICD-10 code for endometriosis and that a subset of the ‘controls’ in our dataset are likely misphenotyped and may represent missed diagnoses due to historic clinical practices. If that is the case, it would result in decreased associations between genetic variants and disease and reduced reproducibility of the results (see Limitations of the Analysis ). We ran a combinatorial analysis to identify disease signatures that are significantly enriched in endometriosis cases relative to controls in the UKB discovery cohort. The general methodology behind the hypothesis-free combinatorial analytics approach is described in more detail elsewhere 24 , 25 . Briefly, the PrecisionLife platform uses a fully deterministic approach to construct and rank disease signatures in ‘layers’ of increasing combinatorial complexity (i.e., adding new SNP-genotypes to the top-ranked disease signatures identified in previous layer). Statistical validation is performed by comparing the properties of disease signatures and their associated ‘networks’ identified by the analysis to the properties of signatures and networks identified in combinatorial analyses using randomly permuted versions of the dataset (i.e., ones constructed by randomly shuffling the case-control assignments to remove any biological link between genotype and phenotype). 1,000 cycles of fully random permutation are typically used. We identified a second cohort of female patients diagnosed with endometriosis, along with female controls matched for self-identified ethnicity / race in the All of Us Curated Data Repository (CDR) version 8 (accessed on the 24th of July 2025) 26 AoU CDR v8 contains data for over 865,000 American participants including 414,840 patients with whole genome sequencing (WGS) data and 354,400 patients with electronic health record data. Notably, it encompasses a diverse population with a wide age range of 18 to 90 years and substantial representation of non-European ancestry groups, which are frequently underrepresented in genomic research. Endometriosis cases were identified using the AoU Cohort Browser by selecting all females with whole genome sequencing (WGS) data who had a diagnosis of endometriosis based on ICD9 codes 617.1–9 or ICD-10 codes N80.1–9 and N80.A-D (see Supplementary Table 1 ). These ICD code criteria are referred to as ‘Source Concepts’ in AoU. This resulted in 4,134 endometriosis cases. The AoU control cohort was generated by selecting females with WGS data who do not have any evidence of endometriosis, either based on ICD-9/ICD-10 codes (AoU Source Concepts) or related AoU Standard Concepts (see Supplementary Tables 2 and 3 ). We also excluded individuals who have a history of procedures such as laparoscopy or any symptomatic phenotypes potentially consistent with undiagnosed endometriosis such as pelvic pain, dysmenorrhea or infertility reported in EHR or surveys (see Supplementary Tables 2 and 3 ). We used the sex-imputation functionality of PLINK to confirm that all selected case and controls were imputed as genetically female. Applying these criteria, our maximum control population included 191,331 individuals. As in UKB, we observed a similar skew towards higher age among controls than cases in AoU ( Supplementary Figure 2 ). For each patient, we took whole genome sequence (WGS) data from AoU and extracted all SNP genotypes included in the filtered UKB genotype dataset. This resulted in 200,578 SNPs available for signature lookups. This lower SNP genotype coverage obviously further restricts the number of matching reproducible signatures that could be found, and this led to 50 signatures being lost from the reproducibility test. To reduce potential confounding effects arising from population substructure, we matched controls to the endometriosis cases based on self-reported race/ethnicity to generate a dataset with a case:control ratio of 1:4. We employed the probabilistic stratified sampling approach described in Sardell et al . (2025) 27 to select the subset of 16,536 controls that most closely matches the distribution of demographic subgroups in cases. The demographic breakdown of the AoU endometriosis dataset is included in Supplementary Table 4 . We used principal component analysis (PCA) to model any remaining population substructure within the AoU study cohort. Prior to performing PCA, we first removed all SNPs that are associated with the sex chromosomes, that fall within the MHC region on chromosome 6, or that have minor allele frequency less than 5%. We then conducted LD-pruning in PLINK 1.9 (--indep-pairwise 50 5 0.2) before generating genetic principal components (PCs) using the --pca command in PLINK 1.9. We selected the top 4 PCs for use in our analyses based on the variance explained by the associated eigenvalues ( Supplementary Table 5 ). We evaluated the degree to which the endometriosis disease signatures identified in a UKB cohort reproduce in AoU, using an approach similar to the one previously used to validate combinatorial disease signatures for long COVID 27 . 50 of the disease signatures identified in UKB could not be evaluated in AoU because one or more of their component SNP genotypes were not included in the latter dataset following QC. These were excluded from the analysis. First, the association between each disease signature and disease status in AoU was assessed via a logistic regression with absence/presence of the disease signature as the dependent variable alongside the top 4 genetic PCs as covariates. The genetic PCs were included as covariates to control for the effects of population substructure on signature frequency. Using the results of these logistic regressions, we first tested whether any individual signatures are significantly associated with increased risk of endometriosis in AoU, after applying Bonferroni 29 and Benjamini-Hochberg 30 false discovery rate (FDR) adjustments to account for testing of multiple signatures. We then repeated this analysis individually for the output of each ‘layer’ of the UKB combinatorial analysis, where ‘layer’ denotes the number of SNP-genotypes in a signature. We also assessed whether any of the ‘lead’ SNPs identified by the meta-GWAS 20 are significantly associated with endometriosis in our AoU cohort. Statistically validating replication of disease associations for individual signatures is challenging due to the limited statistical power associated with the relatively small size of the AoU validation cohort. Due to their nature multi-SNP disease signatures, especially with 4 or 5 SNPs, have much lower frequency than their individual component SNPs, requiring larger sized cohorts to achieve equivalent statistical power. This difference in statistical power is further magnified when the number of disease signatures from combinatorial analysis is orders of magnitude larger than the number of significant loci detected in the meta-GWAS, as is expected when the disease biology reflects a complex network of interacting genes. This necessitates a much more severe FDR adjustment when assessing statistical significance of the former relative to the latter. Therefore, we also assessed the evidence for broad reproducibility of positive disease associations among the set of endometriosis signatures in AoU. This analysis tests whether the proportion of disease signatures that are positively correlated with endometriosis in AoU is significantly greater than expected for a set of signatures unlinked to disease. This observation suggests that the set of signatures are reflective of disease biology, even if most individual signatures cannot achieve a level of statistical validation due to the small sample size. First, we counted the fraction of signatures that have a positive coefficient (odds ratio > 1) in the logistic regression. We call this the ‘reproducibility rate’ for a set of signatures. To test whether the observed enrichment is statistically significant, we randomly reassigned case-control status to the patient cohort while maintaining a fixed case-control ratio, generated logistic regression results in the randomized cohort, and calculated its observed reproducibility rate. Repeating this process 100 times allowed us to generate a distribution of reproducibility rates under the null hypothesis where disease signatures are unlinked to case-control status. The p- value associated with the observed reproducibility data is the number of permutations where the observed reproducibility rate is equal to or greater than the observed reproducibility rate in the AoU endometriosis cohort. This permutation-based approach for calculating p -values is most appropriate because the signatures are non-independent due to shared component SNP-genotypes and therefore violate the assumptions of independent observations required by standard statistical tests. Our previous analysis of long COVID 27 showed that the reproducibility rates for combinatorial disease signatures were positively correlated with the frequency of those signatures within the population. This likely reflects the reduced statistical power for rare signatures, which are more likely to randomly occur at higher frequency in controls than cases due to random sampling even when they are biologically associated with disease. Therefore, we assessed the reproducibility rate for endometriosis disease signatures after applying a set of filters for signature frequency in the AoU cohort. We also tested whether the disease signatures identified in the white British UKB cohort reproduced among AoU participants who have self-reported ‘Black’ and/or ‘African American’ race/ethnicity as well as patients with self-reported ‘Hispanic’, and/or ‘Latino/a’ race/ethnicity. These cohorts are not fully mutually exclusive, as they represent responses to different questions within AoU. We again included genetic PCs as covariates to control for indirect relationships between signature frequency and disease prevalence resulting from population substructure. We conducted separate PCAs for each sub-cohort using the approach described above for the whole cohort. We then selected the first two PCs as covariates for each ancestry-specific analysis based on the variance explained by the PCA eigenvalues (see Supplementary Table 5 ). SNPs identified in disease signatures were mapped to genes using an annotation cascade process against the human reference genome (GRCh38). SNPs that lie within coding regions of gene(s) were assigned directly to the corresponding gene(s). Remaining SNPs that lie within 2 kb upstream or 0.5 kb downstream of any gene(s) were mapped to the closest gene(s) within this region 31 . Additional gene assignments for identified SNPs using publicly available eQTL data and/or chromatin interaction data for ovary and uterine tissues 32 , 33 . SNPs located outside the upstream and downstream distance thresholds (2kb and 0.5kb respectively) of any gene that did not have any data in relevant tissues were not associated with any gene for biological interpretation. This SNP mapping approach is designed to identify the most likely biologically relevant gene(s) for novel target discovery and repurposing analyses. Meta-GWAS SNPs that were assigned a gene in the Rahmioglu et al. (2023) study but could not be mapped to any genes using this approach are listed in Supplementary Table 9 . The mapped genes were annotated using data from over 50 public data sources (see Supplementary Table 6 ), and their biological relevance analyzed to develop a deeper understanding of the gene and its potential mechanism of action link to endometriosis phenotypes. All genes were screened for matches against existing drugs in preclinical and clinical development using GlobalData 34 . This was used to identify, evaluate and prioritize existing endometriosis drugs and potential drug repurposing candidates that can be mapped to specific mechanistic patient subgroups identified in this analysis. For comparison, we also assessed the replicability and reproducibility of 35 of the 42 significant SNPs from the Rahmioglu et al . meta-GWAS 20 (hereafter referred to as ‘meta-GWAS SNPs’). We used the --logistic command in PLINK 1.9 to generate GWAS results for the AoU cohort, including as covariates the same top 4 genetic PCs used in the reproducibility study for the combinatorial signatures. Although the GWAS implementation in PLINK is not state-of-the-art, it offers an approximate estimate of the degree of reproducible signal from the meta-GWAS SNPs. A disease association for a GWAS SNP is considered to replicate if its logistic regression additive model p -value is statistically significant ( p < 0.05 after Bonferroni FDR adjustment for multiple tests). A disease association for a GWAS SNP is considered to reproduce if the coefficient from the logistic regression additive model is greater than 0 (i.e., odds ratio > 1).

Discussion

Combinatorial analysis identified 1,709 combinatorial disease signatures that are associated with significantly increased prevalence of endometriosis in a UKB white British cohort. These UKB endometriosis disease signatures exhibit a significant enrichment of reproducing disease signal in AoU. That is, we observe significantly more UKB disease signatures that are also positively correlated with increased prevalence of endometriosis in AoU than expected by chance. Over a third of these disease signatures contain at least one SNP assigned to a gene identified in a large endometriosis meta-GWAS, and these exhibit high rates of reproducibility. The reproducibility rate increases after applying filters based on the frequency of disease signatures in AoU, a pattern previously observed in a reproducibility analysis for long COVID combinatorial disease signatures 27 . These UKB disease signatures include several wholly novel genes that were associated with endometriosis for the first time in both UKB and AoU, as well as many of the same SNPs and genes identified by a large meta-GWAS study 20 and many genes previously linked to endometriosis via non-genetic association studies. In particular the finding of 75 novel gene associations that have high frequency and reproduce in an independent, ancestrally diverse dataset confirms that combinatorial analysis can identify biologically relevant genes that are overlooked by GWAS approaches even when GWAS meta-analyses use datasets that are an order of magnitude larger and of higher quality. The broader biological context captured by these combinatorial disease signatures provide potential to improve our understanding of disease biology and build clinical development tools to support novel therapeutics programs. Identification of genes previously linked to endometriosis via non-genetic studies provides support for the biological validity of our results. For example, we identified disease signatures linked to serotransferrin (TF), an iron binding protein which transports Fe 3+ ions from sites of absorption and heme degradation to where it is utilized or stored. Dysregulation of iron homeostasis is believed to be involved in endometriosis 37 . Elevated levels of iron have been demonstrated in endometriotic tissues 38 , 39 , 37 and these have been associated with inflammation, oxidative stress, cellular damage and ferroptosis 40 . TF levels are elevated in peritoneal fluid from women with endometriosis 41 , 42 and TF saturation is correlated with disease 43 . Populations of stromal cells and macrophages from endometrial tissues express elevated levels of the transferrin receptor 1, which is responsible for cellular uptake of TF and iron, resulting in increased iron loading 44 , 38 . Serotransferrin and anti-oxidants provide protection for mouse oocytes against the oxidative damage and dysmaturity induced by ovarian follicular fluid from endometriosis patients 45 , 46 . Our observations lend support to the importance of iron management in endometriosis and to greater attempts to explore this as an area for therapeutic manipulation. Additional examples of genes with previous non-genetic associations are listed in Supplementary Table 8 . We identified 9 novel genes associated with the set of core SNP genotypes that feature most prominently in non-meta GWAS disease signatures, all of which universally exhibit high degrees of reproducibility in AoU. This subset includes multiple genes associated with autophagy and macrophage biology as well as others that could hold significant potential for novel therapeutics and targeted drug repurposing/repositioning. Autophagy, a cellular process responsible for degradation and recycling of intracellular components 47 , has previously been implicated in the pathophysiology of endometriosis. Suppressed autophagic activity has been observed in ectopic endometrial tissue, facilitating the survival, immune evasion, and proliferation of ectopic cells (reviewed in 48 , 49 , 50 ). Among the novel core genes identified in our study, ATG16L1 and NEDD4L are of particular interest due to their central roles in autophagy. ATG16L1 encodes a core autophagy protein that forms a complex with ATG5 and ATG12, acting as an E3-like ligase essential for substrate recognition and autophagosome biogenesis 51 . While ATG16L1 has not previously been associated with endometriosis, knockdown of interacting autophagy genes ATG5 and ATG7 has been shown to impair decidualization in human endometrial stromal cells 52 , suggesting a potential role in endometrial function and disease. NEDD4L codes for an E3 ubiquitin ligase that regulates autophagy by targeting the proteins ULK1 and ASCT2 for ubiquitination and degradation 53 . Moreover, NEDD4L has been reported to inhibit the activity of mTOR, a key negative regulator of autophagy 54 . Although NEDD4L represents a novel association with endometriosis, a related E3 ligase NEDD4 has been shown to inhibit ferroptosis and promote endometrial lesion stromal cell survival by mediating the ubiquitination and degradation of PTGS2 55 . Notably, both ATG16L1 and NEDD4L have previously been implicated in inflammatory bowel disease (IBD) 56 , 57 . The SNP identified in ATG16L1 in our endometriosis study has demonstrated a strong association with Crohn’s disease in prior GWAS 56 . This finding is potentially interesting given the reported symptomatic overlap between Crohn’s disease and endometriosis 58 . Collectively, this evidence suggests that dysregulation of ATG16L1 and NEDD4L may contribute to impaired autophagic homeostasis in endometriosis, potentially supporting lesion development, inflammation, and disease progression. Macrophage dysfunction has been increasingly recognized as a contributor to the development and progression of endometriosis 59 . In patients, peritoneal macrophages produce more VEGF than in healthy individuals, and transcriptomic analyses highlight macrophage-derived cytokines as key drivers of inflammation in endometriosis 60 , 61 . In mouse models, macrophage depletion reduces lesion size, vascularization, and inflammatory pain, underscoring their central role in disease development 62 , 63 . During menstruation, macrophages clear endometrial debris via scavenger receptor–mediated phagocytosis 64 . Intriguingly, one of the novel core genes identified in our analysis, COLEC12 , encodes one of these scavenger receptors involved in host defense and phagocytosis. We also identified VSTM1 as a core gene, which encodes an immunoglobulin superfamily protein that enhances IL17A secretion by CD4+ T cells 65 . Although VSTM1 itself has not been studied in endometriosis, IL17A is known to drive pathogenic M2 macrophage polarization in lesions 66 . Both genes show macrophage-enriched expression (Human Protein Atlas 67 ), and while their altered expression in endometriosis remains unconfirmed, dysfunction in either may impair phagocytosis, sustain inflammation, or disrupt tissue remodeling, promoting lesion persistence. Alongside the less well-studied potentially druggable genes that were identified, the analyses found several genes whose protein products are the targets for well-established therapeutics that have not previously been explored in the treatment of endometriosis, i.e., drug repurposing opportunities. One example of these is the peptidase, angiotensin-converting enzyme (ACE). ACE plays a major role in the renin-angiotensin-aldosterone system 68 , which regulates sodium retention by the kidney and blood pressure by converting the peptide hormone angiotensin I (AT1) to angiotensin II (AT2), resulting in an increase of the vasoconstrictor activity 69 . Inhibitors of the enzyme have been in clinical use since the 1980s for a variety of indications including coronary artery disease, congestive heart failure, chronic kidney disease and myocardial infarction 70 . The product of ACE activity, AT2, enhances degradation of extracellular matrix proteins in cultures of human endometrial stromal cells, while also improving their viability and increasing proliferation and migration 71 . ACE inhibitors (captopril, ramipril) have been shown to reduce the growth of endometrial implants in rat models 72 , 73 . The expression of the receptors for AT2 (AGTR1, AGTR2) is also known to be altered in endometrial lesions 74 , 75 . A link between endometriosis and hypertension has previously been identified 76 , 77 , 78 , but the opportunity to exploit the existing therapeutic modulators of ACE or the angiotensin II receptor inhibitors has not been investigated. In total, this analysis generated 21 genes that are targeted by assets in clinical development with strong evidence of reproducibility in mixed genetic ancestry patient subgroups, and highly plausible mechanism of action hypotheses. As evinced by the ACE example, many of the repurposing candidates shortlisted have clear mechanistic roles in lesion development, inflammation, fibrosis and – in a few cases – endometriosis-associated infertility. Furthermore, 16 of these shortlisted candidates have also been found to be significant in at least one and often multiple other independent endometriosis populations through combinatorial analysis, providing further strength of genetic evidence for their role in driving endometriosis development. Further work is ongoing to prioritize the first set of drug repurposing candidates and design preliminary clinical efficacy studies to validate the disease modification potential of these targets in patients. We are currently prioritizing these 21 candidates based on stage of development, drug modality, directionality of modulation, and existing safety data. For each, we can demonstrate a clear rationale for how these drugs might benefit endometriosis patients and provide a method for matching specific patient subtypes with drugs that target their unique disease mechanisms. Neither UKB nor AoU provides an ideal dataset on which to perform this analysis. Both have relatively small case populations, they have low genotype coverage and overlap, and both datasets have material levels of phenotypic misclassification in both cases and controls as is common in complex diseases. In highly heterogeneous diseases, we have found that using stricter diagnostic criteria—despite risking the exclusion of some genuine cases—is generally preferable to including a larger number of poorly diagnosed (and potentially true negative) individuals in the case group. We also try to screen out undiagnosed (and potentially true positive) patients who may be considered for the control group due to low historic diagnosis rates. Unfortunately, larger and better characterized datasets were not available to us, which disadvantaged this study. Reliably identifying patients with endometriosis currently relies on surgical confirmation, and such data is not available in UKB or AoU. We relied instead on ICD-10 coding, which is known to be inconsistently and inaccurately applied. This was further complicated by the age distribution of the cases, which in the case of UKB can be decades past reproductive age. This means that older, potentially less reliable diagnostic criteria may have been in place for these individuals. Indeed, although there is no reason to expect that prevalence of endometriosis should be correlated with age, the age distribution of both UKB and AoU endometriosis cases is heavily skewed towards women with lower ages relative to controls ( Supplementary Figures 2 & 3 ). This is consistent with the observation that endometriosis has been poorly recognized historically, resulting in high prevalence of mis-/undiagnosed cases. As a result, a subset of the ‘controls’ in the UKB and AoU cohorts likely represent misphenotyped patients. Both of these factors (and the low SNP genotype coverage in AoU) work to reduce the number of UKB disease signatures found and the level of their reproducibility reported in AoU. Phenotypic misclassifications typically reduce observed effect sizes by artificially increasing the resemblance between cases and controls 79 . Such misclassification poses a challenge for reproducibility analyses, as signal dilution lowers the statistical power of results 80 . For example, it raises the likelihood that a biologically meaningful disease signature could display an odds ratio below 1 simply because of random sampling (as shown in Sardell et al. 2025 27 ). Accordingly, we anticipate that the high rate of phenotypic misclassification in the datasets will have suppressed the overall signature discovery and reproducibility rates we observed. The reproducibility measures reported here therefore likely represent a lower-bound estimate of the true rate. Using more accurate diagnostic protocols would enable more reliable comparisons across study cohorts. Statistically validating individual disease signatures in an independent dataset is challenging due to the small size of available cohorts, the large number of signatures, the relative rarity of individual signatures, and the complexity of disease biology. This is illustrated by the fact that fewer than 30% of meta-GWAS SNPs replicate significantly in AoU, despite higher frequencies, much less biological complexity, and orders of magnitude fewer features and consequent need for FDR adjustment to account for multiple tests relative to the replication analysis for disease signatures. In contrast, 94% of the meta-GWAS SNP risk alleles are also positively correlated with increased prevalence of endometriosis in AoU, suggesting that the lack of broad statistical replication is likely due to lack of statistical power provided by the relatively small cohort. The results of this study confirm that combinatorial analysis both recapitulates GWAS results and finds a broad range of novel disease biology. The most highly reproducible findings, including across multiple ancestries, come from disease signatures that have high frequency (e.g., above 4%). This is encouraging for clinical applications as signatures with higher frequency offer the potential for increased clinical utility across more patients. The identification of 75 reproducible novel genes shows that the approach can identify biologically and clinically relevant genes that are overlooked by GWAS and meta-GWAS, even when working at a considerable disadvantage due to a much smaller, less well-characterized dataset (4,463 cases with low coverage genotype data vs 60,674 cases using whole genome data). The potential utility of these findings is shown by the association of several genes novel to endometriosis that are druggable either as the basis of new therapeutic programs, or as rapid targeted clinical trials of drug repurposing/repositioning candidates. Where safe and well-tolerated compounds can be validated directly in humans as a proof-of-concept study, this offers the fastest route to bringing effective medicines to patients. These findings have significant potential to accelerate endometriosis drug development and to bring new therapeutic options to patients. In spite of the huge unmet medical need and the associated commercial potential, there has been very little progress in the development of novel endometriosis therapeutics. Aside from chronic underinvestment in women’s health, there are several technical and scientific reasons for this – the disease is complex (i.e., has multiple etiologies and influences), highly heterogenous, multi-symptomatic, not correlated with a readily measurable biomarker, and there are no good animal models for the disease pathology. These all work to increase uncertainty and risk for proposed drug discovery projects. Identifying multiple novel genes and mechanisms that reproduce well across different populations offers new routes into the disease biology. Many of these mechanisms can already be modulated using known active compounds. These can be validated readily using existing drug development candidates (repositioning) or on-market/generic compounds (repurposing) in small proof-of-concept human studies, guided by genetic biomarkers used to recruit patients whose disease is driven by that mechanism. These would provide rapid and cost-effective clinical validation of the disease modifying potential of the novel targets and the ability to select responders. These same genetic/mechanistic biomarkers can be used during drug development to build a precision regulatory strategy, and to recruit likely responders into clinical trials, accelerating and derisking clinical development. The same tools may become complementary diagnostics to guide therapy selection in the clinic in due course. When fully developed and validated, such detailed genetic insights could also be used for rapid and non-invasive differential triage of patients presenting with deep pelvic pain to facilitate quick referral of women to achieve a definitive diagnosis, to evaluate patient prognosis for severity and progression to infertility, endometriomas, and other complications, and to help select the most appropriate therapies for an individual patient. A key challenge for GWAS results and related risk scores is that they are predominantly derived from European cohorts and often fail to transfer across alternative ancestries 81 , 82 , 83 , 84 . It is important in using genomic analysis to demonstrate that the results support health equity across as many patients as possible, including those with different ancestries. With the predominant focus usually being on achieving better representation in the population datasets, the choice of analytical methods is often overlooked as a critical factor in finding transferrable results that reproduce well in different patient subgroups 85 . Although the initial combinatorial analysis was performed on a white British only cohort, the reproducibility results from this study are broadly consistent across all AoU participants regardless of their self-reported ancestry. This observation is crucial as it implies that these disease signatures can be used to inform precision medicine healthcare without adversely affecting historically underrepresented populations and further increasing health disparities.

Conclusions

The high level of reproducibility of combinatorial disease signatures across different ancestries and both novel and known genes is encouraging for furthering the study of endometriosis and its clinical care. It offers much needed new insights into the disease that will be useful to build new diagnostic and therapeutic options for patients. By focusing on the sets of novel disease signatures with the highest case frequencies and largest combinatorial networks, we demonstrated broad reproducibility of disease signal among many genes not previously linked to endometriosis. This result highlights the promising potential for using combinatorial analysis to identify novel targets for drug discovery or repurposing and to provide targeted genetic/mechanism-based biomarkers as crucial clinical development tools in a complex heterogenous disease that has resisted more traditional drug development approaches. Work in endometriosis would clearly benefit from using combinatorial analysis approaches on larger datasets with even wider population diversity, more secure diagnosis, more harmonized health/symptom surveys and deeper whole genome, longitudinal clinical, proteomic and metabolic data. We hope that such datasets will be made available for deeper study using state-of-the-art techniques. Much further work is needed in endometriosis; a rapid and accurate test for the disease would enable patients to access timely care and reduce the potential for years of misdiagnosis, suffering and clinical waste, but it also requires there to be a range of good precision therapeutic options once those women are diagnosed. Progress on both fronts will help strategically prioritize much needed women’s health research initiatives to make more rapid progress in addressing this massive global challenge and improving patients’ lives.

Introduction

Endometriosis is a debilitating and, in many cases, progressive chronic disease that impacts approximately 10% of reproductive age women 1 , 2 . It typically causes severe pain in afflicted patients and occurs in up to 30–50% of women suffering from infertility 3 . Despite its high prevalence and severe pathophysiology, the average time to diagnosis of endometriosis from onset of symptoms is almost 9 years in the UK, with many affected women scheduling 10+ additional medical consultations before receiving a diagnosis 4 . The long delay between onset of symptoms, diagnosis, and treatment occurs in part because the mechanisms underlying the development of endometriosis and its associated symptoms, such as chronic pain and infertility, are poorly understood 5 . The wide variety of symptoms experienced by those with endometriosis, including chronic pelvic pain, dysmenorrhea, infertility, gastrointestinal, and urological problems, contributes to a high rate of misdiagnosis and missed diagnosis of the disease 6 . Currently, the gold standard for endometriosis diagnosis is surgical confirmation of the presence of endometrial lesions (laparotomy/laparoscopy/transvaginal hydrolaparoscopy) 7 . However, these surgeries are invasive, painful and carry a degree of risk, especially with deep infiltrating disease 8 . They may fail to detect lesions, especially superficial peritoneal and filmy adhesion forms 9 . Even when lesions are detected, the severity of different symptoms experienced by patients does not always correlate with the extent of disease pathology observed 6 , 10 . While the sensitivity and accuracy of non-invasive imaging methods such as MRI and ultrasound for diagnosing the disease are improving, they continue to vary enormously depending on the technique, type, size, and location of lesions, and experience of the operator 11 . Successful treatment of endometriosis is also challenging. Invasive surgery to remove lesions and hormonal therapies (e.g., the contraceptive pill) often have limited symptomatic impact for endometriosis patients 12 , 13 . Hormone therapies also have significant side effects and are inappropriate for women aiming to conceive 14 , 15 . The pressing need for more effective treatment options and more accurate, ideally non-invasive, diagnostic tools for endometriosis highlights the importance of obtaining better insight into the biology of this poorly understood disease. A major challenge is that endometriosis is a multifactorial disease, believed to be caused by complex interactions between genetic, hormonal, and environmental factors 16 . Several hypotheses have been proposed to explain the development of endometriosis, including retrograde menstruation, alterations in the peritoneal fluid, and bacterial infection. However many of these have contradicting evidence, and none are definitive 17 . Endometriosis is also often co-associated with other chronic disorders associated with pain, including migraine, irritable bowel syndrome, interstitial cystitis, and fibromyalgia 18 , 19 . This may potentially be due to shared underlying biological mechanisms in pain hypersensitivity and inflammatory pathways. A recent genome-wide association study (GWAS) meta-analysis based on combined whole genome datasets comprised of 60,674 endometriosis cases and 701,926 controls identified 42 genomic loci significantly associated with prevalence of endometriosis including several genes linked to pain perception and maintenance 20 . However, these signals explained only 5% of the disease variance, indicating that much remains unknown about the biology of this complex disease. Combinatorial analysis enables hypothesis-free identification of combinations of genetic variants (‘disease signatures’) that are significantly over- or under-enriched in patients with a disease or other specific phenotype. These studies can be of a case-control design or can use a quantitative trait approach and can find both disease risk and disease resilience factors. The signatures found capture both the linear and non-linear (e.g., epistatic) interactions between multiple genomic loci that reflect the complex disease biology involved in heterogeneous chronic diseases. Combinatorial disease risk signatures improve our understanding of complex diseases beyond the single SNP associations identified by GWAS 21 . They also create opportunities for clinically actionable diagnostic/triage tests and targeted trials of drug candidates that may provide clinical benefit to selected patient cohorts based on their specific disease drivers 22 . The PrecisionLife ® combinatorial analytics platform has been used in multiple studies to identify key genes and genetic mechanisms associated with disease risk and resilience including amyotrophic lateral sclerosis (ALS) 23 , myalgic encephalomyelitis / chronic fatigue syndrome (ME/CFS) 24 , and long COVID 25 . A significant enrichment of disease signatures from an original UK population long COVID study were shown to be positively correlated with increased risk of long COVID in an ancestrally heterogeneous American cohort from an All of Us (AoU) dataset 26 , with up to 83% reproducibility of signatures and 92% reproducibility of genes between the cohorts 27 . These results represented the first demonstration of reproducible genetic signals for both long COVID and ME/CFS, as GWAS approaches have to date offered no validated insights into the biology of these two complex diseases even in much larger populations. We first ran combinatorial analysis on a cohort of endometriosis patients and controls from a UK Biobank (UKB) dataset, specifically including 35 of the 42 meta-GWAS SNPs in the analysis. This analysis identified 1,709 disease signatures significantly associated with increased risk of endometriosis, each comprising between 2 to 5 SNP-genotypes. We then sought to demonstrate significant enrichment of reproducing disease associations for these signatures in an independent, diverse-ancestry American endometriosis cohort from All of Us (AoU) for which we had overlapping genotype data. We wished to evaluate the relative reproducibility of the disease associated signatures in sub-cohorts of self-reported Black/African American and Hispanic/Latino patients. We also sought to calculate the relative reproducibility statistics respectively for meta-GWAS SNPs, combinatorial disease signatures containing meta-GWAS SNPs, and disease signatures made from entirely novel sets of SNPs that had not previously been associated by GWAS or in the literature with endometriosis. We evaluated reproducibility across multiple self-identified race/ethnicity cohorts to assess the applicability of disease signatures to diverse patient populations. We annotated novel genes from a biological perspective to inform the broader understanding of the biological drivers of endometriosis and finally we used our knowledge of the modulators of these novel genes (with appropriate directionality) to identify potential drug repurposing candidates for the novel mechanisms.

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: pmc

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Condition tags

endometriosis

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-06-15T06:13:43.845377+00:00
pmc
last seen: 2026-05-13T20:22:03.195721+00:00
pubmed
last seen: 2026-06-15T06:10:13.204275+00:00
unpaywall
last seen: 2026-05-11T08:34:28.763810+00:00
License: CC-BY-NC-4.0 · commercial use OK · attribution required
Courtesy of the U.S. National Library of Medicine