ExGRS: exome-wide genetic risk score to predict high myopia across multi-ancestry populations

preprint OA: closed
Full text JSON View at publisher
Full text 145,218 characters · extracted from preprint-html · click to expand
ExGRS: exome-wide genetic risk score to predict high myopia across multi-ancestry populations | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article ExGRS: exome-wide genetic risk score to predict high myopia across multi-ancestry populations Jianzhong Su, Jian Yuan, Ruowen Qiu, Yuhan Wang, Zhen Ji Chen, and 9 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-4188555/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 30 Dec, 2024 Read the published version in Communications Medicine → Version 1 posted You are reading this latest preprint version Abstract High myopia (HM), characterized by severe myopic refractive error, stands as a leading cause to visual impairment and blindness globally. HM is a multifactorial ocular disease and presents high heterogeneity in genetics. Employing a genetic risk score (GRS) is useful for capturing genetic susceptibility to HM. Incorporating rare variations into GRS assessment, though presents methodological challenges, yields significant benefits. This study enrolled two independent cohorts: 12,000 unrelated individuals of Han Chinese ancestry from Myopia Associated Genetics and Intervention Consortium (MAGIC) and 8,682 individuals of European ancestry from UK Biobank (UKB). Using whole-exome sequencing (WES) data, we first estimated the heritability of HM resulting in 0.53 (standard error, 0.06) in the MAGIC cohort and 0.21 (standard error, 0.10) in the UKB cohort. In the MAGIC cohort, rare variants in low linkage disequilibrium (LD) with neighboring variants were enriched for heritability, particularly for rare deleterious protein-altering variants. Thus, we generated, optimized and validated an exome-wide genetic risk score (ExGRS) for HM prediction by combining rare risk genotypes with common variant GRS (cvGRS). ExGRS improved the AUC from 0.819 (cvGRS) to 0.856 for HM. Individuals with a top 5% ExGRS conffered a 15.57-times (95%CI, 5.70 - 59.48) higher risk for developing HM compared to the remaining 95% of individuals in MAGIC cohort and 2.03 times (95%CI, 1.65-2.49) higher risk in UKB. Our study implies that rare variants are a major source of the missing heritability of HM in Han Chinese ancestry. And ExGRS provides an enhanced accuracy for HM prediction, shedding new light on research and clinical practice. Health sciences/Diseases/Eye diseases Biological sciences/Genetics/Population genetics high myopia whole-exome sequencing heritability genetic risk score Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Introduction Individuals with high myopia (HM), generally defined by a spherical equivalent (SE) of -6.00 diopters (D) or lower 1 . HM affects 2.8% of the general population and is a risk factor for developing pathologic myopia (PM) and its complications, most notably retinal degeneration or even detachment, which can cause severe visual acuity (VA) loss and even blindness 2,3 . HM commonly occurs in Asian schoolchildren (6.8%-21.6%) 4,5 than in non-Asians (2.0%-2.3%) 6 . HM is a multifactorial eye disease with a high genetic susceptibility. Twin and family studies have demonstrated that HM has a high heritability 7,8 . Over past decades, amounts of genome-wide association studies (GWAS) of refractive error or myopia have revealed more than hundreds of candidate genetic factors across different ethnic populations 9–11 . However, the common variant uncovered by GWAS has a small effect size independently; even the additive effects can only explain a limited fraction of myopia heritability (estimated heritability: 5.3% in Asians and 21.4% in Europeans) 11,12 . Whole-exome sequencing (WES) studies of HM trios or families have identified several novel mutations and genes in the Asian populations, i.e., SCO2 13 , BSG 14 , CCDC102B 15 , and LRPAP1 16 . Moreover, our recent WES study has also identified several HM-associated genes, including rare coding variants, which were found to have larger effect sizes 17 . Hence, rare variants indeed contribute to the genetic architecture of HM, although the extent to which they accounted for its heritability remains unclear, leaving ample room for further investigation. Polygenic risk scores (PRS) summarize the cumulative genetic effects of numerous disease-associated variants, providing an overall measure of genetic susceptibility to a particular disease for an individual 18,19 . No more clinical interventions or examinations are required, a test of blood or saliva samples can be used to predict a wide range of conditions 20 . In European populations, several large-scale studies have demonstrated the effectiveness of utilizing the PRS to stratify myopia risk 10,11,21–24 . Currently, the best-performing PRSs for refractive error explain about 19% of the variance in the trait in individuals of European ancestry and about 6% in those of East Asian ancestry 23 . The best AUROC for HM is 0.783 and 0.672 in European and East Asian populations, respectively 23 . With most large-scale myopia GWASs primarily performed among European populations, it remains unclear if these findings are generalizable to diverse populations of non-European ancestry. Thus, in this study, we estimated heritability explained by SNP-based genetic variance and the gene-wise burden of rare alleles for HM using WES data in a large sample of 12,000 unrelated Chinese from the Myopia Associated Genetics and Intervention Consortium (MAGIC) and 8,682 Europeans from the UK Biobank (UKB) program. We constructed common-variant-based genetic risk scores (cvGRS) and rare-variant-based genetic risk scores (rvGRS) models and evaluated the performance of the two models for genetic risk prediction in a subset of MAGIC. We proposed a method, exome-wide genetic risk score (ExGRS), which combined cvGRS and rvGRS, and observed further improvement in genetic risk prediction for HM. We demonstrated the creation of the ExGRS, which exhibits distinct advantages over cvGRS, by incorporating rare variants identified in HM-associated genes via burden tests, while also evaluated its portability across ancestry in the UKB European populations. Results Polygenic architecture of rare to common coding variants We used a dataset of 1,2600 exomes of Han Chinese ancestry in the MAGIC project and 8,682 exomes of European ancestry in the UK Biobank (Supplementary Fig. 1). We analyzed variants observed at least three times in our dataset, which corresponds to a minor allele frequency (MAF) threshold of 0.01%. After quality control (QC), 2.6 × 10 − 6 and 2.2 × 10 − 6 variants were included in the further analysis in MAGIC and UKB, respectively. First, based on common SNPs, the estimated heritability ( \({h}_{SNP}^{2}\) ) of HM was calculated by the residual maximum likelihood analysis (GREML) approach implemented in the software package GCTA 25 . This analysis utilized a selected set of 43,367 and 66,091 HapMap 3 (HM3) SNPs from the MAGIC and UKB cohorts, respectively. After correcting for the first 20 principal components (PCs) computed from HM3 SNPs, we estimated an \({h}_{SNP}^{2}\) of 0.31 (standard error, s.e. = 0.01) and 0.14 ( s.e. = 0.02) for HM in MAGIC and UKB cohorts, respectively (Supplementary Fig. 2). We then applied variants with MAF > 0.01% to estimate and partition additive genetic variances. We grouped variants according to MAF and LD (Supplementary Fig. 3 and Supplementary Table 1), using the GREML-LDMS partitioning method with a median-based LD grouping strategy 26 . Corrected for the first 20 PCs, we found the estimated heritability based on WES data ( \({h}_{WES}^{2}\) ) was 1.76 (s.e. = 0.03) and 0.20 (s.e. = 0.10) for HM in MAGIC and UKB cohorts (Supplementary Fig. 3), which suggested \({h}_{SNP}^{2}\) in MAGIC cohort may have been inflated by confounding factors such as population structure. To find out the contribution of uncaptured population, we utilized linear model adjusted for PCs to assess the association of rare variants (Supplementary Fig. 4) in both cohorts. We then used 160 PCs (that is, 20 PCs computed from each of the 8 MAF/LD bins) computed from independent variants in the GREML-LDMS analyses, which decreased \({h}_{WES}^{2}\) from 1.76 (s.e.=0.03) to 0.53 (s.e.=0.06) in MAGIC cohort and increased \({h}_{WES}^{2}\) from 0.20 (s.e. = 0.10) to 0.21 (s.e. = 0.10) in UKB (Fig. 1 and Supplementary Fig. 2), suggesting the presence of population stratification effects not captured by the 20 common variant PCs used in the MAGIC above. We also found that the difference of \({h}_{WES}^{2}\) for HM between MAGIC cohort and UKB cohort is predominantly explained by rare variants, in particular those in low LD with nearby variants. For the variants with MAF < 0.01, 0.33 of the phenotypic variance in MAGIC cohort was accounted by variants in the low-LD group, but only 0.10 of the variance by variants in the high-LD group. However, in the UKB cohort, only 0.04 of the phenotypic variance is accounted for by variants in the low-LD group and 0.01 from those in the high-LD group. When replacing all the calling SNPs in MAGIC cohort with overlapped variants that found in both the MAGIC and the UKB WES datasets, the estimated heritability decreased from 0.53 to 0.06 (Supplementary Fig. 5), with most of the differences coming from the variants with 0.0001 < MAF < 0.01, where almost is EAS specific (Supplementary Fig. 6). To further estimate the association between SNP effect and MAF, we demonstrated the association between effect size and MAF by a plot of the cumulative genetic variances explained by h 2 against MAF. Under an evolutionarily neutral model, h 2 is linearly proportional to MAF 27 . We found that the curve of cumulative genetic variances in MAGIC and UKB cohort were deviated from the neutral model, which suggested that HM is under negative selection (Supplementary Fig. 7). To investigate the contribution of low-LD variants with MAF < 0.01 to heritability, we partitioned the low-MAF and low-LD variants bins according to the putative effect of protein-coding variants using VEP 28 . Protein-coding variants include four annotations: ( 1 ) synonymous (Syn); ( 2 ) benign missense (B-mis); ( 3 ) damaging missense (D-mis); and ( 4 ) protein-truncating variants (PTVs) (Supplementary Table 2). The proportion of deleterious protein-altering variants, including PTVs and D-mis, was different across the LD and MAF groups, with an increased trend from low- to high-MAF bins (Supplementary Fig. 8), which is consistent with purifying selection on this class of variants. Interestingly, the average variance explained per variant was larger for bins with PTVs (low-LD) compared with bins with other protein-altering variants and non-protein-altering variants (low-LD) or high-LD variants (Fig. 2 ). To further validate the robustness of the estimates partitioned by functional genomic annotations, we quantified the heritability explained by the gene-wise burden of rare coding variants 29 . We found that HM in MAGIC cohort and UKB cohort have a PTVs burden heritability of 0.7% (s.e. = 0.15%) and 0.32% (s.e. = 0.25%), respectively (Supplementary Fig. 9). Burden heritability concentrates among variants with the most severe predicted functional consequences: PTVs variants explain the majority of burden heritability, followed by D-mis, B-mis and Syn variants, which is also consistent with the GREML-LDMS assessment. Derive genetic risk scores of common coding variants for HM The genetic risk score (GRS) served as a reliable measurement of the overall risk for an individual’s genetic susceptibility to disease, which is an integral part of precision medicine 30 . The flowchart illustrating the study strategy is presented in Fig. 3 . For HM in MAGIC cohort, we created several candidate cvGRS based on summary statistics from ExWAS in 12600 participants (6,300 and 6,300 controls) of Chinese Han ancestry 31 . Specifically, we derived 20 predictors based on a pruning and thresholding method, seven additional predictors using the LDPred2 algorithm 32 and one predictor using Lassosum2 33 . These scores were validated within the MGAIC. We used a validation dataset of the 5,400 participants in the MAGIC cohort to select the cvGPSs with the best performance, defined as the maximum area under the receiver-operator curve (AUC). The predictors had AUCs ranging from 0.598–0.895 in the validation set (Supplementary Table 3; Fig. 4 a). The best model was based on the P value thresholding (P + T) method and involved 40,491 variants with nonzero weights selected based on r 2 = 0.2 and P = 1.0 (Supplementary Fig. 10). In the validation dataset, the polygenic component of the score explained 4.9% of the variance (R 2 ), with 1 s.d. of the score increasing HM risk by 7-fold (odds ratio [OR] = 6.99, 95% confidence interval [CI] = 6.34–7.75, P < 1.00 × 10 − 300 ) after controlling for age, sex and genetic ancestry. GRS optimization by combining with rare variants The second step to optimize the GRS model is to test the independent contributions of rare variants. To identify genes underlying HM, we performed rare-variant burden tests for 12,600 individuals in the MAGIC cohort using four methods (i.e., Fisher's exact test [FET], Burden, SKAT and SKAT-O). Using a MAF threshold of 0.1%, we detected 651 gene-phenotype associations with PTV variants and 1481 associations with D-mis at a GC corrected FET’s P -value of 0.1. We observed a positive correlation between variant pathogenicity and ORs of risk genes for HM under different cut-off P -values (Supplementary Fig. 11). Given the higher heritability and strong effect size of rare deleterious variants in the MAGIC cohort, we reasoned that a cvGRS combining rare variants may effectively identify individuals at high risk for HM. Here we proposed a complementary rvGRS, based on a weighted sum of rare deleterious variants from HM-associated genes. To construct the model, we first fitted a logistic regression model to HM on the rare PTVs and D-mis in associated genes for 12,600 training subsets. Furthermore, we evaluated the predictive power of the rvGRS on the 5,400 of MAGIC individuals that had been withheld for validation. We observed the best performance of rvGRS for PTVs (AUC = 0.698) and D-mis (AUC = 0.772) based on HM-associated genes selected with FET’s P -value of 0.1 (Supplementary Fig. 12-S14). Then, we compared rare-variant association study (RVAS) and rvGRS between PTVs and D-mis. For matched significance thresholds, we uncovered only 4.3% HM-associated genes identified by RVAS which were overlapped between PTV and D-mis (Supplementary Fig. 15). We further stratified the population according to rvGPS decile in PTVs and found out that a striking gradient with respect to rvGRS in D-mis (Supplementary Fig. 15). Therefore, we derived rvGRS to predict HM by integrating HM-associated genes carried with PTV and D-mis (AUC = 0.786) (Fig. 4 b). We assessed the predictive power of the rvGRS and the corresponding cvGRS, as well as a combination of the two methods, on 5,400 MAGIC validation dataset. A higher cvGRS was observed to present in the top decile of the rvGRS (Fig. 4 c). Although rvGRS underperformed for average phenotype predictions, we found that they may outperform cvPRSs for identifying individuals at risk extremes (Fig. 4 d). Therefore, we combined rare- and common-variant GRS models into a unified model (exome-wide genetic risk score, ExGRS) and obtained a significant improvement in genetic risk prediction for HM. The unified ExGRS performed best with a prediction AUC of 0.897, compared with 0.786 and 0.895 for the independent rare-variant or common-variant GRSs (Fig. 4 e). Consistent with the AUC, the inclusion of the rvGRS enhanced HM risk prediction and improved case-control discrimination: the risk of HM for predicted cases was 5.73-fold higher than for the predicted controls, which is higher than cvGRS (4.99-fold) and rvGRS (2.40-fold) (Fig. 4 f). Portability of ExGRSs and validation in both independent cohorts Having derived and validated a new polygenic predictor that considerably outperformed earlier scores, we explored the predictive power of the ExGRS on HM in 1,219 Han Chinese individuals of an independent testing dataset. We found the ExGRS exhibited highly reproducible performance, with AUC of 0.856 and OR of 3.51 (95% CI: 3.05–4.07, P < 1.31 × 10 − 65 ) (Fig. 5 a and Table 1 ). The inclusion of the rvGRS risk genotype considerably enhanced HM risk prediction in MAGIC cohorts, substantially improving tail cutoff discrimination. Compared to the remaining 95% of individuals, the risk for HM among the top 5% of individuals was approximately 9.95-fold higher in the model without rvGRS and 15.57-fold higher in the model with rvGRS (Table 1 ). The effects of the GRS stratified by with or without rvGRS in MAGIC cohorts are depicted in Fig. 5 b. Table 1 The performance metrics of the GRS in the testing cohorts. Models OR per s.d. (95% CI), P AUC PRS threshold OR (95% CI), P Prevalence of HM cvGRS 3.74 (3.19–4.44), 3.27×10 − 55 0.819 Top 20% versus other 80% 9.58 (6.43–14.68), 5.70×10 − 41 0.86 Top 10% versus other 90% 10.93 (5.92–22.05), 7.11×10 − 23 0.90 Top 5% versus other 95% 9.95 (4.24–28.50), 1.94×10 − 11 0.90 Top 2% versus other 98% 7.19 (2.13–37.86), 2.4×10 − 4 0.87 Top 1% versus other 99% 5.05 (1.07–47.61), 0.037 0.83 rvGRS 2.24 (1.99–2.54), 4.92×10 − 38 0.759 Top 20% versus other 80% 9.21 (6.21–14.04), 5.33×10 − 40 0.86 Top 10% versus other 90% 9.96 (5.50-19.54), 7.25×10 − 22 0.89 Top 5% versus other 95% 9.95 (4.24–28.50), 1.94×10 − 11 0.90 Top 2% versus other 980% 7.19 (2.13–37.86), 2.4×10 − 4 0.87 Top 1% versus other 99% 11.15 (1.61-480.32), 0.006 0.91 ExGRS 3.51 (3.05–4.07), 1.31×10 − 65 0.856 Top 20% versus other 80% 12.45 (8.07–19.86), 3.58×10 − 47 0.89 Top 10% versus other 90% 15.13 (7.59–34.31), 3.74×10 − 26 0.92 Top 5% versus other 95% 15.57 (5.70-59.48), 1.47×10 − 13 0.93 Top 2% versus other 980% 7.19 (2.13–37.86), 2.4×10 − 4 0.87 Top 1% versus other 99% 11.15 (1.61-480.32), 0.006 0.91 Next, we evaluated the robustness of ExGRS in the 8,682 UK Biobank European-ancestry individuals. Although significant between-population correlation of allelic effects (i.e., logOR) for variants clumped with different cut-off P- values (Supplementary Fig. 16), we detected significant differences in the ExGRS across ancestries (Wilcoxon rank sum test, P < 2.20×10 − 16 ). We then tested the final ExGRS in UKB European cohort. Predictive models based on the MAGIC and UKB overlaped SNPs and HM-associated genes, fitted with age, sex and population structure, were predictive of HM (versus all non-HM controls) with AUC values of 0.657, similar with 0.662 with cvGRS only (Fig. 5 c). Overall, these observations are consistent with above results which indicated one phenomenon that, in aggregate, rare variants explain less genetic heritability than common variants in the UKB European populations. The combined ExGRS model resulted in OR per s.d. = 1.46, 95% CI = 1.41–1.52 and P = 2.35 × 10 − 82 , which is lower than cvGRS model (OR per s.d. = 1.78, 95% CI = 1.69–1.88, P = 2.14 × 10 − 105 ) (Supplementary Table 4). Distinguished to MAGIC Han Chinese ancestry cohorts, the inclusion of rvGRS in the ExGPS deseased risk prediction in UKB European cohorts (Fig. 5 d). Therefore, the modeled risk in European-ancestry individuals was entirely attributable to the cvGRS. Discussion In our study, we estimated the heritability of HM captured by both rare and common variants in unrelated individuals from two distinct ancestry cohorts. We identified an additional variance attributed to rare variants, particularly rare protein-altering variants in low LD with other genomic variants, beyond what was by common HapMap3 variants. Our estimations largely, though not entirely, recovered the heritability estimated from pedigree data, in particular for Han Chinese ancestry cohort but less so for European ancestry. The remaining gap could be due to a combination of sampling variance and remaining causal variants that are not captured by the WES data. Based on the high heritability of HM, we described a systematic approach to derive and validate the ExGRS, incorporating information from rare to common genetic variants, to predict polygenic susceptibility to HM. Our studies demonstrated that extreme tails of the risk ExGRS distribution (top 5%) conferred an approximately 15-fold increased risk for HM in Han Chinese population. Additionally, we tested the ExGRS in participants across two ancestries, and found that top 5% risk ExGRS distribution conferred an approximately 2-fold increased risk for HM in European ancestry, which lower than 3.67-fold for cvGRS. Beyond enhanced disease screening of asymptomatic individuals, other potential applications of the ExGRS may include improved risk stratification of potential schoolchildren or enhanced assessment of early-onset myopia. Our results stress an urgent need to test the individual ExGRS in this setting to better assess its impact on the risk of pathological myopia as well as other HM complications. The ability to quantify inborn susceptibility using ExGRS is likely to be generalizable across a broad range of complex diseases, contingent upon availability of a large discovery WES, independent validation and testing datasets, and the heritability of a given disease explained by rare and common variants. Predictive power is likely to further improve in the coming years due to larger WES and WGS discovery studies and improved computational algorithms that integrate functional genomics annotation, variant-variant interactions, and rare large-effect variants into the predictive model. We note that both the extreme of the cvGRS and rvGRS distribution (top 5%) identically predisposed individuals to a 10.0-fold greater predisposition than the remainder of the population. Consistent with higher heritability of rare variants, higher risk observed in rvGRS model (OR = 11.15, P = 0.006) when compared to cvGRS (OR = 5.05, P = 0.037) for the top 1% versus bottom 99%. Although the combined ExGRS model substantially improved the prediction performance, this model showed incomplete penetrance that not all carriers manifest HM. This observation is consistent with recent PGS studies combining common and rare variants across a broad range of complex diseases, including coronary artery disease, atrial fibrillation, type 2 diabetes, inflammatory bowel disease, kidney disease and breast cancer 20,34–37 . Additional studies of large unascertained populations are needed to determine whether a larger effect size for rvGRS can be found among adults, and the extent to which a favorable polygenic background can explain the absence of HM noted among many mutation carriers. Due to our score is based on an ExWAS and RVAS for HM in Han Chinese ancestry, so the allelic effect estimates are heavily biased by the Han Chinese participants. We used variants with concordant direction-of-effect between MAGIC and UKB to improve the trans-ethnic performance of the score, and further enhanced the model by including rvGRS model. We demonstrated that rvGRS has an additive effect with cvGRS and significantly improves case-control discrimination in Han Chinese cohort. However, since allele frequencies, linkage disequilibrium patterns, and effect sizes of polymorphisms vary by ancestry, the specific ExGRS here will not have optimal predictive power for European ethnic groups. Although the average refractive error has increased substantially across multiply populations, the variability within given population has also increased, suggesting that an increasingly myopiagenic environment may have led to a preferential “unmasking” of inherited susceptibility in those with the highest genetic risk 12,38,39 . For example, prior studies suggest that the effect of education, metabolism, near work and time outdoors on refractive error are most pronounced in individuals with a genetic predisposition 40–43 . The ability to identify high-risk individuals from the time of birth may facilitate targeted strategies for HM prevention with increased effect or cost-effectiveness. The ExGRS permits identification of individuals, from birth, who inherit high susceptibility and before clinical disease manifest itself. Careful study of individuals at the extremes of an ExGRS distribution might uncover new causal risk factors or underlying disease pathways. Similarly, clinical and multi-omic profiling of individuals at the extremes of an ExGRS distribution for HM may reveal the contributions and molecular correlates of pathways related to ocular development 44 , neurotransmission 45 , and scleral remodeling 46 and might enable the identification of clinically relevant subtypes of severe myopia that most benefit from a given pharmacologic or behavioral intervention. Several important limitations of this work need to be discussed. First, we are most limited by the lack of large-scale WES for HM in multiethnic populations, as well as the small size of existing cohorts that could be used to optimize performance in Europeans and Asians. The assumption of fixed allelic effects across different ancestry groups is likely inaccurate because many disease-related lifestyle factors and environmental exposures related to ancestry could modify allelic effects. Accordingly, the overall tail discrimination of the score was lower in European than in Han Chinese ancestry cohorts, with notably lower sensitivity for the top 5% GPS cutoff. Although it is not possible to overcome this limitation in the present study, our ExGRS approach could be refined by including larger WES studies for HM once available in the future. Second, the performance comparisons between different ancestral groups could be biased by differences in genotyping platforms and the ascertainment methods employed by various biobanks. For example, the UKB represents a population-based cohort recruiting European participants in the 40–60 age group, while the MAGIC case-control cohorts are ascertained in schoolchildren. The inclusion of older participants in UKB testing cohorts might lead to misclassification of some cases due to the age-related refractive error decline, resulting in risk underestimation for cohorts consisting of older participants. In summary, this study highlighted the importance of the rare variants in addressing the current gap in heritability of various traits or diseases, by using WES data. In this study, we derived, optimized and validated a new tool, ExGRS, for HM prediction across ancestries. The variants uncovered by cvGRS and rvGRS had additive effects on HM, resulting in nearly a 15-fold increased risk for HM among individuals in the highest 5% of the risk score distribution. This result underscores the significance of genetic risk scores by combining rare variants which may provide a higher prediction accuracy for many polygenic diseases. The potential implications of the ExGRS would be that it can identify at-risk individuals before the disease or trait has manifested. With the cost of WES is no longer prohibitive, a population-based genetic screening approach for common eye diseases may prove to be a cost-effective public health strategy. While our study marks the initial step in this direction, prospective studies are warranted to evaluate the performance of this approach in clinical practice and analyze its cost-effectiveness. Methods Overview of the High Myopia Sequencing Consortium Cohort The Myopia Associated Genetics and Intervention Consortium (MAGIC) is a large-scale genomic consortium integrating myopia cohorts and sequencing data from many investigators. Over the past several years, MAGIC has been able to collected samples at the Eye Hospital of Wenzhou Medical University (Zhejiang Eye Hospital) through the Institute of Biomedical Big Data 4 . We recruited approximately ten thousand Chinese schoolchildren with high myopia aged from 6 to 18 years from MAGIC. The analysis presented here is based on 21,227 unrelated human samples collected from epidemiological studies of myopia. UK Biobank (UKB) is a large-scale biomedical database and research resource, containing genetic and health records from half a million individuals aged 40 to 69 years in the United Kingdom 47 . There were 488,000 participants were genotyped for 805,426 markers on the UK BiLEVE Axiom array and UK Biobank Axiom array. UKB measured refractive error of 130,494 participants by non-cycloplegic autorefraction using a TomeyRC-5000 AutoRefractor Keratometer. Quality Control Sample quality control (QC) and variant QC for MAGIC and UKB cohorts in our previous study 31 are used in this study. We first selected the samples with phenotypes available and retained only the high-quality variants that passed a GATK Variant Quality Score Recalibration (VQSR) approach, and those located outside of low-complexity regions were remained. Genotypes with a genotype depth (DP) < 10 and genotype quality (GQ) 0.8 or 0.05, Hardy-Weinberg equilibrium (HWE) test P value < 10 − 6 or a MAC < 3 using PLINK v.1.9 48 . Only retained individuals of East Asian (EAS) and European ancestry were retained, which were classified by a random forest algorithm with 1000 Genomes data. At the end of all the QC steps, we retained 12,000 unrelated individuals of Han Chinese and 8,682 European. Variant Annotation The annotation of variants was performed with Ensembl’s Variant Effect Predictor (VEP v.99) for human genome assembly GRCh37. We used the VEP 28 to generate additional bioinformatic predictions of variant deleteriousness (Supplementary Table 5-S6). Protein-coding variants were annotated into the following four classes: ( 1 ) synonymous; ( 2 ) benign missense; ( 3 ) damaging missense; and ( 4 ) protein-truncating variants (PTVs). Benign missense was predicted as “tolerated” and “benign” by PolyPhen-2 and SIFT, respectively, and combined annotation dependent depletion (CADD) score 15. Finally, PTVs were classified as “frameshift_variant”, “splice_acceptor_variant”, “splice_donor_variant”, “stop_gained”, or "start_lost" variants. Association Test We conducted a single-variant association analyses by using MLMA-LOCO 25 . The test statistics obtained via linear regression were inflated because of the population differentiation caused by genetic drift. Post hoc correction approaches, such as “Genomic Control”, were used to correct the inflation 49 . For the exome-wide association study, we first tested each variant, regardless of allele frequency, for HM associations; we applied a significance level of P < 4.3 × 10 − 7 for all variants 50 . To determine whether a single gene was enriched in or depleted of rare protein-coding variants in HM cases, we performed four gene-level association tests including Fisher’s exact test, burden, SKAT and SKAT-O, with previously defined covariates (sample sex, PC1-PC10). Heritability Estimation In each WES dataset, we stratified SNPs into 4 MAF bins (0.0001 < MAF < 0. 0010, 0.001 < MAF < 0.010, 0.01 < MAF < 0.10 and 0.1 < MAF < 0.5). For each of the 22 autosomes, we calculated the LD score of each variant with the others on a sliding window of 10 Mb using GCTA software 25 . Each of the four MAF bins was divided into two more bins, one for variants with LD scores above the median value of the variants in the bin (high-LD bin) and one for variants with LD score below the median (low-LD bin) (Supplementary Table 1). We then used GCTA to perform a GREML-LDMS analysis on HM in each dataset with either 20 PCs calculated from HM3 SNPs or 160 PCs (20 PCs computed from each of the 8 MAF/LD bins) fitted as fixed covariates. Using variant annotations and the LD and MAF bins defined from the GREML-LDMS analysis on the WES data mentioned above, we further separated the low-LD and high-LD variants in the 0.0001 < MAF < 0.01 into four bins according to their predicted variant effects: PTV, D-mis, B-mis and Synonymous. We then ran a GREML-LDMS analysis with 8 Genome-wide Relationship Matrices (GRMs), fitting the 160 PCs shown to capture the effect of population stratification as well as fixed covariates in MAGIC and UKB. To compute the variance explained per SNP, we divided the estimate of variance explained for each bin by the number of variants in the bin. The s.e. was obtained by dividing the s.e. of the estimated variance explained for the bin by the number of variants in the bin. We estimated burden heritability for rare variant by using BHR (v.0.1.0), which is implemented in R, and its source code is publicly available at GitHub ( https://github.com/ajaynadig/bhr ). To compute the effect-size variance explained per gene, we divided the estimate of burden heritability for each bin by the number of variants in the bin. GRS Design We derived cvGRS, rvGRS and ExGRS in the 12,000 unrelated individuals of Han Chinese ancestry from MAGIC. For cvGRS derivation, we first generated 20 pruning and thresholding (P + T) scores over a range of P value (1.0, 0.5, 0.05, 5 × 10 − 4 and 5 × 10 − 6 ) and r2 (0.2, 0.4, 0.6, and 0.8) thresholds. We also computed 7 candidate cvGRS using the LDPred2 algorithm 32 across the following range of rho (fraction of casual variants): 1.00, 1.00 × 10 − 1 , 1.00 × 10 − 2 , 1.00 × 10 − 3 , 3.00 × 10 − 1 , 3.00 × 10 − 2 and 3.00 × 10 − 3 . Additionally, the lassosum2 computational algorithm 33 was used to generate a candidate GRS for HM. Each of the scores derived above was subsequently assessed for discrimination of HM cases from controls in the MAGIC validation dataset (2,697 cases and 2,703 controls) after adjustment for age, sex and 160 PCs of ancestry. The score with the best performance was defined by the maximal area under the receiver operator curve (AUC) and the largest fraction of variance explained. AUC confidence intervals were calculated using the ‘pROC’ package within R. We aimed to assess if adding the rvGRS enhanced HM risk prediction. We constructed the rvGRS from the results of the rare variant burden tests. These were conducted per gene, and each gene had separate thresholds for associated to HM ( P value) and pathogenicity (PTV, D-mis, B-mis and Synoymous variant) established in the training group. rvGRS models were constructed by fitting logistics regression models to HM on the rare variants (AF < 1%) in significantly associated genes. A unified PRS model, ExGRS, was also constructed, which summed the rare- and common-variant GRS models per individual. We tested the ExGRS for association with HM in this dataset. Statistical Analysis within the Testing Dataset For HM, the ExGRS with the best discriminative capacity in the testing dataset was calculated in the testing dataset of 1,219 participants in MAGIC and 8,682 participants in UKB. The proportion of the population and of HM individuals with a given magnitude of increased risk was determined by comparing progressively more extreme tails of the distribution with the remainder of the population. Logistic regression models were used for predicting case-control status with adjustment for age, sex, and PCs of ancestry using the glm function in R. We used the pROC R package to calculate the AUC. We also expressed the effect of the standardized risk score as ORs (with 95% CIs) per s.d. unit of the control standard-normalized risk score distribution in each of the testing cohorts. We examined the risk score discrimination at tail cutoffs corresponding to the top 20, 10, 5, 2 and 1% of the GRS distribution by deriving the ORs of disease for each tail of the distribution compared to all other individuals in each cohort. Statistical analyses were conducted using R version 4.2.1 software (The R Foundation). Declarations Data availability Individual-level data are not publicly available due to ethical and legal restrictions related to the Wenzhou Medical University. VCF files have been deposited to Genome Variation Map (GVM) in BIG Data Center (http://bigd.big.ac.cn/gsa), Beijing Institute of Genomics (BIG), Chinese Academy of Sciences and are publicly available as of the data of publication. Accession numbers are list in the key resources table. This paper does not report original code. Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request Acknowledgments This work was supported by the National Natural Science Foundation of China (U20A20364 and 81830027) and the Zhejiang Provincial Key Research and Development Program Grant (2021C03102) to J. Qu; the National Natural Science Foundation of China (82172882) to J. Su. Consortia The members of the Myopia Associated Genetics and Intervention Consortium (MAGIC) are Jianzhong Su, Jian Yuan, Liangde Xu, Shilai Xing, Yinghao Yao, Fukun Chen, Kai Li, Zhengbo Xue, Yaru Zhang, Ji Zhang, Hui Liu, Dandan Fan, Guosi Zhang, Hong Wang, Meng Zhou, Hao Chen, Fan Lyu, Gang An, Yuanchao Xue, Jian Yang, Jia Qu, Zhenhui Chen, Yunlong Ma, Yichun Xiong, Xinting Liu, Nan Wu, Jie Sun, Jinhua Bao, Liang Xu, Ling Li, Liang Ye, Jun Jiang, Xinjie Mao, Xinping Yu, Xiaoming Huang, Jingjing Xu, Miaomiao Li, Xuemei Zhang, Liang Hu, Zhuopao Zuo, Wanqing Jin, Jiawei Zhou, Yuwen Wang, Xue Li, Fang Hou, Yukuan Huang, Fei Qiu, Yijun Zhou, Na Gao, Xinyu Wang, Xinrui Shi, Yuchun Deng, Xiaoguang Yu, Yu Bai, Chenghao Li, Lu Chen, Ke Li, Lijun Dai, Xiangyi Yu, Peng Lin, Jingting Zhao, Congcong Yan, Siqi Bao, Zicheng Zhang, Fangjie Guo, Hongchen Han, Shen Wang, Haojun Sun, Siyi Jiang, Wei Dai, Hengte Kong, Xiaoyan Lu, Jing Li, Liansheng Li, Siyu Wang. Author contributions The study was conceived, designed and supervised by J.S., X.Y. and L.Q. Analysis of data was performed by J.Y., R.Q., H.S. and W.D.. Patient sample recruitment was conducted by member of Myopia Associated Genetics and Intervention Consortium. The manuscript was written by J.Y. with contributions from all other authors. Competing interests The authors declare no competing financial interests. References Yu, X., Yuan, J., Chen, Z.J., Li, K., Yao, Y., Xing, S., Xue, Z., Zhang, Y., Peng, H., and An, G. (2023). Whole-Exome Sequencing Among School-Aged Children With High Myopia. JAMA Network Open 6 , e2345821-e2345821. Morgan, I.G., Ohno-Matsui, K., and Saw, S.-M. (2012). Myopia. The Lancet 379 , 1739–1748. Saw, S.M., Gazzard, G., Shih-Yen, E.C., and Chua, W.H. (2005). Myopia and associated pathological complications. Ophthalmic and Physiological Optics 25 , 381–391. Xu, L., Ma, Y., Yuan, J., Zhang, Y., Wang, H., Zhang, G., Tu, C., Lu, X., Li, J., and Xiong, Y. (2021). COVID-19 Quarantine Reveals That Behavioral Changes Have an Effect on Myopia Progression. Ophthalmology. You, Q.S., Wu, L.J., Duan, J.L., Luo, Y.X., Liu, L.J., Li, X., Gao, Q., Wang, W., Xu, L., and Jonas, J.B. (2014). Prevalence of myopia in school children in greater Beijing: the Beijing Childhood Eye Study. Acta ophthalmologica 92 , e398-e406. Wong, Y.-L., and Saw, S.-M. (2016). Epidemiology of pathologic myopia in Asia and worldwide. The Asia-Pacific Journal of Ophthalmology 5 , 394–402. Lopes, M.C., Andrew, T., Carbonaro, F., Spector, T.D., and Hammond, C.J. (2009). Estimating heritability and shared environmental effects for refractive error in twin and family studies. Investigative ophthalmology & visual science 50 , 126–131. Guggenheim, J.A., Kirov, G., and Hodson, S.A. (2000). The heritability of high myopia: a reanalysis of Goldschmidt's data. Journal of Medical Genetics 37 , 227–231. Verhoeven, V.J., Hysi, P.G., Wojciechowski, R., Fan, Q., Guggenheim, J.A., Höhn, R., MacGregor, S., Hewitt, A.W., Nag, A., and Cheng, C.-Y. (2013). Genome-wide meta-analyses of multiancestry cohorts identify multiple new susceptibility loci for refractive error and myopia. Nature genetics 45 , 314–318. Tedja, M., Wojciechowski, R., Hysi, P., Eriksson, N., Furlotte, N., Verhoeven, V., Iglesias, A., Meester-Smoor, M., Tompson, S., Fan, Q., et al. (2018). Genome-wide association meta-analysis highlights light-induced signaling as a driver for refractive error. Nature genetics 50 , 834–848. 10.1038/s41588-018-0127-7 . Hysi, P., Choquet, H., Khawaja, A., Wojciechowski, R., Tedja, M., Yin, J., Simcoe, M., Patasova, K., Mahroo, O., Thai, K., et al. (2020). Meta-analysis of 542,934 subjects of European ancestry identifies new genes and mechanisms predisposing to refractive error and myopia. Nature genetics 52 , 401–407. 10.1038/s41588-020-0599-0 . Morgan, I.G., Wu, P.-C., Ostrin, L.A., Tideman, J.W.L., Yam, J.C., Lan, W., Baraas, R.C., He, X., Sankaridurg, P., and Saw, S.-M. (2021). IMI risk factors for myopia. Investigative ophthalmology & visual science 62 , 3–3. Tran-Viet, K., Powell, C., Barathi, V., Klemm, T., Maurer-Stroh, S., Limviphuvadh, V., Soler, V., Ho, C., Yanovitch, T., Schneider, G., et al. (2013). Mutations in SCO2 are associated with autosomal-dominant high-grade myopia. American journal of human genetics 92 , 820–826. 10.1016/j.ajhg.2013.04.005 . Jin, Z., Wu, J., Huang, X., Feng, C., Cai, X., Mao, J., Xiang, L., Wu, K., Xiao, X., Kloss, B., et al. (2017). Trio-based exome sequencing arrests de novo mutations in early-onset high myopia. Proceedings of the National Academy of Sciences of the United States of America 114 , 4219–4224. 10.1073/pnas.1615970114 . Hosoda, Y., Yoshikawa, M., Miyake, M., Tabara, Y., Shimada, N., Zhao, W., Oishi, A., Nakanishi, H., Hata, M., Akagi, T., et al. (2018). CCDC102B confers risk of low vision and blindness in high myopia. Nature communications 9 , 1782. 10.1038/s41467-018-03649-3 . Aldahmesh, M., Khan, A., Alkuraya, H., Adly, N., Anazi, S., Al-Saleh, A., Mohamed, J., Hijazi, H., Prabakaran, S., Tacke, M., et al. (2013). Mutations in LRPAP1 are associated with severe myopia in humans. American journal of human genetics 93 , 313–320. 10.1016/j.ajhg.2013.06.002 . Su, J., Yuan, J., Xu, L., Xing, S., Sun, M., Yao, Y., Ma, Y., Chen, F., Jiang, L., and Li, K. (2022). Sequencing of 19,219 exomes identifies a low-frequency variant in FKBP5 promoter predisposing to high myopia in a Han Chinese population. medRxiv. Hao, L., Kraft, P., Berriz, G.F., Hynes, E.D., Koch, C., Korategere V Kumar, P., Parpattedar, S.S., Steeves, M., Yu, W., and Antwi, A.A. (2022). Development of a clinical polygenic risk score assay and reporting workflow. Nature medicine 28 , 1006–1013. Wray, N.R., Lin, T., Austin, J., McGrath, J.J., Hickie, I.B., Murray, G.K., and Visscher, P.M. (2021). From basic science to clinical application of polygenic risk scores: a primer. JAMA psychiatry 78 , 101–109. Khera, A.V., Chaffin, M., Aragam, K.G., Haas, M.E., Roselli, C., Choi, S.H., Natarajan, P., Lander, E.S., Lubitz, S.A., and Ellinor, P.T. (2018). Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nature genetics 50 , 1219–1224. Mojarrad, N.G., Plotnikov, D., Williams, C., and Guggenheim, J.A. (2020). Association Between Polygenic Risk Score and Risk of Myopia. JAMA Ophthalmology 138 , 7–13. Tideman, J., Pärssinen, O., Haarman, A., Khawaja, A., Wedenoja, J., Williams, K., Biino, G., Ding, X., Kähönen, M., Lehtimäki, T., et al. (2021). Evaluation of Shared Genetic Susceptibility to High and Low Myopia and Hyperopia. JAMA ophthalmology 139 , 601–609. 10.1001/jamaophthalmol.2021.0497 . Clark, R., Lee, S.S.-Y., Du, R., Wang, Y., Kneepkens, S.C., Charng, J., Huang, Y., Hunter, M.L., Jiang, C., and Tideman, J.W.L. (2023). A new polygenic score for refractive error improves detection of children at risk of high myopia but not the prediction of those at risk of myopic macular degeneration. EBioMedicine 91 . Kassam, I., Foo, L.-L., Lanca, C., Xu, L., Hoang, Q.V., Cheng, C.-Y., Hysi, P., and Saw, S.-M. (2022). The potential of current polygenic risk scores to predict high myopia and myopic macular degeneration in multi-ethnic Singapore adults. Ophthalmology. Yang, J., Lee, S.H., Goddard, M.E., and Visscher, P.M. (2011). GCTA: a tool for genome-wide complex trait analysis. The American Journal of Human Genetics 88 , 76–82. Wainschtein, P., Jain, D., Zheng, Z., Cupples, L., Shadyab, A., McKnight, B., Shoemaker, B., Mitchell, B., Psaty, B., Kooperberg, C., et al. (2022). Assessing the contribution of rare variants to complex trait heritability from whole-genome sequence data. Nature genetics 54 , 263–273. 10.1038/s41588-021-00997-7 . Zeng, J., De Vlaming, R., Wu, Y., Robinson, M.R., Lloyd-Jones, L.R., Yengo, L., Yap, C.X., Xue, A., Sidorenko, J., and McRae, A.F. (2018). Signatures of negative selection in the genetic architecture of human complex traits. Nature genetics 50 , 746–753. McLaren, W., Gil, L., Hunt, S.E., Riat, H.S., Ritchie, G.R., Thormann, A., Flicek, P., and Cunningham, F. (2016). The ensembl variant effect predictor. Genome biology 17 , 1–14. Weiner, D.J., Nadig, A., Jagadeesh, K.A., Dey, K.K., Neale, B.M., Robinson, E.B., Karczewski, K.J., and O’Connor, L.J. (2023). Polygenic architecture of rare coding variation across 394,783 exomes. Nature 614 , 492–499. Torkamani, A., Wineinger, N.E., and Topol, E.J. (2018). The personal and clinical utility of polygenic risk scores. Nature Reviews Genetics 19 , 581–590. Su, J., Yuan, J., Xu, L., Xing, S., Sun, M., Yao, Y., Ma, Y., Chen, F., Jiang, L., and Li, K. (2023). Sequencing of 19,219 exomes identifies a low-frequency variant in FKBP5 promoter predisposing to high myopia in a Han Chinese population. Cell Reports 42 . Privé, F., Arbel, J., and Vilhjálmsson, B.J. (2020). LDpred2: better, faster, stronger. Bioinformatics 36 , 5424–5431. Mak, T.S.H., Porsch, R.M., Choi, S.W., Zhou, X., and Sham, P.C. (2017). Polygenic scores via penalized regression on summary statistics. Genetic epidemiology 41 , 469–480. Fiziev, P.P., McRae, J., Ulirsch, J.C., Dron, J.S., Hamp, T., Yang, Y., Wainschtein, P., Ni, Z., Schraiber, J.G., and Gao, H. (2023). Rare penetrant mutations confer severe risk of common diseases. Science 380 , eabo1131. Khan, A., Turchin, M.C., Patki, A., Srinivasasainagendra, V., Shang, N., Nadukuru, R., Jones, A.C., Malolepsza, E., Dikilitas, O., and Kullo, I.J. (2022). Genome-wide polygenic score to predict chronic kidney disease across ancestries. Nature Medicine 28 , 1412–1420. Dornbos, P., Koesterer, R., Ruttenburg, A., Nguyen, T., Cole, J.B., Consortium, A.-T.D.-G., Leong, A., Meigs, J.B., Florez, J.C., and Rotter, J.I. (2022). A combined polygenic score of 21,293 rare and 22 common variants improves diabetes diagnosis based on hemoglobin A1C levels. Nature genetics 54 , 1609–1614. Khera, A.V., Chaffin, M., Wade, K.H., Zahid, S., Brancale, J., Xia, R., Distefano, M., Senol-Cosar, O., Haas, M.E., and Bick, A. (2019). Polygenic prediction of weight and obesity trajectories from birth to adulthood. Cell 177 , 587–596. e589. Morgan, I., and Rose, K. (2005). How genetic is school myopia? Progress in retinal and eye research 24 , 1–38. Morgan, I.G., French, A.N., Ashby, R.S., Guo, X., Ding, X., He, M., and Rose, K.A. (2018). The epidemics of myopia: aetiology and prevention. Progress in retinal and eye research 62 , 134–149. Fan, Q., Guo, X., Tideman, J.W.L., Williams, K.M., Yazar, S., Hosseini, S.M., Howe, L.D., Pourcain, B.S., Evans, D.M., and Timpson, N.J. (2016). Childhood gene-environment interactions and age-dependent effects of genetic variants associated with refractive error and myopia: The CREAM Consortium. Scientific reports 6 , 25853. Enthoven, C.A., Tideman, J.W.L., Polling, J.R., Tedja, M.S., Raat, H., Iglesias, A.I., Verhoeven, V.J., and Klaver, C.C. (2019). Interaction between lifestyle and genetic susceptibility in myopia: the Generation R study. European journal of epidemiology 34 , 777–784. Wojciechowski, R., Yee, S.S., Simpson, C.L., Bailey-Wilson, J.E., and Stambolian, D. (2013). Matrix metalloproteinases and educational attainment in refractive error: Evidence of gene–environment interactions in the Age-Related Eye Disease Study. Ophthalmology 120 , 298–305. Fan, Q., Verhoeven, V.J., Wojciechowski, R., Barathi, V.A., Hysi, P.G., Guggenheim, J.A., Höhn, R., Vitart, V., Khawaja, A.P., and Yamashiro, K. (2016). Meta-analysis of gene–environment-wide association scans accounting for education level identifies additional loci for refractive error. Nature communications 7 , 11008. Wallman, J., and Winawer, J. (2004). Homeostasis of eye growth and the question of myopia. Neuron 43 , 447–468. Troilo, D., Smith, E.L., Nickla, D.L., Ashby, R., Tkatchenko, A.V., Ostrin, L.A., Gawne, T.J., Pardue, M.T., Summers, J.A., and Kee, C.-s. (2019). IMI–Report on experimental models of emmetropization and myopia. Investigative ophthalmology & visual science 60 , M31-M88. Wu, H., Chen, W., Zhao, F., Zhou, Q., Reinach, P.S., Deng, L., Ma, L., Luo, S., Srinivasalu, N., and Pan, M. (2018). Scleral hypoxia is a target for myopia control. Proceedings of the National Academy of Sciences 115 , E7091-E7100. Bycroft, C., Freeman, C., Petkova, D., Band, G., Elliott, L.T., Sharp, K., Motyer, A., Vukcevic, D., Delaneau, O., O'Connell, J., et al. (2018). The UK Biobank resource with deep phenotyping and genomic data. Nature 562 , 203–209. 10.1038/s41586-018-0579-z . Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M.A., Bender, D., Maller, J., Sklar, P., De Bakker, P.I., and Daly, M.J. (2007). PLINK: a tool set for whole-genome association and population-based linkage analyses. The American journal of human genetics 81 , 559–575. Devlin, B., and Roeder, K. (1999). Genomic control for association studies. Biometrics 55 , 997–1004. Sveinbjornsson, G., Albrechtsen, A., Zink, F., Gudjonsson, S.A., Oddson, A., Másson, G., Holm, H., Kong, A., Thorsteinsdottir, U., and Sulem, P. (2016). Weighting sequence variants based on their annotation increases power of whole-genome association studies. Nature genetics 48 , 314–317. Additional Declarations There is NO Competing Interest. Supplementary Files SMFigures2024Feb27.docx Dataset 1 SMTables2024Feb27.xlsx Dataset 2 Cite Share Download PDF Status: Published Journal Publication published 30 Dec, 2024 Read the published version in Communications Medicine → Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-4188555","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":303954153,"identity":"08301297-f286-4521-86f4-363a1621db49","order_by":0,"name":"Jianzhong Su","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAsElEQVRIiWNgGAWjYHCCBIYPEIYB8VoYZ5CqhYGZhyQt8jMSHn+2bduW2MDevE2CoeYOYS0GNxISjHPbbic28Bwrk2A49owILRIJCcm524BaJHLMJBgbDhPlsITDliAt8m+I1MJwIyGxmRFsCw+RWgzOPEhm7P1327iNJ63YIuEYMQ5rz0n+8OPMbdl+9sMbb3yoIcZhDDwJYIoNRCQQo4GBgf0AcepGwSgYBaNg5AIA3Ok67fNCCGgAAAAASUVORK5CYII=","orcid":"https://orcid.org/0000-0003-1054-6042","institution":"Wenzhou Medical University","correspondingAuthor":true,"prefix":"","firstName":"Jianzhong","middleName":"","lastName":"Su","suffix":""},{"id":303954154,"identity":"642d4077-e1d1-4f0f-9c42-d64a4f0a9c36","order_by":1,"name":"Jian Yuan","email":"","orcid":"","institution":"Wenzhou Medical University","correspondingAuthor":false,"prefix":"","firstName":"Jian","middleName":"","lastName":"Yuan","suffix":""},{"id":303954155,"identity":"339b4edc-68ae-4479-94c0-9b3ee7817994","order_by":2,"name":"Ruowen Qiu","email":"","orcid":"","institution":"Wenzhou Medical University","correspondingAuthor":false,"prefix":"","firstName":"Ruowen","middleName":"","lastName":"Qiu","suffix":""},{"id":303954156,"identity":"c6a6b18d-310a-4d0f-8356-000dc5be8527","order_by":3,"name":"Yuhan Wang","email":"","orcid":"","institution":"Beijing Tongren Eye Center","correspondingAuthor":false,"prefix":"","firstName":"Yuhan","middleName":"","lastName":"Wang","suffix":""},{"id":303954157,"identity":"9cf49c95-3074-423e-ae7b-39f9ed5d0323","order_by":4,"name":"Zhen Ji Chen","email":"","orcid":"","institution":"Wenzhou Medical University","correspondingAuthor":false,"prefix":"","firstName":"Zhen","middleName":"Ji","lastName":"Chen","suffix":""},{"id":303954158,"identity":"ca9a87fc-9344-440e-ac21-119da0ef7cfa","order_by":5,"name":"Haojun Sun","email":"","orcid":"","institution":"Wenzhou Medical University","correspondingAuthor":false,"prefix":"","firstName":"Haojun","middleName":"","lastName":"Sun","suffix":""},{"id":303954159,"identity":"1a1e18d4-f068-482d-842f-ad20f9703d4a","order_by":6,"name":"Wei Dai","email":"","orcid":"","institution":"Monash University","correspondingAuthor":false,"prefix":"","firstName":"Wei","middleName":"","lastName":"Dai","suffix":""},{"id":303954160,"identity":"440a4c91-5fe6-4431-9fee-640f54dc2ac8","order_by":7,"name":"Yinghao Yao","email":"","orcid":"","institution":"Oujiang Laboratory","correspondingAuthor":false,"prefix":"","firstName":"Yinghao","middleName":"","lastName":"Yao","suffix":""},{"id":303954161,"identity":"2c2bdd9a-8855-460d-a16d-332ae02da14c","order_by":8,"name":"Ran Zhuo","email":"","orcid":"https://orcid.org/0000-0003-0546-5475","institution":"Wenzhou Medical University","correspondingAuthor":false,"prefix":"","firstName":"Ran","middleName":"","lastName":"Zhuo","suffix":""},{"id":303954162,"identity":"465f27ab-c6aa-49ea-80bf-23fa9370b935","order_by":9,"name":"Kai Li","email":"","orcid":"","institution":"Wenzhou Institute, University of Chinese Academy of Sciences","correspondingAuthor":false,"prefix":"","firstName":"Kai","middleName":"","lastName":"Li","suffix":""},{"id":303954163,"identity":"0e2641b7-1ac9-4678-9755-b3197e83d720","order_by":10,"name":"Shilai Xing","email":"","orcid":"","institution":"Berry Genomics Corporation","correspondingAuthor":false,"prefix":"","firstName":"Shilai","middleName":"","lastName":"Xing","suffix":""},{"id":303954164,"identity":"f2999728-2c23-4ad3-93c0-7566625d305e","order_by":11,"name":"Xiaoguang Yu","email":"","orcid":"","institution":"Institute of PSI Genomics","correspondingAuthor":false,"prefix":"","firstName":"Xiaoguang","middleName":"","lastName":"Yu","suffix":""},{"id":303954165,"identity":"812a25f0-921d-437c-8f85-029325ba7522","order_by":12,"name":"Liya Qiao","email":"","orcid":"","institution":"Beijing Tongren Eye Center","correspondingAuthor":false,"prefix":"","firstName":"Liya","middleName":"","lastName":"Qiao","suffix":""},{"id":303954166,"identity":"39d8b4f2-f030-466e-8a14-6cf08e3eddd4","order_by":13,"name":"Jia Qu","email":"","orcid":"","institution":"Wenzhou Medical University","correspondingAuthor":false,"prefix":"","firstName":"Jia","middleName":"","lastName":"Qu","suffix":""}],"badges":[],"createdAt":"2024-03-29 14:50:10","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-4188555/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-4188555/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1038/s43856-024-00718-1","type":"published","date":"2024-12-30T05:00:00+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":57720912,"identity":"45e42cb9-4ee1-4773-badc-c452ac99a5a2","added_by":"auto","created_at":"2024-06-04 18:55:19","extension":"jpg","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":444253,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eGREML-LDMS estimates from WES data stratified in 8 bins (2 LD bins for each of the 4 MAF bins) correcting for 160 PCs (20 * 8 bins) for MAGIC and UKB. \u003c/strong\u003e(A) Estimate for HM with \u003cem\u003eh\u003c/em\u003e\u003csup\u003e\u003cem\u003e2\u003c/em\u003e\u003c/sup\u003e\u003csub\u003e\u003cem\u003eWES\u003c/em\u003e\u003c/sub\u003e\u003csub\u003e \u003c/sub\u003eat 0.53 (s.e.=0.06) in MAGIC. (B) Estimate for HM with \u003cem\u003eh\u003c/em\u003e\u003csup\u003e\u003cem\u003e2\u003c/em\u003e\u003c/sup\u003e\u003csub\u003e\u003cem\u003eWES\u003c/em\u003e\u003c/sub\u003e\u003csub\u003e \u0026nbsp;\u003c/sub\u003eat 0.21 (s.e.=0.10) in UKB. The number of variants in each of the 4 MAF bins (twice the number in each LD bin) is, from the lowest to highest MAF bins, 539K, 86K, 36K and 39K, respectively (Supplementary Table1). Error bars indicate standard errors (SE).\u003c/p\u003e","description":"","filename":"figure1.jpg","url":"https://assets-eu.researchsquare.com/files/rs-4188555/v1/a86ecf4cc98460b89bac4698.jpg"},{"id":57720913,"identity":"0a560af5-3b4e-4d49-9db0-32542584bdb2","added_by":"auto","created_at":"2024-06-04 18:55:19","extension":"jpg","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":368976,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eVariance explained per variant (the estimate of genetic variance divided by the number of variants in each bin) from GREML-LDMS with rare variants partitioned into four categories according to the variant annotation. \u003c/strong\u003e(A) Variance explained per variant for Han Chinese individuals from MAGIC. (B) Variance explained per variant for European individuals from UKB. Error bars indicate standard errors (SE).\u003c/p\u003e","description":"","filename":"figure2.jpg","url":"https://assets-eu.researchsquare.com/files/rs-4188555/v1/d6e78a453da91a16d107944f.jpg"},{"id":57720895,"identity":"45a62cf3-f056-426a-bc24-317a3219f921","added_by":"auto","created_at":"2024-06-04 18:55:19","extension":"jpg","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":771022,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eOverview of the study design. \u003c/strong\u003eThe HM GRS was designed based on the MAGIC. Validation and optimization were performed in two stages using common variant GRS (optimization 1) and rare variant GRS (optimization 2). The optimal GRS for HM was chosen based on the AUC in the MAGIC validation dataset (n = 5,400 Han Chinese). ExGRS performance validation was conducted in two additional independent testing cohorts of diverse ancestries.\u003c/p\u003e","description":"","filename":"figure3.jpg","url":"https://assets-eu.researchsquare.com/files/rs-4188555/v1/3d20fec8f9d9658d3828af3f.jpg"},{"id":57720891,"identity":"47aa024a-f436-4605-b031-f61b1cc31353","added_by":"auto","created_at":"2024-06-04 18:55:19","extension":"jpg","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":663233,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eCompare the GRS of common and rare variants and combine them into a unified ExGRS model. \u003c/strong\u003e(A) Receiver operating characteristic (ROC) curves for cvGRS model to detect HM in the schoolchildren from the MAGIC cohort. The solid black line represents chance-level prediction accuracy. (B) ROC curves for rvGRS model to detect HM. (C) cvGRS for each individual according to 10 groups of the validation dataset binned according to the quantiles of the rvGRS. (D) Enrichment of outlier GRS scores in individuals who are extreme HM prediction risk. GRS ordered from the 50% to the 100% percentile (x axis), and the y axis depicts the enrichment of HM for each of the percentile-defined subgroups in reference to the baseline population. (E) ROC curves for ExGRS model to detect HM. (F) Odds ratios for cvGRS, rvGRS and unified ExGRS model by comparing those in the high-risk group with the remainder of the population.\u003c/p\u003e","description":"","filename":"figure4.jpg","url":"https://assets-eu.researchsquare.com/files/rs-4188555/v1/266141e7805fbbea4602b0d9.jpg"},{"id":57720914,"identity":"f7fb7a9b-01ea-4af9-8ebf-aeaa3435af4b","added_by":"auto","created_at":"2024-06-04 18:55:20","extension":"jpg","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":805542,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eEffects of the ExGRS for HM in testing cohorts.\u003c/strong\u003e (A) ROC curves and corresponding areas under the ROC curve (AUCs) were used to assess the ability of the ExGRS to distinguish HM in MAGIC testing subgroup. (B) The x axis depicts each quantile of the ExGRS ordered from the first (Q1) to the last (Q5) quantile. The y axis depicts the ORs of HM for each of the quantile-defined subgroups in reference to the middle quantile (Q3) of cvGRS. The effect estimates (dots) and 95% CIs (vertical bars) were derived based on a fixed-effects in MAGIC testing cohorts. (C) ROC Curves for detecting HM using ExGRS in UKB testing subgroup of European ancestry. (D) The effects of the ExGRS in UKB.\u003c/p\u003e","description":"","filename":"figure5.jpg","url":"https://assets-eu.researchsquare.com/files/rs-4188555/v1/d7b52263da02e3c2f5c765ad.jpg"},{"id":72685473,"identity":"762de505-c056-4e4b-9906-b89753c1578d","added_by":"auto","created_at":"2024-12-31 08:09:14","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":3795900,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-4188555/v1/e81fdde3-da1c-47db-bd27-ec8b164e6493.pdf"},{"id":57720873,"identity":"cfe03b7c-d96f-4468-8ff3-d0634775a05d","added_by":"auto","created_at":"2024-06-04 18:55:18","extension":"docx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":8574367,"visible":true,"origin":"","legend":"Dataset 1","description":"","filename":"SMFigures2024Feb27.docx","url":"https://assets-eu.researchsquare.com/files/rs-4188555/v1/11a9a6f276d4c5a80a1e31fd.docx"},{"id":57720890,"identity":"291f7675-e787-40df-881a-d0e2ac215e39","added_by":"auto","created_at":"2024-06-04 18:55:19","extension":"xlsx","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":19079,"visible":true,"origin":"","legend":"\u003cp\u003eDataset 2\u003c/p\u003e","description":"","filename":"SMTables2024Feb27.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4188555/v1/6faaae4dfe16a0fa12f3183a.xlsx"}],"financialInterests":"There is \u003cb\u003eNO\u003c/b\u003e Competing Interest.","formattedTitle":"ExGRS: exome-wide genetic risk score to predict high myopia across multi-ancestry populations","fulltext":[{"header":"Introduction","content":"\u003cp\u003eIndividuals with high myopia (HM), generally defined by a spherical equivalent (SE) of -6.00 diopters (D) or lower\u003csup\u003e1\u003c/sup\u003e. HM affects 2.8% of the general population and is a risk factor for developing pathologic myopia (PM) and its complications, most notably retinal degeneration or even detachment, which can cause severe visual acuity (VA) loss and even blindness\u003csup\u003e2,3\u003c/sup\u003e. HM commonly occurs in Asian schoolchildren (6.8%-21.6%)\u003csup\u003e4,5\u003c/sup\u003e than in non-Asians (2.0%-2.3%)\u003csup\u003e6\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eHM is a multifactorial eye disease with a high genetic susceptibility. Twin and family studies have demonstrated that HM has a high heritability\u003csup\u003e7,8\u003c/sup\u003e. Over past decades, amounts of genome-wide association studies (GWAS) of refractive error or myopia have revealed more than hundreds of candidate genetic factors across different ethnic populations\u003csup\u003e9\u0026ndash;11\u003c/sup\u003e. However, the common variant uncovered by GWAS has a small effect size independently; even the additive effects can only explain a limited fraction of myopia heritability (estimated heritability: 5.3% in Asians and 21.4% in Europeans)\u003csup\u003e11,12\u003c/sup\u003e. Whole-exome sequencing (WES) studies of HM trios or families have identified several novel mutations and genes in the Asian populations, i.e., \u003cem\u003eSCO2\u003c/em\u003e\u003csup\u003e13\u003c/sup\u003e, \u003cem\u003eBSG\u003c/em\u003e\u003csup\u003e14\u003c/sup\u003e, \u003cem\u003eCCDC102B\u003c/em\u003e\u003csup\u003e15\u003c/sup\u003e, and \u003cem\u003eLRPAP1\u003c/em\u003e\u003csup\u003e16\u003c/sup\u003e. Moreover, our recent WES study has also identified several HM-associated genes, including rare coding variants, which were found to have larger effect sizes\u003csup\u003e17\u003c/sup\u003e. Hence, rare variants indeed contribute to the genetic architecture of HM, although the extent to which they accounted for its heritability remains unclear, leaving ample room for further investigation.\u003c/p\u003e \u003cp\u003ePolygenic risk scores (PRS) summarize the cumulative genetic effects of numerous disease-associated variants, providing an overall measure of genetic susceptibility to a particular disease for an individual\u003csup\u003e18,19\u003c/sup\u003e. No more clinical interventions or examinations are required, a test of blood or saliva samples can be used to predict a wide range of conditions\u003csup\u003e20\u003c/sup\u003e. In European populations, several large-scale studies have demonstrated the effectiveness of utilizing the PRS to stratify myopia risk\u003csup\u003e10,11,21\u0026ndash;24\u003c/sup\u003e. Currently, the best-performing PRSs for refractive error explain about 19% of the variance in the trait in individuals of European ancestry and about 6% in those of East Asian ancestry\u003csup\u003e23\u003c/sup\u003e. The best AUROC for HM is 0.783 and 0.672 in European and East Asian populations, respectively\u003csup\u003e23\u003c/sup\u003e. With most large-scale myopia GWASs primarily performed among European populations, it remains unclear if these findings are generalizable to diverse populations of non-European ancestry.\u003c/p\u003e \u003cp\u003eThus, in this study, we estimated heritability explained by SNP-based genetic variance and the gene-wise burden of rare alleles for HM using WES data in a large sample of 12,000 unrelated Chinese from the Myopia Associated Genetics and Intervention Consortium (MAGIC) and 8,682 Europeans from the UK Biobank (UKB) program. We constructed common-variant-based genetic risk scores (cvGRS) and rare-variant-based genetic risk scores (rvGRS) models and evaluated the performance of the two models for genetic risk prediction in a subset of MAGIC. We proposed a method, exome-wide genetic risk score (ExGRS), which combined cvGRS and rvGRS, and observed further improvement in genetic risk prediction for HM. We demonstrated the creation of the ExGRS, which exhibits distinct advantages over cvGRS, by incorporating rare variants identified in HM-associated genes via burden tests, while also evaluated its portability across ancestry in the UKB European populations.\u003c/p\u003e"},{"header":"Results","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003ePolygenic architecture of rare to common coding variants\u003c/h2\u003e \u003cp\u003eWe used a dataset of 1,2600 exomes of Han Chinese ancestry in the MAGIC project and 8,682 exomes of European ancestry in the UK Biobank (Supplementary Fig.\u0026nbsp;1). We analyzed variants observed at least three times in our dataset, which corresponds to a minor allele frequency (MAF) threshold of 0.01%. After quality control (QC), 2.6 \u0026times; 10\u003csup\u003e\u0026minus;\u0026thinsp;6\u003c/sup\u003e and 2.2 \u0026times; 10\u003csup\u003e\u0026minus;\u0026thinsp;6\u003c/sup\u003e variants were included in the further analysis in MAGIC and UKB, respectively. First, based on common SNPs, the estimated heritability (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\({h}_{SNP}^{2}\\)\u003c/span\u003e\u003c/span\u003e) of HM was calculated by the residual maximum likelihood analysis (GREML) approach implemented in the software package GCTA\u003csup\u003e25\u003c/sup\u003e. This analysis utilized a selected set of 43,367 and 66,091 HapMap 3 (HM3) SNPs from the MAGIC and UKB cohorts, respectively. After correcting for the first 20 principal components (PCs) computed from HM3 SNPs, we estimated an \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\({h}_{SNP}^{2}\\)\u003c/span\u003e\u003c/span\u003e of 0.31 (standard error, \u003cem\u003es.e.\u003c/em\u003e = 0.01) and 0.14 (\u003cem\u003es.e.\u003c/em\u003e = 0.02) for HM in MAGIC and UKB cohorts, respectively (Supplementary Fig.\u0026nbsp;2). We then applied variants with MAF\u0026thinsp;\u0026gt;\u0026thinsp;0.01% to estimate and partition additive genetic variances. We grouped variants according to MAF and LD (Supplementary Fig.\u0026nbsp;3 and Supplementary Table\u0026nbsp;1), using the GREML-LDMS partitioning method with a median-based LD grouping strategy\u003csup\u003e26\u003c/sup\u003e. Corrected for the first 20 PCs, we found the estimated heritability based on WES data (\u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\({h}_{WES}^{2}\\)\u003c/span\u003e\u003c/span\u003e) was 1.76 (s.e. = 0.03) and 0.20 (s.e. = 0.10) for HM in MAGIC and UKB cohorts (Supplementary Fig.\u0026nbsp;3), which suggested \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\({h}_{SNP}^{2}\\)\u003c/span\u003e\u003c/span\u003e in MAGIC cohort may have been inflated by confounding factors such as population structure.\u003c/p\u003e \u003cp\u003eTo find out the contribution of uncaptured population, we utilized linear model adjusted for PCs to assess the association of rare variants (Supplementary Fig.\u0026nbsp;4) in both cohorts. We then used 160 PCs (that is, 20 PCs computed from each of the 8 MAF/LD bins) computed from independent variants in the GREML-LDMS analyses, which decreased \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\({h}_{WES}^{2}\\)\u003c/span\u003e\u003c/span\u003e from 1.76 (s.e.=0.03) to 0.53 (s.e.=0.06) in MAGIC cohort and increased \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\({h}_{WES}^{2}\\)\u003c/span\u003e\u003c/span\u003efrom 0.20 (s.e. = 0.10) to 0.21 (s.e. = 0.10) in UKB (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e and Supplementary Fig.\u0026nbsp;2), suggesting the presence of population stratification effects not captured by the 20 common variant PCs used in the MAGIC above. We also found that the difference of \u003cspan class=\"InlineEquation\"\u003e\u003cspan class=\"mathinline\"\u003e\\({h}_{WES}^{2}\\)\u003c/span\u003e\u003c/span\u003e for HM between MAGIC cohort and UKB cohort is predominantly explained by rare variants, in particular those in low LD with nearby variants. For the variants with MAF\u0026thinsp;\u0026lt;\u0026thinsp;0.01, 0.33 of the phenotypic variance in MAGIC cohort was accounted by variants in the low-LD group, but only 0.10 of the variance by variants in the high-LD group. However, in the UKB cohort, only 0.04 of the phenotypic variance is accounted for by variants in the low-LD group and 0.01 from those in the high-LD group. When replacing all the calling SNPs in MAGIC cohort with overlapped variants that found in both the MAGIC and the UKB WES datasets, the estimated heritability decreased from 0.53 to 0.06 (Supplementary Fig.\u0026nbsp;5), with most of the differences coming from the variants with 0.0001\u0026thinsp;\u0026lt;\u0026thinsp;MAF\u0026thinsp;\u0026lt;\u0026thinsp;0.01, where almost is EAS specific (Supplementary Fig.\u0026nbsp;6). To further estimate the association between SNP effect and MAF, we demonstrated the association between effect size and MAF by a plot of the cumulative genetic variances explained by \u003cem\u003eh\u003c/em\u003e\u003csup\u003e\u003cem\u003e2\u003c/em\u003e\u003c/sup\u003e against MAF. Under an evolutionarily neutral model, \u003cem\u003eh\u003c/em\u003e\u003csup\u003e\u003cem\u003e2\u003c/em\u003e\u003c/sup\u003e is linearly proportional to MAF\u003csup\u003e27\u003c/sup\u003e. We found that the curve of cumulative genetic variances in MAGIC and UKB cohort were deviated from the neutral model, which suggested that HM is under negative selection (Supplementary Fig.\u0026nbsp;7).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eTo investigate the contribution of low-LD variants with MAF\u0026thinsp;\u0026lt;\u0026thinsp;0.01 to heritability, we partitioned the low-MAF and low-LD variants bins according to the putative effect of protein-coding variants using VEP\u003csup\u003e28\u003c/sup\u003e. Protein-coding variants include four annotations: (\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e) synonymous (Syn); (\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e) benign missense (B-mis); (\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e) damaging missense (D-mis); and (\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e) protein-truncating variants (PTVs) (Supplementary Table\u0026nbsp;2). The proportion of deleterious protein-altering variants, including PTVs and D-mis, was different across the LD and MAF groups, with an increased trend from low- to high-MAF bins (Supplementary Fig.\u0026nbsp;8), which is consistent with purifying selection on this class of variants. Interestingly, the average variance explained per variant was larger for bins with PTVs (low-LD) compared with bins with other protein-altering variants and non-protein-altering variants (low-LD) or high-LD variants (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e). To further validate the robustness of the estimates partitioned by functional genomic annotations, we quantified the heritability explained by the gene-wise burden of rare coding variants\u003csup\u003e29\u003c/sup\u003e. We found that HM in MAGIC cohort and UKB cohort have a PTVs burden heritability of 0.7% (s.e. = 0.15%) and 0.32% (s.e. = 0.25%), respectively (Supplementary Fig.\u0026nbsp;9). Burden heritability concentrates among variants with the most severe predicted functional consequences: PTVs variants explain the majority of burden heritability, followed by D-mis, B-mis and Syn variants, which is also consistent with the GREML-LDMS assessment.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec4\" class=\"Section2\"\u003e \u003ch2\u003eDerive genetic risk scores of common coding variants for HM\u003c/h2\u003e \u003cp\u003eThe genetic risk score (GRS) served as a reliable measurement of the overall risk for an individual\u0026rsquo;s genetic susceptibility to disease, which is an integral part of precision medicine\u003csup\u003e30\u003c/sup\u003e. The flowchart illustrating the study strategy is presented in Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e. For HM in MAGIC cohort, we created several candidate cvGRS based on summary statistics from ExWAS in 12600 participants (6,300 and 6,300 controls) of Chinese Han ancestry\u003csup\u003e31\u003c/sup\u003e. Specifically, we derived 20 predictors based on a pruning and thresholding method, seven additional predictors using the LDPred2 algorithm\u003csup\u003e32\u003c/sup\u003e and one predictor using Lassosum2\u003csup\u003e33\u003c/sup\u003e. These scores were validated within the MGAIC. We used a validation dataset of the 5,400 participants in the MAGIC cohort to select the cvGPSs with the best performance, defined as the maximum area under the receiver-operator curve (AUC). The predictors had AUCs ranging from 0.598\u0026ndash;0.895 in the validation set (Supplementary Table\u0026nbsp;3; Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003ea). The best model was based on the \u003cem\u003eP\u003c/em\u003e value thresholding (P\u0026thinsp;+\u0026thinsp;T) method and involved 40,491 variants with nonzero weights selected based on \u003cem\u003er\u003c/em\u003e\u003csup\u003e2\u003c/sup\u003e\u0026thinsp;=\u0026thinsp;0.2 and \u003cem\u003eP\u003c/em\u003e\u0026thinsp;=\u0026thinsp;1.0 (Supplementary Fig.\u0026nbsp;10). In the validation dataset, the polygenic component of the score explained 4.9% of the variance (R\u003csup\u003e2\u003c/sup\u003e), with 1 s.d. of the score increasing HM risk by 7-fold (odds ratio [OR]\u0026thinsp;=\u0026thinsp;6.99, 95% confidence interval [CI]\u0026thinsp;=\u0026thinsp;6.34\u0026ndash;7.75, \u003cem\u003eP\u003c/em\u003e\u0026thinsp;\u0026lt;\u0026thinsp;1.00 \u0026times; 10\u003csup\u003e\u0026minus;\u0026thinsp;300\u003c/sup\u003e) after controlling for age, sex and genetic ancestry.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec5\" class=\"Section2\"\u003e \u003ch2\u003eGRS optimization by combining with rare variants\u003c/h2\u003e \u003cp\u003eThe second step to optimize the GRS model is to test the independent contributions of rare variants. To identify genes underlying HM, we performed rare-variant burden tests for 12,600 individuals in the MAGIC cohort using four methods (i.e., Fisher's exact test [FET], Burden, SKAT and SKAT-O). Using a MAF threshold of 0.1%, we detected 651 gene-phenotype associations with PTV variants and 1481 associations with D-mis at a GC corrected FET\u0026rsquo;s \u003cem\u003eP\u003c/em\u003e-value of 0.1. We observed a positive correlation between variant pathogenicity and ORs of risk genes for HM under different cut-off \u003cem\u003eP\u003c/em\u003e-values (Supplementary Fig.\u0026nbsp;11). Given the higher heritability and strong effect size of rare deleterious variants in the MAGIC cohort, we reasoned that a cvGRS combining rare variants may effectively identify individuals at high risk for HM. Here we proposed a complementary rvGRS, based on a weighted sum of rare deleterious variants from HM-associated genes. To construct the model, we first fitted a logistic regression model to HM on the rare PTVs and D-mis in associated genes for 12,600 training subsets. Furthermore, we evaluated the predictive power of the rvGRS on the 5,400 of MAGIC individuals that had been withheld for validation. We observed the best performance of rvGRS for PTVs (AUC\u0026thinsp;=\u0026thinsp;0.698) and D-mis (AUC\u0026thinsp;=\u0026thinsp;0.772) based on HM-associated genes selected with FET\u0026rsquo;s \u003cem\u003eP\u003c/em\u003e-value of 0.1 (Supplementary Fig.\u0026nbsp;12-S14). Then, we compared rare-variant association study (RVAS) and rvGRS between PTVs and D-mis. For matched significance thresholds, we uncovered only 4.3% HM-associated genes identified by RVAS which were overlapped between PTV and D-mis (Supplementary Fig.\u0026nbsp;15). We further stratified the population according to rvGPS decile in PTVs and found out that a striking gradient with respect to rvGRS in D-mis (Supplementary Fig.\u0026nbsp;15). Therefore, we derived rvGRS to predict HM by integrating HM-associated genes carried with PTV and D-mis (AUC\u0026thinsp;=\u0026thinsp;0.786) (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eb).\u003c/p\u003e \u003cp\u003eWe assessed the predictive power of the rvGRS and the corresponding cvGRS, as well as a combination of the two methods, on 5,400 MAGIC validation dataset. A higher cvGRS was observed to present in the top decile of the rvGRS (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003ec). Although rvGRS underperformed for average phenotype predictions, we found that they may outperform cvPRSs for identifying individuals at risk extremes (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003ed). Therefore, we combined rare- and common-variant GRS models into a unified model (exome-wide genetic risk score, ExGRS) and obtained a significant improvement in genetic risk prediction for HM. The unified ExGRS performed best with a prediction AUC of 0.897, compared with 0.786 and 0.895 for the independent rare-variant or common-variant GRSs (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003ee). Consistent with the AUC, the inclusion of the rvGRS enhanced HM risk prediction and improved case-control discrimination: the risk of HM for predicted cases was 5.73-fold higher than for the predicted controls, which is higher than cvGRS (4.99-fold) and rvGRS (2.40-fold) (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003ef).\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec6\" class=\"Section2\"\u003e \u003ch2\u003ePortability of ExGRSs and validation in both independent cohorts\u003c/h2\u003e \u003cp\u003eHaving derived and validated a new polygenic predictor that considerably outperformed earlier scores, we explored the predictive power of the ExGRS on HM in 1,219 Han Chinese individuals of an independent testing dataset. We found the ExGRS exhibited highly reproducible performance, with AUC of 0.856 and OR of 3.51 (95% CI: 3.05\u0026ndash;4.07, \u003cem\u003eP\u003c/em\u003e\u0026thinsp;\u0026lt;\u0026thinsp;1.31 \u0026times; 10\u003csup\u003e\u0026minus;\u0026thinsp;65\u003c/sup\u003e) (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003ea and Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e). The inclusion of the rvGRS risk genotype considerably enhanced HM risk prediction in MAGIC cohorts, substantially improving tail cutoff discrimination. Compared to the remaining 95% of individuals, the risk for HM among the top 5% of individuals was approximately 9.95-fold higher in the model without rvGRS and 15.57-fold higher in the model with rvGRS (Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e). The effects of the GRS stratified by with or without rvGRS in MAGIC cohorts are depicted in Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003eb.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eThe performance metrics of the GRS in the testing cohorts.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"6\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\"\u0026times;\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eModels\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eOR per s.d. (95% CI), \u003cem\u003eP\u003c/em\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eAUC\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003ePRS threshold\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eOR (95% CI), \u003cem\u003eP\u003c/em\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003ePrevalence of HM\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003ecvGRS\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026times;\" colname=\"c2\"\u003e \u003cp\u003e3.74 (3.19\u0026ndash;4.44), 3.27\u0026times;10\u003csup\u003e\u0026minus;\u0026thinsp;55\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.819\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eTop 20% versus other 80%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e9.58 (6.43\u0026ndash;14.68), 5.70\u0026times;10\u003csup\u003e\u0026minus;\u0026thinsp;41\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.86\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eTop 10% versus other 90%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e10.93 (5.92\u0026ndash;22.05), 7.11\u0026times;10\u003csup\u003e\u0026minus;\u0026thinsp;23\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.90\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eTop 5% versus other 95%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e9.95 (4.24\u0026ndash;28.50), 1.94\u0026times;10\u003csup\u003e\u0026minus;\u0026thinsp;11\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.90\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eTop 2% versus other 98%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e7.19 (2.13\u0026ndash;37.86), 2.4\u0026times;10\u003csup\u003e\u0026minus;\u0026thinsp;4\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.87\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eTop 1% versus other 99%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e5.05 (1.07\u0026ndash;47.61), 0.037\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.83\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003ervGRS\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026times;\" colname=\"c2\"\u003e \u003cp\u003e2.24 (1.99\u0026ndash;2.54), 4.92\u0026times;10\u003csup\u003e\u0026minus;\u0026thinsp;38\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.759\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eTop 20% versus other 80%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e9.21 (6.21\u0026ndash;14.04), 5.33\u0026times;10\u003csup\u003e\u0026minus;\u0026thinsp;40\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.86\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eTop 10% versus other 90%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e9.96 (5.50-19.54), 7.25\u0026times;10\u003csup\u003e\u0026minus;\u0026thinsp;22\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.89\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eTop 5% versus other 95%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e9.95 (4.24\u0026ndash;28.50), 1.94\u0026times;10\u003csup\u003e\u0026minus;\u0026thinsp;11\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.90\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eTop 2% versus other 980%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e7.19 (2.13\u0026ndash;37.86), 2.4\u0026times;10\u003csup\u003e\u0026minus;\u0026thinsp;4\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.87\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eTop 1% versus other 99%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e11.15 (1.61-480.32), 0.006\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.91\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eExGRS\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\"\u0026times;\" colname=\"c2\"\u003e \u003cp\u003e3.51 (3.05\u0026ndash;4.07), 1.31\u0026times;10\u003csup\u003e\u0026minus;\u0026thinsp;65\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.856\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eTop 20% versus other 80%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e12.45 (8.07\u0026ndash;19.86), 3.58\u0026times;10\u003csup\u003e\u0026minus;\u0026thinsp;47\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.89\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eTop 10% versus other 90%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e15.13 (7.59\u0026ndash;34.31), 3.74\u0026times;10\u003csup\u003e\u0026minus;\u0026thinsp;26\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.92\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eTop 5% versus other 95%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e15.57 (5.70-59.48), 1.47\u0026times;10\u003csup\u003e\u0026minus;\u0026thinsp;13\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.93\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eTop 2% versus other 980%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e7.19 (2.13\u0026ndash;37.86), 2.4\u0026times;10\u003csup\u003e\u0026minus;\u0026thinsp;4\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.87\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eTop 1% versus other 99%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e11.15 (1.61-480.32), 0.006\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.91\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eNext, we evaluated the robustness of ExGRS in the 8,682 UK Biobank European-ancestry individuals. Although significant between-population correlation of allelic effects (i.e., logOR) for variants clumped with different cut-off \u003cem\u003eP-\u003c/em\u003evalues (Supplementary Fig.\u0026nbsp;16), we detected significant differences in the ExGRS across ancestries (Wilcoxon rank sum test, \u003cem\u003eP\u003c/em\u003e\u0026thinsp;\u0026lt;\u0026thinsp;2.20\u0026times;10\u003csup\u003e\u0026minus;\u0026thinsp;16\u003c/sup\u003e). We then tested the final ExGRS in UKB European cohort. Predictive models based on the MAGIC and UKB overlaped SNPs and HM-associated genes, fitted with age, sex and population structure, were predictive of HM (versus all non-HM controls) with AUC values of 0.657, similar with 0.662 with cvGRS only (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003ec). Overall, these observations are consistent with above results which indicated one phenomenon that, in aggregate, rare variants explain less genetic heritability than common variants in the UKB European populations. The combined ExGRS model resulted in OR per s.d. = 1.46, 95% CI\u0026thinsp;=\u0026thinsp;1.41\u0026ndash;1.52 and \u003cem\u003eP\u003c/em\u003e\u0026thinsp;=\u0026thinsp;2.35 \u0026times; 10\u003csup\u003e\u0026minus;\u0026thinsp;82\u003c/sup\u003e, which is lower than cvGRS model (OR per s.d. = 1.78, 95% CI\u0026thinsp;=\u0026thinsp;1.69\u0026ndash;1.88, \u003cem\u003eP\u003c/em\u003e\u0026thinsp;=\u0026thinsp;2.14 \u0026times; 10\u003csup\u003e\u0026minus;\u0026thinsp;105\u003c/sup\u003e) (Supplementary Table\u0026nbsp;4). Distinguished to MAGIC Han Chinese ancestry cohorts, the inclusion of rvGRS in the ExGPS deseased risk prediction in UKB European cohorts (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003ed). Therefore, the modeled risk in European-ancestry individuals was entirely attributable to the cvGRS.\u003c/p\u003e \u003c/div\u003e"},{"header":"Discussion","content":"\u003cp\u003eIn our study, we estimated the heritability of HM captured by both rare and common variants in unrelated individuals from two distinct ancestry cohorts. We identified an additional variance attributed to rare variants, particularly rare protein-altering variants in low LD with other genomic variants, beyond what was by common HapMap3 variants. Our estimations largely, though not entirely, recovered the heritability estimated from pedigree data, in particular for Han Chinese ancestry cohort but less so for European ancestry. The remaining gap could be due to a combination of sampling variance and remaining causal variants that are not captured by the WES data. Based on the high heritability of HM, we described a systematic approach to derive and validate the ExGRS, incorporating information from rare to common genetic variants, to predict polygenic susceptibility to HM. Our studies demonstrated that extreme tails of the risk ExGRS distribution (top 5%) conferred an approximately 15-fold increased risk for HM in Han Chinese population. Additionally, we tested the ExGRS in participants across two ancestries, and found that top 5% risk ExGRS distribution conferred an approximately 2-fold increased risk for HM in European ancestry, which lower than 3.67-fold for cvGRS.\u003c/p\u003e \u003cp\u003eBeyond enhanced disease screening of asymptomatic individuals, other potential applications of the ExGRS may include improved risk stratification of potential schoolchildren or enhanced assessment of early-onset myopia. Our results stress an urgent need to test the individual ExGRS in this setting to better assess its impact on the risk of pathological myopia as well as other HM complications. The ability to quantify inborn susceptibility using ExGRS is likely to be generalizable across a broad range of complex diseases, contingent upon availability of a large discovery WES, independent validation and testing datasets, and the heritability of a given disease explained by rare and common variants. Predictive power is likely to further improve in the coming years due to larger WES and WGS discovery studies and improved computational algorithms that integrate functional genomics annotation, variant-variant interactions, and rare large-effect variants into the predictive model.\u003c/p\u003e \u003cp\u003eWe note that both the extreme of the cvGRS and rvGRS distribution (top 5%) identically predisposed individuals to a 10.0-fold greater predisposition than the remainder of the population. Consistent with higher heritability of rare variants, higher risk observed in rvGRS model (OR\u0026thinsp;=\u0026thinsp;11.15, \u003cem\u003eP\u003c/em\u003e\u0026thinsp;=\u0026thinsp;0.006) when compared to cvGRS (OR\u0026thinsp;=\u0026thinsp;5.05, \u003cem\u003eP\u003c/em\u003e\u0026thinsp;=\u0026thinsp;0.037) for the top 1% versus bottom 99%. Although the combined ExGRS model substantially improved the prediction performance, this model showed incomplete penetrance that not all carriers manifest HM. This observation is consistent with recent PGS studies combining common and rare variants across a broad range of complex diseases, including coronary artery disease, atrial fibrillation, type 2 diabetes, inflammatory bowel disease, kidney disease and breast cancer\u003csup\u003e20,34\u0026ndash;37\u003c/sup\u003e. Additional studies of large unascertained populations are needed to determine whether a larger effect size for rvGRS can be found among adults, and the extent to which a favorable polygenic background can explain the absence of HM noted among many mutation carriers. Due to our score is based on an ExWAS and RVAS for HM in Han Chinese ancestry, so the allelic effect estimates are heavily biased by the Han Chinese participants. We used variants with concordant direction-of-effect between MAGIC and UKB to improve the trans-ethnic performance of the score, and further enhanced the model by including rvGRS model. We demonstrated that rvGRS has an additive effect with cvGRS and significantly improves case-control discrimination in Han Chinese cohort. However, since allele frequencies, linkage disequilibrium patterns, and effect sizes of polymorphisms vary by ancestry, the specific ExGRS here will not have optimal predictive power for European ethnic groups.\u003c/p\u003e \u003cp\u003eAlthough the average refractive error has increased substantially across multiply populations, the variability within given population has also increased, suggesting that an increasingly myopiagenic environment may have led to a preferential \u0026ldquo;unmasking\u0026rdquo; of inherited susceptibility in those with the highest genetic risk\u003csup\u003e12,38,39\u003c/sup\u003e. For example, prior studies suggest that the effect of education, metabolism, near work and time outdoors on refractive error are most pronounced in individuals with a genetic predisposition\u003csup\u003e40\u0026ndash;43\u003c/sup\u003e. The ability to identify high-risk individuals from the time of birth may facilitate targeted strategies for HM prevention with increased effect or cost-effectiveness. The ExGRS permits identification of individuals, from birth, who inherit high susceptibility and before clinical disease manifest itself. Careful study of individuals at the extremes of an ExGRS distribution might uncover new causal risk factors or underlying disease pathways. Similarly, clinical and multi-omic profiling of individuals at the extremes of an ExGRS distribution for HM may reveal the contributions and molecular correlates of pathways related to ocular development\u003csup\u003e44\u003c/sup\u003e, neurotransmission\u003csup\u003e45\u003c/sup\u003e, and scleral remodeling\u003csup\u003e46\u003c/sup\u003e and might enable the identification of clinically relevant subtypes of severe myopia that most benefit from a given pharmacologic or behavioral intervention.\u003c/p\u003e \u003cp\u003eSeveral important limitations of this work need to be discussed. First, we are most limited by the lack of large-scale WES for HM in multiethnic populations, as well as the small size of existing cohorts that could be used to optimize performance in Europeans and Asians. The assumption of fixed allelic effects across different ancestry groups is likely inaccurate because many disease-related lifestyle factors and environmental exposures related to ancestry could modify allelic effects. Accordingly, the overall tail discrimination of the score was lower in European than in Han Chinese ancestry cohorts, with notably lower sensitivity for the top 5% GPS cutoff. Although it is not possible to overcome this limitation in the present study, our ExGRS approach could be refined by including larger WES studies for HM once available in the future. Second, the performance comparisons between different ancestral groups could be biased by differences in genotyping platforms and the ascertainment methods employed by various biobanks. For example, the UKB represents a population-based cohort recruiting European participants in the 40\u0026ndash;60 age group, while the MAGIC case-control cohorts are ascertained in schoolchildren. The inclusion of older participants in UKB testing cohorts might lead to misclassification of some cases due to the age-related refractive error decline, resulting in risk underestimation for cohorts consisting of older participants.\u003c/p\u003e \u003cp\u003eIn summary, this study highlighted the importance of the rare variants in addressing the current gap in heritability of various traits or diseases, by using WES data. In this study, we derived, optimized and validated a new tool, ExGRS, for HM prediction across ancestries. The variants uncovered by cvGRS and rvGRS had additive effects on HM, resulting in nearly a 15-fold increased risk for HM among individuals in the highest 5% of the risk score distribution. This result underscores the significance of genetic risk scores by combining rare variants which may provide a higher prediction accuracy for many polygenic diseases. The potential implications of the ExGRS would be that it can identify at-risk individuals before the disease or trait has manifested. With the cost of WES is no longer prohibitive, a population-based genetic screening approach for common eye diseases may prove to be a cost-effective public health strategy. While our study marks the initial step in this direction, prospective studies are warranted to evaluate the performance of this approach in clinical practice and analyze its cost-effectiveness.\u003c/p\u003e"},{"header":"Methods","content":"\u003cdiv id=\"Sec9\" class=\"Section2\"\u003e \u003ch2\u003eOverview of the High Myopia Sequencing Consortium Cohort\u003c/h2\u003e \u003cp\u003eThe Myopia Associated Genetics and Intervention Consortium (MAGIC) is a large-scale genomic consortium integrating myopia cohorts and sequencing data from many investigators. Over the past several years, MAGIC has been able to collected samples at the Eye Hospital of Wenzhou Medical University (Zhejiang Eye Hospital) through the Institute of Biomedical Big Data\u003csup\u003e4\u003c/sup\u003e. We recruited approximately ten thousand Chinese schoolchildren with high myopia aged from 6 to 18 years from MAGIC. The analysis presented here is based on 21,227 unrelated human samples collected from epidemiological studies of myopia.\u003c/p\u003e \u003cp\u003eUK Biobank (UKB) is a large-scale biomedical database and research resource, containing genetic and health records from half a million individuals aged 40 to 69 years in the United Kingdom\u003csup\u003e47\u003c/sup\u003e. There were 488,000 participants were genotyped for 805,426 markers on the UK BiLEVE Axiom array and UK Biobank Axiom array. UKB measured refractive error of 130,494 participants by non-cycloplegic autorefraction using a TomeyRC-5000 AutoRefractor Keratometer.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec10\" class=\"Section2\"\u003e \u003ch2\u003eQuality Control\u003c/h2\u003e \u003cp\u003eSample quality control (QC) and variant QC for MAGIC and UKB cohorts in our previous study\u003csup\u003e31\u003c/sup\u003e are used in this study. We first selected the samples with phenotypes available and retained only the high-quality variants that passed a GATK Variant Quality Score Recalibration (VQSR) approach, and those located outside of low-complexity regions were remained. Genotypes with a genotype depth (DP)\u0026thinsp;\u0026lt;\u0026thinsp;10 and genotype quality (GQ)\u0026thinsp;\u0026lt;\u0026thinsp;20 and heterozygous genotype calls with an allele balance\u0026thinsp;\u0026gt;\u0026thinsp;0.8 or \u0026lt;\u0026thinsp;0.2 were set as missing. We then excluded variants with genotype missingness rate\u0026thinsp;\u0026gt;\u0026thinsp;0.05, Hardy-Weinberg equilibrium (HWE) test \u003cem\u003eP\u003c/em\u003e value\u0026thinsp;\u0026lt;\u0026thinsp;10\u003csup\u003e\u0026minus;\u0026thinsp;6\u003c/sup\u003e or a MAC\u0026thinsp;\u0026lt;\u0026thinsp;3 using PLINK v.1.9\u003csup\u003e48\u003c/sup\u003e. Only retained individuals of East Asian (EAS) and European ancestry were retained, which were classified by a random forest algorithm with 1000 Genomes data. At the end of all the QC steps, we retained 12,000 unrelated individuals of Han Chinese and 8,682 European.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003eVariant Annotation\u003c/h2\u003e \u003cp\u003eThe annotation of variants was performed with Ensembl\u0026rsquo;s Variant Effect Predictor (VEP v.99) for human genome assembly GRCh37. We used the VEP\u003csup\u003e28\u003c/sup\u003e to generate additional bioinformatic predictions of variant deleteriousness (Supplementary Table\u0026nbsp;5-S6). Protein-coding variants were annotated into the following four classes: (\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e) synonymous; (\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e) benign missense; (\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e) damaging missense; and (\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e) protein-truncating variants (PTVs). Benign missense was predicted as \u0026ldquo;tolerated\u0026rdquo; and \u0026ldquo;benign\u0026rdquo; by PolyPhen-2 and SIFT, respectively, and combined annotation dependent depletion (CADD) score\u0026thinsp;\u0026lt;\u0026thinsp;15. Furthermore, damaging missense were predicted as \u0026ldquo;probably damaging\u0026rdquo; and \u0026ldquo;deleterious\u0026rdquo; by PolyPhen-2 and SIFT and CADD\u0026thinsp;\u0026gt;\u0026thinsp;15. Finally, PTVs were classified as \u0026ldquo;frameshift_variant\u0026rdquo;, \u0026ldquo;splice_acceptor_variant\u0026rdquo;, \u0026ldquo;splice_donor_variant\u0026rdquo;, \u0026ldquo;stop_gained\u0026rdquo;, or \"start_lost\" variants.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec12\" class=\"Section2\"\u003e \u003ch2\u003eAssociation Test\u003c/h2\u003e \u003cp\u003eWe conducted a single-variant association analyses by using MLMA-LOCO \u003csup\u003e25\u003c/sup\u003e. The test statistics obtained via linear regression were inflated because of the population differentiation caused by genetic drift. Post hoc correction approaches, such as \u0026ldquo;Genomic Control\u0026rdquo;, were used to correct the inflation\u003csup\u003e49\u003c/sup\u003e. For the exome-wide association study, we first tested each variant, regardless of allele frequency, for HM associations; we applied a significance level of \u003cem\u003eP\u003c/em\u003e\u0026thinsp;\u0026lt;\u0026thinsp;4.3 \u0026times; 10\u003csup\u003e\u0026minus;\u0026thinsp;7\u003c/sup\u003e for all variants\u003csup\u003e50\u003c/sup\u003e. To determine whether a single gene was enriched in or depleted of rare protein-coding variants in HM cases, we performed four gene-level association tests including Fisher\u0026rsquo;s exact test, burden, SKAT and SKAT-O, with previously defined covariates (sample sex, PC1-PC10).\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec13\" class=\"Section2\"\u003e \u003ch2\u003eHeritability Estimation\u003c/h2\u003e \u003cp\u003eIn each WES dataset, we stratified SNPs into 4 MAF bins (0.0001\u0026thinsp;\u0026lt;\u0026thinsp;MAF\u0026thinsp;\u0026lt;\u0026thinsp;0. 0010, 0.001\u0026thinsp;\u0026lt;\u0026thinsp;MAF\u0026thinsp;\u0026lt;\u0026thinsp;0.010, 0.01\u0026thinsp;\u0026lt;\u0026thinsp;MAF\u0026thinsp;\u0026lt;\u0026thinsp;0.10 and 0.1\u0026thinsp;\u0026lt;\u0026thinsp;MAF\u0026thinsp;\u0026lt;\u0026thinsp;0.5). For each of the 22 autosomes, we calculated the LD score of each variant with the others on a sliding window of 10 Mb using GCTA software\u003csup\u003e25\u003c/sup\u003e. Each of the four MAF bins was divided into two more bins, one for variants with LD scores above the median value of the variants in the bin (high-LD bin) and one for variants with LD score below the median (low-LD bin) (Supplementary Table\u0026nbsp;1). We then used GCTA to perform a GREML-LDMS analysis on HM in each dataset with either 20 PCs calculated from HM3 SNPs or 160 PCs (20 PCs computed from each of the 8 MAF/LD bins) fitted as fixed covariates.\u003c/p\u003e \u003cp\u003eUsing variant annotations and the LD and MAF bins defined from the GREML-LDMS analysis on the WES data mentioned above, we further separated the low-LD and high-LD variants in the 0.0001\u0026thinsp;\u0026lt;\u0026thinsp;MAF\u0026thinsp;\u0026lt;\u0026thinsp;0.01 into four bins according to their predicted variant effects: PTV, D-mis, B-mis and Synonymous. We then ran a GREML-LDMS analysis with 8 Genome-wide Relationship Matrices (GRMs), fitting the 160 PCs shown to capture the effect of population stratification as well as fixed covariates in MAGIC and UKB. To compute the variance explained per SNP, we divided the estimate of variance explained for each bin by the number of variants in the bin. The s.e. was obtained by dividing the s.e. of the estimated variance explained for the bin by the number of variants in the bin. We estimated burden heritability for rare variant by using BHR (v.0.1.0), which is implemented in R, and its source code is publicly available at GitHub (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://github.com/ajaynadig/bhr\u003c/span\u003e\u003cspan address=\"https://github.com/ajaynadig/bhr\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e). To compute the effect-size variance explained per gene, we divided the estimate of burden heritability for each bin by the number of variants in the bin.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec14\" class=\"Section2\"\u003e \u003ch2\u003eGRS Design\u003c/h2\u003e \u003cp\u003eWe derived cvGRS, rvGRS and ExGRS in the 12,000 unrelated individuals of Han Chinese ancestry from MAGIC. For cvGRS derivation, we first generated 20 pruning and thresholding (P\u0026thinsp;+\u0026thinsp;T) scores over a range of \u003cem\u003eP\u003c/em\u003e value (1.0, 0.5, 0.05, 5 \u0026times; 10\u003csup\u003e\u0026minus;\u0026thinsp;4\u003c/sup\u003e and 5 \u0026times; 10\u003csup\u003e\u0026minus;\u0026thinsp;6\u003c/sup\u003e) and r2 (0.2, 0.4, 0.6, and 0.8) thresholds. We also computed 7 candidate cvGRS using the LDPred2 algorithm\u003csup\u003e32\u003c/sup\u003e across the following range of rho (fraction of casual variants): 1.00, 1.00 \u0026times; 10\u003csup\u003e\u0026minus;\u0026thinsp;1\u003c/sup\u003e, 1.00 \u0026times; 10\u003csup\u003e\u0026minus;\u0026thinsp;2\u003c/sup\u003e, 1.00 \u0026times; 10\u003csup\u003e\u0026minus;\u0026thinsp;3\u003c/sup\u003e, 3.00 \u0026times; 10\u003csup\u003e\u0026minus;\u0026thinsp;1\u003c/sup\u003e, 3.00 \u0026times; 10\u003csup\u003e\u0026minus;\u0026thinsp;2\u003c/sup\u003e and 3.00 \u0026times; 10\u003csup\u003e\u0026minus;\u0026thinsp;3\u003c/sup\u003e. Additionally, the lassosum2 computational algorithm\u003csup\u003e33\u003c/sup\u003e was used to generate a candidate GRS for HM. Each of the scores derived above was subsequently assessed for discrimination of HM cases from controls in the MAGIC validation dataset (2,697 cases and 2,703 controls) after adjustment for age, sex and 160 PCs of ancestry. The score with the best performance was defined by the maximal area under the receiver operator curve (AUC) and the largest fraction of variance explained. AUC confidence intervals were calculated using the \u0026lsquo;pROC\u0026rsquo; package within R.\u003c/p\u003e \u003cp\u003eWe aimed to assess if adding the rvGRS enhanced HM risk prediction. We constructed the rvGRS from the results of the rare variant burden tests. These were conducted per gene, and each gene had separate thresholds for associated to HM (\u003cem\u003eP\u003c/em\u003e value) and pathogenicity (PTV, D-mis, B-mis and Synoymous variant) established in the training group. rvGRS models were constructed by fitting logistics regression models to HM on the rare variants (AF\u0026thinsp;\u0026lt;\u0026thinsp;1%) in significantly associated genes. A unified PRS model, ExGRS, was also constructed, which summed the rare- and common-variant GRS models per individual. We tested the ExGRS for association with HM in this dataset.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec15\" class=\"Section2\"\u003e \u003ch2\u003eStatistical Analysis within the Testing Dataset\u003c/h2\u003e \u003cp\u003eFor HM, the ExGRS with the best discriminative capacity in the testing dataset was calculated in the testing dataset of 1,219 participants in MAGIC and 8,682 participants in UKB. The proportion of the population and of HM individuals with a given magnitude of increased risk was determined by comparing progressively more extreme tails of the distribution with the remainder of the population. Logistic regression models were used for predicting case-control status with adjustment for age, sex, and PCs of ancestry using the glm function in R. We used the pROC R package to calculate the AUC. We also expressed the effect of the standardized risk score as ORs (with 95% CIs) per s.d. unit of the control standard-normalized risk score distribution in each of the testing cohorts. We examined the risk score discrimination at tail cutoffs corresponding to the top 20, 10, 5, 2 and 1% of the GRS distribution by deriving the ORs of disease for each tail of the distribution compared to all other individuals in each cohort. Statistical analyses were conducted using R version 4.2.1 software (The R Foundation).\u003c/p\u003e "},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eData availability\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eIndividual-level data are not publicly available due to ethical and legal restrictions related to the Wenzhou Medical University. VCF files have been deposited to Genome Variation Map (GVM) in BIG Data Center (http://bigd.big.ac.cn/gsa), Beijing Institute of Genomics (BIG), Chinese Academy of Sciences and are publicly available as of the data of publication.\u0026nbsp;Accession numbers are list in the key resources table.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eThis paper does not report original code. Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAcknowledgments\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis work was supported by the National Natural Science Foundation of China (U20A20364 and 81830027) and the Zhejiang Provincial Key Research and Development Program Grant (2021C03102) to J. Qu; the National Natural Science Foundation of China (82172882) to J. Su.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConsortia\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe members of the Myopia Associated Genetics and Intervention Consortium (MAGIC) are Jianzhong Su, Jian Yuan, Liangde Xu, Shilai Xing, Yinghao Yao, Fukun Chen, Kai Li, Zhengbo Xue, Yaru Zhang, Ji Zhang, Hui Liu, Dandan Fan, Guosi Zhang, Hong Wang, Meng Zhou, Hao Chen, Fan Lyu, Gang An, Yuanchao Xue, Jian Yang, Jia Qu, Zhenhui Chen, Yunlong Ma, Yichun Xiong,\u0026nbsp;Xinting Liu, Nan Wu, Jie Sun, Jinhua Bao, Liang Xu, Ling Li, Liang Ye, Jun Jiang, Xinjie Mao, Xinping Yu, Xiaoming Huang, Jingjing Xu, Miaomiao Li, Xuemei Zhang, Liang Hu, Zhuopao Zuo, Wanqing Jin, Jiawei Zhou, Yuwen Wang, Xue Li, Fang Hou, Yukuan Huang, Fei Qiu, Yijun Zhou, Na Gao, Xinyu Wang, Xinrui Shi, Yuchun Deng, Xiaoguang Yu, Yu Bai, Chenghao Li, Lu Chen, Ke Li, Lijun Dai, Xiangyi Yu, Peng Lin, Jingting Zhao, Congcong Yan, Siqi Bao, Zicheng Zhang, Fangjie Guo, Hongchen Han, Shen Wang, Haojun Sun, Siyi Jiang, Wei Dai, Hengte Kong, Xiaoyan Lu, Jing Li, Liansheng Li, Siyu Wang.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthor contributions\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe study was conceived, designed and supervised by J.S., X.Y. and L.Q. Analysis of data was performed by J.Y., R.Q., H.S. and W.D.. Patient sample recruitment was conducted by member of Myopia Associated Genetics and Intervention Consortium. The manuscript was written by J.Y. with contributions from all other authors.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting interests\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare no competing financial interests.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eYu, X., Yuan, J., Chen, Z.J., Li, K., Yao, Y., Xing, S., Xue, Z., Zhang, Y., Peng, H., and An, G. (2023). Whole-Exome Sequencing Among School-Aged Children With High Myopia. JAMA Network Open \u003cem\u003e6\u003c/em\u003e, e2345821-e2345821.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMorgan, I.G., Ohno-Matsui, K., and Saw, S.-M. (2012). Myopia. The Lancet \u003cem\u003e379\u003c/em\u003e, 1739\u0026ndash;1748.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSaw, S.M., Gazzard, G., Shih-Yen, E.C., and Chua, W.H. (2005). Myopia and associated pathological complications. Ophthalmic and Physiological Optics \u003cem\u003e25\u003c/em\u003e, 381\u0026ndash;391.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eXu, L., Ma, Y., Yuan, J., Zhang, Y., Wang, H., Zhang, G., Tu, C., Lu, X., Li, J., and Xiong, Y. (2021). COVID-19 Quarantine Reveals That Behavioral Changes Have an Effect on Myopia Progression. Ophthalmology.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYou, Q.S., Wu, L.J., Duan, J.L., Luo, Y.X., Liu, L.J., Li, X., Gao, Q., Wang, W., Xu, L., and Jonas, J.B. (2014). Prevalence of myopia in school children in greater Beijing: the Beijing Childhood Eye Study. Acta ophthalmologica \u003cem\u003e92\u003c/em\u003e, e398-e406.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWong, Y.-L., and Saw, S.-M. (2016). Epidemiology of pathologic myopia in Asia and worldwide. The Asia-Pacific Journal of Ophthalmology \u003cem\u003e5\u003c/em\u003e, 394\u0026ndash;402.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLopes, M.C., Andrew, T., Carbonaro, F., Spector, T.D., and Hammond, C.J. (2009). Estimating heritability and shared environmental effects for refractive error in twin and family studies. Investigative ophthalmology \u0026amp; visual science \u003cem\u003e50\u003c/em\u003e, 126\u0026ndash;131.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGuggenheim, J.A., Kirov, G., and Hodson, S.A. (2000). The heritability of high myopia: a reanalysis of Goldschmidt's data. Journal of Medical Genetics \u003cem\u003e37\u003c/em\u003e, 227\u0026ndash;231.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eVerhoeven, V.J., Hysi, P.G., Wojciechowski, R., Fan, Q., Guggenheim, J.A., H\u0026ouml;hn, R., MacGregor, S., Hewitt, A.W., Nag, A., and Cheng, C.-Y. (2013). Genome-wide meta-analyses of multiancestry cohorts identify multiple new susceptibility loci for refractive error and myopia. Nature genetics \u003cem\u003e45\u003c/em\u003e, 314\u0026ndash;318.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTedja, M., Wojciechowski, R., Hysi, P., Eriksson, N., Furlotte, N., Verhoeven, V., Iglesias, A., Meester-Smoor, M., Tompson, S., Fan, Q., et al. (2018). Genome-wide association meta-analysis highlights light-induced signaling as a driver for refractive error. Nature genetics \u003cem\u003e50\u003c/em\u003e, 834\u0026ndash;848. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/s41588-018-0127-7\u003c/span\u003e\u003cspan address=\"10.1038/s41588-018-0127-7\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHysi, P., Choquet, H., Khawaja, A., Wojciechowski, R., Tedja, M., Yin, J., Simcoe, M., Patasova, K., Mahroo, O., Thai, K., et al. (2020). Meta-analysis of 542,934 subjects of European ancestry identifies new genes and mechanisms predisposing to refractive error and myopia. Nature genetics \u003cem\u003e52\u003c/em\u003e, 401\u0026ndash;407. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/s41588-020-0599-0\u003c/span\u003e\u003cspan address=\"10.1038/s41588-020-0599-0\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMorgan, I.G., Wu, P.-C., Ostrin, L.A., Tideman, J.W.L., Yam, J.C., Lan, W., Baraas, R.C., He, X., Sankaridurg, P., and Saw, S.-M. (2021). IMI risk factors for myopia. Investigative ophthalmology \u0026amp; visual science \u003cem\u003e62\u003c/em\u003e, 3\u0026ndash;3.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTran-Viet, K., Powell, C., Barathi, V., Klemm, T., Maurer-Stroh, S., Limviphuvadh, V., Soler, V., Ho, C., Yanovitch, T., Schneider, G., et al. (2013). Mutations in SCO2 are associated with autosomal-dominant high-grade myopia. American journal of human genetics \u003cem\u003e92\u003c/em\u003e, 820\u0026ndash;826. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.ajhg.2013.04.005\u003c/span\u003e\u003cspan address=\"10.1016/j.ajhg.2013.04.005\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eJin, Z., Wu, J., Huang, X., Feng, C., Cai, X., Mao, J., Xiang, L., Wu, K., Xiao, X., Kloss, B., et al. (2017). Trio-based exome sequencing arrests de novo mutations in early-onset high myopia. Proceedings of the National Academy of Sciences of the United States of America \u003cem\u003e114\u003c/em\u003e, 4219\u0026ndash;4224. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1073/pnas.1615970114\u003c/span\u003e\u003cspan address=\"10.1073/pnas.1615970114\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHosoda, Y., Yoshikawa, M., Miyake, M., Tabara, Y., Shimada, N., Zhao, W., Oishi, A., Nakanishi, H., Hata, M., Akagi, T., et al. (2018). CCDC102B confers risk of low vision and blindness in high myopia. Nature communications \u003cem\u003e9\u003c/em\u003e, 1782. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/s41467-018-03649-3\u003c/span\u003e\u003cspan address=\"10.1038/s41467-018-03649-3\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAldahmesh, M., Khan, A., Alkuraya, H., Adly, N., Anazi, S., Al-Saleh, A., Mohamed, J., Hijazi, H., Prabakaran, S., Tacke, M., et al. (2013). Mutations in LRPAP1 are associated with severe myopia in humans. American journal of human genetics \u003cem\u003e93\u003c/em\u003e, 313\u0026ndash;320. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.ajhg.2013.06.002\u003c/span\u003e\u003cspan address=\"10.1016/j.ajhg.2013.06.002\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSu, J., Yuan, J., Xu, L., Xing, S., Sun, M., Yao, Y., Ma, Y., Chen, F., Jiang, L., and Li, K. (2022). Sequencing of 19,219 exomes identifies a low-frequency variant in FKBP5 promoter predisposing to high myopia in a Han Chinese population. medRxiv.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHao, L., Kraft, P., Berriz, G.F., Hynes, E.D., Koch, C., Korategere V Kumar, P., Parpattedar, S.S., Steeves, M., Yu, W., and Antwi, A.A. (2022). Development of a clinical polygenic risk score assay and reporting workflow. Nature medicine \u003cem\u003e28\u003c/em\u003e, 1006\u0026ndash;1013.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWray, N.R., Lin, T., Austin, J., McGrath, J.J., Hickie, I.B., Murray, G.K., and Visscher, P.M. (2021). From basic science to clinical application of polygenic risk scores: a primer. JAMA psychiatry \u003cem\u003e78\u003c/em\u003e, 101\u0026ndash;109.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKhera, A.V., Chaffin, M., Aragam, K.G., Haas, M.E., Roselli, C., Choi, S.H., Natarajan, P., Lander, E.S., Lubitz, S.A., and Ellinor, P.T. (2018). Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nature genetics \u003cem\u003e50\u003c/em\u003e, 1219\u0026ndash;1224.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMojarrad, N.G., Plotnikov, D., Williams, C., and Guggenheim, J.A. (2020). Association Between Polygenic Risk Score and Risk of Myopia. JAMA Ophthalmology \u003cem\u003e138\u003c/em\u003e, 7\u0026ndash;13.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTideman, J., P\u0026auml;rssinen, O., Haarman, A., Khawaja, A., Wedenoja, J., Williams, K., Biino, G., Ding, X., K\u0026auml;h\u0026ouml;nen, M., Lehtim\u0026auml;ki, T., et al. (2021). Evaluation of Shared Genetic Susceptibility to High and Low Myopia and Hyperopia. JAMA ophthalmology \u003cem\u003e139\u003c/em\u003e, 601\u0026ndash;609. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1001/jamaophthalmol.2021.0497\u003c/span\u003e\u003cspan address=\"10.1001/jamaophthalmol.2021.0497\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eClark, R., Lee, S.S.-Y., Du, R., Wang, Y., Kneepkens, S.C., Charng, J., Huang, Y., Hunter, M.L., Jiang, C., and Tideman, J.W.L. (2023). A new polygenic score for refractive error improves detection of children at risk of high myopia but not the prediction of those at risk of myopic macular degeneration. EBioMedicine \u003cem\u003e91\u003c/em\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKassam, I., Foo, L.-L., Lanca, C., Xu, L., Hoang, Q.V., Cheng, C.-Y., Hysi, P., and Saw, S.-M. (2022). The potential of current polygenic risk scores to predict high myopia and myopic macular degeneration in multi-ethnic Singapore adults. Ophthalmology.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYang, J., Lee, S.H., Goddard, M.E., and Visscher, P.M. (2011). GCTA: a tool for genome-wide complex trait analysis. The American Journal of Human Genetics \u003cem\u003e88\u003c/em\u003e, 76\u0026ndash;82.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWainschtein, P., Jain, D., Zheng, Z., Cupples, L., Shadyab, A., McKnight, B., Shoemaker, B., Mitchell, B., Psaty, B., Kooperberg, C., et al. (2022). Assessing the contribution of rare variants to complex trait heritability from whole-genome sequence data. Nature genetics \u003cem\u003e54\u003c/em\u003e, 263\u0026ndash;273. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/s41588-021-00997-7\u003c/span\u003e\u003cspan address=\"10.1038/s41588-021-00997-7\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZeng, J., De Vlaming, R., Wu, Y., Robinson, M.R., Lloyd-Jones, L.R., Yengo, L., Yap, C.X., Xue, A., Sidorenko, J., and McRae, A.F. (2018). Signatures of negative selection in the genetic architecture of human complex traits. Nature genetics \u003cem\u003e50\u003c/em\u003e, 746\u0026ndash;753.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMcLaren, W., Gil, L., Hunt, S.E., Riat, H.S., Ritchie, G.R., Thormann, A., Flicek, P., and Cunningham, F. (2016). The ensembl variant effect predictor. Genome biology \u003cem\u003e17\u003c/em\u003e, 1\u0026ndash;14.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWeiner, D.J., Nadig, A., Jagadeesh, K.A., Dey, K.K., Neale, B.M., Robinson, E.B., Karczewski, K.J., and O\u0026rsquo;Connor, L.J. (2023). Polygenic architecture of rare coding variation across 394,783 exomes. Nature \u003cem\u003e614\u003c/em\u003e, 492\u0026ndash;499.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTorkamani, A., Wineinger, N.E., and Topol, E.J. (2018). The personal and clinical utility of polygenic risk scores. Nature Reviews Genetics \u003cem\u003e19\u003c/em\u003e, 581\u0026ndash;590.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSu, J., Yuan, J., Xu, L., Xing, S., Sun, M., Yao, Y., Ma, Y., Chen, F., Jiang, L., and Li, K. (2023). Sequencing of 19,219 exomes identifies a low-frequency variant in FKBP5 promoter predisposing to high myopia in a Han Chinese population. Cell Reports \u003cem\u003e42\u003c/em\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePriv\u0026eacute;, F., Arbel, J., and Vilhj\u0026aacute;lmsson, B.J. (2020). LDpred2: better, faster, stronger. Bioinformatics \u003cem\u003e36\u003c/em\u003e, 5424\u0026ndash;5431.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMak, T.S.H., Porsch, R.M., Choi, S.W., Zhou, X., and Sham, P.C. (2017). Polygenic scores via penalized regression on summary statistics. Genetic epidemiology \u003cem\u003e41\u003c/em\u003e, 469\u0026ndash;480.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFiziev, P.P., McRae, J., Ulirsch, J.C., Dron, J.S., Hamp, T., Yang, Y., Wainschtein, P., Ni, Z., Schraiber, J.G., and Gao, H. (2023). Rare penetrant mutations confer severe risk of common diseases. Science \u003cem\u003e380\u003c/em\u003e, eabo1131.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKhan, A., Turchin, M.C., Patki, A., Srinivasasainagendra, V., Shang, N., Nadukuru, R., Jones, A.C., Malolepsza, E., Dikilitas, O., and Kullo, I.J. (2022). Genome-wide polygenic score to predict chronic kidney disease across ancestries. Nature Medicine \u003cem\u003e28\u003c/em\u003e, 1412\u0026ndash;1420.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDornbos, P., Koesterer, R., Ruttenburg, A., Nguyen, T., Cole, J.B., Consortium, A.-T.D.-G., Leong, A., Meigs, J.B., Florez, J.C., and Rotter, J.I. (2022). A combined polygenic score of 21,293 rare and 22 common variants improves diabetes diagnosis based on hemoglobin A1C levels. Nature genetics \u003cem\u003e54\u003c/em\u003e, 1609\u0026ndash;1614.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKhera, A.V., Chaffin, M., Wade, K.H., Zahid, S., Brancale, J., Xia, R., Distefano, M., Senol-Cosar, O., Haas, M.E., and Bick, A. (2019). Polygenic prediction of weight and obesity trajectories from birth to adulthood. Cell \u003cem\u003e177\u003c/em\u003e, 587\u0026ndash;596. e589.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMorgan, I., and Rose, K. (2005). How genetic is school myopia? Progress in retinal and eye research \u003cem\u003e24\u003c/em\u003e, 1\u0026ndash;38.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMorgan, I.G., French, A.N., Ashby, R.S., Guo, X., Ding, X., He, M., and Rose, K.A. (2018). The epidemics of myopia: aetiology and prevention. Progress in retinal and eye research \u003cem\u003e62\u003c/em\u003e, 134\u0026ndash;149.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFan, Q., Guo, X., Tideman, J.W.L., Williams, K.M., Yazar, S., Hosseini, S.M., Howe, L.D., Pourcain, B.S., Evans, D.M., and Timpson, N.J. (2016). Childhood gene-environment interactions and age-dependent effects of genetic variants associated with refractive error and myopia: The CREAM Consortium. Scientific reports \u003cem\u003e6\u003c/em\u003e, 25853.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eEnthoven, C.A., Tideman, J.W.L., Polling, J.R., Tedja, M.S., Raat, H., Iglesias, A.I., Verhoeven, V.J., and Klaver, C.C. (2019). Interaction between lifestyle and genetic susceptibility in myopia: the Generation R study. European journal of epidemiology \u003cem\u003e34\u003c/em\u003e, 777\u0026ndash;784.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWojciechowski, R., Yee, S.S., Simpson, C.L., Bailey-Wilson, J.E., and Stambolian, D. (2013). Matrix metalloproteinases and educational attainment in refractive error: Evidence of gene\u0026ndash;environment interactions in the Age-Related Eye Disease Study. Ophthalmology \u003cem\u003e120\u003c/em\u003e, 298\u0026ndash;305.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFan, Q., Verhoeven, V.J., Wojciechowski, R., Barathi, V.A., Hysi, P.G., Guggenheim, J.A., H\u0026ouml;hn, R., Vitart, V., Khawaja, A.P., and Yamashiro, K. (2016). Meta-analysis of gene\u0026ndash;environment-wide association scans accounting for education level identifies additional loci for refractive error. Nature communications \u003cem\u003e7\u003c/em\u003e, 11008.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWallman, J., and Winawer, J. (2004). Homeostasis of eye growth and the question of myopia. Neuron \u003cem\u003e43\u003c/em\u003e, 447\u0026ndash;468.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTroilo, D., Smith, E.L., Nickla, D.L., Ashby, R., Tkatchenko, A.V., Ostrin, L.A., Gawne, T.J., Pardue, M.T., Summers, J.A., and Kee, C.-s. (2019). IMI\u0026ndash;Report on experimental models of emmetropization and myopia. Investigative ophthalmology \u0026amp; visual science \u003cem\u003e60\u003c/em\u003e, M31-M88.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWu, H., Chen, W., Zhao, F., Zhou, Q., Reinach, P.S., Deng, L., Ma, L., Luo, S., Srinivasalu, N., and Pan, M. (2018). Scleral hypoxia is a target for myopia control. Proceedings of the National Academy of Sciences \u003cem\u003e115\u003c/em\u003e, E7091-E7100.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBycroft, C., Freeman, C., Petkova, D., Band, G., Elliott, L.T., Sharp, K., Motyer, A., Vukcevic, D., Delaneau, O., O'Connell, J., et al. (2018). The UK Biobank resource with deep phenotyping and genomic data. Nature \u003cem\u003e562\u003c/em\u003e, 203\u0026ndash;209. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/s41586-018-0579-z\u003c/span\u003e\u003cspan address=\"10.1038/s41586-018-0579-z\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePurcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M.A., Bender, D., Maller, J., Sklar, P., De Bakker, P.I., and Daly, M.J. (2007). PLINK: a tool set for whole-genome association and population-based linkage analyses. The American journal of human genetics \u003cem\u003e81\u003c/em\u003e, 559\u0026ndash;575.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDevlin, B., and Roeder, K. (1999). Genomic control for association studies. Biometrics \u003cem\u003e55\u003c/em\u003e, 997\u0026ndash;1004.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSveinbjornsson, G., Albrechtsen, A., Zink, F., Gudjonsson, S.A., Oddson, A., M\u0026aacute;sson, G., Holm, H., Kong, A., Thorsteinsdottir, U., and Sulem, P. (2016). Weighting sequence variants based on their annotation increases power of whole-genome association studies. Nature genetics \u003cem\u003e48\u003c/em\u003e, 314\u0026ndash;317.\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"nature-portfolio","isNatureJournal":true,"hasQc":false,"allowDirectSubmit":false,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"","title":"Nature Portfolio","twitterHandle":"","acdcEnabled":false,"dfaEnabled":false,"editorialSystem":"ejp","reportingPortfolio":"","inReviewEnabled":true,"inReviewRevisionsEnabled":false},"keywords":"high myopia, whole-exome sequencing, heritability, genetic risk score","lastPublishedDoi":"10.21203/rs.3.rs-4188555/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-4188555/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eHigh myopia (HM), characterized by severe myopic refractive error, stands as a leading cause to visual impairment and blindness globally. HM is a multifactorial ocular disease and presents high heterogeneity in genetics. Employing a genetic risk score (GRS) is useful for capturing genetic susceptibility to HM. Incorporating rare variations into GRS assessment, though presents methodological challenges, yields significant benefits. This study enrolled two independent cohorts: 12,000 unrelated individuals of Han Chinese ancestry from Myopia Associated Genetics and Intervention Consortium (MAGIC) and 8,682 individuals of European ancestry from UK Biobank (UKB). Using whole-exome sequencing (WES) data, we first estimated the heritability of HM resulting in 0.53 (standard error, 0.06) in the MAGIC cohort and 0.21 (standard error, 0.10) in the UKB cohort. In the MAGIC cohort, rare variants in low linkage disequilibrium (LD) with neighboring variants were enriched for heritability, particularly for rare deleterious protein-altering variants. Thus, we generated, optimized and validated an exome-wide genetic risk score (ExGRS) for HM prediction by combining rare risk genotypes with common variant GRS (cvGRS). ExGRS improved the AUC from 0.819 (cvGRS) to 0.856 for HM. Individuals with a top 5% ExGRS conffered a 15.57-times (95%CI, 5.70 - 59.48) higher risk for developing HM compared to the remaining 95% of individuals in MAGIC cohort and 2.03 times (95%CI, 1.65-2.49) higher risk in UKB. Our study implies that rare variants are a major source of the missing heritability of HM in Han Chinese ancestry. And ExGRS provides an enhanced accuracy for HM prediction, shedding new light on research and clinical practice.\u003c/p\u003e","manuscriptTitle":"ExGRS: exome-wide genetic risk score to predict high myopia across multi-ancestry populations","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-06-04 18:55:12","doi":"10.21203/rs.3.rs-4188555/v1","editorialEvents":[],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"communications-medicine","isNatureJournal":true,"hasQc":false,"allowDirectSubmit":false,"externalIdentity":"commsmed","sideBox":"Learn more about [Communications Medicine](http://www.nature.com/commsmed)","snPcode":"43856","submissionUrl":"https://mts-commsmed.nature.com/cgi-bin/main.plex","title":"Communications Medicine","twitterHandle":"@commsmedicine","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"ejp","reportingPortfolio":"Communications Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"02800efb-8e7f-4499-929d-3e840cb7cd2d","owner":[],"postedDate":"June 4th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[{"id":32090271,"name":"Health sciences/Diseases/Eye diseases"},{"id":32090272,"name":"Biological sciences/Genetics/Population genetics"}],"tags":[],"updatedAt":"2024-12-31T08:09:03+00:00","versionOfRecord":{"articleIdentity":"rs-4188555","link":"https://doi.org/10.1038/s43856-024-00718-1","journal":{"identity":"communications-medicine","isVorOnly":false,"title":"Communications Medicine"},"publishedOn":"2024-12-30 05:00:00","publishedOnDateReadable":"December 30th, 2024"},"versionCreatedAt":"2024-06-04 18:55:12","video":"","vorDoi":"10.1038/s43856-024-00718-1","vorDoiUrl":"https://doi.org/10.1038/s43856-024-00718-1","workflowStages":[]},"version":"v1","identity":"rs-4188555","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-4188555","identity":"rs-4188555","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00