Expanding the Genetic Landscape of Craniofacial Anomalies Through Transcriptome-Wide Association Studies

preprint OA: closed
Full text JSON View at publisher
Full text 126,678 characters · extracted from preprint-html · click to expand
Expanding the Genetic Landscape of Craniofacial Anomalies Through Transcriptome-Wide Association Studies | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Expanding the Genetic Landscape of Craniofacial Anomalies Through Transcriptome-Wide Association Studies Elly Brokamp, Alexandra Scalici, Tyne Miller-Fleming, David Wu, and 4 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7645057/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Background Craniofacial anomalies are common congenital anomalies that significantly contribute to infant mortality and life-long health problems. Studies of craniofacial anomalies have identified several genetic causes, but focus on rare, Mendelian presentations. Despite this, current diagnostic genetic testing only identifies a causal genomic variant in ~ 25% of affected individuals. This low diagnostic yield for Mendelian conditions may relate to oligogenic and polygenic risks for craniofacial anomalies. In this study we sought to use large electronic health record systems including many patients with craniofacial anomalies to determine whether we could identify patterns of genetic associations with craniofacial anomalies and known associated genes. Methods We performed transcriptome-wide association studies that evaluated the association between genetically predicted gene expression and craniofacial anomalies in two cohorts: Vanderbilt University Medical Center’s BioVU and Electronic Medical Records and Genomics Network (eMERGE). Using a list of 391 previously identified craniofacial anomaly-associated genes we determined whether there was a greater proportion of significant associations with these genes than others. We also evaluated whether these genes were associated with other congenital anomalies. Results We determined the predicted expression of 12 (3.1%) of the known craniofacial anomaly genes were associated with craniofacial anomalies (p < 0.05) in BioVU and 18 (4.6%) in eMERGE. In both cohorts, the majority of significant genes and those demonstrating the strongest significance were not previously associated with craniofacial anomalies. In total, we identified 53 genes not previously associated with craniofacial anomalies. Interestingly fewer than 15% of the known craniofacial associated genes were associated with craniofacial anomalies (p < 0.05) while 262 (76.8%) were associated with congenital anomalies of the heart, 133 (39.0%) anomalies of the nervous system and 142 (41.6%) of the urinary system. Conclusions Our results support that both rare and common variation in Mendelian disease-associated genes may contribute to craniofacial anomalies and are broadly involved in congenital anomaly development. Craniofacial anomalies transcriptome-wide association studies congenital anomalies electronic health records Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Background Craniofacial anomalies (CFAs) are a common group of congenital anomalies (CAs) caused by the abnormal development of skull and/or facial bones. The most common CFAs, cleft lip with or without cleft palate (CL/P) and cleft palate alone, make up one third of all CAs in the United States. They occur in ~ 16 in 10,00 births within the United States and are a major contributor to infant mortality. 1 Likewise, craniosynostosis, another common CFA, is estimated to affect 1 in 2100–2500 births and can result in abnormal brain growth causing neurological dysfunction. 2 The frequency, contribution to infant mortality, and life-long associated health problems associated with CFAs makes understanding of genetic underpinnings of the risk and presentations of CFAs essential. Despite a great deal of research into the genetic etiology of CFAs, clinical genetic testing for CFAs has a relatively low diagnostic yield. There is little understanding of what drives the variable expression and incomplete penetrance of CFA syndromes. Combined diagnostic approaches of karyotype, chromosomal microarray, and exome sequencing can detect the genetic cause of 22.5% of individuals with orofacial clefts. 3 , 4 Similarly, comprehensive clinical diagnostic genetic testing of individuals with craniosynostosis can detect a genetic cause in about a quarter (25%) of affected individuals. 5 These results suggest that, at best, for every four patients with a CFA who access comprehensive genetic testing, one will receive a diagnostic results for the genetic cause of their condition. Even when a diagnostic test identifies a causal variant in a Mendelian gene, there is often incomplete penetrance and/or variable expressivity, suggesting the presence of complex genomic and environmental interactions that impact phenotypic manifestations. As sequencing technologies have advanced and sample sizes have increased, several studies have examined how polygenic variation impacts the penetrance of rare variants associated with CFAs such as cleft lip and palate. 6 , 7 To better understand how polygenicity impacts CFAs, large sample sizes are essential. Resources such as Vanderbilt University Medical Center’s BioVU and the Electronic Medical Records and Genomics (eMERGE) network cohort, which link genotype data to electronic health record (EHR) data, could contribute to these efforts. 8 , 9 We therefore undertook the first use of these resources to evaluate the genetically predicted changes in gene expression of CFA-associated Mendelian genes as well as the genetically predicted gene expression of individuals with CFAs. Methods Identification of Individuals with Craniofacial and Other Congenital Anomalies We used the phecodeX version of phecodes, systematic groupings of International Classifiers of Disease (ICD) billing codes, which are part of the CM (congenital malformation) chapter of phecodes to identify individuals with CAs. 10 The Tongue, Mouth, and Pharynx parent code (CM_754) and the Skull, Face, and Jaw parent code (CM_755) within the CM chapter were used to identify individuals with CFAs. Figure 1 A demonstrates how ICD9 and ICD10 billing codes are collapsed into one phecode. All available billing codes from an individuals’ entire medical history are used in the construction of phecodes. We mapped the CM phecodes from the ICD code data using the PheWAS R package. 11 Identification in Vanderbilt University Medical Center Vanderbilt University Medical Center’s EHR-system provides a unique resource with a wealth of de-identified health information in the synthetic derivative (SD). The SD contains ~ 3.5 million individuals, with ~ 300k individuals having genetic information in BioVU, VUMC’s deidentified EHR-linked biobank. 6,131 Individuals with CAs and CFAs were identified from Vanderbilt University Medical Center (VUMC)’s de-identified EHR-linked DNA biobank, BioVU. 8 , 12 This work is deemed non-human subjects work by the VUMC IRB and received all necessary approvals. We restricted our cases and controls used in analyses to a medical home population (n = 1,275,576), a high medical use population defined as individuals with an ICD-9 or 10 billing code collected from at least three unique visit dates over three or more years. This medical home definition ensures a study population with substantial phenotypic information within their EHR. All genetic data for included participants in BioVU were genotyped using the Illumina MEGAEX Array and the included population was restricted to participants of European genetic ancestry based on clustering in principal component analysis (PCA) using genetic data from the 1000 Genomes Project as reference populations, as previously described. 13 This restriction was done to minimize confounding results and maximize sample size. Future work to include individuals of non-European genetic ancestry is ongoing. Identification in eMERGE We identified individuals with CFAs from the Electronic Medical Records and Genomics (eMERGE) Network, which combines DNA biorepositories with EHR data. 9 , 14 The eMERGE GWAS cohort includes 64,536 individuals from five institutions (VUMC, Columbia University Irving Medical Center, Northwestern Medical Center, Mass General Brigham, and Cincinnati Children’s Hospital Medical Center (CCHMC)). Analyses were restricted to participants of European genetic ancestry from CCHMC in the eMERGE GWAS cohort. Because 75% of CFA cases in eMERGE were from CCHMC we restricted our secondary analysis to this population. ICD billing codes for each individual are provided as part of available data to eMERGE consortium researchers. We mapped the CM phecodes from the ICD code data using the PheWAS R package as previously described. 11 Identifying Genes with Known CFA Associations We curated a list of genes associated with CFAs, using previously published genome-wide association study (GWAS) results and the Concert Genetics’ registry of clinical genetic tests. 15 , 16 This testing registry aggregates all clinically available genetic panels, lists all the genes included in each panel, and allows for comparison of included genes between different companies’ panels. These panels reflect a largely comprehensive list of genes with a known monogenic CFA association. A certified genetic counselor searched Concert Genetics’ Test Registry using the search term “craniofacial” and compiled a list of all the unique genes that are offered on the resulting Craniofacial Panel Tests. These panels included both syndromic and non-syndromic CFA genes. Supplemental Table 1 lists curated known CFA-associated genes. Defining CFA cases and controls To identify individuals in BioVU and eMERGE that have a CFA, we used phecodes from the “Tongue Mouth and Pharynx” and the “Skull Face and Jaw” phecodeX Chaps. 1 0 We identified individuals that had at least two instances of a CFA phecode of the same parent code and/or family head code Supplemental Table 2. Figure 1 B demonstrates the breakdown of the parent code, family head code, and specific code portions of a phecode. We defined controls as individuals who did not have a single congenital anomaly phecode in their record. Defining CA cases and controls To define whether individuals in BioVU and eMERGE had CAs we used phecodes from the congenital malformations phecode X Chap. 1 0 For our cases, we identified individuals with at least two instances of a specific CA phecode (Fig. 1 B, Supplemental Table 3). Controls were defined as individuals with no CA phecodes in their record. Transcriptome Wide Association Study of Individuals with CFAs Using machine learning models such as PrediXcan, UTMOST, and Joint Tissue Imputation (JTI), we calculated genetically predicted gene expression (GPGE) using GTEx (version 8) as a reference population in both BioVU and eMERGE. 17 , 18 , 19 , 20 Using the single best performing model out of these three models for each gene tissue pair (r 2 > 0.01), we conducted a transcriptome-wide association study (TWAS). We tested the association of CFA status with GPGE in a logistic regression model adjusting for age, sex, number of visits, and the first ten principal components of ancestry. Association of Craniofacial Anomaly Genes with Other Congenital Anomalies To determine how variation in GPGE of known CFA genes may increase risk for other CAs, we conducted gene-based phenotype-wide association studies (PheWAS) of the known CFA genes. 21 , 22 , 23 Of the 391 known CFA genes, we had quality prediction for 341 (r 2 > 0.1). We tested whether a diagnosis of a CA is associated with GPGE of a known CFA gene adjusting for age, sex, the first ten principal components of ancestry, and number of visits in a logistic regression mode. Results Identification of individuals with craniofacial and other congenital anomalies within the EHR Within the entire VUMC medical home population, including individuals without genotype data, there are 19,509 individuals with a CFA. About a quarter of these individuals, 4,051 (21.1%) have a second CA in another organ system (Fig. 1 C). At VUMC, there are 694 individuals of European ancestry with a CFA who have available genotype information (248 individuals with a “Tongue, Mouth, and Pharynx” phecode CA and 446 individuals with a “Skull, Face, and Jaw” phecode CA). From the eMERGE GWAS cohort individuals of European ancestry at CCHMC, there are 384 individuals with a CFA (113 individuals with a Tongue, Mouth, and Pharynx phecode CA and 322 individuals with a Skull, Face, and Jaw phecode CA) (Table 1 ). Table 1 Demographics of craniofacial cases and controls Craniofacial anomalies Controls Total BioVU (n) 635 42,810 43,445 EHR-reported sex Male 352 (55.4) 17,497 (40.9) 17,849 (41.1%) Female 283 (44.6) 25,313 (59.1) 25,596 (58.9%) Age, years 27.1 (23.4) 58.0 (21.1) 57.5 (21.5) Number of visits 126.0 (135.0) 61.4 (65.5) 62.4 (67.5) eMERGE (n) 384 3,575 3,959 EHR-reported sex Male 159 (41.4) 2,014 (56.4) 2,239 (56.6) Female 225 (58.6) 1,559 (43.6) 1,718 (43.4) Age, years 17.2 (5.3) 25.2 (7.9) 24.4 (8.1) Data is presented as number (%) for categorical or median (standard deviation) for continuous variables. TWAS of individuals with CFAs identifies genes not previously associated with CFAs Our TWAS of CFAs in both BioVU and eMERGE did not identify any statistically significant associations that passed the highly conservative Bonferroni multiple testing correction (0.05 divided by the number of genes with GPGE per tissue). When using a less stringent p-value threshold, we identified 1,261 genes in BioVU and 1,260 gene in eMERGE that were significantly associated (p < 0.05). We compared these genes in both study populations to our curated list of known CFA genes and found that the majority of our curated CFA genes (93.6%) demonstrate no level of significant association, even at a permissive p < 0.05 level, with CFAs in either BioVU and eMERGE (Fig. 2 A). In total, fewer than 1% of significant genes in either BioVU or eMERGE were part of the curated previously known CFA-associated gene list. This included 11 significant genes from BioVU (0.90%) and 14 (1%) from eMERGE (Table 2 ). There was no overlap in significant genes known to association with a CFA in either BioVU or eMERGE. Table 2 Number of significant results in BioVU and eMERGE craniofacial (CFA) transcriptome-wide association study (TWAS) Study site Any gene (p < 0.05) Any gene (p < 0.001) Known CFA gene (p < 0.05) Known CFA gene (p < 0.001) BioVU 1,261 231 11 1 eMERGE 1,260 257 14 4 Additionally, we compared the 1,261 and 1,260 significant associations (p < 0.05) in each study population and found 53 significant gene associations that were shared between the BioVU and eMERGE TWAS results (Fig. 2 B). These 53 genes were not part of the curated gene list and were not identified previously as associated with CFAs or craniofacial structure. By using a more stringent p-value threshold (p < 0.001), we identified 231 genes in BioVU and 257 genes in eMERGE associated with CFAs (Table 2 ). From this more stringent cutoff we identified two genes, VAV1 (*164875) (BioVU p = 0.009; eMERGE p = 0.006) and CYP3A7 (*605340) (BioVU p = 0.009; eMERGE p = 0.009) that are shared between both cohorts (Fig. 2 B). Neither of these genes were associated with CFAs or any human disease but have been implicated in development. 24 , 25 CA-wide association study illustrates that known CFA genes are associated with a broad range of CAs across multiple organ systems Because CFAs often co-occur with other CAs, we sought to evaluate whether the curated CFA associated gene list demonstrated more significant associations with other CAs. To do this we tested for the association between the GPGE of 341 CFA-associated genes from the curated list and any CA phecode in both BioVU and eMERGE study populations. Assuming common variation in these genes identified from clinical testing panels for CFAs were specific to craniofacial development, we would expect to find an enrichment of craniofacial phenotypes in the skull face and jaw as well as the mouth, tongue and pharynx phecodeX chapters. While we identified a few significant associations with these phecodes at the least stringent significance threshold (p < 0.05) (Fig. 3 A and B), the most significant associations for any of the CFA-associated genes were identified in other organ system CAs. This analysis illustrates that a broad range of phenotypes spanning multiple organ systems are associated with common variation in CFA-associated genes (Fig. 4 ). Among our gene-based results, we found that the GPGE of GLI2 (*165230) is associated with 18 different CA phecodes in BioVU (p < 0.05). These include significant associations with CA phecodes from four organ systems, 12 heart, 1 eye, 1 musculoskeletal, and 1 respiratory, as well as 2 situs inversus CA phecodes (Fig. 5 , Table 3 ). Table 3 Congenital anomaly phecodes associated with genetically predicted expression of GLI2 Organ system CA phecode Beta Standard error p-value Eye CM_751.113 0.495 0.240 0.039 Heart CM_763.212 0.331 0.090 2.291x10 − 4 Heart CM_763.2 0.145 0.040 2.505 x10 − 4 Heart CM_763.232 0.186 0.053 5.033x10 − 4 Heart CM_763.23 0.158 0.047 7.654 x10 − 4 Heart CM_763.21 0.234 0.071 9.777 x10 − 4 Heart CM_763.11 0.622 0.216 4.059x10 − 3 Heart CM_763.15 0.187 0.067 0.006 Heart CM_763.36 0.259 0.102 0.011 Heart CM_763.231 0.204 0.086 0.018 Heart CM_763.152 0.256 0.115 0.026 Heart CM_763.1 0.095 0.043 0.028 Heart CM_763.8 0.484 0.242 0.046 Heart CM_763.14 0.138 0.070 0.049 Musculoskeletal CM_770.1 0.380 0.192 0.048 Respiratory CM_762.3 0.377 0.155 0.015 Situs Inversus CM_774.2 0.672 0.186 3.003 x10 − 4 Situs Inversus CM_774 0.451 0.150 0.003 Discussion Because most previous genetic studies of CFAs focus on rare variant/ monogenic causes of disease, this work investigated the GPGE of individuals with CFAs and how the GPGE of known CFA genes relate to CAs. We perform a TWAS for individuals with a CFA and a CA-wide PheWAS for the GPGE of genes with a known CFA association. Overall, the results of these analyses suggest that in addition to rare variants, polygenic variation impacting gene expression may contribute to many CAs and may play a role in the penetrance and expressivity of CA syndromes. As sample sizes for genetic studies have increased, so has our understanding of the complexity of the genetic architecture driving phenotypes such as CFAs through our expanded understanding of how common variation contributes to the diversity of facial morphology and shape. 15 , 26 , 27 Studies of CFAs and polygenic architecture have identified that there are shared genetic features between cleft lip and palate and the size of facial features as well as common variation that affects the penetrance of known cleft clip and palate variants in PDGFRA (*173490). 6,7 Additionally, researchers have proposed that Mendelian CFAs are extreme phenotypes on a continuum of phenotypic variation in facial morphology and that integrating common variation into the study of these phenotypes is essential to understanding their genetic drivers. 27 , 28 All of these studies leverage GWAS and polygenic scores to examine common variations contributions to CFAs. However, using an approach that utilizes GPGE allows us to capture the predicted effects of common variant gene expression in a TWAS to identify genes whose altered GPGE is associated with CFA status. By conducting a gene-based analysis of CFAs, we can obtain more biologically interpretable results for our CFA associations. Out of the 391 genes known to cause CFAs in a monogenic fashion, over 90% are not significantly associated with CFAs in a transcriptome-wide fashion (p < 0.01), highlighting the gap in knowledge when studying CFAs that present in a Mendelian or syndromic fashion. The two genes whose GPGE showed a significant association (p < 0.01) with CFAs in both cohorts, VAV1 (*164875) and CYP3A7 (*605340), have not yet been associated with any human disease, but do have known roles in fetal development. VAV1 (*164875) is a proto-oncogene that is involved with hematopoiesis and T and B cell signaling. The well-established relationship between disrupted oncoprotein signaling, cancer, and congenital anomalies suggests that differential expression of the oncogene VAV1 (*164875) could be driving the development of some CFAs. 29 , 30,24 Additional research on the differential expression of proto-oncogenes in individuals with congenital anomalies could give additional insight into why those with congenital anomalies are more likely to develop cancer. CYP3A7 (*605340) encodes a cytochrome P450 “super family” enzyme, CYP3A7 (*605340), which metabolizes a diverse array of endogenous and exogenous substances, including prescription medications that many pregnant individuals need to maintain their own health such as carbamazepine, diltiazem, caffeine, and nifedipine. CYP3A7 (*605340) is primarily expressed in fetal liver tissue, being detected as early as 50 days of gestation and decreasing in expression until 24 months of age postnatally. 31 , 32 Additionally, CYP3A7 (*605340) plays a key role in the production of a critical pregnancy hormone, estriol, which has been shown to be an important epigenetic modifier in mice fetuses. Variable expression of CYP3A7 (*605340) could have dramatic effects on fetal development, and further research can assess the complex interactions of environmental risks and genetic predispositions to CFAs and other CAs. The observations that only 10% of the previously identified CFA associated genes had a significant GPGE association with CFAs and the strong overlap (21.1%) of individuals with CFAs having other CAs drove us to analyze what CAs were significantly associated with the GPGE of the curated CFA genes. These known CFA genes show more significant GPGE associations with other organ system CAs than the CFAs themselves, suggesting that there are shared genetic and environmental susceptibilities across CAs. The many genetic syndromes that contain multiple CAs support the idea of shared risk factors for many types of CAs. Taken together these results suggest that polygenic variation in CFA-associated genes may relate to developmental changes more broadly and are not necessarily restricted to craniofacial development. Many genetic syndromes that cause multiple types of CAs demonstrate variable expression and incomplete penetrance, yet the factors causing variable expression/ incomplete penetrance is not well understood. For example, ~ 65% of individuals with 22q11.2 deletion syndrome have a congenital heart defect (CHD) and ~ 67% have a palate abnormality. 33 , 34 As another example, we noted that the predicted expression of GLI2 ’s (*165230) is significantly (p < 0.05) associated with many congenital heart defects (CHDs), such as congenital pulmonary valve stenosis (p = 0.0002), congenital malformations of heart valves (p = 0.0002), congenital insufficiency of the aortic valve (p = 0.0005), and ten others (Table 3 ). GLI2 (*165230) is classified as a known CFA gene due to its association with two congenital malformation syndromes, Culler-Jones syndrome and Holoprosencephaly 9. 35 Both syndromes can present with several congenital anomalies, such as cleft lip/palate, microcephaly, polydactyly, but to date CHDs are not associated with either syndrome. 36 , 37 Yet GLI2 (*165230) does have a well-established role in cardiomyogenesis and there is a group of individuals with CHDs that have GLI2 (*165230) missense variants shown to dysregulate sonic hedgehog signaling, which is crucial for fetal development. , 38 , 39 , 40 The expression of GLI2 (*165230) in the developing heart suggests CHDs could possibly be a phenotypic expansion for the two GLI2 -related congenital anomaly syndromes and suggests examining its role in cardiac development. Differences in gene expression could be one factor causing variable expressivity and reduced penetrance that is characteristic of many congenital anomaly syndromes. One of the main limitations of studying CFAs at biobank scale is that they have a relatively low prevalence and are caused by large-effect rare variants. Both issues affect our statistical power to detect phenotype associations. One way that our analysis attempted to address this limitation is by conducting gene-based analyses and defining our CFA phenotype across multiple phecodes that describe different types of CFAs. Despite using a less stringent p-value threshold, conducting gene-based analyses such as TWAS provides a more interpretable biological unit than single variant analyses. While the analyses in this paper are underpowered, they still provide biologically and clinically meaningful results. 22 , 23 Throughout this study we leveraged a set of known CFA genes that were compiled from clinical diagnostic testing for CFAs as well as a GWAS of face shape that was curated and reviewed by a certified genetic counselor. While we tried to make this gene set as comprehensive as possible, we are limited by which genes are currently have a well-established associated with CFAs. The main goal of the work in this study is to try to better understand the genetic drivers of CFAs. Overall, our results support that both rare and common genetic variants in CFA Mendelian genes may contribute to a variety of CAs and highlights the complexities of the CA phenotypes, suggesting there are shared underlying genetic and environmental risk factors. Further research of CAs through GPGE could help better explain variable presentations and penetrance of CA syndromes. Conclusions TWAS of individuals with CFAs in the BioVU and eMERGE cohorts identified relatively few previously identified CFA-associated genes. The two genes whose GPGE had the strongest association in both cohorts have potential roles in the complex genetic drive of CAs. The predicted expression of genes that have a known Mendelian-association with CFAs are more often significantly associated with other types of CAs in the BioVU and eMERGE cohorts. For example, the predicted expression of GLI2 , which is associated with a syndrome that can include CFAs, is significantly associated with several CHDs. The results of both analyses suggest that there are overlapping polygenic causes of many types of CAs and that with further research may help explain the variability in how CA syndromes can present. Abbreviations Craniofacial anomalies (CFAs) Congenital anomalies (CAs) Cleft lip and/or palate (CL/P) Electronic health records (EHR) International Classification of Disease (ICD) Congenital malformation (CM) Synthetic derivative (SD) Vanderbilt University Medical Center (VUMC) Institutional Review Board (IRB) Principle component analyses (PCAs) Cincinnati Children’s Hospital Medical Center (CCHMC) Genotype-wide association study (GWAS) Transcriptome-wide association study (TWAS) Genetically predicted gene expression (GPGE) Phenotype-wide association study (PheWAS) Congenital heart defect (CHD) Declarations Ethics approval and consent to participate The data for this study is deidentified and deemed nonhuman subjects. All approvals were obtained for data use including those from Vanderbilt University Medical Center’s institutional review board and eMERGE project proposal protocols. Consent for publication All authors consent to the publication of the results and manuscript materials. Availability of data and materials All data from the primary population at Vanderbilt University Medical Center will be made available by request to the corresponding author pending Institutional approval. Requests for data derived from the secondary population, eMERGE, will be handled per consortium requirements. Competing interests The authors declare no competing interests. Funding The dataset used for the clinical analyses was obtained from the Vanderbilt University Medical Center Synthetic Derivative, which is supported by institutional funding, the 1S10RR025141-01 instrumentation award, and by the CTSA grant UL1TR000445 from National Center for Advancing Translational Sciences/National Institutes of Health. The phase of the eMERGE Network used in this study was initiated and funded by the NHGRI through the following grants: U01HG008657 (Group Health Cooperative/University of Washington); U01HG008685 (Bringham and Women’s Hospital); U01HG00672 (Vanderbilt University Medical Center); U01HG008666 (Cincinnati Children’s Hospital Medical Center); U01HG006379 (Mayo Clinic); U01HG008679 (Geisinger Clinic); U01HG008680 (Columbia University Health Sciences); U01HG008684 (Children’s Hospital of Philadelphia); U01HG008673 (Northwestern University); U01HG008701 (Vanderbilt University Medical Center serving as the Coordinating Center); U01HG008676 (Partners Healthcare/Broad Institute); U01HG008664 (Baylor College of Medicine); and U54MD007593 (Meharry Medical College). Individual support for this work includes E.B (NIHNCATS NOT-OD-22-108 CTTSA TL1 5TL1TR002244-07), A.S. (5T32GM080178-15 and 1T32GM145734-01), and M.M.S (K12HD043483 and P50HD106446). Authors contributions Conceptualization: E.B., A.S, N.J.C, M.M.S; Data Curation: E.B., A.S, D.W., T.M.F., M.M.S; Formal analysis: E.B., A.S.; Funding acquisition: E.B., A.S, N.J.C, M.M.S; Investigation: E.B., A.S., M.M.S.; Methodology: E.B., A.S, N.J.C, M.M.S; Project Administration: E.B., A.S; Resources: N.J.C, M.M.S; Software: E.B., T.M.F, A.S., M.M.S; Supervision: N.J.C, M.M.S; Validation: E.B., A.S, M.M.S; Visualization: E.B., A.S., M.M.S; Writing- original draft: E.B., A.S., M.M.S.; Writing- review and editing: E.B., A.S, T.M.F, D.W, W.C, M.W., N.J.C., M.M.S Acknowledgements Not applicable References Mai CT, Isenburg JL, Canfield MA, et al. National population-based estimates for major birth defects, 2010-2014. Birth Defects Res . 2019;111(18):1420-1435. doi:10.1002/bdr2.1589 Boulet SL, Rasmussen SA, Honein MA. A population-based study of craniosynostosis in metropolitan Atlanta, 1989-2003. Am J Med Genet A . 2008;146A(8):984-991. doi:10.1002/ajmg.a.32208 Jin H, Yingqiu C, Zequn L, et al. Chromosomal microarray analysis in the prenatal diagnosis of orofacial clefts: Experience from a single medical center in mainland China. Medicine (Baltimore) . 2018;97(34):e12057. doi:10.1097/MD.0000000000012057 Yan S, Fu F, Li R, et al. Exome sequencing improves genetic diagnosis of congenital orofacial clefts. Front Genet . 2023;14:1252823. doi:10.3389/fgene.2023.1252823 Wilkie AOM, Johnson D, Wall SA. Clinical genetics of craniosynostosis. Curr Opin Pediatr . 2017;29(6):622-628. doi:10.1097/MOP.0000000000000542 Yu Y, Alvarado R, Petty LE, et al. Polygenic risk impacts PDGFRA mutation penetrance in non-syndromic cleft lip and palate. Hum Mol Genet . 2022;31(14):2348. doi:10.1093/hmg/ddac037 Howe LJ, Lee MK, Sharp GC, et al. Investigating the shared genetics of non-syndromic cleft lip/palate and facial morphology. PLOS Genet . 2018;14(8):e1007501. doi:10.1371/journal.pgen.1007501 Roden DM, Pulley JM, Basford MA, et al. Development of a large-scale de-identified DNA biobank to enable personalized medicine. Clin Pharmacol Ther . 2008;84(3):362-369. doi:10.1038/clpt.2008.89 McCarty CA, Chisholm RL, Chute CG, et al. The eMERGE Network: a consortium of biorepositories linked to electronic medical records data for conducting genomic studies. BMC Med Genomics . 2011;4:13. doi:10.1186/1755-8794-4-13 Shuey MM, Stead WW, Aka I, et al. Next-generation phenotyping: introducing phecodeX for enhanced discovery research in medical phenomics. Bioinforma Oxf Engl . 2023;39(11):btad655. doi:10.1093/bioinformatics/btad655 Carroll RJ, Bastarache L, Denny JC. R PheWAS: data analysis and plotting tools for phenome-wide association studies in the R environment. Bioinforma Oxf Engl . 2014;30(16):2375-2376. doi:10.1093/bioinformatics/btu197 Danciu I, Cowan JD, Basford M, et al. Secondary use of clinical data: the Vanderbilt approach. J Biomed Inform . 2014;52:28-35. doi:10.1016/j.jbi.2014.02.003 Dennis JK, Sealock JM, Straub P, et al. Clinical laboratory test-wide association scan of polygenic scores identifies biomarkers of complex disease. Genome Med . 2021;13(1):6. doi:10.1186/s13073-020-00820-8 Gottesman O, Kuivaniemi H, Tromp G, et al. The Electronic Medical Records and Genomics (eMERGE) Network: past, present, and future. Genet Med Off J Am Coll Med Genet . 2013;15(10):761-771. doi:10.1038/gim.2013.72 Shaffer JR, Orlova E, Lee MK, et al. Genome-Wide Association Study Reveals Multiple Loci Influencing Normal Human Facial Morphology. PLoS Genet . 2016;12(8):e1006149. doi:10.1371/journal.pgen.1006149 Payment Accuracy for Precision Lab Diagnostics. Concert. Accessed February 16, 2025. https://www.concert.co/ Gamazon ER, Wheeler HE, Shah KP, et al. A gene-based association method for mapping traits using reference transcriptome data. Nat Genet . 2015;47(9):1091-1098. doi:10.1038/ng.3367 Hu Y, Li M, Lu Q, et al. A statistical framework for cross-tissue transcriptome-wide association analysis. Nat Genet . 2019;51(3):568-576. doi:10.1038/s41588-019-0345-7 Zhou D, Jiang Y, Zhong X, Cox NJ, Liu C, Gamazon ER. A unified framework for joint-tissue transcriptome-wide association and Mendelian randomization analysis. Nat Genet . 2020;52(11):1239-1246. doi:10.1038/s41588-020-0706-2 GTEx Consortium. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science . 2020;369(6509):1318-1330. doi:10.1126/science.aaz1776 Unlu G, Gamazon ER, Qi X, et al. GRIK5 Genetically Regulated Expression Associated with Eye and Vascular Phenomes: Discovery through Iteration among Biobanks, Electronic Health Records, and Zebrafish. Am J Hum Genet . 2019;104(3):503-519. doi:10.1016/j.ajhg.2019.01.017 Unlu G, Qi X, Gamazon ER, et al. Phenome-based approach identifies RIC1-linked Mendelian syndrome through zebrafish models, biobank associations and clinical studies. Nat Med . 2020;26(1):98-109. doi:10.1038/s41591-019-0705-y Zhong X, Yin Z, Jia G, et al. Electronic health record phenotypes associated with genetically regulated expression of CFTR and application to cystic fibrosis. Genet Med Off J Am Coll Med Genet . 2020;22(7):1191-1200. doi:10.1038/s41436-020-0786-5 Katzav S, Martin-Zanca D, Barbacid M. vav, a novel human oncogene derived from a locus ubiquitously expressed in hematopoietic cells. EMBO J . 1989;8(8):2283-2290. doi:10.1002/j.1460-2075.1989.tb08354.x Li H, Lampe JN. Neonatal cytochrome P450 CYP3A7: A comprehensive review of its role in development, disease, and xenobiotic metabolism. Arch Biochem Biophys . 2019;673:108078. doi:10.1016/j.abb.2019.108078 Decoding the Human Face: Progress and Challenges in Understanding the Genetics of Craniofacial Morphology - PubMed. Accessed February 20, 2025. https://pubmed.ncbi.nlm.nih.gov/35483406/ Goovaerts S, Hoskens H, Eller RJ, et al. Joint multi-ancestry and admixed GWAS reveals the complex genetics behind human cranial vault shape. Nat Commun . 2023;14(1):7436. doi:10.1038/s41467-023-43237-8 Vanneste M, Hoskens H, Goovaerts S, et al. Syndrome-informed phenotyping identifies a polygenic background for achondroplasia-like facial variation in the general population. Nat Commun . 2024;15(1):10458. doi:10.1038/s41467-024-54839-1 Castel P, Rauen KA, McCormick F. The duality of human oncoproteins: drivers of cancer and congenital disorders. Nat Rev Cancer . 2020;20(7):383-397. doi:10.1038/s41568-020-0256-z Zhang R, Alt FW, Davidson L, Orkin SH, Swat W. Defective signalling through the T- and B-cell antigen receptors in lymphoid cells lacking the vav proto-oncogene. Nature . 1995;374(6521):470-473. doi:10.1038/374470a0 Liu J, Kandel SE, Lampe JN, Scott EE. Human cytochrome P450 3A7 binding four copies of its native substrate dehydroepiandrosterone 3-sulfate. J Biol Chem . 2023;299(8):104993. doi:10.1016/j.jbc.2023.104993 Williams JA, Ring BJ, Cantrell VE, et al. Comparative metabolic capabilities of CYP3A4, CYP3A5, and CYP3A7. Drug Metab Dispos Biol Fate Chem . 2002;30(8):883-891. doi:10.1124/dmd.30.8.883 Campbell IM, Sheppard SE, Crowley TB, et al. What is new with 22q? An update from the 22q and You Center at the Children’s Hospital of Philadelphia. Am J Med Genet A . 2018;176(10):2058-2069. doi:10.1002/ajmg.a.40637 Jackson O, Crowley TB, Sharkus R, et al. Palatal evaluation and treatment in 22q11.2 deletion syndrome. Am J Med Genet A . 2019;179(7):1184-1195. doi:10.1002/ajmg.a.61152 Bertolacini C, Ribeiro‐Bicudo L, Petrin A, Richieri‐Costa A, Murray J. Clinical findings in patients with GLI2 mutations – phenotypic variability. Clin Genet . 2012;81(1):70-75. doi:10.1111/j.1399-0004.2010.01606.x Bertolacini C, Ribeiro‐Bicudo L, Petrin A, Richieri‐Costa A, Murray J. Clinical findings in patients with GLI2 mutations – phenotypic variability. Clin Genet . 2012;81(1):70-75. doi:10.1111/j.1399-0004.2010.01606.x Blaas H ‐G. K, Eriksson AG, Salvesen KÅ, et al. Brains and faces in holoprosencephaly: pre‐ and postnatal description of 30 cases. Ultrasound Obstet Gynecol . 2002;19(1):24-38. doi:10.1046/j.0960-7692.2001.00154.x Fair JV, Voronova A, Bosiljcic N, Rajgara R, Blais A, Skerjanc IS. BRG1 interacts with GLI2 and binds Mef2c gene in a hedgehog signalling dependent manner during in vitro cardiomyogenesis. BMC Dev Biol . 2016;16(1):27. doi:10.1186/s12861-016-0127-8 Waldron CJ, Kelly LA, Stan N, et al. The HH-GLI2-CKS1B network regulates the proliferation-to-maturation transition of cardiomyocytes. Stem Cells Transl Med . 2024;13(7):678-692. doi:10.1093/stcltm/szae032 Voronova A, Al Madhoun A, Fischer A, Shelton M, Karamboulas C, Skerjanc IS. Gli2 and MEF2C activate each other’s expression and function synergistically during cardiomyogenesis in vitro. Nucleic Acids Res . 2012;40(8):3329-3347. doi:10.1093/nar/gkr1232 Additional Declarations No competing interests reported. Supplementary Files 17Sept2025SupplementalTables.xlsx Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7645057","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":525627470,"identity":"937bcec4-a343-43d8-85b0-f74897fb3501","order_by":0,"name":"Elly Brokamp","email":"","orcid":"","institution":"Vanderbilt University Medical Center","correspondingAuthor":false,"prefix":"","firstName":"Elly","middleName":"","lastName":"Brokamp","suffix":""},{"id":525627471,"identity":"ec7371fa-5f31-4078-8fd4-802cf1cb97f8","order_by":1,"name":"Alexandra Scalici","email":"","orcid":"","institution":"Vanderbilt University Medical Center","correspondingAuthor":false,"prefix":"","firstName":"Alexandra","middleName":"","lastName":"Scalici","suffix":""},{"id":525627472,"identity":"64b6c5c9-5929-4fae-ba5f-2b26c017b622","order_by":2,"name":"Tyne Miller-Fleming","email":"","orcid":"","institution":"Vanderbilt University Medical Center","correspondingAuthor":false,"prefix":"","firstName":"Tyne","middleName":"","lastName":"Miller-Fleming","suffix":""},{"id":525627473,"identity":"88621f44-4de8-427f-add8-d3465a5e8919","order_by":3,"name":"David Wu","email":"","orcid":"","institution":"Vanderbilt University Medical Center","correspondingAuthor":false,"prefix":"","firstName":"David","middleName":"","lastName":"Wu","suffix":""},{"id":525627474,"identity":"61f82b32-d17c-4fc9-af29-198e5bec8517","order_by":4,"name":"Wendy K. Chung","email":"","orcid":"","institution":"Boston Children’s Hospital","correspondingAuthor":false,"prefix":"","firstName":"Wendy","middleName":"K.","lastName":"Chung","suffix":""},{"id":525627475,"identity":"5d479c85-a9b3-4f75-b0f7-c1aa0c2a95b2","order_by":5,"name":"Monica H. Wojcik","email":"","orcid":"","institution":"Boston Children’s Hospital","correspondingAuthor":false,"prefix":"","firstName":"Monica","middleName":"H.","lastName":"Wojcik","suffix":""},{"id":525627476,"identity":"e926f20f-25eb-4748-822c-a75f0f0423d3","order_by":6,"name":"Nancy J. Cox","email":"","orcid":"","institution":"Vanderbilt University Medical Center","correspondingAuthor":false,"prefix":"","firstName":"Nancy","middleName":"J.","lastName":"Cox","suffix":""},{"id":525627481,"identity":"805464b1-a90b-4174-8bb2-d537b4a2b107","order_by":7,"name":"Megan M. Shuey","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA/ElEQVRIiWNgGAWjYFADCQgpR7IWC2OStVQkNhBSqNvefvFzAcM2BvnZPYaPeRgk0jfcyE78wFBxzw6XXrMzZ4qlZzDcZjC4c8bYGKgld8ON3M0SDGeKk3FquZGTIM0D0iKRYyYN0jJzRu42Bsa2hGRcDjO7/yb5N0iL/Iwc898gh0kS1HKD/RjYFoYbOWbMQC0J/BIQLXY4tZzJYbPmMbjNY3AjrVhyjoGEYT/P280SCWcSEnBqOX788W2eitty8jOSN354U1Enz8aeu/HDh4oEe1xaGBh4DBgYDBh4QEwmEBsMgFbgiSD2B3Am4w8kcTy2jIJRMApGwQgDAK/XT9mgs0H8AAAAAElFTkSuQmCC","orcid":"","institution":"Vanderbilt University Medical Center","correspondingAuthor":true,"prefix":"","firstName":"Megan","middleName":"M.","lastName":"Shuey","suffix":""}],"badges":[],"createdAt":"2025-09-18 04:23:21","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7645057/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7645057/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":93773544,"identity":"551266d7-5637-48b4-9065-013f3e83dc7c","added_by":"auto","created_at":"2025-10-17 12:23:05","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":2514283,"visible":true,"origin":"","legend":"","description":"","filename":"17September25maindocument.docx","url":"https://assets-eu.researchsquare.com/files/rs-7645057/v1/4e3a1aa83671c02cce3aa82d.docx"},{"id":93773547,"identity":"02811c55-242a-4e1e-922f-004c26dfde69","added_by":"auto","created_at":"2025-10-17 12:23:05","extension":"json","order_by":1,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":10227,"visible":true,"origin":"","legend":"","description":"","filename":"0cb9c3b429854a0b824a9690353bda01.json","url":"https://assets-eu.researchsquare.com/files/rs-7645057/v1/47ed56e89a127be89d6a4356.json"},{"id":93773548,"identity":"7a12eef3-67a4-46ba-9c20-68be19041ca0","added_by":"auto","created_at":"2025-10-17 12:23:05","extension":"xlsx","order_by":2,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":24981,"visible":true,"origin":"","legend":"","description":"","filename":"17Sept2025SupplementalTables.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7645057/v1/8966ad4405700de991504424.xlsx"},{"id":93773531,"identity":"93cba594-7602-4ac3-9090-9afef85dffcc","added_by":"auto","created_at":"2025-10-17 12:23:05","extension":"xml","order_by":3,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":121643,"visible":true,"origin":"","legend":"","description":"","filename":"0cb9c3b429854a0b824a9690353bda011enriched.xml","url":"https://assets-eu.researchsquare.com/files/rs-7645057/v1/b2126c8738255288cd9c57b2.xml"},{"id":93775427,"identity":"d9adf11a-37fa-4d73-91c3-0ecee8797ec2","added_by":"auto","created_at":"2025-10-17 12:31:05","extension":"png","order_by":4,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":216224,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-7645057/v1/bacf788d4f066ceb0a2670c1.png"},{"id":93773532,"identity":"1c7538bc-f42d-406c-b901-d6a534e3f212","added_by":"auto","created_at":"2025-10-17 12:23:05","extension":"png","order_by":5,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":607782,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-7645057/v1/37db7a67fe759b157f1c92ad.png"},{"id":93773540,"identity":"860bb219-595a-4356-a55d-78df551a10ac","added_by":"auto","created_at":"2025-10-17 12:23:05","extension":"png","order_by":6,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":770010,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-7645057/v1/e89b807841ac9f9907958602.png"},{"id":93773541,"identity":"961bc7a1-8bee-48a6-a795-dcf9f4c50185","added_by":"auto","created_at":"2025-10-17 12:23:05","extension":"jpeg","order_by":7,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":561709,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage4.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-7645057/v1/11c4c8de3d6e2b63a9868b49.jpeg"},{"id":93775426,"identity":"1531af5b-7d7e-43fa-8a7e-40a3f7fd15a2","added_by":"auto","created_at":"2025-10-17 12:31:05","extension":"png","order_by":8,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":431245,"visible":true,"origin":"","legend":"","description":"","filename":"floatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-7645057/v1/944750eaea922231b43dd0d1.png"},{"id":93775428,"identity":"2ff69fd2-61d8-4a93-b93e-81eae523fb80","added_by":"auto","created_at":"2025-10-17 12:31:05","extension":"png","order_by":9,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":47403,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-7645057/v1/ae856130732c77148612192d.png"},{"id":93773543,"identity":"6d9d75e1-1c50-4cf1-be72-54abad4030bf","added_by":"auto","created_at":"2025-10-17 12:23:05","extension":"png","order_by":10,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":99626,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-7645057/v1/3688ed4458454c817484dcda.png"},{"id":93775430,"identity":"8c98a67d-4781-454a-9fe8-a92b20a42e5e","added_by":"auto","created_at":"2025-10-17 12:31:05","extension":"png","order_by":11,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":131408,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-7645057/v1/e88173ee62c1d0ca1a19974a.png"},{"id":93777120,"identity":"2810a310-5bcf-4497-81ea-90d75716fa4d","added_by":"auto","created_at":"2025-10-17 12:39:05","extension":"png","order_by":12,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":119178,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-7645057/v1/ec7ad9f06a458833d7fe847e.png"},{"id":93773539,"identity":"4f29fffe-2141-43cf-91de-36447aeebbfe","added_by":"auto","created_at":"2025-10-17 12:23:05","extension":"png","order_by":13,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":81275,"visible":true,"origin":"","legend":"","description":"","filename":"Onlinefloatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-7645057/v1/b9c4efb1530e7e8c2cb75d1c.png"},{"id":93773536,"identity":"a30b6f92-e909-4580-a57c-62b62a695878","added_by":"auto","created_at":"2025-10-17 12:23:05","extension":"xml","order_by":14,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":121157,"visible":true,"origin":"","legend":"","description":"","filename":"0cb9c3b429854a0b824a9690353bda011structuring.xml","url":"https://assets-eu.researchsquare.com/files/rs-7645057/v1/9c0ad569d2109036543c51d7.xml"},{"id":93773550,"identity":"3faa6e76-795a-4d30-b824-f9069efe6fc7","added_by":"auto","created_at":"2025-10-17 12:23:06","extension":"html","order_by":15,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":130963,"visible":true,"origin":"","legend":"","description":"","filename":"earlyproof.html","url":"https://assets-eu.researchsquare.com/files/rs-7645057/v1/173672030f92dfa31821138a.html"},{"id":93773530,"identity":"4cfa76c5-b05c-4805-b523-1201a5e903bd","added_by":"auto","created_at":"2025-10-17 12:23:04","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":140449,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eCraniofacial Anomaly Phenotyping. \u003c/strong\u003e(\u003cstrong\u003eA\u003c/strong\u003e) Demonstrates phecodeX data structure, e.g how related International Classifiers of Disease (ICD) version 9 and 10 billing codes are collapsed into a single phecode. Seven ICD9 codes and nine ICD10 codes are collapsed into the single cleft palate with cleft lip phecode (CM_754.11). (\u003cstrong\u003eB\u003c/strong\u003e) Demonstrates the different components of a phecode, parent code, family head code, and the entire specific code. Cases for the TWAS were classified as having two or more instances of a phecode within the same family head code. Cases for the congenital anomaly wide phenotype association were classified as having two or more instances of the same specific code. (\u003cstrong\u003eC\u003c/strong\u003e) In VUMC 21.1% of individuals with craniofacial anomalies have another congenital anomaly. This bar chart shows the number of individuals with another congenital anomaly on the y-axis and is grouped by the specific organ system of the other congenital anomaly.\u003c/p\u003e","description":"","filename":"floatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-7645057/v1/08bb72c50edf93bbdeddbc9b.png"},{"id":93773538,"identity":"b6db7292-a2df-4e9c-8644-7dbd85ae47d5","added_by":"auto","created_at":"2025-10-17 12:23:05","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":455040,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eBioVU and eMERGE Craniofacial Anomaly (CFA) Transcriptome wide association study (TWAS) results. \u003c/strong\u003eTWAS analyses identify genes not previously associated with CFAs.\u003cstrong\u003e (A)\u003c/strong\u003e Miami style plot of genes with GPGE associated with CFAs in BioVU and eMERGE. The dotted red lines indicate a p-value of 0.05 and the orange points highlight our curated list of known CFA genes. \u003cstrong\u003e(B) \u003c/strong\u003eA heatmap of the 53 genes that were significant (p\u0026lt;0.05) in both BioVU and eMERGE cohorts. The asterisk (*) indicates the two genes associated with CFAs in both BioVU and eMERGE with a p\u0026lt;0.01.\u003c/p\u003e","description":"","filename":"floatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-7645057/v1/b7851ba55b83219b9bb6f759.png"},{"id":93773549,"identity":"fd81dc26-3606-4382-8a14-3a5f9d8c12cb","added_by":"auto","created_at":"2025-10-17 12:23:05","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":566395,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eKnown Craniofacial Anomaly (CFA) genes associated with CFA phecodes in BioVU. \u003c/strong\u003eWe demonstrate significant associations between known genes and specific phecodes based on the specific phecodeX chapters\u003cstrong\u003e (A) \u003c/strong\u003eSkull, Face, and Jaw and (\u003cstrong\u003eB)\u003c/strong\u003e Tongue, Mouth, and Pharynx chapter. The blue dotted line indicates a p-value of 0.05. Direction of the triangle aligns with the direction of the effect.\u003c/p\u003e","description":"","filename":"floatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-7645057/v1/e4c101df2c7f92cf8206aaae.png"},{"id":93773537,"identity":"c6eeccf2-c74e-4953-9e0b-ec801315b84d","added_by":"auto","created_at":"2025-10-17 12:23:05","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":327068,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eAssociation study of genetically predicted gene expression of known craniofacial anomaly (CFA) genes and other congenital anomalies (CA). \u003c/strong\u003eWe demonstrate significant associations between known CFA genes and specific phecodes from congenital malformation phecodeX chapters\u003cstrong\u003e \u003c/strong\u003ebased on organ system. The plot is the known CFA gene associations with other congenital anomalies in BioVU. The blue dotted line indicates a p-value of 0.05. Direction of the triangle aligns with the direction of the effect.\u003c/p\u003e","description":"","filename":"floatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-7645057/v1/662e061300abefe780bb8f7b.png"},{"id":93773534,"identity":"2cb95f7a-411a-46b7-862f-9a386f6a526e","added_by":"auto","created_at":"2025-10-17 12:23:05","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":343350,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eAssociation study of genetically predicted gene expression (GPGE) of known craniofacial anomaly (CFA) genes and other congenital anomalies (CA). \u003c/strong\u003eThe GPGE of \u003cem\u003eGLI2\u003c/em\u003e (*165230) is associated with an enrichment of cardiac CAs. The blue dotted line indicates a p-value of 0.05. Direction of the triangle aligns with the direction of the effect.\u003c/p\u003e","description":"","filename":"floatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-7645057/v1/4da1e37463e2c472b66cf633.png"},{"id":96453284,"identity":"8a589695-f1e4-409c-bf5a-ad3392979a73","added_by":"auto","created_at":"2025-11-21 09:59:05","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":2916793,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7645057/v1/8af9f140-37ea-4ad0-bda6-49062a6f0499.pdf"},{"id":93773528,"identity":"7bbd272b-75de-49fa-a7ac-710ab5a128e4","added_by":"auto","created_at":"2025-10-17 12:23:04","extension":"xlsx","order_by":0,"title":"","display":"","copyAsset":false,"role":"supplement","size":24981,"visible":true,"origin":"","legend":"","description":"","filename":"17Sept2025SupplementalTables.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7645057/v1/7516436b524b62b884088c20.xlsx"}],"financialInterests":"No competing interests reported.","formattedTitle":"Expanding the Genetic Landscape of Craniofacial Anomalies Through Transcriptome-Wide Association Studies","fulltext":[{"header":"Background","content":"\u003cp\u003eCraniofacial anomalies (CFAs) are a common group of congenital anomalies (CAs) caused by the abnormal development of skull and/or facial bones. The most common CFAs, cleft lip with or without cleft palate (CL/P) and cleft palate alone, make up one third of all CAs in the United States. They occur in ~\u0026thinsp;16 in 10,00 births within the United States and are a major contributor to infant mortality.\u003csup\u003e\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e\u003c/sup\u003e Likewise, craniosynostosis, another common CFA, is estimated to affect 1 in 2100\u0026ndash;2500 births and can result in abnormal brain growth causing neurological dysfunction.\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e\u003c/sup\u003e The frequency, contribution to infant mortality, and life-long associated health problems associated with CFAs makes understanding of genetic underpinnings of the risk and presentations of CFAs essential.\u003c/p\u003e\u003cp\u003eDespite a great deal of research into the genetic etiology of CFAs, clinical genetic testing for CFAs has a relatively low diagnostic yield. There is little understanding of what drives the variable expression and incomplete penetrance of CFA syndromes. Combined diagnostic approaches of karyotype, chromosomal microarray, and exome sequencing can detect the genetic cause of 22.5% of individuals with orofacial clefts.\u003csup\u003e\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e,\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e\u003c/sup\u003e Similarly, comprehensive clinical diagnostic genetic testing of individuals with craniosynostosis can detect a genetic cause in about a quarter (25%) of affected individuals.\u003csup\u003e\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e\u003c/sup\u003e These results suggest that, at best, for every four patients with a CFA who access comprehensive genetic testing, one will receive a diagnostic results for the genetic cause of their condition. Even when a diagnostic test identifies a causal variant in a Mendelian gene, there is often incomplete penetrance and/or variable expressivity, suggesting the presence of complex genomic and environmental interactions that impact phenotypic manifestations. As sequencing technologies have advanced and sample sizes have increased, several studies have examined how polygenic variation impacts the penetrance of rare variants associated with CFAs such as cleft lip and palate.\u003csup\u003e\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e,\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e\u003c/sup\u003e\u003c/p\u003e\u003cp\u003eTo better understand how polygenicity impacts CFAs, large sample sizes are essential. Resources such as Vanderbilt University Medical Center\u0026rsquo;s BioVU and the Electronic Medical Records and Genomics (eMERGE) network cohort, which link genotype data to electronic health record (EHR) data, could contribute to these efforts.\u003csup\u003e\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e,\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e\u003c/sup\u003e We therefore undertook the first use of these resources to evaluate the genetically predicted changes in gene expression of CFA-associated Mendelian genes as well as the genetically predicted gene expression of individuals with CFAs.\u003c/p\u003e"},{"header":"Methods","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\u003ch2\u003eIdentification of Individuals with Craniofacial and Other Congenital Anomalies\u003c/h2\u003e\u003cp\u003eWe used the phecodeX version of phecodes, systematic groupings of International Classifiers of Disease (ICD) billing codes, which are part of the CM (congenital malformation) chapter of phecodes to identify individuals with CAs.\u003csup\u003e\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e\u003c/sup\u003e The Tongue, Mouth, and Pharynx parent code (CM_754) and the Skull, Face, and Jaw parent code (CM_755) within the CM chapter were used to identify individuals with CFAs. Figure\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eA demonstrates how ICD9 and ICD10 billing codes are collapsed into one phecode. All available billing codes from an individuals\u0026rsquo; entire medical history are used in the construction of phecodes. We mapped the CM phecodes from the ICD code data using the PheWAS R package.\u003csup\u003e\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e\u003c/sup\u003e\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003c/div\u003e\n\u003ch3\u003eIdentification in Vanderbilt University Medical Center\u003c/h3\u003e\n\u003cp\u003eVanderbilt University Medical Center\u0026rsquo;s EHR-system provides a unique resource with a wealth of de-identified health information in the synthetic derivative (SD). The SD contains\u0026thinsp;~\u0026thinsp;3.5\u0026nbsp;million individuals, with ~\u0026thinsp;300k individuals having genetic information in BioVU, VUMC\u0026rsquo;s deidentified EHR-linked biobank. 6,131 Individuals with CAs and CFAs were identified from Vanderbilt University Medical Center (VUMC)\u0026rsquo;s de-identified EHR-linked DNA biobank, BioVU.\u003csup\u003e\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e,\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e\u003c/sup\u003e This work is deemed non-human subjects work by the VUMC IRB and received all necessary approvals. We restricted our cases and controls used in analyses to a medical home population (n\u0026thinsp;=\u0026thinsp;1,275,576), a high medical use population defined as individuals with an ICD-9 or 10 billing code collected from at least three unique visit dates over three or more years. This medical home definition ensures a study population with substantial phenotypic information within their EHR. All genetic data for included participants in BioVU were genotyped using the Illumina MEGAEX Array and the included population was restricted to participants of European genetic ancestry based on clustering in principal component analysis (PCA) using genetic data from the 1000 Genomes Project as reference populations, as previously described.\u003csup\u003e\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e\u003c/sup\u003e This restriction was done to minimize confounding results and maximize sample size. Future work to include individuals of non-European genetic ancestry is ongoing.\u003c/p\u003e\n\u003ch3\u003eIdentification in eMERGE\u003c/h3\u003e\n\u003cp\u003eWe identified individuals with CFAs from the Electronic Medical Records and Genomics (eMERGE) Network, which combines DNA biorepositories with EHR data.\u003csup\u003e\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e,\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e\u003c/sup\u003e The eMERGE GWAS cohort includes 64,536 individuals from five institutions (VUMC, Columbia University Irving Medical Center, Northwestern Medical Center, Mass General Brigham, and Cincinnati Children\u0026rsquo;s Hospital Medical Center (CCHMC)). Analyses were restricted to participants of European genetic ancestry from CCHMC in the eMERGE GWAS cohort. Because 75% of CFA cases in eMERGE were from CCHMC we restricted our secondary analysis to this population. ICD billing codes for each individual are provided as part of available data to eMERGE consortium researchers. We mapped the CM phecodes from the ICD code data using the PheWAS R package as previously described.\u003csup\u003e\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e\u003c/sup\u003e\u003c/p\u003e\n\u003ch3\u003eIdentifying Genes with Known CFA Associations\u003c/h3\u003e\n\u003cp\u003eWe curated a list of genes associated with CFAs, using previously published genome-wide association study (GWAS) results and the Concert Genetics\u0026rsquo; registry of clinical genetic tests.\u003csup\u003e\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e,\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e\u003c/sup\u003e This testing registry aggregates all clinically available genetic panels, lists all the genes included in each panel, and allows for comparison of included genes between different companies\u0026rsquo; panels. These panels reflect a largely comprehensive list of genes with a known monogenic CFA association. A certified genetic counselor searched Concert Genetics\u0026rsquo; Test Registry using the search term \u0026ldquo;craniofacial\u0026rdquo; and compiled a list of all the unique genes that are offered on the resulting Craniofacial Panel Tests. These panels included both syndromic and non-syndromic CFA genes. Supplemental Table\u0026nbsp;1 lists curated known CFA-associated genes.\u003c/p\u003e\n\u003ch3\u003eDefining CFA cases and controls\u003c/h3\u003e\n\u003cp\u003eTo identify individuals in BioVU and eMERGE that have a CFA, we used phecodes from the \u0026ldquo;Tongue Mouth and Pharynx\u0026rdquo; and the \u0026ldquo;Skull Face and Jaw\u0026rdquo; phecodeX Chaps.\u0026nbsp;1\u003csup\u003e0\u003c/sup\u003e We identified individuals that had at least two instances of a CFA phecode of the same parent code and/or family head code Supplemental Table\u0026nbsp;2. Figure\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eB demonstrates the breakdown of the parent code, family head code, and specific code portions of a phecode. We defined controls as individuals who did not have a single congenital anomaly phecode in their record.\u003c/p\u003e\u003cdiv id=\"Sec8\" class=\"Section2\"\u003e\u003ch2\u003eDefining CA cases and controls\u003c/h2\u003e\u003cp\u003eTo define whether individuals in BioVU and eMERGE had CAs we used phecodes from the congenital malformations phecode X Chap.\u0026nbsp;1\u003csup\u003e0\u003c/sup\u003e For our cases, we identified individuals with at least two instances of a specific CA phecode (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eB, Supplemental Table\u0026nbsp;3). Controls were defined as individuals with no CA phecodes in their record.\u003c/p\u003e\u003c/div\u003e\n\u003ch3\u003eTranscriptome Wide Association Study of Individuals with CFAs\u003c/h3\u003e\n\u003cp\u003eUsing machine learning models such as PrediXcan, UTMOST, and Joint Tissue Imputation (JTI), we calculated genetically predicted gene expression (GPGE) using GTEx (version 8) as a reference population in both BioVU and eMERGE.\u003csup\u003e\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e,\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e,\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e,\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e\u003c/sup\u003e Using the single best performing model out of these three models for each gene tissue pair (r\u003csup\u003e2\u003c/sup\u003e\u0026thinsp;\u0026gt;\u0026thinsp;0.01), we conducted a transcriptome-wide association study (TWAS). We tested the association of CFA status with GPGE in a logistic regression model adjusting for age, sex, number of visits, and the first ten principal components of ancestry.\u003c/p\u003e\n\u003ch3\u003eAssociation of Craniofacial Anomaly Genes with Other Congenital Anomalies\u003c/h3\u003e\n\u003cp\u003eTo determine how variation in GPGE of known CFA genes may increase risk for other CAs, we conducted gene-based phenotype-wide association studies (PheWAS) of the known CFA genes.\u003csup\u003e\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e,\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e,\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e\u003c/sup\u003e Of the 391 known CFA genes, we had quality prediction for 341 (r\u003csup\u003e2\u003c/sup\u003e\u0026thinsp;\u0026gt;\u0026thinsp;0.1). We tested whether a diagnosis of a CA is associated with GPGE of a known CFA gene adjusting for age, sex, the first ten principal components of ancestry, and number of visits in a logistic regression mode.\u003c/p\u003e"},{"header":"Results","content":"\u003cdiv id=\"Sec12\" class=\"Section2\"\u003e\u003ch2\u003eIdentification of individuals with craniofacial and other congenital anomalies within the EHR\u003c/h2\u003e\u003cp\u003eWithin the entire VUMC medical home population, including individuals without genotype data, there are 19,509 individuals with a CFA. About a quarter of these individuals, 4,051 (21.1%) have a second CA in another organ system (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003eC). At VUMC, there are 694 individuals of European ancestry with a CFA who have available genotype information (248 individuals with a \u0026ldquo;Tongue, Mouth, and Pharynx\u0026rdquo; phecode CA and 446 individuals with a \u0026ldquo;Skull, Face, and Jaw\u0026rdquo; phecode CA). From the eMERGE GWAS cohort individuals of European ancestry at CCHMC, there are 384 individuals with a CFA (113 individuals with a Tongue, Mouth, and Pharynx phecode CA and 322 individuals with a Skull, Face, and Jaw phecode CA) (Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e).\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eDemographics of craniofacial cases and controls\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"4\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCraniofacial anomalies\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003eControls\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003eTotal\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eBioVU (n)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003e635\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003e42,810\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003e43,445\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eEHR-reported sex\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eMale\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e352 (55.4)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e17,497 (40.9)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e17,849 (41.1%)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eFemale\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e283 (44.6)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e25,313 (59.1)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e25,596 (58.9%)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eAge, years\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e27.1 (23.4)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e58.0 (21.1)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e57.5 (21.5)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eNumber of visits\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e126.0 (135.0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e61.4 (65.5)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e62.4 (67.5)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eeMERGE (n)\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003e384\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e\u003cb\u003e3,575\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e\u003cb\u003e3,959\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eEHR-reported sex\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eMale\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e159 (41.4)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e2,014 (56.4)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e2,239 (56.6)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eFemale\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e225 (58.6)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e1,559 (43.6)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e1,718 (43.4)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eAge, years\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e17.2 (5.3)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e25.2 (7.9)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e24.4 (8.1)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003eData is presented as number (%) for categorical or median (standard deviation) for continuous variables.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec13\" class=\"Section2\"\u003e\u003ch2\u003eTWAS of individuals with CFAs identifies genes not previously associated with CFAs\u003c/h2\u003e\u003cp\u003eOur TWAS of CFAs in both BioVU and eMERGE did not identify any statistically significant associations that passed the highly conservative Bonferroni multiple testing correction (0.05 divided by the number of genes with GPGE per tissue). When using a less stringent p-value threshold, we identified 1,261 genes in BioVU and 1,260 gene in eMERGE that were significantly associated (p\u0026thinsp;\u0026lt;\u0026thinsp;0.05). We compared these genes in both study populations to our curated list of known CFA genes and found that the majority of our curated CFA genes (93.6%) demonstrate no level of significant association, even at a permissive p\u0026thinsp;\u0026lt;\u0026thinsp;0.05 level, with CFAs in either BioVU and eMERGE (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eA). In total, fewer than 1% of significant genes in either BioVU or eMERGE were part of the curated previously known CFA-associated gene list. This included 11 significant genes from BioVU (0.90%) and 14 (1%) from eMERGE (Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e). There was no overlap in significant genes known to association with a CFA in either BioVU or eMERGE.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eNumber of significant results in BioVU and eMERGE craniofacial (CFA) transcriptome-wide association study (TWAS)\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"5\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eStudy site\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eAny gene\u003c/p\u003e\u003cp\u003e(p\u0026thinsp;\u0026lt;\u0026thinsp;0.05)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003eAny gene\u003c/p\u003e\u003cp\u003e(p\u0026thinsp;\u0026lt;\u0026thinsp;0.001)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003eKnown CFA gene (p\u0026thinsp;\u0026lt;\u0026thinsp;0.05)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003eKnown CFA gene (p\u0026thinsp;\u0026lt;\u0026thinsp;0.001)\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eBioVU\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e1,261\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e231\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e11\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eeMERGE\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e\u003cp\u003e1,260\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e257\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e14\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e4\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003eAdditionally, we compared the 1,261 and 1,260 significant associations (p\u0026thinsp;\u0026lt;\u0026thinsp;0.05) in each study population and found 53 significant gene associations that were shared between the BioVU and eMERGE TWAS results (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eB). These 53 genes were not part of the curated gene list and were not identified previously as associated with CFAs or craniofacial structure. By using a more stringent p-value threshold (p\u0026thinsp;\u0026lt;\u0026thinsp;0.001), we identified 231 genes in BioVU and 257 genes in eMERGE associated with CFAs (Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e). From this more stringent cutoff we identified two genes, \u003cem\u003eVAV1\u003c/em\u003e (*164875) (BioVU p\u0026thinsp;=\u0026thinsp;0.009; eMERGE p\u0026thinsp;=\u0026thinsp;0.006) and \u003cem\u003eCYP3A7\u003c/em\u003e (*605340) (BioVU p\u0026thinsp;=\u0026thinsp;0.009; eMERGE p\u0026thinsp;=\u0026thinsp;0.009) that are shared between both cohorts (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eB). Neither of these genes were associated with CFAs or any human disease but have been implicated in development.\u003csup\u003e\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e,\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e\u003c/sup\u003e\u003c/p\u003e\u003cp\u003e\u003cem\u003eCA-wide association study illustrates that known CFA genes are associated with a broad range of CAs across multiple organ systems\u003c/em\u003e\u003c/p\u003e\u003cp\u003eBecause CFAs often co-occur with other CAs, we sought to evaluate whether the curated CFA associated gene list demonstrated more significant associations with other CAs. To do this we tested for the association between the GPGE of 341 CFA-associated genes from the curated list and any CA phecode in both BioVU and eMERGE study populations. Assuming common variation in these genes identified from clinical testing panels for CFAs were specific to craniofacial development, we would expect to find an enrichment of craniofacial phenotypes in the skull face and jaw as well as the mouth, tongue and pharynx phecodeX chapters. While we identified a few significant associations with these phecodes at the least stringent significance threshold (p\u0026thinsp;\u0026lt;\u0026thinsp;0.05) (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eA and B), the most significant associations for any of the CFA-associated genes were identified in other organ system CAs. This analysis illustrates that a broad range of phenotypes spanning multiple organ systems are associated with common variation in CFA-associated genes (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e). Among our gene-based results, we found that the GPGE of \u003cem\u003eGLI2\u003c/em\u003e (*165230) is associated with 18 different CA phecodes in BioVU (p\u0026thinsp;\u0026lt;\u0026thinsp;0.05). These include significant associations with CA phecodes from four organ systems, 12 heart, 1 eye, 1 musculoskeletal, and 1 respiratory, as well as 2 situs inversus CA phecodes (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e, Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e).\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab3\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 3\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eCongenital anomaly phecodes associated with genetically predicted expression of \u003cem\u003eGLI2\u003c/em\u003e\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"5\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eOrgan system\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCA phecode\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003eBeta\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003eStandard error\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003ep-value\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eEye\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCM_751.113\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.495\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.240\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0.039\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eHeart\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCM_763.212\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.331\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.090\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e2.291x10\u003csup\u003e\u0026minus;\u0026thinsp;4\u003c/sup\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eHeart\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCM_763.2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.145\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.040\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e2.505 x10\u003csup\u003e\u0026minus;\u0026thinsp;4\u003c/sup\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eHeart\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCM_763.232\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.186\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.053\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e5.033x10\u003csup\u003e\u0026minus;\u0026thinsp;4\u003c/sup\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eHeart\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCM_763.23\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.158\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.047\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e7.654 x10\u003csup\u003e\u0026minus;\u0026thinsp;4\u003c/sup\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eHeart\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCM_763.21\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.234\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.071\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e9.777 x10\u003csup\u003e\u0026minus;\u0026thinsp;4\u003c/sup\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eHeart\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCM_763.11\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.622\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.216\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e4.059x10\u003csup\u003e\u0026minus;\u0026thinsp;3\u003c/sup\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eHeart\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCM_763.15\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.187\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.067\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0.006\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eHeart\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCM_763.36\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.259\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.102\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0.011\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eHeart\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCM_763.231\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.204\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.086\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0.018\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eHeart\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCM_763.152\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.256\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.115\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0.026\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eHeart\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCM_763.1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.095\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.043\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0.028\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eHeart\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCM_763.8\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.484\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.242\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0.046\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eHeart\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCM_763.14\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.138\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.070\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0.049\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eMusculoskeletal\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCM_770.1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.380\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.192\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0.048\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eRespiratory\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCM_762.3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.377\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.155\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0.015\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eSitus Inversus\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCM_774.2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.672\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.186\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e3.003 x10\u003csup\u003e\u0026minus;\u0026thinsp;4\u003c/sup\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eSitus Inversus\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCM_774\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e0.451\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e0.150\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0.003\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003c/div\u003e"},{"header":"Discussion","content":"\u003cp\u003eBecause most previous genetic studies of CFAs focus on rare variant/ monogenic causes of disease, this work investigated the GPGE of individuals with CFAs and how the GPGE of known CFA genes relate to CAs. We perform a TWAS for individuals with a CFA and a CA-wide PheWAS for the GPGE of genes with a known CFA association. Overall, the results of these analyses suggest that in addition to rare variants, polygenic variation impacting gene expression may contribute to many CAs and may play a role in the penetrance and expressivity of CA syndromes.\u003c/p\u003e\u003cp\u003eAs sample sizes for genetic studies have increased, so has our understanding of the complexity of the genetic architecture driving phenotypes such as CFAs through our expanded understanding of how common variation contributes to the diversity of facial morphology and shape.\u003csup\u003e\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e,\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e,\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e\u003c/sup\u003e Studies of CFAs and polygenic architecture have identified that there are shared genetic features between cleft lip and palate and the size of facial features as well as common variation that affects the penetrance of known cleft clip and palate variants in \u003cem\u003ePDGFRA\u003c/em\u003e (*173490).\u003csup\u003e6,7\u003c/sup\u003e Additionally, researchers have proposed that Mendelian CFAs are extreme phenotypes on a continuum of phenotypic variation in facial morphology and that integrating common variation into the study of these phenotypes is essential to understanding their genetic drivers.\u003csup\u003e\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e,\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e\u003c/sup\u003e All of these studies leverage GWAS and polygenic scores to examine common variations contributions to CFAs. However, using an approach that utilizes GPGE allows us to capture the predicted effects of common variant gene expression in a TWAS to identify genes whose altered GPGE is associated with CFA status. By conducting a gene-based analysis of CFAs, we can obtain more biologically interpretable results for our CFA associations.\u003c/p\u003e\u003cp\u003eOut of the 391 genes known to cause CFAs in a monogenic fashion, over 90% are not significantly associated with CFAs in a transcriptome-wide fashion (p\u0026thinsp;\u0026lt;\u0026thinsp;0.01), highlighting the gap in knowledge when studying CFAs that present in a Mendelian or syndromic fashion. The two genes whose GPGE showed a significant association (p\u0026thinsp;\u0026lt;\u0026thinsp;0.01) with CFAs in both cohorts, \u003cem\u003eVAV1\u003c/em\u003e (*164875) and \u003cem\u003eCYP3A7\u003c/em\u003e (*605340), have not yet been associated with any human disease, but do have known roles in fetal development. \u003cem\u003eVAV1\u003c/em\u003e (*164875) is a proto-oncogene that is involved with hematopoiesis and T and B cell signaling. The well-established relationship between disrupted oncoprotein signaling, cancer, and congenital anomalies suggests that differential expression of the oncogene \u003cem\u003eVAV1\u003c/em\u003e (*164875) could be driving the development of some CFAs.\u003csup\u003e\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e, 30,24\u003c/sup\u003e Additional research on the differential expression of proto-oncogenes in individuals with congenital anomalies could give additional insight into why those with congenital anomalies are more likely to develop cancer. \u003cem\u003eCYP3A7\u003c/em\u003e (*605340) encodes a cytochrome P450 \u0026ldquo;super family\u0026rdquo; enzyme, \u003cem\u003eCYP3A7\u003c/em\u003e (*605340), which metabolizes a diverse array of endogenous and exogenous substances, including prescription medications that many pregnant individuals need to maintain their own health such as carbamazepine, diltiazem, caffeine, and nifedipine. \u003cem\u003eCYP3A7\u003c/em\u003e (*605340) is primarily expressed in fetal liver tissue, being detected as early as 50 days of gestation and decreasing in expression until 24 months of age postnatally.\u003csup\u003e\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e,\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e\u003c/sup\u003e Additionally, \u003cem\u003eCYP3A7\u003c/em\u003e (*605340) plays a key role in the production of a critical pregnancy hormone, estriol, which has been shown to be an important epigenetic modifier in mice fetuses. Variable expression of \u003cem\u003eCYP3A7\u003c/em\u003e (*605340) could have dramatic effects on fetal development, and further research can assess the complex interactions of environmental risks and genetic predispositions to CFAs and other CAs.\u003c/p\u003e\u003cp\u003eThe observations that only 10% of the previously identified CFA associated genes had a significant GPGE association with CFAs and the strong overlap (21.1%) of individuals with CFAs having other CAs drove us to analyze what CAs were significantly associated with the GPGE of the curated CFA genes. These known CFA genes show more significant GPGE associations with other organ system CAs than the CFAs themselves, suggesting that there are shared genetic and environmental susceptibilities across CAs. The many genetic syndromes that contain multiple CAs support the idea of shared risk factors for many types of CAs. Taken together these results suggest that polygenic variation in CFA-associated genes may relate to developmental changes more broadly and are not necessarily restricted to craniofacial development.\u003c/p\u003e\u003cp\u003eMany genetic syndromes that cause multiple types of CAs demonstrate variable expression and incomplete penetrance, yet the factors causing variable expression/ incomplete penetrance is not well understood. For example, ~\u0026thinsp;65% of individuals with 22q11.2 deletion syndrome have a congenital heart defect (CHD) and ~\u0026thinsp;67% have a palate abnormality.\u003csup\u003e\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e,\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e\u003c/sup\u003e As another example, we noted that the predicted expression of \u003cem\u003eGLI2\u003c/em\u003e\u0026rsquo;s (*165230) is significantly (p\u0026thinsp;\u0026lt;\u0026thinsp;0.05) associated with many congenital heart defects (CHDs), such as congenital pulmonary valve stenosis (p\u0026thinsp;=\u0026thinsp;0.0002), congenital malformations of heart valves (p\u0026thinsp;=\u0026thinsp;0.0002), congenital insufficiency of the aortic valve (p\u0026thinsp;=\u0026thinsp;0.0005), and ten others (Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e). \u003cem\u003eGLI2\u003c/em\u003e (*165230) is classified as a known CFA gene due to its association with two congenital malformation syndromes, Culler-Jones syndrome and Holoprosencephaly 9.\u003csup\u003e35\u003c/sup\u003e Both syndromes can present with several congenital anomalies, such as cleft lip/palate, microcephaly, polydactyly, but to date CHDs are not associated with either syndrome.\u003csup\u003e\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e,\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e\u003c/sup\u003e Yet \u003cem\u003eGLI2\u003c/em\u003e (*165230) does have a well-established role in cardiomyogenesis and there is a group of individuals with CHDs that have \u003cem\u003eGLI2\u003c/em\u003e (*165230) missense variants shown to dysregulate sonic hedgehog signaling, which is crucial for fetal development.\u003csup\u003e,\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e,\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e,\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e\u003c/sup\u003e The expression of \u003cem\u003eGLI2\u003c/em\u003e (*165230) in the developing heart suggests CHDs could possibly be a phenotypic expansion for the two \u003cem\u003eGLI2\u003c/em\u003e-related congenital anomaly syndromes and suggests examining its role in cardiac development. Differences in gene expression could be one factor causing variable expressivity and reduced penetrance that is characteristic of many congenital anomaly syndromes.\u003c/p\u003e\u003cp\u003eOne of the main limitations of studying CFAs at biobank scale is that they have a relatively low prevalence and are caused by large-effect rare variants. Both issues affect our statistical power to detect phenotype associations. One way that our analysis attempted to address this limitation is by conducting gene-based analyses and defining our CFA phenotype across multiple phecodes that describe different types of CFAs. Despite using a less stringent p-value threshold, conducting gene-based analyses such as TWAS provides a more interpretable biological unit than single variant analyses. While the analyses in this paper are underpowered, they still provide biologically and clinically meaningful results.\u003csup\u003e\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e,\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e\u003c/sup\u003e\u003c/p\u003e\u003cp\u003eThroughout this study we leveraged a set of known CFA genes that were compiled from clinical diagnostic testing for CFAs as well as a GWAS of face shape that was curated and reviewed by a certified genetic counselor. While we tried to make this gene set as comprehensive as possible, we are limited by which genes are currently have a well-established associated with CFAs. The main goal of the work in this study is to try to better understand the genetic drivers of CFAs.\u003c/p\u003e\u003cp\u003eOverall, our results support that both rare and common genetic variants in CFA Mendelian genes may contribute to a variety of CAs and highlights the complexities of the CA phenotypes, suggesting there are shared underlying genetic and environmental risk factors. Further research of CAs through GPGE could help better explain variable presentations and penetrance of CA syndromes.\u003c/p\u003e"},{"header":"Conclusions","content":"\u003cp\u003eTWAS of individuals with CFAs in the BioVU and eMERGE cohorts identified relatively few previously identified CFA-associated genes. The two genes whose GPGE had the strongest association in both cohorts have potential roles in the complex genetic drive of CAs. The predicted expression of genes that have a known Mendelian-association with CFAs are more often significantly associated with other types of CAs in the BioVU and eMERGE cohorts. For example, the predicted expression of \u003cem\u003eGLI2\u003c/em\u003e, which is associated with a syndrome that can include CFAs, is significantly associated with several CHDs. The results of both analyses suggest that there are overlapping polygenic causes of many types of CAs and that with further research may help explain the variability in how CA syndromes can present.\u003c/p\u003e"},{"header":"Abbreviations","content":"\u003cp\u003eCraniofacial anomalies (CFAs)\u003c/p\u003e\n\u003cp\u003eCongenital anomalies (CAs)\u003c/p\u003e\n\u003cp\u003eCleft lip and/or palate (CL/P)\u003c/p\u003e\n\u003cp\u003eElectronic health records (EHR)\u003c/p\u003e\n\u003cp\u003eInternational Classification of Disease (ICD)\u003c/p\u003e\n\u003cp\u003eCongenital malformation (CM)\u003c/p\u003e\n\u003cp\u003eSynthetic derivative (SD)\u003c/p\u003e\n\u003cp\u003eVanderbilt University Medical Center (VUMC)\u003c/p\u003e\n\u003cp\u003eInstitutional Review Board (IRB)\u003c/p\u003e\n\u003cp\u003ePrinciple component analyses (PCAs)\u003c/p\u003e\n\u003cp\u003eCincinnati Children\u0026rsquo;s Hospital Medical Center (CCHMC)\u003c/p\u003e\n\u003cp\u003eGenotype-wide association study (GWAS)\u003c/p\u003e\n\u003cp\u003eTranscriptome-wide association study (TWAS)\u003c/p\u003e\n\u003cp\u003eGenetically predicted gene expression (GPGE)\u003c/p\u003e\n\u003cp\u003ePhenotype-wide association study (PheWAS)\u003c/p\u003e\n\u003cp\u003eCongenital heart defect (CHD)\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cem\u003eEthics approval and consent to participate\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eThe data for this study is deidentified and deemed nonhuman subjects. All approvals were obtained for data use including those from Vanderbilt University Medical Center\u0026rsquo;s institutional review board and eMERGE project proposal protocols.\u003c/p\u003e\n\u003cp\u003e\u0026nbsp;\u003cem\u003eConsent for publication\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eAll authors consent to the publication of the results and manuscript materials.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003e\u0026nbsp;\u003c/em\u003e\u003cem\u003eAvailability of data and materials\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eAll data from the primary population at Vanderbilt University Medical Center will be made available by request to the corresponding author pending Institutional approval. Requests for data derived from the secondary population, eMERGE, will be handled per consortium requirements.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003e\u0026nbsp;\u003c/em\u003e\u003cem\u003eCompeting interests\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare no competing interests.\u003c/p\u003e\n\u003cp\u003e\u0026nbsp;\u003cem\u003eFunding\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eThe dataset used for the clinical analyses was obtained from the Vanderbilt University Medical Center Synthetic Derivative, which is supported by institutional funding, the 1S10RR025141-01 instrumentation award, and by the CTSA grant UL1TR000445 from National Center for Advancing Translational Sciences/National Institutes of Health. The phase of the eMERGE Network used in this study was initiated and funded by the NHGRI through the following grants: U01HG008657 (Group Health Cooperative/University of Washington); U01HG008685 (Bringham and Women\u0026rsquo;s Hospital); U01HG00672 (Vanderbilt University Medical Center); U01HG008666 (Cincinnati Children\u0026rsquo;s Hospital Medical Center); U01HG006379 (Mayo Clinic); U01HG008679 (Geisinger Clinic); U01HG008680 (Columbia University Health Sciences); U01HG008684 (Children\u0026rsquo;s Hospital of Philadelphia); U01HG008673 (Northwestern University); U01HG008701 (Vanderbilt University Medical Center serving as the Coordinating Center); U01HG008676 (Partners Healthcare/Broad Institute); U01HG008664 (Baylor College of Medicine); and U54MD007593 (Meharry Medical College). Individual support for this work includes E.B (NIHNCATS NOT-OD-22-108 CTTSA TL1 5TL1TR002244-07), A.S. (5T32GM080178-15 and 1T32GM145734-01), and M.M.S (K12HD043483 and P50HD106446).\u003c/p\u003e\n\u003cp\u003e\u003cem\u003e\u0026nbsp;\u003c/em\u003e\u003cem\u003eAuthors contributions\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eConceptualization: E.B., A.S, N.J.C, M.M.S; Data Curation: E.B., A.S, D.W., T.M.F., M.M.S; Formal analysis: E.B., A.S.; Funding acquisition: E.B., A.S, N.J.C, M.M.S; Investigation: E.B., A.S., M.M.S.; Methodology: E.B., A.S, N.J.C, M.M.S; Project Administration: E.B., A.S; Resources: N.J.C, M.M.S; Software: E.B., T.M.F, A.S., M.M.S; Supervision: N.J.C, M.M.S; \u0026nbsp;Validation: E.B., A.S, M.M.S; Visualization: E.B., A.S., M.M.S; Writing- original draft: E.B., A.S., M.M.S.; Writing- review and editing: E.B., A.S, T.M.F, D.W, W.C, M.W., N.J.C., M.M.S\u003c/p\u003e\n\u003cp\u003e\u003cem\u003e\u0026nbsp;\u003c/em\u003e\u003cem\u003eAcknowledgements\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eMai CT, Isenburg JL, Canfield MA, et al. National population-based estimates for major birth defects, 2010-2014. \u003cem\u003eBirth Defects Res\u003c/em\u003e. 2019;111(18):1420-1435. doi:10.1002/bdr2.1589\u003c/li\u003e\n\u003cli\u003eBoulet SL, Rasmussen SA, Honein MA. A population-based study of craniosynostosis in metropolitan Atlanta, 1989-2003. \u003cem\u003eAm J Med Genet A\u003c/em\u003e. 2008;146A(8):984-991. doi:10.1002/ajmg.a.32208\u003c/li\u003e\n\u003cli\u003eJin H, Yingqiu C, Zequn L, et al. Chromosomal microarray analysis in the prenatal diagnosis of orofacial clefts: Experience from a single medical center in mainland China. \u003cem\u003eMedicine (Baltimore)\u003c/em\u003e. 2018;97(34):e12057. doi:10.1097/MD.0000000000012057\u003c/li\u003e\n\u003cli\u003eYan S, Fu F, Li R, et al. Exome sequencing improves genetic diagnosis of congenital orofacial clefts. \u003cem\u003eFront Genet\u003c/em\u003e. 2023;14:1252823. doi:10.3389/fgene.2023.1252823\u003c/li\u003e\n\u003cli\u003eWilkie AOM, Johnson D, Wall SA. Clinical genetics of craniosynostosis. \u003cem\u003eCurr Opin Pediatr\u003c/em\u003e. 2017;29(6):622-628. doi:10.1097/MOP.0000000000000542\u003c/li\u003e\n\u003cli\u003eYu Y, Alvarado R, Petty LE, et al. Polygenic risk impacts PDGFRA mutation penetrance in non-syndromic cleft lip and palate. \u003cem\u003eHum Mol Genet\u003c/em\u003e. 2022;31(14):2348. doi:10.1093/hmg/ddac037\u003c/li\u003e\n\u003cli\u003eHowe LJ, Lee MK, Sharp GC, et al. Investigating the shared genetics of non-syndromic cleft lip/palate and facial morphology. \u003cem\u003ePLOS Genet\u003c/em\u003e. 2018;14(8):e1007501. doi:10.1371/journal.pgen.1007501\u003c/li\u003e\n\u003cli\u003eRoden DM, Pulley JM, Basford MA, et al. Development of a large-scale de-identified DNA biobank to enable personalized medicine. \u003cem\u003eClin Pharmacol Ther\u003c/em\u003e. 2008;84(3):362-369. doi:10.1038/clpt.2008.89\u003c/li\u003e\n\u003cli\u003eMcCarty CA, Chisholm RL, Chute CG, et al. The eMERGE Network: a consortium of biorepositories linked to electronic medical records data for conducting genomic studies. \u003cem\u003eBMC Med Genomics\u003c/em\u003e. 2011;4:13. doi:10.1186/1755-8794-4-13\u003c/li\u003e\n\u003cli\u003eShuey MM, Stead WW, Aka I, et al. Next-generation phenotyping: introducing phecodeX for enhanced discovery research in medical phenomics. \u003cem\u003eBioinforma Oxf Engl\u003c/em\u003e. 2023;39(11):btad655. doi:10.1093/bioinformatics/btad655\u003c/li\u003e\n\u003cli\u003eCarroll RJ, Bastarache L, Denny JC. R PheWAS: data analysis and plotting tools for phenome-wide association studies in the R environment. \u003cem\u003eBioinforma Oxf Engl\u003c/em\u003e. 2014;30(16):2375-2376. doi:10.1093/bioinformatics/btu197\u003c/li\u003e\n\u003cli\u003eDanciu I, Cowan JD, Basford M, et al. Secondary use of clinical data: the Vanderbilt approach. \u003cem\u003eJ Biomed Inform\u003c/em\u003e. 2014;52:28-35. doi:10.1016/j.jbi.2014.02.003\u003c/li\u003e\n\u003cli\u003eDennis JK, Sealock JM, Straub P, et al. Clinical laboratory test-wide association scan of polygenic scores identifies biomarkers of complex disease. \u003cem\u003eGenome Med\u003c/em\u003e. 2021;13(1):6. doi:10.1186/s13073-020-00820-8\u003c/li\u003e\n\u003cli\u003eGottesman O, Kuivaniemi H, Tromp G, et al. The Electronic Medical Records and Genomics (eMERGE) Network: past, present, and future. \u003cem\u003eGenet Med Off J Am Coll Med Genet\u003c/em\u003e. 2013;15(10):761-771. doi:10.1038/gim.2013.72\u003c/li\u003e\n\u003cli\u003eShaffer JR, Orlova E, Lee MK, et al. Genome-Wide Association Study Reveals Multiple Loci Influencing Normal Human Facial Morphology. \u003cem\u003ePLoS Genet\u003c/em\u003e. 2016;12(8):e1006149. doi:10.1371/journal.pgen.1006149\u003c/li\u003e\n\u003cli\u003ePayment Accuracy for Precision Lab Diagnostics. Concert. Accessed February 16, 2025. https://www.concert.co/\u003c/li\u003e\n\u003cli\u003eGamazon ER, Wheeler HE, Shah KP, et al. A gene-based association method for mapping traits using reference transcriptome data. \u003cem\u003eNat Genet\u003c/em\u003e. 2015;47(9):1091-1098. doi:10.1038/ng.3367\u003c/li\u003e\n\u003cli\u003eHu Y, Li M, Lu Q, et al. A statistical framework for cross-tissue transcriptome-wide association analysis. \u003cem\u003eNat Genet\u003c/em\u003e. 2019;51(3):568-576. doi:10.1038/s41588-019-0345-7\u003c/li\u003e\n\u003cli\u003eZhou D, Jiang Y, Zhong X, Cox NJ, Liu C, Gamazon ER. A unified framework for joint-tissue transcriptome-wide association and Mendelian randomization analysis. \u003cem\u003eNat Genet\u003c/em\u003e. 2020;52(11):1239-1246. doi:10.1038/s41588-020-0706-2\u003c/li\u003e\n\u003cli\u003eGTEx Consortium. The GTEx Consortium atlas of genetic regulatory effects across human tissues. \u003cem\u003eScience\u003c/em\u003e. 2020;369(6509):1318-1330. doi:10.1126/science.aaz1776\u003c/li\u003e\n\u003cli\u003eUnlu G, Gamazon ER, Qi X, et al. GRIK5 Genetically Regulated Expression Associated with Eye and Vascular Phenomes: Discovery through Iteration among Biobanks, Electronic Health Records, and Zebrafish. \u003cem\u003eAm J Hum Genet\u003c/em\u003e. 2019;104(3):503-519. doi:10.1016/j.ajhg.2019.01.017\u003c/li\u003e\n\u003cli\u003eUnlu G, Qi X, Gamazon ER, et al. Phenome-based approach identifies RIC1-linked Mendelian syndrome through zebrafish models, biobank associations and clinical studies. \u003cem\u003eNat Med\u003c/em\u003e. 2020;26(1):98-109. doi:10.1038/s41591-019-0705-y\u003c/li\u003e\n\u003cli\u003eZhong X, Yin Z, Jia G, et al. Electronic health record phenotypes associated with genetically regulated expression of CFTR and application to cystic fibrosis. \u003cem\u003eGenet Med Off J Am Coll Med Genet\u003c/em\u003e. 2020;22(7):1191-1200. doi:10.1038/s41436-020-0786-5\u003c/li\u003e\n\u003cli\u003eKatzav S, Martin-Zanca D, Barbacid M. vav, a novel human oncogene derived from a locus ubiquitously expressed in hematopoietic cells. \u003cem\u003eEMBO J\u003c/em\u003e. 1989;8(8):2283-2290. doi:10.1002/j.1460-2075.1989.tb08354.x\u003c/li\u003e\n\u003cli\u003eLi H, Lampe JN. Neonatal cytochrome P450 CYP3A7: A comprehensive review of its role in development, disease, and xenobiotic metabolism. \u003cem\u003eArch Biochem Biophys\u003c/em\u003e. 2019;673:108078. doi:10.1016/j.abb.2019.108078\u003c/li\u003e\n\u003cli\u003eDecoding the Human Face: Progress and Challenges in Understanding the Genetics of Craniofacial Morphology - PubMed. Accessed February 20, 2025. https://pubmed.ncbi.nlm.nih.gov/35483406/\u003c/li\u003e\n\u003cli\u003eGoovaerts S, Hoskens H, Eller RJ, et al. Joint multi-ancestry and admixed GWAS reveals the complex genetics behind human cranial vault shape. \u003cem\u003eNat Commun\u003c/em\u003e. 2023;14(1):7436. doi:10.1038/s41467-023-43237-8\u003c/li\u003e\n\u003cli\u003eVanneste M, Hoskens H, Goovaerts S, et al. Syndrome-informed phenotyping identifies a polygenic background for achondroplasia-like facial variation in the general population. \u003cem\u003eNat Commun\u003c/em\u003e. 2024;15(1):10458. doi:10.1038/s41467-024-54839-1\u003c/li\u003e\n\u003cli\u003eCastel P, Rauen KA, McCormick F. The duality of human oncoproteins: drivers of cancer and congenital disorders. \u003cem\u003eNat Rev Cancer\u003c/em\u003e. 2020;20(7):383-397. doi:10.1038/s41568-020-0256-z\u003c/li\u003e\n\u003cli\u003eZhang R, Alt FW, Davidson L, Orkin SH, Swat W. Defective signalling through the T- and B-cell antigen receptors in lymphoid cells lacking the vav proto-oncogene. \u003cem\u003eNature\u003c/em\u003e. 1995;374(6521):470-473. doi:10.1038/374470a0\u003c/li\u003e\n\u003cli\u003eLiu J, Kandel SE, Lampe JN, Scott EE. Human cytochrome P450 3A7 binding four copies of its native substrate dehydroepiandrosterone 3-sulfate. \u003cem\u003eJ Biol Chem\u003c/em\u003e. 2023;299(8):104993. doi:10.1016/j.jbc.2023.104993\u003c/li\u003e\n\u003cli\u003eWilliams JA, Ring BJ, Cantrell VE, et al. Comparative metabolic capabilities of CYP3A4, CYP3A5, and CYP3A7. \u003cem\u003eDrug Metab Dispos Biol Fate Chem\u003c/em\u003e. 2002;30(8):883-891. doi:10.1124/dmd.30.8.883\u003c/li\u003e\n\u003cli\u003eCampbell IM, Sheppard SE, Crowley TB, et al. What is new with 22q? An update from the 22q and You Center at the Children\u0026rsquo;s Hospital of Philadelphia. \u003cem\u003eAm J Med Genet A\u003c/em\u003e. 2018;176(10):2058-2069. doi:10.1002/ajmg.a.40637\u003c/li\u003e\n\u003cli\u003eJackson O, Crowley TB, Sharkus R, et al. Palatal evaluation and treatment in 22q11.2 deletion syndrome. \u003cem\u003eAm J Med Genet A\u003c/em\u003e. 2019;179(7):1184-1195. doi:10.1002/ajmg.a.61152\u003c/li\u003e\n\u003cli\u003eBertolacini C, Ribeiro‐Bicudo L, Petrin A, Richieri‐Costa A, Murray J. Clinical findings in patients with \u003cem\u003eGLI2\u003c/em\u003e mutations \u0026ndash; phenotypic variability. \u003cem\u003eClin Genet\u003c/em\u003e. 2012;81(1):70-75. doi:10.1111/j.1399-0004.2010.01606.x\u003c/li\u003e\n\u003cli\u003eBertolacini C, Ribeiro‐Bicudo L, Petrin A, Richieri‐Costa A, Murray J. Clinical findings in patients with \u003cem\u003eGLI2\u003c/em\u003e mutations \u0026ndash; phenotypic variability. \u003cem\u003eClin Genet\u003c/em\u003e. 2012;81(1):70-75. doi:10.1111/j.1399-0004.2010.01606.x\u003c/li\u003e\n\u003cli\u003eBlaas H ‐G. K, Eriksson AG, Salvesen K\u0026Aring;, et al. Brains and faces in holoprosencephaly: pre‐ and postnatal description of 30 cases. \u003cem\u003eUltrasound Obstet Gynecol\u003c/em\u003e. 2002;19(1):24-38. doi:10.1046/j.0960-7692.2001.00154.x\u003c/li\u003e\n\u003cli\u003eFair JV, Voronova A, Bosiljcic N, Rajgara R, Blais A, Skerjanc IS. BRG1 interacts with GLI2 and binds Mef2c gene in a hedgehog signalling dependent manner during in vitro cardiomyogenesis. \u003cem\u003eBMC Dev Biol\u003c/em\u003e. 2016;16(1):27. doi:10.1186/s12861-016-0127-8\u003c/li\u003e\n\u003cli\u003eWaldron CJ, Kelly LA, Stan N, et al. The HH-GLI2-CKS1B network regulates the proliferation-to-maturation transition of cardiomyocytes. \u003cem\u003eStem Cells Transl Med\u003c/em\u003e. 2024;13(7):678-692. doi:10.1093/stcltm/szae032\u003c/li\u003e\n\u003cli\u003eVoronova A, Al Madhoun A, Fischer A, Shelton M, Karamboulas C, Skerjanc IS. Gli2 and MEF2C activate each other\u0026rsquo;s expression and function synergistically during cardiomyogenesis in vitro. \u003cem\u003eNucleic Acids Res\u003c/em\u003e. 2012;40(8):3329-3347. doi:10.1093/nar/gkr1232\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Craniofacial anomalies, transcriptome-wide association studies, congenital anomalies, electronic health records","lastPublishedDoi":"10.21203/rs.3.rs-7645057/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7645057/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003ch2\u003eBackground\u003c/h2\u003e\u003cp\u003eCraniofacial anomalies are common congenital anomalies that significantly contribute to infant mortality and life-long health problems. Studies of craniofacial anomalies have identified several genetic causes, but focus on rare, Mendelian presentations. Despite this, current diagnostic genetic testing only identifies a causal genomic variant in ~\u0026thinsp;25% of affected individuals. This low diagnostic yield for Mendelian conditions may relate to oligogenic and polygenic risks for craniofacial anomalies. In this study we sought to use large electronic health record systems including many patients with craniofacial anomalies to determine whether we could identify patterns of genetic associations with craniofacial anomalies and known associated genes.\u003c/p\u003e\u003ch2\u003eMethods\u003c/h2\u003e\u003cp\u003eWe performed transcriptome-wide association studies that evaluated the association between genetically predicted gene expression and craniofacial anomalies in two cohorts: Vanderbilt University Medical Center\u0026rsquo;s BioVU and Electronic Medical Records and Genomics Network (eMERGE). Using a list of 391 previously identified craniofacial anomaly-associated genes we determined whether there was a greater proportion of significant associations with these genes than others. We also evaluated whether these genes were associated with other congenital anomalies.\u003c/p\u003e\u003ch2\u003eResults\u003c/h2\u003e\u003cp\u003eWe determined the predicted expression of 12 (3.1%) of the known craniofacial anomaly genes were associated with craniofacial anomalies (p\u0026thinsp;\u0026lt;\u0026thinsp;0.05) in BioVU and 18 (4.6%) in eMERGE. In both cohorts, the majority of significant genes and those demonstrating the strongest significance were not previously associated with craniofacial anomalies. In total, we identified 53 genes not previously associated with craniofacial anomalies. Interestingly fewer than 15% of the known craniofacial associated genes were associated with craniofacial anomalies (p\u0026thinsp;\u0026lt;\u0026thinsp;0.05) while 262 (76.8%) were associated with congenital anomalies of the heart, 133 (39.0%) anomalies of the nervous system and 142 (41.6%) of the urinary system.\u003c/p\u003e\u003ch2\u003eConclusions\u003c/h2\u003e\u003cp\u003eOur results support that both rare and common variation in Mendelian disease-associated genes may contribute to craniofacial anomalies and are broadly involved in congenital anomaly development.\u003c/p\u003e","manuscriptTitle":"Expanding the Genetic Landscape of Craniofacial Anomalies Through Transcriptome-Wide Association Studies","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-10-17 12:22:57","doi":"10.21203/rs.3.rs-7645057/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"5cb620f0-abc8-445a-a2f2-37701c1a22f0","owner":[],"postedDate":"October 17th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2025-12-23T12:09:17+00:00","versionOfRecord":[],"versionCreatedAt":"2025-10-17 12:22:57","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-7645057","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7645057","identity":"rs-7645057","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00