Functionally informed cis and trans proteome-wide association studies prioritize disease-critical genes | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Functionally informed cis and trans proteome-wide association studies prioritize disease-critical genes Kangcheng Hou, Ali Pazokitoroudi, Benjamin Strober, Xilin Jiang, and 1 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-9477232/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted You are reading this latest preprint version Abstract Proteome-wide association studies (PWAS) typically link genetically predicted protein levels to disease using cis-pQTLs, which can be limited by low cis-heritability for disease-critical genes under negative selection and by tagging due to co-regulation among nearby genes. Trans-pQTLs provide complementary information when large sample sizes are available to detect weak polygenic effects, enabling associations between trans-predicted protein levels and disease. We developed PolyPWAS, a functionally informed, summary statistics-based framework for associating both cis- and trans-predicted protein levels to disease. PolyPWAS integrates 96 functional annotations with proteome-wide pleiotropy to improve protein prediction, while correcting for PCs of predicted protein levels to limit tagging effects. We applied PolyPWAS to 2.8K plasma proteins measured in 34K UKB-PPP participants, analyzing GWAS summary statistics for 88 diseases and complex traits (average N=336K). Trans-predicted protein levels explained 21% of disease heritability (vs. 9.6% for cis-predicted protein levels), leveraging a 24% relative improvement in trans-prediction accuracy from functional priors. Trans-PWAS identified more significant protein-disease associations (and more conditionally significant associations) than cis-PWAS. Cis and trans associations showed only modest excess overlap (1.18, 95% CI: 1.11-1.26). Accordingly, combining evidence from cis and trans associations improved disease gene prioritization evaluated using gene sets from rare variant association studies (+11% relative improvement) and PoPS (+7.0% relative improvement) relative to cis-only approaches. PWAS associations to disease replicated across protein level cohorts, with strong UKB-PPP/deCODE concordance after adjusting for cohort-specific prediction accuracy. We provide examples where trans-regulatory effects link multiple disease-critical genes, underscoring the importance of integrating cis- and trans-regulatory effects to map protein-mediated disease biology. Biological sciences/Genetics/Gene regulation Biological sciences/Computational biology and bioinformatics Full Text Additional Declarations There is NO Competing Interest. Supplementary Files supptables.xlsx Supplementary Tables nrreportingsummary.pdf Reporting summary supp.pdf Supplementary Figures Cite Share Download PDF Status: Under Review Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-9477232","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":633075483,"identity":"eff30b97-0677-419e-a28d-fa772fcd28a8","order_by":0,"name":"Kangcheng Hou","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA6ElEQVRIiWNgGAWjYBACCQkg8bGBgcGAgQcqdAAoSkgL40yStTDzkqRFcnbzMWnbHXV55gy8Bz9Xttnl8x1gPnibB48WaZljadK5Zw4XWzbwJUuebUu2nHmALdkanxY5iRwz6dy2A4kbDvAYSDacYTYwOMBjJk1Qi2VbHUiL8c+GM/VALfzf8GqRBmlhbGMGaTGTbKg4DLKFDa8WyRlpyZa9bYeLDQ7zmFk2VBw3kDzMZmw5B48WiRvJB2/8bKvLMzjeY3yzwaDagO9488Mbb/BogYEEBmYYkxmfOhQto2AUjIJRMApwAQBlO0kbexhuhAAAAABJRU5ErkJggg==","orcid":"https://orcid.org/0000-0001-7110-5596","institution":"Harvard School of Public Health","correspondingAuthor":true,"prefix":"","firstName":"Kangcheng","middleName":"","lastName":"Hou","suffix":""},{"id":633075484,"identity":"e69e97ab-ca0c-4424-8abd-1b05d9568c3a","order_by":1,"name":"Ali Pazokitoroudi","email":"","orcid":"","institution":"Harvard School of Public Health","correspondingAuthor":false,"prefix":"","firstName":"Ali","middleName":"","lastName":"Pazokitoroudi","suffix":""},{"id":633075485,"identity":"cc8a2be6-bb6e-4e9e-a2d8-eac099468abc","order_by":2,"name":"Benjamin Strober","email":"","orcid":"","institution":"Boston Children's Hospital","correspondingAuthor":false,"prefix":"","firstName":"Benjamin","middleName":"","lastName":"Strober","suffix":""},{"id":633075486,"identity":"8b9704b4-2fda-41cd-995c-98f0b4d58e44","order_by":3,"name":"Xilin Jiang","email":"","orcid":"https://orcid.org/0000-0001-6773-9182","institution":"Cambridge Baker Systems Genomics Initiative, Baker Heart \u0026 Diabetes Institute","correspondingAuthor":false,"prefix":"","firstName":"Xilin","middleName":"","lastName":"Jiang","suffix":""},{"id":633075487,"identity":"2ed68169-fd06-4599-803b-a609b7d1e294","order_by":4,"name":"Alkes Price","email":"","orcid":"https://orcid.org/0000-0002-2971-7975","institution":"Harvard University","correspondingAuthor":false,"prefix":"","firstName":"Alkes","middleName":"","lastName":"Price","suffix":""}],"badges":[],"createdAt":"2026-04-21 01:35:27","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-9477232/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-9477232/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":108495255,"identity":"4c3f4c01-93b2-44d8-8083-b0c9b0eb10ff","added_by":"auto","created_at":"2026-05-05 10:09:36","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1340289,"visible":true,"origin":"","legend":"Article File","description":"","filename":"main.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9477232/v1_covered_bffd3aa7-0920-4e68-9b2c-7c5eb7e70194.pdf"},{"id":108441127,"identity":"6669f654-240c-4fcb-a492-790774017f14","added_by":"auto","created_at":"2026-05-04 16:44:53","extension":"xlsx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":193393,"visible":true,"origin":"","legend":"Supplementary Tables","description":"","filename":"supptables.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-9477232/v1/1af3967e485257e8abd876d4.xlsx"},{"id":108493234,"identity":"23a4f8b3-4a0c-458f-91b8-d1a93e71befb","added_by":"auto","created_at":"2026-05-05 09:59:43","extension":"pdf","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":1665282,"visible":true,"origin":"","legend":"Reporting summary","description":"","filename":"nrreportingsummary.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9477232/v1/27c6e4a747157178affb1bf9.pdf"},{"id":108441129,"identity":"878c587c-f0cd-4aa4-ada2-c7f57a56a93d","added_by":"auto","created_at":"2026-05-04 16:44:53","extension":"pdf","order_by":3,"title":"","display":"","copyAsset":false,"role":"supplement","size":5789010,"visible":true,"origin":"","legend":"Supplementary Figures","description":"","filename":"supp.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9477232/v1/28abb017e90b5a2708590a82.pdf"}],"financialInterests":"There is \u003cb\u003eNO\u003c/b\u003e Competing Interest.","formattedTitle":"Functionally informed cis and trans proteome-wide association studies prioritize disease-critical genes","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"nature-portfolio","isNatureJournal":true,"hasQc":false,"allowDirectSubmit":false,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"","title":"Nature Portfolio","twitterHandle":"","acdcEnabled":false,"dfaEnabled":false,"editorialSystem":"ejp","reportingPortfolio":"","inReviewEnabled":true,"inReviewRevisionsEnabled":false},"keywords":"","lastPublishedDoi":"10.21203/rs.3.rs-9477232/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-9477232/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"Proteome-wide association studies (PWAS) typically link genetically predicted protein levels to disease using cis-pQTLs, which can be limited by low cis-heritability for disease-critical genes under negative selection and by tagging due to co-regulation among nearby genes. Trans-pQTLs provide complementary information when large sample sizes are available to detect weak polygenic effects, enabling associations between trans-predicted protein levels and disease. We developed PolyPWAS, a functionally informed, summary statistics-based framework for associating both cis- and trans-predicted protein levels to disease. PolyPWAS integrates 96 functional annotations with proteome-wide pleiotropy to improve protein prediction, while correcting for PCs of predicted protein levels to limit tagging effects. We applied PolyPWAS to 2.8K plasma proteins measured in 34K UKB-PPP participants, analyzing GWAS summary statistics for 88 diseases and complex traits (average N=336K). Trans-predicted protein levels explained 21% of disease heritability (vs. 9.6% for cis-predicted protein levels), leveraging a 24% relative improvement in trans-prediction accuracy from functional priors. Trans-PWAS identified more significant protein-disease associations (and more conditionally significant associations) than cis-PWAS. Cis and trans associations showed only modest excess overlap (1.18, 95% CI: 1.11-1.26). Accordingly, combining evidence from cis and trans associations improved disease gene prioritization evaluated using gene sets from rare variant association studies (+11% relative improvement) and PoPS (+7.0% relative improvement) relative to cis-only approaches. PWAS associations to disease replicated across protein level cohorts, with strong UKB-PPP/deCODE concordance after adjusting for cohort-specific prediction accuracy. We provide examples where trans-regulatory effects link multiple disease-critical genes, underscoring the importance of integrating cis- and trans-regulatory effects to map protein-mediated disease biology.","manuscriptTitle":"Functionally informed cis and trans proteome-wide association studies prioritize disease-critical genes","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-05-04 16:44:49","doi":"10.21203/rs.3.rs-9477232/v1","editorialEvents":[],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"nature-genetics","isNatureJournal":true,"hasQc":false,"allowDirectSubmit":false,"externalIdentity":"ng","sideBox":"Learn more about [Nature Genetics](http://www.nature.com/ng/)","snPcode":"","submissionUrl":"","title":"Nature Genetics","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"ejp","reportingPortfolio":"Nature Research","inReviewEnabled":true,"inReviewRevisionsEnabled":false}}],"origin":"","ownerIdentity":"fb9ba8ae-c133-460d-89e0-b7ca2a9d8de3","owner":[],"postedDate":"May 4th, 2026","published":true,"recentEditorialEvents":[{"type":"editorInvitedReview","content":"This content is not available.","date":"2026-05-12T23:22:16+00:00","index":1,"fulltext":"This content is not available."},{"type":"reviewerAgreed","content":"This content is not available.","date":"2026-05-01T17:57:59+00:00","index":1,"fulltext":"This content is not available."},{"type":"reviewersInvited","content":"3","date":"2026-05-01T17:49:54+00:00","index":"","fulltext":""}],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[{"id":67385207,"name":"Biological sciences/Genetics/Gene regulation"},{"id":67385208,"name":"Biological sciences/Computational biology and bioinformatics"}],"tags":[],"updatedAt":"2026-05-04T16:44:49+00:00","versionOfRecord":[],"versionCreatedAt":"2026-05-04 16:44:49","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-9477232","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-9477232","identity":"rs-9477232","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.