A k-mer-based GWAS approach empowering gene mining in polyploids | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article A k-mer-based GWAS approach empowering gene mining in polyploids Xingtan Zhang, Shuai Chen, Xinlong Liu, Shenyang Qu, Yuhan Song, and 10 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7347406/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted You are reading this latest preprint version Abstract Genome-wide association studies (GWAS) serve as a cornerstone for deciphering the genetic architecture of complex traits. However, conventional GWAS tools, predominantly optimized for diploid species, encounter substantial limitations when applied to complex polyploids due to challenges such as genotyping complexity, multi-allelic variant interpretation, and allele dosage ambiguity. Here, we present KMERIA, a k-mer-based framework specifically engineered to address these challenges, enabling efficient genotyping and robust association mapping in complex polyploid genomes. Rigorous benchmarking with simulated and empirical datasets demonstrates that KMERIA surpasses existing methods in both genotyping accuracy and statistical power. To demonstrate its utility in high-ploidy systems, we deployed KMERIA in an auto-polyploid natural population of 290 wild sugarcane accessions (Saccharum spontaneum) spanning diverse ploidy levels. To assess biases inherent to linear reference genomes in capturing allelic diversity and structural variations, we constructed a graph-based pan-genome integrating structural variations and haplotype diversity across 15 S. spontaneum accessions. Integrating KMERIA with a graph pangenome revealed novel sucrose biosynthesis (SsMGT) and tillering regulators (SsERF14, SsNGA5, SsNAC, SsARF8, SsLOG, SsSCR) in S. spontaneum, including functionally validated SsNGA5. These discoveries not only elucidate the genetic basis of S. spontaneum for its yield potential but also provide actionable targets for sugarcane breeding. Collectively, KMERIA bridges a critical methodological gap in polyploid genomics, while the integration of graph pan-genomes provides a robust framework for deciphering genotype-phenotype relationships in crops with complex genome architectures. Biological sciences/Genetics/Population genetics Biological sciences/Genetics/Plant genetics GWAS Polyploid K-mer Sugarcane Full Text Additional Declarations Yes there is potential Competing Interest. A patent on the KMERIA method has been filed by Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences with S.C., S.Q. X.Z. and M.Z. as inventors. The remaining authors declare no competing interests. Supplementary Files 2.SupplementaryNotesTablesandFigures.pdf Supplementary Information Cite Share Download PDF Status: Under Review Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7347406","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":514413351,"identity":"e13dcd4f-7f6b-4d3f-85c9-6348323c850d","order_by":0,"name":"Xingtan Zhang","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA7ElEQVRIiWNgGAWjYDACCSSS4QNpWtgYGBhnkKCFAayFmYcYHfKzm589/NpmkScf3/zssU2Nnbx8A/OzB/i0MM45Zm4sc0ai2PAYm7lxzrFkww0H2MwN8Glhlkgwk5aokEjc2MZgJp3bwMy4gYGHTQKfFjaJ9G/SEgYgLezfpC0b6u3nNxDQwiORYyb5AWjLfDYeM2nGhsOJDQcIaJGQyCmTZjgjkbiBLadMsufY8eQNh9nM8GqRn5G+TfJnW13i/Obj2yR+1FTbzm9vfoZXCwiAo8PgAJxLSD0QMP4AWddAhMpRMApGwSgYmQAAkTVAB4Tu9QYAAAAASUVORK5CYII=","orcid":"https://orcid.org/0000-0002-5207-0882","institution":"Agricultural Genomics Institute at Shenzhen","correspondingAuthor":true,"prefix":"","firstName":"Xingtan","middleName":"","lastName":"Zhang","suffix":""},{"id":514413352,"identity":"7b3fd42e-5f20-4ad8-8216-b1011e3d3074","order_by":1,"name":"Shuai Chen","email":"","orcid":"https://orcid.org/0000-0002-6861-2682","institution":"FAFU and UIUC-SIB Joint Center for Genomics and Biotechnology, National Sugarcane Engineering Technology Research Center, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, F","correspondingAuthor":false,"prefix":"","firstName":"Shuai","middleName":"","lastName":"Chen","suffix":""},{"id":514413353,"identity":"1fb9b65b-b0af-440c-870d-7ad4b31cf7de","order_by":2,"name":"Xinlong Liu","email":"","orcid":"","institution":"State Key Laboratory for Tropical Crop Breeding, Sugarcane Research Institute, Yunnan Academy of Agricultural Sciences/Yunnan Key Laboratory of Sugarcane Genetic Improvement","correspondingAuthor":false,"prefix":"","firstName":"Xinlong","middleName":"","lastName":"Liu","suffix":""},{"id":514413354,"identity":"f502a78a-d766-44cb-a9f0-7589f80555b3","order_by":3,"name":"Shenyang Qu","email":"","orcid":"","institution":"Agricultural Genomics Institute at Shenzhen","correspondingAuthor":false,"prefix":"","firstName":"Shenyang","middleName":"","lastName":"Qu","suffix":""},{"id":514413355,"identity":"74d885b9-be4b-4d3c-a0ab-20f92348ebf7","order_by":4,"name":"Yuhan Song","email":"","orcid":"","institution":"Agricultural Genomics Institute at Shenzhen","correspondingAuthor":false,"prefix":"","firstName":"Yuhan","middleName":"","lastName":"Song","suffix":""},{"id":514413356,"identity":"b6c9ad6f-ed66-4e26-a020-e852757fad9b","order_by":5,"name":"Kun Chai","email":"","orcid":"","institution":"Chinese Academy of Agricultural Sciences","correspondingAuthor":false,"prefix":"","firstName":"Kun","middleName":"","lastName":"Chai","suffix":""},{"id":514413357,"identity":"932a88c8-8836-41b6-b8dc-18ab7e28a3fa","order_by":6,"name":"Hongbo Liu","email":"","orcid":"","institution":"State Key Laboratory for Tropical Crop Breeding, Sugarcane Research Institute, Yunnan Academy of Agricultural Sciences","correspondingAuthor":false,"prefix":"","firstName":"Hongbo","middleName":"","lastName":"Liu","suffix":""},{"id":514413358,"identity":"8f1d20ad-d303-4120-8a0e-e35f73e05e57","order_by":7,"name":"Yuebin Zhang","email":"","orcid":"","institution":"State Key Laboratory for Tropical Crop Breeding, Sugarcane Research Institute, Yunnan Academy of Agricultural Sciences","correspondingAuthor":false,"prefix":"","firstName":"Yuebin","middleName":"","lastName":"Zhang","suffix":""},{"id":514413359,"identity":"debe6672-436f-423a-99a8-562523c0b63f","order_by":8,"name":"Zhongqiang Xia","email":"","orcid":"https://orcid.org/0000-0003-1759-4143","institution":"Agricultural Genomics Institute at Shenzhen","correspondingAuthor":false,"prefix":"","firstName":"Zhongqiang","middleName":"","lastName":"Xia","suffix":""},{"id":514413360,"identity":"d20655cd-8a4e-4933-b6f9-36bf89228757","order_by":9,"name":"Xiaofeng Li","email":"","orcid":"https://orcid.org/0000-0001-6033-9979","institution":"Agricultural Genomics Institute at Shenzhen","correspondingAuthor":false,"prefix":"","firstName":"Xiaofeng","middleName":"","lastName":"Li","suffix":""},{"id":514413361,"identity":"eadafea5-51dd-4e6a-9e6e-438c3abbab58","order_by":10,"name":"Jungang Wang","email":"","orcid":"","institution":"State Key Laboratory of Tropical Crop Breeding, Institute of Tropical Bioscience and Biotechnology, Chinese Academy of Tropical Agricultural Sciences","correspondingAuthor":false,"prefix":"","firstName":"Jungang","middleName":"","lastName":"Wang","suffix":""},{"id":514413362,"identity":"7ffb5231-4a15-4e5d-b623-edfd2b60ee31","order_by":11,"name":"Muqing Zhang","email":"","orcid":"https://orcid.org/0000-0003-3138-3422","institution":"Guangxi University","correspondingAuthor":false,"prefix":"","firstName":"Muqing","middleName":"","lastName":"Zhang","suffix":""},{"id":514413363,"identity":"b9735228-ca44-4640-906f-012fdc0c940d","order_by":12,"name":"Hongbo Li","email":"","orcid":"https://orcid.org/0000-0003-1579-4600","institution":"College of Horticulture Science and Engineering, Shandong Agricultural University","correspondingAuthor":false,"prefix":"","firstName":"Hongbo","middleName":"","lastName":"Li","suffix":""},{"id":514413364,"identity":"065d858e-4adb-4613-a0b9-7043966dd843","order_by":13,"name":"Guo-Bo Chen","email":"","orcid":"https://orcid.org/0000-0001-5475-8237","institution":"People’s Hospital of Hangzhou Medical College","correspondingAuthor":false,"prefix":"","firstName":"Guo-Bo","middleName":"","lastName":"Chen","suffix":""},{"id":514413365,"identity":"8d62493c-9e70-43ec-95c6-eeb91d30894f","order_by":14,"name":"Chris Maliepaard","email":"","orcid":"https://orcid.org/0000-0002-7319-5270","institution":"Wageningen University and Research Centre","correspondingAuthor":false,"prefix":"","firstName":"Chris","middleName":"","lastName":"Maliepaard","suffix":""}],"badges":[],"createdAt":"2025-08-11 14:35:46","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7347406/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7347406/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":95313413,"identity":"4471e171-f7f1-4d18-807b-7778ba85d408","added_by":"auto","created_at":"2025-11-06 15:51:23","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":7974668,"visible":true,"origin":"","legend":"","description":"","filename":"1.Maintext0811clean.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7347406/v1_covered_1cb21f76-8a1a-40f9-8d7f-ac84b43730fc.pdf"},{"id":95268035,"identity":"8bdba762-034a-43cf-9346-84441c4239a4","added_by":"auto","created_at":"2025-11-06 06:27:01","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":4737555,"visible":true,"origin":"","legend":"Supplementary Information","description":"","filename":"2.SupplementaryNotesTablesandFigures.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7347406/v1/86b097569ab7adcc40420ea9.pdf"}],"financialInterests":"\u003cb\u003eYes\u003c/b\u003e there is potential Competing Interest.\nA patent on the KMERIA method has been filed by Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences with S.C., S.Q. X.Z. and M.Z. as inventors. The remaining authors declare no competing interests.","formattedTitle":"A k-mer-based GWAS approach empowering gene mining in polyploids","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"nature-portfolio","isNatureJournal":true,"hasQc":false,"allowDirectSubmit":false,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"","title":"Nature Portfolio","twitterHandle":"","acdcEnabled":false,"dfaEnabled":false,"editorialSystem":"ejp","reportingPortfolio":"","inReviewEnabled":true,"inReviewRevisionsEnabled":false},"keywords":"GWAS, Polyploid, K-mer, Sugarcane","lastPublishedDoi":"10.21203/rs.3.rs-7347406/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7347406/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"Genome-wide association studies (GWAS) serve as a cornerstone for deciphering the genetic architecture of complex traits. However, conventional GWAS tools, predominantly optimized for diploid species, encounter substantial limitations when applied to complex polyploids due to challenges such as genotyping complexity, multi-allelic variant interpretation, and allele dosage ambiguity. Here, we present KMERIA, a k-mer-based framework specifically engineered to address these challenges, enabling efficient genotyping and robust association mapping in complex polyploid genomes. Rigorous benchmarking with simulated and empirical datasets demonstrates that KMERIA surpasses existing methods in both genotyping accuracy and statistical power. To demonstrate its utility in high-ploidy systems, we deployed KMERIA in an auto-polyploid natural population of 290 wild sugarcane accessions (Saccharum spontaneum) spanning diverse ploidy levels. To assess biases inherent to linear reference genomes in capturing allelic diversity and structural variations, we constructed a graph-based pan-genome integrating structural variations and haplotype diversity across 15 S. spontaneum accessions. Integrating KMERIA with a graph pangenome revealed novel sucrose biosynthesis (SsMGT) and tillering regulators (SsERF14, SsNGA5, SsNAC, SsARF8, SsLOG, SsSCR) in S. spontaneum, including functionally validated SsNGA5. These discoveries not only elucidate the genetic basis of S. spontaneum for its yield potential but also provide actionable targets for sugarcane breeding. Collectively, KMERIA bridges a critical methodological gap in polyploid genomics, while the integration of graph pan-genomes provides a robust framework for deciphering genotype-phenotype relationships in crops with complex genome architectures.","manuscriptTitle":"A k-mer-based GWAS approach empowering gene mining in polyploids","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-11-06 06:26:57","doi":"10.21203/rs.3.rs-7347406/v1","editorialEvents":[],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"nature-genetics","isNatureJournal":true,"hasQc":false,"allowDirectSubmit":false,"externalIdentity":"ng","sideBox":"Learn more about [Nature Genetics](http://www.nature.com/ng/)","snPcode":"","submissionUrl":"","title":"Nature Genetics","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"ejp","reportingPortfolio":"Nature Research","inReviewEnabled":true,"inReviewRevisionsEnabled":false}}],"origin":"","ownerIdentity":"71ddce91-09cf-48bb-a985-7e56bf5ff7c4","owner":[],"postedDate":"November 6th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[{"id":54653759,"name":"Biological sciences/Genetics/Population genetics"},{"id":54653760,"name":"Biological sciences/Genetics/Plant genetics"}],"tags":[],"updatedAt":"2026-04-09T12:58:19+00:00","versionOfRecord":[],"versionCreatedAt":"2025-11-06 06:26:57","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-7347406","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7347406","identity":"rs-7347406","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.