TinkerHap - A Novel Read-Based Phasing Algorithm with Integrated Multi-Method Support for Enhanced Accuracy

preprint OA: closed CC-BY-4.0
Full text 11,901 characters · extracted from preprint-html · click to expand
TinkerHap - A Novel Read-Based Phasing Algorithm with Integrated Multi-Method Support for Enhanced Accuracy | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article TinkerHap - A Novel Read-Based Phasing Algorithm with Integrated Multi-Method Support for Enhanced Accuracy Uri Hartmann, Eran Shaham, Dafna Nathan, Ilana Blech, Danny Zeevi This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-6322826/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Phasing, the assignment of alleles to their respective parental chromosomes, is fundamental to studying genetic variation and identifying disease-causing variants. Traditional approaches, including statistical, pedigree-based, and read-based phasing, face challenges such as limited accuracy for rare variants, reliance on external reference panels, and constraints in regions with sparse genetic variation. To address these limitations, we developed TinkerHap, a novel and unique phasing algorithm that integrates a read-based phaser, based on a pairwise distance-based unsupervised classification, with external phased data, such as statistical or pedigree phasing. We evaluated TinkerHap’s performance against other phasing algorithms using 1,040 parent-offspring trios from the UK Biobank (Illumina short-reads) and GIAB Ashkenazi trio (PacBio long-reads). TinkerHap’s read-based phaser alone achieved higher phasing accuracies than all other algorithms with 95.1% for short-reads (second best: 94.8%) and 97.5% for long-reads (second best: 95.5%). Its hybrid approach further enhanced short-read performance to 96.3% accuracy and was able to phase 99.5% of all heterozygous sites. TinkerHap also extended haplotype block sizes to a median of 79,449 base-pairs for long-reads (second best: 68,303 bp) and demonstrated higher accuracy for both SNPs and indels. This combination of a robust read-based algorithm and hybrid strategy makes TinkerHap a uniquely powerful tool for genomic analyses. Biological sciences/Computational biology and bioinformatics/Genome informatics Biological sciences/Genetics/Haplotypes Biological sciences/Genetics/Genomics Full Text Additional Declarations There is NO Competing Interest. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-6322826","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":435475062,"identity":"f4d66ad2-4107-408a-8954-29011758ddd5","order_by":0,"name":"Uri Hartmann","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA/UlEQVRIiWNgGAWjYFACxgZmGOsBAxsDgwGUx8NHQIsEEDMboGhhw2MPTAubBLIWBlxaDG4fbmAuqLlXp9t+9ljFh7LDDObshx8w/KhhkMGp5VxiA/OMY8USZmfy0m7OOHeYwbInzYCx5xhuhxmcAfqFhy1BwuxAjtlt3rbDDAYHchgYeBsIafkH1HL+jVnxX5CW828YGP8S0sLbBtRyI8eMmRGk5UYOAzM+WySBWg7z9iVIbrvxxliy51w6j8GNZwaHZY5J4NTCd4b94WOebwn8ZudzDD/8KLOWMzif/PDhmxobe34cWkDgADKHByoigUfDKBgFo2AUjAJCAABT5FFM01FrdQAAAABJRU5ErkJggg==","orcid":"https://orcid.org/0009-0002-0668-0249","institution":"Jerusalem Multidisciplinary College","correspondingAuthor":true,"prefix":"","firstName":"Uri","middleName":"","lastName":"Hartmann","suffix":""},{"id":435475063,"identity":"2c59791a-7485-45a4-9371-5ad2fcccbf97","order_by":1,"name":"Eran Shaham","email":"","orcid":"","institution":"Jerusalem Multidisciplinary College","correspondingAuthor":false,"prefix":"","firstName":"Eran","middleName":"","lastName":"Shaham","suffix":""},{"id":435475064,"identity":"8f686ae2-8029-449f-9d59-3af91648c8eb","order_by":2,"name":"Dafna Nathan","email":"","orcid":"","institution":"Jerusalem Multidisciplinary College","correspondingAuthor":false,"prefix":"","firstName":"Dafna","middleName":"","lastName":"Nathan","suffix":""},{"id":435475065,"identity":"4f1419f0-db24-46de-8cc1-eedd0df058e0","order_by":3,"name":"Ilana Blech","email":"","orcid":"","institution":"Jerusalem Multidisciplinary College","correspondingAuthor":false,"prefix":"","firstName":"Ilana","middleName":"","lastName":"Blech","suffix":""},{"id":435475066,"identity":"dab56c9b-7a88-45a4-bffc-e75b6f4292e2","order_by":4,"name":"Danny Zeevi","email":"","orcid":"","institution":"Jerusalem Multidisciplinary College","correspondingAuthor":false,"prefix":"","firstName":"Danny","middleName":"","lastName":"Zeevi","suffix":""}],"badges":[],"createdAt":"2025-03-27 18:50:36","currentVersionCode":1,"declarations":{"humanSubjects":false,"vertebrateSubjects":false,"conflictsOfInterestStatement":false,"humanSubjectEthicalGuidelines":false,"humanSubjectConsent":false,"humanSubjectClinicalTrial":false,"humanSubjectCaseReport":false,"vertebrateSubjectEthicalGuidelines":false},"doi":"10.21203/rs.3.rs-6322826/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-6322826/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":80272302,"identity":"eacf56c2-579c-455c-87a2-f5c5c079bab3","added_by":"auto","created_at":"2025-04-10 04:03:34","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":386540,"visible":true,"origin":"","legend":"","description":"","filename":"TinkerHapArticle20250316.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6322826/v1_covered_0b55f4b7-841d-4f82-807b-691ea918b6bd.pdf"}],"financialInterests":"There is \u003cb\u003eNO\u003c/b\u003e Competing Interest.","formattedTitle":"TinkerHap - A Novel Read-Based Phasing Algorithm with Integrated Multi-Method Support for Enhanced Accuracy","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":true,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"","lastPublishedDoi":"10.21203/rs.3.rs-6322826/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-6322826/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003ePhasing, the assignment of alleles to their respective parental chromosomes, is fundamental to studying genetic variation and identifying disease-causing variants. Traditional approaches, including statistical, pedigree-based, and read-based phasing, face challenges such as limited accuracy for rare variants, reliance on external reference panels, and constraints in regions with sparse genetic variation.\u003c/p\u003e \u003cp\u003eTo address these limitations, we developed TinkerHap, a novel and unique phasing algorithm that integrates a read-based phaser, based on a pairwise distance-based unsupervised classification, with external phased data, such as statistical or pedigree phasing. We evaluated TinkerHap\u0026rsquo;s performance against other phasing algorithms using 1,040 parent-offspring trios from the UK Biobank (Illumina short-reads) and GIAB Ashkenazi trio (PacBio long-reads). TinkerHap\u0026rsquo;s read-based phaser alone achieved higher phasing accuracies than all other algorithms with 95.1% for short-reads (second best: 94.8%) and 97.5% for long-reads (second best: 95.5%). Its hybrid approach further enhanced short-read performance to 96.3% accuracy and was able to phase 99.5% of all heterozygous sites. TinkerHap also extended haplotype block sizes to a median of 79,449 base-pairs for long-reads (second best: 68,303 bp) and demonstrated higher accuracy for both SNPs and indels. This combination of a robust read-based algorithm and hybrid strategy makes TinkerHap a uniquely powerful tool for genomic analyses.\u003c/p\u003e","manuscriptTitle":"TinkerHap - A Novel Read-Based Phasing Algorithm with Integrated Multi-Method Support for Enhanced Accuracy","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-04-03 05:08:38","doi":"10.21203/rs.3.rs-6322826/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"62fadcd7-339b-4d9d-aaf8-eaec5cf65a73","owner":[],"postedDate":"April 3rd, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":46373447,"name":"Biological sciences/Computational biology and bioinformatics/Genome informatics"},{"id":46373448,"name":"Biological sciences/Genetics/Haplotypes"},{"id":46373449,"name":"Biological sciences/Genetics/Genomics"}],"tags":[],"updatedAt":"2025-04-10T03:55:27+00:00","versionOfRecord":[],"versionCreatedAt":"2025-04-03 05:08:38","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-6322826","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-6322826","identity":"rs-6322826","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00
unpaywall
last seen: 2026-05-23T02:00:01.238055+00:00
License: CC-BY-4.0