DITTO: An explainable machine-learning model for transcript- specific variant pathogenicity prediction | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article DITTO: An explainable machine-learning model for transcript- specific variant pathogenicity prediction Tarun Karthik Kumar Mamidi, Brandon Michael Wilk, Manavalan Gajapathy, and 1 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-8930088/v1 This work is licensed under a CC BY 4.0 License Status: Under Revision Version 1 posted 12 You are reading this latest preprint version Abstract Accurate classification of genetic variants is critical for medical decision-making, providing insights into disease mechanisms, and enabling therapeutic discovery. While numerous methods exist, they often address limited variant types, lack transcript awareness, and operate as opaque black boxes. To address these issues, we developed DITTO - a unified, explainable, transcript-aware advanced machine-learning method for pathogenicity prediction. It integrates diverse genomic features-including conservation scores, population frequencies, etc., to train a single, explainable neural network model. DITTO outperforms existing methods across standard benchmarks, demonstrating superior performance (99% F1 score) in classifying pathogenic and benign variants. DITTO is publicly available at https://github.com/uab-cgds-worthey/DITTO Biological sciences/Computational biology and bioinformatics Biological sciences/Genetics rare disease genomics explainable machine learning pathogenic variant consequence classification prioritization Full Text Additional Declarations No competing interests reported. Supplementary Files supptables.xlsx Cite Share Download PDF Status: Under Revision Version 1 posted Editorial decision: Revision requested 15 May, 2026 Reviews received at journal 14 May, 2026 Reviewers agreed at journal 07 May, 2026 Reviews received at journal 06 May, 2026 Reviewers agreed at journal 06 May, 2026 Reviewers agreed at journal 05 May, 2026 Reviewers agreed at journal 05 May, 2026 Reviewers invited by journal 27 Feb, 2026 Editor invited by journal 27 Feb, 2026 Editor assigned by journal 23 Feb, 2026 Submission checks completed at journal 23 Feb, 2026 First submitted to journal 20 Feb, 2026 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-8930088","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":599824465,"identity":"7477ec2d-f2e6-4a79-9bf9-7c6ed0133c6b","order_by":0,"name":"Tarun Karthik Kumar Mamidi","email":"","orcid":"","institution":"University of Alabama-Birmingham School of Medicine","correspondingAuthor":false,"prefix":"","firstName":"Tarun","middleName":"Karthik Kumar","lastName":"Mamidi","suffix":""},{"id":599824466,"identity":"020b3843-a9ab-4910-9750-098336330fd7","order_by":1,"name":"Brandon Michael Wilk","email":"","orcid":"","institution":"University of Alabama-Birmingham School of Medicine","correspondingAuthor":false,"prefix":"","firstName":"Brandon","middleName":"Michael","lastName":"Wilk","suffix":""},{"id":599824467,"identity":"cd72c4b9-0b2f-47bc-b912-5617ad3a868f","order_by":2,"name":"Manavalan Gajapathy","email":"","orcid":"","institution":"University of Alabama-Birmingham School of Medicine","correspondingAuthor":false,"prefix":"","firstName":"Manavalan","middleName":"","lastName":"Gajapathy","suffix":""},{"id":599824468,"identity":"cf2614a8-3fc1-4c86-b0bb-9c9b78b43af6","order_by":3,"name":"Elizabeth Anabel Worthey","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA1klEQVRIiWNgGAWjYPCCAwwM7A0MzBBOAjE6EoBaeA6QrEUigUgt/OxnDB9X/rgjZ3DzjeHnwhw7oEiOAV4tkj05xoZnEp4ZG9zOMZaeuS0ZKPIGvxaDA7nbJBsSDiduuJ1jIM277QCDwQ0CttiffwvWUr/h5hnj3yAt9oS0GEhAbEkwuMFjBrFFgoAWiRvvPxs2pB02nHkmrcwa6BceiTPPCvBq4e9PS3zYYHNYnu/44c23C7fZyfG3J2/AqwUOFA5AaB7ilIOAfAPxakfBKBgFo2CEAQA8S0vRlPbvTAAAAABJRU5ErkJggg==","orcid":"","institution":"University of Alabama-Birmingham School of Medicine","correspondingAuthor":true,"prefix":"","firstName":"Elizabeth","middleName":"Anabel","lastName":"Worthey","suffix":""}],"badges":[],"createdAt":"2026-02-21 02:53:17","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-8930088/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-8930088/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":104401390,"identity":"6a3062ff-86fe-487a-b463-ef09ca57fc78","added_by":"auto","created_at":"2026-03-11 12:12:34","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1103252,"visible":true,"origin":"","legend":"","description":"","filename":"DITTOManuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-8930088/v1_covered_fbb1af76-1e00-4249-a8fd-a0ce62fec2d6.pdf"},{"id":103918626,"identity":"57bc6fb5-068a-4c2a-b733-e701037b8260","added_by":"auto","created_at":"2026-03-04 13:43:28","extension":"xlsx","order_by":0,"title":"","display":"","copyAsset":false,"role":"supplement","size":41772,"visible":true,"origin":"","legend":"","description":"","filename":"supptables.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-8930088/v1/b175533a2f12f7816c9a9e93.xlsx"}],"financialInterests":"No competing interests reported.","formattedTitle":"DITTO: An explainable machine-learning model for transcript- specific variant pathogenicity prediction","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"scientific-reports","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"scirep","sideBox":"Learn more about [Scientific Reports](http://www.nature.com/srep/)","snPcode":"","submissionUrl":"","title":"Scientific Reports","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Scientific Reports","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"rare disease, genomics, explainable, machine learning, pathogenic, variant consequence, classification, prioritization","lastPublishedDoi":"10.21203/rs.3.rs-8930088/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-8930088/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eAccurate classification of genetic variants is critical for medical decision-making, providing insights into disease mechanisms, and enabling therapeutic discovery. While numerous methods exist, they often address limited variant types, lack transcript awareness, and operate as opaque black boxes. To address these issues, we developed DITTO - a unified, explainable, transcript-aware advanced machine-learning method for pathogenicity prediction. It integrates diverse genomic features-including conservation scores, population frequencies, etc., to train a single, explainable neural network model. DITTO outperforms existing methods across standard benchmarks, demonstrating superior performance (99% F1 score) in classifying pathogenic and benign variants. DITTO is publicly available at \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://github.com/uab-cgds-worthey/DITTO\u003c/span\u003e\u003cspan address=\"https://github.com/uab-cgds-worthey/DITTO\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/p\u003e","manuscriptTitle":"DITTO: An explainable machine-learning model for transcript- specific variant pathogenicity prediction","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-03-04 13:43:23","doi":"10.21203/rs.3.rs-8930088/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2026-05-15T04:30:37+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2026-05-14T13:25:55+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"216591765171548026224297928572258149274","date":"2026-05-07T05:39:58+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2026-05-06T08:45:12+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"309523307509538739695197861613260655007","date":"2026-05-06T05:31:51+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"167399058025367607091876158470850295460","date":"2026-05-05T15:37:32+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"210465454607146906885237060745054030105","date":"2026-05-05T14:02:21+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2026-02-27T07:16:11+00:00","index":"","fulltext":""},{"type":"editorInvited","content":"","date":"2026-02-27T06:49:07+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2026-02-23T12:07:49+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2026-02-23T12:06:57+00:00","index":"","fulltext":""},{"type":"submitted","content":"Scientific Reports","date":"2026-02-21T02:45:52+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"scientific-reports","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"scirep","sideBox":"Learn more about [Scientific Reports](http://www.nature.com/srep/)","snPcode":"","submissionUrl":"","title":"Scientific Reports","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Scientific Reports","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"358c3b3e-c47e-4eda-9d8f-9752be55b9b8","owner":[],"postedDate":"March 4th, 2026","published":true,"recentEditorialEvents":[{"type":"decision","content":"Revision requested","date":"2026-05-15T04:30:37+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2026-05-14T13:25:55+00:00","index":71,"fulltext":""},{"type":"reviewerAgreed","content":"216591765171548026224297928572258149274","date":"2026-05-07T05:39:58+00:00","index":70,"fulltext":""},{"type":"editorInvitedReview","content":"","date":"2026-05-06T08:45:12+00:00","index":69,"fulltext":""},{"type":"reviewerAgreed","content":"309523307509538739695197861613260655007","date":"2026-05-06T05:31:51+00:00","index":67,"fulltext":""},{"type":"reviewerAgreed","content":"167399058025367607091876158470850295460","date":"2026-05-05T15:37:32+00:00","index":66,"fulltext":""},{"type":"reviewerAgreed","content":"210465454607146906885237060745054030105","date":"2026-05-05T14:02:21+00:00","index":64,"fulltext":""}],"rejectedJournal":[],"revision":"","amendment":"","status":"in-revision","subjectAreas":[{"id":63829289,"name":"Biological sciences/Computational biology and bioinformatics"},{"id":63829290,"name":"Biological sciences/Genetics"}],"tags":[],"updatedAt":"2026-05-15T04:39:36+00:00","versionOfRecord":[],"versionCreatedAt":"2026-03-04 13:43:23","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-8930088","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-8930088","identity":"rs-8930088","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.