Using Machine Learning Models to Predict the Impact of Template Mismatches on Polymerase Chain Reaction (PCR) Assay Performance | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Using Machine Learning Models to Predict the Impact of Template Mismatches on Polymerase Chain Reaction (PCR) Assay Performance Brittany Knight, Taylor Otwell, Michael P. Coryell, Jennifer Stone, and 6 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-4830873/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 09 May, 2025 Read the published version in Scientific Reports → Version 1 posted 11 You are reading this latest preprint version Abstract Molecular assays are critical tools for the diagnosis of infectious diseases. These assays have been extremely valuable during the COVID pandemic, used to guide both patient management and infection control strategies. Sustained transmission and unhindered proliferation of the virus during the pandemic resulted in many variants with unique mutations. Some of these mutations could lead to signature erosion, where tests developed using the genetic sequence of an earlier version of the pathogen may produce false negative results when used to detect novel variants. In this study, we assessed the performance changes of 15 molecular assay designs when challenged with a variety of mutations that fall within the targeted region. Using data generated from this study, we trained and assessed the performance of seven different machine learning models to predict whether a specific set of mutations will result in significant change in the performance for a specific test design. The best performing model demonstrated acceptable performance with sensitivity of 82% and specificity of 87% when assessed using 10-fold cross validation. Our findings highlighted the potential of using machine learning models to predict the impact of emerging mutations on the performance of specific molecular test designs. Health sciences/Biomarkers/Diagnostic markers Biological sciences/Biological techniques Biological sciences/Biotechnology Signature Erosion qPCR performance in silico prediction false negative result supervised learning. Full Text Additional Declarations No competing interests reported. Supplementary Files SupplementaryFile1.xlsx SupplementaryFile2.xlsx Cite Share Download PDF Status: Published Journal Publication published 09 May, 2025 Read the published version in Scientific Reports → Version 1 posted Editorial decision: Revision requested 04 Mar, 2025 Reviewers agreed at journal 29 Jan, 2025 Reviews received at journal 12 Sep, 2024 Reviewers agreed at journal 27 Aug, 2024 Reviewers agreed at journal 27 Aug, 2024 Reviewers agreed at journal 27 Aug, 2024 Reviewers invited by journal 27 Aug, 2024 Editor assigned by journal 27 Aug, 2024 Editor invited by journal 07 Aug, 2024 Submission checks completed at journal 07 Aug, 2024 First submitted to journal 30 Jul, 2024 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-4830873","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":346148181,"identity":"d64b38d0-2905-42d5-83d4-a788c56093b0","order_by":0,"name":"Brittany Knight","email":"","orcid":"","institution":"MRIGlobal","correspondingAuthor":false,"prefix":"","firstName":"Brittany","middleName":"","lastName":"Knight","suffix":""},{"id":346148184,"identity":"8c56f231-5513-4680-b781-b3b3d9f2e19f","order_by":1,"name":"Taylor Otwell","email":"","orcid":"","institution":"MRIGlobal","correspondingAuthor":false,"prefix":"","firstName":"Taylor","middleName":"","lastName":"Otwell","suffix":""},{"id":346148189,"identity":"d6ebf829-8242-4a29-82a5-68ee0dd1ee3e","order_by":2,"name":"Michael P. Coryell","email":"","orcid":"","institution":"U.S. Food and Drug Administration","correspondingAuthor":false,"prefix":"","firstName":"Michael","middleName":"P.","lastName":"Coryell","suffix":""},{"id":346148191,"identity":"01d547dd-c027-40ec-9b43-ec3c5b936a5d","order_by":3,"name":"Jennifer Stone","email":"","orcid":"","institution":"MRIGlobal","correspondingAuthor":false,"prefix":"","firstName":"Jennifer","middleName":"","lastName":"Stone","suffix":""},{"id":346148194,"identity":"d7f7b15e-64f8-4a13-b1e8-a4e16a524c30","order_by":4,"name":"Phillip Davis","email":"","orcid":"","institution":"MRIGlobal","correspondingAuthor":false,"prefix":"","firstName":"Phillip","middleName":"","lastName":"Davis","suffix":""},{"id":346148196,"identity":"063938bf-3b52-41cc-a404-313eb4a14a1d","order_by":5,"name":"Bryan Necciai","email":"","orcid":"","institution":"Enabling Biotechnologies JPL-EB","correspondingAuthor":false,"prefix":"","firstName":"Bryan","middleName":"","lastName":"Necciai","suffix":""},{"id":346148197,"identity":"c9f216e7-c5a0-488a-86ba-119a91b7613e","order_by":6,"name":"Paul E. Carlson","email":"","orcid":"","institution":"U.S. Food and Drug Administration","correspondingAuthor":false,"prefix":"","firstName":"Paul","middleName":"E.","lastName":"Carlson","suffix":""},{"id":346148198,"identity":"c643e131-c722-45f0-9ec0-5c062a8353ef","order_by":7,"name":"Shanmuga Sozhamannan","email":"","orcid":"","institution":"Enabling Biotechnologies JPL-EB","correspondingAuthor":false,"prefix":"","firstName":"Shanmuga","middleName":"","lastName":"Sozhamannan","suffix":""},{"id":346148200,"identity":"56a42be5-ff50-4cc9-8978-db7852810a69","order_by":8,"name":"Alyxandria M. Schubert","email":"","orcid":"","institution":"U.S. Food and Drug Administration","correspondingAuthor":false,"prefix":"","firstName":"Alyxandria","middleName":"M.","lastName":"Schubert","suffix":""},{"id":346148202,"identity":"f71cfdde-8bdb-4068-a787-9983325ee155","order_by":9,"name":"Yi H. Yan","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA30lEQVRIiWNgGAWjYDACZjApIcfekMDAzNhAvBYLY54DRGuBgIrEHqK1GBxnYJP4uUMivYc9+QFz4Y46Bnn3HgP8Wg4zsEn2npHI7eF5ZsA888xhBsMzZwhrkeBtk8jdL5FgwMzbdoDBcEZaAmFb/rZJpPNIpH8AaqkjTos00JYEHokckC3MDPISyQfwapE8zNhsLdsmYdjD86bgMG/bYR4DnsP4tfCdP3zw5tu2Onke9vSNj4EOk5Nvb2zAq0XhAGOLBIwDMp3HAL8dDAzyDQzMH9BFRsEoGAWjYBSgAACrI0Hl2BvOFwAAAABJRU5ErkJggg==","orcid":"","institution":"U.S. Food and Drug Administration","correspondingAuthor":true,"prefix":"","firstName":"Yi","middleName":"H.","lastName":"Yan","suffix":""}],"badges":[],"createdAt":"2024-07-30 19:23:27","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-4830873/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-4830873/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1038/s41598-025-98444-8","type":"published","date":"2025-05-09T15:56:56+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":82537610,"identity":"7a7313ef-6ca8-4e20-976f-6f95a87870a2","added_by":"auto","created_at":"2025-05-12 16:09:16","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":896316,"visible":true,"origin":"","legend":"","description":"","filename":"Manuscript.Submission.pdf","url":"https://assets-eu.researchsquare.com/files/rs-4830873/v1_covered_c6673a02-26a1-4ffc-af0e-6fc2fe816339.pdf"},{"id":63874895,"identity":"6ac5c0c5-f5a6-46f0-ad82-fe8637e04009","added_by":"auto","created_at":"2024-09-03 09:08:56","extension":"xlsx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":32699,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryFile1.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4830873/v1/994a5bcfd8d415af04514160.xlsx"},{"id":63874112,"identity":"6dee956b-6700-46c3-bc78-744a9ca01300","added_by":"auto","created_at":"2024-09-03 09:00:56","extension":"xlsx","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":13432,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryFile2.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-4830873/v1/db59a0af756a224af67c96d6.xlsx"}],"financialInterests":"No competing interests reported.","formattedTitle":"Using Machine Learning Models to Predict the Impact of Template Mismatches on Polymerase Chain Reaction (PCR) Assay Performance","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"scientific-reports","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"scirep","sideBox":"Learn more about [Scientific Reports](http://www.nature.com/srep/)","snPcode":"","submissionUrl":"","title":"Scientific Reports","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Scientific Reports","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"Signature Erosion, qPCR performance, in silico prediction, false negative result, supervised learning. ","lastPublishedDoi":"10.21203/rs.3.rs-4830873/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-4830873/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eMolecular assays are critical tools for the diagnosis of infectious diseases. These assays have been extremely valuable during the COVID pandemic, used to guide both patient management and infection control strategies. Sustained transmission and unhindered proliferation of the virus during the pandemic resulted in many variants with unique mutations. Some of these mutations could lead to signature erosion, where tests developed using the genetic sequence of an earlier version of the pathogen may produce false negative results when used to detect novel variants. In this study, we assessed the performance changes of 15 molecular assay designs when challenged with a variety of mutations that fall within the targeted region. Using data generated from this study, we trained and assessed the performance of seven different machine learning models to predict whether a specific set of mutations will result in significant change in the performance for a specific test design. The best performing model demonstrated acceptable performance with sensitivity of 82% and specificity of 87% when assessed using 10-fold cross validation. Our findings highlighted the potential of using machine learning models to predict the impact of emerging mutations on the performance of specific molecular test designs.\u003c/p\u003e","manuscriptTitle":"Using Machine Learning Models to Predict the Impact of Template Mismatches on Polymerase Chain Reaction (PCR) Assay Performance","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-09-03 09:00:51","doi":"10.21203/rs.3.rs-4830873/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2025-03-04T05:08:30+00:00","index":"","fulltext":""},{"type":"reviewerAgreed","content":"89549302888571558735055731095616137181","date":"2025-01-30T02:57:36+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2024-09-12T13:33:22+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"339135170030083668779731302641941364361","date":"2024-08-28T03:34:09+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"234597223785114644494129469833023570957","date":"2024-08-27T20:57:20+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"310366368705256043165248460799213408263","date":"2024-08-27T20:36:33+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2024-08-27T20:22:08+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2024-08-27T20:18:06+00:00","index":"","fulltext":""},{"type":"editorInvited","content":"","date":"2024-08-07T09:04:41+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2024-08-07T08:56:02+00:00","index":"","fulltext":""},{"type":"submitted","content":"Scientific Reports","date":"2024-07-30T19:18:40+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"scientific-reports","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"scirep","sideBox":"Learn more about [Scientific Reports](http://www.nature.com/srep/)","snPcode":"","submissionUrl":"","title":"Scientific Reports","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Scientific Reports","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"9d6aa681-91e9-4b0b-87da-08a97963cf72","owner":[],"postedDate":"September 3rd, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[{"id":36700065,"name":"Health sciences/Biomarkers/Diagnostic markers"},{"id":36700066,"name":"Biological sciences/Biological techniques"},{"id":36700067,"name":"Biological sciences/Biotechnology"}],"tags":[],"updatedAt":"2025-05-12T16:04:52+00:00","versionOfRecord":{"articleIdentity":"rs-4830873","link":"https://doi.org/10.1038/s41598-025-98444-8","journal":{"identity":"scientific-reports","isVorOnly":false,"title":"Scientific Reports"},"publishedOn":"2025-05-09 15:56:56","publishedOnDateReadable":"May 9th, 2025"},"versionCreatedAt":"2024-09-03 09:00:51","video":"","vorDoi":"10.1038/s41598-025-98444-8","vorDoiUrl":"https://doi.org/10.1038/s41598-025-98444-8","workflowStages":[]},"version":"v1","identity":"rs-4830873","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-4830873","identity":"rs-4830873","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.