Automated Prediction of Radiological Protocols Using Retrieval Augmented Generation | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Automated Prediction of Radiological Protocols Using Retrieval Augmented Generation Conrad T. Testagrose, Panagiotis Korfiatis, Ph.D., Timothy L. Kline, Ph.D., and 7 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7623430/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Radiological protocol selection is a critical but time-consuming step in clinical workflow, requiring radiologists to match patient indications with the appropriate MRI or CT protocol. Manual selection can be prone to delays or potential errors, and automated approaches must contend with substantial data imbalance, site-specific variation, and evolving nomenclature. We investigated whether a large language model (LLM) can support reliable protocol selection at scale and whether retrieval-augmented generation (RAG) offers operational advantages over direct fine-tuning. Using 498,228 patient reports collected across three Mayo Clinic sites (Arizona, Florida, and Rochester) spanning six radiological divisions, we trained site-specific Llama 3.2 3B models for use with and without retrieval augmentation. Division-scoped Facebook AI Similarity Search (FAISS) indexes constructed from procedure and diagnosis text were used to supply contextual evidence in the RAG framework. Both fine-tuned and RAG-augmented models achieved strong performance across sites, with F1 scores of 0.88–0.90. RAG matched or modestly trailed direct fine-tuning overall but delivered consistent gains in specific divisions (e.g., musculoskeletal imaging). Importantly, the RAG model introduced abstention behavior (Not Enough Information), which concentrated in linguistically diverse divisions and provided an interpretable signal of uncertainty. These findings suggest that RAG-based models are viable for division-scoped protocol selection and offer practical advantages. Retrieval indexes can be refreshed far more easily and with fewer resources than retraining LLMs, enabling continual adaptation to evolving clinical workflows. Future prospective deployment will evaluate real-time accuracy, agreement with practitioners, and the role of abstention as a safety mechanism in clinical decision support. Artificial Intelligence and Machine Learning Nuclear Medicine & Medical Imaging Medical Informatics Deep Learning Large Language Models Radiology Retrieval Augmented Generation Full Text Additional Declarations The authors declare no competing interests. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7623430","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":515512933,"identity":"32835ff5-7b78-472c-b51f-8a0802df41fa","order_by":0,"name":"Conrad T. Testagrose","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABFUlEQVRIiWNgGAWjYGiA4w1QxgHs8jwMDIwNqEJnYEqJ13IjAb8We/b25w9+/LonJ9/ewPjh44578nw3Xyd+5qlgkOOD60WzheeMYWNvX7GxwZkDzJIzzxQbzrydu1ma5wyDsSQuLRI5jA28PQmJGyQS2Jh52xIYN9zO3SDN28aQuAGXFvnnDxv/ArXMn/8ArMV+w82zm38DtdTj1CLBYNjM8yMhseEGA1gL0HDebSBbEgxwaTmTYzhbtiEB6JfEZsmZbQnJM8/kbrOcc0bCcOaZB1i1sLcff/DxzZ8EYIgdPvjhY1uCbd/xs5tvvKmwkec7jt0WMGBsA5MNcAEmHgYJ3MrB4A+6GT8IaBgFo2AUjIIRBQDKOWid2jHnLAAAAABJRU5ErkJggg==","orcid":"","institution":"Mayo Clinic Florida","correspondingAuthor":true,"prefix":"","firstName":"Conrad","middleName":"T.","lastName":"Testagrose","suffix":""},{"id":515512934,"identity":"cada3767-0554-476b-b268-e0dbc1aede1a","order_by":1,"name":"Panagiotis Korfiatis, Ph.D.","email":"","orcid":"","institution":"Mayo Clinic Rochester","correspondingAuthor":false,"prefix":"","firstName":"Panagiotis","middleName":"","lastName":"Korfiatis","suffix":"Ph.D."},{"id":515512935,"identity":"89bcf161-1d7d-4ee5-9dfd-0aa64aef57e2","order_by":2,"name":"Timothy L. Kline, Ph.D.","email":"","orcid":"","institution":"Mayo Clinic Rochester","correspondingAuthor":false,"prefix":"","firstName":"Timothy","middleName":"L.","lastName":"Kline","suffix":"Ph.D."},{"id":515512936,"identity":"61c6dcb1-f1a7-4d33-9d8b-625a17d03b07","order_by":3,"name":"Justin D. Benfield","email":"","orcid":"","institution":"Mayo Clinic Rochester","correspondingAuthor":false,"prefix":"","firstName":"Justin","middleName":"D.","lastName":"Benfield","suffix":""},{"id":515512937,"identity":"467aa915-1c4f-472d-81ec-95e924e4d987","order_by":4,"name":"Cole J. Cook, Ph.D","email":"","orcid":"","institution":"Mayo Clinic Rochester","correspondingAuthor":false,"prefix":"","firstName":"Cole","middleName":"J.","lastName":"Cook","suffix":"Ph.D"},{"id":515512938,"identity":"e37145cd-9713-4c3d-afb6-e3af470cb1a4","order_by":5,"name":"Peggy S. Merkel","email":"","orcid":"","institution":"Mayo Clinic Rochester","correspondingAuthor":false,"prefix":"","firstName":"Peggy","middleName":"S.","lastName":"Merkel","suffix":""},{"id":515512939,"identity":"3447be3b-3a8b-456a-b26c-2b732f7b320f","order_by":6,"name":"Mutlu Demirer, Ph.D.","email":"","orcid":"","institution":"Mayo Clinic Florida","correspondingAuthor":false,"prefix":"","firstName":"Mutlu","middleName":"","lastName":"Demirer","suffix":"Ph.D."},{"id":515512940,"identity":"2a8f65ce-91b1-4620-a593-997bf7589e70","order_by":7,"name":"Richard D. White, M.D.","email":"","orcid":"","institution":"Mayo Clinic Jacksonville","correspondingAuthor":false,"prefix":"","firstName":"Richard","middleName":"D.","lastName":"White","suffix":"M.D."},{"id":515512941,"identity":"c6eb15f0-2623-4032-966a-ed16c921867e","order_by":8,"name":"Candice W. Bolan, M.D.","email":"","orcid":"","institution":"Mayo Clinic Florida","correspondingAuthor":false,"prefix":"","firstName":"Candice","middleName":"W.","lastName":"Bolan","suffix":"M.D."},{"id":515512942,"identity":"67938447-2f45-4bd9-b155-910ae21a9d10","order_by":9,"name":"Barbaros S. Erdal, Ph.D.","email":"","orcid":"","institution":"Mayo Clinic Florida","correspondingAuthor":false,"prefix":"","firstName":"Barbaros","middleName":"S.","lastName":"Erdal","suffix":"Ph.D."}],"badges":[],"createdAt":"2025-09-15 18:26:04","currentVersionCode":1,"declarations":{"humanSubjects":false,"vertebrateSubjects":false,"conflictsOfInterestStatement":false,"humanSubjectEthicalGuidelines":false,"humanSubjectConsent":false,"humanSubjectClinicalTrial":false,"humanSubjectCaseReport":false,"vertebrateSubjectEthicalGuidelines":false},"doi":"10.21203/rs.3.rs-7623430/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7623430/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":91495545,"identity":"84b0732b-579c-402c-b3c8-fedf18fea008","added_by":"auto","created_at":"2025-09-17 06:18:44","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1214961,"visible":true,"origin":"","legend":"","description":"","filename":"ProtocolPredictionRAGPreprint.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7623430/v1_covered_1926f945-ca84-4e08-a7f2-3257eb1bee07.pdf"}],"financialInterests":"The authors declare no competing interests.","formattedTitle":"\u003cp\u003eAutomated Prediction of Radiological Protocols Using Retrieval Augmented Generation\u003c/p\u003e","fulltext":[],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":false,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"Mayo Clinic","isAcceptedByJournal":false,"isAuthorSuppliedPdf":true,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":true,"isPdf":true,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Deep Learning, Large Language Models, Radiology, Retrieval Augmented Generation","lastPublishedDoi":"10.21203/rs.3.rs-7623430/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7623430/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eRadiological protocol selection is a critical but time-consuming step in clinical workflow, requiring radiologists to match patient indications with the appropriate MRI or CT protocol. Manual selection can be prone to delays or potential errors, and automated approaches must contend with substantial data imbalance, site-specific variation, and evolving nomenclature.\u003c/p\u003e\n\u003cp\u003eWe investigated whether a large language model (LLM) can support reliable protocol selection at scale and whether retrieval-augmented generation (RAG) offers operational advantages over direct fine-tuning. Using 498,228 patient reports collected across three Mayo Clinic sites (Arizona, Florida, and Rochester) spanning six radiological divisions, we trained site-specific Llama 3.2 3B models for use with and without retrieval augmentation. Division-scoped Facebook AI Similarity Search (FAISS) indexes constructed from procedure and diagnosis text were used to supply contextual evidence in the RAG framework.\u003c/p\u003e\n\u003cp\u003eBoth fine-tuned and RAG-augmented models achieved strong performance across sites, with F1 scores of 0.88–0.90. RAG matched or modestly trailed direct fine-tuning overall but delivered consistent gains in specific divisions (e.g., musculoskeletal imaging). Importantly, the RAG model introduced abstention behavior (Not Enough Information), which concentrated in linguistically diverse divisions and provided an interpretable signal of uncertainty.\u003c/p\u003e\n\u003cp\u003eThese findings suggest that RAG-based models are viable for division-scoped protocol selection and offer practical advantages. Retrieval indexes can be refreshed far more easily and with fewer resources than retraining LLMs, enabling continual adaptation to evolving clinical workflows. Future prospective deployment will evaluate real-time accuracy, agreement with practitioners, and the role of abstention as a safety mechanism in clinical decision support.\u003c/p\u003e","manuscriptTitle":"Automated Prediction of Radiological Protocols Using Retrieval Augmented Generation","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-09-17 06:10:36","doi":"10.21203/rs.3.rs-7623430/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"0efd472c-e5e9-4547-a1db-88bcea54b918","owner":[],"postedDate":"September 17th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":54760939,"name":"Artificial Intelligence and Machine Learning"},{"id":54760940,"name":"Nuclear Medicine \u0026 Medical Imaging"},{"id":54760941,"name":"Medical Informatics"}],"tags":[],"updatedAt":"2025-09-17T06:10:36+00:00","versionOfRecord":[],"versionCreatedAt":"2025-09-17 06:10:36","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-7623430","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7623430","identity":"rs-7623430","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.