Automatic Processing of Gastrointestinal Endoscopy Referrals and Patient Communication Using Large Language Models

preprint OA: closed
Full text JSON View at publisher
Full text 61,301 characters · extracted from preprint-html · click to expand
Automatic Processing of Gastrointestinal Endoscopy Referrals and Patient Communication Using Large Language Models | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Automatic Processing of Gastrointestinal Endoscopy Referrals and Patient Communication Using Large Language Models Yuri Gorelik, Eyas Awawdeh, Ariel Gralnek, Amir Klein This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7860274/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 03 Feb, 2026 Read the published version in BMC Gastroenterology → Version 1 posted 12 You are reading this latest preprint version Abstract Background Open-access endoscopy relies on referrals that are manually vetted, which is a resource consuming process, with potential biases. We assessed whether large language models (LLMs) can provide accurate recommendations on gastrointestinal endoscopy referrals. Methods We extrracted 200 multilingual endoscopy referrals. We evaluated OpenAI’s o3 and Google’s Gemini 2.5-pro. A prompt was trained and tuned on a set of 20 referrals and tested on the remaining 180 referrals. Eight variables were tested: procedure type, indication, need for anesthesiologist, withdrawal of anti-aggregants, anti-coagulants and GLP-1 agonists, implantable electronic devices and need for intensified preparation. Accuracy and F1 scores were analyzed using bootstrapping, and models compared with McNemar’s test. Confusion matrices were calculated. Additionally, o3 generated patient-specific visual timelines. Results Among 200 referrals, 88 (44%) referred for colonoscopy, 54 (27%) for gastroscopy; 65 (32.5%) required an anaesthesiologist and 65 (32.5%) intensified preparation. o3 achieved 0.91–1.00 accuracy across all eight variables, whereas Gemini ranged from 0.89 to 0.90. Confusion-matrix analysis confirmed high precision and specificity for both models (≥ 0.90 and ≥ 0.92, respectively). O3 generated accurate, patient-specific visual timelines for sampled cases. Conclusion LLMs are highly accurate in processing endoscopy referrals and can generate patient-specific instructions, offering a solution to streamline open-access endoscopy. Large-Language Models Endoscopy Referrals Artificial Intelligence Figures Figure 1 Figure 2 INTRODUCTION In open-access endoscopy, endoscopic procedures are scheduled based on a written referral letter without a prior consultation or clinic visit. This is a common, practice world-wide, which ensures timely and efficient completion of procedures. 1 The endoscopy referral letter should include the indication for the procedure, the patient’s medical background and a list of medications. To ensure safety and quality of the procedure, a physician must review all referrals and consider factors that affect patient and procedure safety and efficacy such as the need for an anesthesiologist, use and omission of medications and type of bowel preparation. Additionally, concerns have been raised regarding the potential overuse and inappropriate use of endoscopy, particularly open-access endoscopy. 2 Large language models (LLMs) are capable of processing medical information in multiple forms and languages and have shown substantial promise in medicine and gastroenterology specifically. 3 Several studies suggested that LLMs can be used to evaluate medical documents and identify and extract selected features. 4 , 5 In this study, we aimed to evaluate LLMs as tools for reviewing gastrointestinal endoscopy referrals, providing accurate pre-endoscopy recommendations and visual patient instructions. METHODS Design This was a proof of study held at a gastroenterology institute of a tertiary care hospital. We evaluated 200 referrals for gastrointestinal endoscopy from various healthcare providers, each using different electronic healthcare software and generating different referral format. Referrals include structured data such as demographic details, coded diagnoses, medication information, allergies and sensitivities, and free-text data usually describing patient history, physical examination and procedure indication summary. Referrals include both Hebrew and English. The evaluated models were OpenAI’s o3 and Google’s Gemini 2.5-pro. The models were aimed to assess eight parameters of referral management. The model usage of o3 via the Microsoft Azure platform and Gemini 2.5 via the Google Vertex platform comply with the Health Insurance Portability and Accountability Act (HIPAA) and have received local health ministry authorization for use with patient data. This study was approved by the institutional review board (RMB-D-0568-23). Variables and Outcomes The variables we included were: procedure (esophagogastriduodenoscopy [EGD], colonoscopy, or both); correct identification of the indication; need for an anesthesiologist; identification of anti-aggregants, anti-coagulants, and GLP-1RAs, and recommendation of timing of omission (a separate variable for each class); identification of cardiac implantable electronic device; and, need for an intensified bowel preparation. The recommendations were based on our local practices and on established societal guidelines. 6 – 8 Two expert gastroenterologists reviewed all referrals and provided answers and recommendations on all eight aspects. We assigned the first 20 referrals for training and prompt development, which was performed in iterations and prompt tuning, on the training set mostly with a chain-of-thought approach 9 . The remaining 180 referrals were used to test the accuracy of recommendations and calculate performance metrics. The outcomes were the accuracy of each model compared to the gold standard physician recommendations and the comparison of accuracies between the models for each parameter. For each binary recommendation variable, we generated confusion matrices comparing LLM output with expert reference answers. Finally, 20 randomly selected referrals were used to test a request to prepare a custom patient letter and visual timeline. These visual instructions were created using o3 and we aimed to qualitatively assess their accuracy in providing patient specific accurate instructions. Statistical Analysis All statistical analysis and plot generation were performed using R Statistical Software (v4.5.0; R Core Team). Baseline distribution of parameters is presented using medians and inter-quartile ranges. Accuracy rates and 95% confidence intervals (CI) of each model for each parameter were calculated using bootstrapping (1,000 resamples). To account for class imbalance in several variables we also reported the F1, the harmonic mean of precision and recall. F1 scores were computed for each variable, 95% percentile‑bootstrap confidence intervals were derived from 1,000 resamples, and the two models were compared by bootstrapping the paired difference in F1 to obtain confidence intervals and two‑sided p‑values. McNemar’s test was employed to compare the accuracies of the two models in each variable. For confusion matrices we derived accuracy, precision (positive-predictive value), recall (sensitivity), and specificity. RESULTS Baseline variables of the referrals and their distributions are presented in Table 1 . Of these referrals, 88 (44.0%) were for colonoscopy alone and 53 (26.5%) for gastroscopy alone. According to gold-standard physician recommendations, anesthesiologist assistance was required in 65 (32.5%) procedures, and intensified bowel preparation was recommended in 65 (32.5%) cases (Table 1 ). Table 1 Distribution of variables in the endoscopy referrals (according to gold standard physician recommendations) Parameter N (%), (n = 200) Procedure Gastroscopy 53 (26.5) Colonoscopy 88 (44) Both 59 (29.5) Anesthesiologist Required 65 (32.5) Anti-aggregant use 41 (20.5) Anti-coagulant use 15 (7.5) CIED 5 (2.5) GLP-1RA 21 (10.5) Intensified bowel preparation 65 (32.5) CIED, cardiac implantable electronic device; GLP-1RA, Glucagon-like peptide-1receptor agonist Prompt development started from all guidelines and basic instructions. Instructions like identification of patient body mass index and identification of drug allergies to avoid treating them as drugs used were added with improvement in accuracies. Final prompts included explanations, decision rules for each parameter, and a specified output format. All prompts are available in the Supplemental material . Accuracy rates for the o3 and the Gemini 2.5-pro respectively were very high for all variables (Procedure, 0.99 (0.97–1.00) and 0.99 (0.97–1.00); Indication, 0.91 (0.87–0.94) and 0.89 (0.85–0.94); Need for an anesthesiologist, 0.96 (0.92–0.98) and 0.94 (0.91–0.97); Anti-aggregant management and recommended withdrawal 0.98 (0.97–1.00) vs 0.99 (0.97–1.00); Anti-coagulant management and recommended withdrawal interval 1.00 (1.00–1.00) vs 0.99 (0.97–1.00); GLP-1RA management, 0.99 (0.97–1.00) and 0.99 (0.98–1.00); Cardiac implantable electronic device (CIED), 1.00 (1.00–1.00) and 0.99 (0.98–1.00); Need for an intensified bowel preparation, 0.95 (0.92–0.98) and 0.92 (0.88–0.96). Comparison of accuracy rates between models did not show any significant differences in any variable. These accuracy rates and comparisons are presented in Fig. 1 . F1 demonstrate robust performance for binary variables. For o3 versus Gemini 2.5‑pro, respectively, the F1 values and 95% CI were: anesthesiologist 0.95 (0.91–0.98) vs 0.93 (0.89–0.97); anti‑aggregant identification 0.97 (0.94–1.00) vs 0.98 (0.95–1.00); anti‑coagulant management 1.00 (1.00–1.00) vs 0.64 (0.60–1.00); GLP‑1 receptor‑agonist management 0.97 (0.93–1.00) vs 0.99 (0.95–1.00); CIED recognition 1.00 (1.00–1.00) vs 0.95 (0.83–1.00); and intensified bowel preparation requirement 0.94 (0.91–0.98) vs 0.91 (0.87–0.95). There were no significant differences in F1 scores between the two models for the binary variables. Diagnostic metrics of the binary variables are presented in Table 2 . Across the binary variables, o3 achieved an accuracy ranging 0.95–1.00, with perfect performance for all anticoagulant-related decisions and CEID recognition. Gemini showed a comparable accuracy profile ranging 0.89–0.95. Both models maintained very high specificity (≥ 0.91 for all variables), and precision exceeded 0.95 for all variables (except for GLP-1 discontinuation by o3 [0.90]). Table 2 Separate performance metrics for each variable (of the binary variables) Variable Sensitivity Specificity Precision Accuracy o3 Gemini o3 Gemini o3 Gemini o3 Gemini Anesthesiologist 0.98 0.93 0.92 0.97 0.96 0.98 0.96 0.94 Anti-aggregants 0.94 0.97 0.99 0.99 0.97 0.97 0.98 0.99 Anti-coagulants 1 0.93 1 0.98 1 1 1 0.99 GLP-1RA 1 1 0.99 0.99 0.91 0.95 0.99 0.99 Intensified preparation 0.95 0.93 0.95 0.92 0.9 0.85 0.95 0.92 GLP-1RA, Glucagon-like peptide-1receptor agonist; CIED, cardiac implantable electronic device We randomly selected 20 referrals and asked the LLM to create a patient specific timeline followed by a generation of a visual timeline. All images included crucial details like instructions on time of medication omissions, details on intensified bowel preparations, and reminder to bring breathing assisting devices (Fig. 2 ). DISCUSSION In the current study LLMs achieved near perfect accuracy and F1 scores in open access endoscopy referral management. Furthermore, o3 was able to generate patient specific simplified visual timelines to aid in patient communication. Referral management in healthcare, particularly in gastroenterology, is a labor-intensive task that typically requires medical experts to individually review each referral. Previous studies have attempted to automate this process using artificial intelligence methods. Campbell et al. evaluated the feasibility of an automated electronic health record (EHR)-based alert system to identify parameters requiring modifications to patient management before endoscopy, such as elevated body mass index or use of positive airway pressure devices, similar to what we investigated in our study 10 . However, relying on recorded and indexed EHR data can be problematic because it may not always be comprehensive or accessible in all settings. Other studies have assessed natural language processing tools to evaluate and prioritize patient referrals, but they did not include endoscopy referrals and were focused primarily on prioritization 11 . In contrast, our study is both unique and novel because we showed the ability of LLMs to accurately assess referrals of varying formats, encompassing multiple languages, as well as structured and unstructured data, and to generate accurate visual summarizations of the provided recommendations. Our study has several limitations. Firstly, although we analyzed referrals from multiple providers, all were from the same region and referred to a single center, thus limiting the generalizability. Additionally, an important component of referral management—namely the revision of the appropriateness of procedure indications—was not evaluated in this study, meaning the automation of the referral process currently lacks this critical dimension. In conclusion, LLMs can be employed to evaluate open access referrals for gastrointestinal endoscopy and provide highly accurate pre-endoscopy recommendations and potentially simple patient communication. Future research should focus on the real-time use and integration of LLMs into EHRs. Declarations ETHICS Human Ethics and Consent to Participate declarations: not applicable This study was approved by the institutional review board (RMB-D-0568-23) FUNDING This study was not supported by any finding Author Contribution YG: study concept, study design, analysis, manuscript preparation; EA: data curation, manuscript review, AG: data curation, manuscript review; AK: study supervision, study design, interpretation of results, manuscript review References Chandrasekhara V, Eloubeidi MA, Bruining DH, et al. Open-access endoscopy. Gastrointest Endosc 2015;81:1326-1329. Shaheen NJ, Fennerty MB, Bergman JJ. Less Is More: A Minimalist Approach to Endoscopy. Gastroenterology 2018;154:1993-2003. Shahab O, El Kurdi B, Shaukat A, et al. Large language models: a primer and gastroenterology applications. Therap Adv Gastroenterol 2024;17. Wiest IC, Ferber D, Zhu J, et al. Privacy-preserving large language models for structured medical information retrieval. npj Digital Medicine 2024 7:1 2024;7:1-9. Wang L, Ma Y, Bi W, et al. An Entity Extraction Pipeline for Medical Text Records Using Large Language Models: Analytical Study. J Med Internet Res 2024;26:e54580. Abraham NS, Barkun AN, Sauer BG, et al. American College of Gastroenterology-Canadian Association of Gastroenterology Clinical Practice Guideline: Management of Anticoagulants and Antiplatelets during Acute Gastrointestinal Bleeding and the Periendoscopic Period. American Journal of Gastroenterology 2022;117:542-558. Sidhu R, Turnbull D, Haboubi H, et al. British Society of Gastroenterology guidelines on sedation in gastrointestinal endoscopy. Gut 2024;73:1-27. Kindel TL, Wang AY, Wadhwa A, et al. Multisociety Clinical Practice Guidance for the Safe Use of Glucagon-like Peptide-1 Receptor Agonists in the Perioperative Period. Clinical Gastroenterology and Hepatology 2024;0. Zaghir J, Naguib M, Bjelogrlic M, Névéol A, Tannier X, Lovis C. Prompt Engineering Paradigms for Medical Applications: Scoping Review. J Med Internet Res 2024;26:e60501 Campbell EJ, Krishnaraj A, Harris M, et al. Automated before-procedure electronic health record screening to assess appropriateness for GI endoscopy and sedation. Gastrointest Endosc 2012;76:786-792. Abdel-Hafez A, Jones M, Ebrahimabadi M, et al. Artificial intelligence in medical referrals triage based on Clinical Prioritization Criteria. Front Digit Health 2023;5:1192975. Additional Declarations No competing interests reported. Supplementary Files supplementarymaterial.docx Cite Share Download PDF Status: Published Journal Publication published 03 Feb, 2026 Read the published version in BMC Gastroenterology → Version 1 posted Editorial decision: Revision requested 19 Dec, 2025 Reviews received at journal 17 Dec, 2025 Reviewers agreed at journal 14 Dec, 2025 Reviews received at journal 30 Nov, 2025 Reviewers agreed at journal 29 Nov, 2025 Reviewers agreed at journal 14 Nov, 2025 Reviewers agreed at journal 12 Nov, 2025 Reviewers invited by journal 12 Nov, 2025 Editor invited by journal 17 Oct, 2025 Editor assigned by journal 16 Oct, 2025 Submission checks completed at journal 16 Oct, 2025 First submitted to journal 14 Oct, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7860274","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":549030267,"identity":"363070d9-1681-4640-8b30-b5c9bc01ad81","order_by":0,"name":"Yuri Gorelik","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABMUlEQVRIie3SMUvEMBQH8CeFdgnOL+S4+hFSBDnw0NEP4RIp1MWg4FJQMEWoy4GrfpMbI4Xe0sVNUbDlwEnh0KUgiNHzQL3WWTB/eATe45eXIQA2Nn83PYKeAlgwBWiqBD0diL02ggSJ/kKEmBHeSgBRNBGYJ1x7l+M4xg49f6yerodrm8COLyZb9S34GwMOZRMh+0FRIGFse5nJIpSqk4coxD2YNm96mCERTVIkXRYBk6kjFe5wQzIIziLRTl4NobnzItMjQ3af69+Jl9NEmYeh65ot2fsW92OLj6FuIlQTJ1A5EjqI3FVZjGSK0UpPRBnhJNO6gSxqr6rUYX8dR7lzI4cH8hTD8dWkn3X9kySp6niOLN09/LjH/TwJ1zD7A9/iK69saE9HLQMbGxub/5Y3KiVsDeoVkY4AAAAASUVORK5CYII=","orcid":"","institution":"Rambam Healthcare Campus","correspondingAuthor":true,"prefix":"","firstName":"Yuri","middleName":"","lastName":"Gorelik","suffix":""},{"id":549030268,"identity":"eaa4c162-fd2c-4e68-931e-3ea37cb8a478","order_by":1,"name":"Eyas Awawdeh","email":"","orcid":"","institution":"Rambam Healthcare Campus","correspondingAuthor":false,"prefix":"","firstName":"Eyas","middleName":"","lastName":"Awawdeh","suffix":""},{"id":549030269,"identity":"6d2d5bf1-27f4-4e8e-bd54-d1df572d535c","order_by":2,"name":"Ariel Gralnek","email":"","orcid":"","institution":"Rambam Healthcare Campus","correspondingAuthor":false,"prefix":"","firstName":"Ariel","middleName":"","lastName":"Gralnek","suffix":""},{"id":549030270,"identity":"4d6e7d5b-57cd-4240-bbde-ac80c1ec2ca3","order_by":3,"name":"Amir Klein","email":"","orcid":"","institution":"Rambam Healthcare Campus","correspondingAuthor":false,"prefix":"","firstName":"Amir","middleName":"","lastName":"Klein","suffix":""}],"badges":[],"createdAt":"2025-10-14 15:38:18","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7860274/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7860274/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1186/s12876-026-04636-5","type":"published","date":"2026-02-03T15:57:11+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":96708683,"identity":"76715946-0235-4064-8018-603aa547072f","added_by":"auto","created_at":"2025-11-25 10:05:05","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":89853,"visible":true,"origin":"","legend":"","description":"","filename":"Automaticreferralprocessingmanuscript.docx","url":"https://assets-eu.researchsquare.com/files/rs-7860274/v1/a42a6a6295712c634b8ae58c.docx"},{"id":96616002,"identity":"3a7a8990-bddb-4949-b8bb-1362a04b1aed","added_by":"auto","created_at":"2025-11-24 10:21:32","extension":"png","order_by":1,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":262869,"visible":true,"origin":"","legend":"","description":"","filename":"Figure1.png","url":"https://assets-eu.researchsquare.com/files/rs-7860274/v1/8483cb5fa51aac551719cc49.png"},{"id":96616006,"identity":"d24bced3-83d0-400e-b5ad-a78fcdbae605","added_by":"auto","created_at":"2025-11-24 10:21:33","extension":"jpg","order_by":2,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":1860208,"visible":true,"origin":"","legend":"","description":"","filename":"Figure2.jpg","url":"https://assets-eu.researchsquare.com/files/rs-7860274/v1/1ef0f08209e60fc70843111c.jpg"},{"id":96616016,"identity":"6c3c8966-fe2c-4bc8-8df3-fd8e77570278","added_by":"auto","created_at":"2025-11-24 10:21:33","extension":"json","order_by":3,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":5744,"visible":true,"origin":"","legend":"","description":"","filename":"c6c26c091d5e4edfa180c33dc309b84c.json","url":"https://assets-eu.researchsquare.com/files/rs-7860274/v1/6ab25a770f034075a823bb11.json"},{"id":96616008,"identity":"e28d079f-01b8-43f5-9791-97a957bffa33","added_by":"auto","created_at":"2025-11-24 10:21:33","extension":"docx","order_by":4,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":22693,"visible":true,"origin":"","legend":"","description":"","filename":"supplementarymaterial.docx","url":"https://assets-eu.researchsquare.com/files/rs-7860274/v1/1c0bfc35e425c7f9ceb9108f.docx"},{"id":96616009,"identity":"3959a219-e3fc-43b0-86fc-18187bebeb4c","added_by":"auto","created_at":"2025-11-24 10:21:33","extension":"xml","order_by":5,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":43287,"visible":true,"origin":"","legend":"","description":"","filename":"c6c26c091d5e4edfa180c33dc309b84c1enriched.xml","url":"https://assets-eu.researchsquare.com/files/rs-7860274/v1/f5506591cbe398a293296822.xml"},{"id":96708389,"identity":"64af0660-98b6-4448-93ef-5cd7e996cec3","added_by":"auto","created_at":"2025-11-25 10:01:39","extension":"png","order_by":6,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":262869,"visible":true,"origin":"","legend":"","description":"","filename":"Figure1.png","url":"https://assets-eu.researchsquare.com/files/rs-7860274/v1/c5f9bf3a7abdd7b06621c5ee.png"},{"id":96616010,"identity":"5d0699bb-7173-41d1-b38b-4b6e49bd1f4a","added_by":"auto","created_at":"2025-11-24 10:21:33","extension":"jpg","order_by":7,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":1860208,"visible":true,"origin":"","legend":"","description":"","filename":"Figure2.jpg","url":"https://assets-eu.researchsquare.com/files/rs-7860274/v1/9700fc883770f5ca1d99e269.jpg"},{"id":96708480,"identity":"f179dbf6-7ae7-49b5-9516-da0258ab1935","added_by":"auto","created_at":"2025-11-25 10:03:32","extension":"png","order_by":8,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":46977,"visible":true,"origin":"","legend":"","description":"","filename":"OnlineFigure1.png","url":"https://assets-eu.researchsquare.com/files/rs-7860274/v1/22082177a1e3e47fd6cf44c7.png"},{"id":96616014,"identity":"3c5ca899-d7a0-465f-ad8f-c7348ef02d75","added_by":"auto","created_at":"2025-11-24 10:21:33","extension":"png","order_by":9,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":1478374,"visible":true,"origin":"","legend":"","description":"","filename":"OnlineFigure2.png","url":"https://assets-eu.researchsquare.com/files/rs-7860274/v1/fe3107345407847eef1c6ccb.png"},{"id":96616013,"identity":"843efef9-3343-4229-bece-db39f34d7aa6","added_by":"auto","created_at":"2025-11-24 10:21:33","extension":"xml","order_by":10,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":42044,"visible":true,"origin":"","legend":"","description":"","filename":"c6c26c091d5e4edfa180c33dc309b84c1structuring.xml","url":"https://assets-eu.researchsquare.com/files/rs-7860274/v1/aae7d3d385ecbc729ebf4cb2.xml"},{"id":96616015,"identity":"8541394a-6a0a-4025-b2c3-ecb75f9b5ada","added_by":"auto","created_at":"2025-11-24 10:21:33","extension":"html","order_by":11,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":48022,"visible":true,"origin":"","legend":"","description":"","filename":"earlyproof.html","url":"https://assets-eu.researchsquare.com/files/rs-7860274/v1/e55516002511a29052117d5c.html"},{"id":96616003,"identity":"0c37a3f9-adb2-4f46-9b0c-3c46b82b28b1","added_by":"auto","created_at":"2025-11-24 10:21:32","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":262869,"visible":true,"origin":"","legend":"\u003cp\u003eRadar plot of recommendation accuracy of o3 and Gemini 2.5-pro.\u003c/p\u003e","description":"","filename":"Figure1.png","url":"https://assets-eu.researchsquare.com/files/rs-7860274/v1/14460e8001f65f57560509ac.png"},{"id":96708626,"identity":"b1862359-7d27-4520-9314-55f224835b98","added_by":"auto","created_at":"2025-11-25 10:04:52","extension":"jpg","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":1860208,"visible":true,"origin":"","legend":"\u003cp\u003eExamples of visual patient specific LLM generated patient instructions with a timeline format. Overall, although variability in aesthetics and choice of illustrations, all timelines included all crucial instructions such as time of omission of relevant medications and details on intensified bowel preparation.\u003c/p\u003e","description":"","filename":"Figure2.jpg","url":"https://assets-eu.researchsquare.com/files/rs-7860274/v1/7ccd4c46df6ddad09645fbb7.jpg"},{"id":102233966,"identity":"398ba34a-efb7-4935-8a47-3883ba9c832d","added_by":"auto","created_at":"2026-02-09 16:00:58","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":2508727,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7860274/v1/eb93c406-1ffd-45dc-8937-96e4cc29fbc2.pdf"},{"id":96616005,"identity":"43a51b7f-743d-4a5b-8f68-5c2820e433e7","added_by":"auto","created_at":"2025-11-24 10:21:33","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"supplement","size":22693,"visible":true,"origin":"","legend":"","description":"","filename":"supplementarymaterial.docx","url":"https://assets-eu.researchsquare.com/files/rs-7860274/v1/bc0a3a9a574c104c41712509.docx"}],"financialInterests":"No competing interests reported.","formattedTitle":"Automatic Processing of Gastrointestinal Endoscopy Referrals and Patient Communication Using Large Language Models","fulltext":[{"header":"INTRODUCTION","content":"\u003cp\u003eIn open-access endoscopy, endoscopic procedures are scheduled based on a written referral letter without a prior consultation or clinic visit. This is a common, practice world-wide, which ensures timely and efficient completion of procedures.\u003csup\u003e\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e\u003c/sup\u003e\u003c/p\u003e\u003cp\u003eThe endoscopy referral letter should include the indication for the procedure, the patient\u0026rsquo;s medical background and a list of medications. To ensure safety and quality of the procedure, a physician must review all referrals and consider factors that affect patient and procedure safety and efficacy such as the need for an anesthesiologist, use and omission of medications and type of bowel preparation. Additionally, concerns have been raised regarding the potential overuse and inappropriate use of endoscopy, particularly open-access endoscopy.\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e\u003c/sup\u003e\u003c/p\u003e\u003cp\u003eLarge language models (LLMs) are capable of processing medical information in multiple forms and languages and have shown substantial promise in medicine and gastroenterology specifically.\u003csup\u003e\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e\u003c/sup\u003e Several studies suggested that LLMs can be used to evaluate medical documents and identify and extract selected features.\u003csup\u003e\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e,\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e\u003c/sup\u003e\u003c/p\u003e\u003cp\u003eIn this study, we aimed to evaluate LLMs as tools for reviewing gastrointestinal endoscopy referrals, providing accurate pre-endoscopy recommendations and visual patient instructions.\u003c/p\u003e"},{"header":"METHODS","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\u003ch2\u003eDesign\u003c/h2\u003e\u003cp\u003eThis was a proof of study held at a gastroenterology institute of a tertiary care hospital. We evaluated 200 referrals for gastrointestinal endoscopy from various healthcare providers, each using different electronic healthcare software and generating different referral format. Referrals include structured data such as demographic details, coded diagnoses, medication information, allergies and sensitivities, and free-text data usually describing patient history, physical examination and procedure indication summary. Referrals include both Hebrew and English.\u003c/p\u003e\u003cp\u003eThe evaluated models were OpenAI\u0026rsquo;s o3 and Google\u0026rsquo;s Gemini 2.5-pro. The models were aimed to assess eight parameters of referral management. The model usage of o3 via the Microsoft Azure platform and Gemini 2.5 via the Google Vertex platform comply with the Health Insurance Portability and Accountability Act (HIPAA) and have received local health ministry authorization for use with patient data. This study was approved by the institutional review board (RMB-D-0568-23).\u003c/p\u003e\u003c/div\u003e\n\u003ch3\u003eVariables and Outcomes\u003c/h3\u003e\n\u003cp\u003eThe variables we included were: procedure (esophagogastriduodenoscopy [EGD], colonoscopy, or both); correct identification of the indication; need for an anesthesiologist; identification of anti-aggregants, anti-coagulants, and GLP-1RAs, and recommendation of timing of omission (a separate variable for each class); identification of cardiac implantable electronic device; and, need for an intensified bowel preparation. The recommendations were based on our local practices and on established societal guidelines.\u003csup\u003e\u003cspan additionalcitationids=\"CR7\" citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e\u003c/sup\u003e\u003c/p\u003e\u003cp\u003eTwo expert gastroenterologists reviewed all referrals and provided answers and recommendations on all eight aspects. We assigned the first 20 referrals for training and prompt development, which was performed in iterations and prompt tuning, on the training set mostly with a chain-of-thought approach \u003csup\u003e\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e\u003c/sup\u003e. The remaining 180 referrals were used to test the accuracy of recommendations and calculate performance metrics.\u003c/p\u003e\u003cp\u003eThe outcomes were the accuracy of each model compared to the gold standard physician recommendations and the comparison of accuracies between the models for each parameter. For each binary recommendation variable, we generated confusion matrices comparing LLM output with expert reference answers.\u003c/p\u003e\u003cp\u003eFinally, 20 randomly selected referrals were used to test a request to prepare a custom patient letter and visual timeline. These visual instructions were created using o3 and we aimed to qualitatively assess their accuracy in providing patient specific accurate instructions.\u003c/p\u003e\u003cdiv id=\"Sec5\" class=\"Section2\"\u003e\u003ch2\u003eStatistical Analysis\u003c/h2\u003e\u003cp\u003eAll statistical analysis and plot generation were performed using R Statistical Software (v4.5.0; R Core Team). Baseline distribution of parameters is presented using medians and inter-quartile ranges. Accuracy rates and 95% confidence intervals (CI) of each model for each parameter were calculated using bootstrapping (1,000 resamples). To account for class imbalance in several variables we also reported the F1, the harmonic mean of precision and recall. F1 scores were computed for each variable, 95% percentile‑bootstrap confidence intervals were derived from 1,000 resamples, and the two models were compared by bootstrapping the paired difference in F1 to obtain confidence intervals and two‑sided p‑values. McNemar\u0026rsquo;s test was employed to compare the accuracies of the two models in each variable. For confusion matrices we derived accuracy, precision (positive-predictive value), recall (sensitivity), and specificity.\u003c/p\u003e\u003c/div\u003e"},{"header":"RESULTS","content":"\u003cp\u003eBaseline variables of the referrals and their distributions are presented in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e. Of these referrals, 88 (44.0%) were for colonoscopy alone and 53 (26.5%) for gastroscopy alone. According to gold-standard physician recommendations, anesthesiologist assistance was required in 65 (32.5%) procedures, and intensified bowel preparation was recommended in 65 (32.5%) cases (Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e).\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eDistribution of variables in the endoscopy referrals (according to gold standard physician recommendations)\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"2\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eParameter\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eN (%), (n\u0026thinsp;=\u0026thinsp;200)\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eProcedure\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eGastroscopy\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e53 (26.5)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eColonoscopy\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e88 (44)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eBoth\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e59 (29.5)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eAnesthesiologist Required\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e65 (32.5)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eAnti-aggregant use\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e41 (20.5)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eAnti-coagulant use\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e15 (7.5)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eCIED\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e5 (2.5)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eGLP-1RA\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e21 (10.5)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eIntensified bowel preparation\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e65 (32.5)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003ctfoot\u003e\u003ctr\u003e\u003ctd colspan=\"2\"\u003eCIED, cardiac implantable electronic device; GLP-1RA, Glucagon-like peptide-1receptor agonist\u003c/td\u003e\u003c/tr\u003e\u003c/tfoot\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003e Prompt development started from all guidelines and basic instructions. Instructions like identification of patient body mass index and identification of drug allergies to avoid treating them as drugs used were added with improvement in accuracies. Final prompts included explanations, decision rules for each parameter, and a specified output format. All prompts are available in the \u003cb\u003eSupplemental material\u003c/b\u003e.\u003c/p\u003e\u003cp\u003eAccuracy rates for the o3 and the Gemini 2.5-pro respectively were very high for all variables (Procedure, 0.99 (0.97\u0026ndash;1.00) and 0.99 (0.97\u0026ndash;1.00); Indication, 0.91 (0.87\u0026ndash;0.94) and 0.89 (0.85\u0026ndash;0.94); Need for an anesthesiologist, 0.96 (0.92\u0026ndash;0.98) and 0.94 (0.91\u0026ndash;0.97); Anti-aggregant management and recommended withdrawal 0.98 (0.97\u0026ndash;1.00) vs 0.99 (0.97\u0026ndash;1.00); Anti-coagulant management and recommended withdrawal interval 1.00 (1.00\u0026ndash;1.00) vs 0.99 (0.97\u0026ndash;1.00); GLP-1RA management, 0.99 (0.97\u0026ndash;1.00) and 0.99 (0.98\u0026ndash;1.00); Cardiac implantable electronic device (CIED), 1.00 (1.00\u0026ndash;1.00) and 0.99 (0.98\u0026ndash;1.00); Need for an intensified bowel preparation, 0.95 (0.92\u0026ndash;0.98) and 0.92 (0.88\u0026ndash;0.96). Comparison of accuracy rates between models did not show any significant differences in any variable. These accuracy rates and comparisons are presented in Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eF1 demonstrate robust performance for binary variables. For o3 versus Gemini 2.5‑pro, respectively, the F1 values and 95% CI were: anesthesiologist 0.95 (0.91\u0026ndash;0.98) vs 0.93 (0.89\u0026ndash;0.97); anti‑aggregant identification 0.97 (0.94\u0026ndash;1.00) vs 0.98 (0.95\u0026ndash;1.00); anti‑coagulant management 1.00 (1.00\u0026ndash;1.00) vs 0.64 (0.60\u0026ndash;1.00); GLP‑1 receptor‑agonist management 0.97 (0.93\u0026ndash;1.00) vs 0.99 (0.95\u0026ndash;1.00); CIED recognition 1.00 (1.00\u0026ndash;1.00) vs 0.95 (0.83\u0026ndash;1.00); and intensified bowel preparation requirement 0.94 (0.91\u0026ndash;0.98) vs 0.91 (0.87\u0026ndash;0.95). There were no significant differences in F1 scores between the two models for the binary variables.\u003c/p\u003e\u003cp\u003eDiagnostic metrics of the binary variables are presented in Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e. Across the binary variables, o3 achieved an accuracy ranging 0.95\u0026ndash;1.00, with perfect performance for all anticoagulant-related decisions and CEID recognition. Gemini showed a comparable accuracy profile ranging 0.89\u0026ndash;0.95. Both models maintained very high specificity (\u0026ge;\u0026thinsp;0.91 for all variables), and precision exceeded 0.95 for all variables (except for GLP-1 discontinuation by o3 [0.90]).\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eSeparate performance metrics for each variable (of the binary variables)\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"9\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c8\" colnum=\"8\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c9\" colnum=\"9\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eVariable\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colspan=\"2\" nameend=\"c3\" namest=\"c2\"\u003e\u003cp\u003eSensitivity\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colspan=\"2\" nameend=\"c5\" namest=\"c4\"\u003e\u003cp\u003eSpecificity\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colspan=\"2\" nameend=\"c7\" namest=\"c6\"\u003e\u003cp\u003ePrecision\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colspan=\"2\" nameend=\"c9\" namest=\"c8\"\u003e\u003cp\u003eAccuracy\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eo3\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e\u003cb\u003eGemini\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e\u003cb\u003eo3\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e\u003cb\u003eGemini\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e\u003cb\u003eo3\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e\u003cb\u003eGemini\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e\u003cb\u003eo3\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c9\"\u003e\u003cp\u003e\u003cb\u003eGemini\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eAnesthesiologist\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e0.98\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e0.93\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.92\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0.97\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e0.96\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e0.98\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e0.96\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c9\"\u003e\u003cp\u003e0.94\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eAnti-aggregants\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e0.94\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e0.97\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.99\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0.99\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e0.97\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e0.97\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e0.98\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c9\"\u003e\u003cp\u003e0.99\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eAnti-coagulants\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e0.93\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0.98\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c9\"\u003e\u003cp\u003e0.99\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eGLP-1RA\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.99\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0.99\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e0.91\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e0.95\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e0.99\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c9\"\u003e\u003cp\u003e0.99\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eIntensified preparation\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e0.95\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e0.93\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.95\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0.92\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e0.9\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e0.85\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e0.95\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c9\"\u003e\u003cp\u003e0.92\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003ctfoot\u003e\u003ctr\u003e\u003ctd colspan=\"9\"\u003eGLP-1RA, Glucagon-like peptide-1receptor agonist; CIED, cardiac implantable electronic device\u003c/td\u003e\u003c/tr\u003e\u003c/tfoot\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003eWe randomly selected 20 referrals and asked the LLM to create a patient specific timeline followed by a generation of a visual timeline. All images included crucial details like instructions on time of medication omissions, details on intensified bowel preparations, and reminder to bring breathing assisting devices (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e).\u003c/p\u003e\u003cp\u003e\u003c/p\u003e"},{"header":"DISCUSSION","content":"\u003cp\u003eIn the current study LLMs achieved near perfect accuracy and F1 scores in open access endoscopy referral management. Furthermore, o3 was able to generate patient specific simplified visual timelines to aid in patient communication. Referral management in healthcare, particularly in gastroenterology, is a labor-intensive task that typically requires medical experts to individually review each referral. Previous studies have attempted to automate this process using artificial intelligence methods. Campbell et al. evaluated the feasibility of an automated electronic health record (EHR)-based alert system to identify parameters requiring modifications to patient management before endoscopy, such as elevated body mass index or use of positive airway pressure devices, similar to what we investigated in our study\u003csup\u003e10\u003c/sup\u003e. However, relying on recorded and indexed EHR data can be problematic because it may not always be comprehensive or accessible in all settings. Other studies have assessed natural language processing tools to evaluate and prioritize patient referrals, but they did not include endoscopy referrals and were focused primarily on prioritization\u003csup\u003e11\u003c/sup\u003e . In contrast, our study is both unique and novel because we showed the ability of LLMs to accurately assess referrals of varying formats, encompassing multiple languages, as well as structured and unstructured data, and to generate accurate visual summarizations of the provided recommendations.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eOur study has several limitations. Firstly, although we analyzed referrals from multiple providers, all were from the same region and referred to a single center, thus limiting the generalizability. Additionally, an important component of referral management\u0026mdash;namely the revision of the appropriateness of procedure indications\u0026mdash;was not evaluated in this study, meaning the automation of the referral process currently lacks this critical dimension.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eIn conclusion, LLMs can be employed to evaluate open access referrals for gastrointestinal endoscopy and provide highly accurate pre-endoscopy recommendations and potentially simple patient communication. Future research should focus on the real-time use and integration of LLMs into EHRs.\u0026nbsp;\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eETHICS\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eHuman Ethics and Consent to Participate declarations: not applicable\u003c/p\u003e\n\u003cp\u003eThis study was approved by the institutional review board (RMB-D-0568-23)\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFUNDING\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis study was not supported by any finding\u003c/p\u003e\u003ch2\u003eAuthor Contribution\u003c/h2\u003e\u003cp\u003eYG: study concept, study design, analysis, manuscript preparation; EA: data curation, manuscript review, AG: data curation, manuscript review; AK: study supervision, study design, interpretation of results, manuscript review\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eChandrasekhara V, Eloubeidi MA, Bruining DH, et al. Open-access endoscopy. Gastrointest Endosc 2015;81:1326-1329. \u003c/li\u003e\n\u003cli\u003eShaheen NJ, Fennerty MB, Bergman JJ. Less Is More: A Minimalist Approach to Endoscopy. Gastroenterology 2018;154:1993-2003. \u003c/li\u003e\n\u003cli\u003eShahab O, El Kurdi B, Shaukat A, et al. Large language models: a primer and gastroenterology applications. Therap Adv Gastroenterol 2024;17. \u003c/li\u003e\n\u003cli\u003eWiest IC, Ferber D, Zhu J, et al. Privacy-preserving large language models for structured medical information retrieval. npj Digital Medicine 2024 7:1 2024;7:1-9. \u003c/li\u003e\n\u003cli\u003eWang L, Ma Y, Bi W, et al. An Entity Extraction Pipeline for Medical Text Records Using Large Language Models: Analytical Study. J Med Internet Res 2024;26:e54580. \u003c/li\u003e\n\u003cli\u003eAbraham NS, Barkun AN, Sauer BG, et al. American College of Gastroenterology-Canadian Association of Gastroenterology Clinical Practice Guideline: Management of Anticoagulants and Antiplatelets during Acute Gastrointestinal Bleeding and the Periendoscopic Period. American Journal of Gastroenterology 2022;117:542-558. \u003c/li\u003e\n\u003cli\u003eSidhu R, Turnbull D, Haboubi H, et al. British Society of Gastroenterology guidelines on sedation in gastrointestinal endoscopy. Gut 2024;73:1-27. \u003c/li\u003e\n\u003cli\u003eKindel TL, Wang AY, Wadhwa A, et al. Multisociety Clinical Practice Guidance for the Safe Use of Glucagon-like Peptide-1 Receptor Agonists in the Perioperative Period. Clinical Gastroenterology and Hepatology 2024;0. \u003c/li\u003e\n\u003cli\u003eZaghir J, Naguib M, Bjelogrlic M, N\u0026eacute;v\u0026eacute;ol A, Tannier X, Lovis C. Prompt Engineering Paradigms for Medical Applications: Scoping Review. J Med Internet Res 2024;26:e60501\u003c/li\u003e\n\u003cli\u003eCampbell EJ, Krishnaraj A, Harris M, et al. Automated before-procedure electronic health record screening to assess appropriateness for GI endoscopy and sedation. Gastrointest Endosc 2012;76:786-792. \u003c/li\u003e\n\u003cli\u003eAbdel-Hafez A, Jones M, Ebrahimabadi M, et al. Artificial intelligence in medical referrals triage based on Clinical Prioritization Criteria. Front Digit Health 2023;5:1192975.\u003cbr\u003e \u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"bmc-gastroenterology","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"bmge","sideBox":"Learn more about [BMC Gastroenterology](http://bmcgastroenterol.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/bmge/default.aspx","title":"BMC Gastroenterology","twitterHandle":"BMC_series","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"Large-Language Models, Endoscopy Referrals, Artificial Intelligence","lastPublishedDoi":"10.21203/rs.3.rs-7860274/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7860274/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003ch2\u003eBackground\u003c/h2\u003e\u003cp\u003eOpen-access endoscopy relies on referrals that are manually vetted, which is a resource consuming process, with potential biases. We assessed whether large language models (LLMs) can provide accurate recommendations on gastrointestinal endoscopy referrals.\u003c/p\u003e\u003ch2\u003eMethods\u003c/h2\u003e\u003cp\u003eWe extrracted 200 multilingual endoscopy referrals. We evaluated OpenAI\u0026rsquo;s o3 and Google\u0026rsquo;s Gemini 2.5-pro. A prompt was trained and tuned on a set of 20 referrals and tested on the remaining 180 referrals. Eight variables were tested: procedure type, indication, need for anesthesiologist, withdrawal of anti-aggregants, anti-coagulants and GLP-1 agonists, implantable electronic devices and need for intensified preparation. Accuracy and F1 scores were analyzed using bootstrapping, and models compared with McNemar\u0026rsquo;s test. Confusion matrices were calculated. Additionally, o3 generated patient-specific visual timelines.\u003c/p\u003e\u003ch2\u003eResults\u003c/h2\u003e\u003cp\u003eAmong 200 referrals, 88 (44%) referred for colonoscopy, 54 (27%) for gastroscopy; 65 (32.5%) required an anaesthesiologist and 65 (32.5%) intensified preparation. o3 achieved 0.91\u0026ndash;1.00 accuracy across all eight variables, whereas Gemini ranged from 0.89 to 0.90. Confusion-matrix analysis confirmed high precision and specificity for both models (\u0026ge;\u0026thinsp;0.90 and \u0026ge;\u0026thinsp;0.92, respectively). O3 generated accurate, patient-specific visual timelines for sampled cases.\u003c/p\u003e\u003ch2\u003eConclusion\u003c/h2\u003e\u003cp\u003eLLMs are highly accurate in processing endoscopy referrals and can generate patient-specific instructions, offering a solution to streamline open-access endoscopy.\u003c/p\u003e","manuscriptTitle":"Automatic Processing of Gastrointestinal Endoscopy Referrals and Patient Communication Using Large Language Models","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-11-24 10:21:28","doi":"10.21203/rs.3.rs-7860274/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2025-12-19T07:59:20+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-12-17T21:07:24+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"78897670715236449363569897070810338244","date":"2025-12-14T23:43:46+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-11-30T19:42:14+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"93614482985160636091174650758756021639","date":"2025-11-29T08:08:34+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"11044524694649124888387790160471742384","date":"2025-11-14T16:05:49+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"337484308140429257078581359716169986968","date":"2025-11-12T19:40:20+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2025-11-12T05:58:09+00:00","index":"","fulltext":""},{"type":"editorInvited","content":"","date":"2025-10-17T09:49:29+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-10-17T02:12:35+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-10-17T02:10:49+00:00","index":"","fulltext":""},{"type":"submitted","content":"BMC Gastroenterology","date":"2025-10-14T15:26:26+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"bmc-gastroenterology","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"bmge","sideBox":"Learn more about [BMC Gastroenterology](http://bmcgastroenterol.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/bmge/default.aspx","title":"BMC Gastroenterology","twitterHandle":"BMC_series","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"1b1f90ee-bbd6-4d3a-9f0c-1c5a6e96f304","owner":[],"postedDate":"November 24th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[],"tags":[],"updatedAt":"2026-02-09T16:00:16+00:00","versionOfRecord":{"articleIdentity":"rs-7860274","link":"https://doi.org/10.1186/s12876-026-04636-5","journal":{"identity":"bmc-gastroenterology","isVorOnly":false,"title":"BMC Gastroenterology"},"publishedOn":"2026-02-03 15:57:11","publishedOnDateReadable":"February 3rd, 2026"},"versionCreatedAt":"2025-11-24 10:21:28","video":"","vorDoi":"10.1186/s12876-026-04636-5","vorDoiUrl":"https://doi.org/10.1186/s12876-026-04636-5","workflowStages":[]},"version":"v1","identity":"rs-7860274","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7860274","identity":"rs-7860274","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00