Comparing ChatGPT Responses with AHA Guidelines for Assessing Unruptured Intracranial Aneurysms: Establishment of a Simple Rating System

doi:10.21203/rs.3.rs-3897237/v1

Comparing ChatGPT Responses with AHA Guidelines for Assessing Unruptured Intracranial Aneurysms: Establishment of a Simple Rating System

2024 · doi:10.21203/rs.3.rs-3897237/v1

preprint OA: closed CC-BY-4.0

📄 Open PDF Full text JSON View at publisher

Full text 60,316 characters · extracted from preprint-html · click to expand

Comparing ChatGPT Responses with AHA Guidelines for Assessing Unruptured Intracranial Aneurysms: Establishment of a Simple Rating System | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Comparing ChatGPT Responses with AHA Guidelines for Assessing Unruptured Intracranial Aneurysms: Establishment of a Simple Rating System Yu Chang, Po-Hsuan Lee, Chi-Chen Huang, Chia-En Wong, Pang-Shuo Perng, and 3 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-3897237/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Introduction Efficient diagnosis and intervention for unruptured intracranial aneurysms (UIAs) are crucial for favorable outcomes. Our study aimed to evaluate the accuracy and alignment of Chat Generative Pre-trained Transformer (ChatGPT) with established medical standards by systematically evaluating its responses using the American Heart Association (AHA) guidelines for the management of UIAs as a reference. This initiative bridges advanced artificial intelligence (AI) technology and medical practice norms, and contributes to the discussion on the role of AI in the dissemination of medical information. Methods In our collaborative study, we systematically assessed ChatGPT 3.5's responses by posing clinical questions aligned with AHA guidelines and evaluating them on a 1 to 5 scale for agreement and comprehensiveness. This method allowed us to objectively gauge ChatGPT's alignment with AHA medical guidelines. Results We introduced a set of ten clinical questions related to UIAs. Within this set, ChatGPT's responses achieved a 5-point rating for four questions. A further four questions were rated 3 points, and the remaining two questions received a score of 2. Conclusions By establishing a scoring system, we assessed the accuracy of ChatGPT responses to questions related to UIAs. It provides excellent results for screening, risk factors, and as a diagnostic tool. However, there is room for improvement in terms of the rupture risk and management. Biological sciences/Neuroscience Health sciences/Anatomy Health sciences/Medical research artificial intelligence intracranial aneurysm AHA guideline Figures Figure 1 Introduction Efficient diagnosis and timely intervention for unruptured intracranial aneurysms (UIAs) yield favorable outcomes 1 , 2 . Exploration of this condition, focusing on screening, characteristics, risks, and management is of paramount significance in the field of neurosurgery 3 – 5 . The 2015 American Heart Association (AHA) guidelines offer comprehensive recommendations for medical professionals in navigating this complex landscape 6 . With its robust capabilities, Chat Generative Pre-trained Transformer (ChatGPT) emerges as an impressive form of artificial intelligence (AI), proficient in addressing a wide range of inquiries, including intricate medical concerns 7 – 9 . In our research, we embarked on inputting targeted questions about UIAs into ChatGPT. The primary goal was to discern the nuances between the ChatGPT responses and the well-established guidelines provided by the AHA. By systematically comparing ChatGPT's generated information with the AHA's evidence-based directives, our aim was to critically evaluate AI's accuracy and alignment with recognized medical standards. This research initiative serves as a bridge between advanced AI technology and the steadfast norms of established medical practice. Through a meticulous comparative analysis of ChatGPT responses and the AHA guidelines, we contribute substantively to the ongoing discourse on the role of AI in disseminating medical information. Our efforts harmonize with the broader objective of ensuring that advanced AI systems seamlessly integrate with existing medical knowledge, thereby providing a vital resource for precise, dependable, and consistent information to both medical professionals and the public. This convergence of AI and medical expertise holds promise for enhancing healthcare practices and outcomes. Methods In this collaborative study, we adopted a methodical approach to assess the responses generated by ChatGPT 3.5 using the AHA guidelines regarding recommendations for the management of patients with UIAs as the reference. Because this research does not involve human subjects, an Institutional Review Board approval is not required. Our method involved posing specific clinical questions to ChatGPT which were aligned with the AHA guidelines; we subsequently evaluated the quality of the responses using a scale ranging from 1 to 5. This assessment was based on the extent of agreement with the guidelines and overall comprehensiveness of the information provided. Using this systematic evaluation method, we were able to objectively gauge the alignment between ChatGPT responses and the established medical guidelines outlined by the AHA. The evaluation scale was as follows: 5 points We assigned 5 points to responses that not only fully addressed all elements outlined in the AHA guidelines, but also included additional relevant information, thereby enhancing the overall depth and breadth of the response. 4 points Responses earning 4 points closely adhered to the AHA guidelines without any omissions, although they might have lacked supplementary information. 3 points Responses scoring 3 points covered some of the content specified in the AHA guidelines, yet could display certain omissions or incompleteness in their coverage. 2 points Answers receiving 2 points incorporated elements from the guidelines, but also featured a mixture of subjective judgments that did not entirely align with the guideline recommendations. 1 point The lowest score of 1 point was assigned to responses that significantly deviated from the AHA guidelines, indicating a substantial failure to adhere to the recommended clinical approach. The evaluation process was independently conducted by two authors, each of whom provided their own ratings. In cases in which opinions on the evaluation differed, a third investigator was consulted to provide a final rating. Figure 1 demonstrates an example of ChatGPT's response to our clinical question. Results We introduced a set of ten clinical questions related to UIA. Within this set, ChatGPT's responses achieved a 5-point rating for the four questions. A further four questions were rated 3 points, and the remaining two questions received a score of 2. The specific details and evaluations are presented in Table 1 . ChatGPT's responses are summarized in Table 1 S1 in the supplemental file. Table 1 Rating and comment for ChatGPT’s response Q1: Who should be considered for intracranial aneurysm screening? Rating: 5 Comment: ChatGPT provided information aligned with the AHA guideline. Moreover, ChatGPT also suggested that patients with symptoms possibly associated with IA should undergo screening. Q2: What are the risk factors associated with the growth or rupture of unruptured intracranial aneurysms? Rating: 5 Comment: ChatGPT reported several possible risk factors, including smoking and hypertension, which were also reported in the AHA guideline. Moreover, there were additional risk factors reported. Q3: What diagnostic tests are recommended for detecting intracranial aneurysm? Rating: 5 Comment: ChatGPT reported several tools for IA diagnosis and mentioned DSA as the gold standard, which aligns with the AHA guideline. Moreover, CT Perfusion and transcranial doppler were also mentioned by ChatGPT. Q4: What is the likelihood of an intracranial aneurysm rupturing? Rating: 3 Comment: ChatGPT reported that IA < 7 mm is at low risk of rupture. However, ChatGPT did not report further analysis of the rupture rate of IA with other sizes. Q5: Who should be considered for treatment of an unruptured intracranial aneurysm? Rating: 3 Comment: ChatGPT reported several potential conditions in which aneurysms required treatment, including documented enlargement during follow-up and family history aligned with the AHA guideline. However, the AHA guideline mentioned that a history of aSAH might be considered an independent risk factor for future hemorrhage secondary to a different small unruptured aneurysm, which was not mentioned by ChatGPT. Q6: When considering the treatment of unruptured intracranial aneurysms, which patients are appropriate candidates for surgical clipping as opposed to endovascular treatment? Rating: 2 Comment: ChatGPT mentioned that size, location, and shape should be taken into account when considering surgical clipping as the mode of treatment, aligned with the AHA guideline. However, ChatGPT mentioned that surgical clipping might be preferred for relatively large aneurysms, especially those > 7 mm in diameter, which is overconclusive. Q7: When considering the treatment of unruptured intracranial aneurysms, which patients are appropriate candidates for endovascular treatment opposed to surgical clipping? Rating: 2 Comment: The AHA guideline did not mention what type of patients were suitable for endovascular treatment. However, ChatGPT mentioned that endovascular treatment could be favored for relatively small aneurysms, typically < 7 mm in diameter, which is overconclusive. Q8: Which treatment option is more suitable for unruptured intracranial aneurysms: surgical clipping or endovascular treatment? Rating: 3 Comment: The AHA guideline did not clearly report the efficacy of endovascular versus surgical treatment of UIA. ChatGPT answered that both treatment methods had their advantages and considerations; however, AHA mentioned that, in selected cases, endovascular coiling was associated with a reduction in procedural morbidity and mortality compared with surgical clipping, but that it had a higher overall recurrence risk; this was not mentioned by ChatGPT. Q9: At what circumstances, observation is a reasonable alternative to treatment of UIA? Rating: 5 Comment: The AHA guidelines stated that observation was a reasonable option for individuals aged over 65 years and those with associated medical comorbidities who had small asymptomatic UIAs and a low risk of hemorrhage based on factors such as location, size, morphology, family history, and other relevant factors. This aligns with the information provided by ChatGPT. Furthermore, ChatGPT also added that serial imaging could assist in the evaluation. Q10: How frequently should a patient with an untreated unruptured intracranial aneurysm undergo follow-up appointments? Rating: 3 Comment: ChatGPT reported that follow-up every six months might be reasonable whereas the AHA guideline reported that the first follow-up study should be at 6 to 12 months after initial discovery, followed by subsequent yearly or alternate yearly follow-up. Discussion The primary significance of our research lies in comprehending the disparities between ChatGPT responses and the established guidelines for UIAs. Despite the presence of guidelines, clinical practitioners still seek insights from ChatGPT. Beyond this, our questions were formulated in a more colloquial manner to simulate the types of inquiries that the general public might pose to ChatGPT regarding this subject. Furthermore, we devised a systematic framework to evaluate AI responses. This system not only served the purpose for our current study but also has the potential to efficiently evaluate the quality of AI responses in diverse contexts. This approach could be used to assess AI-generated responses across a range of future issues. The study conducted by Duey et al. analyzed thromboembolic prophylaxis in spinal surgery and compared the responses generated by the ChatGPT with the North American Spine Society guidelines 10 . They found that ChatGPT could provide recommendations for thromboembolic prophylaxis with reasonable accuracy. It is noteworthy that they categorized the ChatGPT responses into four distinct aspects for analysis: accuracy, overconclusiveness, supplementary information, and incompleteness. However, this categorization approach may be more suitable for evaluating clinical questions related to whether or not a particular intervention should be performed. In our study, we investigated various issues related to UIAs. It is possible that not all ChatGPT responses are amenable to analysis using these four dimensions. Consequently, we adopted a rating score that offers a simpler and clearer way to determine the appropriateness of AI-generated answers. This rating system could potentially be applied in the future to evaluate ChatGPT responses to questions pertaining to other medical conditions. It is a logical approach to choose the 2015 guidelines for comparison, given that ChatGPT is based on information predating 2021. Upon careful examination of our findings, it was evident that ChatGPT excelled in the initial three questions, providing exceptional responses for topics such as screening, risk factors, and diagnostic tools, each earning a 5-point rating. However, with respect to rupture rates, which involve varying risks based on different anatomical locations 11 , ChatGPT did not address the concept of anatomical location variance. Understanding this variance is crucial for neurosurgeons and patients, and it also impacts management. We hypothesized that if the inquirers were from another medical specialty or laypersons without the concept of anatomical location variance, they might inquire about the risk while not taking into account the impact of the anatomical location. ChatGPT's omission of variance led to a 3-point rating. Nevertheless, it did offer a commonly recognized cutoff value for small aneurysms 12 , 13 . Regarding early management, the ChatGPT response acknowledged the importance of enlargement as a key indication, which is consistent with the AHA guidelines. However, due to minor omissions, it received a 3-point rating. The treatment approaches for UIAs have undergone substantial transformations, primarily driven by the growing prominence of endovascular therapy 14 .Notably, between 2001 and 2008, the number of patients receiving endovascular coiling for UIAs surpassed those undergoing surgical clipping, with 34,054 cases compared to 29,866, respectively, as indicated by data from the National Inpatient Sample 15 . The International Study of Unruptured Intracranial Aneurysms (ISUIA) 16 provided essential natural history data on UIAs and information related to their treatment. The ISUIA, encompassing 4,060 eligible patients, included 1,917 individuals who underwent surgical treatment for UIAs, of whom 451 received endovascular coiling. The findings indicated that surgical clipping resulted in a higher one-year morbidity and mortality rate than endovascular coiling and that the endovascular approach was less influenced by patient age, potentially making it a preferable choice, particularly for older individuals. Nonetheless, the long-term effectiveness of aneurysm occlusion using endovascular coiling can pose challenges, particularly in younger patients and those with aneurysms located at bifurcation sites, such as the middle cerebral bifurcation and basilar artery termini 17 . Although the recurrence rate is relatively low, it is essential to maintain a regular surveillance regimen to monitor these cases over time 18 . Despite increased awareness of the advantages and disadvantages of both treatments, the decision-making process regarding surgical clipping or endovascular treatment of UIAs is highly intricate and involves numerous considerations. Factors such as the surgeon's technique and experience also play a role, and even the AHA guidelines do not provide definitive directives 19 .Consequently, enabling ChatGPT to offer comprehensive responses on such matters is challenging. Even within the management segment, ChatGPT may present content lacking evidence-based support, leading to a 2-point rating. However, this is consistent with the statement that the decision between surgical clipping and endovascular treatment is complex and should involve collaboration with a multidisciplinary team of neurosurgeons, interventional neuroradiologists, and other specialists. This approach prevents other physicians or the general public from prematurely assuming the preferred treatment without robust evaluation. Conclusions By establishing a scoring system, we assessed the accuracy of ChatGPT responses to questions related to intracranial ruptured aneurysms. It provides excellent results for screening, risk factors, and as a diagnostic tool. However, there is room for improvement in terms of the rupture risk and management. With future guideline updates, AI is expected to provide better responses and analyses. Abbreviations AHA, American Heart Association AI, artificial intelligence Chat GPT, Chat Generative Pre-trained Transformer UIA, unruptured intracranial aneurysm Declarations Funding Not applicable Author Contribution Y.C. and C.Y.H. wrote the main manuscript text. P.H.L. was responsible for experimental design and execution, ensuring the reliability of the conducted experiments. C.C.H. and C.E.W. conducted an extensive literature review and organized data. P.S.P., J.S.L. and L.C.W. critically reviewed the study design and manuscript. All authors reviewed the manuscript to ensure its quality and accuracy. Acknowledgement Not applicable Availability of Data and Materials The datasets used and/or analyzed during the current study available from the corresponding author on reasonable request. Competing of interest All authors declare no competing of interest. References Brown RD, Jr., Broderick JP. Unruptured intracranial aneurysms: epidemiology, natural history, management options, and familial screening. Lancet Neurol. 2014;13(4):393–404. Etminan N, Rinkel GJ. Unruptured intracranial aneurysms: development, rupture and preventive management. Nat Rev Neurol. 2016;12(12):699–713. Pontes FGB, da Silva EM, Baptista-Silva JC, Vasconcelos V. Treatments for unruptured intracranial aneurysms. Cochrane Database Syst Rev. 2021;5(5):Cd013312. Güresir E, Vatter H, Schuss P, et al. Natural history of small unruptured anterior circulation aneurysms: a prospective cohort study. Stroke. 2013;44(11):3027–3031. Takao H, Nojo T, Ohtomo K. Screening for familial intracranial aneurysms: decision and cost-effectiveness analysis. Acad Radiol. 2008;15(4):462–471. Thompson BG, Brown RD, Jr., Amin-Hanjani S, et al. Guidelines for the Management of Patients With Unruptured Intracranial Aneurysms: A Guideline for Healthcare Professionals From the American Heart Association/American Stroke Association. Stroke. 2015;46(8):2368–2400. Will ChatGPT transform healthcare? Nat Med. 2023;29(3):505–506. Cascella M, Montomoli J, Bellini V, Bignami E. Evaluating the Feasibility of ChatGPT in Healthcare: An Analysis of Multiple Clinical and Research Scenarios. J Med Syst. 2023;47(1):33. Scimeca M, Bonfiglio R. Dignity of Science and the use of ChatGPT as a co-author. ESMO Open. 2023;8(5):101621. Duey AH, Nietsch KS, Zaidat B, et al. Thromboembolic prophylaxis in spine surgery: an analysis of ChatGPT recommendations. Spine J. 2023. Rinkel GJ, Djibuti M, Algra A, van Gijn J. Prevalence and risk of rupture of intracranial aneurysms: a systematic review. Stroke. 1998;29(1):251–256. Broderick JP, Brown RD, Jr., Sauerbeck L, et al. Greater rupture risk for familial as compared to sporadic unruptured intracranial aneurysms. Stroke. 2009;40(6):1952–1957. Mackey J, Brown RD, Jr., Moomaw CJ, et al. Unruptured intracranial aneurysms in the Familial Intracranial Aneurysm and International Study of Unruptured Intracranial Aneurysms cohorts: differences in multiplicity and location. J Neurosurg. 2012;117(1):60–64. Pierot L, Wakhloo AK. Endovascular treatment of intracranial aneurysms: current status. Stroke. 2013;44(7):2046–2054. Brinjikji W, Rabinstein AA, Lanzino G, Kallmes DF, Cloft HJ. Effect of age on outcomes of treatment of unruptured cerebral aneurysms: a study of the National Inpatient Sample 2001–2008. Stroke. 2011;42(5):1320–1324. Unruptured intracranial aneurysms–risk of rupture and risks of surgical intervention. N Engl J Med. 1998;339(24):1725–1733. Schaafsma JD, Sprengers ME, van Rooij WJ, et al. Long-term recurrent subarachnoid hemorrhage after adequate coiling versus clipping of ruptured intracranial aneurysms. Stroke. 2009;40(5):1758–1763. Ringer AJ, Rodriguez-Mercado R, Veznedaroglu E, et al. Defining the risk of retreatment for aneurysm recurrence or residual after initial treatment by endovascular coiling: a multicenter study. Neurosurgery. 2009;65(2):311–315; discussion 315. Darsaut TE, Estrade L, Jamali S, Bojanowski MW, Chagnon M, Raymond J. Uncertainty and agreement in the management of unruptured intracranial aneurysms. J Neurosurg. 2014;120(3):618–623. Additional Declarations No competing interests reported. Supplementary Files TableS1.docx Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-3897237","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":271404603,"identity":"f66f1981-5013-43f6-91c2-3a75bf8ffc67","order_by":0,"name":"Yu Chang","email":"","orcid":"","institution":"National Cheng Kung University Hospital, National Cheng Kung University","correspondingAuthor":false,"prefix":"","firstName":"Yu","middleName":"","lastName":"Chang","suffix":""},{"id":271404604,"identity":"9463006b-449a-4899-aa30-aef7f9c67445","order_by":1,"name":"Po-Hsuan Lee","email":"","orcid":"","institution":"National Cheng Kung University Hospital, National Cheng Kung University","correspondingAuthor":false,"prefix":"","firstName":"Po-Hsuan","middleName":"","lastName":"Lee","suffix":""},{"id":271404605,"identity":"39f7b618-2038-466e-8bf1-62d462c24e87","order_by":2,"name":"Chi-Chen Huang","email":"","orcid":"","institution":"National Cheng Kung University Hospital, National Cheng Kung University","correspondingAuthor":false,"prefix":"","firstName":"Chi-Chen","middleName":"","lastName":"Huang","suffix":""},{"id":271404606,"identity":"9b816a35-8254-4c14-afff-a5c819da6574","order_by":3,"name":"Chia-En Wong","email":"","orcid":"","institution":"National Cheng Kung University Hospital, National Cheng Kung University","correspondingAuthor":false,"prefix":"","firstName":"Chia-En","middleName":"","lastName":"Wong","suffix":""},{"id":271404607,"identity":"09966758-126b-438d-80d3-52e27e6d57e4","order_by":4,"name":"Pang-Shuo Perng","email":"","orcid":"","institution":"National Cheng Kung University Hospital, National Cheng Kung University","correspondingAuthor":false,"prefix":"","firstName":"Pang-Shuo","middleName":"","lastName":"Perng","suffix":""},{"id":271404608,"identity":"ff081d48-c9de-453f-89d5-69c133a02a07","order_by":5,"name":"Jung-Shun Lee","email":"","orcid":"","institution":"National Cheng Kung University Hospital, National Cheng Kung University","correspondingAuthor":false,"prefix":"","firstName":"Jung-Shun","middleName":"","lastName":"Lee","suffix":""},{"id":271404609,"identity":"8ca7207e-7e12-4aa2-a888-211ba0661129","order_by":6,"name":"Liang-Chao Wang","email":"","orcid":"","institution":"National Cheng Kung University Hospital, National Cheng Kung University","correspondingAuthor":false,"prefix":"","firstName":"Liang-Chao","middleName":"","lastName":"Wang","suffix":""},{"id":271404610,"identity":"63b5e6fa-419e-47aa-bad8-60e9424fdc4b","order_by":7,"name":"Chih-Yuan Huang","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA+UlEQVRIiWNgGAWjYJACCSCS42dvPvjgAwNDArFaLIwle44lG85A0iJBQEtFosGNHDNpHmK0yLv3PrzxcYdEguQMoBbbNrs8fvYGxg8fcxjqDA5g12J45rix5cwzEnn8PM+KrXPbkoslew4wS87cxiCBU8uMNDZp3jaJYsn25I23c9uYEzfcSGBj5gVqMSOgJXHDgQQDacu2esJa5CVgWk6kGEkzth0mrMWA5xiz5cw2CUgg95w7njiz52Az0C8Skvtx2dLexnjjY1sdJCp/lFUn9gMZHz5us+GXbMBhC4pRjGxgEqQWd0zKoxr1B6fCUTAKRsEoGMEAANxQXSCnYZdfAAAAAElFTkSuQmCC","orcid":"","institution":"National Cheng Kung University Hospital, National Cheng Kung University","correspondingAuthor":true,"prefix":"","firstName":"Chih-Yuan","middleName":"","lastName":"Huang","suffix":""}],"badges":[],"createdAt":"2024-01-25 13:05:29","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-3897237/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-3897237/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":50809030,"identity":"528234a2-e8dc-41b7-b385-7f254a04af6d","added_by":"auto","created_at":"2024-02-07 17:03:52","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":348556,"visible":true,"origin":"","legend":"\u003cp\u003eExample of ChatGPT’s response to clinical questions\u003c/p\u003e","description":"","filename":"figure1.png","url":"https://assets-eu.researchsquare.com/files/rs-3897237/v1/1869fe147be3034f85c293ca.png"},{"id":77841277,"identity":"157c1ce6-c3dd-4223-b551-45c0d1fda176","added_by":"auto","created_at":"2025-03-06 04:53:56","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":804448,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-3897237/v1/99461481-900a-498b-8295-515f14d33562.pdf"},{"id":50809031,"identity":"99a5aa1e-03b0-48eb-a06b-57a6f3ead91c","added_by":"auto","created_at":"2024-02-07 17:03:53","extension":"docx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":30111,"visible":true,"origin":"","legend":"","description":"","filename":"TableS1.docx","url":"https://assets-eu.researchsquare.com/files/rs-3897237/v1/23bb1133c2eb33764afaeea4.docx"}],"financialInterests":"No competing interests reported.","formattedTitle":"Comparing ChatGPT Responses with AHA Guidelines for Assessing Unruptured Intracranial Aneurysms: Establishment of a Simple Rating System","fulltext":[{"header":"Introduction","content":"\u003cp\u003eEfficient diagnosis and timely intervention for unruptured intracranial aneurysms (UIAs) yield favorable outcomes\u003csup\u003e\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e,\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e\u003c/sup\u003e. Exploration of this condition, focusing on screening, characteristics, risks, and management is of paramount significance in the field of neurosurgery \u003csup\u003e\u003cspan additionalcitationids=\"CR4\" citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e\u003c/sup\u003e. The 2015 American Heart Association (AHA) guidelines offer comprehensive recommendations for medical professionals in navigating this complex landscape\u003csup\u003e\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eWith its robust capabilities, Chat Generative Pre-trained Transformer (ChatGPT) emerges as an impressive form of artificial intelligence (AI), proficient in addressing a wide range of inquiries, including intricate medical concerns\u003csup\u003e\u003cspan additionalcitationids=\"CR8\" citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e\u003c/sup\u003e. In our research, we embarked on inputting targeted questions about UIAs into ChatGPT. The primary goal was to discern the nuances between the ChatGPT responses and the well-established guidelines provided by the AHA. By systematically comparing ChatGPT's generated information with the AHA's evidence-based directives, our aim was to critically evaluate AI's accuracy and alignment with recognized medical standards.\u003c/p\u003e \u003cp\u003eThis research initiative serves as a bridge between advanced AI technology and the steadfast norms of established medical practice. Through a meticulous comparative analysis of ChatGPT responses and the AHA guidelines, we contribute substantively to the ongoing discourse on the role of AI in disseminating medical information. Our efforts harmonize with the broader objective of ensuring that advanced AI systems seamlessly integrate with existing medical knowledge, thereby providing a vital resource for precise, dependable, and consistent information to both medical professionals and the public. This convergence of AI and medical expertise holds promise for enhancing healthcare practices and outcomes.\u003c/p\u003e"},{"header":"Methods","content":"\u003cp\u003eIn this collaborative study, we adopted a methodical approach to assess the responses generated by ChatGPT 3.5 using the AHA guidelines regarding recommendations for the management of patients with UIAs as the reference. Because this research does not involve human subjects, an Institutional Review Board approval is not required. Our method involved posing specific clinical questions to ChatGPT which were aligned with the AHA guidelines; we subsequently evaluated the quality of the responses using a scale ranging from 1 to 5. This assessment was based on the extent of agreement with the guidelines and overall comprehensiveness of the information provided. Using this systematic evaluation method, we were able to objectively gauge the alignment between ChatGPT responses and the established medical guidelines outlined by the AHA.\u003c/p\u003e\n\u003cp\u003eThe evaluation scale was as follows:\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e5 points\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe assigned 5 points to responses that not only fully addressed all elements outlined in the AHA guidelines, but also included additional relevant information, thereby enhancing the overall depth and breadth of the response.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e4 points\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eResponses earning 4 points closely adhered to the AHA guidelines without any omissions, although they might have lacked supplementary information.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e3 points\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eResponses scoring 3 points covered some of the content specified in the AHA guidelines, yet could display certain omissions or incompleteness in their coverage.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e2 points\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAnswers receiving 2 points incorporated elements from the guidelines, but also featured a mixture of subjective judgments that did not entirely align with the guideline recommendations.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e1 point\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe lowest score of 1 point was assigned to responses that significantly deviated from the AHA guidelines, indicating a substantial failure to adhere to the recommended clinical approach.\u003c/p\u003e\n\u003cp\u003eThe evaluation process was independently conducted by two authors, each of whom provided their own ratings. In cases in which opinions on the evaluation differed, a third investigator was consulted to provide a final rating.\u003c/p\u003e\n\u003cp\u003eFigure \u003cspan class=\"InternalRef\"\u003e1\u003c/span\u003e demonstrates an example of ChatGPT\u0026apos;s response to our clinical question.\u003c/p\u003e"},{"header":"Results","content":"\u003cp\u003eWe introduced a set of ten clinical questions related to UIA. Within this set, ChatGPT's responses achieved a 5-point rating for the four questions. A further four questions were rated 3 points, and the remaining two questions received a score of 2. The specific details and evaluations are presented in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e. ChatGPT's responses are summarized in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e\u003cb\u003eS1\u003c/b\u003e in the supplemental file.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eRating and comment for ChatGPT\u0026rsquo;s response\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"1\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eQ1: Who should be considered for intracranial aneurysm screening?\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRating: 5\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eComment:\u003c/p\u003e \u003cp\u003eChatGPT provided information aligned with the AHA guideline. Moreover, ChatGPT also suggested that patients with symptoms possibly associated with IA should undergo screening.\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eQ2: What are the risk factors associated with the growth or rupture of unruptured intracranial aneurysms?\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRating: 5\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eComment:\u003c/p\u003e \u003cp\u003eChatGPT reported several possible risk factors, including smoking and hypertension, which were also reported in the AHA guideline. Moreover, there were additional risk factors reported.\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eQ3: What diagnostic tests are recommended for detecting intracranial aneurysm?\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRating: 5\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eComment:\u003c/p\u003e \u003cp\u003eChatGPT reported several tools for IA diagnosis and mentioned DSA as the gold standard, which aligns with the AHA guideline. Moreover, CT Perfusion and transcranial doppler were also mentioned by ChatGPT.\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eQ4: What is the likelihood of an intracranial aneurysm rupturing?\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRating: 3\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eComment:\u003c/p\u003e \u003cp\u003eChatGPT reported that IA\u0026thinsp;\u0026lt;\u0026thinsp;7 mm is at low risk of rupture. However, ChatGPT did not report further analysis of the rupture rate of IA with other sizes.\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eQ5: Who should be considered for treatment of an unruptured intracranial aneurysm?\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRating: 3\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eComment:\u003c/p\u003e \u003cp\u003eChatGPT reported several potential conditions in which aneurysms required treatment, including documented enlargement during follow-up and family history aligned with the AHA guideline. However, the AHA guideline mentioned that a history of aSAH might be considered an independent risk factor for future hemorrhage secondary to a different small unruptured aneurysm, which was not mentioned by ChatGPT.\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eQ6: When considering the treatment of unruptured intracranial aneurysms, which patients are appropriate candidates for surgical clipping as opposed to endovascular treatment?\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRating: 2\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eComment:\u003c/p\u003e \u003cp\u003eChatGPT mentioned that size, location, and shape should be taken into account when considering surgical clipping as the mode of treatment, aligned with the AHA guideline. However, ChatGPT mentioned that surgical clipping might be preferred for relatively large aneurysms, especially those\u0026thinsp;\u0026gt;\u0026thinsp;7 mm in diameter, which is overconclusive.\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eQ7: When considering the treatment of unruptured intracranial aneurysms, which patients are appropriate candidates for endovascular treatment opposed to surgical clipping?\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRating: 2\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eComment:\u003c/p\u003e \u003cp\u003eThe AHA guideline did not mention what type of patients were suitable for endovascular treatment. However, ChatGPT mentioned that endovascular treatment could be favored for relatively small aneurysms, typically\u0026thinsp;\u0026lt;\u0026thinsp;7 mm in diameter, which is overconclusive.\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eQ8: Which treatment option is more suitable for unruptured intracranial aneurysms: surgical clipping or endovascular treatment?\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRating: 3\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eComment:\u003c/p\u003e \u003cp\u003eThe AHA guideline did not clearly report the efficacy of endovascular versus surgical treatment of UIA. ChatGPT answered that both treatment methods had their advantages and considerations; however, AHA mentioned that, in selected cases, endovascular coiling was associated with a reduction in procedural morbidity and mortality compared with surgical clipping, but that it had a higher overall recurrence risk; this was not mentioned by ChatGPT.\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eQ9: At what circumstances, observation is a reasonable alternative to treatment of UIA?\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRating: 5\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eComment:\u003c/p\u003e \u003cp\u003eThe AHA guidelines stated that observation was a reasonable option for individuals aged over 65 years and those with associated medical comorbidities who had small asymptomatic UIAs and a low risk of hemorrhage based on factors such as location, size, morphology, family history, and other relevant factors. This aligns with the information provided by ChatGPT. Furthermore, ChatGPT also added that serial imaging could assist in the evaluation.\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eQ10: How frequently should a patient with an untreated unruptured intracranial aneurysm undergo follow-up appointments?\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRating: 3\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eComment:\u003c/p\u003e \u003cp\u003eChatGPT reported that follow-up every six months might be reasonable whereas the AHA guideline reported that the first follow-up study should be at 6 to 12 months after initial discovery, followed by subsequent yearly or alternate yearly follow-up.\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e"},{"header":"Discussion","content":"\u003cp\u003e The primary significance of our research lies in comprehending the disparities between ChatGPT responses and the established guidelines for UIAs. Despite the presence of guidelines, clinical practitioners still seek insights from ChatGPT. Beyond this, our questions were formulated in a more colloquial manner to simulate the types of inquiries that the general public might pose to ChatGPT regarding this subject. Furthermore, we devised a systematic framework to evaluate AI responses. This system not only served the purpose for our current study but also has the potential to efficiently evaluate the quality of AI responses in diverse contexts. This approach could be used to assess AI-generated responses across a range of future issues.\u003c/p\u003e \u003cp\u003eThe study conducted by Duey et al. analyzed thromboembolic prophylaxis in spinal surgery and compared the responses generated by the ChatGPT with the North American Spine Society guidelines\u003csup\u003e\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e\u003c/sup\u003e. They found that ChatGPT could provide recommendations for thromboembolic prophylaxis with reasonable accuracy. It is noteworthy that they categorized the ChatGPT responses into four distinct aspects for analysis: accuracy, overconclusiveness, supplementary information, and incompleteness. However, this categorization approach may be more suitable for evaluating clinical questions related to whether or not a particular intervention should be performed. In our study, we investigated various issues related to UIAs. It is possible that not all ChatGPT responses are amenable to analysis using these four dimensions. Consequently, we adopted a rating score that offers a simpler and clearer way to determine the appropriateness of AI-generated answers. This rating system could potentially be applied in the future to evaluate ChatGPT responses to questions pertaining to other medical conditions.\u003c/p\u003e \u003cp\u003e It is a logical approach to choose the 2015 guidelines for comparison, given that ChatGPT is based on information predating 2021. Upon careful examination of our findings, it was evident that ChatGPT excelled in the initial three questions, providing exceptional responses for topics such as screening, risk factors, and diagnostic tools, each earning a 5-point rating. However, with respect to rupture rates, which involve varying risks based on different anatomical locations\u003csup\u003e\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e\u003c/sup\u003e, ChatGPT did not address the concept of anatomical location variance. Understanding this variance is crucial for neurosurgeons and patients, and it also impacts management. We hypothesized that if the inquirers were from another medical specialty or laypersons without the concept of anatomical location variance, they might inquire about the risk while not taking into account the impact of the anatomical location. ChatGPT's omission of variance led to a 3-point rating. Nevertheless, it did offer a commonly recognized cutoff value for small aneurysms\u003csup\u003e\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e,\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e\u003c/sup\u003e. Regarding early management, the ChatGPT response acknowledged the importance of enlargement as a key indication, which is consistent with the AHA guidelines. However, due to minor omissions, it received a 3-point rating.\u003c/p\u003e \u003cp\u003eThe treatment approaches for UIAs have undergone substantial transformations, primarily driven by the growing prominence of endovascular therapy\u003csup\u003e\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e\u003c/sup\u003e.Notably, between 2001 and 2008, the number of patients receiving endovascular coiling for UIAs surpassed those undergoing surgical clipping, with 34,054 cases compared to 29,866, respectively, as indicated by data from the National Inpatient Sample\u003csup\u003e\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e\u003c/sup\u003e. The International Study of Unruptured Intracranial Aneurysms (ISUIA)\u003csup\u003e\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e\u003c/sup\u003e provided essential natural history data on UIAs and information related to their treatment. The ISUIA, encompassing 4,060 eligible patients, included 1,917 individuals who underwent surgical treatment for UIAs, of whom 451 received endovascular coiling. The findings indicated that surgical clipping resulted in a higher one-year morbidity and mortality rate than endovascular coiling and that the endovascular approach was less influenced by patient age, potentially making it a preferable choice, particularly for older individuals. Nonetheless, the long-term effectiveness of aneurysm occlusion using endovascular coiling can pose challenges, particularly in younger patients and those with aneurysms located at bifurcation sites, such as the middle cerebral bifurcation and basilar artery termini\u003csup\u003e\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e\u003c/sup\u003e. Although the recurrence rate is relatively low, it is essential to maintain a regular surveillance regimen to monitor these cases over time\u003csup\u003e\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e\u003c/sup\u003e. Despite increased awareness of the advantages and disadvantages of both treatments, the decision-making process regarding surgical clipping or endovascular treatment of UIAs is highly intricate and involves numerous considerations. Factors such as the surgeon's technique and experience also play a role, and even the AHA guidelines do not provide definitive directives\u003csup\u003e\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e\u003c/sup\u003e.Consequently, enabling ChatGPT to offer comprehensive responses on such matters is challenging. Even within the management segment, ChatGPT may present content lacking evidence-based support, leading to a 2-point rating. However, this is consistent with the statement that the decision between surgical clipping and endovascular treatment is complex and should involve collaboration with a multidisciplinary team of neurosurgeons, interventional neuroradiologists, and other specialists. This approach prevents other physicians or the general public from prematurely assuming the preferred treatment without robust evaluation.\u003c/p\u003e"},{"header":"Conclusions","content":"\u003cp\u003eBy establishing a scoring system, we assessed the accuracy of ChatGPT responses to questions related to intracranial ruptured aneurysms. It provides excellent results for screening, risk factors, and as a diagnostic tool. However, there is room for improvement in terms of the rupture risk and management. With future guideline updates, AI is expected to provide better responses and analyses.\u003c/p\u003e"},{"header":"Abbreviations","content":"\u003cp\u003eAHA, American Heart Association\u003c/p\u003e\u003cp\u003eAI, artificial intelligence\u003c/p\u003e\u003cp\u003eChat GPT, Chat Generative Pre-trained Transformer\u003c/p\u003e\u003cp\u003eUIA, unruptured intracranial aneurysm\u003c/p\u003e"},{"header":"Declarations","content":"\u003ch2\u003eFunding\u003c/h2\u003e \u003cp\u003eNot applicable\u003c/p\u003e\u003ch2\u003eAuthor Contribution\u003c/h2\u003e\u003cp\u003eY.C. and C.Y.H. wrote the main manuscript text. P.H.L. was responsible for experimental design and execution, ensuring the reliability of the conducted experiments. C.C.H. and C.E.W. conducted an extensive literature review and organized data. P.S.P., J.S.L. and L.C.W. critically reviewed the study design and manuscript. All authors reviewed the manuscript to ensure its quality and accuracy.\u003c/p\u003e\u003ch2\u003eAcknowledgement\u003c/h2\u003e \u003cp\u003eNot applicable\u003c/p\u003e\u003ch2\u003eAvailability of Data and Materials\u003c/h2\u003e \u003cp\u003eThe datasets used and/or analyzed during the current study available from the corresponding author on reasonable request.\u003c/p\u003e\n\u003ch2\u003eCompeting of interest\u003c/h2\u003e\n\u003cp\u003eAll authors declare no competing of interest.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eBrown RD, Jr., Broderick JP. Unruptured intracranial aneurysms: epidemiology, natural history, management options, and familial screening. Lancet Neurol. 2014;13(4):393\u0026ndash;404.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eEtminan N, Rinkel GJ. Unruptured intracranial aneurysms: development, rupture and preventive management. Nat Rev Neurol. 2016;12(12):699\u0026ndash;713.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePontes FGB, da Silva EM, Baptista-Silva JC, Vasconcelos V. Treatments for unruptured intracranial aneurysms. Cochrane Database Syst Rev. 2021;5(5):Cd013312.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eG\u0026uuml;resir E, Vatter H, Schuss P, et al. Natural history of small unruptured anterior circulation aneurysms: a prospective cohort study. Stroke. 2013;44(11):3027\u0026ndash;3031.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTakao H, Nojo T, Ohtomo K. Screening for familial intracranial aneurysms: decision and cost-effectiveness analysis. Acad Radiol. 2008;15(4):462\u0026ndash;471.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eThompson BG, Brown RD, Jr., Amin-Hanjani S, et al. Guidelines for the Management of Patients With Unruptured Intracranial Aneurysms: A Guideline for Healthcare Professionals From the American Heart Association/American Stroke Association. Stroke. 2015;46(8):2368\u0026ndash;2400.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWill ChatGPT transform healthcare? Nat Med. 2023;29(3):505\u0026ndash;506.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCascella M, Montomoli J, Bellini V, Bignami E. Evaluating the Feasibility of ChatGPT in Healthcare: An Analysis of Multiple Clinical and Research Scenarios. J Med Syst. 2023;47(1):33.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eScimeca M, Bonfiglio R. Dignity of Science and the use of ChatGPT as a co-author. ESMO Open. 2023;8(5):101621.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDuey AH, Nietsch KS, Zaidat B, et al. Thromboembolic prophylaxis in spine surgery: an analysis of ChatGPT recommendations. Spine J. 2023.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRinkel GJ, Djibuti M, Algra A, van Gijn J. Prevalence and risk of rupture of intracranial aneurysms: a systematic review. Stroke. 1998;29(1):251\u0026ndash;256.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBroderick JP, Brown RD, Jr., Sauerbeck L, et al. Greater rupture risk for familial as compared to sporadic unruptured intracranial aneurysms. Stroke. 2009;40(6):1952\u0026ndash;1957.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMackey J, Brown RD, Jr., Moomaw CJ, et al. Unruptured intracranial aneurysms in the Familial Intracranial Aneurysm and International Study of Unruptured Intracranial Aneurysms cohorts: differences in multiplicity and location. J Neurosurg. 2012;117(1):60\u0026ndash;64.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePierot L, Wakhloo AK. Endovascular treatment of intracranial aneurysms: current status. Stroke. 2013;44(7):2046\u0026ndash;2054.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBrinjikji W, Rabinstein AA, Lanzino G, Kallmes DF, Cloft HJ. Effect of age on outcomes of treatment of unruptured cerebral aneurysms: a study of the National Inpatient Sample 2001\u0026ndash;2008. Stroke. 2011;42(5):1320\u0026ndash;1324.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eUnruptured intracranial aneurysms\u0026ndash;risk of rupture and risks of surgical intervention. N Engl J Med. 1998;339(24):1725\u0026ndash;1733.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSchaafsma JD, Sprengers ME, van Rooij WJ, et al. Long-term recurrent subarachnoid hemorrhage after adequate coiling versus clipping of ruptured intracranial aneurysms. Stroke. 2009;40(5):1758\u0026ndash;1763.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRinger AJ, Rodriguez-Mercado R, Veznedaroglu E, et al. Defining the risk of retreatment for aneurysm recurrence or residual after initial treatment by endovascular coiling: a multicenter study. Neurosurgery. 2009;65(2):311\u0026ndash;315; discussion 315.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDarsaut TE, Estrade L, Jamali S, Bojanowski MW, Chagnon M, Raymond J. Uncertainty and agreement in the management of unruptured intracranial aneurysms. J Neurosurg. 2014;120(3):618\u0026ndash;623.\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"artificial intelligence, intracranial aneurysm, AHA guideline","lastPublishedDoi":"10.21203/rs.3.rs-3897237/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-3897237/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003ch2\u003eIntroduction\u003c/h2\u003e \u003cp\u003eEfficient diagnosis and intervention for unruptured intracranial aneurysms (UIAs) are crucial for favorable outcomes. Our study aimed to evaluate the accuracy and alignment of Chat Generative Pre-trained Transformer (ChatGPT) with established medical standards by systematically evaluating its responses using the American Heart Association (AHA) guidelines for the management of UIAs as a reference. This initiative bridges advanced artificial intelligence (AI) technology and medical practice norms, and contributes to the discussion on the role of AI in the dissemination of medical information.\u003c/p\u003e\u003ch2\u003eMethods\u003c/h2\u003e \u003cp\u003e In our collaborative study, we systematically assessed ChatGPT 3.5's responses by posing clinical questions aligned with AHA guidelines and evaluating them on a 1 to 5 scale for agreement and comprehensiveness. This method allowed us to objectively gauge ChatGPT's alignment with AHA medical guidelines.\u003c/p\u003e\u003ch2\u003eResults\u003c/h2\u003e \u003cp\u003eWe introduced a set of ten clinical questions related to UIAs. Within this set, ChatGPT's responses achieved a 5-point rating for four questions. A further four questions were rated 3 points, and the remaining two questions received a score of 2.\u003c/p\u003e\u003ch2\u003eConclusions\u003c/h2\u003e \u003cp\u003eBy establishing a scoring system, we assessed the accuracy of ChatGPT responses to questions related to UIAs. It provides excellent results for screening, risk factors, and as a diagnostic tool. However, there is room for improvement in terms of the rupture risk and management.\u003c/p\u003e","manuscriptTitle":"Comparing ChatGPT Responses with AHA Guidelines for Assessing Unruptured Intracranial Aneurysms: Establishment of a Simple Rating System","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-02-07 17:03:48","doi":"10.21203/rs.3.rs-3897237/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"46efcc1d-5eb9-428f-a472-9cf99869bd38","owner":[],"postedDate":"February 7th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":28612920,"name":"Biological sciences/Neuroscience"},{"id":28612921,"name":"Health sciences/Anatomy"},{"id":28612922,"name":"Health sciences/Medical research"}],"tags":[],"updatedAt":"2025-03-06T04:53:35+00:00","versionOfRecord":[],"versionCreatedAt":"2024-02-07 17:03:48","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-3897237","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-3897237","identity":"rs-3897237","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00
unpaywall: last seen: 2026-05-23T02:00:01.238055+00:00

License: CC-BY-4.0