Comparison of the accuracy and reliability of ChatGPT-4o and Gemini in answering HIV-related questions | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Comparison of the accuracy and reliability of ChatGPT-4o and Gemini in answering HIV-related questions Muhammet Salih Tarhan, Meryem Sahin Ozdemir This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7541042/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Bacground: This is the first study to evaluate the accuracy and reliability of the ChatGPT and Gemini chatbots' on HIV. Methods A total of 156 questions about HIV in 3 different categories (CDC, guideline and social media) were asked to both ChatGPT and Gemini. The chatbots' answers were scored on a scale of 1 to 4 (1 = completely wrong, 4 = completely correct) by two different infectious disease experts. The reproducibility of both chatbots was also analysed. Results The mean score of the answers generated for all questions was 3.69 ± 0.72 for ChatGPT and 3.55 ± 0.81 for Gemini (p = 0.051). The rate of completely correct answers was 81.4% for ChatGPT and 71.8% for Gemini (p = 0.045). ChatGPT answered guideline questions with lower accuracy than CDC questions (47.9% vs. 97.1%, p = 0.000) and social media questions (47.9% vs. 94.9%, p = 0.000). Similarly, Gemini answered guideline questions with lower accuracy than CDC questions (35.4% vs. 88.4%, p = 0.000) and social media questions (35.4% vs. 87.2%, p = 0.000). Considering the questions according to the topics, the lowest accuracy rate for both chatbots was in the subject of ‘Prevention and Treatment’ (67.2% for ChatGPT, 54.7% for Gemini). The reproducibility of the answers was 94.8% for ChatGPT and 90.3% for Gemini. Conclusion ChatGPT and Gemini answered CDC and the social media questions with high accuracy. However, both chatbots need improvement for guideline questions and questions on “Prevention and Treatment”. Therefore, these applications need to be improved for the use of healthcare professionals. AIDS Artificial intelligence ChatGPT Gemini HIV Figures Figure 1 Background As artificial intelligence (AI) applications become more widespread, their roles in social life are also expanding. One of these areas is health services. Nevertheless, AI has the potential to bring about significant developments in the health field [ 1 ]. The objective of AI applications is to provide reliable information by utilizing extensive data sets and continuously evolving algorithms [ 2 ]. The ability to rapidly access accurate information about diseases and preventive measures without needing consultation with healthcare professionals is a significant benefit. The increasing reliance on AI is prompting people to pose a multitude of queries to these systems and modify their behaviors in accordance with the responses they receive. ChatGPT (Chat Generative Pre-trained Transformer) and Gemini (Google, Mountain View, California, USA) are among the most frequently utilized text-based AI applications [ 3 , 4 ]. Both AI applications collect data from many sources for health-related questions and transfer this information to their users according to their algorithms. However, the quality and adequacy of the answers to the questions is an ongoing debate. Human Immunodeficiency Virus (HIV) is an ongoing pandemic, identified approximately 43 years ago, has killed more than 40 million people, and currently affects 40 million people worldwide. While life expectancy for people living with HIV has significantly improved due to highly effective antiretroviral therapy, there has been little progress toward eradicating the virus [ 5 ]. This suggests that HIV will continue to be a significant public health problem. People living with HIV may have many questions about different topics regarding the disease during this long association. In addition, people seeking to protect themselves against HIV try to get information about many topics, particularly transmission methods and protective measures. Those unable to access healthcare professionals for such questions may instead turn to artificial intelligence. It is, therefore, crucial to verify the reliability of the responses provided by AI. However, to the best of our knowledge, no study has been conducted on the accuracy and reliability of AI chatbots' knowledge about HIV. In this study, we aimed to assess the reliability of responses from two commonly used AI applications regarding questions about HIV. Methods In this study, ChatGPT 4o and Google Gemini 1.5 Flash were used to answer the questions. A total of 156 HIV-related questions were asked of both AI between 15 November, 2024 and 30 November, 2024. The questions were categorized into three main groups. In the first group, the questions were prepared from the Centers for Disease Control and Prevention (CDC) “questions and answers for the public”. In the second group, the questions were prepared from the European AIDS Clinical Society (EACS) guideline version 12.1 [ 6 ] and the United States Department of Health and Human Services (DHHS) panel on antiretroviral guidelines for adults and adolescents [ 7 ]. In the third group, the questions were prepared from frequently asked questions on social media platforms (Google, X [formerly named Twitter], Facebook, YouTube). The terms “human immunodeficiency virus”, “HIV”, “acquired immunodeficiency syndrome”, and “AIDS” were searched online. In this study, questions about personal information, repetitive questions, questions with unclear answers, unrealistic questions, and questions with grammatical errors were excluded. Of the questions, 42.3% (n:69) were prepared from the CDC, 29.5% (n:48) were prepared from guidelines, and 28.2% (n:46) were prepared from social media. The questions were categorized into four subgroups: “General information”, “Transmission”, “Diagnosis” and “Prevention and Treatment”. Two specialists in infectious diseases and clinical microbiology (M.S.T. and M.S.O.) assessed the answers separately. A third infectious diseases and clinical microbiology specialist (Y.E.O.) reviewed responses that did not agree between the two specialists. The scores for the responses that disagreed with the three specialists were evaluated jointly. The final score was determined by complete agreement. The reviewers used a scoring system from 1 to 4 points to assess the quality and reliability of the answers given by ChatGPT and Gemini. The scoring according to the adequacy and quality of the answers was as follows: 1 point: An answer that was completely incorrect or irrelevant. 2 points: An answer that was partially correct but includes some misleading information. 3 points: An answer that was generally correct but did not have sufficient detail. 4 points: An answer that was completely correct and sufficient. The scores given by the reviewers based on ChatGPT and Gemini's answers are listed in Supplementary Table 1 and Supplementary Table 2, respectively. The consistency of the answers given by AI to the same question at different times and on different computers was also evaluated. Repeatability was evaluated as positive if the answers given when the same question was asked on different computers received the same score. If the answer to the same question did not have the same score, repeatability was evaluated as negative. In this case, only the first response from the chatbots was scored. Ethics committee approval was not required because this study did not include patient data. Statistical analysis Statistical Package for Social Sciences (SPSS) version 25.0 was used for statistical analysis. Categorical variables were presented as numbers (n) and percentages (%), continuous variables were presented as mean ± standard deviation (sd). Chi-square test was used to compare categorical variables. If continuous parameters were normally distributed, they were compared with "t-test", if not normally distributed, they were compared with "Mann-Whitney U" test. Spearman’s correlation analysis was applied to investigate the relationship between the responses generated by ChatGPT and Gemini. To assess inter-rater reliability, Cohen's kappa coefficient was calculated. A result with a p value < 0.05 was considered statistically significant. Results The 156 questions included in the study were classified into four categories according to their topics: “General information” (n = 43, 27.6%), “Transmission” (n = 31, 19.9%), “Diagnosis” (n = 18, 11.5%), and “Prevention and Treatment” (n = 64, 41.0%). The mean score of the answers generated by ChatGPT for all questions was 3.69 ± 0.72. The mean score of the answers to the guideline questions was significantly lower in comparison to both the CDC questions (3.08 ± 1.05 vs. 3.97 ± 0.17, p < 0.001) and the social media questions (3.08 ± 1.05 vs. 3.95 ± 0.22, p = 0.001) (Fig. 1 ). The mean score of the answers to the “Prevention and Treatment” questions was significantly lower than those for the “General information” (3.38 ± 1.00 vs. 3.88 ± 0.32, p = 0.006), “Transmission” (3.38 ± 1.00 vs. 3.97 ± 0.18, p = 0.001) and “Diagnosis” (3.38 ± 1.00 vs. 3.89 ± 0.32, p = 0.049) questions. No significant differences were found between other topics except for “Prevention and Treatment” questions (p > 0.05). ChatGPT answered 81.4% of all questions completely correctly, 9.6% correctly but inadequately, 5.8% misleadingly, and 3.2% completely incorrectly. The distribution of the scores given to the answers according to the main groups and the topics is shown in Table 1 . The rate of completely correct answers was lower for the guideline questions than for the CDC questions (47.9% vs. 97.1%, p < 0.001, OR = 37.0, 95% CI:8.00-166.7) and the social media questions (47.9% vs. 94.9%, p < 0.001, OR = 20.1, 95% CI:4.35-93.0) (Fig. 1 ). The rate of completely correct answers to the questions about “Prevention and Treatment” was significantly lower than to the questions about “General information” (67.2% vs. 88.4%, p = 0.012, OR = 3.72, 95% CI:1.28–10.8) and “Transmission” (67.2% vs. 96.8%, p = 0.001, OR = 3.90, 95% CI:0.82–18.5). No significant difference was found between the other topics in terms of completely correct answer rates (p > 0.05). Table 1 Distribution of the scores of the answers of both AIs according to the main groups and the topics Mean Score ± SD 1-point n (%) 2-points n (%) 3-points n (%) 4-points n (%) Total n (%) Topics CDC Questions ChatGPT General Information 4.00 ± 0.0 0 (0) 0 (0) 0 (0) 20 (100) 20 (100) Transmission 4.00 ± 0.0 0 (0) 0 (0) 0 (0) 25 (100) 25 (100) Diagnosis 3.85 ± 0.38 0 (0) 0 (0) 2 (15.4) 11 (84.6) 13 (100) Prevention and Treatment 4.00 ± 0.0 0 (0) 0 (0) 0 (0) 11 (100) 11 (100) Total 3.97 ± 0.17 0 (0) 0 (0) 2 (2.9) 67 (97.1) 69 (100) Guidelines Questions General Information 3.60 ± 0.52 0 (0) 0 (0) 4 (40) 6 60) 10 (100) Prevention and Treatment 2.95 ± 1.11 5 (13.2) 9 (23.7) 7 (18.4) 17 (44.7) 38 (100) Total 3.08 ± 1.05 5 (10.4) 9 (18.8) 11 (22.9) 23 (47.9) 48 (100) Social Media Questions General Information 3.92 ± 0.28 0 (0) 0 (0) 1 (7.7) 12 (92.3) 13 (100) Transmission 3.83 ± 0.41 0 (0) 0 (0) 1 (16.7) 5 (83.3) 6 (100) Diagnosis 4.00 ± 0.0 0 (0) 0 (0) 0 (0) 5 (100) 5 (100) Prevention and Treatment 4.00 ± 0.0 0 (0) 0 (0) 0 (0) 15 (100) 15 (100) Total 3.95 ± 0.22 0 (0) 0 (0) 2 (5.1) 37 (94.9) 39 (100) All Questions 3.69 ± 0.72 5 (3.2) 9 (5.8) 15 (9.6) 127 (81.4) 156 (100) Gemini CDC Questions General Information 3.95 ± 0.22 0 (0) 0 (0) 1 (5.0) 19 (95.0) 20 (100) Transmission 3.84 ± 0.47 0 (0) 1 (4.0) 2 (8.0) 22 (88.0) 25 (100) Diagnosis 3.69 ± 0.63 0 (0) 1 (7.7) 2 (15.4) 10 (76.9) 13 (100) Prevention and Treatment 3.91 ± 0.30 0 (0) 0 (0) 1 (9.1) 10 (90.9) 11 (100) Total 3.86 ± 0.43 0 (0) 2 (2.9) 6 (8.7) 61 (88.4) 69 (100) Guidelines Questions General Information 3.60 ± 0.52 0 (0) 0 (0) 4 (40) 6 60) 10 (100) Prevention and Treatment 2.68 ± 1.04 5 (13.2) 13 (34.2) 9 (23.7) 11 (28.9) 38 (100) Total 2.88 ± 1.02 5 (10.4) 13 (27.1) 13 (27.1) 17 (35.4) 48 (100) Social Media Questions General Information 3.85 ± 0.38 0 (0) 0 (0) 2 (15.4) 11 (84.6) 13 (100) Transmission 4.00 ± 0.0 0 (0) 0 (0) 0 (0) 6 (100) 6 (100) Diagnosis 3.40 ± 0.89 0 (0) 1 (20.0) 1 (20.0) 3 (60.0) 5 (100) Prevention and Treatment 3.93 ± 0.26 0 (0) 0 (0) 1 (6.7) 14 (93.3) 15 (100) Total 3.85 ± 0.43 0 (0) 1 (2.6) 4 (10.3) 34 (87.2) 39 (100) All Questions 3.55 ± 0.81 5 (3.2) 16 (10.3) 23 (14.7) 112 (71.8) 156 (100) CDC: Centers for Disease Control and Prevention The mean score of the answers generated by Gemini for all questions was 3.55 ± 0.81. The mean score for the guideline questions was found to be significantly lower than that for the CDC questions (2.88 ± 1.02 vs. 3.86 ± 0.43, p < 0.001) and the social media questions (2.88 ± 1.02 vs. 3.85 ± 0.43, p < 0.001) (Fig. 1 ). The mean score of the answers to the “Prevention and Treatment” questions was significantly lower than those for the “General information” (3.19 ± 1.02 vs. 3.84 ± 0.37, p < 0.001) and “Transmission” (3.19 ± 1.02 vs. 3.87 ± 0.43, p 0.05). Gemini answered 71.8% of all questions completely correctly, 14.7% correctly but inadequately, 10.3% misleadingly, and 3.2% completely incorrectly (Table 1 ). The rate of completely correct answers was lower for the guideline questions than for CDC questions (35.4% vs. 88.4%, p < 0.001, OR = 13.9, 95% CI:5.41–35.7) and the social media questions (35.4% vs. 87.2%, p < 0.001, OR = 12.4, 95% CI:4.09–37.6) (Fig. 1 ). The rate of completely correct answers to the questions on “Prevention and Treatment” was significantly lower than to the questions on “General information” (54.7% vs. 83.7%, p = 0.002, OR = 4.26, 95% CI:1.65-11.0) and “Transmission” (54.7% vs. 90.3%, p = 0.001, OR = 7.75, 95% CI:0.69–6.76). There was no significant difference between the other topics in terms of the rate of completely correct answers (p > 0.05). The mean score of ChatGPT's answers was higher than that of Gemini for all HIV-related questions, although this was not statistically significant (3.69 ± 0.72 vs. 3.55 ± 0.81, p = 0.051). ChatGPT answered the CDC questions with a higher score than Gemini (3.97 ± 0.17 vs. 3.86 ± 0.43 p = 0.048). According to other main groups and the topics, there was no statistically significant difference between ChatGPT and Gemini (Table 2 ). Table 2 Comparison of the mean scores of the answers of both AIs according to the main groups and the topics n Mean Score ± SD p ChatGPT Gemini All Questions 156 3.69 ± 0.72 3.55 ± 0.81 0.051 Main Groups CDC Questions 69 3.97 ± 0.17 3.86 ± 0.43 0.048 Guidelines Questions 48 3.08 ± 1.05 2.88 ± 1.02 0.272 Social Media Questions 39 3.95 ± 0.22 3.85 ± 0.43 0.230 Topics General Information 43 3.88 ± 0.32 3.84 ± 0.37 0.536 Transmission 31 3.97 ± 0.18 3.87 ± 0.43 0.298 Diagnosis 18 3.89 ± 0.32 3.61 ± 0.70 0.183 Prevention and Treatment 64 3.38 ± 1.00 3.19 ± 1.02 0.206 CDC: Centers for Disease Control and Prevention ChatGPT had a higher rate than Gemini in giving completely correct answers to all questions (81.4% vs. 71.8%, p = 0.045, OR = 1.72, 95% CI:1.01–2.93). Also, in CDC questions, ChatGPT had a higher rate of correct answers than Gemini (97.1% vs. 88.4%, p = 0.049, OR = 4.39, 95% CI:0.90–21.3). There was no significant difference between the two AIs in terms of completely correct answers according to the other main groups and the topics (Table 3 ). Table 3 Comparison of completely correct answer rates of both AIs according to the main groups and the topics n Completely Correct Answer Rates (%) OR %95 CI p ChatGPT Gemini All Questions 156 81.4 71.8 1.72 1.01–2.93 0.045 Main Groups CDC Questions 69 97.1 88.4 4.39 0.90–21.3 0.049 Guidelines Questions 48 47.9 35.4 1.68 0.74–3.80 0.214 Social Media Questions 39 94.9 87.2 2.72 0.49–14.93 0.235 Topics General Information 43 88.4 83.7 1.48 0.43–5.08 0.534 Transmission 31 96.8 90.3 3.22 0.32–32.3 0.301 Diagnosis 18 88.9 72.2 6.15 0.051–18.5 0.206 Prevention and Treatment 64 67.2 54.7 1.70 0.83–3.47 0.147 CDC: Centers for Disease Control and Prevention There was a high positive correlation between ChatGPT and Gemini answers (r = 0.680, p < 0.001). According to Cohen's kappa test, there was a substantial inter-rater agreement for ChatGPT (κ = 0.674) and Gemini (κ = 0.768). The reproducibility rate for ChatGPT was 94.8%, while that for Gemini was 90.3% (p = 0.135). Discussion Although AI chatbots provide easy access to information on many topics, there is a significant disadvantage that these chatbots also contain false information. Providing false information about HIV can negatively affect public health and lead to inappropriate approaches among healthcare professionals. There is no study evaluating the reliability of AI chatbots regarding this HIV pandemic, which significantly affects public health. In this study, we evaluated the knowledge levels of ChatGPT and Gemini chatbots about HIV. In our study, ChatGPT provided a higher rate of completely correct answers to HIV-related questions than Gemini (81.3% vs. 71.8%). Both AI applications responded to the CDC and the social media questions with high accuracy rates. However, the lowest accuracy rates were found in the guideline questions (ChatGPT 47.9%, Gemini 35.4%) and questions on “Prevention and Treatment” (ChatGPT 67.2%, Gemini 57.4%). The information provided by AI applications in healthcare still needs improvement, but many promising results exist. In a meta-analysis evaluating 45 studies on the performance of different versions of ChatGPT in medical licensing exams, ChatGPT-4 achieved an overall accuracy rate of 81%, passed medical exams in 26 out of 29 cases, and exceeded the average scores of medical students in 13 out of 17 cases [ 8 ]. In a study in which 82 clinical cases were asked to utilize ChatGPT for differential diagnoses, ChatGPT-4 demonstrated a high level of agreement (95.9%) with physicians in assessing whether a definitive diagnosis should be included in differential diagnosis lists [ 9 ]. A systematic review of 44 studies on the utilization of ChatGPT in radiology demonstrated that ChatGPT exhibits a high degree of accuracy in this field [ 10 ]. A study has demonstrated that AI applications can accurately predict pathological diagnoses of neurodegenerative diseases based on clinical information [ 11 ]. In addition, AI chatbots were shown to provide generally accurate information for frequently asked questions about cataracts [ 12 ], urological malignancies [ 13 ], kidney stones [ 14 ] and amyloidosis [ 15 ]. Nevertheless, researches have also demonstrated the limitations of AI applications. In a study in which 316 questions prepared for otolaryngology residents were presented to three different AI models, the accuracy rates of the answers given by senior residents outperformed all AI applications. ChatGPT-4 and Gemini performed similarly to junior residents, while Bing lagged behind junior residents [ 16 ]. Similarly, ChatGPT and Gemini have shown limited performance in examinations in nephrology [ 17 ]. In a study evaluating the ChatGPT approach to managing bloodstream infections for 44 cases, 16% of the approaches were considered harmful [ 18 ]. In a study evaluating the responses of four distinct AI chatbots to common emergency care questions, all responses contained small to moderate levels of potentially dangerous information [ 19 ]. In a study on the simplification of radiology reports to ChatGPT, in approximately one-third of all simplified reports, participating radiologists found errors that could lead patients to incorrect outcomes and potentially lead to physical and/or psychological damage [ 20 ]. In our study, ChatGPT and Gemini provided completely wrong or misleading answers to 9.0% and 13.5% of the questions, respectively. Answers containing incorrect information are one of the important limitations of AI applications and improvements are needed in this regard. While questions about diseases frequently asked by society and on social media platforms are generally answered with high accuracy by AI chatbots, AI's performance decreases on more complex issues such as guideline questions and questions about treatment. In a study where ChatGPT was asked 200 questions from social media platforms and guideline questions on many topics related to infectious diseases, ChatGPT answered 92% of social media platform questions and 69% of guideline questions completely correctly (p = 0.001) [ 21 ]. Similarly, ChatGPT answered cervical cancer of social media platform questions with significantly higher accuracy than guideline questions (p < 0.001) [ 22 ]. In another study, ChatGPT provided adequate answers to more than 80% of frequently asked questions about osteoporosis, while it provided adequate answers to 61.3% of questions prepared from the Turkish National Osteoporosis Guide [ 23 ]. In a study in which common dermatological cases were asked, ChatGPT showed lower performance in diagnosing complex cases such as cutaneous neoplasms [ 24 ]. In our study, both AI chatbots responded with significantly lower accuracy to guideline questions and questions on “Prevention and Treatment”. The performances of different AI applications in healthcare have been compared in several studies. In many studies, different versions of ChatGPT (3.5, 4o, and 4.0) demonstrated better performance than Gemini [ 25 – 28 ]. In some studies, ChatGPT-4 performed better than Gemini and ChatGPT-3.5, while ChatGPT-3.5 and Gemini showed similar performance [ 17 , 29 , 30 ]. Similarly, in another study comparing ChatGPT-3.5 and Gemini, both AI chatbots showed similar performance [ 31 ]. In our study, the completely correct response rate was significantly higher in the responses produced by ChatGPT-4o than Gemini. However, there was no significant difference between the mean scores of the responses produced by both chatbots. The reproducibility of the information provided by AI chatbots is as important as reliability and accuracy, because it is very difficult to generalize study findings on platforms where repeatability is low. In existing studies in the literature, reproducibility rates for ChatGPT have been reported to be between 70% and 100% [ 22 , 23 , 32 , 33 ], and for Gemini between 50% and 92%. [ 34 , 35 ]. In our study, the repeatability rates of the answers were found to be similarly high for ChatGPT and Gemini, 94.8% and 90.3%, respectively. To our knowledge, this is the first study to evaluate the accuracy and reliability of AI chatbots in response to HIV-related questions. However, our study had several limitations. Firstly, it should be noted that the questions used in this study represent only a fraction of the total number of questions that could be posed within this field of inquiry. Secondly, it is possible that the comments of the experts who evaluated the responses contained subjectivity and this could affect the scoring. Thirdly, this study did not assess the comprehensibility of the responses provided by AI chatbots. Conclusion AI chatbots have a great potential for patients and healthcare professionals to access information quickly in the health field. In particular, it seems to be a reliable source for society to obtain easy and accurate information about diseases. However, caution should be exercised regarding the reliability of information provided by AI applications on complex issues of interest to healthcare professionals, such as the treatment and management of diseases. In addition, it would be useful to integrate specific guidelines for each health field into the algorithms of AI applications to increase accuracy and reliability. Abbreviations AI artificial intelligence CDC Centers for Disease Control and Prevention ChatGPT Chat Generative Pre-trained Transformer DHHS United States Department of Health and Human Services EACS European AIDS Clinical Society HIV Human Immunodeficiency Virus Declarations Ethics approval and consent to participate: Not applicable. Consent for publication: Not applicable. Availability of data and materials: The datasets analysed during the current study are available from the corresponding author on reasonable request. Competing interests: The authors declare that they have no competing interests. Funding: This study was not funded by any person or organisation. Authors’ contributions: MST and MSO contributed to the design of the concept. MST and MSO collected the data for the study. MST contributed to the analysis of the data. MST and MSO contributed to the interpretation of the data, revision of the manuscript, and review. All authors read and approved the final manuscript. Acknowledgements: Not applicable. References Obermeyer Z, Emanuel EJ. Predicting the Future — Big Data, Machine Learning, and Clinical Medicine. N Engl J Med. 2016. https://doi.org/10.1056/NEJMp1606181 . Rashid A, Bin, Kausik MAK. AI revolutionizing industries worldwide: A comprehensive overview of its diverse applications. Hybrid Adv. 2024. https://doi.org/10.1016/J.HYBADV.2024.100277 . Introducing GPT- 4. o and more tools to ChatGPT free users | OpenAI [Internet]. 2025. https://openai.com/index/gpt-4o-and-more-tools-to-chatgpt-free/ . Accessed 10 Jun 2025. Gemini [Internet]. 2025. https://gemini.google.com/app . Accessed 10 Jun 2025. Trickey A, Sabin CA, Burkholder G, Crane H, d’Arminio Monforte A, Egger M, et al. Life expectancy after 2015 of adults with HIV on long-term antiretroviral therapy in Europe and North America: a collaborative analysis of cohort studies. Lancet HIV. 2023. https://doi.org/10.1016/S2352-3018(23)00028-0 . EACS Guidelines. 2024 — EACS Guidelines [Internet]. https://eacs.sanfordguide.com/ . Accessed 12 Nov 2024. HIV/AIDS Treatment Guidelines. | Clinicalinfo.HIV.gov [Internet]. https://clinicalinfo.hiv.gov/en/guidelines . Accessed 12 Nov 2024. Liu M, Okuhara T, Chang XY, Shirabe R, Nishiie Y, Okada H, et al. Performance of ChatGPT Across Different Versions in Medical Licensing Examinations Worldwide: Systematic Review and Meta-Analysis. J Med Internet Res. 2024. https://doi.org/10.2196/60807 . Mizuta K, Hirosawa T, Harada Y, Shimizu T. Can ChatGPT-4 evaluate whether a differential diagnosis list contains the correct diagnosis as accurately as a physician? Diagnosis. 2024. https://doi.org/10.1515/dx-2024-0027 . Keshavarz P, Bagherieh S, Nabipoorashrafi SA, Chalian H, Rahsepar AA, Kim GHJ, et al. ChatGPT in radiology: A systematic review of performance, pitfalls, and future perspectives. Diagn Interv Imaging. 2024. https://doi.org/10.1016/j.diii.2024.04.003 . Koga S, Martin NB, Dickson DW. Evaluating the performance of large language models: ChatGPT and Google Bard in generating differential diagnoses in clinicopathological conferences of neurodegenerative disorders. Brain Pathology. 2023; https://doi.org/10.1111/bpa.13207 Yılmaz İE, Doğan L. Talking technology: exploring chatbots as a tool for cataract patient education. Clin Exp Optom. 2024. https://doi.org/10.1080/08164622.2023.2298812 . Musheyev D, Pan A, Loeb S, Kabarriti AE. How Well Do Artificial Intelligence Chatbots Respond to the Top Search Queries About Urological Malignancies? Eur Urol. 2024. https://doi.org/10.1016/j.eururo.2023.07.004 . Musheyev D, Pan A, Kabarriti AE, Loeb S, Borin JF. Quality of Information About Kidney Stones from Artificial Intelligence Chatbots. J Endourol. 2024. https://doi.org/10.1089/end.2023.0484 . King RC, Samaan JS, Yeo YH, Peng Y, Kunkel DC, Habib AA, et al. A Multidisciplinary Assessment of ChatGPT’s Knowledge of Amyloidosis: Observational Study. JMIR Cardio. 2024. https://doi.org/10.2196/53421 . Mete U. Evaluating the Performance of ChatGPT, Gemini, and Bing Compared with Resident Surgeons in the Otorhinolaryngology In-service Training Examination. Turk Arch Otorhinolaryngol. 2024. https://doi.org/10.4274/tao.2024.3.5 . Noda R, Izaki Y, Kitano F, Komatsu J, Ichikawa D, Shibagaki Y. Performance of ChatGPT and Bard in self-assessment questions for nephrology board renewal. Clin Exp Nephrol. 2024. https://doi.org/10.1007/s10157-023-02451-w . Maillard A, Micheli G, Lefevre L, Guyonnet C, Poyart C, Canouï E, et al. Can Chatbot Artificial Intelligence Replace Infectious Diseases Physicians in the Management of Bloodstream Infections? A Prospective Cohort Study. Clin Infect Dis. 2024. https://doi.org/10.1093/cid/ciad632 . Yau JYS, Saadat S, Hsu E, Murphy LSL, Roh JS, Suchard J, et al. Accuracy of Prospective Assessments of 4 Large Language Model Chatbot Responses to Patient Questions About Emergency Care: Experimental Comparative Study. J Med Internet Res. 2024. https://doi.org/10.2196/60291 . Jeblick K, Schachtner B, Dexl J, Mittermeier A, Stüber AT, Topalis J, et al. ChatGPT makes medicine easy to swallow: an exploratory case study on simplified radiology reports. Eur Radiol. 2023. https://doi.org/10.1007/s00330-023-10213-1 . Tunçer G, Güçlü KG. How Reliable is ChatGPT as a Novel Consultant in Infectious Diseases and Clinical Microbiology? Infectious Diseases & Clinical Microbiology. 2024; https://doi.org/10.36519/idcm.2024.286 Yurtcu E, Ozvural S, Keyif B. Analyzing the performance of ChatGPT in answering inquiries about cervical cancer. Int J Gynecol Obstet. 2024. https://doi.org/10.1002/ijgo.15861 . Cinar C. Analyzing the Performance of ChatGPT About Osteoporosis. Cureus. 2023. https://doi.org/10.7759/cureus.45890 . Goktas P, Grzybowski A. Assessing the Impact of ChatGPT in Dermatology: A Comprehensive Rapid Review. J Clin Med. 2024. https://doi.org/10.3390/jcm13195909 . Carlà MM, Gambini G, Baldascino A, Giannuzzi F, Boselli F, Crincoli E, et al. Exploring AI-chatbots’ capability to suggest surgical planning in ophthalmology: ChatGPT versus Google Gemini analysis of retinal detachment cases. Br J Ophthalmol. 2024. https://doi.org/10.1136/bjo-2023-325143 . Strzalkowski P, Strzalkowska A, Chhablani J, Pfau K, Errera MH, Roth M, et al. Evaluation of the accuracy and readability of ChatGPT-4 and Google Gemini in providing information on retinal detachment: a multicenter expert comparative study. Int J Retina Vitreous. 2024. https://doi.org/10.1186/s40942-024-00579-9 . Rossettini G, Rodeghiero L, Corradi F, Cook C, Pillastrini P, Turolla A, et al. Comparative accuracy of ChatGPT-4, Microsoft Copilot and Google Gemini in the Italian entrance test for healthcare sciences degrees: a cross-sectional study. BMC Med Educ. 2024. https://doi.org/10.1186/s12909-024-05630-9 . Abdul Sami M, Abdul Samad M, Parekh K, Suthar PP. Comparative Accuracy of ChatGPT 4.0 and Google Gemini in Answering Pediatric Radiology Text-Based Questions. Cureus. 2024. https://doi.org/10.7759/cureus.70897 . Toyama Y, Harigai A, Abe M, Nagano M, Kawabata M, Seki Y, et al. Performance evaluation of ChatGPT, GPT-4, and Bard on the official board examination of the Japan Radiology Society. Jpn J Radiol. 2024. https://doi.org/10.1007/s11604-023-01491-2 . Khan AA, Yunus R, Sohail M, Rehman TA, Saeed S, Bu Y, et al. Artificial Intelligence for Anesthesiology Board-Style Examination Questions: Role of Large Language Models. J Cardiothorac Vasc Anesth. 2024. https://doi.org/10.1053/j.jvca.2024.01.032 . Doğan L, Özçakmakcı GB, Yılmaz ĬE. The Performance of Chatbots and the AAPOS Website as a Tool for Amblyopia Education. J Pediatr Ophthalmol Strabismus. 2024. https://doi.org/10.3928/01913913-20240409-01 . Talyshinskii A, Juliebø-Jones P, Zeeshan Hameed BM, Naik N, Adhikari K, Zhanbyrbekuly U, et al. ChatGPT as a Clinical Decision Maker for Urolithiasis: Compliance with the Current European Association of Urology Guidelines. Eur Urol Open Sci. 2024. https://doi.org/10.1016/j.euros.2024.08.015 . Ozgor BY, Simavi MA. Accuracy and reproducibility of ChatGPT’s free version answers about endometriosis. Int J Gynecol Obstet. 2024. https://doi.org/10.1002/ijgo.15309 . Iannantuono GM, Bracken-Clarke D, Karzai F, Choo-Wosoba H, Gulley JL, Floudas CS. Comparison of Large Language Models in Answering Immuno-Oncology Questions: A Cross-Sectional Study. Oncologist. 2024. https://doi.org/10.1093/oncolo/oyae009 . Sahin Ozdemir M, Ozdemir YE. Comparison of the performances between ChatGPT and Gemini in answering questions on viral hepatitis. Sci Rep. 2025. 10.1038/s41598-024-83575-1 . Additional Declarations No competing interests reported. Supplementary Files SupplementaryTable1.docx Supplementary Table 1. Scores for answers generated by ChatGPT according to the reviewers SupplementaryTable2.docx Supplementary Table 2. Scores for answers generated by Gemini according to the reviewers Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7541042","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":510828685,"identity":"9d3fce85-5821-4d10-a9d6-5bf6c70340b5","order_by":0,"name":"Muhammet Salih Tarhan","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABIklEQVRIiWNgGAWjYFACHhiDueEwiDJgYGB8ABLnI6BFAqgQqCUBrIXZACTORowWZqgWNgmQEC4tujNyD366mWNTJ99+sPFw4Q+7fHP2s8cqv+bYybAxMD98dANTi9mNvGTp3G1pEgZnEhsOz0hIttzZk5d2W3ZbMtBhbMbGOdi05BgAtRyWMGAAauFJYDYwOJBjdltyGzNQCw+bNHYtxr9zt/2XkO9/CNJSb2Bw/o1ZseS2enxazIC2HJBguAG25bCBAVCE8eO2w7i1nHljZp27LVlyww2QLWnHgVreGEszbjvOw8aMwy/Hc4xv526z45fvTz78mcemGuiwHMOPP7dV2/OzNz98jEULdsAMjixmYpWDAOMPUlSPglEwCkbBcAcASpFlUC9sg5cAAAAASUVORK5CYII=","orcid":"","institution":"Mardin Training and Research Hospital","correspondingAuthor":true,"prefix":"","firstName":"Muhammet","middleName":"Salih","lastName":"Tarhan","suffix":""},{"id":510828686,"identity":"351c6312-c9e5-4721-b48c-68b6dfffa086","order_by":1,"name":"Meryem Sahin Ozdemir","email":"","orcid":"","institution":"Basaksehir Cam \u0026 Sakura City Hospital","correspondingAuthor":false,"prefix":"","firstName":"Meryem","middleName":"Sahin","lastName":"Ozdemir","suffix":""}],"badges":[],"createdAt":"2025-09-05 05:53:22","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7541042/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7541042/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":90805137,"identity":"c50252ed-225c-4d4e-91e0-0a32ea6a904b","added_by":"auto","created_at":"2025-09-08 10:42:39","extension":"jpeg","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":109622,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003ea \u003c/strong\u003eComparisons of ChatGPT's completely correct answer rates between main groups. \u003cstrong\u003eb\u003c/strong\u003e Comparison of the mean scores of the ChatGPT answers between the main groups. \u003cstrong\u003ec\u003c/strong\u003e Comparisons of Gemini's completely correct answer rates between main groups. \u003cstrong\u003ed\u003c/strong\u003e Comparison of the mean scores of the Gemini answers between the main groups.\u003c/p\u003e","description":"","filename":"floatimage1.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-7541042/v1/3b0e08b5a2055fce978e7bf8.jpeg"},{"id":90806090,"identity":"ac73fc2c-09be-4962-a9d0-abeb8d369e77","added_by":"auto","created_at":"2025-09-08 10:58:42","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1222905,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7541042/v1/7e6d7803-9606-48d4-9d2c-f3ef0438f5ec.pdf"},{"id":90803454,"identity":"3efe67bb-2560-4e49-9594-705f90929584","added_by":"auto","created_at":"2025-09-08 10:34:39","extension":"docx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":47170,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eSupplementary Table 1. \u003c/strong\u003eScores for answers generated by ChatGPT according to the reviewers\u003c/p\u003e","description":"","filename":"SupplementaryTable1.docx","url":"https://assets-eu.researchsquare.com/files/rs-7541042/v1/2135643f3e85c818a0b5be51.docx"},{"id":90802936,"identity":"5a628048-d32e-4590-bd57-dbbf17d1f17d","added_by":"auto","created_at":"2025-09-08 10:26:40","extension":"docx","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":52125,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eSupplementary Table 2.\u003c/strong\u003e Scores for answers generated by Gemini according to the reviewers\u003c/p\u003e","description":"","filename":"SupplementaryTable2.docx","url":"https://assets-eu.researchsquare.com/files/rs-7541042/v1/781ab308fe1e38a93b702faf.docx"}],"financialInterests":"No competing interests reported.","formattedTitle":"Comparison of the accuracy and reliability of ChatGPT-4o and Gemini in answering HIV-related questions","fulltext":[{"header":"Background","content":"\u003cp\u003eAs artificial intelligence (AI) applications become more widespread, their roles in social life are also expanding. One of these areas is health services. Nevertheless, AI has the potential to bring about significant developments in the health field [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]. The objective of AI applications is to provide reliable information by utilizing extensive data sets and continuously evolving algorithms [\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e]. The ability to rapidly access accurate information about diseases and preventive measures without needing consultation with healthcare professionals is a significant benefit. The increasing reliance on AI is prompting people to pose a multitude of queries to these systems and modify their behaviors in accordance with the responses they receive. ChatGPT (Chat Generative Pre-trained Transformer) and Gemini (Google, Mountain View, California, USA) are among the most frequently utilized text-based AI applications [\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e, \u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e]. Both AI applications collect data from many sources for health-related questions and transfer this information to their users according to their algorithms. However, the quality and adequacy of the answers to the questions is an ongoing debate.\u003c/p\u003e\u003cp\u003eHuman Immunodeficiency Virus (HIV) is an ongoing pandemic, identified approximately 43 years ago, has killed more than 40\u0026nbsp;million people, and currently affects 40\u0026nbsp;million people worldwide. While life expectancy for people living with HIV has significantly improved due to highly effective antiretroviral therapy, there has been little progress toward eradicating the virus [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e]. This suggests that HIV will continue to be a significant public health problem. People living with HIV may have many questions about different topics regarding the disease during this long association. In addition, people seeking to protect themselves against HIV try to get information about many topics, particularly transmission methods and protective measures. Those unable to access healthcare professionals for such questions may instead turn to artificial intelligence. It is, therefore, crucial to verify the reliability of the responses provided by AI. However, to the best of our knowledge, no study has been conducted on the accuracy and reliability of AI chatbots' knowledge about HIV. In this study, we aimed to assess the reliability of responses from two commonly used AI applications regarding questions about HIV.\u003c/p\u003e"},{"header":"Methods","content":"\u003cp\u003eIn this study, ChatGPT 4o and Google Gemini 1.5 Flash were used to answer the questions. A total of 156 HIV-related questions were asked of both AI between 15 November, 2024 and 30 November, 2024. The questions were categorized into three main groups. In the first group, the questions were prepared from the Centers for Disease Control and Prevention (CDC) \u0026ldquo;questions and answers for the public\u0026rdquo;. In the second group, the questions were prepared from the European AIDS Clinical Society (EACS) guideline version 12.1 [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e] and the United States Department of Health and Human Services (DHHS) panel on antiretroviral guidelines for adults and adolescents [\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e]. In the third group, the questions were prepared from frequently asked questions on social media platforms (Google, X [formerly named Twitter], Facebook, YouTube). The terms \u0026ldquo;human immunodeficiency virus\u0026rdquo;, \u0026ldquo;HIV\u0026rdquo;, \u0026ldquo;acquired immunodeficiency syndrome\u0026rdquo;, and \u0026ldquo;AIDS\u0026rdquo; were searched online. In this study, questions about personal information, repetitive questions, questions with unclear answers, unrealistic questions, and questions with grammatical errors were excluded. Of the questions, 42.3% (n:69) were prepared from the CDC, 29.5% (n:48) were prepared from guidelines, and 28.2% (n:46) were prepared from social media. The questions were categorized into four subgroups: \u0026ldquo;General information\u0026rdquo;, \u0026ldquo;Transmission\u0026rdquo;, \u0026ldquo;Diagnosis\u0026rdquo; and \u0026ldquo;Prevention and Treatment\u0026rdquo;.\u003c/p\u003e\u003cp\u003eTwo specialists in infectious diseases and clinical microbiology (M.S.T. and M.S.O.) assessed the answers separately. A third infectious diseases and clinical microbiology specialist (Y.E.O.) reviewed responses that did not agree between the two specialists. The scores for the responses that disagreed with the three specialists were evaluated jointly. The final score was determined by complete agreement. The reviewers used a scoring system from 1 to 4 points to assess the quality and reliability of the answers given by ChatGPT and Gemini. The scoring according to the adequacy and quality of the answers was as follows:\u003c/p\u003e\u003cp\u003e1 point: An answer that was completely incorrect or irrelevant.\u003c/p\u003e\u003cp\u003e2 points: An answer that was partially correct but includes some misleading information.\u003c/p\u003e\u003cp\u003e3 points: An answer that was generally correct but did not have sufficient detail.\u003c/p\u003e\u003cp\u003e4 points: An answer that was completely correct and sufficient.\u003c/p\u003e\u003cp\u003eThe scores given by the reviewers based on ChatGPT and Gemini's answers are listed in Supplementary Table\u0026nbsp;1 and Supplementary Table\u0026nbsp;2, respectively.\u003c/p\u003e\u003cp\u003eThe consistency of the answers given by AI to the same question at different times and on different computers was also evaluated. Repeatability was evaluated as positive if the answers given when the same question was asked on different computers received the same score. If the answer to the same question did not have the same score, repeatability was evaluated as negative. In this case, only the first response from the chatbots was scored. Ethics committee approval was not required because this study did not include patient data.\u003c/p\u003e\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\u003ch2\u003eStatistical analysis\u003c/h2\u003e\u003cp\u003eStatistical Package for Social Sciences (SPSS) version 25.0 was used for statistical analysis. Categorical variables were presented as numbers (n) and percentages (%), continuous variables were presented as mean\u0026thinsp;\u0026plusmn;\u0026thinsp;standard deviation (sd). Chi-square test was used to compare categorical variables. If continuous parameters were normally distributed, they were compared with \"t-test\", if not normally distributed, they were compared with \"Mann-Whitney U\" test. Spearman\u0026rsquo;s correlation analysis was applied to investigate the relationship between the responses generated by ChatGPT and Gemini. To assess inter-rater reliability, Cohen's kappa coefficient was calculated. A result with a p value\u0026thinsp;\u0026lt;\u0026thinsp;0.05 was considered statistically significant.\u003c/p\u003e\u003c/div\u003e"},{"header":"Results","content":"\u003cp\u003eThe 156 questions included in the study were classified into four categories according to their topics: \u0026ldquo;General information\u0026rdquo; (n\u0026thinsp;=\u0026thinsp;43, 27.6%), \u0026ldquo;Transmission\u0026rdquo; (n\u0026thinsp;=\u0026thinsp;31, 19.9%), \u0026ldquo;Diagnosis\u0026rdquo; (n\u0026thinsp;=\u0026thinsp;18, 11.5%), and \u0026ldquo;Prevention and Treatment\u0026rdquo; (n\u0026thinsp;=\u0026thinsp;64, 41.0%).\u003c/p\u003e\u003cp\u003eThe mean score of the answers generated by ChatGPT for all questions was 3.69\u0026thinsp;\u0026plusmn;\u0026thinsp;0.72. The mean score of the answers to the guideline questions was significantly lower in comparison to both the CDC questions (3.08\u0026thinsp;\u0026plusmn;\u0026thinsp;1.05 vs. 3.97\u0026thinsp;\u0026plusmn;\u0026thinsp;0.17, p\u0026thinsp;\u0026lt;\u0026thinsp;0.001) and the social media questions (3.08\u0026thinsp;\u0026plusmn;\u0026thinsp;1.05 vs. 3.95\u0026thinsp;\u0026plusmn;\u0026thinsp;0.22, p\u0026thinsp;=\u0026thinsp;0.001) (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e). The mean score of the answers to the \u0026ldquo;Prevention and Treatment\u0026rdquo; questions was significantly lower than those for the \u0026ldquo;General information\u0026rdquo; (3.38\u0026thinsp;\u0026plusmn;\u0026thinsp;1.00 vs. 3.88\u0026thinsp;\u0026plusmn;\u0026thinsp;0.32, p\u0026thinsp;=\u0026thinsp;0.006), \u0026ldquo;Transmission\u0026rdquo; (3.38\u0026thinsp;\u0026plusmn;\u0026thinsp;1.00 vs. 3.97\u0026thinsp;\u0026plusmn;\u0026thinsp;0.18, p\u0026thinsp;=\u0026thinsp;0.001) and \u0026ldquo;Diagnosis\u0026rdquo; (3.38\u0026thinsp;\u0026plusmn;\u0026thinsp;1.00 vs. 3.89\u0026thinsp;\u0026plusmn;\u0026thinsp;0.32, p\u0026thinsp;=\u0026thinsp;0.049) questions. No significant differences were found between other topics except for \u0026ldquo;Prevention and Treatment\u0026rdquo; questions (p\u0026thinsp;\u0026gt;\u0026thinsp;0.05).\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eChatGPT answered 81.4% of all questions completely correctly, 9.6% correctly but inadequately, 5.8% misleadingly, and 3.2% completely incorrectly. The distribution of the scores given to the answers according to the main groups and the topics is shown in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e. The rate of completely correct answers was lower for the guideline questions than for the CDC questions (47.9% vs. 97.1%, p\u0026thinsp;\u0026lt;\u0026thinsp;0.001, OR\u0026thinsp;=\u0026thinsp;37.0, 95% CI:8.00-166.7) and the social media questions (47.9% vs. 94.9%, p\u0026thinsp;\u0026lt;\u0026thinsp;0.001, OR\u0026thinsp;=\u0026thinsp;20.1, 95% CI:4.35-93.0) (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e). The rate of completely correct answers to the questions about \u0026ldquo;Prevention and Treatment\u0026rdquo; was significantly lower than to the questions about \u0026ldquo;General information\u0026rdquo; (67.2% vs. 88.4%, p\u0026thinsp;=\u0026thinsp;0.012, OR\u0026thinsp;=\u0026thinsp;3.72, 95% CI:1.28\u0026ndash;10.8) and \u0026ldquo;Transmission\u0026rdquo; (67.2% vs. 96.8%, p\u0026thinsp;=\u0026thinsp;0.001, OR\u0026thinsp;=\u0026thinsp;3.90, 95% CI:0.82\u0026ndash;18.5). No significant difference was found between the other topics in terms of completely correct answer rates (p\u0026thinsp;\u0026gt;\u0026thinsp;0.05).\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eDistribution of the scores of the answers of both AIs according to the main groups and the topics\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"8\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c8\" colnum=\"8\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003eMean Score \u003c/p\u003e\u003cp\u003e\u0026plusmn; SD\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003e1-point\u003c/p\u003e\u003cp\u003en (%)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003e2-points\u003c/p\u003e\u003cp\u003en (%)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c6\"\u003e\u003cp\u003e3-points\u003c/p\u003e\u003cp\u003en (%)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c7\"\u003e\u003cp\u003e4-points\u003c/p\u003e\u003cp\u003en (%)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c8\"\u003e\u003cp\u003eTotal \u003c/p\u003e\u003cp\u003en (%)\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eTopics\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colspan=\"6\" nameend=\"c8\" namest=\"c3\"\u003e\u003cp\u003eCDC Questions\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\" morerows=\"15\" rowspan=\"16\"\u003e\u003cp\u003e\u003cb\u003eChatGPT\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eGeneral Information\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e4.00\u0026thinsp;\u0026plusmn;\u0026thinsp;0.0\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e20 (100)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e20 (100)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eTransmission\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e4.00\u0026thinsp;\u0026plusmn;\u0026thinsp;0.0\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e25 (100)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e25 (100)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eDiagnosis\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e3.85\u0026thinsp;\u0026plusmn;\u0026thinsp;0.38\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e2 (15.4)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e11 (84.6)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e13 (100)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003ePrevention and Treatment\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e4.00\u0026thinsp;\u0026plusmn;\u0026thinsp;0.0\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e11 (100)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e11 (100)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eTotal\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e3.97\u0026thinsp;\u0026plusmn;\u0026thinsp;0.17\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e2 (2.9)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e67 (97.1)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e69 (100)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colspan=\"6\" nameend=\"c8\" namest=\"c3\"\u003e\u003cp\u003e\u003cb\u003eGuidelines Questions\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eGeneral Information\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e3.60\u0026thinsp;\u0026plusmn;\u0026thinsp;0.52\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e4 (40)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e6 60)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e10 (100)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003ePrevention and Treatment\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e2.95\u0026thinsp;\u0026plusmn;\u0026thinsp;1.11\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e5 (13.2)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e9 (23.7)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e7 (18.4)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e17 (44.7)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e38 (100)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eTotal\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e3.08\u0026thinsp;\u0026plusmn;\u0026thinsp;1.05\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e5 (10.4)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e9 (18.8)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e11 (22.9)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e23 (47.9)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e48 (100)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colspan=\"6\" nameend=\"c8\" namest=\"c3\"\u003e\u003cp\u003e\u003cb\u003eSocial Media Questions\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eGeneral Information\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e3.92\u0026thinsp;\u0026plusmn;\u0026thinsp;0.28\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e1 (7.7)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e12 (92.3)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e13 (100)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eTransmission\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e3.83\u0026thinsp;\u0026plusmn;\u0026thinsp;0.41\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e1 (16.7)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e5 (83.3)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e6 (100)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eDiagnosis\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e4.00\u0026thinsp;\u0026plusmn;\u0026thinsp;0.0\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e5 (100)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e5 (100)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003ePrevention and Treatment\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e4.00\u0026thinsp;\u0026plusmn;\u0026thinsp;0.0\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e15 (100)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e15 (100)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eTotal\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e3.95\u0026thinsp;\u0026plusmn;\u0026thinsp;0.22\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e2 (5.1)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e37 (94.9)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e39 (100)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eAll Questions\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e\u003cb\u003e3.69\u0026thinsp;\u0026plusmn;\u0026thinsp;0.72\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e\u003cb\u003e5 (3.2)\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e\u003cb\u003e9 (5.8)\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e\u003cb\u003e15 (9.6)\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e\u003cb\u003e127 (81.4)\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e\u003cb\u003e156 (100)\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\" morerows=\"16\" rowspan=\"17\"\u003e\u003cp\u003e\u003cb\u003eGemini\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colspan=\"6\" nameend=\"c8\" namest=\"c3\"\u003e\u003cp\u003e\u003cb\u003eCDC Questions\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eGeneral Information\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e3.95\u0026thinsp;\u0026plusmn;\u0026thinsp;0.22\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e1 (5.0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e19 (95.0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e20 (100)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eTransmission\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e3.84\u0026thinsp;\u0026plusmn;\u0026thinsp;0.47\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e1 (4.0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e2 (8.0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e22 (88.0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e25 (100)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eDiagnosis\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e3.69\u0026thinsp;\u0026plusmn;\u0026thinsp;0.63\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e1 (7.7)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e2 (15.4)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e10 (76.9)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e13 (100)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003ePrevention and Treatment\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e3.91\u0026thinsp;\u0026plusmn;\u0026thinsp;0.30\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e1 (9.1)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e10 (90.9)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e11 (100)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eTotal\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e3.86\u0026thinsp;\u0026plusmn;\u0026thinsp;0.43\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e2 (2.9)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e6 (8.7)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e61 (88.4)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e69 (100)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colspan=\"6\" nameend=\"c8\" namest=\"c3\"\u003e\u003cp\u003e\u003cb\u003eGuidelines Questions\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eGeneral Information\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e3.60\u0026thinsp;\u0026plusmn;\u0026thinsp;0.52\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e4 (40)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e6 60)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e10 (100)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003ePrevention and Treatment\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e2.68\u0026thinsp;\u0026plusmn;\u0026thinsp;1.04\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e5 (13.2)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e13 (34.2)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e9 (23.7)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e11 (28.9)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e38 (100)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eTotal\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e2.88\u0026thinsp;\u0026plusmn;\u0026thinsp;1.02\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e5 (10.4)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e13 (27.1)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e13 (27.1)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e17 (35.4)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e48 (100)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colspan=\"6\" nameend=\"c8\" namest=\"c3\"\u003e\u003cp\u003e\u003cb\u003eSocial Media Questions\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eGeneral Information\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e3.85\u0026thinsp;\u0026plusmn;\u0026thinsp;0.38\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e2 (15.4)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e11 (84.6)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e13 (100)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eTransmission\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e4.00\u0026thinsp;\u0026plusmn;\u0026thinsp;0.0\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e6 (100)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e6 (100)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eDiagnosis\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e3.40\u0026thinsp;\u0026plusmn;\u0026thinsp;0.89\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e1 (20.0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e1 (20.0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e3 (60.0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e5 (100)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003ePrevention and Treatment\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e3.93\u0026thinsp;\u0026plusmn;\u0026thinsp;0.26\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e1 (6.7)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e14 (93.3)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e15 (100)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eTotal\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e3.85\u0026thinsp;\u0026plusmn;\u0026thinsp;0.43\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0 (0)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e1 (2.6)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e4 (10.3)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e34 (87.2)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e39 (100)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eAll Questions\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e\u003cb\u003e3.55\u0026thinsp;\u0026plusmn;\u0026thinsp;0.81\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e\u003cb\u003e5 (3.2)\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e\u003cb\u003e16 (10.3)\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e\u003cb\u003e23 (14.7)\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c7\"\u003e\u003cp\u003e\u003cb\u003e112 (71.8)\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c8\"\u003e\u003cp\u003e\u003cb\u003e156 (100)\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003ctfoot\u003e\u003ctr\u003e\u003ctd colspan=\"8\"\u003eCDC: Centers for Disease Control and Prevention\u003c/td\u003e\u003c/tr\u003e\u003c/tfoot\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003eThe mean score of the answers generated by Gemini for all questions was 3.55\u0026thinsp;\u0026plusmn;\u0026thinsp;0.81. The mean score for the guideline questions was found to be significantly lower than that for the CDC questions (2.88\u0026thinsp;\u0026plusmn;\u0026thinsp;1.02 vs. 3.86\u0026thinsp;\u0026plusmn;\u0026thinsp;0.43, p\u0026thinsp;\u0026lt;\u0026thinsp;0.001) and the social media questions (2.88\u0026thinsp;\u0026plusmn;\u0026thinsp;1.02 vs. 3.85\u0026thinsp;\u0026plusmn;\u0026thinsp;0.43, p\u0026thinsp;\u0026lt;\u0026thinsp;0.001) (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e). The mean score of the answers to the \u0026ldquo;Prevention and Treatment\u0026rdquo; questions was significantly lower than those for the \u0026ldquo;General information\u0026rdquo; (3.19\u0026thinsp;\u0026plusmn;\u0026thinsp;1.02 vs. 3.84\u0026thinsp;\u0026plusmn;\u0026thinsp;0.37, p\u0026thinsp;\u0026lt;\u0026thinsp;0.001) and \u0026ldquo;Transmission\u0026rdquo; (3.19\u0026thinsp;\u0026plusmn;\u0026thinsp;1.02 vs. 3.87\u0026thinsp;\u0026plusmn;\u0026thinsp;0.43, p\u0026thinsp;\u0026lt;\u0026thinsp;0.001) questions. There was no significant difference in mean scores between the other topics (p\u0026thinsp;\u0026gt;\u0026thinsp;0.05).\u003c/p\u003e\u003cp\u003eGemini answered 71.8% of all questions completely correctly, 14.7% correctly but inadequately, 10.3% misleadingly, and 3.2% completely incorrectly (Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e). The rate of completely correct answers was lower for the guideline questions than for CDC questions (35.4% vs. 88.4%, p\u0026thinsp;\u0026lt;\u0026thinsp;0.001, OR\u0026thinsp;=\u0026thinsp;13.9, 95% CI:5.41\u0026ndash;35.7) and the social media questions (35.4% vs. 87.2%, p\u0026thinsp;\u0026lt;\u0026thinsp;0.001, OR\u0026thinsp;=\u0026thinsp;12.4, 95% CI:4.09\u0026ndash;37.6) (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e). The rate of completely correct answers to the questions on \u0026ldquo;Prevention and Treatment\u0026rdquo; was significantly lower than to the questions on \u0026ldquo;General information\u0026rdquo; (54.7% vs. 83.7%, p\u0026thinsp;=\u0026thinsp;0.002, OR\u0026thinsp;=\u0026thinsp;4.26, 95% CI:1.65-11.0) and \u0026ldquo;Transmission\u0026rdquo; (54.7% vs. 90.3%, p\u0026thinsp;=\u0026thinsp;0.001, OR\u0026thinsp;=\u0026thinsp;7.75, 95% CI:0.69\u0026ndash;6.76). There was no significant difference between the other topics in terms of the rate of completely correct answers (p\u0026thinsp;\u0026gt;\u0026thinsp;0.05).\u003c/p\u003e\u003cp\u003eThe mean score of ChatGPT's answers was higher than that of Gemini for all HIV-related questions, although this was not statistically significant (3.69\u0026thinsp;\u0026plusmn;\u0026thinsp;0.72 vs. 3.55\u0026thinsp;\u0026plusmn;\u0026thinsp;0.81, p\u0026thinsp;=\u0026thinsp;0.051). ChatGPT answered the CDC questions with a higher score than Gemini (3.97\u0026thinsp;\u0026plusmn;\u0026thinsp;0.17 vs. 3.86\u0026thinsp;\u0026plusmn;\u0026thinsp;0.43 p\u0026thinsp;=\u0026thinsp;0.048). According to other main groups and the topics, there was no statistically significant difference between ChatGPT and Gemini (Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e).\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eComparison of the mean scores of the answers of both AIs according to the main groups and the topics\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"6\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\"\u0026plusmn;\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\"\u0026plusmn;\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\" morerows=\"1\" rowspan=\"2\"\u003e\u003cp\u003en\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colspan=\"2\" nameend=\"c5\" namest=\"c4\"\u003e\u003cp\u003eMean Score\u0026thinsp;\u0026plusmn;\u0026thinsp;SD\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c6\" morerows=\"1\" rowspan=\"2\"\u003e\u003cp\u003ep\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003eChatGPT\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003eGemini\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eAll Questions\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e156\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c4\"\u003e\u003cp\u003e3.69\u0026thinsp;\u0026plusmn;\u0026thinsp;0.72\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c5\"\u003e\u003cp\u003e3.55\u0026thinsp;\u0026plusmn;\u0026thinsp;0.81\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e0.051\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\" morerows=\"2\" rowspan=\"3\"\u003e\u003cp\u003e\u003cb\u003eMain Groups\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eCDC Questions\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e69\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c4\"\u003e\u003cp\u003e3.97\u0026thinsp;\u0026plusmn;\u0026thinsp;0.17\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c5\"\u003e\u003cp\u003e3.86\u0026thinsp;\u0026plusmn;\u0026thinsp;0.43\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e\u003cb\u003e0.048\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eGuidelines Questions\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e48\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c4\"\u003e\u003cp\u003e3.08\u0026thinsp;\u0026plusmn;\u0026thinsp;1.05\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c5\"\u003e\u003cp\u003e2.88\u0026thinsp;\u0026plusmn;\u0026thinsp;1.02\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e0.272\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eSocial Media Questions\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e39\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c4\"\u003e\u003cp\u003e3.95\u0026thinsp;\u0026plusmn;\u0026thinsp;0.22\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c5\"\u003e\u003cp\u003e3.85\u0026thinsp;\u0026plusmn;\u0026thinsp;0.43\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e0.230\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\" morerows=\"3\" rowspan=\"4\"\u003e\u003cp\u003e\u003cb\u003eTopics\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eGeneral Information\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e43\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c4\"\u003e\u003cp\u003e3.88\u0026thinsp;\u0026plusmn;\u0026thinsp;0.32\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c5\"\u003e\u003cp\u003e3.84\u0026thinsp;\u0026plusmn;\u0026thinsp;0.37\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e0.536\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eTransmission\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e31\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c4\"\u003e\u003cp\u003e3.97\u0026thinsp;\u0026plusmn;\u0026thinsp;0.18\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c5\"\u003e\u003cp\u003e3.87\u0026thinsp;\u0026plusmn;\u0026thinsp;0.43\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e0.298\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eDiagnosis\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e18\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c4\"\u003e\u003cp\u003e3.89\u0026thinsp;\u0026plusmn;\u0026thinsp;0.32\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c5\"\u003e\u003cp\u003e3.61\u0026thinsp;\u0026plusmn;\u0026thinsp;0.70\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e0.183\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003ePrevention and Treatment\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e64\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c4\"\u003e\u003cp\u003e3.38\u0026thinsp;\u0026plusmn;\u0026thinsp;1.00\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\"\u0026plusmn;\" colname=\"c5\"\u003e\u003cp\u003e3.19\u0026thinsp;\u0026plusmn;\u0026thinsp;1.02\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e0.206\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003ctfoot\u003e\u003ctr\u003e\u003ctd colspan=\"6\"\u003eCDC: Centers for Disease Control and Prevention\u003c/td\u003e\u003c/tr\u003e\u003c/tfoot\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003eChatGPT had a higher rate than Gemini in giving completely correct answers to all questions (81.4% vs. 71.8%, p\u0026thinsp;=\u0026thinsp;0.045, OR\u0026thinsp;=\u0026thinsp;1.72, 95% CI:1.01\u0026ndash;2.93). Also, in CDC questions, ChatGPT had a higher rate of correct answers than Gemini (97.1% vs. 88.4%, p\u0026thinsp;=\u0026thinsp;0.049, OR\u0026thinsp;=\u0026thinsp;4.39, 95% CI:0.90\u0026ndash;21.3). There was no significant difference between the two AIs in terms of completely correct answers according to the other main groups and the topics (Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e).\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab3\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 3\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eComparison of completely correct answer rates of both AIs according to the main groups and the topics\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"8\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e\u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c8\" colnum=\"8\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\" morerows=\"1\" rowspan=\"2\"\u003e\u003cp\u003en\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colspan=\"2\" nameend=\"c5\" namest=\"c4\"\u003e\u003cp\u003eCompletely Correct \u003c/p\u003e\u003cp\u003eAnswer Rates (%)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c6\" morerows=\"1\" rowspan=\"2\"\u003e\u003cp\u003eOR\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c7\" morerows=\"1\" rowspan=\"2\"\u003e\u003cp\u003e%95 CI\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c8\" morerows=\"1\" rowspan=\"2\"\u003e\u003cp\u003ep\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003eChatGPT\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003eGemini\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eAll Questions\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e156\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e81.4\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e71.8\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e1.72\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e\u003cp\u003e1.01\u0026ndash;2.93\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c8\"\u003e\u003cp\u003e\u003cb\u003e0.045\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\" morerows=\"2\" rowspan=\"3\"\u003e\u003cp\u003e\u003cb\u003eMain Groups\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eCDC Questions\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e69\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e97.1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e88.4\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e4.39\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e\u003cp\u003e0.90\u0026ndash;21.3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c8\"\u003e\u003cp\u003e\u003cb\u003e0.049\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eGuidelines Questions\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e48\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e47.9\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e35.4\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e1.68\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e\u003cp\u003e0.74\u0026ndash;3.80\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c8\"\u003e\u003cp\u003e0.214\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eSocial Media Questions\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e39\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e94.9\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e87.2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e2.72\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e\u003cp\u003e0.49\u0026ndash;14.93\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c8\"\u003e\u003cp\u003e0.235\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\" morerows=\"3\" rowspan=\"4\"\u003e\u003cp\u003e\u003cb\u003eTopics\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eGeneral Information\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e43\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e88.4\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e83.7\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e1.48\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e\u003cp\u003e0.43\u0026ndash;5.08\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c8\"\u003e\u003cp\u003e0.534\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eTransmission\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e31\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e96.8\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e90.3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e3.22\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e\u003cp\u003e0.32\u0026ndash;32.3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c8\"\u003e\u003cp\u003e0.301\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003eDiagnosis\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e18\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e88.9\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e72.2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e6.15\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e\u003cp\u003e0.051\u0026ndash;18.5\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c8\"\u003e\u003cp\u003e0.206\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e\u003cb\u003ePrevention and Treatment\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e\u003cp\u003e64\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e\u003cp\u003e67.2\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e\u003cp\u003e54.7\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e\u003cp\u003e1.70\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e\u003cp\u003e0.83\u0026ndash;3.47\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"char\" char=\".\" colname=\"c8\"\u003e\u003cp\u003e0.147\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003ctfoot\u003e\u003ctr\u003e\u003ctd colspan=\"8\"\u003eCDC: Centers for Disease Control and Prevention\u003c/td\u003e\u003c/tr\u003e\u003c/tfoot\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003eThere was a high positive correlation between ChatGPT and Gemini answers (r\u0026thinsp;=\u0026thinsp;0.680, p\u0026thinsp;\u0026lt;\u0026thinsp;0.001). According to Cohen's kappa test, there was a substantial inter-rater agreement for ChatGPT (κ\u0026thinsp;=\u0026thinsp;0.674) and Gemini (κ\u0026thinsp;=\u0026thinsp;0.768). The reproducibility rate for ChatGPT was 94.8%, while that for Gemini was 90.3% (p\u0026thinsp;=\u0026thinsp;0.135).\u003c/p\u003e"},{"header":"Discussion","content":"\u003cp\u003eAlthough AI chatbots provide easy access to information on many topics, there is a significant disadvantage that these chatbots also contain false information. Providing false information about HIV can negatively affect public health and lead to inappropriate approaches among healthcare professionals. There is no study evaluating the reliability of AI chatbots regarding this HIV pandemic, which significantly affects public health. In this study, we evaluated the knowledge levels of ChatGPT and Gemini chatbots about HIV. In our study, ChatGPT provided a higher rate of completely correct answers to HIV-related questions than Gemini (81.3% vs. 71.8%). Both AI applications responded to the CDC and the social media questions with high accuracy rates. However, the lowest accuracy rates were found in the guideline questions (ChatGPT 47.9%, Gemini 35.4%) and questions on \u0026ldquo;Prevention and Treatment\u0026rdquo; (ChatGPT 67.2%, Gemini 57.4%).\u003c/p\u003e\u003cp\u003eThe information provided by AI applications in healthcare still needs improvement, but many promising results exist. In a meta-analysis evaluating 45 studies on the performance of different versions of ChatGPT in medical licensing exams, ChatGPT-4 achieved an overall accuracy rate of 81%, passed medical exams in 26 out of 29 cases, and exceeded the average scores of medical students in 13 out of 17 cases [\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e]. In a study in which 82 clinical cases were asked to utilize ChatGPT for differential diagnoses, ChatGPT-4 demonstrated a high level of agreement (95.9%) with physicians in assessing whether a definitive diagnosis should be included in differential diagnosis lists [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e]. A systematic review of 44 studies on the utilization of ChatGPT in radiology demonstrated that ChatGPT exhibits a high degree of accuracy in this field [\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e]. A study has demonstrated that AI applications can accurately predict pathological diagnoses of neurodegenerative diseases based on clinical information [\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e]. In addition, AI chatbots were shown to provide generally accurate information for frequently asked questions about cataracts [\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e], urological malignancies [\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e], kidney stones [\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e] and amyloidosis [\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e].\u003c/p\u003e\u003cp\u003eNevertheless, researches have also demonstrated the limitations of AI applications. In a study in which 316 questions prepared for otolaryngology residents were presented to three different AI models, the accuracy rates of the answers given by senior residents outperformed all AI applications. ChatGPT-4 and Gemini performed similarly to junior residents, while Bing lagged behind junior residents [\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e]. Similarly, ChatGPT and Gemini have shown limited performance in examinations in nephrology [\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e]. In a study evaluating the ChatGPT approach to managing bloodstream infections for 44 cases, 16% of the approaches were considered harmful [\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e]. In a study evaluating the responses of four distinct AI chatbots to common emergency care questions, all responses contained small to moderate levels of potentially dangerous information [\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e]. In a study on the simplification of radiology reports to ChatGPT, in approximately one-third of all simplified reports, participating radiologists found errors that could lead patients to incorrect outcomes and potentially lead to physical and/or psychological damage [\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e]. In our study, ChatGPT and Gemini provided completely wrong or misleading answers to 9.0% and 13.5% of the questions, respectively. Answers containing incorrect information are one of the important limitations of AI applications and improvements are needed in this regard.\u003c/p\u003e\u003cp\u003e While questions about diseases frequently asked by society and on social media platforms are generally answered with high accuracy by AI chatbots, AI's performance decreases on more complex issues such as guideline questions and questions about treatment. In a study where ChatGPT was asked 200 questions from social media platforms and guideline questions on many topics related to infectious diseases, ChatGPT answered 92% of social media platform questions and 69% of guideline questions completely correctly (p\u0026thinsp;=\u0026thinsp;0.001) [\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e]. Similarly, ChatGPT answered cervical cancer of social media platform questions with significantly higher accuracy than guideline questions (p\u0026thinsp;\u0026lt;\u0026thinsp;0.001) [\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e]. In another study, ChatGPT provided adequate answers to more than 80% of frequently asked questions about osteoporosis, while it provided adequate answers to 61.3% of questions prepared from the Turkish National Osteoporosis Guide [\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e]. In a study in which common dermatological cases were asked, ChatGPT showed lower performance in diagnosing complex cases such as cutaneous neoplasms [\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e]. In our study, both AI chatbots responded with significantly lower accuracy to guideline questions and questions on \u0026ldquo;Prevention and Treatment\u0026rdquo;.\u003c/p\u003e\u003cp\u003eThe performances of different AI applications in healthcare have been compared in several studies. In many studies, different versions of ChatGPT (3.5, 4o, and 4.0) demonstrated better performance than Gemini [\u003cspan additionalcitationids=\"CR26 CR27\" citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e]. In some studies, ChatGPT-4 performed better than Gemini and ChatGPT-3.5, while ChatGPT-3.5 and Gemini showed similar performance [\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e, \u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e, \u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e]. Similarly, in another study comparing ChatGPT-3.5 and Gemini, both AI chatbots showed similar performance [\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e]. In our study, the completely correct response rate was significantly higher in the responses produced by ChatGPT-4o than Gemini. However, there was no significant difference between the mean scores of the responses produced by both chatbots.\u003c/p\u003e\u003cp\u003eThe reproducibility of the information provided by AI chatbots is as important as reliability and accuracy, because it is very difficult to generalize study findings on platforms where repeatability is low. In existing studies in the literature, reproducibility rates for ChatGPT have been reported to be between 70% and 100% [\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e, \u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e, \u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e, \u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e], and for Gemini between 50% and 92%. [\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e, \u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e]. In our study, the repeatability rates of the answers were found to be similarly high for ChatGPT and Gemini, 94.8% and 90.3%, respectively.\u003c/p\u003e\u003cp\u003eTo our knowledge, this is the first study to evaluate the accuracy and reliability of AI chatbots in response to HIV-related questions. However, our study had several limitations. Firstly, it should be noted that the questions used in this study represent only a fraction of the total number of questions that could be posed within this field of inquiry. Secondly, it is possible that the comments of the experts who evaluated the responses contained subjectivity and this could affect the scoring. Thirdly, this study did not assess the comprehensibility of the responses provided by AI chatbots.\u003c/p\u003e"},{"header":"Conclusion","content":"\u003cp\u003eAI chatbots have a great potential for patients and healthcare professionals to access information quickly in the health field. In particular, it seems to be a reliable source for society to obtain easy and accurate information about diseases. However, caution should be exercised regarding the reliability of information provided by AI applications on complex issues of interest to healthcare professionals, such as the treatment and management of diseases. In addition, it would be useful to integrate specific guidelines for each health field into the algorithms of AI applications to increase accuracy and reliability.\u003c/p\u003e"},{"header":"Abbreviations","content":"\u003cdiv class=\"DefinitionList\"\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003eAI\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eartificial intelligence\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003eCDC\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eCenters for Disease Control and Prevention\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003eChatGPT\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eChat Generative Pre-trained Transformer\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003eDHHS\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eUnited States Department of Health and Human Services\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003eEACS\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eEuropean AIDS Clinical Society\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003eHIV\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eHuman Immunodeficiency Virus\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003c/div\u003e"},{"header":"Declarations","content":"\u003cp\u003eEthics approval and consent to participate:\u0026nbsp;Not applicable.\u003c/p\u003e\n\u003cp\u003eConsent for publication: Not applicable.\u003c/p\u003e\n\u003cp\u003eAvailability of data and materials: The datasets analysed during the current study are available from the corresponding author on reasonable request.\u003c/p\u003e\n\u003cp\u003eCompeting interests: The authors declare that they have no competing interests.\u003c/p\u003e\n\u003cp\u003eFunding: This study was not funded by any person or organisation.\u003c/p\u003e\n\u003cp\u003eAuthors\u0026rsquo; contributions: MST and MSO contributed to the design of the concept. MST and MSO collected the data for the study. MST contributed to the analysis of the data. MST and MSO contributed to the interpretation of the data, revision of the manuscript, and review. All authors read and approved the final manuscript.\u003c/p\u003e\n\u003cp\u003eAcknowledgements: Not applicable.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eObermeyer Z, Emanuel EJ. Predicting the Future \u0026mdash; Big Data, Machine Learning, and Clinical Medicine. N Engl J Med. 2016. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1056/NEJMp1606181\u003c/span\u003e\u003cspan address=\"10.1056/NEJMp1606181\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRashid A, Bin, Kausik MAK. AI revolutionizing industries worldwide: A comprehensive overview of its diverse applications. Hybrid Adv. 2024. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/J.HYBADV.2024.100277\u003c/span\u003e\u003cspan address=\"10.1016/J.HYBADV.2024.100277\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eIntroducing GPT- 4. o and more tools to ChatGPT free users | OpenAI [Internet]. 2025. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://openai.com/index/gpt-4o-and-more-tools-to-chatgpt-free/\u003c/span\u003e\u003cspan address=\"https://openai.com/index/gpt-4o-and-more-tools-to-chatgpt-free/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. Accessed 10 Jun 2025.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGemini [Internet]. 2025. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://gemini.google.com/app\u003c/span\u003e\u003cspan address=\"https://gemini.google.com/app\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. Accessed 10 Jun 2025.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eTrickey A, Sabin CA, Burkholder G, Crane H, d\u0026rsquo;Arminio Monforte A, Egger M, et al. Life expectancy after 2015 of adults with HIV on long-term antiretroviral therapy in Europe and North America: a collaborative analysis of cohort studies. Lancet HIV. 2023. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/S2352-3018(23)00028-0\u003c/span\u003e\u003cspan address=\"10.1016/S2352-3018(23)00028-0\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eEACS Guidelines. 2024 \u0026mdash; EACS Guidelines [Internet]. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://eacs.sanfordguide.com/\u003c/span\u003e\u003cspan address=\"https://eacs.sanfordguide.com/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. Accessed 12 Nov 2024.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHIV/AIDS Treatment Guidelines. | Clinicalinfo.HIV.gov [Internet]. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://clinicalinfo.hiv.gov/en/guidelines\u003c/span\u003e\u003cspan address=\"https://clinicalinfo.hiv.gov/en/guidelines\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. Accessed 12 Nov 2024.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLiu M, Okuhara T, Chang XY, Shirabe R, Nishiie Y, Okada H, et al. Performance of ChatGPT Across Different Versions in Medical Licensing Examinations Worldwide: Systematic Review and Meta-Analysis. J Med Internet Res. 2024. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.2196/60807\u003c/span\u003e\u003cspan address=\"10.2196/60807\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMizuta K, Hirosawa T, Harada Y, Shimizu T. Can ChatGPT-4 evaluate whether a differential diagnosis list contains the correct diagnosis as accurately as a physician? Diagnosis. 2024. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1515/dx-2024-0027\u003c/span\u003e\u003cspan address=\"10.1515/dx-2024-0027\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKeshavarz P, Bagherieh S, Nabipoorashrafi SA, Chalian H, Rahsepar AA, Kim GHJ, et al. ChatGPT in radiology: A systematic review of performance, pitfalls, and future perspectives. Diagn Interv Imaging. 2024. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.diii.2024.04.003\u003c/span\u003e\u003cspan address=\"10.1016/j.diii.2024.04.003\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKoga S, Martin NB, Dickson DW. Evaluating the performance of large language models: ChatGPT and Google Bard in generating differential diagnoses in clinicopathological conferences of neurodegenerative disorders. Brain Pathology. 2023; \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1111/bpa.13207\u003c/span\u003e\u003cspan address=\"10.1111/bpa.13207\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eYılmaz İE, Doğan L. Talking technology: exploring chatbots as a tool for cataract patient education. Clin Exp Optom. 2024. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1080/08164622.2023.2298812\u003c/span\u003e\u003cspan address=\"10.1080/08164622.2023.2298812\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMusheyev D, Pan A, Loeb S, Kabarriti AE. How Well Do Artificial Intelligence Chatbots Respond to the Top Search Queries About Urological Malignancies? Eur Urol. 2024. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.eururo.2023.07.004\u003c/span\u003e\u003cspan address=\"10.1016/j.eururo.2023.07.004\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMusheyev D, Pan A, Kabarriti AE, Loeb S, Borin JF. Quality of Information About Kidney Stones from Artificial Intelligence Chatbots. J Endourol. 2024. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1089/end.2023.0484\u003c/span\u003e\u003cspan address=\"10.1089/end.2023.0484\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKing RC, Samaan JS, Yeo YH, Peng Y, Kunkel DC, Habib AA, et al. A Multidisciplinary Assessment of ChatGPT\u0026rsquo;s Knowledge of Amyloidosis: Observational Study. JMIR Cardio. 2024. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.2196/53421\u003c/span\u003e\u003cspan address=\"10.2196/53421\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMete U. Evaluating the Performance of ChatGPT, Gemini, and Bing Compared with Resident Surgeons in the Otorhinolaryngology In-service Training Examination. Turk Arch Otorhinolaryngol. 2024. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.4274/tao.2024.3.5\u003c/span\u003e\u003cspan address=\"10.4274/tao.2024.3.5\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eNoda R, Izaki Y, Kitano F, Komatsu J, Ichikawa D, Shibagaki Y. Performance of ChatGPT and Bard in self-assessment questions for nephrology board renewal. Clin Exp Nephrol. 2024. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s10157-023-02451-w\u003c/span\u003e\u003cspan address=\"10.1007/s10157-023-02451-w\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMaillard A, Micheli G, Lefevre L, Guyonnet C, Poyart C, Canou\u0026iuml; E, et al. Can Chatbot Artificial Intelligence Replace Infectious Diseases Physicians in the Management of Bloodstream Infections? A Prospective Cohort Study. Clin Infect Dis. 2024. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/cid/ciad632\u003c/span\u003e\u003cspan address=\"10.1093/cid/ciad632\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eYau JYS, Saadat S, Hsu E, Murphy LSL, Roh JS, Suchard J, et al. Accuracy of Prospective Assessments of 4 Large Language Model Chatbot Responses to Patient Questions About Emergency Care: Experimental Comparative Study. J Med Internet Res. 2024. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.2196/60291\u003c/span\u003e\u003cspan address=\"10.2196/60291\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eJeblick K, Schachtner B, Dexl J, Mittermeier A, St\u0026uuml;ber AT, Topalis J, et al. ChatGPT makes medicine easy to swallow: an exploratory case study on simplified radiology reports. Eur Radiol. 2023. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s00330-023-10213-1\u003c/span\u003e\u003cspan address=\"10.1007/s00330-023-10213-1\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eTun\u0026ccedil;er G, G\u0026uuml;\u0026ccedil;l\u0026uuml; KG. How Reliable is ChatGPT as a Novel Consultant in Infectious Diseases and Clinical Microbiology? Infectious Diseases \u0026amp; Clinical Microbiology. 2024; \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.36519/idcm.2024.286\u003c/span\u003e\u003cspan address=\"10.36519/idcm.2024.286\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eYurtcu E, Ozvural S, Keyif B. Analyzing the performance of ChatGPT in answering inquiries about cervical cancer. Int J Gynecol Obstet. 2024. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1002/ijgo.15861\u003c/span\u003e\u003cspan address=\"10.1002/ijgo.15861\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eCinar C. Analyzing the Performance of ChatGPT About Osteoporosis. Cureus. 2023. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.7759/cureus.45890\u003c/span\u003e\u003cspan address=\"10.7759/cureus.45890\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGoktas P, Grzybowski A. Assessing the Impact of ChatGPT in Dermatology: A Comprehensive Rapid Review. J Clin Med. 2024. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3390/jcm13195909\u003c/span\u003e\u003cspan address=\"10.3390/jcm13195909\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eCarl\u0026agrave; MM, Gambini G, Baldascino A, Giannuzzi F, Boselli F, Crincoli E, et al. Exploring AI-chatbots\u0026rsquo; capability to suggest surgical planning in ophthalmology: ChatGPT versus Google Gemini analysis of retinal detachment cases. Br J Ophthalmol. 2024. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1136/bjo-2023-325143\u003c/span\u003e\u003cspan address=\"10.1136/bjo-2023-325143\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eStrzalkowski P, Strzalkowska A, Chhablani J, Pfau K, Errera MH, Roth M, et al. Evaluation of the accuracy and readability of ChatGPT-4 and Google Gemini in providing information on retinal detachment: a multicenter expert comparative study. Int J Retina Vitreous. 2024. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1186/s40942-024-00579-9\u003c/span\u003e\u003cspan address=\"10.1186/s40942-024-00579-9\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRossettini G, Rodeghiero L, Corradi F, Cook C, Pillastrini P, Turolla A, et al. Comparative accuracy of ChatGPT-4, Microsoft Copilot and Google Gemini in the Italian entrance test for healthcare sciences degrees: a cross-sectional study. BMC Med Educ. 2024. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1186/s12909-024-05630-9\u003c/span\u003e\u003cspan address=\"10.1186/s12909-024-05630-9\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eAbdul Sami M, Abdul Samad M, Parekh K, Suthar PP. Comparative Accuracy of ChatGPT 4.0 and Google Gemini in Answering Pediatric Radiology Text-Based Questions. Cureus. 2024. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.7759/cureus.70897\u003c/span\u003e\u003cspan address=\"10.7759/cureus.70897\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eToyama Y, Harigai A, Abe M, Nagano M, Kawabata M, Seki Y, et al. Performance evaluation of ChatGPT, GPT-4, and Bard on the official board examination of the Japan Radiology Society. Jpn J Radiol. 2024. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s11604-023-01491-2\u003c/span\u003e\u003cspan address=\"10.1007/s11604-023-01491-2\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKhan AA, Yunus R, Sohail M, Rehman TA, Saeed S, Bu Y, et al. Artificial Intelligence for Anesthesiology Board-Style Examination Questions: Role of Large Language Models. J Cardiothorac Vasc Anesth. 2024. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1053/j.jvca.2024.01.032\u003c/span\u003e\u003cspan address=\"10.1053/j.jvca.2024.01.032\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDoğan L, \u0026Ouml;z\u0026ccedil;akmakcı GB, Yılmaz ĬE. The Performance of Chatbots and the AAPOS Website as a Tool for Amblyopia Education. J Pediatr Ophthalmol Strabismus. 2024. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3928/01913913-20240409-01\u003c/span\u003e\u003cspan address=\"10.3928/01913913-20240409-01\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eTalyshinskii A, Julieb\u0026oslash;-Jones P, Zeeshan Hameed BM, Naik N, Adhikari K, Zhanbyrbekuly U, et al. ChatGPT as a Clinical Decision Maker for Urolithiasis: Compliance with the Current European Association of Urology Guidelines. Eur Urol Open Sci. 2024. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.euros.2024.08.015\u003c/span\u003e\u003cspan address=\"10.1016/j.euros.2024.08.015\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eOzgor BY, Simavi MA. Accuracy and reproducibility of ChatGPT\u0026rsquo;s free version answers about endometriosis. Int J Gynecol Obstet. 2024. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1002/ijgo.15309\u003c/span\u003e\u003cspan address=\"10.1002/ijgo.15309\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eIannantuono GM, Bracken-Clarke D, Karzai F, Choo-Wosoba H, Gulley JL, Floudas CS. Comparison of Large Language Models in Answering Immuno-Oncology Questions: A Cross-Sectional Study. Oncologist. 2024. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/oncolo/oyae009\u003c/span\u003e\u003cspan address=\"10.1093/oncolo/oyae009\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSahin Ozdemir M, Ozdemir YE. Comparison of the performances between ChatGPT and Gemini in answering questions on viral hepatitis. Sci Rep. 2025. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/s41598-024-83575-1\u003c/span\u003e\u003cspan address=\"10.1038/s41598-024-83575-1\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"AIDS, Artificial intelligence, ChatGPT, Gemini, HIV","lastPublishedDoi":"10.21203/rs.3.rs-7541042/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7541042/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003ch2\u003eBacground:\u003c/h2\u003e\u003cp\u003eThis is the first study to evaluate the accuracy and reliability of the ChatGPT and Gemini chatbots' on HIV.\u003c/p\u003e\u003ch2\u003eMethods\u003c/h2\u003e\u003cp\u003e A total of 156 questions about HIV in 3 different categories (CDC, guideline and social media) were asked to both ChatGPT and Gemini. The chatbots' answers were scored on a scale of 1 to 4 (1\u0026thinsp;=\u0026thinsp;completely wrong, 4\u0026thinsp;=\u0026thinsp;completely correct) by two different infectious disease experts. The reproducibility of both chatbots was also analysed.\u003c/p\u003e\u003ch2\u003eResults\u003c/h2\u003e\u003cp\u003eThe mean score of the answers generated for all questions was 3.69\u0026thinsp;\u0026plusmn;\u0026thinsp;0.72 for ChatGPT and 3.55\u0026thinsp;\u0026plusmn;\u0026thinsp;0.81 for Gemini (p\u0026thinsp;=\u0026thinsp;0.051). The rate of completely correct answers was 81.4% for ChatGPT and 71.8% for Gemini (p\u0026thinsp;=\u0026thinsp;0.045). ChatGPT answered guideline questions with lower accuracy than CDC questions (47.9% vs. 97.1%, p\u0026thinsp;=\u0026thinsp;0.000) and social media questions (47.9% vs. 94.9%, p\u0026thinsp;=\u0026thinsp;0.000). Similarly, Gemini answered guideline questions with lower accuracy than CDC questions (35.4% vs. 88.4%, p\u0026thinsp;=\u0026thinsp;0.000) and social media questions (35.4% vs. 87.2%, p\u0026thinsp;=\u0026thinsp;0.000). Considering the questions according to the topics, the lowest accuracy rate for both chatbots was in the subject of \u0026lsquo;Prevention and Treatment\u0026rsquo; (67.2% for ChatGPT, 54.7% for Gemini). The reproducibility of the answers was 94.8% for ChatGPT and 90.3% for Gemini.\u003c/p\u003e\u003ch2\u003eConclusion\u003c/h2\u003e\u003cp\u003eChatGPT and Gemini answered CDC and the social media questions with high accuracy. However, both chatbots need improvement for guideline questions and questions on \u0026ldquo;Prevention and Treatment\u0026rdquo;. Therefore, these applications need to be improved for the use of healthcare professionals.\u003c/p\u003e","manuscriptTitle":"Comparison of the accuracy and reliability of ChatGPT-4o and Gemini in answering HIV-related questions","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-09-08 10:26:35","doi":"10.21203/rs.3.rs-7541042/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"4023e975-5cb9-49de-b9e0-09f7e44d2e9b","owner":[],"postedDate":"September 8th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2025-09-15T23:08:12+00:00","versionOfRecord":[],"versionCreatedAt":"2025-09-08 10:26:35","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-7541042","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7541042","identity":"rs-7541042","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.