Artificial Intelligence and Dental Professionals’ Performance in the Identification of Dental Implant Systems: The Impact of Deep Learning Algorithms – A Diagnostic Accuracy Study | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Artificial Intelligence and Dental Professionals’ Performance in the Identification of Dental Implant Systems: The Impact of Deep Learning Algorithms – A Diagnostic Accuracy Study Oğuz Alp KÖSE, Musa Şamil AKYIL This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7355779/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Background : This study aimed to compare the diagnostic performance of two deep learning models (YOLOv8 and YOLOv10) in identifying dental implant systems (DISs) on panoramic radiographs and to assess the impact of AI assistance on the diagnostic accuracy of dental professionals. Methods: Panoramic radiographs of patients who underwent implant treatment at Aydın Adnan Menderes University Faculty of Dentistry between January 2014 and April 2024 were retrospectively screened. A total of 380 radiographs, containing 1,143 implant images representing five DISs (NucleOSS T4, NucleOSS T6, Dentium Superline, Nobel Replace Tapered, and NTA Regular) were included. Images were annotated using Roboflow Annotate and underwent preprocessing and data augmentation. The final dataset comprised 810 images split into 80% training, 10% validation, and 10% test. YOLOv8 and YOLOv10 object detection algorithms were fine-tuned for DIS identification. Performance was assessed using confusion matrices, precision, recall, F1 score, and mean Average Precision (mAP). Twelve dental professionals with at least one year of implantology experience were included. Multiple-choice forms, with and without AI assistance, were completed by participants. Precision, recall, F1 score, and true positive (TP) rates were calculated to compare performance. Statistical analyses included the Shapiro–Wilk and Wilcoxon signed-rank tests. Results : Precision, recall, F1 score, and mAP for YOLOv8 were 0.94, 0.94, 0.94, and 0.96, respectively, while YOLOv10 achieved 0.94, 0.95, 0.95, and 0.97. Although overall accuracy was similar, YOLOv10 showed a narrower range of variation. Except for the recall of NucleOSS T4, AI significantly improved all metrics across all DIS groups (p<0.05). The overall diagnostic performance of participants was lower than that of the AI models. Conclusions: Deep learning-based AI models showed high accuracy in classifying dental implant systems on panoramic radiographs, with YOLOv10 outperforming YOLOv8 in consistency. AI assistance improved participants’ diagnostic performance across all implant categories but did not exceed the standalone AI model, underscoring the impact of human–AI interaction and automation bias. Future work should focus on dataset diversity, multi-modal imaging, and clinician training to optimize AI integration in clinical practice. Trial registration: not applicable. Artificial Intelligence Deep Learning Dental Implants Dentists Figures Figure 1 Figure 2 BACKGROUND Dental implants are among the most commonly utilized treatment modalities for replacing missing teeth and are regarded as a reliable option in contemporary restorative dentistry (1). Although implant-supported treatments generally exhibit high long-term success rates, clinicians frequently encounter failures involving both implants and superstructures. Mechanical complications, such as implant fracture, loss of retention, screw loosening or fracture, and prosthetic material failure, are widely reported (2). Globally, hundreds of manufacturers produce thousands of distinct dental implant systems (DIS), contributing to substantial diversity within the field of implant dentistry (3). These systems exhibit considerable variation in their internal configurations, abutment designs, superstructures, and screw components, with such differences largely determined by the specific manufacturer and model. Accurate identification and classification of the existing implant system in the oral cavity is therefore critical for ensuring appropriate clinical management in the event of such complications (4). Artificial intelligence (AI) is a rapidly advancing field that mimics human behavior to perform specific tasks in medicine and dentistry (5). In particular, convolutional neural networks (CNNs), a fundamental architecture within deep learning, are specifically designed to address tasks such as image processing, video analysis, object detection, and image classification (6). CNNs have been widely adopted in dentistry from diagnosis to treatment planning, implementation, and prediction of outcomes. Numerous studies have demonstrated that AI applications can improve diagnostic precision and therapeutic success (5, 7, 8). Several meta-analyses and systematic reviews have explored the use of object detection algorithms for identifying dental implant systems in panoramic and periapical radiographs (9-11). While the outcomes of these studies are promising, further research is warranted to enhance the effectiveness of computer-aided systems, improve clinical accuracy, and optimize treatment outcomes. Since real-world clinical settings involve complex decision-making processes, evaluations that do not account for clinician involvement may offer a limited representation of practical use. Although AI systems are increasingly adopted in dental practice, the clinician retains primary responsibility for diagnostic and treatment decisions. Assessing the interaction between AI systems and dental professionals is therefore essential for evaluating their clinical utility. To address this gap, the present study aims to compare the diagnostic performance of two deep learning models (YOLOv8 and YOLOv10) in identifying DIS on panoramic radiographs and to investigate the effect of AI assistance on the diagnostic accuracy of dental professionals. The null hypotheses of the study were as follows: H₁: There is no difference between the AI models in identifying DISs. H₂: There is no significant difference in dental professionals’ diagnostic performance with and without AI assistance in DISs. METHODS Trial Design This retrospective observational diagnostic accuracy study aimed to compare the performance of two deep learning models (YOLOv8 and YOLOv10) in identifying dental implant systems on panoramic radiographs and to evaluate the effect of AI assistance on the diagnostic accuracy of dental professionals. The study was approved by the Ethics Committee of the Faculty of Dentistry, Aydın Adnan Menderes University (protocol number: 2024/02). Dataset Preparation Panoramic radiographs obtained from patients who underwent dental implant treatment at the Faculty of Dentistry, Aydın Adnan Menderes University between January 2014 and April 2024 were retrospectively reviewed. Radiographs were excluded when the corresponding implant system had an insufficient number of cases to be included as a separate group in the analysis, exhibited severe patient positioning errors or imaging artifacts, or showed any pathological conditions in the implant region. The final dataset comprised 380 radiographs containing 1143 implant instances. Table 1 summarized the DISs included in the study, along with the corresponding number of images and implant counts for each brand. Table 1 . Numbers of panoramic images and instances of implant systems used in this study. I mplant System ı mage (n) Instance (n) NucleOSS T4 87 240 NucleOSS T6 74 239 Dentium 73 242 Nobel 93 208 NTA 57 214 Total 380* 1143 *The sum of the images in individual groups exceeds the total number of images because some radiographs contain multiple implant systems, resulting in their inclusion in more than one category. Five dental implant systems from four manufacturers NucleOSS T4 Standard and NucleOSS T6 Standard [NucleOSS Implants, İzmir, Türkiye], Dentium SuperLine [Dentium Co., Ltd., Seoul, South Korea], NobelReplace Tapered [Nobel Biocare AB, Göteborg, Sweden], and NTA Regular [Pilatus Swiss Dental GmbH, Egolzwil, Switzerland]) were annotated by a single researcher using Roboflow Annotate (Roboflow Inc., Des Moines, IA, USA). The annotated images were split into training (60%), validation (20%), and test (20%) sets via the Roboflow platform. All images were resized to 640×640 pixels without cropping. Data augmentation, including ±15° rotation and ±25% brightness adjustment, was applied to the training set. The final dataset included 810 images (80% training, 10% validation, and 10% test). Training of Artificial Intelligence Algorithms Two pretrained object detection models, large models of YOLOv8 and YOLOv10 (both pretrained on the Microsoft COCO dataset), were fine-tuned to improve performance in DIS identification. Training was conducted on the Google Colab Pro platform (Google LLC, Mountain View, CA, USA) using an NVIDIA T4 GPU. Each model was trained for 200 epochs with a batch size of 32. Assessment of AI Model Performance Model performance was evaluated using the test dataset. Detections with an Intersection over Union (IoU) greater than 0.5 were considered correct. Confusion matrices were constructed to report true positives (TP), false positives (FP), and false negatives (FN); true negatives (TN) were excluded, as they are not defined in object detection tasks. Class-wise precision, recall, and F1 scores were calculated. The mean Average Precision at an IoU threshold of 0.5 (mAP50) was used as the primary metric to assess overall detection performance. Evaluation of the Performance of Dental Professionals Twelve eligible dental professionals from the departments of Periodontology and Prosthodontics, all of whom had at least one year of clinical experience in implantology, were included in the study. No participants from the Department of Oral and Maxillofacial Surgery were included, as no eligible personnel were available during the study period. The distribution of specialties among participants was not balanced. A total of 233 cropped implant images from the test set were used to generate two multiple-choice forms: one without AI assistance and one with AI assistance. Both forms were text-based and administered via Google Forms (Google LLC, Mountain View, California, USA). In the AI-assisted form, YOLOv10 predictions were shown below each question. Question order was randomized for each participant; class options remained fixed. Participants initially completed the form without AI assistance. After a two-week interval, the same participants completed the AI-assisted form. Responses from the first session were not accessible during the second. The interval was intended to minimize recall bias and to evaluate the independent effect of AI assistance on decision-making. Each participant’s responses were compared with the ground truth class labels to generate separate confusion matrices for the AI-assisted and unassisted conditions. Class-specific precision, recall, and F1-scores were calculated at both individual and group levels. Since each participant provided only one response per class, the values for precision, recall, and F1-score were identical, effectively representing the correct classification rate (i.e., true positive ratio). This metric was therefore used as the overall performance indicator. Statistical Analysis Statistical analyses were performed using IBM SPSS Statistics 25.0. The normality of metric distributions was tested using the Shapiro-Wilk test. Since normality was not assumed, non-parametric tests were used. The Wilcoxon signed-rank test was used to compare performance between AI-assisted and unassisted conditions. A significance level of p<0.05 was considered statistically significant. RESULTS Evaluation of the Performance of Artificial Intelligence Algorithms Normalized confusion matrices for YOLOv8 and YOLOv10 algorithms were shown at Figure 1. YOLOv8 achieved high TP rates in the NucleOSS T6, Dentium SuperLine, and Nobel Replace Tapered classes. However, its lowest performance was observed in the NTA Regular group (77%), which was frequently misclassified as NucleOSS T4 (10%). In contrast, YOLOv10 demonstrated a more balanced and consistent classification performance across all implant classes, with notably reduced inter-class confusion, particularly improving the recognition of the NTA group (94%). Table 2 . Performance metrics of YOLOv8 and YOLOv10 models for each implant class and overall. Class Precision Recall F1-Score AP YOLOv8 YOLOv10 YOLOv8 YOLOv10 YOLOv8 YOLOv10 YOLOv8 YOLOv10 T4 0.89 0.96 0.93 0.90 0.91 0.93 0.94 0.95 T6 0.96 0.94 0.98 0.98 0.97 0.96 0.99 0.98 DENTIUM 0.94 0.92 0.98 0.96 0.96 0.94 0.99 0.97 NOBEL 0.98 0.99 1 0.99 0.99 0.99 0.99 0.99 NTA 0.90 0.91 0.77 0.94 0.83 0.92 0.88 0.96 Overall 0.94 0.94 0.94 0.95 0.94 0.95 0.96 0.97 AP: Average Precision. Both YOLOv8 and YOLOv10 models demonstrated high overall performance in terms of precision, recall, F1-score, and AP values (Table 2). However, YOLOv10 exhibited more consistent results across classes, as indicated by its narrower range of recall (0.90–0.99) and F1-score (0.93–0.99) values, compared to YOLOv8 (recall: 0.77–1.00; F1-score: 0.83–0.99). This suggests greater robustness in class-level predictions. In terms of average precision (AP), YOLOv10 also outperformed YOLOv8 slightly, with mAP50 scores of 0.97 and 0.96, respectively. YOLOv10 achieved its lowest AP in the T4 class (0.95) and the highest in the NobelReplace Tapered class (0.99), while YOLOv8 performed best in the T6 class (0.99) and poorest in the NTA class (0.88). These results suggest that although both models were effective, YOLOv10 offered a more balanced classification performance across implant systems, whereas YOLOv8 exhibited strong results in certain classes but lower reliability in others, particularly the NTA class. Evaluation of the Performance of Dental Professionals Confusion matrices are presented in Figure 2. In the unassisted condition, true positive (TP) rates ranged from 0.34 to 0.88, with the most frequent misclassifications occurring in the Dentium SuperLine and Nobel Replace Tapered classes. Dentium SuperLine was frequently misclassified as NTA Regular, whereas Nobel Replace Tapered was commonly confused with NucleOSS T4 and T6. In contrast, the AI-assisted condition achieved TP rates ranging from 0.84 to 0.95, with a marked reduction in misclassifications across all classes. Table 3 . Comparison of precision, recall, and F1-scores with and without Artificial Intelligence assistance. Class Precision Recall F1-Score Unassisted Assisted p Unassisted Assisted p Unassisted Assisted p T4 0.74 0.92 0.002 0.88 0.94 0.058 0.80 0.93 0.002 T6 0.62 0.90 0.002 0.79 0.95 0.003 0.69 0.93 0.002 DENTIUM 0.59 0.88 0.002 0.45 0.84 0.002 0.51 0.86 0.002 NOBEL 0.56 0.94 0.005 0.34 0.87 0.002 0.42 0.90 0.002 NTA 0.48 0.80 0.002 0.60 0.86 0.002 0.53 0.83 0.002 Overall 0.69 0.81 0.002 0.69 0.81 0.002 0.69 0.81 0.002 The data were analyzed using the Wilcoxon Signed-Rank test, p-value <0.05 considered statistically significant. Statistically significant p-values are shown in bold. Since each participant provided only one response per class, the values for precision, recall, and F1-score were identical. Precision, recall, and F1-score values were consistently higher in the AI-assisted group, with narrower inter-class variability (Table 3). Statistically significant improvements were observed in all three metrics across all classes (p < 0.05), except for recall in the NucleOSS T4 group, where the increase did not reach statistical significance (p = 0.058). Overall classification performance also improved significantly with AI assistance, with the mean true positive rate increasing from 0.69 to 0.81 (p = 0.002). DISCUSSION The aim of this study was to comparatively assess the effectiveness of AI systems in the identification of DISs and to examine their impact on diagnostic performance. The first null hypothesis, which stated that there would be no significant difference in classification performance among AI models, was partially rejected. Although the overall accuracy values were similar, the YOLOv10 model exhibited a narrower range of class-specific accuracy scores, indicating more consistent performance. The second null hypothesis, which proposed that AI assistance would not influence diagnostic performance, was rejected, as AI assistance led to measurable improvements across all implant system categories. Recent meta-analyses and systematic reviews have demonstrated that deep learning-based AI algorithms offer high accuracy in the identification of DIS (9-11). In a meta-analysis conducted by Dashti et al. (11), the average diagnostic accuracy of deep learning models for DIS identification was reported as 95.63%. Similarly, Ibraheem (12) stated that the majority of studies in the literature achieved accuracy rates above 90%. These findings highlight the strong potential of AI models in identifying DIS based on radiographic images. Previous studies have most commonly employed panoramic and periapical radiographs as imaging modalities. While some research has suggested superior performance with periapical images, others have reported comparable or even better outcomes using panoramic radiographs or a combination of both (12, 13). However, there is still no consensus in literature regarding the superiority of a particular modality. In the present study, only panoramic radiographs were included due to the dataset being limited to this type of image. The findings of this study were consistent with previous research demonstrating the effectiveness of YOLO-based object detection algorithms in classifying DISs. Hassan et al. (14) reported that the YOLOv8m-seg model achieved high performance in periapical radiographs, with an F1 score of 95% and a mAP of 97.2%. Kong et al. (15) showed that the YOLOv7 model outperformed YOLOv5 in classifying implant designs. Additionally, studies evaluating YOLOv3 have reported accuracy rates up to 96.7%, with mAP and IoU metrics exceeding 0.70 (16, 17). In the present study, both YOLOv8 and YOLOv10 models demonstrated high success; however, YOLOv10 stood out with a narrower TP range (0.90–0.99) and more stable metric distributions. Therefore, YOLOv10 was selected as the reference model in dental professionals’ performance analysis. Several studies have evaluated the performance of AI and dental professionals in identifying DISs (18-21). While most of these investigations focused on comparing AI and human performance separately (18-20), all consistently reported that deep learning (DL) algorithms significantly outperformed dental professionals in terms of diagnostic accuracy. To date, however, only one study has specifically examined the collaborative effect of human-AI interaction (21). In the study conducted by Lee et al. (21), AI-assisted board-certified periodontists achieved higher diagnostic accuracy (88.56%) compared to the standalone performance of the DL algorithm (80.56%). Conversely, general dentists and periodontal residents in the same study failed to surpass the independent performance of the AI model, suggesting that the efficacy of AI assistance may be contingent upon the clinician’s level of expertise and experience. In alignment with the existing literature, the present study also demonstrated that AI assistance led to significant improvements in diagnostic performance across all implant system categories. Participants exhibited consistent enhancements in all evaluated metrics, indicating that AI support contributed positively to the classification process irrespective of implant type. Notably, the most pronounced gains were observed in diagnostically challenging categories, such as Dentium SuperLine and Nobel Replace Tapered, underscored the potential utility of AI in complex clinical scenarios. Despite these enhancements, however, the overall diagnostic accuracy of the AI-assisted group remained inferior to that of the AI algorithm operating autonomously. This finding suggested that, although AI systems were capable of producing stable and high-accuracy classifications in isolation, their integration into human decision-making processes does not inherently lead to superior outcomes and, under certain conditions, may even reduce performance due to suboptimal human-AI interaction. This discrepancy may be attributed to inconsistencies in how participants interpret and apply AI-generated outputs during clinical evaluations. Some individuals may disregard, underutilize, or misinterpret the algorithm’s predictions due to reliance on prior clinical experience or the influence of cognitive biases such as anchoring or confirmation bias. Furthermore, a critical observation in the present study was the occurrence of automation bias, wherein participants altered initially correct responses after exposure to incorrect AI suggestions. In such cases, the AI system—rather than serving as a supportive second opinion—paradoxically diverted clinicians from accurate decisions, thereby diminishing the overall effectiveness of the collaboration. These findings underscore a fundamental challenge in clinical AI integration: while AI models offer robust standalone performance, their true clinical utility depends on appropriate interpretation and calibrated reliance by human users. Several limitations should be considered. First, the relatively small sample size and single-center design may restrict the generalizability of the findings. Additionally, the limited number of images per implant class may affect the model’s classification performance, especially for rarely encountered systems. The exclusive use of panoramic radiographs also limits the ability to compare the performance across other imaging modalities such as periapical radiographs or Cone-Beam Computed Tomography. Future studies involving larger, multicenter datasets and diverse radiographic techniques would enhance the generalizability and applicability of the findings. Furthermore, one of the most commonly cited limitations in literature is the lack of annotated implant image data. Deep learning models require large, balanced, and diverse datasets for effective training. Although data augmentation techniques were employed in this study, future research may benefit from the use of synthetic data generation methods such as Generative Adversarial Networks. These approaches could increase the number of rare implant types and contribute to more balanced model performance, ultimately improving the utility of clinical decision support systems. CONCLUSION This study confirmed the strong potential of deep learning-based AI models in the classification of DISs using panoramic radiographs. Among the evaluated models, YOLOv10 demonstrated superior and more consistent performance, leading to its selection as the reference model for observer comparison. The integration of AI support significantly improved the diagnostic performance of participants across all implant categories. However, despite these improvements, AI-assisted participants did not surpass the standalone AI model. This outcome highlighted the critical role of human-AI interaction dynamics and the potential for automation bias, particularly when clinicians rely excessively on incorrect AI suggestions. These findings underscored the importance of not only developing high-performing AI systems but also promoting clinical awareness and education on how to effectively interpret and utilize AI-generated outputs. Enhancing clinicians’ digital literacy and decision-making strategies in the presence of AI may help mitigate common pitfalls—such as automation bias—and unlock the full potential of human-AI collaboration. Future inputs should focus on expanding dataset diversity, evaluating multi-modal imaging inputs, and establishing evidence-based protocols for effective AI integration in routine clinical practice. LIST OF ABBREVIATIONS AI : Artificial Intelligence AP : Average Precision CNN : Convolutional Neural Network DIS : Dental Implant System DL : Deep Learning FN : False Negative FP : False Positive IoU : Intersection over Union mAP : Mean Average Precision mAP50 : Mean Average Precision at an IoU threshold of 0.5 TN : True Negative TP : True Positive Declarations Ethics approval and consent to participate This study was approved by the Non-Interventional Clinical Research Ethics Committee of the Faculty of Dentistry, Aydın Adnan Menderes University (Decision No: 2024/02, 19 July 2024). The requirement for informed consent from patients was waived by the ethics committee due to the retrospective nature of the study, which used only anonymized panoramic radiographs without collecting any personal patient information. Informed consent was obtained from all participants who voluntarily participated in the study. Consent for publication Not applicable. This manuscript does not contain any individual person’s data in any form (including individual details, images, or videos). Trial registration Not applicable. Availability of data and materials The datasets generated and/or analysed during the current study are not publicly available due to restrictions related to patient privacy but are available from the corresponding author on reasonable request. Competing interests The authors declare that they have no competing interests. Funding This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors. Authors' contributions Köse OA contributed to the conception, design, data acquisition and interpretation of the study and drafted the manuscript. Akyıl MŞ contributed to the conception, design, and project management, and also drafted the manuscript. Both authors critically revised the manuscript, approved the final version, and agree to be accountable for all aspects of the work. Acknowledgements We would like to express our sincere gratitude to Prof. Dr. Timur Köse for his valuable support in performing the statistical analyses for this study, and to Emre Kutlu Köse, M.Sc., for his assistance in the training of the artificial intelligence models. Footnotes Not applicable. References Warreth A, Ibieyou N, O'Leary R, Cremonese M, Abdulrahim M. Dental implants: an overview. Dent Update. 2017;44:596-620. Sailer I, Karasan D, Todorovic A, Ligoutsikou M, Pjetursson BE. Prosthetic failures in dental implant therapy. Periodontol 2000. 2022;88(1):130-44. Misch CM. Editorial: the global dental implant market: everything has a price. Int J Oral Implantol (Berl). 2020;13(4):311-2. Leblebicioglu Kurtulus I, Lubbad M, Yilmaz OMD, Kilic K, Karaboga D, Basturk A, et al. A robust deep learning model for the classification of dental implant brands. J Stomatol Oral Maxillofac Surg. 2024;125(12 Suppl 2):101818. Chakravorty S, Aulakh BK, Shil M, Nepale M, Puthenkandathil R, Syed W. Role of artificial intelligence (AI) in dentistry: a literature review. J Pharm Bioallied Sci. 2024;16(Suppl 1):S14-S6. Yamashita R, Nishio M, Do RK, Togashi K. Convolutional neural networks: an overview and application in radiology. Insights Imaging. 2018;9(4):611-29. Bonny T, Al Nassan W, Obaideen K, Al Mallahi MN, Mohammad Y, El-Damanhoury HM. Contemporary role and applications of artificial intelligence in dentistry. F1000Res. 2023;12:1179. Panahi O, Jabbarzadeh M. The expanding role of artificial intelligence in modern dentistry. 2025. Alqutaibi AY, Algabri RS, Elawady D, Ibrahim WI. Advancements in artificial intelligence algorithms for dental implant identification: a systematic review with meta-analysis. J Prosthet Dent. 2023. Chaurasia A, Namachivayam A, Koca-Ünsal RB, Lee JH. Deep-learning performance in identifying and classifying dental implant systems from dental imaging: a systematic review and meta-analysis. J Periodontal Implant Sci. 2023;54(1):3. Dashti M, Londono J, Ghasemi S, Tabatabaei S, Hashemi S, Baghaei K, et al. Evaluation of accuracy of deep learning and conventional neural network algorithms in detection of dental implant type using intraoral radiographic images: a systematic review and meta-analysis. J Prosthet Dent. 2025;133(1):137-46. Ibraheem WI. Accuracy of artificial intelligence models in dental implant fixture identification and classification from radiographs: a systematic review. Diagnostics (Basel). 2024;14(8):806. Park W, Huh JK, Lee JH. Automated deep learning for classification of dental implant radiographs using a large multi-center dataset. Sci Rep. 2023;13(1):4862. Hassan NA, Kamel AE, Omran AE, Gad MW, Ashraf NM, Ahmed OM, et al. Automated identification of dental implants: a new, fast and accurate artificial intelligence system. Eur J Prosthodont Restor Dent. 2023. Kong HJ, Yoo JY, Lee JH, Eom SH, Kim JH. Performance evaluation of deep learning models for the classification and identification of dental implants. J Prosthet Dent. 2023. Kim HS, Ha EG, Kim YH, Jeon KJ, Lee C, Han SS. Transfer learning in a deep convolutional neural network for implant fixture classification: a pilot study. Imaging Sci Dent. 2022;52(2):219-24. Takahashi T, Nozaki K, Gonda T, Mameno T, Wada M, Ikebe K. Identification of dental implants using deep learning—pilot study. Int J Implant Dent. 2020;6:4. Park W, Schwendicke F, Krois J, Huh JK, Lee JH. Identification of dental implant systems using a large-scale multicenter data set. J Dent Res. 2023;102(7):727-33. Lee JH, Kim YT, Lee JB, Jeong SN. A performance comparison between automated deep learning and dental professionals in classification of dental implant systems from dental imaging: a multi-center study. Diagnostics (Basel). 2020;10(11):910. Lee JH, Jeong SN. Efficacy of deep convolutional neural network algorithm for the identification and classification of dental implant systems, using panoramic and periapical radiographs: a pilot study. Medicine (Baltimore). 2020;99(26):e20787. Lee JH, Kim YT, Lee JB, Jeong SN. Deep learning improves implant classification by dental professionals: a multi-center evaluation of accuracy and efficiency. J Periodontal Implant Sci. 2022;52(3):220-9. Additional Declarations No competing interests reported. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7355779","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":526881640,"identity":"5cc063b1-b21d-45a4-9c77-779faae40a79","order_by":0,"name":"Oğuz Alp KÖSE","email":"","orcid":"","institution":"Aydin Adnan Menderes University","correspondingAuthor":false,"prefix":"","firstName":"Oğuz","middleName":"Alp","lastName":"KÖSE","suffix":""},{"id":526881641,"identity":"de3cea4c-6837-4f31-990e-316bf897133a","order_by":1,"name":"Musa Şamil AKYIL","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA6ElEQVRIiWNgGAWjYJCCAw8YEhjY2JkPMPCAucRoSQBpYWZLgGlhbCCoB6SFgZnHgDgt8u2nEw8k1KQl9jHzfPzwto1Bju9GAvvjCjxaDM7kbjiQcCwnsY2Zd7Pk3DYGY8kbCYyNZ/BpYQBpYasAadkgzdvGkLgBpAWfy+T73wK1/ANp4Xn8G6ilnqAWhhtAWxLbQA7jYQPZkmBASIvBDaAtiX1pxm3MbGaWc85JGM4887BxJn6H5W7+8OFbsuz89ubHN96U2cjzHU8+8BGvw6DAEapIAoiJiEkQsCdK1SgYBaNgFIxMAADFKlWx69qp6QAAAABJRU5ErkJggg==","orcid":"","institution":"Aydin Adnan Menderes University","correspondingAuthor":true,"prefix":"","firstName":"Musa","middleName":"Şamil","lastName":"AKYIL","suffix":""}],"badges":[],"createdAt":"2025-08-12 12:38:14","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7355779/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7355779/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":93338187,"identity":"937bb82d-e366-4c35-80ee-4bd328afe1f1","added_by":"auto","created_at":"2025-10-12 14:19:20","extension":"jpg","order_by":0,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":205150,"visible":true,"origin":"","legend":"","description":"","filename":"Figure1.jpg","url":"https://assets-eu.researchsquare.com/files/rs-7355779/v1/98f9ef401f4303164ff24aed.jpg"},{"id":93336031,"identity":"75dac31b-053a-4308-ae58-ac5769be2501","added_by":"auto","created_at":"2025-10-12 14:03:20","extension":"docx","order_by":1,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":86872,"visible":true,"origin":"","legend":"","description":"","filename":"BMCmaintextrevison2.docx","url":"https://assets-eu.researchsquare.com/files/rs-7355779/v1/e16705fe63733550db12abe4.docx"},{"id":93336029,"identity":"d44914bf-5340-4efa-8a74-eb09cf67b788","added_by":"auto","created_at":"2025-10-12 14:03:20","extension":"jpg","order_by":2,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":216897,"visible":true,"origin":"","legend":"","description":"","filename":"Figure2.jpg","url":"https://assets-eu.researchsquare.com/files/rs-7355779/v1/c4f8ce29519fe8242eff580d.jpg"},{"id":93337835,"identity":"bf783274-b426-47f2-b7d4-f460d821ea63","added_by":"auto","created_at":"2025-10-12 14:11:20","extension":"json","order_by":3,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":5788,"visible":true,"origin":"","legend":"","description":"","filename":"3fde13a66eca48a68b61c834e7fa654c.json","url":"https://assets-eu.researchsquare.com/files/rs-7355779/v1/adac25f20e20bdf8bcee4b48.json"},{"id":93336028,"identity":"43abb68e-8765-4c4e-abcc-14db1964cab9","added_by":"auto","created_at":"2025-10-12 14:03:20","extension":"jpg","order_by":4,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":205150,"visible":true,"origin":"","legend":"","description":"","filename":"Figure1.jpg","url":"https://assets-eu.researchsquare.com/files/rs-7355779/v1/c08f655693951d0879f87458.jpg"},{"id":93337837,"identity":"3bc9f194-f2b7-486a-afc4-06fa6b4d8b0e","added_by":"auto","created_at":"2025-10-12 14:11:20","extension":"jpg","order_by":5,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":216897,"visible":true,"origin":"","legend":"","description":"","filename":"Figure2.jpg","url":"https://assets-eu.researchsquare.com/files/rs-7355779/v1/cfffc1521926aaee9d2877ae.jpg"},{"id":93336025,"identity":"8bac01e3-83d1-4045-b7dc-b237a989de98","added_by":"auto","created_at":"2025-10-12 14:03:20","extension":"png","order_by":6,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":44912,"visible":true,"origin":"","legend":"","description":"","filename":"OnlineFigure1.png","url":"https://assets-eu.researchsquare.com/files/rs-7355779/v1/06c4024d28688183f7d5470c.png"},{"id":93338188,"identity":"41a98525-a61b-4803-8f8f-6911aea627a6","added_by":"auto","created_at":"2025-10-12 14:19:20","extension":"png","order_by":7,"title":"","display":"","copyAsset":false,"role":"acdc-reference","size":47240,"visible":true,"origin":"","legend":"","description":"","filename":"OnlineFigure2.png","url":"https://assets-eu.researchsquare.com/files/rs-7355779/v1/451d6290b68de3d380bfc8d5.png"},{"id":93336022,"identity":"3bfaacbc-5199-4c88-a95d-2ee64d56c6a7","added_by":"auto","created_at":"2025-10-12 14:03:20","extension":"jpg","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":205150,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eNormalized confusion matrices for YOLOv8 and YOLOv10.\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNormalized confusion matrices showing the performance of YOLOv8 (left) and YOLOv10 (right) models in identifying dental implant systems. Each matrix visualizes the proportion of true positives (diagonal), false positives (off-diagonal column values), and false negatives (off-diagonal row values) for each class. Values represent normalized prediction frequencies per true class.\u003c/p\u003e","description":"","filename":"Figure1.jpg","url":"https://assets-eu.researchsquare.com/files/rs-7355779/v1/b3ccd3057edb21fb1ce749c2.jpg"},{"id":93337832,"identity":"2851c243-0783-415f-874d-f978543493e1","added_by":"auto","created_at":"2025-10-12 14:11:20","extension":"jpg","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":216897,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eConfusion matrices for dental professionals with and without AI assistance.\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eConfusion matrices showing classification performance of dental professionals without AI assistance (left) and with AI assistance (right). True label classes are shown on the vertical axis and predicted classes on the horizontal axis. Values represent normalized true positive rates per class.\u003c/p\u003e","description":"","filename":"Figure2.jpg","url":"https://assets-eu.researchsquare.com/files/rs-7355779/v1/b5bb52eef44f499e56f005ec.jpg"},{"id":93340766,"identity":"8cc04fb3-752c-4fe8-bd2c-ff8b19ec9126","added_by":"auto","created_at":"2025-10-12 14:35:22","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1218321,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7355779/v1/c16a5dba-5ded-45ea-b464-270c7ae3e682.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Artificial Intelligence and Dental Professionals’ Performance in the Identification of Dental Implant Systems: The Impact of Deep Learning Algorithms – A Diagnostic Accuracy Study","fulltext":[{"header":"BACKGROUND","content":"\u003cp\u003eDental implants are among the most commonly utilized treatment modalities for replacing missing teeth and are regarded as a reliable option in contemporary restorative dentistry (1). Although implant-supported treatments generally exhibit high long-term success rates, clinicians frequently encounter failures involving both implants and superstructures. Mechanical complications, such as implant fracture, loss of retention, screw loosening or fracture, and prosthetic material failure, are widely reported (2). Globally, hundreds of manufacturers produce thousands of distinct dental implant systems (DIS), contributing to substantial diversity within the field of implant dentistry (3). These systems exhibit considerable variation in their internal configurations, abutment designs, superstructures, and screw components, with such differences largely determined by the specific manufacturer and model. Accurate identification and classification of the existing implant system in the oral cavity is therefore critical for ensuring appropriate clinical management in the event of such complications (4).\u003c/p\u003e\n\u003cp\u003eArtificial intelligence (AI) is a rapidly advancing field that mimics human behavior to perform specific tasks in medicine and dentistry (5). In particular, convolutional neural networks (CNNs), a fundamental architecture within deep learning, are specifically designed to address tasks such as image processing, video analysis, object detection, and image classification (6). CNNs have been widely adopted in dentistry from diagnosis to treatment planning, implementation, and prediction of outcomes. Numerous studies have demonstrated that AI applications can improve diagnostic precision and therapeutic success (5, 7, 8).\u003c/p\u003e\n\u003cp\u003eSeveral meta-analyses and systematic reviews have explored the use of object detection algorithms for identifying dental implant systems in panoramic and periapical radiographs (9-11). While the outcomes of these studies are promising, further research is warranted to enhance the effectiveness of computer-aided systems, improve clinical accuracy, and optimize treatment outcomes. Since real-world clinical settings involve complex decision-making processes, evaluations that do not account for clinician involvement may offer a limited representation of practical use. Although AI systems are increasingly adopted in dental practice, the clinician retains primary responsibility for diagnostic and treatment decisions. Assessing the interaction between AI systems and dental professionals is therefore essential for evaluating their clinical utility. To address this gap, the present study aims to compare the diagnostic performance of two deep learning models (YOLOv8 and YOLOv10) in identifying DIS on panoramic radiographs and to investigate the effect of AI assistance on the diagnostic accuracy of dental professionals.\u003c/p\u003e\n\u003cp\u003eThe null hypotheses of the study were as follows:\u003c/p\u003e\n\u003cp\u003eH₁: There is no difference between the AI models in identifying DISs.\u003c/p\u003e\n\u003cp\u003eH₂: There is no significant difference in dental professionals’ diagnostic performance with and without AI assistance in DISs.\u003c/p\u003e"},{"header":"METHODS","content":"\u003cp\u003e\u003cstrong\u003eTrial Design\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis retrospective observational diagnostic accuracy study aimed to compare the performance of two deep learning models (YOLOv8 and YOLOv10) in identifying dental implant systems on panoramic radiographs and to evaluate the effect of AI assistance on the diagnostic accuracy of dental professionals. The study was approved by the Ethics Committee of the Faculty of Dentistry, Aydın Adnan Menderes University (protocol number: 2024/02).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eDataset Preparation\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003ePanoramic radiographs obtained from patients who underwent dental implant treatment at the Faculty of Dentistry, Aydın Adnan Menderes University between January 2014 and April 2024 were retrospectively reviewed. Radiographs were excluded when the corresponding implant system had an insufficient number of cases to be included as a separate group in the analysis, exhibited severe patient positioning errors or imaging artifacts, or showed any pathological conditions in the implant region. The final dataset comprised 380 radiographs containing 1143 implant instances. Table 1 summarized the DISs included in the study, along with the corresponding number of images and implant counts for each brand.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTable\u0026nbsp;\u003c/strong\u003e\u003cstrong\u003e1\u003c/strong\u003e\u003cstrong\u003e.\u003c/strong\u003e Numbers of panoramic images and instances of implant systems used in this study.\u003c/p\u003e\n\u003ctable border=\"0\" cellspacing=\"0\" cellpadding=\"0\" width=\"102%\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003eI\u003c/strong\u003e\u003cstrong\u003emplant System\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003eı\u003c/strong\u003e\u003cstrong\u003emage (n)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003eInstance (n)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eNucleOSS T4\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e87\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e240\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eNucleOSS T6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e74\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e239\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eDentium\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e73\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e242\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eNobel\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e93\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e208\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eNTA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e57\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e214\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eTotal\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e380*\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e1143\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\n\u003cp\u003e*The sum of the images in individual groups exceeds the total number of images because some radiographs contain multiple implant systems, resulting in their inclusion in more than one category.\u003c/p\u003e\n\u003cp\u003eFive\u0026nbsp;dental implant systems from four manufacturers NucleOSS T4 Standard and NucleOSS T6 Standard [NucleOSS Implants, İzmir, Türkiye], Dentium SuperLine [Dentium Co., Ltd., Seoul, South Korea], NobelReplace Tapered [Nobel Biocare AB, Göteborg, Sweden], and NTA Regular [Pilatus Swiss Dental GmbH, Egolzwil, Switzerland]) were annotated by a single researcher using Roboflow Annotate (Roboflow Inc., Des Moines, IA, USA). The annotated images were split into training (60%), validation (20%), and test (20%) sets via the Roboflow platform. All images were resized to 640×640 pixels without cropping. Data augmentation, including ±15° rotation and ±25% brightness adjustment, was applied to the training set. The final dataset included 810 images (80% training, 10% validation, and 10% test).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTraining of Artificial Intelligence Algorithms\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eTwo pretrained object detection models, large models of YOLOv8 and YOLOv10\u0026nbsp;(both pretrained on the Microsoft COCO dataset), were fine-tuned to improve performance in DIS identification. Training was conducted on the Google Colab Pro platform (Google LLC, Mountain View, CA, USA) using an NVIDIA T4 GPU. Each model was trained for 200 epochs with a batch size of 32.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAssessment of AI Model Performance\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eModel performance was evaluated using the test dataset. Detections with an Intersection over Union (IoU) greater than 0.5 were considered correct. Confusion matrices were constructed to report true positives (TP), false positives (FP), and false negatives (FN); true negatives (TN) were excluded, as they are not defined in object detection tasks. Class-wise precision, recall, and F1 scores were calculated. The mean Average Precision at an IoU threshold of 0.5 (mAP50) was used as the primary metric to assess overall detection performance.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eEvaluation of the Performance of Dental Professionals\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eTwelve eligible dental professionals from the departments of Periodontology and Prosthodontics, all of whom had at least one year of clinical experience in implantology, were included in the study. No participants from the Department of Oral and Maxillofacial Surgery were included, as no eligible personnel were available during the study period. The distribution of specialties among participants was not balanced.\u003c/p\u003e\n\u003cp\u003eA total of 233 cropped implant images from the test set were used to generate two multiple-choice forms: one without AI assistance and one with AI assistance. Both forms were text-based and administered via Google Forms (Google LLC, Mountain View, California, USA). In the AI-assisted form, YOLOv10 predictions were shown below each question. Question order was randomized for each participant; class options remained fixed.\u003c/p\u003e\n\u003cp\u003eParticipants initially completed the form without AI assistance. After a two-week interval, the same participants completed the AI-assisted form. Responses from the first session were not accessible during the second. The interval was intended to minimize recall bias and to evaluate the independent effect of AI assistance on decision-making.\u003c/p\u003e\n\u003cp\u003eEach participant’s responses were compared with the ground truth class labels to generate separate confusion matrices for the AI-assisted and unassisted conditions. Class-specific precision, recall, and F1-scores were calculated at both individual and group levels. Since each participant provided only one response per class, the values for precision, recall, and F1-score were identical, effectively representing the correct classification rate (i.e., true positive ratio). This metric was therefore used as the overall performance indicator.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eStatistical Analysis\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eStatistical analyses were performed using IBM SPSS Statistics 25.0. The normality of metric distributions was tested using the Shapiro-Wilk test. Since normality was not assumed, non-parametric tests were used. The Wilcoxon signed-rank test was used to compare performance between AI-assisted and unassisted conditions. A significance level of p\u0026lt;0.05 was considered statistically significant.\u003c/p\u003e"},{"header":"RESULTS","content":"\u003cp\u003e\u003cstrong\u003eEvaluation of the Performance of Artificial Intelligence Algorithms\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNormalized confusion matrices for YOLOv8 and YOLOv10 algorithms were shown at Figure 1. YOLOv8 achieved high TP rates in the NucleOSS T6, Dentium SuperLine, and Nobel Replace Tapered classes. However, its lowest performance was observed in the NTA Regular group (77%), which was frequently misclassified as NucleOSS T4 (10%). In contrast, YOLOv10 demonstrated a more balanced and consistent classification performance across all implant classes, with notably reduced inter-class confusion, particularly improving the recognition of the NTA group (94%).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTable\u0026nbsp;\u003c/strong\u003e\u003cstrong\u003e2\u003c/strong\u003e\u003cstrong\u003e.\u003c/strong\u003e\u0026nbsp; \u0026nbsp;Performance metrics of YOLOv8 and YOLOv10 models for each implant class and overall.\u003c/p\u003e\n\u003cdiv\u003e\n \u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\" width=\"103%\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd rowspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003eClass\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003ePrecision\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003eRecall\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003eF1-Score\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003eAP\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eYOLOv8\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eYOLOv10\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eYOLOv8\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eYOLOv10\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eYOLOv8\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eYOLOv10\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eYOLOv8\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eYOLOv10\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eT4\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.89\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.96\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.93\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.90\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.91\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.93\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.94\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.95\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eT6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.96\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.94\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.98\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.98\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.97\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.96\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.99\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.98\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eDENTIUM\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.94\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.92\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.98\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.96\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.96\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.94\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.99\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.97\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eNOBEL\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.98\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.99\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.99\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.99\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.99\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.99\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.99\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eNTA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.90\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.91\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.77\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.94\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.83\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.92\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.88\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.96\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eOverall\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.94\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.94\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.94\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.95\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.94\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.95\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.96\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.97\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n \u003c/table\u003e\n\u003c/div\u003e\n\u003cp\u003eAP: Average Precision.\u003c/p\u003e\n\u003cp\u003eBoth YOLOv8 and YOLOv10 models demonstrated high overall performance in terms of precision, recall, F1-score, and AP values (Table 2). However, YOLOv10 exhibited more consistent results across classes, as indicated by its narrower range of recall (0.90–0.99) and F1-score (0.93–0.99) values, compared to YOLOv8 (recall: 0.77–1.00; F1-score: 0.83–0.99). This suggests greater robustness in class-level predictions. In terms of average precision (AP), YOLOv10 also outperformed YOLOv8 slightly, with mAP50 scores of 0.97 and 0.96, respectively. YOLOv10 achieved its lowest AP in the T4 class (0.95) and the highest in the NobelReplace Tapered class (0.99), while YOLOv8 performed best in the T6 class (0.99) and poorest in the NTA class (0.88). These results suggest that although both models were effective, YOLOv10 offered a more balanced classification performance across implant systems, whereas YOLOv8 exhibited strong results in certain classes but lower reliability in others, particularly the NTA class.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eEvaluation of the Performance of Dental Professionals\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eConfusion matrices are presented in Figure 2. In the unassisted condition, true positive (TP) rates ranged from 0.34 to 0.88, with the most frequent misclassifications occurring in the Dentium SuperLine and Nobel Replace Tapered classes. Dentium SuperLine was frequently misclassified as NTA Regular, whereas Nobel Replace Tapered was commonly confused with NucleOSS T4 and T6. In contrast, the AI-assisted condition achieved TP rates ranging from 0.84 to 0.95, with a marked reduction in misclassifications across all classes.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTable\u0026nbsp;\u003c/strong\u003e\u003cstrong\u003e3\u003c/strong\u003e\u003cstrong\u003e.\u0026nbsp;\u003c/strong\u003eComparison of precision, recall, and F1-scores with and without Artificial Intelligence assistance.\u003c/p\u003e\n\u003cdiv align=\"center\"\u003e\n \u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\" width=\"700\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd rowspan=\"2\" valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003eClass\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"3\" valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003ePrecision\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"3\" valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003eRecall\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"3\" valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003eF1-Score\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eUnassisted\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eAssisted\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003ep\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eUnassisted\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eAssisted\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003ep\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eUnassisted\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003eAssisted\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003ep\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003eT4\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.74\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.92\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003e0.002\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.88\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.94\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.058\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.80\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.93\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003e0.002\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003eT6\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.62\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.90\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003e0.002\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.79\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.95\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003e0.003\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.69\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.93\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003e0.002\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003eDENTIUM\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.59\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.88\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003e0.002\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.45\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.84\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003e0.002\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.51\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.86\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003e0.002\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003eNOBEL\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.56\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.94\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003e0.005\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.34\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.87\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003e0.002\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.42\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.90\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003e0.002\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003eNTA\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.48\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.80\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003e0.002\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.60\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.86\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003e0.002\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.53\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.83\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003e0.002\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003eOverall\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.69\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.81\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003e0.002\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.69\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.81\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003e0.002\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.69\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e0.81\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\"\u003e\n \u003cp\u003e\u003cstrong\u003e0.002\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n \u003c/table\u003e\n\u003c/div\u003e\n\u003cp\u003eThe data were analyzed using the Wilcoxon Signed-Rank test, \u0026nbsp;p-value \u0026lt;0.05 considered statistically significant.\u0026nbsp;Statistically significant p-values are shown in bold. Since each participant provided only one response per class, the values for precision, recall, and F1-score were identical.\u003c/p\u003e\n\u003cp\u003ePrecision, recall, and F1-score values were consistently higher in the AI-assisted group, with narrower inter-class variability (Table 3). Statistically significant improvements were observed in all three metrics across all classes (p \u0026lt; 0.05), except for recall in the NucleOSS T4 group, where the increase did not reach statistical significance (p = 0.058).\u003c/p\u003e\n\u003cp\u003eOverall classification performance also improved significantly with AI assistance, with the mean true positive rate increasing from 0.69 to 0.81 (p = 0.002).\u003c/p\u003e"},{"header":"DISCUSSION","content":"\u003cp\u003eThe aim of this study was to comparatively assess the effectiveness of AI systems in the identification of DISs and to examine their impact on diagnostic performance. The first null hypothesis, which stated that there would be no significant difference in classification performance among AI models, was partially rejected. Although the overall accuracy values were similar, the YOLOv10 model exhibited a narrower range of class-specific accuracy scores, indicating more consistent performance. The second null hypothesis, which proposed that AI assistance would not influence diagnostic performance, was rejected, as AI assistance led to measurable improvements across all implant system categories.\u003c/p\u003e\n\u003cp\u003eRecent meta-analyses and systematic reviews have demonstrated that deep learning-based AI algorithms offer high accuracy in the identification of DIS (9-11). In a meta-analysis conducted by Dashti et al. (11), the average diagnostic accuracy of deep learning models for DIS identification was reported as 95.63%. Similarly, Ibraheem (12) stated that the majority of studies in the literature achieved accuracy rates above 90%. These findings highlight the strong potential of AI models in identifying DIS based on radiographic images.\u003c/p\u003e\n\u003cp\u003ePrevious studies have most commonly employed panoramic and periapical radiographs as imaging modalities. While some research has suggested superior performance with periapical images, others have reported comparable or even better outcomes using panoramic radiographs or a combination of both (12, 13). However, there is still no consensus in literature regarding the superiority of a particular modality. In the present study, only panoramic radiographs were included due to the dataset being limited to this type of image.\u003c/p\u003e\n\u003cp\u003eThe findings of this study were consistent with previous research demonstrating the effectiveness of YOLO-based object detection algorithms in classifying DISs. Hassan et al. (14) reported that the YOLOv8m-seg model achieved high performance in periapical radiographs, with an F1 score of 95% and a mAP of 97.2%. Kong et al. (15) showed that the YOLOv7 model outperformed YOLOv5 in classifying implant designs. Additionally, studies evaluating YOLOv3 have reported accuracy rates up to 96.7%, with mAP and IoU metrics exceeding 0.70 (16, 17). In the present study, both YOLOv8 and YOLOv10 models demonstrated high success; however, YOLOv10 stood out with a narrower TP range (0.90–0.99) and more stable metric distributions. Therefore, YOLOv10 was selected as the reference model in dental professionals’ performance analysis.\u003c/p\u003e\n\u003cp\u003eSeveral studies have evaluated the performance of AI and dental professionals in identifying DISs (18-21). While most of these investigations focused on comparing AI and human performance separately (18-20), all consistently reported that deep learning (DL) algorithms significantly outperformed dental professionals in terms of diagnostic accuracy. To date, however, only one study has specifically examined the collaborative effect of human-AI interaction (21). In the study conducted by Lee et al. (21), AI-assisted board-certified periodontists achieved higher diagnostic accuracy (88.56%) compared to the standalone performance of the DL algorithm (80.56%). Conversely, general dentists and periodontal residents in the same study failed to surpass the independent performance of the AI model, suggesting that the efficacy of AI assistance may be contingent upon the clinician’s level of expertise and experience.\u003c/p\u003e\n\u003cp\u003eIn alignment with the existing literature, the present study also demonstrated that AI assistance led to significant improvements in diagnostic performance across all implant system categories. Participants exhibited consistent enhancements in all evaluated metrics, indicating that AI support contributed positively to the classification process irrespective of implant type. Notably, the most pronounced gains were observed in diagnostically challenging categories, such as Dentium SuperLine and Nobel Replace Tapered, underscored the potential utility of AI in complex clinical scenarios. Despite these enhancements, however, the overall diagnostic accuracy of the AI-assisted group remained inferior to that of the AI algorithm operating autonomously. This finding suggested that, although AI systems were capable of producing stable and high-accuracy classifications in isolation, their integration into human decision-making processes does not inherently lead to superior outcomes and, under certain conditions, may even reduce performance due to suboptimal human-AI interaction.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eThis discrepancy may be attributed to inconsistencies in how participants interpret and apply AI-generated outputs during clinical evaluations. Some individuals may disregard, underutilize, or misinterpret the algorithm’s predictions due to reliance on prior clinical experience or the influence of cognitive biases such as anchoring or confirmation bias. Furthermore, a critical observation in the present study was the occurrence of automation bias, wherein participants altered initially correct responses after exposure to incorrect AI suggestions. In such cases, the AI system—rather than serving as a supportive second opinion—paradoxically diverted clinicians from accurate decisions, thereby diminishing the overall effectiveness of the collaboration. These findings underscore a fundamental challenge in clinical AI integration: while AI models offer robust standalone performance, their true clinical utility depends on appropriate interpretation and calibrated reliance by human users.\u003c/p\u003e\n\u003cp\u003eSeveral limitations should be considered. First, the relatively small sample size and single-center design may restrict the generalizability of the findings. Additionally, the limited number of images per implant class may affect the model’s classification performance, especially for rarely encountered systems. The exclusive use of panoramic radiographs also limits the ability to compare the performance across other imaging modalities such as periapical radiographs or Cone-Beam Computed Tomography. Future studies involving larger, multicenter datasets and diverse radiographic techniques would enhance the generalizability and applicability of the findings.\u003c/p\u003e\n\u003cp\u003eFurthermore, one of the most commonly cited limitations in literature is the lack of annotated implant image data. Deep learning models require large, balanced, and diverse datasets for effective training. Although data augmentation techniques were employed in this study, future research may benefit from the use of synthetic data generation methods such as Generative Adversarial Networks. These approaches could increase the number of rare implant types and contribute to more balanced model performance, ultimately improving the utility of clinical decision support systems.\u003c/p\u003e"},{"header":"CONCLUSION","content":"\u003cp\u003eThis study confirmed the strong potential of deep learning-based AI models in the classification of DISs using panoramic radiographs. Among the evaluated models, YOLOv10 demonstrated superior and more consistent performance, leading to its selection as the reference model for observer comparison. The integration of AI support significantly improved the diagnostic performance of participants across all implant categories. However, despite these improvements, AI-assisted participants did not surpass the standalone AI model. This outcome highlighted the critical role of human-AI interaction dynamics and the potential for automation bias, particularly when clinicians rely excessively on incorrect AI suggestions.\u003c/p\u003e\n\u003cp\u003eThese findings underscored the importance of not only developing high-performing AI systems but also promoting clinical awareness and education on how to effectively interpret and utilize AI-generated outputs. Enhancing clinicians’ digital literacy and decision-making strategies in the presence of AI may help mitigate common pitfalls—such as automation bias—and unlock the full potential of human-AI collaboration. Future inputs should focus on expanding dataset diversity, evaluating multi-modal imaging inputs, and establishing evidence-based protocols for effective AI integration in routine clinical practice.\u003c/p\u003e"},{"header":"LIST OF ABBREVIATIONS","content":"\u003cp\u003e\u003cstrong\u003eAI\u003c/strong\u003e: Artificial Intelligence\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAP\u003c/strong\u003e: Average Precision\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCNN\u003c/strong\u003e: Convolutional Neural Network\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eDIS\u003c/strong\u003e: Dental Implant System\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eDL\u003c/strong\u003e: Deep Learning\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFN\u003c/strong\u003e: False Negative\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFP\u003c/strong\u003e: False Positive\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eIoU\u003c/strong\u003e: Intersection over Union\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003emAP\u003c/strong\u003e: Mean Average Precision\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003emAP50\u003c/strong\u003e: Mean Average Precision at an IoU threshold of 0.5\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTN\u003c/strong\u003e: True Negative\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTP\u003c/strong\u003e: True Positive\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eEthics approval and consent to participate\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis study was approved by the Non-Interventional Clinical Research Ethics Committee of the Faculty of Dentistry, Aydın Adnan Menderes University (Decision No: 2024/02, 19 July 2024). The requirement for informed consent from patients was waived by the ethics committee due to the retrospective nature of the study, which used only anonymized panoramic radiographs without collecting any personal patient information. Informed consent was obtained from all participants who voluntarily participated in the study.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConsent for publication\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable. This manuscript does not contain any individual person’s data in any form (including individual details, images, or videos).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTrial registration\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAvailability of data and materials\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe datasets generated and/or analysed during the current study are not publicly available due to restrictions related to patient privacy but are available from the corresponding author on reasonable request.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting interests\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare that they have no competing interests.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthors' contributions\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eKöse OA contributed to the conception, design, data acquisition and interpretation of the study and drafted the manuscript.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eAkyıl MŞ contributed to the conception, design, and project management, and also drafted the manuscript.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eBoth authors critically revised the manuscript, approved the final version, and agree to be accountable for all aspects of the work.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAcknowledgements\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe would like to express our sincere gratitude to Prof. Dr. Timur Köse for his valuable support in performing the statistical analyses for this study, and to Emre Kutlu Köse, M.Sc., for his assistance in the training of the artificial intelligence models.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFootnotes\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNot applicable.\u0026nbsp;\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n \u003cli\u003eWarreth A, Ibieyou N, O'Leary R, Cremonese M, Abdulrahim M. Dental implants: an overview. Dent Update. 2017;44:596-620.\u003c/li\u003e\n \u003cli\u003eSailer I, Karasan D, Todorovic A, Ligoutsikou M, Pjetursson BE. Prosthetic failures in dental implant therapy. Periodontol 2000. 2022;88(1):130-44.\u003c/li\u003e\n \u003cli\u003eMisch CM. Editorial: the global dental implant market: everything has a price. Int J Oral Implantol (Berl). 2020;13(4):311-2.\u003c/li\u003e\n \u003cli\u003eLeblebicioglu Kurtulus I, Lubbad M, Yilmaz OMD, Kilic K, Karaboga D, Basturk A, et al. A robust deep learning model for the classification of dental implant brands. J Stomatol Oral Maxillofac Surg. 2024;125(12 Suppl 2):101818.\u003c/li\u003e\n \u003cli\u003eChakravorty S, Aulakh BK, Shil M, Nepale M, Puthenkandathil R, Syed W. Role of artificial intelligence (AI) in dentistry: a literature review. J Pharm Bioallied Sci. 2024;16(Suppl 1):S14-S6.\u003c/li\u003e\n \u003cli\u003eYamashita R, Nishio M, Do RK, Togashi K. Convolutional neural networks: an overview and application in radiology. Insights Imaging. 2018;9(4):611-29.\u003c/li\u003e\n \u003cli\u003eBonny T, Al Nassan W, Obaideen K, Al Mallahi MN, Mohammad Y, El-Damanhoury HM. Contemporary role and applications of artificial intelligence in dentistry. F1000Res. 2023;12:1179.\u003c/li\u003e\n \u003cli\u003ePanahi O, Jabbarzadeh M. The expanding role of artificial intelligence in modern dentistry. 2025.\u003c/li\u003e\n \u003cli\u003eAlqutaibi AY, Algabri RS, Elawady D, Ibrahim WI. Advancements in artificial intelligence algorithms for dental implant identification: a systematic review with meta-analysis. J Prosthet Dent. 2023.\u003c/li\u003e\n \u003cli\u003eChaurasia A, Namachivayam A, Koca-Ünsal RB, Lee JH. Deep-learning performance in identifying and classifying dental implant systems from dental imaging: a systematic review and meta-analysis. J Periodontal Implant Sci. 2023;54(1):3.\u003c/li\u003e\n \u003cli\u003eDashti M, Londono J, Ghasemi S, Tabatabaei S, Hashemi S, Baghaei K, et al. Evaluation of accuracy of deep learning and conventional neural network algorithms in detection of dental implant type using intraoral radiographic images: a systematic review and meta-analysis. J Prosthet Dent. 2025;133(1):137-46.\u003c/li\u003e\n \u003cli\u003eIbraheem WI. Accuracy of artificial intelligence models in dental implant fixture identification and classification from radiographs: a systematic review. Diagnostics (Basel). 2024;14(8):806.\u003c/li\u003e\n \u003cli\u003ePark W, Huh JK, Lee JH. Automated deep learning for classification of dental implant radiographs using a large multi-center dataset. Sci Rep. 2023;13(1):4862.\u003c/li\u003e\n \u003cli\u003eHassan NA, Kamel AE, Omran AE, Gad MW, Ashraf NM, Ahmed OM, et al. Automated identification of dental implants: a new, fast and accurate artificial intelligence system. Eur J Prosthodont Restor Dent. 2023.\u003c/li\u003e\n \u003cli\u003eKong HJ, Yoo JY, Lee JH, Eom SH, Kim JH. Performance evaluation of deep learning models for the classification and identification of dental implants. J Prosthet Dent. 2023.\u003c/li\u003e\n \u003cli\u003eKim HS, Ha EG, Kim YH, Jeon KJ, Lee C, Han SS. Transfer learning in a deep convolutional neural network for implant fixture classification: a pilot study. Imaging Sci Dent. 2022;52(2):219-24.\u003c/li\u003e\n \u003cli\u003eTakahashi T, Nozaki K, Gonda T, Mameno T, Wada M, Ikebe K. Identification of dental implants using deep learning—pilot study. Int J Implant Dent. 2020;6:4.\u003c/li\u003e\n\u003c/ol\u003e\n\u003col start=\"18\"\u003e\n \u003cli\u003ePark W, Schwendicke F, Krois J, Huh JK, Lee JH. Identification of dental implant systems using a large-scale multicenter data set. J Dent Res. 2023;102(7):727-33.\u003c/li\u003e\n \u003cli\u003eLee JH, Kim YT, Lee JB, Jeong SN. A performance comparison between automated deep learning and dental professionals in classification of dental implant systems from dental imaging: a multi-center study. Diagnostics (Basel). 2020;10(11):910.\u003c/li\u003e\n \u003cli\u003eLee JH, Jeong SN. Efficacy of deep convolutional neural network algorithm for the identification and classification of dental implant systems, using panoramic and periapical radiographs: a pilot study. Medicine (Baltimore). 2020;99(26):e20787.\u003c/li\u003e\n \u003cli\u003eLee JH, Kim YT, Lee JB, Jeong SN. Deep learning improves implant classification by dental professionals: a multi-center evaluation of accuracy and efficiency. J Periodontal Implant Sci. 2022;52(3):220-9.\u0026nbsp;\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Artificial Intelligence, Deep Learning, Dental Implants, Dentists","lastPublishedDoi":"10.21203/rs.3.rs-7355779/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7355779/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003e\u003cstrong\u003eBackground\u003c/strong\u003e: This study aimed to compare the diagnostic performance of two deep learning models (YOLOv8 and YOLOv10) in identifying dental implant systems (DISs) on panoramic radiographs and to assess the impact of AI assistance on the diagnostic accuracy of dental professionals.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eMethods:\u003c/strong\u003ePanoramic radiographs of patients who underwent implant treatment at Aydın Adnan Menderes University Faculty of Dentistry between January 2014 and April 2024 were retrospectively screened. A total of 380 radiographs, containing 1,143 implant images representing five DISs (NucleOSS T4, NucleOSS T6, Dentium Superline, Nobel Replace Tapered, and NTA Regular) were included. Images were annotated using Roboflow Annotate and underwent preprocessing and data augmentation. The final dataset comprised 810 images split into 80% training, 10% validation, and 10% test. YOLOv8 and YOLOv10 object detection algorithms were fine-tuned for DIS identification. Performance was assessed using confusion matrices, precision, recall, F1 score, and mean Average Precision (mAP). Twelve dental professionals with at least one year of implantology experience were included. Multiple-choice forms, with and without AI assistance, were completed by participants. Precision, recall, F1 score, and true positive (TP) rates were calculated to compare performance. Statistical analyses included the Shapiro–Wilk and Wilcoxon signed-rank tests.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eResults\u003c/strong\u003e: Precision, recall, F1 score, and mAP for YOLOv8 were 0.94, 0.94, 0.94, and 0.96, respectively, while YOLOv10 achieved 0.94, 0.95, 0.95, and 0.97. Although overall accuracy was similar, YOLOv10 showed a narrower range of variation. Except for the recall of NucleOSS T4, AI significantly improved all metrics across all DIS groups (p\u0026lt;0.05). The overall diagnostic performance of participants was lower than that of the AI models.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConclusions:\u003cbr\u003e\n \u003c/strong\u003eDeep learning-based AI models showed high accuracy in classifying dental implant systems on panoramic radiographs, with YOLOv10 outperforming YOLOv8 in consistency. AI assistance improved participants’ diagnostic performance across all implant categories but did not exceed the standalone AI model, underscoring the impact of human–AI interaction and automation bias. Future work should focus on dataset diversity, multi-modal imaging, and clinician training to optimize AI integration in clinical practice.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTrial registration:\u003c/strong\u003e not applicable.\u003c/p\u003e","manuscriptTitle":"Artificial Intelligence and Dental Professionals’ Performance in the Identification of Dental Implant Systems: The Impact of Deep Learning Algorithms – A Diagnostic Accuracy Study","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-10-12 14:03:15","doi":"10.21203/rs.3.rs-7355779/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"6fdbfff5-7b93-44a7-b794-b75c8b7e3e1b","owner":[],"postedDate":"October 12th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2025-10-12T14:03:18+00:00","versionOfRecord":[],"versionCreatedAt":"2025-10-12 14:03:15","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-7355779","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7355779","identity":"rs-7355779","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.