AI for Classifying Oral Cancer and Precursor Lesions Using Visible-Light Photography

doi:10.21203/rs.3.rs-8865303/v1

AI for Classifying Oral Cancer and Precursor Lesions Using Visible-Light Photography

2026 · doi:10.21203/rs.3.rs-8865303/v1

preprint OA: closed

Full text JSON View at publisher

Full text 234,643 characters · extracted from preprint-html · click to expand

AI for Classifying Oral Cancer and Precursor Lesions Using Visible-Light Photography | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Systematic Review AI for Classifying Oral Cancer and Precursor Lesions Using Visible-Light Photography Charles Goodmaker, Rishi Bhandari, Anwar Tappuni, Tuan Pham This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-8865303/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Artificial intelligence shows promise for oral cancer detection, yet clinical translation remains limited. This scoping review examined 134 studies (2015–2025) investigating AI applications for oral lesion classification using visible-light clinical photography. Searches across Scopus, Web of Science, Embase, and PubMed followed PRISMA-ScR guidelines. Methodological limitations exist among studies; 25.4% utilised a single 131-image Kaggle dataset without ground-truth histological labelling, 99.3% employed supervised learning, and 8.2% performed external validation. Binary classification tasks predominated (59.7%), while dysplasia grading was seldom explored (10.4%). Convolutional neural network architectures, such as ResNet, dominated study designs. Critical gaps include limited multi-modal and multi-model integration, absence of ordinal classification approaches - reflecting disease progression, and underexplored potential of novel deep-learning architectures such as graph-based mechanisms, and use of frontier techniques to address data scarcity such as synthetic image generation. Dentistry Artificial intelligence oral cancer classification clinical photography scoping review Figures Figure 1 Figure 2 Figure 3 Figure 4 Introduction Oral cancer carries significant morbidity and mortality burden. Five year survival rates have remained relatively static at around 50% for most countries 1 . The global incidence rate of oral cancer is rising 2.7% per year; in 2009, there were 275,000 cases worldwide and 6236 in the UK, making it the sixth most prevalent cancer globally 1 , 2 . In England alone, in 2022/23, there were 275,354 head and neck cancer referrals to 2 week wait pathways 3 , of which a significant portion were likely specific to the oral cavity. These incidence figures do not capture the full healthcare burden from diagnosis through treatment and long-term follow-up. Oral cancer management, from initial precursor lesion identification to complete treatment, requires substantial healthcare resources. One systematic review found that oral leukoplakia (OL) progresses to oral squamous cell carcinoma (OSCC) in 7.2% of cases on average 4 . Within this transformation rate hides the substantial level of care required across large patient populations to diagnose each cancer case, including repeated biopsies and follow-ups of oral potentially malignant disorders (OPMDs). Fundamentally, it underpins the potential impact that automated oral lesion classification and risk assessment could have on healthcare systems. OPMDs are a set of oral conditions which have a higher rate of malignant transformation to OSCC, compared to benign lesions. These include oral leukoplakia, oral erythroplakia, oral submucous fibrosis (OSMF), and oral lichen planus (OLP). OSCC, the most common form of oral malignancy, does not, therefore, represent a binary benign-versus-malignant classification but rather exists along a pathophysiological spectrum 5 . Further, histopathology captures a single timepoint in this disease continuum, which both complicates lesion monitoring and creates opportunities for early intervention before progression to advanced stages. Crucially, many precursor lesions appear similar on visual inspection alone, despite varying malignant potential, with one study citing only 25% of lesions have a provisional clinical diagnosis which matches histological outcomes 6 . As a result, histopathological biopsy remains the gold standard; determining lesion type by observing architectural, cellular, genetic and biochemical transformations within tissues 7 . This diagnostic challenge necessitates the integration of technological approaches that can enhance detection and risk stratification of not only benign and malignant but also precursor oral lesions (representative examples of clinically distinct oral lesions spanning benign, potentially malignant, and malignant presentations are shown in Fig. 1 ). Significant disparities exist in both oral cancer incidence and survival across demographic groups, with social deprivation strongly associated with higher disease rates 1 , 8 . Established risk factors include tobacco smoking, alcohol consumption, betel quid chewing, and increasing age 9 , 10 . Microbiological factors such as Candida infection and human papillomavirus (HPV), along with broader oral microbiome composition, have also been implicated in cancer development 11 , 12 . These risk patterns, combined with unequal access to specialist diagnostic services, underscore the need for accessible diagnostic tools that can improve early detection across diverse healthcare settings. Artificial intelligence (AI) has been shown as a promising approach in parallel specialities, such as dermatology, where the first class-IIa medical device for skin lesion classification is currently addressing high volume referral burdens on UK clinical pathways 13 . Similarly, the challenge of classifying oral lesions requires consideration of clinical utility, explainability, and integration with existing workflows. Artificial intelligence encompasses computational systems capable of performing tasks that typically require human intelligence. Within AI, machine learning (ML) represents algorithms that learn patterns from data without explicit programming 14 . Deep learning, a subset of machine learning, employs multi-layered artificial neural networks that can automatically extract hierarchical features from raw data 15 . In medical imaging, computer vision applies these algorithms to interpret visual information from images, enabling automated lesion detection and classification 16 . Advances in deep learning architectures have led to the development of convolutional neural networks (CNNs), which have become the dominant approach for medical image classification, using specialised layers to detect spatial patterns at multiple scales 17 , 18 . More recently, vision transformers have emerged as an alternative architecture that processes images as sequences of patches, enabling modelling of long-range dependencies across the entire image 19 . These models have demonstrated competitive performance on medical imaging tasks, though their application in oral pathology remains limited 20 . Translating these capabilities to into practice requires readily accessible data and compatibility with existing workflows. Therefore, this review focuses on visible-light intraoral clinical photography; a data modality routinely collected during standard management of oral lesions in the UK. By aligning with real-world clinical workflows, this review aims to uncover research gaps that can inform deployable clinical decision support systems. Objectives To map the evidence landscape of artificial intelligence in the classification and detection of oral cancer and its precursor lesions. To focus on realistic and available data subtypes; visible light photography and textual (structured or unstructured) clinical data. To identify gaps in research and the barriers withholding scaled usage in healthcare. Methodology This study followed the PRISMA-ScR (scoping review extension) guidelines 21 , a review protocol was not registered for this study. Searches were conducted in Scopus, Web of Science, Embase, and PubMed from 2015–2025 using structured Boolean strings combining terms for (1) oral lesions and cancers, (2) AI and machine learning, and (3) visible-light or intraoral imaging modalities (supplementary table 1 provides full search-terms). Radiographic and histopathology-only studies were excluded. The search yielded 2,540 records, which were imported into Rayyan AI 22 review software and reduced to 1,940 after deduplication, then down to 292 following title–abstract screening. Zotero 23 (v7.0.30) reference manager was used during a second stage review process, filtering whole article texts manually, resulting in 134 articles for inclusion. Figure 2 outlines the selection process and Table 1 outlines the inclusion/exclusion criteria. Studies from conference papers were included as they can offer papers in emerging research fields which is highly relevant this this scoping review. Table 1 Scoping review inclusion and exclusion criteria. Category Inclusion Criteria Exclusion Criteria Study Objective Studies that classify, detect, or predict oral potentially malignant disorders (OPMDs), dysplasia, or oral cancer. Studies focused solely on segmentation, feature extraction, or image enhancement without diagnostic or classification intent. Study Type & Design Original research articles using Artificial Intelligence (AI), Machine Learning (ML), or Deep Learning (DL). Peer-reviewed journal articles or conference papers published from 2015–2025. Systematic reviews, meta-analyses, narrative reviews, editorials, or letters without new experimental results. Descriptive clinical studies or rule-based expert systems without ML/DL. Population Human subjects with oral lesions or oral cancer. Studies using only animal data. Studies involving non-oral regions (e.g., skin, larynx, systemic cancers). Data Modality Must use clinical photographic images (RGB / visible light) taken intraorally (e.g., DSLR, smartphone, or intraoral camera). Studies using only non-visible-light modalities (e.g., autofluorescence, OCT, NIR, hyperspectral, radiography, histopathology, MRI). Studies based purely on textual EHR, genomic data, or pathology slides. Multimodal Inputs Studies incorporating non-image data (e.g., patient demographics, clinical history, lesion site, textual metadata) are included. N/A AI Methodology Any algorithmic approach including traditional ML, CNNs, Vision Transformers, multimodal fusion, or hybrid systems (image + text/metadata). N/A Reporting Standards Quantitative performance metrics must be provided (e.g., accuracy, sensitivity, specificity, AUC, F1). Studies with unavailable performance metrics, unclear image modality, or no dataset description. Language & Access English-language publications with full text available (preprints acceptable if methodologically detailed). Non-English publications. Dataset quality indicators included size, source, ground-truth labelling, and histopathological confirmation because data quality fundamentally determines whether reported performance will generalise to clinical settings, as established by TRIPOD-AI reporting standards 24 Histopathological confirmation was specifically tracked as the clinical gold standard for oral lesion diagnosis 25 . Clinical tasks were categorised to assess alignment between research focus and clinical utility: binary classification (malignant vs. benign) versus multi-class including dysplasia grading (mild, moderate and severe), which influences clinical management 5 . Architectural AI design was categorised by model families (CNNs, vision transformers, graph neural networks) because architecture determines model capabilities and therefore clinical utility 26 . Validation was assessed by distinguishing internal validation from external validation on independent datasets collected at different institutions, which represents the gold standard for clinical generalisability 27 . Multimodal integration approaches combining imaging with clinical data were documented because this mirrors clinical decision-making processes 28 while explainability methods were recorded as interpretability correlates with trust and regulatory approval 29 , 30 . All extracted variables were recorded in Microsoft Excel file, where frequencies and graphs were also generated. Results In this scoping review, 134 studies published between 2015 and 2025 31–164 were included. Several excluded studies were relevant to oral cancer classification tasks, but either used a different modality type or covered tasks such as classification and detection without specifically classifying lesions; these are summarised in Table 2 . Table 2 Excluded categories relevant to oral cancer classification tasks. Category Description References Hyperspectral imaging Requires specialist imaging spectrometer capturing light across hundreds of narrow wavelength bands rather than standard RGB. Promising for histopathology slide analysis with limited in vivo application for tissue and lesion classification. 209–211 Autofluorescence imaging Exploits tissue autofluorescence properties that change during malignant transformation due to alterations in fluorophores (e.g., collagen) and tissue architecture. Often combined with white light in dual-modality approaches. Requires specialist equipment not standard in UK oral health clinics. 212 Confocal microscopy Laser scanning microscopy enabling high-resolution, non-invasive optical sectioning of tissue in vivo, providing near-histological images without biopsy. Requires specialist equipment with limited clinical accessibility. 213 Histopathological imaging Microscopic analysis of tissue specimens focusing on cellular and architectural features. A substantial parallel field operating at microscopic rather than macroscopic clinical photography scale, with transferable methodological findings. 214–217 Tabular data only Studies analysing only structured clinical variables without image data, including EHR demographics, risk factors, and histology. Novel approaches include CNN processing of tabular-to-image transformed data. 218,219 Radiological integration Studies combining features from CT, MRI, or PET imaging alongside clinical photographs. Excluded as radiological imaging is not routinely integrated with clinical photography workflows in primary oral assessment. 220–222 Lesion localisation (detection and segmentation) Studies performing bounding box detection or pixel-level segmentation without diagnostic classification output. Demonstrates object detection (YOLO, Faster R-CNN) and spatial reasoning architectures (U-Net, Mask R-CNN, transformers) useful as preprocessing for classification pipelines. 186,223–233 Datasets and Modalities Many CV studies rely on open-source datasets, the quantity and quality of which are very limited, in comparison to parallel fields such as dermatology. For example, a specific open source dataset called Oral Cancer (Lips and Tongue) Images or OCI 165 , contains 131 images without specified histological ground truth labelling; and served as a data source in 25.4% of studies. Some studies have, however, managed to assemble relatively large private datasets, the largest reported number of images included was 44,409 32 . The majority of studies (84.3%, n = 113) rely solely or primarily on clinical photographs, whereas 15.7% ( n = 21) studies leverage a multi-modal approach (Table 3 ). Table 3 Tabulated breakdown of multimodal data types Additional Data Value % Clinical history/risk factors 9 6.7% Demographics (age, sex) 8 6.0% Histopathology images 8 6.0% Lesion metadata (site/type) 5 3.7% Radiological (CT/MRI) 4 3.0% Other imaging (fluorescence) 4 3.0% Text/NLP/EHR 3 2.2% Table 4 A table showing AI explainability approaches across the cohort of included studies. Method Value % Any XAI reported 48 35.8% No XAI reported 86 64.2% Grad-CAM 30 22.4% Attention maps 5 3.7% Saliency maps 3 2.2% SHAP 3 2.2% LIME 1 0.7% Where reported, the most common approach to managing multi-modal data in model training was to deploy early-fusion strategies ( n = 9). Feature-level fusion ( n = 7) and late-fusion ( n = 2) were also mentioned among the cohort of papers. Learning Paradigms & Tasks Machine learning approaches can be categorised into supervised, unsupervised, semi-supervised, self-supervised and reinforcement learning 166 . Whereby a model may see labelled data during training or attempt to train on a fewer number of images without labels or even none. This review found one study employing a semi-supervised training method; proposing a multi-scale random cropping self-training framework for oral cancer versus leukoplakia classification 71 , with an accuracy of 71.7%. Another study utilised a unique supervised “few-shot” approach to overcome data scarcity within the domain; through a Siamese network which produced matching scores for images across classes 135 achieving 92% accuracy. Other approaches tended to focus on data augmentation methods (63.4%, n = 85). Further, one study was identified as having used synthetic oral lesion images created through Generative Adversarial Networks, achieving a diagnostic accuracy of 97% 114 . Within supervised learning, binary classification was the most common task (59.7%, n = 80) - Fig. 3 - followed by multiclass classification (41.8%, n = 56), with smaller proportions additionally exploring object detection (15.7%, n = 21) and semantic segmentation (6.7%, n = 9). Preprocessing and Validation Among the included studies 63.4% ( n = 85) included data augmentation techniques such as image flipping (45.5%, n = 61), rotation 44.8% (60) translation (17.9% n = 24), scaling (13.4%, n = 18) and brightness alterations (13.4%, n = 18). The review found 4.5% ( n = 6) of studies employing Gaussian noise; an approach where random pixel-level variation is added to images to simulate real-world image degradation. Additionally, 41.0% ( n = 55) employed segmentation techniques to isolate regions of interest. Where specified, the type of segmentation was Region of Interest (ROI); which was primarily performed manually 22.4% ( n = 30). A smaller number of studies employed automated segmentation pipelines with U-Net and YOLO models (3.7%, n = 5, and 2.2%, n = 3, respectively). This review found 91.0% ( n = 122) of studies using internal validation methods for model training. Internal validation was considered to be use of random image-level or patient-level splits within a single dataset. Further, 16.4% ( n = 22) employed a K-fold cross-validation approach to model development. Of all included studies, 8.2% ( n = 11) utilised external validation datasets and 4.1% ( n = 6) included multi-centre data; defined as data from ≥ 2 institutions. External validation was considered as testing on a dataset collected at a different institution or time period. Model Architectures & Performance Model architecture refers to the structural design of neural networks, determining how information flows through computational layers. Studies primarily employed pre-trained CNN architectures, leveraging transfer learning; a training strategy whereby models pre-trained on large-scale image datasets, such as ImageNet 167 , are fine-tuned on domain-specific medical images. Among CNN architectures, the ResNet family was most frequently employed, with ResNet-50 being the single most common model subtype (Fig. 4 ). Among CNN architectures, several studies demonstrated strong performance. Fu et al. (2020) achieved an Area Under Curve (AUC) of 0.983 on internal validation using DenseNet, with external validation confirming generalisability (AUC 0.935) 32 . Another reported AUC 1.0 for OSCC classification using DenseNet-169 44 . Earlier work demonstrated the feasibility of transfer learning with fine-tuned architectures for oral lesion detection 56 . Promising results for mobile-based oral cancer classification using smartphone-acquired images was also observed 125 , highlighting potential for point-of-care screening. More recently, vision transformer architectures have emerged in oral cancer clinical image classification 92 , with 14 studies deploying transformer based architectures. A key study compared Vision Transformer (ViT) and Swin Transformer performance on mobile-acquired images 70 , while another directly compared vision transformers against CNNs for oral cancer lesion classification 54 . Explainability Approaches Explainable AI (XAI) 168 are methods designed to communicate the rationale behind model predictions, 35.8% ( n = 48) of studies described at least one XAI method. Saliency maps are a popular XAI method - showing important image regions identified by a model 169 . A specific format are Gradient-weighted Class Activation Maps (Grad-CAM) 170 , which utilise the gradients flowing into the final convolution layer to generate a class-discriminate map. Grad-CAM remained the predominant approach in the dataset (22.4%, n = 30). This technique - which highlights determining image regions by utilising gradients flowing into the final convolutional layer - was extensively employed to validate model focus on clinically relevant lesion areas, such as the tongue borders or ulcerated regions. Attention maps 171 were utilised (3.7%, n = 5) in newer transformer-based and hybrid architectures, to dynamically weigh the importance of specific spatial features or multimodal inputs. Use of advanced model-agnostic quantitative frameworks like SHAP 172 or LIME 168 was rare. One study leveraged Case-Based Reasoning 108 to provide more physician-aligned decision support and explainability of model predictions. Limitations This scoping review has several limitations that should be considered when interpreting its findings. First, the search was limited to four databases (Scopus, Web of Science, Embase, and PubMed) and English-language publications, potentially missing relevant studies indexed elsewhere or published in other languages. Second, conference papers were included to capture emerging research, which may introduce findings that have not undergone full peer review. Third, formal quality assessment and meta-analysis were not performed. Given substantial heterogeneity in study designs, datasets, and evaluation protocols pooled results may not be fully representative. Finally, publication bias likely inflates reported performance, as studies with negative or null findings may be underrepresented. Discussion The field of oral cancer classification using AI has extended rapidly over the past decade. However, compared to parallel areas of clinical image analysis, real-world deployment is lagging. In dermatology, for example, advances in vision models have started to make their way into real world clinical workflows 173 . The paucity of large, open-source well-curated datasets 174 likely contributes to this mis-match. In dermatology, datasets such as the HAM10000 – a collection of over 10,000 annotated dermatoscopy images 175 and the International Skin Imaging Collaboration (ISIC) 2024 challenge dataset of 400,000 skin lesion image crops are readily available to researchers and developers alike. Assembling large oral image databases, however, is limited by slightly different challenges. Anatomical constraints can make adequate photography difficult without mirrors and specialist equipment. This is compounded by data privacy concerns – with the face, teeth and oral features considered partly, if not fully identifiable (an issue which is less of a concern for cropped lesions of the skin or other anatomical structures). Indeed, researchers have specifically investigated ways to resolve privacy issues through techniques like federated learning 176 and the generation of synthetic image datasets using a GAN framework 114 to address data scarcity. Next steps could include exploring diffusion-based generative models. Unlike GANs, diffusion models produce higher-fidelity images with greater diversity and fewer mode collapse issues. Recent work demonstrates that classifiers trained on synthetic diffusion-generated medical images can approach or match performance of those trained on real data. For oral cancer, this approach could enable creation of large, diverse, privacy-compliant training datasets. Within this environment of relative data scarcity, many studies have adopted binary classification tasks; determining malignant from benign lesions. However, real world oral lesions exist on a pathophysiological spectrum of benign to dysplastic to malignant 25 . Only 10.4% of studies have investigated the power of AI models to discriminate varying grades of dysplasia through to carcinoma. One paper took a robust approach of attempting dysplasia sub-grade prediction – collapsing the WHO classification down to a dichotomous outcome of high vs lower-risk dysplasia from images 177 , 178 . Moreover, only 6.9% of studies investigate three-way classification tasks (benign, OPMD/dysplasia and malignant). This may partly explain why reported performance gains have not consistently translated into deployed clinical tools – because classifying benign and malignant may not meet utility thresholds for clinicians, as perhaps would models which can detect hard-to-spot edge cases; such as high-risk dysplastic white patches in the oral cavity. More broadly, multimodal data integration will likely be pivotal in building models which can generate safe and stable risk stratification 92 , 113 , 122 , 129 , 152 , 160 ; accounting for limited clinical image datapoints they may be able to build richer representations of patient cases. A leading study in the field utilised smartphone clinical photographs fused with patient age, sex, and habit data using early feature concatenation, demonstrating improved performance over image-only models 129 . Another study used a heterogeneous graph framework integrating lesion photographs with structured clinical variables including demographics, risk factors, lesion characteristics, oral epithelial dysplasia grade, and longitudinal follow-up, to enable both diagnostic classification and time-dependent malignant transformation risk prediction 131 . Similarly, multi-model ensemble methods; whereby several different AI models are hosted within a single classification workflow have performed very well in dermatology image classification tasks, often outperforming uni-model variants in international competitions 180 . Interestingly, no studies explored self-supervised learning approaches such as contrastive learning or masked image modelling, despite their demonstrated effectiveness in radiology 181 . Self-supervised pretraining methods - including contrastive learning (SimCLR, MoCo) 182 , 183 , masked autoencoders, and self-distillation approaches (DINO) - leverage large quantities of unlabelled images to learn visual representations. Such techniques could be applied to potentially massive imaging dataset in dental primary and secondary care whilst mitigating significant annotation burdens for specialist clinicians. Convolutional neural network architectures dominate the field, with ResNet-50 being the most frequently employed model. While CNNs excel at local feature extraction, they may be suboptimal for tasks requiring spatial reasoning across tissue regions 184 . Graph neural networks, which can model relationships between tissue components and could potentially capture field cancerisation patterns through spatial connectivity modelling, remain essentially unexplored in oral lesion classification despite promising applications in other medical imaging domains 185 . Vision transformer architectures have started to be investigated in this field (2024–2025), demonstrating competitive or superior performance in preliminary studies 42 , 54 , 70 , 92 , 92 , 186 . However, the small number of transformer-based studies precludes robust comparative conclusions. The field would benefit from systematic head-to-head evaluations across standardised benchmarks to establish whether transformer architectures offer genuine advantages for oral lesion classification. The emergence of vision-language foundation models represents a paradigm shift. These models, pre-trained on billions of image-text pairs, demonstrate competitive zero-shot performance on medical imaging tasks without domain-specific training 187 , 188 . Early studies evaluating multimodal large language models on oral lesion image classification show promise 94 , 189 , though systematic benchmarking against purpose-built classifiers remains limited. Real-world deployment requires seamless integration with existing clinical information systems, decision support that aligns with clinical workflows, and outputs that support rather than replace clinical judgement. Currently, no studies have evaluated clinical workflow feasibility, time-to-decision impacts, or clinician acceptance of AI-assisted diagnosis. Further, real-world adoption will likely be heavily dependent on robust explainability frameworks 30 . Gradient-weighted Class Activation Mapping (Grad-CAM) was the predominant explainability approach used throughout this scoping review (22.4%), followed by attention maps (3.7%) and saliency maps (2.2%). Advanced model-agnostic interpretability methods remained rare, with SHAP appearing in only 2.2% and LIME in 0.7% of studies. The integration of these approaches may reflect the rising saliency of explainability, but a wider plethora of techniques and intrinsic incorporation into model workflows remains open for exploration. Temporal risk stratification and longitudinal monitoring represent underexplored applications with substantial clinical value. Oral lesions require extended surveillance periods, with transformation to malignancy occurring over months to years 190 . AI systems capable of integrating sequential imaging data to model disease progression and predict transformation risk could substantially improve clinical management of oral potentially malignant disorders. Finally, uncertainty-aware models that provide calibrated confidence estimates would enable appropriate referral decisions and human-AI collaboration 191 . Current approaches largely provide point predictions without uncertainty quantification, limiting their utility for clinical decision-making where understanding prediction confidence is essential. Future Directions and Recommendations To accelerate the safe and effective use of AI for classifying oral cancer and precursor lesions from visible light photography, several methodological and translational priorities should be addressed. First, curation of large, open, and well-annotated datasets must be a central goal. Datasets should capture real-world heterogeneity in acquisition devices, illumination conditions, anatomical sites, and disease spectra (normal mucosa, potentially malignant disorders, dysplasia grades, and invasive cancer). Standardised annotation protocols—ideally including lesion boundaries, uncertainty labels, and longitudinal outcomes—would enable robust benchmarking, reproducibility, and fair comparison across models. Second, advanced image segmentation should be treated as a foundational step rather than an optional pre-processing task. Accurate delineation of lesions and surrounding mucosa can reduce background bias, improve feature localisation, and enhance interpretability. Contemporary approaches such as attention-guided segmentation 192 , weakly supervised 193 and self-supervised 194 segmentation, and multi-scale encoder–decoder architectures 195 can leverage limited pixel-level labels while remaining robust to annotation noise. Joint optimisation of segmentation and classification in end-to-end frameworks may further improve diagnostic performance. Third, data augmentation and synthesis require more principled development beyond basic geometric or photometric transformations. Domain-aware 196 and attribute-aware 197 augmentation—accounting for colour constancy, illumination variability, specular highlights, and anatomical plausibility—is critical in oral imaging. Generative models, including diffusion-based synthesis and conditional generative adversarial networks 198 , offer promising avenues to enrich minority classes, simulate rare precursor lesions, and improve model generalisation, provided that safeguards against artefact learning and distribution shift are rigorously applied. Fourth, class imbalance remains a defining challenge, as early-stage dysplasia and certain precursor lesions are under-represented relative to advanced disease or normal tissue. Addressing this requires a combination of strategies, including cost-sensitive 199 , focal 200 and asymmetric 201 loss functions, informed resampling 202 , and uncertainty-aware training 203 . Importantly, evaluation metrics should prioritise clinical relevance—such as sensitivity for high-risk lesions and balanced accuracy—rather than overall accuracy alone. Fifth, model fusion and ensemble learning should be systematically explored. Combining complementary representations—such as convolutional features, transformer-based global context, and graph-based relational modelling of lesion morphology—can improve robustness and reduce variance. Hybrid fusion strategies, including late-decision fusion, 28,204 tailored to majority and minority classes, may be particularly effective in clinically imbalanced settings and better reflect multi-factorial clinician reasoning. Sixth, XAI is essential for clinical adoption. Models should provide transparent and reliable explanations at both the pixel and concept levels, highlighting diagnostically meaningful regions and features rather than spurious correlations. Techniques such as attention visualisation 205 , counterfactual explanations 206 , and concept-based attribution 207 can support clinician trust, facilitate error analysis, and enable regulatory scrutiny. XAI should be evaluated not only for visual plausibility but also for clinical validity and consistency across patient subgroups. Seventh, vision–language models (VLMs) 208 represent a promising direction for future research in oral cancer and precursor lesion classification from visible light photography. By jointly learning from images and textual information, VLMs would enable the integration of visual lesion characteristics with clinical descriptors such as lesion morphology, anatomical site, risk factors, and provisional diagnoses. This multimodal representation may be particularly advantageous for early-stage and potentially malignant disorders, where visual features are subtle and subject to high inter-observer variability. Moreover, VLMs support weakly supervised, prompt-based, and few-shot learning paradigms, potentially reducing dependence on large, fully annotated datasets. Importantly, language grounding also offers a pathway toward more interpretable and clinically meaningful explanations, complementing pixel-level saliency methods. Future work should focus on domain-adapted VLM architectures tailored to intraoral imaging, robust handling of noisy or biased clinical text, and rigorous external validation to ensure generalisability and clinical trust. Finally, future research should emphasise generalisation, fairness, and deployment readiness. External validation across institutions, devices, and populations is critical to mitigate bias and ensure equity. Prospective studies, human–AI interaction experiments, and integration with clinical workflows will be necessary to move from proof-of-concept models toward real-world impact. Collectively, addressing these issues will help ensure that AI systems for oral cancer and precursor lesion classification are accurate, interpretable, and clinically meaningful. Conclusion This scoping review of 134 studies reveals a field that has achieved technically impressive results on narrow benchmarks but faces fundamental challenges for clinical translation. Current research is characterised by reliance on limited, inadequately validated datasets; near-exclusive focus on supervised learning with binary classification tasks; minimal external validation; and explainability implementations that serve publication rather than clinical needs. The critical gaps identified, particularly the absence of dysplasia grading research, lack of ordinal classification approaches reflecting disease progression, limited multimodal integration, and unexplored potential of graph-based architectures for spatial reasoning, represent priorities for future research. Addressing these gaps will require collaborative efforts to develop larger, well-annotated datasets with histological validation, standardised evaluation protocols enabling meaningful comparison across studies, and clinical workflow integration studies that move beyond technical accuracy to assess real-world impact. Abbreviations AI Artificial Intelligence AUC Area Under the Curve CNN Convolutional Neural Network CV Computer Vision EHR Electronic Health Record GAN Generative Adversarial Network GNN Graph Neural Network Grad CAM -Gradient-weighted Class Activation Mapping HPV Human Papillomavirus IQR Interquartile Range LIME Local Interpretable Model-agnostic Explanations ML Machine Learning OCI Oral Cancer (Lips and Tongue) Images dataset OED Oral Epithelial Dysplasia OL Oral Leukoplakia OLP Oral Lichen Planus OPMD Oral Potentially Malignant Disorder OSCC Oral Squamous Cell Carcinoma OSMF Oral Submucous Fibrosis PRISMA ScR -Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews ResNet Residual Network ROI Region of Interest SD Standard Deviation SHAP SHapley Additive exPlanations UK United Kingdom VGG Visual Geometry Group ViT Vision Transformer WHO World Health Organization XAI Explainable AI Declarations Conflicts of Interest None to declare Funding There was no funding for this study. References Warnakulasuriya S (2009) Global epidemiology of oral and oropharyngeal cancer. Oral Oncol 45:309–316 Bray F et al (2024) Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 74:229–263 CWT USC referrals dashboard https://nhsd-ndrs.shinyapps.io/cwt_referral_conversion_detection/ Guan J-Y et al (2023) Malignant transformation rate of oral leukoplakia in the past 20 years: A systematic review and meta-analysis. J Oral Pathol Med Off Publ Int Assoc Oral Pathol Am Acad Oral Pathol 52:691–700 Warnakulasuriya S, Johnson, Newell W, Van Der Waal I (2007) Nomenclature and classification of potentially malignant disorders of the oral mucosa. J Oral Pathol Med 36:575–580 Coppola N et al (2021) Referral Patterns in Oral Medicine: A Retrospective Analysis of an Oral Medicine University Center in Southern Italy. Int J Environ Res Public Health 18:12161 Walsh T et al (2021) Diagnostic tests for oral cancer and potentially malignant disorders in patients presenting with clinically evident lesions. Cochrane Database Syst. Rev. (2021) Conway DI et al (2008) Socioeconomic inequalities and oral cancer risk: A systematic review and meta-analysis of case‐control studies. Int J Cancer 122:2811–2819 Gandini S et al (2008) Tobacco smoking and cancer: a meta-analysis. Int J Cancer 122:155–164 Bagnardi V et al (2015) Alcohol consumption and site-specific cancer risk: a comprehensive dose-response meta-analysis. Br J Cancer 112:580–593 Kreimer AR, Clifford GM, Boyle P, Franceschi S (2005) Human Papillomavirus Types in Head and Neck Squamous Cell Carcinomas Worldwide: A Systematic Review. Cancer Epidemiol Biomarkers Prev 14:467–475 Perera M, Al-hebshi NN, Speicher DJ, Perera I, Johnson NW (2016) Emerging role of bacteria in oral carcinogenesis: a review with special reference to perio-pathogenic bacteria. J Oral Microbiol 8:32762 Marsden H et al (2024) Accuracy of an artificial intelligence as a medical device as part of a UK-based skin cancer teledermatology service. Front Med 11:1302363 Samuel AL (1959) Some Studies in Machine Learning Using the Game of Checkers. IBM J Res Dev 3:210–229 LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444 Shen D, Wu G, Suk H-I (2017) Deep Learning in Medical Image Analysis. Annu Rev Biomed Eng 19:221–248 Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60:84–90 Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 Dosovitskiy A et al (2020) An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. Preprint at https://doi.org/10.48550/ARXIV.2010.11929 Takahashi S et al (2024) Comparison of Vision Transformers and Convolutional Neural Networks in Medical Image Analysis: A Systematic Review. J Med Syst 48:84 Arksey H, O’Malley L (2005) Scoping studies: towards a methodological framework. Int J Soc Res Methodol 8:19–32 Ouzzani M, Hammady H, Fedorowicz Z, Elmagarmid A (2016) Rayyan—a web and mobile app for systematic reviews. Syst Rev 5:210 Zotero | Your personal research assistant. https://www.zotero.org/ Collins GS et al (2024) TRIPOD + AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ 385:e078378 Muller S, Tilakaratne WM (2022) Update from the 5th Edition of the World Health Organization Classification of Head and Neck Tumors: Tumours of the Oral Cavity and Mobile Tongue. Head Neck Pathol 16:54–62 Khalifa M, Albadawy M (2024) AI in diagnostic imaging: Revolutionising accuracy and efficiency. Comput Methods Programs Biomed Update 5:100146 Collins GS et al (2014) External validation of multivariable prediction models: a systematic review of methodological conduct and reporting. BMC Med Res Methodol 14:40 Huang S-C, Pareek A, Seyyedi S, Banerjee I, Lungren MP (2020) Fusion of medical imaging and electronic health records using deep learning: a systematic review and implementation guidelines. Npj Digit Med 3:136 US Food and Drug Administration (2025) Artificial intelligence and machine learning (AI/ML)-enabled medical devices. https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-and-machine-learning-aiml-enabled-medical-devices Cinà G, Röber TE, Goedhart R, Birbil (2025) Ş. İ. Why we do need explainable AI for healthcare. Diagn Progn Res 9:24 Sharma D, Kudva V, Patil V, Kudva A, Bhat RS (2022) A Convolutional Neural Network Based Deep Learning Algorithm for Identification of Oral Precancerous and Cancerous Lesion and Differentiation from Normal Mucosa: A Retrospective Study. Eng Sci 18:278–287 Fu Q et al (2020) A deep learning algorithm for detection of oral cavity squamous cell carcinoma from photographic images: A retrospective study. eClinicalMedicine 27:100558 J A et al (2024) A Deep Learning System to Predict Epithelial Dysplasia in Oral Leukoplakia. J Dent Res 103:1218–1226 Upadhyay D, Manwal M, Kukreja V, Sharma RA, Fine (2024) -Tuned Yolov5 and Exception Model for Oral Cancer Detection. https://doi.org/10.1109/INCET61516.2024.10592942 doi:10.1109/INCET61516.2024.10592942 Saini A, Guleria K, Sharma SA, Pre-trained (2023) MobileNetV2 Model for Oral Cancer Classification. https://doi.org/10.1109/IDICAIEI58380.2023.10406692 doi:10.1109/IDICAIEI58380.2023.10406692 Qu JA, Remote Network (2024) Transmission Diagnosis Method for Oral Cancer Based on 6G and Rough Set Theory Hierarchical Diagnosis. Wirel Pers Commun. https://doi.org/10.1007/s11277-024-11242-9 Baliarsingh SK, Dev PP, Bandyopadhyay A, Dash AK, Pradhan R (2024) A Smartphone-based Deep Learning Framework for Early Detection of Oral Cancer Signs. 181–186. 10.1109/ESIC60604.2024.10481662 Liu P, Bagi K (2025) A tailored deep learning approach for early detection of oral cancer using a 19-layer CNN on clinical lip and tongue images. Sci Rep 15 Pradhan P (2025) Accuracy of ChatGPT 3.5, 4.0, 4o and Gemini in diagnosing oral potentially malignant lesions based on clinical case reports and image recognition. Med Oral Patol Oral Cir Bucal 30:e224–e231 Kabir MF, Ahmad MY, Uddin R, Cordero M, Kant S (2025) Accurate and lightweight oral cancer detection using SE-MobileViT on clinically validated image dataset. Discov Artif Intell 5 Hadjouni M et al (2023) Advanced Meta-Heuristic Algorithm Based on Particle Swarm and Al-Biruni Earth Radius Optimization Methods for Oral Cancer Detection. Ieee Access. https://doi.org/10.1109/access.2023.3253430 Vinayahalingam S et al (2024) Advancements in diagnosing oral potentially malignant disorders: leveraging Vision transformers for multi-class detection. Clin ORAL Investig. 28 Talwar V et al (2023) AI-Assisted Screening of Oral Potentially Malignant Disorders Using Smartphone-Based Photographic Images. Cancers 15 Warin K et al (2022) AI-based Analysis of Oral Lesions Using Novel Deep Convolutional Neural Networks for Early Detection of Oral Cancer. PLoS ONE. https://doi.org/10.1371/journal.pone.0273508 Rai V et al (2024) AI-Driven Smartphone Screening for Early Detection of Oral Potentially Malignant Disorders. https://doi.org/10.1109/ICONSTEM60960.2024.10568597 doi:10.1109/ICONSTEM60960.2024.10568597 Santhiya M, Sindhuja M, Jegatha R, Manikandan J (2023) An Effective Automated Framework for Oral Cancer Detection by Enhanced. https://doi.org/10.1109/ICoAC59537.2023.10249983 . Convolutional Neural Networks Nanditha BR, Kiran GA, Chandrashekar HS, Dinesh MS, Murali S (2021) An Ensemble Deep Neural Network Approach for Oral Cancer Screening. Int J ONLINE Biomed Eng 17:121–134 Aftab J et al (2025) Artificial intelligence based classification and prediction of medical imaging using a novel framework of inverted and self-attention deep neural network architecture. Sci Rep 15 Schmidl B et al (2025) Artificial intelligence for image recognition in diagnosing oral and oropharyngeal cancer and leukoplakia. Sci Rep 15 Surenthar M, Gunaseelan R, Balasubramaniam A, Elakiya J (2025) Artificial Intelligence in the Screening of Oral Cancer: A Cross–Sectional Study on a Novel App–Based Approach for Primary Health Care Settings. J Indian Acad Oral Med Radiol 37:215–220 Ramesh E, Ganesan A, Lakshmi KC, Natarajan PM (2025) Artificial intelligence—based diagnosis of oral leukoplakia using deep convolutional neural networks Xception and MobileNet-v2. Front Oral Health 6:1414524 Alabdan R, Alruban A, Mustafa Hilal AM, Motwakel A (2023) Artificial-Intelligence-Based Decision Making for Oral Potentially Malignant Disorder Diagnosis in Internet of Medical Things Environment. Healthc Switz 11 Patel A et al (2024) Attention-guided convolutional network for bias-mitigated and interpretable oral lesion classification. Sci Rep 14 Chilet-Martos E, Vila-Francés J, Bagán-Sebastián JV, Vives-Gilabert Y (2025) Automated classification of oral cancer lesions: Vision transformers vs radiomics. Comput Biol Med 189 C S-S et al (2025) Automated classification of oral potentially malignant disorders and oral squamous cell carcinoma using a convolutional neural network framework: a cross-sectional study. Lancet Reg Health Am 47:101138 Welikala RA et al (2020) Automated detection and classification of oral lesions using deep learning for early detection of oral cancer. IEEE Access 8:132677–132693 Tanriver G, Soluk Tekkesin M, Ergen O (2021) Automated detection and classification of oral lesions using deep learning to detect oral potentially malignant disorders. Cancers 13:2766 Shamim MZM et al (2022) Automated Detection of Oral Pre-Cancerous Tongue Lesions Using Deep Learning for Early Diagnosis of Oral Cavity Cancer. Comput J 65:91–104 Manikandan J, Krishna BV, Varun N, Vishal V, Yugant S (2023) Automated Framework for Effective Identification of Oral Cancer Using Improved. https://doi.org/10.1109/ICONSTEM56934.2023.10142794 . Convolutional Neural Network Warin K, Limprasert W, Suebnukarn S, Jinaporntham S, Jantana P (2021) Automatic Classification and Detection of Oral Cancer in Photographic Images Using Deep Learning Algorithms. J Oral Pathol Med. https://doi.org/10.1111/jop.13227 Song B et al (2018) Automatic classification of dual-modalilty, smartphone-based oral dysplasia and malignancy images using deep learning. Biomed Opt Express 9:5318–5329 Begum SH, Vidyullatha P (2023) Automatic Detection and Classification of Oral Cancer from Photographic Images Using Attention Maps and Deep Learning. Int J Intell Syst Appl Eng 11:221–229 H L, H, C., L, W., J, S., J L (2021) Automatic detection of oral cancer in smartphone-based images using deep learning for early diagnosis. J Biomed Opt 26 Sundari TS, Maheshwari M (2025) Automatic oral cancer detection using deep learning techniques. Biomed Signal Process Control 106 Song B et al (2021) Bayesian deep learning for reliable oral cancer image classification. Biomed Opt Express 12:6422–6430 Islam MM, Alam KMR, Uddin J, Ashraf I, Samad MA (2023) Benign and Malignant Oral Lesion Image Classification Using Fine-Tuned Transfer Learning Techniques. Diagnostics 13:3360 Vijaya J, Rishabh K, Parashar P, CanScan (2023) Non-Invasive Techniques for Oral Cancer Detection. https://doi.org/10.1109/ELEXCOM58812.2023.10370254 doi:10.1109/ELEXCOM58812.2023.10370254 al-Ali A et al (2025) CLASEG: advanced multiclassification and segmentation for differential diagnosis of oral lesions using deep learning. Sci Rep 15 Song B et al (2021) Classification of imbalanced oral cancer image data from high-risk population. J Biomed Opt 26 Song B et al (2024) Classification of Mobile-Based Oral Cancer Images Using the Vision Transformer and the Swin Transformer. CANCERS 16 Hamada I et al (2025) Classification of Oral Cancer and Leukoplakia Using Oral Images and Deep Learning with Multi-Scale Random Crop Self-Training. Int Conf Pattern Recognit Appl Methods 1:780–787 Goswami B, Bhuyan MK, Alfarhood S, Safran M (2024) Classification of Oral Cancer Into Pre-Cancerous Stages From White Light Images Using LightGBM Algorithm. IEEE Access 12:31626–31639 Goswami B, Neogi S, Nagar SR, Punjabi N, Gudi R (2025) Classification of Oral Potentially Malignant Disorders Using Multimodal Feature Integration. Proc. - Int. Symp. Biomed. Imaging https://doi.org/10.1109/ISBI60581.2025.10980715 doi:10.1109/ISBI60581.2025.10980715 Welikala RA et al (2021) Clinically Guided Trainable Soft Attention for Early Detection of Oral Cancer. Lect Notes Comput Sci 13052:226–236 Rathi S, Puranik A, Pratham S, Kulkarni V, Chincholkar H (2024) Comparative Analysis of CNN Architectures for Enhancing Oral Cancer Detection Using Advanced Image Processing Techniques. https://doi.org/10.1109/ICCUBEA61740.2024.10774894 doi:10.1109/ICCUBEA61740.2024.10774894 Bourass Y, Zouaki H, Bahri A (2016) Computer-Aided diagnostics of facial and oral cancer. https://doi.org/10.1109/ICoCS.2015.7483252 doi:10.1109/ICoCS.2015.7483252 Shruthi K et al (2024) Convolutional Neural Network For Detection Of Oral Cavity Leading To Oral Cancer From Photographic Images. Int J Comput Digit Syst 15:865–877 Wei X et al (2024) Convolutional neural network for oral cancer detection combined with improved tunicate swarm algorithm to detect oral cancer. Sci Rep 14 Camalan S et al (2021) Convolutional Neural Network-Based Clinical Predictors of Oral Dysplasia: Class Activation Map Analysis of Deep Learning Results. CANCERS 13 Lim JH et al (2021) D’OraCa: Deep Learning-Based Classification of Oral Lesions with Mouth Landmark Guidance for Early Detection of Oral Cancer. Lect Notes Comput Sci 12722:408–422 Lee Y-H et al (2025) DCNN models with post-hoc interpretability for the automated detection of glossitis and OSCC on the tongue. Sci Rep 15 Sankaradass V, Devasenan R, Manish M, Gurunamasivayam VK, M., Govindasamy C (2025) Deep Learning Algorithms in Oral Lesion Diagnosis: Innovations in Image-Based optimization for Cancer Detection and. https://doi.org/10.1109/ICDSAAI65575.2025.11011565 . Differential Diagnosis Heo J et al (2022) Deep learning model for tongue cancer diagnosis using endoscopic images. Sci Rep 12 Su A-Y, Wu M-L, Wu Y-H (2025) Deep learning system for the differential diagnosis of oral mucosal lesions through clinical photographic imaging. J Dent Sci 20:54–60 Ormeño-Arriagada P, Navarro E, Taramasco C, Gatica G, Vasconez JP (2025) Deep Learning Techniques for Oral Cancer Detection: Enhancing Clinical Diagnosis by ResNet and DenseNet Performance. Commun Comput Inf Sci 2236:59–72 Alzahrani AA et al (2025) Deep structured learning with vision intelligence for oral carcinoma lesion segmentation and classification using medical imaging. Sci Rep 15 Marzouk R et al (2022) Deep Transfer Learning Driven Oral Cancer Detection and Classification Model. Comput Mater Contin 73:3905–3920 Kumar A, Sharma N (2023) Detection and Classification of Oral Cancer Using Machine Learning Models. 522–528. 10.1109/ICTACS59847.2023.10390071 Kavyashree C, Vimala HS, J J, Detection (2024) and segmentation of oral lesion using Mask R-CNN. https://doi.org/10.1109/ACOIT62457.2024.10939183 doi:10.1109/ACOIT62457.2024.10939183 Song H-J et al (2023) Detection of Abnormal Changes on the Dorsal Tongue Surface Using Deep Learning. Med -Lith 59 La Mantia G et al (2024) Detection of Elementary White Mucosal Lesions by an AI System: A Pilot Study. Oral 4:557–566 Flügge T et al (2023) Detection of oral squamous cell carcinoma in clinical photographs using a vision transformer. Sci Rep 13 Kumar S, Pratap A, Saxena I (2025) Development of Oral Cancer Detection Technique: A Comprehensive Approach Using CNNs and TensorFlow Lite. 934–938. 10.1109/CICTN64563.2025.10932593 LA V et al (2025) Diagnostic Performance of ChatGPT-4o in Analyzing Oral Mucosal Lesions: A Comparative Study with Experts. Med Kaunas Lith 61 al MADO (2024) et. Diagnóstico de cáncer oral mediante algoritmos de aprendizaje profundo. Ingenius Rev Cienc Tecnol. https://doi.org/10.17163/ings.n32.2024.06 Jurczyszyn K, Kozakiewicz M (2019) Differential diagnosis of leukoplakia versus lichen planus of the oral mucosa based on digital texture analysis in intraoral photography. Adv Clin Exp Med 28:1469–1476 Gürses BO et al (2025) Differentiation of benign and malignant oral lesions through surface texture analysis and SVM modeling. Clin Oral Investig 29 Pahadiya P, Vijay R, Gupta KK, Saxena S, Shahapurkar T (2023) Digital Image Based Segmentation and Classification of Tongue Cancer Using CNN. Wirel Pers Commun 132:609–627 Wang W, Liu Y, Wu J (2023) Early diagnosis of oral cancer using a hybrid arrangement of deep belief networkand combined group teaching algorithm. Sci Rep 13 Lee S-J et al (2024) Enhancing deep learning classification performance of tongue lesions in imbalanced data: mosaic-based soft labeling with curriculum learning. BMC Oral Health 24 da Silva AVB et al (2024) Enhancing Explainability in Oral Cancer Detection with Grad-CAM Visualizations. Lect Notes Comput Sci 14813:151–164 Shuaib M et al (2025) Enhancing Oral Cancer Diagnosis through Attention Mechanisms and Explainable AI: A VGG-19 with CBAM Integration. https://doi.org/10.1109/CONIT65521.2025.11166859 Attarde AV et al (2024) Enhancing Oral Cancer Screening with Deep Learning Algorithms. https://doi.org/10.1109/ICCSC62048.2024.10830423 Chakraborty P, Saha N, Nath S, Das K (2025) Enhancing Oral Disease Classification using Dilated Convolutional Neural Network. 127–130. 10.1109/IEEECONF64992.2025.10963041 Sharma A, Gupta S, Abbas HM (2025) Evaluating Deep Neural Networks for Oral Cancer Prediction: A Study Using ResNet50 and DenseNet121. https://doi.org/10.1109/OTCON65728.2025.11070632 doi:10.1109/OTCON65728.2025.11070632 G K, İŞ HY, B., F, N. P., Ö Ç (2025) Evaluation of the Detectability of Oral Potentially Malignant Diseases with a Deep Learning Approach: A Retrospective Pilot Study. J Imaging Inf Med. https://doi.org/10.1007/s10278-025-01665-6 Yadav DP, Sharma B, Noonia A, Mehbodniya A (2025) Explainable label guided lightweight network with axial transformer encoder for early detection of oral cancer. Sci Rep 15 Cimino MGCA et al (2025) Explainable screening of oral cancer via deep learning and case-based reasoning. Smart Health 35 Welikala RA et al (2020) Fine-Tuning Deep Learning Architectures for Early Detection of Oral Cancer. Lect Notes Comput Sci 12508:25–31 Rabinovici-Cohen S et al (2024) From Pixels to Diagnosis: Algorithmic Analysis of Clinical Oral Photos for Early Detection of Oral Squamous Cell Carcinoma. CANCERS 16 Shamim MZM (2022) Hardware Deployable Edge-AI Solution for Prescreening of Oral Tongue Lesions Using TinyML on Embedded Devices. IEEE Embed Syst Lett 14:183–186 Reddy MR, Saritha KN, Reddy A, Nagaratnamaiah PA, C., Sandhya TK (2025) Hybrid Deep Learning Framework for Real-Time Oral Cancer Detection and Prevention Using Multi-Model CNN Integration. https://doi.org/10.1109/ISAC364032.2025.11156758 doi:10.1109/ISAC364032.2025.11156758 Parola M et al (2023) Image-Based Screening of Oral Cancer via Deep Ensemble Architecture. 1572–1578. 10.1109/SSCI52147.2023.10371865 Lee YO, Kim J, Lee JW (2025) Improving Diagnostic Accuracy for Oral Cancer with inpainting Synthesis Lesions Generated Using Diffusion Models. 10.48550/arXiv.2508.06151 Parola M et al (2025) Improving oral cancer classification via segment-driven photographic deep learning imaging. https://doi.org/10.1109/CISMCompanion65074.2025.11032552 doi:10.1109/CISMCompanion65074.2025.11032552 Karthikeyan B et al (2024) Design and Development of an Oral Cancer Identification Methodology based on Improved Neural Classification Scheme. 411–416. 10.1109/ICSCSA64454.2024.00072 Bordoloi D, Joshi K, Kukreja V, Sharma R (2024) Innovative Approaches in Oncology: YOLOv5 and EfficientNet for Improved. https://doi.org/10.1109/APCIT62007.2024.10673457 . Oral Cancer Diagnosis Kavyashree C, Vimala HS (2025) Instance Segmentation of Oral Cancer Images with Fusion of Swin Transformer and Mask RCNN. J Innov Image Process 7:695–706 Alanazi AA, Khayyat MM, Khayyat MM, Elnaim BM (2022) & Abdel-khalek, S. Intelligent Deep Learning Enabled Oral Squamous Cell Carcinoma Detection and Classification Using Biomedical Images. Comput. Intell. Neurosci. (2022) Chen R, Wang Q, Huang X (2024) Intelligent deep learning supports biomedical image detection and classification of oral cancer. Technol Health Care 32:S465–S475 Figuera K et al (2022) Interpretable deep learning approach for oral cancer classification using guided attention inference network. J Biomed Opt 27:015001 Dinesh Y, Ramalingam K, Ramani P, Deepak RM (2023) Machine Learning in the Detection of Oral Lesions With Clinical Intraoral Images. Cureus https://doi.org/10.7759/cureus.44018 doi:10.7759/cureus.44018 Schwärzler J et al (2025) Machine learning versus clinicians for detection and classification of oral mucosal lesions. J Dent 161 Liyanage V, Tao M, Park S, Wang JS, K. N., Azimi S (2023) Malignant and non-malignant oral lesions classification and diagnosis with deep neural networks. J Dent 137 Song B et al (2021) Mobile-based oral cancer classification for point-of-care screening. J Biomed Opt 26 Rashid J et al (2024) Mouth and oral disease classification using InceptionResNetV2 method. Multimed Tools Appl 83:33903–33921 Özen BB, Karadaş F, Ba Alawi A (2024) Multi-Model Stacking Ensemble Approach for Improving Oral Cancer Diagnosis. https://doi.org/10.1109/SIU61531.2024.10600983 doi:10.1109/SIU61531.2024.10600983 Redondo A et al (2026) Multiclass classification of oral mucosal lesions by deep learning from clinical images without performing any restrictions. Biomed Signal Process Control 111 Devindi GAI et al (2024) Multimodal Deep Convolutional Neural Network Pipeline for AI-Assisted Early Detection of Oral Cancer. IEEE Access 12:124375–124390 Zhang J, Tian Y, Song B, Lee E-J (2025) Multimodal Learning for Enhanced Detection in Oral Cancer Screening. Proc. - Int. Symp. Biomed. Imaging https://doi.org/10.1109/ISBI60581.2025.10981235 doi:10.1109/ISBI60581.2025.10981235 Li J et al (2025) Next-generation AI framework for comprehensive oral leukoplakia evaluation and management. Npj Digit Med 8 Razmjouei P et al (2025) NFR-EDL: Non-linear fuzzy rank-based ensemble deep learning for accurate diagnosis of oral and dental diseases using RGB color photography. Comput Biol Med 192 Huang Q, Ding H, Razmjooy N (2023) Optimal deep learning neural network using ISSA for diagnosing the oral cancer. Biomed Signal Process Control 84 Sekaran R, Manikandan M, Suliman W, Ravi V (2024) Optimizing Oral Cancer Diagnosis with Advanced Deep Learning Approaches. https://doi.org/10.1109/ISAECT64333.2024.10799545 doi:10.1109/ISAECT64333.2024.10799545 Rajan A, Oviya IR (2024) Oral Cancer Classification Using Few-Shot Learning with CNN and Siamese Networks. https://doi.org/10.1109/INDICON63790.2024.10958513 doi:10.1109/INDICON63790.2024.10958513 Khan SUR, Asif S (2024) Oral cancer detection using feature-level fusion and novel self-attention mechanisms. Biomed Signal Process Control 95:106437 Swamikannan LD et al (2024) Oral Cancer Detection Using Mobile Vision Technology. https://doi.org/10.1109/BHI62660.2024.10913489 doi:10.1109/BHI62660.2024.10913489 Chai Y, Chai X, Zhang L, Ye G (2025) & Rashid Sheykhahmad, F. R. Oral cancer detection via Vanilla CNN optimized by improved artificial protozoa optimizer. Sci Rep 15 Saxena K et al (2025) Oral Cancer Detection with Customized Deep Neural Network Based Transfer Learning Technique: A Comprehensive 2-D Image Analysis. https://doi.org/10.1109/APSIT63993.2025.11086279 Medapati MP, Ahmed A, Subash K, K., Aeron A (2024) Oral Cancer Detections and Classification Using Region Based Convolutional Neural Network. https://doi.org/10.1109/TQCEBT59414.2024.10545203 doi:10.1109/TQCEBT59414.2024.10545203 Zhang L, Shi R, Yousefi N (2024) Oral cancer diagnosis based on gated recurrent unit networks optimized by an improved version of Northern Goshawk optimization algorithm. Heliyon 10 Teo AHA, Goh CP (2024) Oral Disease Image Detection System Using Transfer Learning. 194–198. 10.1109/ICDXA61007.2024.10470514 Singh R, Sharma N, Rajput K, Singh M (2024) Oral Lesions Classification Using EfficientNet Transfer Learning Model. https://doi.org/10.1109/WCONF61366.2024.10692037 doi:10.1109/WCONF61366.2024.10692037 Nanditha BR, Kiran GA, Chandrashekar HS, Dinesh MS, Murali S (2020) Oral Malignancy Detection Using Color Features from Digital True Color Images. Int J ONLINE Biomed Eng 16:95–106 Xie F et al (2024) Oral mucosal disease recognition based on dynamic self-attention and feature discriminant loss. Oral Dis 30:3094–3107 Bhopal M, Ranjan R (2023) Oral Tumor Detection based on Convolution Neural Network. https://doi.org/10.1109/INCOFT60753.2023.10425572 doi:10.1109/INCOFT60753.2023.10425572 Divya S, Oviya IR, PrasannaKumar R, Oralnet (2024) A Deep Learning Model for Automated. https://doi.org/10.1109/INDICON63790.2024.10958407 . Oral Cancer Detection Asif S, Wang VY, Xu D, OralTransNet (2025) A novel hybrid model integrating transformer attention and CNN features for accurate diagnosis of mouth and oral diseases. Eng Appl Artif IN℡LIGENCE 159 Kumar M et al (2025) Performance Evaluation of Large Language Models in Detecting Buccal Mucosal Lesions Using Smartphone-Based Imaging. J Pioneer Med Sci 14:102–107 Khovidhunkit S-OP et al (2025) Performance of deep learning models for the classification and object detection of different oral white lesions using photographic images. Sci Rep 15 Uthoff RD et al (2018) Point-of-care, smartphone-based, dual-modality, dual-view, oral cancer screening device with neural network classification for low-resource communities. PLoS ONE 13:e0207493 Wuttisarnwattana P et al (2024) Precise Identification of Oral Cancer Lesions Using Artificial Intelligence. Stud Health Technol Inf 316:1096–1097 Nooralli IM, Patil SB, Kulkarni G (2024) Predictive Analytics for Tongue Diseases: A Comparative Study of Deep Learning Models. https://doi.org/10.1109/INNOVA63080.2024.10847037 doi:10.1109/INNOVA63080.2024.10847037 Dissorn P et al (2023) Preprocessing Technique for Oral Lesion Classification using U-NET Segmentation. https://doi.org/10.1109/BMEiCON60347.2023.10322048 doi:10.1109/BMEiCON60347.2023.10322048 Pradeep Singh SM et al (2023) Real Time Oral Cavity Detection Leading to Oral Cancer using CNN. https://doi.org/10.1109/NMITCON58196.2023.10275851 Al Duhayyim M et al (2023) Sailfish Optimization with Deep Learning Based Oral Cancer Classification Model. Comput Syst Sci Eng 45:753–767 Desai KM et al (2025) Screening of oral potentially malignant disorders and oral cancer using deep learning models. Sci Rep 15 Zahran FM, el-Din YS, Azab NA (2025) The Application of Artificial-based Models to Classify Oral Cavity Findings Based on Clinical Image Analysis. Adv Dent J 7:627–641 Phosri K et al (2022) The Comparison of Deep Learning Model Efficiency for Classification of Oral White Lesions. 235–238. 10.1109/ITC-CSCC55581.2022.9894916 Parola M et al (2024) Towards explainable oral cancer recognition: Screening on imperfect images via Informed Deep Learning and Case-Based Reasoning. Comput Med Imaging Graph. 117 Araújo ALD et al (2025) Two-step pipeline for oral diseases detection and classification: a deep learning approach. Front Oral Health 6:1659323 Gomes R et al (2023) Use of Artificial Intelligence in the Classification of Elementary Oral Lesions from Clinical Images. Int J Environ Res Public Health 20:3894 Ye Y-J, Han Y, Liu Y, Guo Z-L, Huang M-W (2024) Utilizing deep learning for automated detection of oral lesions: A multicenter study. Oral Oncol. 155 Vemulapalli L, Kola AVS, Ravuri CC, Kanagala A (2025) White Light Medical Image Based Oral Cancer Diagnosis Using an Ensemble Deep Learning Model. 1132–1137. 10.1109/ICCSAI64074.2025.11063769 Oral Cancer (Lips and Tongue) images. https://www.kaggle.com/datasets/shivam17299/oral-cancer-lips-and-tongue-images Sarker IH (2021) Machine Learning: Algorithms, Real-World Applications and Research Directions. SN Comput Sci 2:160 Fei-Fei L Knowledge transfer in learning to recognize visual objects classes Ribeiro MT, Singh S, Guestrin C (2016) ‘Why Should I Trust You?’: Explaining the Predictions of Any Classifier. Preprint at https://doi.org/10.48550/arXiv.1602.04938 Simonyan K, Vedaldi A, Zisserman A (2014) Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps. Preprint at https://doi.org/10.48550/arXiv.1312.6034 Selvaraju RR et al (2020) Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization. Int J Comput Vis 128:336–359 al SHB (2023) et. Automatic Detection and Classification of Oral Cancer from Photographic Images Using Attention Maps and Deep Learning. Int J Intell Syst Appl Eng Lundberg S, Lee S-I (2017) A Unified Approach to Interpreting Model Predictions. Preprint at https://doi.org/10.48550/arXiv.1705.07874 Brown AC, Salmon PJM, Leffell DJ, Ko JM, Grant-Kels JM (2022) Artificial intelligence in the detection of skin cancer. J Am Acad Dermatol 87:1336–1342 Sengupta N, Sarode S, Sarode G, Ghone U (2022) Scarcity of publicly available oral cancer image datasets for machine learning research. Oral Oncol 126 Tschandl P, ViDIR Group (2018) &. The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. 683406, 1403566547, 1366522108, 10808743, 421046860, 976058, 830369, 129493 Harvard Dataverse https://doi.org/10.7910/DVN/DBW86T Firdaus N, Raza Z (2025) Enhancing Privacy in Oral Cancer Detection through Federated Learning: A Cross-Institutional Study. Procedia Comput Sci 260:1113–1120 Shephard AJ et al (2025) Development and validation of an artificial intelligence-based pipeline for predicting oral epithelial dysplasia malignant transformation. Commun Med 5 Ferrer-Sánchez A, Bagan J, Vila-Francés J, Magdalena-Benedito R, Bagan-Debon L (2022) Prediction of the risk of cancer and the grade of dysplasia in leukoplakia lesions using deep learning. Oral Oncol 132:105967 Schouten D et al (2025) Navigating the landscape of multimodal AI in medicine: A scoping review on technical challenges and clinical applications. Med Image Anal 105:103621 International Skin Imaging Collaboration (2024) SLICE-3D 2024 Permissive Challenge Dataset. International Skin Imaging Collaboration https://doi.org/10.34970/2024-SLICE-3D-PERMISSIVE Huang S-C et al (2023) Self-supervised learning for medical image classification: a systematic review and implementation guidelines. Npj Digit Med 6:74 Chen T, Kornblith S, Norouzi M, Hinton G (2020) A Simple Framework for Contrastive Learning of Visual Representations. Preprint at https://doi.org/10.48550/arXiv.2002.05709 He K, Fan H, Wu Y, Xie S, Girshick R (2020) Momentum Contrast for Unsupervised Visual Representation Learning. Preprint at https://doi.org/10.48550/arXiv.1911.05722 Mienye ID, Viriri S (2025) Graph Neural Networks in Medical Imaging: Methods, Applications and Future Directions. Information 16:1051 Kiechle J et al (2024) Graph Neural Networks: A suitable Alternative to MLPs in Latent 3D Medical Image. org/10.48550/arXiv.2407.17219 . Classification? Preprint at https://doi. Goswami B et al (2025) Detection of Oral Potentially Malignant Lesions Through Transformer-Based Segmentation Models. Lect Notes Comput Sci 15305:318–332 Zhang Y, Jiang H, Miura Y, Manning CD, Langlotz CP (2022) Contrastive Learning of Medical Visual Representations from Paired Images and Text. Preprint at https://doi.org/10.48550/arXiv.2010.00747 Tiu E et al (2022) Expert-level detection of pathologies from unannotated chest X-ray images via self-supervised learning. Nat Biomed Eng 6:1399–1406 Zhang J, Du B, Miao Y, Sun D, Cao X (2025) OralGPT: A Two-Stage Vision-Language Model for Oral Mucosal Disease Diagnosis and Description. Preprint at https://doi.org/10.48550/arXiv.2510.13911 Bukovszky B et al (2023) Malignant Transformation and Long-Term Outcome of Oral and Laryngeal Leukoplakia. J Clin Med 12:4255 Loftus TJ et al (2022) Uncertainty-aware deep learning in healthcare: A scoping review. PLOS Digit Health 1:e0000085 Tripathi PC, Bag S (2023) An Attention-Guided CNN Framework for Segmentation and Grading of Glioma Using 3D MRI Scans. IEEE/ACM Trans Comput Biol Bioinform 20:1890–1904 Patel G, Dolz J (2022) Weakly supervised segmentation with cross-modality equivariant constraints. Med Image Anal 77:102374 Ferreira DL, Lau C, Salaymang Z, Arnaout R (2025) Self-supervised learning for label-free segmentation in cardiac ultrasound. Nat Commun 16:4070 Tripathi M, Kongprawechnon W, Kondo TA (2025) Highly Robust Encoder–Decoder Network with Multi-Scale Feature Enhancement and Attention Gate for the Reduction of Mixed Gaussian and Salt-and-Pepper Noise in Digital Images. J Imaging 11:51 Michel N, Negrel R, Chierchia G, Bercher J-F (2023) Domain-Aware Augmentations for Unsupervised Online General Continual Learning. Preprint at https://doi.org/10.48550/ARXIV.2309.06896 Wagata K, Huang C, Nihey F, Kosaka Y, Nakahara K (2025) Attribute-Aware Adversarial Domain Augmentation for Zero-Shot Medical Domain Adaptation. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. IEEE Eng. Med. Biol. Soc. Annu. Int. Conf. 1–7 (2025) Shen Z, Mao M, Fan PA Primary Comparison of Diffusion Models and Generative Adversarial Networks for Image Synthesis. in Proceedings of the (2024) 7th International Conference on Machine Learning and Machine Intelligence (MLMI) 225–234 (ACM, Osaka Japan, 2024). 10.1145/3696271.3696307 Araf I, Idri A, Chairi I (2024) Cost-sensitive learning for imbalanced medical data: a review. Artif Intell Rev 57:80 Yeung M, Sala E, Schönlieb C-B, Rundo L (2022) Unified Focal loss: Generalising Dice and cross entropy-based losses to handle class imbalanced medical image segmentation. Comput Med Imaging Graph 95:102026 Vito V, Stefanus LY (2022) An Asymmetric Contrastive Loss for Handling Imbalanced Datasets. Entropy 24:1303 Welvaars K et al (2023) Implications of resampling data to address the class imbalance problem (IRCIP): an evaluation of impact on performance between classification algorithms in medical data. JAMIA Open 6:ooad033 Dashti A et al (2025) Uncertainty-Aware Deep Neural Network Training for Imbalanced Geochemical Data Distributions. Nat Resour Res. https://doi.org/10.1007/s11053-025-10568-w Gadzicki K, Khamsehashari R, Zetzsche C Early vs Late Fusion in Multimodal Convolutional Neural Networks. in (2020) IEEE 23rd International Conference on Information Fusion (FUSION) 1–6 (IEEE, Rustenburg, South Africa, 2020). 10.23919/FUSION45008.2020.9190246 Liu G, Zhang J, Chan AB, Hsiao JH (2024) Human attention guided explainable artificial intelligence for computer vision models. Neural Netw 177:106392 Delaney E, Pakrashi A, Greene D, Keane MT (2023) Counterfactual explanations for misclassified images: How human and machine explanations differ. Artif Intell 324:103995 Pastor E, Poeta E, Panisson A, Perotti A, Ciravegna G (2025) Beyond Input Attribution: A Hands-On Tutorial to Concept-Based Explainable AI and Mechanistic Interpretability. in Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2 6247–6248Association for Computing Machinery, New York, NY, USA. 10.1145/3711896.3737606 Danish S et al (2026) A comprehensive survey of Vision–Language Models: Pretrained models, fine-tuning, prompt engineering, adapters, and benchmark datasets. Inf Fusion 126:103623 Ayyapa V et al (2024) Non-Invasive Oral Cancer Detection Using Hyperspectral Imaging and Advanced Spectral Unmixing Models. https://doi.org/10.1109/ICEC59683.2024.10837017 doi:10.1109/ICEC59683.2024.10837017 Thiem DGE et al (2021) Hyperspectral imaging and artificial intelligence to detect oral malignancy - part 1-automated tissue classification of oral muscle, fat and mucosa using a light-weight 6-layer deep neural network. HEAD FACE Med 17 Caughlin K et al (2024) Contrastive Clustering-Based Patient Normalization to Improve Automated In Vivo Oral Cancer Diagnosis from Multispectral Autofluorescence Lifetime Images. Cancers 16 Marsden M et al (2021) Intraoperative Margin Assessment in Oral and Oropharyngeal Cancer Using Label-Free Fluorescence Lifetime Imaging and Machine Learning. IEEE Trans Biomed Eng 68:857–868 Ramani RS et al (2025) Convolutional neural networks for accurate real-time diagnosis of oral epithelial dysplasia and oral squamous cell carcinoma using high-resolution in vivo confocal microscopy. Sci Rep 15:2555 Abd El-Aziz AAA, Mahmood MA, Abd El-Ghany SA (2025) Enhancing Early Detection of Oral Squamous Cell Carcinoma: A Deep Learning Approach with LRT-Enhanced EfficientNet-B3 for Accurate and Efficient Histopathological Diagnosis. Diagnostics 15 Chantapakul W et al (2025) Detection of Architectural Dysplastic Features From Histopathological Imagery of Oral Mucosa Using Neural Networks. Bioengineering https://doi.org/10.3390/bioengineering12030216 doi:10.3390/bioengineering12030216 de Lima LM et al (2023) Importance of complementary data to histopathological image analysis of oral leukoplakia and carcinoma using deep neural networks. IN℡LIGENT Med 3:258–266 Deo BS, Pal M, Panigrahi PK, Pradhan A (2025) An ensemble deep learning model with empirical wavelet transform feature for oral cancer histopathological image classification. Int J DATA Sci Anal 20:1005–1022 Adeoye J, Su Y (2025) Deep learning with data transformation improves cancer risk prediction in oral precancerous conditions. Intell Med 5:141–150 Xue Z, Liang Z, Rajaraman S, Marini N, Antani S (2025) Detecting Oral Cancer Using Tabular Deep Learning. https://doi.org/10.1109/COINS65080.2025.11125786 doi:10.1109/COINS65080.2025.11125786 Deshpande PR, Pansare BS, Pawar HY, Dash S (2025) Exploring the Use of Neural Networks for Early Detection of Oral Cancer and Other Dental Pathologies. 764–769. 10.1109/ICOECA66273.2025.00136 Pavani V et al (2025) An Advanced Imaging and Machine Learning Algorithm for Enhanced Oral Cancer Detection. 285–294. 10.1109/ICMLAS64557.2025.10967776 Shaheer KM et al (2024) Oral Cancer Analysis for Early Detection using Deep Learning. 317–321. 10.1109/ICC-ROBINS60238.2024.10533923 Huang S-Y, Chiou C-Y, Tan Y-S, Chen C-Y, Chung P-C (2022) Deep Oral Cancer Lesion Segmentation with Heterogeneous Features. https://doi.org/10.1109/RASSE54974.2022.9989871 doi:10.1109/RASSE54974.2022.9989871 L L et al (2024) Development of an oral cancer detection system through deep learning. BMC Oral Health 24:1468 Piazza C et al (2021) Deep Learning for Automatic Segmentation of Oral and Oropharyngeal Cancer Using Narrow Band Imaging: Preliminary Experience in a Clinical Perspective. Front Oncol. 11 Parola M et al (2025) Oral Cancer Recognition on Photographic Images Via Deep Learning Semantic Segmentation. https://doi.org/10.1109/CIHMCompanion65205.2025.11002690 doi:10.1109/CIHMCompanion65205.2025.11002690 Zhang R et al (2024) Research and Application of Deep Learning Models with Multi-Scale Feature Fusion for Lesion Segmentation in Oral Mucosal Diseases. Bioengineering 11 Song B et al (2022) Exploring uncertainty measures in convolutional neural network for semantic segmentation of oral cancer images. J Biomed Opt 27 Thakuria T et al (2025) Smartphone-Based Oral Lesion Image Segmentation Using Deep Learning. J Imaging Inf Med. https://doi.org/10.1007/s10278-025-01455-0 Hsu Y et al (2025) Oral mucosal lesions triage via YOLOv7 models. J Formos Med Assoc 124:621–627 Keser G, Pekiner F, Bayrakdar İŞ, Celik Ö, Orhan K (2024) A deep learning approach to detection of oral cancer lesions from intra oral patient images: A preliminary retrospective study. J Stomatol Oral Maxillofac Surg 125 Laila S, Ema RR, Galib SM, Jamil ARM (2024) & Shahab Uddin, A. F. M. S. A Novel Method to Detect Oral Carcinoma Using Box Annotation Based on YOLO Model. https://doi.org/10.1109/iCACCESS61735.2024.10499560 doi:10.1109/iCACCESS61735.2024.10499560 Birur P et al (2022) Field validation of deep learning based Point-of-Care device for early detection of oral malignant and potentially malignant disorders. Sci Rep 12 Haddaway NR, Page MJ, Pritchard CC, McGuinness LA (2022) PRISMA2020: An R package and Shiny app for producing PRISMA 2020-compliant flow diagrams, with interactivity for optimised digital transparency and Open Synthesis. Campbell Syst Rev 18:e1230 Additional Declarations The authors declare no competing interests. Supplementary Files PRISMAScRFillableChecklist10Sept20191.docx Meta Analyses for Scoping Reviews (PRISMA-ScR) Checklist Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-8865303","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Systematic Review","associatedPublications":[],"authors":[{"id":590485360,"identity":"13eeacad-f67a-47b5-97e5-96d6d4fc5e84","order_by":0,"name":"Charles Goodmaker","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA1klEQVRIiWNgGAWjYNACNgYGCQYGxgeMDSRqYTaAaGEmXgubBFFa+BvYLz74ULZNXrK991l14Y46BnP2/gN4tUgc4Ck2nHHutuFsnuNmt2eeOcxg2XMYvy0GDDxp0rxttxnnSaSx3eZtO8BgcCOZOC328+SfsRXzttUxGNx/TEgL+zGQlsTZEmxszLxtzEBbCHhf4jAPM8gvyTN70piBeg/zGJxJNsCrhb+9/SEwxG7bzjh+jPEz0GFyBscPPsBvDTMPqpk8+JWDATsBM0fBKBgFo2AUAABbwkEEXmCfwAAAAABJRU5ErkJggg==","orcid":"","institution":"NHS","correspondingAuthor":true,"prefix":"","firstName":"Charles","middleName":"","lastName":"Goodmaker","suffix":""},{"id":590487372,"identity":"68796c60-7b6a-4c1f-8557-0877bbeb813b","order_by":1,"name":"Rishi Bhandari","email":"","orcid":"","institution":"NHS","correspondingAuthor":false,"prefix":"","firstName":"Rishi","middleName":"","lastName":"Bhandari","suffix":""},{"id":590487373,"identity":"32f7f959-b907-4ef7-a117-2f9dc7d50c5a","order_by":2,"name":"Anwar Tappuni","email":"","orcid":"","institution":"Queen Mary University of London","correspondingAuthor":false,"prefix":"","firstName":"Anwar","middleName":"","lastName":"Tappuni","suffix":""},{"id":590487374,"identity":"2e28238e-8fdc-4e3b-b421-e50951f18806","order_by":3,"name":"Tuan Pham","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA3ElEQVRIiWNgGAWjYPCCAwwM7A1AugDMY8arlgeuhQeIGQxI0iKRQKQWe4nkZw8Yd9yR1535eOtmHgMGef4GHmMDvLZIpJkbMJ55ZrjtdlrZbaAWwxkHeIwT8GtJMJNgbDvMuO12jhlIC+MGBh7jA/i1pH8DabHfdvMMWIs9EVpywLYkbrvBA9aSCNKC32Fn3pRJJJ45nLztTFrZzTkGEskzDrMV4/U+e3v6NomPOw7bbjt+eNuNNxU2tv3tzZsl8GkBg8QGMAUyW4JQrEABI0LLKBgFo2AUjAJMAABUMEaKxGFtpgAAAABJRU5ErkJggg==","orcid":"https://orcid.org/0000-0002-4255-5130","institution":"Queen Mary University of London","correspondingAuthor":true,"prefix":"","firstName":"Tuan","middleName":"","lastName":"Pham","suffix":""}],"badges":[],"createdAt":"2026-02-12 19:57:07","currentVersionCode":1,"declarations":{"humanSubjects":true,"vertebrateSubjects":false,"conflictsOfInterestStatement":false,"humanSubjectEthicalGuidelines":true,"humanSubjectConsent":true,"humanSubjectClinicalTrial":false,"humanSubjectCaseReport":false,"vertebrateSubjectEthicalGuidelines":false},"doi":"10.21203/rs.3.rs-8865303/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-8865303/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":102739680,"identity":"5e93b0df-610c-4c92-a1f0-cc4a173142f1","added_by":"auto","created_at":"2026-02-16 07:11:21","extension":"jpeg","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":509737,"visible":true,"origin":"","legend":"\u003cp\u003eClinical photographs showing: (a) a mucocele, a benign lower-lip lesion; (b) a non-homogeneous red and white patch with a high risk of dysplasia; and (c) oral squamous cell carcinoma (OSCC).\u003c/p\u003e","description":"","filename":"floatimage1.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8865303/v1/9e85cdf069acb3e12a9fa1dd.jpeg"},{"id":102739691,"identity":"3377442f-8f83-44dd-9cb8-a298b3e652cb","added_by":"auto","created_at":"2026-02-16 07:11:22","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":96349,"visible":true,"origin":"","legend":"\u003cp\u003ePRISMA diagram generated using PRISMA-Flow-Diagram Tool \u003csup\u003e234\u003c/sup\u003e\u003c/p\u003e","description":"","filename":"floatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-8865303/v1/6c25341e6d710a0b4cf1ee71.png"},{"id":102739681,"identity":"5b072b41-436a-43bb-87ae-8894fd1bffa6","added_by":"auto","created_at":"2026-02-16 07:11:21","extension":"jpeg","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":204641,"visible":true,"origin":"","legend":"\u003cp\u003eBar charts (\u003cem\u003en\u003c/em\u003e=134) showing approaches to data scarcity (a), and learning tasks incorporated into study designs (b).\u003c/p\u003e","description":"","filename":"floatimage3.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-8865303/v1/89d745656a84652c3cb066d8.jpeg"},{"id":102739652,"identity":"af779e24-8383-4c31-aa73-b4f05dfdc8fe","added_by":"auto","created_at":"2026-02-16 07:11:08","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":166270,"visible":true,"origin":"","legend":"\u003cp\u003eA bar chart showing the frequency-distribution of model architectures used across oral cancer classification studies. Note: Studies may use multiple architectures, ViT and Swin Transformers are combined into one group.\u003c/p\u003e","description":"","filename":"floatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-8865303/v1/e7989812c1d84d069e22d32e.png"},{"id":102739654,"identity":"dc043b97-9ef3-419b-b3b9-74f314866347","added_by":"auto","created_at":"2026-02-16 07:11:10","extension":"docx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":110523,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eMeta Analyses for Scoping Reviews (PRISMA-ScR) Checklist\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"PRISMAScRFillableChecklist10Sept20191.docx","url":"https://assets-eu.researchsquare.com/files/rs-8865303/v1/5bfff9e4295a6b94399b291b.docx"}],"financialInterests":"The authors declare no competing interests.","formattedTitle":"\u003cp\u003eAI for Classifying Oral Cancer and Precursor Lesions Using Visible-Light Photography\u003c/p\u003e","fulltext":[{"header":"Introduction","content":"\u003cp\u003eOral cancer carries significant morbidity and mortality burden. Five year survival rates have remained relatively static at around 50% for most countries\u003csup\u003e\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e\u003c/sup\u003e. The global incidence rate of oral cancer is rising 2.7% per year; in 2009, there were 275,000 cases worldwide and 6236 in the UK, making it the sixth most prevalent cancer globally\u003csup\u003e\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e,\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e\u003c/sup\u003e. In England alone, in 2022/23, there were 275,354 head and neck cancer referrals to 2 week wait pathways \u003csup\u003e\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e\u003c/sup\u003e, of which a significant portion were likely specific to the oral cavity.\u003c/p\u003e \u003cp\u003eThese incidence figures do not capture the full healthcare burden from diagnosis through treatment and long-term follow-up. Oral cancer management, from initial precursor lesion identification to complete treatment, requires substantial healthcare resources. One systematic review found that oral leukoplakia (OL) progresses to oral squamous cell carcinoma (OSCC) in 7.2% of cases on average\u003csup\u003e\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e\u003c/sup\u003e. Within this transformation rate hides the substantial level of care required across large patient populations to diagnose each cancer case, including repeated biopsies and follow-ups of oral potentially malignant disorders (OPMDs). Fundamentally, it underpins the potential impact that automated oral lesion classification and risk assessment could have on healthcare systems.\u003c/p\u003e \u003cp\u003eOPMDs are a set of oral conditions which have a higher rate of malignant transformation to OSCC, compared to benign lesions. These include oral leukoplakia, oral erythroplakia, oral submucous fibrosis (OSMF), and oral lichen planus (OLP). OSCC, the most common form of oral malignancy, does not, therefore, represent a binary benign-versus-malignant classification but rather exists along a pathophysiological spectrum\u003csup\u003e\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e\u003c/sup\u003e. Further, histopathology captures a single timepoint in this disease continuum, which both complicates lesion monitoring and creates opportunities for early intervention before progression to advanced stages. Crucially, many precursor lesions appear similar on visual inspection alone, despite varying malignant potential, with one study citing only 25% of lesions have a provisional clinical diagnosis which matches histological outcomes\u003csup\u003e\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e\u003c/sup\u003e. As a result, histopathological biopsy remains the gold standard; determining lesion type by observing architectural, cellular, genetic and biochemical transformations within tissues\u003csup\u003e\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e\u003c/sup\u003e. This diagnostic challenge necessitates the integration of technological approaches that can enhance detection and risk stratification of not only benign and malignant but also precursor oral lesions (representative examples of clinically distinct oral lesions spanning benign, potentially malignant, and malignant presentations are shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eSignificant disparities exist in both oral cancer incidence and survival across demographic groups, with social deprivation strongly associated with higher disease rates\u003csup\u003e\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e,\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e\u003c/sup\u003e. Established risk factors include tobacco smoking, alcohol consumption, betel quid chewing, and increasing age\u003csup\u003e\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e,\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e\u003c/sup\u003e. Microbiological factors such as Candida infection and human papillomavirus (HPV), along with broader oral microbiome composition, have also been implicated in cancer development\u003csup\u003e\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e,\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e\u003c/sup\u003e. These risk patterns, combined with unequal access to specialist diagnostic services, underscore the need for accessible diagnostic tools that can improve early detection across diverse healthcare settings.\u003c/p\u003e \u003cp\u003eArtificial intelligence (AI) has been shown as a promising approach in parallel specialities, such as dermatology, where the first class-IIa medical device for skin lesion classification is currently addressing high volume referral burdens on UK clinical pathways\u003csup\u003e\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e\u003c/sup\u003e. Similarly, the challenge of classifying oral lesions requires consideration of clinical utility, explainability, and integration with existing workflows. Artificial intelligence encompasses computational systems capable of performing tasks that typically require human intelligence. Within AI, machine learning (ML) represents algorithms that learn patterns from data without explicit programming\u003csup\u003e\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e\u003c/sup\u003e. Deep learning, a subset of machine learning, employs multi-layered artificial neural networks that can automatically extract hierarchical features from raw data\u003csup\u003e\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e\u003c/sup\u003e. In medical imaging, computer vision applies these algorithms to interpret visual information from images, enabling automated lesion detection and classification\u003csup\u003e\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eAdvances in deep learning architectures have led to the development of convolutional neural networks (CNNs), which have become the dominant approach for medical image classification, using specialised layers to detect spatial patterns at multiple scales \u003csup\u003e\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e,\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e\u003c/sup\u003e. More recently, vision transformers have emerged as an alternative architecture that processes images as sequences of patches, enabling modelling of long-range dependencies across the entire image\u003csup\u003e\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e\u003c/sup\u003e. These models have demonstrated competitive performance on medical imaging tasks, though their application in oral pathology remains limited\u003csup\u003e\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eTranslating these capabilities to into practice requires readily accessible data and compatibility with existing workflows. Therefore, this review focuses on visible-light intraoral clinical photography; a data modality routinely collected during standard management of oral lesions in the UK. By aligning with real-world clinical workflows, this review aims to uncover research gaps that can inform deployable clinical decision support systems.\u003c/p\u003e \u003cp\u003e \u003cb\u003eObjectives\u003c/b\u003e \u003c/p\u003e \u003cp\u003e \u003col\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eTo map the evidence landscape of artificial intelligence in the classification and detection of oral cancer and its precursor lesions.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eTo focus on realistic and available data subtypes; visible light photography and textual (structured or unstructured) clinical data.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eTo identify gaps in research and the barriers withholding scaled usage in healthcare.\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003c/ol\u003e \u003c/p\u003e"},{"header":"Methodology","content":"\u003cp\u003eThis study followed the PRISMA-ScR (scoping review extension) guidelines \u003csup\u003e\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e\u003c/sup\u003e, a review protocol was not registered for this study. Searches were conducted in Scopus, Web of Science, Embase, and PubMed from 2015\u0026ndash;2025 using structured Boolean strings combining terms for (1) oral lesions and cancers, (2) AI and machine learning, and (3) visible-light or intraoral imaging modalities (supplementary table 1 provides full search-terms). Radiographic and histopathology-only studies were excluded.\u003c/p\u003e \u003cp\u003eThe search yielded 2,540 records, which were imported into Rayyan AI\u003csup\u003e22\u003c/sup\u003e review software and reduced to 1,940 after deduplication, then down to 292 following title\u0026ndash;abstract screening. Zotero\u003csup\u003e\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e\u003c/sup\u003e (v7.0.30) reference manager was used during a second stage review process, filtering whole article texts manually, resulting in 134 articles for inclusion. Figure\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e outlines the selection process and Table \u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e outlines the inclusion/exclusion criteria. Studies from conference papers were included as they can offer papers in emerging research fields which is highly relevant this this scoping review.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eScoping review inclusion and exclusion criteria.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"3\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCategory\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eInclusion Criteria\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eExclusion Criteria\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eStudy Objective\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eStudies that classify, detect, or predict oral potentially malignant disorders (OPMDs), dysplasia, or oral cancer.\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eStudies focused solely on segmentation, feature extraction, or image enhancement without diagnostic or classification intent.\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eStudy Type \u0026amp; Design\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eOriginal research articles using Artificial Intelligence (AI), Machine Learning (ML), or Deep Learning (DL). Peer-reviewed journal articles or conference papers published from 2015\u0026ndash;2025.\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eSystematic reviews, meta-analyses, narrative reviews, editorials, or letters without new experimental results. Descriptive clinical studies or rule-based expert systems without ML/DL.\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003ePopulation\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eHuman subjects with oral lesions or oral cancer.\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eStudies using only animal data. Studies involving non-oral regions (e.g., skin, larynx, systemic cancers).\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eData Modality\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eMust use clinical photographic images (RGB / visible light) taken intraorally (e.g., DSLR, smartphone, or intraoral camera).\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eStudies using only non-visible-light modalities (e.g., autofluorescence, OCT, NIR, hyperspectral, radiography, histopathology, MRI). Studies based purely on textual EHR, genomic data, or pathology slides.\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eMultimodal Inputs\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eStudies incorporating non-image data (e.g., patient demographics, clinical history, lesion site, textual metadata) are included.\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eN/A\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eAI Methodology\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAny algorithmic approach including traditional ML, CNNs, Vision Transformers, multimodal fusion, or hybrid systems (image\u0026thinsp;+\u0026thinsp;text/metadata).\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eN/A\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eReporting Standards\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eQuantitative performance metrics must be provided (e.g., accuracy, sensitivity, specificity, AUC, F1).\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eStudies with unavailable performance metrics, unclear image modality, or no dataset description.\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eLanguage \u0026amp; Access\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eEnglish-language publications with full text available (preprints acceptable if methodologically detailed).\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eNon-English publications.\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eDataset quality indicators included size, source, ground-truth labelling, and histopathological confirmation because data quality fundamentally determines whether reported performance will generalise to clinical settings, as established by TRIPOD-AI reporting standards\u003csup\u003e\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e\u003c/sup\u003e Histopathological confirmation was specifically tracked as the clinical gold standard for oral lesion diagnosis\u003csup\u003e\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e\u003c/sup\u003e. Clinical tasks were categorised to assess alignment between research focus and clinical utility: binary classification (malignant vs. benign) versus multi-class including dysplasia grading (mild, moderate and severe), which influences clinical management\u003csup\u003e\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eArchitectural AI design was categorised by model families (CNNs, vision transformers, graph neural networks) because architecture determines model capabilities and therefore clinical utility\u003csup\u003e\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e\u003c/sup\u003e. Validation was assessed by distinguishing internal validation from external validation on independent datasets collected at different institutions, which represents the gold standard for clinical generalisability\u003csup\u003e\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e\u003c/sup\u003e. Multimodal integration approaches combining imaging with clinical data were documented because this mirrors clinical decision-making processes\u003csup\u003e\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e\u003c/sup\u003e while explainability methods were recorded as interpretability correlates with trust and regulatory approval\u003csup\u003e\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e,\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e\u003c/sup\u003e. All extracted variables were recorded in Microsoft Excel file, where frequencies and graphs were also generated.\u003c/p\u003e"},{"header":"Results","content":"\u003cp\u003eIn this scoping review, 134 studies published between 2015 and 2025 \u003csup\u003e31\u0026ndash;164\u003c/sup\u003e were included. Several excluded studies were relevant to oral cancer classification tasks, but either used a different modality type or covered tasks such as classification and detection without specifically classifying lesions; these are summarised in Table \u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eExcluded categories relevant to oral cancer classification tasks.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"3\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCategory\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eDescription\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eReferences\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eHyperspectral imaging\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eRequires specialist imaging spectrometer capturing light across hundreds of narrow wavelength bands rather than standard RGB. Promising for histopathology slide analysis with limited in vivo application for tissue and lesion classification.\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003csup\u003e209\u0026ndash;211\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAutofluorescence imaging\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eExploits tissue autofluorescence properties that change during malignant transformation due to alterations in fluorophores (e.g., collagen) and tissue architecture. Often combined with white light in dual-modality approaches. Requires specialist equipment not standard in UK oral health clinics.\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003csup\u003e212\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eConfocal microscopy\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eLaser scanning microscopy enabling high-resolution, non-invasive optical sectioning of tissue in vivo, providing near-histological images without biopsy. Requires specialist equipment with limited clinical accessibility.\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003csup\u003e213\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eHistopathological imaging\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eMicroscopic analysis of tissue specimens focusing on cellular and architectural features. A substantial parallel field operating at microscopic rather than macroscopic clinical photography scale, with transferable methodological findings.\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003csup\u003e214\u0026ndash;217\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eTabular data only\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eStudies analysing only structured clinical variables without image data, including EHR demographics, risk factors, and histology. Novel approaches include CNN processing of tabular-to-image transformed data.\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003csup\u003e218,219\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRadiological integration\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eStudies combining features from CT, MRI, or PET imaging alongside clinical photographs. Excluded as radiological imaging is not routinely integrated with clinical photography workflows in primary oral assessment.\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003csup\u003e220\u0026ndash;222\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eLesion localisation (detection and segmentation)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eStudies performing bounding box detection or pixel-level segmentation without diagnostic classification output. Demonstrates object detection (YOLO, Faster R-CNN) and spatial reasoning architectures (U-Net, Mask R-CNN, transformers) useful as preprocessing for classification pipelines.\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003csup\u003e186,223\u0026ndash;233\u003c/sup\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e\n\u003ch3\u003eDatasets and Modalities\u003c/h3\u003e\n\u003cp\u003eMany CV studies rely on open-source datasets, the quantity and quality of which are very limited, in comparison to parallel fields such as dermatology. For example, a specific open source dataset called Oral Cancer (Lips and Tongue) Images or OCI \u003csup\u003e\u003cspan citationid=\"CR165\" class=\"CitationRef\"\u003e165\u003c/span\u003e\u003c/sup\u003e, contains 131 images without specified histological ground truth labelling; and served as a data source in 25.4% of studies. Some studies have, however, managed to assemble relatively large private datasets, the largest reported number of images included was 44,409 \u003csup\u003e32\u003c/sup\u003e. The majority of studies (84.3%, \u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;113) rely solely or primarily on clinical photographs, whereas 15.7% (\u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;21) studies leverage a multi-modal approach (Table \u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab3\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 3\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eTabulated breakdown of multimodal data types\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"3\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAdditional Data\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eValue\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003e%\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eClinical history/risk factors\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e6.7%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eDemographics (age, sex)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e6.0%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eHistopathology images\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e6.0%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eLesion metadata (site/type)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e3.7%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eRadiological (CT/MRI)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e3.0%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eOther imaging (fluorescence)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e3.0%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eText/NLP/EHR\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e2.2%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab4\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 4\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eA table showing AI explainability approaches across the cohort of included studies.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"3\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMethod\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eValue\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003e%\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAny XAI reported\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e48\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e35.8%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eNo XAI reported\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e86\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e64.2%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eGrad-CAM\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e30\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e22.4%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAttention maps\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e3.7%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSaliency maps\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e2.2%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSHAP\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e2.2%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eLIME\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.7%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003eWhere reported, the most common approach to managing multi-modal data in model training was to deploy early-fusion strategies (\u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;9). Feature-level fusion (\u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;7) and late-fusion (\u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;2) were also mentioned among the cohort of papers.\u003c/p\u003e\n\u003ch3\u003eLearning Paradigms \u0026 Tasks\u003c/h3\u003e\n\u003cp\u003eMachine learning approaches can be categorised into supervised, unsupervised, semi-supervised, self-supervised and reinforcement learning\u003csup\u003e\u003cspan citationid=\"CR166\" class=\"CitationRef\"\u003e166\u003c/span\u003e\u003c/sup\u003e. Whereby a model may see labelled data during training or attempt to train on a fewer number of images without labels or even none. This review found one study employing a semi-supervised training method; proposing a multi-scale random cropping self-training framework for oral cancer versus leukoplakia classification \u003csup\u003e\u003cspan citationid=\"CR71\" class=\"CitationRef\"\u003e71\u003c/span\u003e\u003c/sup\u003e, with an accuracy of 71.7%.\u003c/p\u003e \u003cp\u003eAnother study utilised a unique supervised \u0026ldquo;few-shot\u0026rdquo; approach to overcome data scarcity within the domain; through a Siamese network which produced matching scores for images across classes\u003csup\u003e\u003cspan citationid=\"CR135\" class=\"CitationRef\"\u003e135\u003c/span\u003e\u003c/sup\u003e achieving 92% accuracy. Other approaches tended to focus on data augmentation methods (63.4%, \u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;85). Further, one study was identified as having used synthetic oral lesion images created through Generative Adversarial Networks, achieving a diagnostic accuracy of 97% \u003csup\u003e114\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eWithin supervised learning, binary classification was the most common task (59.7%, \u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;80) - Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e - followed by multiclass classification (41.8%, \u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;56), with smaller proportions additionally exploring object detection (15.7%, \u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;21) and semantic segmentation (6.7%, \u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;9).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e\n\u003ch3\u003ePreprocessing and Validation\u003c/h3\u003e\n\u003cp\u003eAmong the included studies 63.4% (\u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;85) included data augmentation techniques such as image flipping (45.5%, \u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;61), rotation 44.8% (60) translation (17.9% \u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;24), scaling (13.4%, \u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;18) and brightness alterations (13.4%, \u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;18). The review found 4.5% (\u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;6) of studies employing Gaussian noise; an approach where random pixel-level variation is added to images to simulate real-world image degradation.\u003c/p\u003e \u003cp\u003eAdditionally, 41.0% (\u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;55) employed segmentation techniques to isolate regions of interest. Where specified, the type of segmentation was Region of Interest (ROI); which was primarily performed manually 22.4% (\u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;30). A smaller number of studies employed automated segmentation pipelines with U-Net and YOLO models (3.7%, \u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;5, and 2.2%, \u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;3, respectively).\u003c/p\u003e \u003cp\u003eThis review found 91.0% (\u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;122) of studies using internal validation methods for model training. Internal validation was considered to be use of random image-level or patient-level splits within a single dataset. Further, 16.4% (\u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;22) employed a K-fold cross-validation approach to model development. Of all included studies, 8.2% (\u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;11) utilised external validation datasets and 4.1% (\u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;6) included multi-centre data; defined as data from \u0026ge;\u0026thinsp;2 institutions. External validation was considered as testing on a dataset collected at a different institution or time period.\u003c/p\u003e\n\u003ch3\u003eModel Architectures \u0026 Performance\u003c/h3\u003e\n\u003cp\u003eModel architecture refers to the structural design of neural networks, determining how information flows through computational layers. Studies primarily employed pre-trained CNN architectures, leveraging transfer learning; a training strategy whereby models pre-trained on large-scale image datasets, such as ImageNet\u003csup\u003e\u003cspan citationid=\"CR167\" class=\"CitationRef\"\u003e167\u003c/span\u003e\u003c/sup\u003e, are fine-tuned on domain-specific medical images. Among CNN architectures, the ResNet family was most frequently employed, with ResNet-50 being the single most common model subtype (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eAmong CNN architectures, several studies demonstrated strong performance. Fu et al. (2020) achieved an Area Under Curve (AUC) of 0.983 on internal validation using DenseNet, with external validation confirming generalisability (AUC 0.935)\u003csup\u003e\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e\u003c/sup\u003e. Another reported AUC 1.0 for OSCC classification using DenseNet-169\u003csup\u003e44\u003c/sup\u003e. Earlier work demonstrated the feasibility of transfer learning with fine-tuned architectures for oral lesion detection\u003csup\u003e\u003cspan citationid=\"CR56\" class=\"CitationRef\"\u003e56\u003c/span\u003e\u003c/sup\u003e. Promising results for mobile-based oral cancer classification using smartphone-acquired images was also observed\u003csup\u003e\u003cspan citationid=\"CR125\" class=\"CitationRef\"\u003e125\u003c/span\u003e\u003c/sup\u003e, highlighting potential for point-of-care screening.\u003c/p\u003e \u003cp\u003eMore recently, vision transformer architectures have emerged in oral cancer clinical image classification\u003csup\u003e\u003cspan citationid=\"CR92\" class=\"CitationRef\"\u003e92\u003c/span\u003e\u003c/sup\u003e, with 14 studies deploying transformer based architectures. A key study compared Vision Transformer (ViT) and Swin Transformer performance on mobile-acquired images\u003csup\u003e\u003cspan citationid=\"CR70\" class=\"CitationRef\"\u003e70\u003c/span\u003e\u003c/sup\u003e, while another directly compared vision transformers against CNNs for oral cancer lesion classification\u003csup\u003e\u003cspan citationid=\"CR54\" class=\"CitationRef\"\u003e54\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003eExplainability Approaches\u003c/h2\u003e \u003cp\u003eExplainable AI (XAI)\u003csup\u003e\u003cspan citationid=\"CR168\" class=\"CitationRef\"\u003e168\u003c/span\u003e\u003c/sup\u003e are methods designed to communicate the rationale behind model predictions, 35.8% (\u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;48) of studies described at least one XAI method. Saliency maps are a popular XAI method - showing important image regions identified by a model\u003csup\u003e\u003cspan citationid=\"CR169\" class=\"CitationRef\"\u003e169\u003c/span\u003e\u003c/sup\u003e. A specific format are Gradient-weighted Class Activation Maps (Grad-CAM)\u003csup\u003e\u003cspan citationid=\"CR170\" class=\"CitationRef\"\u003e170\u003c/span\u003e\u003c/sup\u003e, which utilise the gradients flowing into the final convolution layer to generate a class-discriminate map. Grad-CAM remained the predominant approach in the dataset (22.4%, \u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;30). This technique - which highlights determining image regions by utilising gradients flowing into the final convolutional layer - was extensively employed to validate model focus on clinically relevant lesion areas, such as the tongue borders or ulcerated regions.\u003c/p\u003e \u003cp\u003eAttention maps\u003csup\u003e\u003cspan citationid=\"CR171\" class=\"CitationRef\"\u003e171\u003c/span\u003e\u003c/sup\u003e were utilised (3.7%, n\u0026thinsp;=\u0026thinsp;5) in newer transformer-based and hybrid architectures, to dynamically weigh the importance of specific spatial features or multimodal inputs. Use of advanced model-agnostic quantitative frameworks like SHAP\u003csup\u003e\u003cspan citationid=\"CR172\" class=\"CitationRef\"\u003e172\u003c/span\u003e\u003c/sup\u003e or LIME\u003csup\u003e\u003cspan citationid=\"CR168\" class=\"CitationRef\"\u003e168\u003c/span\u003e\u003c/sup\u003e was rare. One study leveraged Case-Based Reasoning\u003csup\u003e\u003cspan citationid=\"CR108\" class=\"CitationRef\"\u003e108\u003c/span\u003e\u003c/sup\u003e to provide more physician-aligned decision support and explainability of model predictions.\u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eLimitations\u003c/h3\u003e\n\u003cp\u003eThis scoping review has several limitations that should be considered when interpreting its findings. First, the search was limited to four databases (Scopus, Web of Science, Embase, and PubMed) and English-language publications, potentially missing relevant studies indexed elsewhere or published in other languages. Second, conference papers were included to capture emerging research, which may introduce findings that have not undergone full peer review. Third, formal quality assessment and meta-analysis were not performed. Given substantial heterogeneity in study designs, datasets, and evaluation protocols pooled results may not be fully representative. Finally, publication bias likely inflates reported performance, as studies with negative or null findings may be underrepresented.\u003c/p\u003e"},{"header":"Discussion","content":"\u003cp\u003eThe field of oral cancer classification using AI has extended rapidly over the past decade. However, compared to parallel areas of clinical image analysis, real-world deployment is lagging. In dermatology, for example, advances in vision models have started to make their way into real world clinical workflows\u003csup\u003e\u003cspan citationid=\"CR173\" class=\"CitationRef\"\u003e173\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eThe paucity of large, open-source well-curated datasets\u003csup\u003e\u003cspan citationid=\"CR174\" class=\"CitationRef\"\u003e174\u003c/span\u003e\u003c/sup\u003e likely contributes to this mis-match. In dermatology, datasets such as the HAM10000 \u0026ndash; a collection of over 10,000 annotated dermatoscopy images \u003csup\u003e\u003cspan citationid=\"CR175\" class=\"CitationRef\"\u003e175\u003c/span\u003e\u003c/sup\u003e and the International Skin Imaging Collaboration (ISIC) 2024 challenge dataset of 400,000 skin lesion image crops are readily available to researchers and developers alike.\u003c/p\u003e \u003cp\u003eAssembling large oral image databases, however, is limited by slightly different challenges. Anatomical constraints can make adequate photography difficult without mirrors and specialist equipment. This is compounded by data privacy concerns \u0026ndash; with the face, teeth and oral features considered partly, if not fully identifiable (an issue which is less of a concern for cropped lesions of the skin or other anatomical structures).\u003c/p\u003e \u003cp\u003eIndeed, researchers have specifically investigated ways to resolve privacy issues through techniques like federated learning\u003csup\u003e\u003cspan citationid=\"CR176\" class=\"CitationRef\"\u003e176\u003c/span\u003e\u003c/sup\u003e and the generation of synthetic image datasets using a GAN framework\u003csup\u003e\u003cspan citationid=\"CR114\" class=\"CitationRef\"\u003e114\u003c/span\u003e\u003c/sup\u003e to address data scarcity. Next steps could include exploring diffusion-based generative models. Unlike GANs, diffusion models produce higher-fidelity images with greater diversity and fewer mode collapse issues. Recent work demonstrates that classifiers trained on synthetic diffusion-generated medical images can approach or match performance of those trained on real data. For oral cancer, this approach could enable creation of large, diverse, privacy-compliant training datasets.\u003c/p\u003e \u003cp\u003eWithin this environment of relative data scarcity, many studies have adopted binary classification tasks; determining malignant from benign lesions. However, real world oral lesions exist on a pathophysiological spectrum of benign to dysplastic to malignant\u003csup\u003e\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e\u003c/sup\u003e. Only 10.4% of studies have investigated the power of AI models to discriminate varying grades of dysplasia through to carcinoma. One paper took a robust approach of attempting dysplasia sub-grade prediction \u0026ndash; collapsing the WHO classification down to a dichotomous outcome of high vs lower-risk dysplasia from images \u003csup\u003e\u003cspan citationid=\"CR177\" class=\"CitationRef\"\u003e177\u003c/span\u003e,\u003cspan citationid=\"CR178\" class=\"CitationRef\"\u003e178\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eMoreover, only 6.9% of studies investigate three-way classification tasks (benign, OPMD/dysplasia and malignant). This may partly explain why reported performance gains have not consistently translated into deployed clinical tools \u0026ndash; because classifying benign and malignant may not meet utility thresholds for clinicians, as perhaps would models which can detect hard-to-spot edge cases; such as high-risk dysplastic white patches in the oral cavity.\u003c/p\u003e \u003cp\u003eMore broadly, multimodal data integration will likely be pivotal in building models which can generate safe and stable risk stratification\u003csup\u003e\u003cspan citationid=\"CR92\" class=\"CitationRef\"\u003e92\u003c/span\u003e,\u003cspan citationid=\"CR113\" class=\"CitationRef\"\u003e113\u003c/span\u003e,\u003cspan citationid=\"CR122\" class=\"CitationRef\"\u003e122\u003c/span\u003e,\u003cspan citationid=\"CR129\" class=\"CitationRef\"\u003e129\u003c/span\u003e,\u003cspan citationid=\"CR152\" class=\"CitationRef\"\u003e152\u003c/span\u003e,\u003cspan citationid=\"CR160\" class=\"CitationRef\"\u003e160\u003c/span\u003e\u003c/sup\u003e; accounting for limited clinical image datapoints they may be able to build richer representations of patient cases. A leading study in the field utilised smartphone clinical photographs fused with patient age, sex, and habit data using early feature concatenation, demonstrating improved performance over image-only models\u003csup\u003e\u003cspan citationid=\"CR129\" class=\"CitationRef\"\u003e129\u003c/span\u003e\u003c/sup\u003e. Another study used a heterogeneous graph framework integrating lesion photographs with structured clinical variables including demographics, risk factors, lesion characteristics, oral epithelial dysplasia grade, and longitudinal follow-up, to enable both diagnostic classification and time-dependent malignant transformation risk prediction\u003csup\u003e\u003cspan citationid=\"CR131\" class=\"CitationRef\"\u003e131\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eSimilarly, multi-model ensemble methods; whereby several different AI models are hosted within a single classification workflow have performed very well in dermatology image classification tasks, often outperforming uni-model variants in international competitions\u003csup\u003e\u003cspan citationid=\"CR180\" class=\"CitationRef\"\u003e180\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eInterestingly, no studies explored self-supervised learning approaches such as contrastive learning or masked image modelling, despite their demonstrated effectiveness in radiology\u003csup\u003e\u003cspan citationid=\"CR181\" class=\"CitationRef\"\u003e181\u003c/span\u003e\u003c/sup\u003e. Self-supervised pretraining methods - including contrastive learning (SimCLR, MoCo)\u003csup\u003e\u003cspan citationid=\"CR182\" class=\"CitationRef\"\u003e182\u003c/span\u003e,\u003cspan citationid=\"CR183\" class=\"CitationRef\"\u003e183\u003c/span\u003e\u003c/sup\u003e, masked autoencoders, and self-distillation approaches (DINO) - leverage large quantities of unlabelled images to learn visual representations. Such techniques could be applied to potentially massive imaging dataset in dental primary and secondary care whilst mitigating significant annotation burdens for specialist clinicians.\u003c/p\u003e \u003cp\u003eConvolutional neural network architectures dominate the field, with ResNet-50 being the most frequently employed model. While CNNs excel at local feature extraction, they may be suboptimal for tasks requiring spatial reasoning across tissue regions\u003csup\u003e\u003cspan citationid=\"CR184\" class=\"CitationRef\"\u003e184\u003c/span\u003e\u003c/sup\u003e. Graph neural networks, which can model relationships between tissue components and could potentially capture field cancerisation patterns through spatial connectivity modelling, remain essentially unexplored in oral lesion classification despite promising applications in other medical imaging domains\u003csup\u003e\u003cspan citationid=\"CR185\" class=\"CitationRef\"\u003e185\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eVision transformer architectures have started to be investigated in this field (2024\u0026ndash;2025), demonstrating competitive or superior performance in preliminary studies\u003csup\u003e\u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e42\u003c/span\u003e,\u003cspan citationid=\"CR54\" class=\"CitationRef\"\u003e54\u003c/span\u003e,\u003cspan citationid=\"CR70\" class=\"CitationRef\"\u003e70\u003c/span\u003e,\u003cspan citationid=\"CR92\" class=\"CitationRef\"\u003e92\u003c/span\u003e,\u003cspan citationid=\"CR92\" class=\"CitationRef\"\u003e92\u003c/span\u003e,\u003cspan citationid=\"CR186\" class=\"CitationRef\"\u003e186\u003c/span\u003e\u003c/sup\u003e. However, the small number of transformer-based studies precludes robust comparative conclusions. The field would benefit from systematic head-to-head evaluations across standardised benchmarks to establish whether transformer architectures offer genuine advantages for oral lesion classification.\u003c/p\u003e \u003cp\u003eThe emergence of vision-language foundation models represents a paradigm shift. These models, pre-trained on billions of image-text pairs, demonstrate competitive zero-shot performance on medical imaging tasks without domain-specific training\u003csup\u003e\u003cspan citationid=\"CR187\" class=\"CitationRef\"\u003e187\u003c/span\u003e,\u003cspan citationid=\"CR188\" class=\"CitationRef\"\u003e188\u003c/span\u003e\u003c/sup\u003e. Early studies evaluating multimodal large language models on oral lesion image classification show promise\u003csup\u003e\u003cspan citationid=\"CR94\" class=\"CitationRef\"\u003e94\u003c/span\u003e,\u003cspan citationid=\"CR189\" class=\"CitationRef\"\u003e189\u003c/span\u003e\u003c/sup\u003e, though systematic benchmarking against purpose-built classifiers remains limited.\u003c/p\u003e \u003cp\u003eReal-world deployment requires seamless integration with existing clinical information systems, decision support that aligns with clinical workflows, and outputs that support rather than replace clinical judgement. Currently, no studies have evaluated clinical workflow feasibility, time-to-decision impacts, or clinician acceptance of AI-assisted diagnosis. Further, real-world adoption will likely be heavily dependent on robust explainability frameworks\u003csup\u003e\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e\u003c/sup\u003e. Gradient-weighted Class Activation Mapping (Grad-CAM) was the predominant explainability approach used throughout this scoping review (22.4%), followed by attention maps (3.7%) and saliency maps (2.2%). Advanced model-agnostic interpretability methods remained rare, with SHAP appearing in only 2.2% and LIME in 0.7% of studies. The integration of these approaches may reflect the rising saliency of explainability, but a wider plethora of techniques and intrinsic incorporation into model workflows remains open for exploration.\u003c/p\u003e \u003cp\u003eTemporal risk stratification and longitudinal monitoring represent underexplored applications with substantial clinical value. Oral lesions require extended surveillance periods, with transformation to malignancy occurring over months to years\u003csup\u003e\u003cspan citationid=\"CR190\" class=\"CitationRef\"\u003e190\u003c/span\u003e\u003c/sup\u003e. AI systems capable of integrating sequential imaging data to model disease progression and predict transformation risk could substantially improve clinical management of oral potentially malignant disorders.\u003c/p\u003e \u003cp\u003eFinally, uncertainty-aware models that provide calibrated confidence estimates would enable appropriate referral decisions and human-AI collaboration\u003csup\u003e\u003cspan citationid=\"CR191\" class=\"CitationRef\"\u003e191\u003c/span\u003e\u003c/sup\u003e. Current approaches largely provide point predictions without uncertainty quantification, limiting their utility for clinical decision-making where understanding prediction confidence is essential.\u003c/p\u003e \u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003eFuture Directions and Recommendations\u003c/h2\u003e \u003cp\u003eTo accelerate the safe and effective use of AI for classifying oral cancer and precursor lesions from visible light photography, several methodological and translational priorities should be addressed. First, curation of large, open, and well-annotated datasets must be a central goal. Datasets should capture real-world heterogeneity in acquisition devices, illumination conditions, anatomical sites, and disease spectra (normal mucosa, potentially malignant disorders, dysplasia grades, and invasive cancer). Standardised annotation protocols\u0026mdash;ideally including lesion boundaries, uncertainty labels, and longitudinal outcomes\u0026mdash;would enable robust benchmarking, reproducibility, and fair comparison across models.\u003c/p\u003e \u003cp\u003eSecond, advanced image segmentation should be treated as a foundational step rather than an optional pre-processing task. Accurate delineation of lesions and surrounding mucosa can reduce background bias, improve feature localisation, and enhance interpretability. Contemporary approaches such as attention-guided segmentation\u003csup\u003e\u003cspan citationid=\"CR192\" class=\"CitationRef\"\u003e192\u003c/span\u003e\u003c/sup\u003e, weakly supervised\u003csup\u003e\u003cspan citationid=\"CR193\" class=\"CitationRef\"\u003e193\u003c/span\u003e\u003c/sup\u003e and self-supervised\u003csup\u003e\u003cspan citationid=\"CR194\" class=\"CitationRef\"\u003e194\u003c/span\u003e\u003c/sup\u003e segmentation, and multi-scale encoder\u0026ndash;decoder architectures\u003csup\u003e\u003cspan citationid=\"CR195\" class=\"CitationRef\"\u003e195\u003c/span\u003e\u003c/sup\u003e can leverage limited pixel-level labels while remaining robust to annotation noise. Joint optimisation of segmentation and classification in end-to-end frameworks may further improve diagnostic performance.\u003c/p\u003e \u003cp\u003eThird, data augmentation and synthesis require more principled development beyond basic geometric or photometric transformations. Domain-aware\u003csup\u003e\u003cspan citationid=\"CR196\" class=\"CitationRef\"\u003e196\u003c/span\u003e\u003c/sup\u003e and attribute-aware\u003csup\u003e\u003cspan citationid=\"CR197\" class=\"CitationRef\"\u003e197\u003c/span\u003e\u003c/sup\u003e augmentation\u0026mdash;accounting for colour constancy, illumination variability, specular highlights, and anatomical plausibility\u0026mdash;is critical in oral imaging. Generative models, including diffusion-based synthesis and conditional generative adversarial networks\u003csup\u003e\u003cspan citationid=\"CR198\" class=\"CitationRef\"\u003e198\u003c/span\u003e\u003c/sup\u003e, offer promising avenues to enrich minority classes, simulate rare precursor lesions, and improve model generalisation, provided that safeguards against artefact learning and distribution shift are rigorously applied.\u003c/p\u003e \u003cp\u003eFourth, class imbalance remains a defining challenge, as early-stage dysplasia and certain precursor lesions are under-represented relative to advanced disease or normal tissue. Addressing this requires a combination of strategies, including cost-sensitive\u003csup\u003e\u003cspan citationid=\"CR199\" class=\"CitationRef\"\u003e199\u003c/span\u003e\u003c/sup\u003e, focal\u003csup\u003e\u003cspan citationid=\"CR200\" class=\"CitationRef\"\u003e200\u003c/span\u003e\u003c/sup\u003e and asymmetric\u003csup\u003e\u003cspan citationid=\"CR201\" class=\"CitationRef\"\u003e201\u003c/span\u003e\u003c/sup\u003e loss functions, informed resampling\u003csup\u003e\u003cspan citationid=\"CR202\" class=\"CitationRef\"\u003e202\u003c/span\u003e\u003c/sup\u003e, and uncertainty-aware training\u003csup\u003e\u003cspan citationid=\"CR203\" class=\"CitationRef\"\u003e203\u003c/span\u003e\u003c/sup\u003e. Importantly, evaluation metrics should prioritise clinical relevance\u0026mdash;such as sensitivity for high-risk lesions and balanced accuracy\u0026mdash;rather than overall accuracy alone.\u003c/p\u003e \u003cp\u003eFifth, model fusion and ensemble learning should be systematically explored. Combining complementary representations\u0026mdash;such as convolutional features, transformer-based global context, and graph-based relational modelling of lesion morphology\u0026mdash;can improve robustness and reduce variance. Hybrid fusion strategies, including late-decision fusion, \u003csup\u003e28,204\u003c/sup\u003e tailored to majority and minority classes, may be particularly effective in clinically imbalanced settings and better reflect multi-factorial clinician reasoning.\u003c/p\u003e \u003cp\u003eSixth, XAI is essential for clinical adoption. Models should provide transparent and reliable explanations at both the pixel and concept levels, highlighting diagnostically meaningful regions and features rather than spurious correlations. Techniques such as attention visualisation\u003csup\u003e\u003cspan citationid=\"CR205\" class=\"CitationRef\"\u003e205\u003c/span\u003e\u003c/sup\u003e, counterfactual explanations\u003csup\u003e\u003cspan citationid=\"CR206\" class=\"CitationRef\"\u003e206\u003c/span\u003e\u003c/sup\u003e, and concept-based attribution\u003csup\u003e\u003cspan citationid=\"CR207\" class=\"CitationRef\"\u003e207\u003c/span\u003e\u003c/sup\u003e can support clinician trust, facilitate error analysis, and enable regulatory scrutiny. XAI should be evaluated not only for visual plausibility but also for clinical validity and consistency across patient subgroups.\u003c/p\u003e \u003cp\u003eSeventh, vision\u0026ndash;language models (VLMs)\u003csup\u003e\u003cspan citationid=\"CR208\" class=\"CitationRef\"\u003e208\u003c/span\u003e\u003c/sup\u003e represent a promising direction for future research in oral cancer and precursor lesion classification from visible light photography. By jointly learning from images and textual information, VLMs would enable the integration of visual lesion characteristics with clinical descriptors such as lesion morphology, anatomical site, risk factors, and provisional diagnoses. This multimodal representation may be particularly advantageous for early-stage and potentially malignant disorders, where visual features are subtle and subject to high inter-observer variability. Moreover, VLMs support weakly supervised, prompt-based, and few-shot learning paradigms, potentially reducing dependence on large, fully annotated datasets. Importantly, language grounding also offers a pathway toward more interpretable and clinically meaningful explanations, complementing pixel-level saliency methods. Future work should focus on domain-adapted VLM architectures tailored to intraoral imaging, robust handling of noisy or biased clinical text, and rigorous external validation to ensure generalisability and clinical trust.\u003c/p\u003e \u003cp\u003eFinally, future research should emphasise generalisation, fairness, and deployment readiness. External validation across institutions, devices, and populations is critical to mitigate bias and ensure equity. Prospective studies, human\u0026ndash;AI interaction experiments, and integration with clinical workflows will be necessary to move from proof-of-concept models toward real-world impact. Collectively, addressing these issues will help ensure that AI systems for oral cancer and precursor lesion classification are accurate, interpretable, and clinically meaningful.\u003c/p\u003e \u003c/div\u003e"},{"header":"Conclusion","content":"\u003cp\u003eThis scoping review of 134 studies reveals a field that has achieved technically impressive results on narrow benchmarks but faces fundamental challenges for clinical translation. Current research is characterised by reliance on limited, inadequately validated datasets; near-exclusive focus on supervised learning with binary classification tasks; minimal external validation; and explainability implementations that serve publication rather than clinical needs.\u003c/p\u003e \u003cp\u003eThe critical gaps identified, particularly the absence of dysplasia grading research, lack of ordinal classification approaches reflecting disease progression, limited multimodal integration, and unexplored potential of graph-based architectures for spatial reasoning, represent priorities for future research. Addressing these gaps will require collaborative efforts to develop larger, well-annotated datasets with histological validation, standardised evaluation protocols enabling meaningful comparison across studies, and clinical workflow integration studies that move beyond technical accuracy to assess real-world impact.\u003c/p\u003e"},{"header":"Abbreviations","content":"\u003cdiv class=\"DefinitionList\"\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eAI\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eArtificial Intelligence\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eAUC\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eArea Under the Curve\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eCNN\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eConvolutional Neural Network\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eCV\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eComputer Vision\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eEHR\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eElectronic Health Record\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eGAN\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eGenerative Adversarial Network\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eGNN\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eGraph Neural Network\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eGrad\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003e \u003cb\u003eCAM\u003c/b\u003e-Gradient-weighted Class Activation Mapping\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eHPV\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eHuman Papillomavirus\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eIQR\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eInterquartile Range\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eLIME\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eLocal Interpretable Model-agnostic Explanations\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eML\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eMachine Learning\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eOCI\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eOral Cancer (Lips and Tongue) Images dataset\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eOED\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eOral Epithelial Dysplasia\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eOL\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eOral Leukoplakia\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eOLP\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eOral Lichen Planus\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eOPMD\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eOral Potentially Malignant Disorder\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eOSCC\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eOral Squamous Cell Carcinoma\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eOSMF\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eOral Submucous Fibrosis\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003ePRISMA\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003e \u003cb\u003eScR\u003c/b\u003e-Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eResNet\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eResidual Network\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eROI\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eRegion of Interest\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eSD\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eStandard Deviation\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eSHAP\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eSHapley Additive exPlanations\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eUK\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eUnited Kingdom\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eVGG\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eVisual Geometry Group\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eViT\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eVision Transformer\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eWHO\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eWorld Health Organization\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cb\u003eXAI\u003c/b\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eExplainable AI\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003c/div\u003e"},{"header":"Declarations","content":"\u003cp\u003e \u003ch2\u003eConflicts of Interest\u003c/h2\u003e \u003cp\u003eNone to declare\u003c/p\u003e \u003c/p\u003e\u003ch2\u003eFunding\u003c/h2\u003e \u003cp\u003eThere was no funding for this study.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eWarnakulasuriya S (2009) Global epidemiology of oral and oropharyngeal cancer. Oral Oncol 45:309\u0026ndash;316\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBray F et al (2024) Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 74:229\u0026ndash;263\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCWT USC referrals dashboard \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://nhsd-ndrs.shinyapps.io/cwt_referral_conversion_detection/\u003c/span\u003e\u003cspan address=\"https://nhsd-ndrs.shinyapps.io/cwt_referral_conversion_detection/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGuan J-Y et al (2023) Malignant transformation rate of oral leukoplakia in the past 20 years: A systematic review and meta-analysis. J Oral Pathol Med Off Publ Int Assoc Oral Pathol Am Acad Oral Pathol 52:691\u0026ndash;700\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWarnakulasuriya S, Johnson, Newell W, Van Der Waal I (2007) Nomenclature and classification of potentially malignant disorders of the oral mucosa. J Oral Pathol Med 36:575\u0026ndash;580\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCoppola N et al (2021) Referral Patterns in Oral Medicine: A Retrospective Analysis of an Oral Medicine University Center in Southern Italy. Int J Environ Res Public Health 18:12161\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWalsh T et al (2021) Diagnostic tests for oral cancer and potentially malignant disorders in patients presenting with clinically evident lesions. \u003cem\u003eCochrane Database Syst. Rev.\u003c/em\u003e (2021)\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eConway DI et al (2008) Socioeconomic inequalities and oral cancer risk: A systematic review and meta-analysis of case‐control studies. Int J Cancer 122:2811\u0026ndash;2819\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGandini S et al (2008) Tobacco smoking and cancer: a meta-analysis. Int J Cancer 122:155\u0026ndash;164\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBagnardi V et al (2015) Alcohol consumption and site-specific cancer risk: a comprehensive dose-response meta-analysis. Br J Cancer 112:580\u0026ndash;593\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKreimer AR, Clifford GM, Boyle P, Franceschi S (2005) Human Papillomavirus Types in Head and Neck Squamous Cell Carcinomas Worldwide: A Systematic Review. Cancer Epidemiol Biomarkers Prev 14:467\u0026ndash;475\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePerera M, Al-hebshi NN, Speicher DJ, Perera I, Johnson NW (2016) Emerging role of bacteria in oral carcinogenesis: a review with special reference to perio-pathogenic bacteria. J Oral Microbiol 8:32762\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMarsden H et al (2024) Accuracy of an artificial intelligence as a medical device as part of a UK-based skin cancer teledermatology service. Front Med 11:1302363\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSamuel AL (1959) Some Studies in Machine Learning Using the Game of Checkers. IBM J Res Dev 3:210\u0026ndash;229\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436\u0026ndash;444\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eShen D, Wu G, Suk H-I (2017) Deep Learning in Medical Image Analysis. Annu Rev Biomed Eng 19:221\u0026ndash;248\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKrizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60:84\u0026ndash;90\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. \u003cem\u003eProc. IEEE\u003c/em\u003e 86, 2278\u0026ndash;2324\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDosovitskiy A et al (2020) An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. Preprint at \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.48550/ARXIV.2010.11929\u003c/span\u003e\u003cspan address=\"10.48550/ARXIV.2010.11929\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTakahashi S et al (2024) Comparison of Vision Transformers and Convolutional Neural Networks in Medical Image Analysis: A Systematic Review. J Med Syst 48:84\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eArksey H, O\u0026rsquo;Malley L (2005) Scoping studies: towards a methodological framework. Int J Soc Res Methodol 8:19\u0026ndash;32\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eOuzzani M, Hammady H, Fedorowicz Z, Elmagarmid A (2016) Rayyan\u0026mdash;a web and mobile app for systematic reviews. Syst Rev 5:210\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZotero | Your personal research assistant. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.zotero.org/\u003c/span\u003e\u003cspan address=\"https://www.zotero.org/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCollins GS et al (2024) TRIPOD\u0026thinsp;+\u0026thinsp;AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ 385:e078378\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMuller S, Tilakaratne WM (2022) Update from the 5th Edition of the World Health Organization Classification of Head and Neck Tumors: Tumours of the Oral Cavity and Mobile Tongue. Head Neck Pathol 16:54\u0026ndash;62\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKhalifa M, Albadawy M (2024) AI in diagnostic imaging: Revolutionising accuracy and efficiency. Comput Methods Programs Biomed Update 5:100146\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCollins GS et al (2014) External validation of multivariable prediction models: a systematic review of methodological conduct and reporting. BMC Med Res Methodol 14:40\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHuang S-C, Pareek A, Seyyedi S, Banerjee I, Lungren MP (2020) Fusion of medical imaging and electronic health records using deep learning: a systematic review and implementation guidelines. Npj Digit Med 3:136\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eUS Food and Drug Administration (2025) Artificial intelligence and machine learning (AI/ML)-enabled medical devices. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-and-machine-learning-aiml-enabled-medical-devices\u003c/span\u003e\u003cspan address=\"https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-and-machine-learning-aiml-enabled-medical-devices\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCin\u0026agrave; G, R\u0026ouml;ber TE, Goedhart R, Birbil (2025) Ş. İ. Why we do need explainable AI for healthcare. Diagn Progn Res 9:24\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSharma D, Kudva V, Patil V, Kudva A, Bhat RS (2022) A Convolutional Neural Network Based Deep Learning Algorithm for Identification of Oral Precancerous and Cancerous Lesion and Differentiation from Normal Mucosa: A Retrospective Study. Eng Sci 18:278\u0026ndash;287\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFu Q et al (2020) A deep learning algorithm for detection of oral cavity squamous cell carcinoma from photographic images: A retrospective study. eClinicalMedicine 27:100558\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eJ A et al (2024) A Deep Learning System to Predict Epithelial Dysplasia in Oral Leukoplakia. J Dent Res 103:1218\u0026ndash;1226\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eUpadhyay D, Manwal M, Kukreja V, Sharma RA, Fine (2024) -Tuned Yolov5 and Exception Model for Oral Cancer Detection. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/INCET61516.2024.10592942\u003c/span\u003e\u003cspan address=\"10.1109/INCET61516.2024.10592942\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e doi:10.1109/INCET61516.2024.10592942\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSaini A, Guleria K, Sharma SA, Pre-trained (2023) MobileNetV2 Model for Oral Cancer Classification. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/IDICAIEI58380.2023.10406692\u003c/span\u003e\u003cspan address=\"10.1109/IDICAIEI58380.2023.10406692\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e doi:10.1109/IDICAIEI58380.2023.10406692\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eQu JA, Remote Network (2024) Transmission Diagnosis Method for Oral Cancer Based on 6G and Rough Set Theory Hierarchical Diagnosis. Wirel Pers Commun. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s11277-024-11242-9\u003c/span\u003e\u003cspan address=\"10.1007/s11277-024-11242-9\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBaliarsingh SK, Dev PP, Bandyopadhyay A, Dash AK, Pradhan R (2024) A Smartphone-based Deep Learning Framework for Early Detection of Oral Cancer Signs. 181\u0026ndash;186. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1109/ESIC60604.2024.10481662\u003c/span\u003e\u003cspan address=\"10.1109/ESIC60604.2024.10481662\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLiu P, Bagi K (2025) A tailored deep learning approach for early detection of oral cancer using a 19-layer CNN on clinical lip and tongue images. Sci Rep 15\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePradhan P (2025) Accuracy of ChatGPT 3.5, 4.0, 4o and Gemini in diagnosing oral potentially malignant lesions based on clinical case reports and image recognition. Med Oral Patol Oral Cir Bucal 30:e224\u0026ndash;e231\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKabir MF, Ahmad MY, Uddin R, Cordero M, Kant S (2025) Accurate and lightweight oral cancer detection using SE-MobileViT on clinically validated image dataset. Discov Artif Intell 5\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHadjouni M et al (2023) Advanced Meta-Heuristic Algorithm Based on Particle Swarm and Al-Biruni Earth Radius Optimization Methods for Oral Cancer Detection. Ieee Access. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/access.2023.3253430\u003c/span\u003e\u003cspan address=\"10.1109/access.2023.3253430\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eVinayahalingam S et al (2024) Advancements in diagnosing oral potentially malignant disorders: leveraging Vision transformers for multi-class detection. Clin ORAL Investig. 28\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTalwar V et al (2023) AI-Assisted Screening of Oral Potentially Malignant Disorders Using Smartphone-Based Photographic Images. \u003cem\u003eCancers\u003c/em\u003e 15\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWarin K et al (2022) AI-based Analysis of Oral Lesions Using Novel Deep Convolutional Neural Networks for Early Detection of Oral Cancer. PLoS ONE. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1371/journal.pone.0273508\u003c/span\u003e\u003cspan address=\"10.1371/journal.pone.0273508\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRai V et al (2024) AI-Driven Smartphone Screening for Early Detection of Oral Potentially Malignant Disorders. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/ICONSTEM60960.2024.10568597\u003c/span\u003e\u003cspan address=\"10.1109/ICONSTEM60960.2024.10568597\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e doi:10.1109/ICONSTEM60960.2024.10568597\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSanthiya M, Sindhuja M, Jegatha R, Manikandan J (2023) An Effective Automated Framework for Oral Cancer Detection by Enhanced. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/ICoAC59537.2023.10249983\u003c/span\u003e\u003cspan address=\"10.1109/ICoAC59537.2023.10249983\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. Convolutional Neural Networks\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNanditha BR, Kiran GA, Chandrashekar HS, Dinesh MS, Murali S (2021) An Ensemble Deep Neural Network Approach for Oral Cancer Screening. Int J ONLINE Biomed Eng 17:121\u0026ndash;134\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAftab J et al (2025) Artificial intelligence based classification and prediction of medical imaging using a novel framework of inverted and self-attention deep neural network architecture. Sci Rep 15\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSchmidl B et al (2025) Artificial intelligence for image recognition in diagnosing oral and oropharyngeal cancer and leukoplakia. Sci Rep 15\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSurenthar M, Gunaseelan R, Balasubramaniam A, Elakiya J (2025) Artificial Intelligence in the Screening of Oral Cancer: A Cross\u0026ndash;Sectional Study on a Novel App\u0026ndash;Based Approach for Primary Health Care Settings. J Indian Acad Oral Med Radiol 37:215\u0026ndash;220\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRamesh E, Ganesan A, Lakshmi KC, Natarajan PM (2025) Artificial intelligence\u0026mdash;based diagnosis of oral leukoplakia using deep convolutional neural networks Xception and MobileNet-v2. Front Oral Health 6:1414524\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAlabdan R, Alruban A, Mustafa Hilal AM, Motwakel A (2023) Artificial-Intelligence-Based Decision Making for Oral Potentially Malignant Disorder Diagnosis in Internet of Medical Things Environment. Healthc Switz 11\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePatel A et al (2024) Attention-guided convolutional network for bias-mitigated and interpretable oral lesion classification. Sci Rep 14\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChilet-Martos E, Vila-Franc\u0026eacute;s J, Bag\u0026aacute;n-Sebasti\u0026aacute;n JV, Vives-Gilabert Y (2025) Automated classification of oral cancer lesions: Vision transformers vs radiomics. Comput Biol Med 189\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eC S-S et al (2025) Automated classification of oral potentially malignant disorders and oral squamous cell carcinoma using a convolutional neural network framework: a cross-sectional study. Lancet Reg Health Am 47:101138\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWelikala RA et al (2020) Automated detection and classification of oral lesions using deep learning for early detection of oral cancer. IEEE Access 8:132677\u0026ndash;132693\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTanriver G, Soluk Tekkesin M, Ergen O (2021) Automated detection and classification of oral lesions using deep learning to detect oral potentially malignant disorders. Cancers 13:2766\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eShamim MZM et al (2022) Automated Detection of Oral Pre-Cancerous Tongue Lesions Using Deep Learning for Early Diagnosis of Oral Cavity Cancer. Comput J 65:91\u0026ndash;104\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eManikandan J, Krishna BV, Varun N, Vishal V, Yugant S (2023) Automated Framework for Effective Identification of Oral Cancer Using Improved. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/ICONSTEM56934.2023.10142794\u003c/span\u003e\u003cspan address=\"10.1109/ICONSTEM56934.2023.10142794\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. Convolutional Neural Network\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWarin K, Limprasert W, Suebnukarn S, Jinaporntham S, Jantana P (2021) Automatic Classification and Detection of Oral Cancer in Photographic Images Using Deep Learning Algorithms. J Oral Pathol Med. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1111/jop.13227\u003c/span\u003e\u003cspan address=\"10.1111/jop.13227\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSong B et al (2018) Automatic classification of dual-modalilty, smartphone-based oral dysplasia and malignancy images using deep learning. Biomed Opt Express 9:5318\u0026ndash;5329\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBegum SH, Vidyullatha P (2023) Automatic Detection and Classification of Oral Cancer from Photographic Images Using Attention Maps and Deep Learning. Int J Intell Syst Appl Eng 11:221\u0026ndash;229\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eH L, H, C., L, W., J, S., J L (2021) Automatic detection of oral cancer in smartphone-based images using deep learning for early diagnosis. J Biomed Opt 26\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSundari TS, Maheshwari M (2025) Automatic oral cancer detection using deep learning techniques. Biomed Signal Process Control 106\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSong B et al (2021) Bayesian deep learning for reliable oral cancer image classification. Biomed Opt Express 12:6422\u0026ndash;6430\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eIslam MM, Alam KMR, Uddin J, Ashraf I, Samad MA (2023) Benign and Malignant Oral Lesion Image Classification Using Fine-Tuned Transfer Learning Techniques. Diagnostics 13:3360\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eVijaya J, Rishabh K, Parashar P, CanScan (2023) Non-Invasive Techniques for Oral Cancer Detection. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/ELEXCOM58812.2023.10370254\u003c/span\u003e\u003cspan address=\"10.1109/ELEXCOM58812.2023.10370254\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e doi:10.1109/ELEXCOM58812.2023.10370254\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eal-Ali A et al (2025) CLASEG: advanced multiclassification and segmentation for differential diagnosis of oral lesions using deep learning. Sci Rep 15\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSong B et al (2021) Classification of imbalanced oral cancer image data from high-risk population. J Biomed Opt 26\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSong B et al (2024) Classification of Mobile-Based Oral Cancer Images Using the Vision Transformer and the Swin Transformer. \u003cem\u003eCANCERS\u003c/em\u003e 16\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHamada I et al (2025) Classification of Oral Cancer and Leukoplakia Using Oral Images and Deep Learning with Multi-Scale Random Crop Self-Training. Int Conf Pattern Recognit Appl Methods 1:780\u0026ndash;787\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGoswami B, Bhuyan MK, Alfarhood S, Safran M (2024) Classification of Oral Cancer Into Pre-Cancerous Stages From White Light Images Using LightGBM Algorithm. IEEE Access 12:31626\u0026ndash;31639\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGoswami B, Neogi S, Nagar SR, Punjabi N, Gudi R (2025) Classification of Oral Potentially Malignant Disorders Using Multimodal Feature Integration. \u003cem\u003eProc. - Int. Symp. Biomed. Imaging\u003c/em\u003e \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/ISBI60581.2025.10980715\u003c/span\u003e\u003cspan address=\"10.1109/ISBI60581.2025.10980715\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e doi:10.1109/ISBI60581.2025.10980715\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWelikala RA et al (2021) Clinically Guided Trainable Soft Attention for Early Detection of Oral Cancer. Lect Notes Comput Sci 13052:226\u0026ndash;236\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRathi S, Puranik A, Pratham S, Kulkarni V, Chincholkar H (2024) Comparative Analysis of CNN Architectures for Enhancing Oral Cancer Detection Using Advanced Image Processing Techniques. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/ICCUBEA61740.2024.10774894\u003c/span\u003e\u003cspan address=\"10.1109/ICCUBEA61740.2024.10774894\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e doi:10.1109/ICCUBEA61740.2024.10774894\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBourass Y, Zouaki H, Bahri A (2016) Computer-Aided diagnostics of facial and oral cancer. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/ICoCS.2015.7483252\u003c/span\u003e\u003cspan address=\"10.1109/ICoCS.2015.7483252\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e doi:10.1109/ICoCS.2015.7483252\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eShruthi K et al (2024) Convolutional Neural Network For Detection Of Oral Cavity Leading To Oral Cancer From Photographic Images. Int J Comput Digit Syst 15:865\u0026ndash;877\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWei X et al (2024) Convolutional neural network for oral cancer detection combined with improved tunicate swarm algorithm to detect oral cancer. Sci Rep 14\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCamalan S et al (2021) Convolutional Neural Network-Based Clinical Predictors of Oral Dysplasia: Class Activation Map Analysis of Deep Learning Results. \u003cem\u003eCANCERS\u003c/em\u003e 13\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLim JH et al (2021) D\u0026rsquo;OraCa: Deep Learning-Based Classification of Oral Lesions with Mouth Landmark Guidance for Early Detection of Oral Cancer. Lect Notes Comput Sci 12722:408\u0026ndash;422\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLee Y-H et al (2025) DCNN models with post-hoc interpretability for the automated detection of glossitis and OSCC on the tongue. Sci Rep 15\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSankaradass V, Devasenan R, Manish M, Gurunamasivayam VK, M., Govindasamy C (2025) Deep Learning Algorithms in Oral Lesion Diagnosis: Innovations in Image-Based optimization for Cancer Detection and. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/ICDSAAI65575.2025.11011565\u003c/span\u003e\u003cspan address=\"10.1109/ICDSAAI65575.2025.11011565\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. Differential Diagnosis\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHeo J et al (2022) Deep learning model for tongue cancer diagnosis using endoscopic images. Sci Rep 12\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSu A-Y, Wu M-L, Wu Y-H (2025) Deep learning system for the differential diagnosis of oral mucosal lesions through clinical photographic imaging. J Dent Sci 20:54\u0026ndash;60\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eOrme\u0026ntilde;o-Arriagada P, Navarro E, Taramasco C, Gatica G, Vasconez JP (2025) Deep Learning Techniques for Oral Cancer Detection: Enhancing Clinical Diagnosis by ResNet and DenseNet Performance. Commun Comput Inf Sci 2236:59\u0026ndash;72\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAlzahrani AA et al (2025) Deep structured learning with vision intelligence for oral carcinoma lesion segmentation and classification using medical imaging. Sci Rep 15\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMarzouk R et al (2022) Deep Transfer Learning Driven Oral Cancer Detection and Classification Model. Comput Mater Contin 73:3905\u0026ndash;3920\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKumar A, Sharma N (2023) Detection and Classification of Oral Cancer Using Machine Learning Models. 522\u0026ndash;528. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1109/ICTACS59847.2023.10390071\u003c/span\u003e\u003cspan address=\"10.1109/ICTACS59847.2023.10390071\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKavyashree C, Vimala HS, J J, Detection (2024) and segmentation of oral lesion using Mask R-CNN. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/ACOIT62457.2024.10939183\u003c/span\u003e\u003cspan address=\"10.1109/ACOIT62457.2024.10939183\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e doi:10.1109/ACOIT62457.2024.10939183\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSong H-J et al (2023) Detection of Abnormal Changes on the Dorsal Tongue Surface Using Deep Learning. Med -Lith 59\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLa Mantia G et al (2024) Detection of Elementary White Mucosal Lesions by an AI System: A Pilot Study. Oral 4:557\u0026ndash;566\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFl\u0026uuml;gge T et al (2023) Detection of oral squamous cell carcinoma in clinical photographs using a vision transformer. Sci Rep 13\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKumar S, Pratap A, Saxena I (2025) Development of Oral Cancer Detection Technique: A Comprehensive Approach Using CNNs and TensorFlow Lite. 934\u0026ndash;938. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1109/CICTN64563.2025.10932593\u003c/span\u003e\u003cspan address=\"10.1109/CICTN64563.2025.10932593\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLA V et al (2025) Diagnostic Performance of ChatGPT-4o in Analyzing Oral Mucosal Lesions: A Comparative Study with Experts. Med Kaunas Lith 61\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eal MADO (2024) et. Diagn\u0026oacute;stico de c\u0026aacute;ncer oral mediante algoritmos de aprendizaje profundo. Ingenius Rev Cienc Tecnol. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.17163/ings.n32.2024.06\u003c/span\u003e\u003cspan address=\"10.17163/ings.n32.2024.06\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eJurczyszyn K, Kozakiewicz M (2019) Differential diagnosis of leukoplakia versus lichen planus of the oral mucosa based on digital texture analysis in intraoral photography. Adv Clin Exp Med 28:1469\u0026ndash;1476\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eG\u0026uuml;rses BO et al (2025) Differentiation of benign and malignant oral lesions through surface texture analysis and SVM modeling. Clin Oral Investig 29\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePahadiya P, Vijay R, Gupta KK, Saxena S, Shahapurkar T (2023) Digital Image Based Segmentation and Classification of Tongue Cancer Using CNN. Wirel Pers Commun 132:609\u0026ndash;627\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWang W, Liu Y, Wu J (2023) Early diagnosis of oral cancer using a hybrid arrangement of deep belief networkand combined group teaching algorithm. Sci Rep 13\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLee S-J et al (2024) Enhancing deep learning classification performance of tongue lesions in imbalanced data: mosaic-based soft labeling with curriculum learning. BMC Oral Health 24\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eda Silva AVB et al (2024) Enhancing Explainability in Oral Cancer Detection with Grad-CAM Visualizations. Lect Notes Comput Sci 14813:151\u0026ndash;164\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eShuaib M et al (2025) Enhancing Oral Cancer Diagnosis through Attention Mechanisms and Explainable AI: A VGG-19 with CBAM Integration. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/CONIT65521.2025.11166859\u003c/span\u003e\u003cspan address=\"10.1109/CONIT65521.2025.11166859\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAttarde AV et al (2024) Enhancing Oral Cancer Screening with Deep Learning Algorithms. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/ICCSC62048.2024.10830423\u003c/span\u003e\u003cspan address=\"10.1109/ICCSC62048.2024.10830423\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChakraborty P, Saha N, Nath S, Das K (2025) Enhancing Oral Disease Classification using Dilated Convolutional Neural Network. 127\u0026ndash;130. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1109/IEEECONF64992.2025.10963041\u003c/span\u003e\u003cspan address=\"10.1109/IEEECONF64992.2025.10963041\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSharma A, Gupta S, Abbas HM (2025) Evaluating Deep Neural Networks for Oral Cancer Prediction: A Study Using ResNet50 and DenseNet121. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/OTCON65728.2025.11070632\u003c/span\u003e\u003cspan address=\"10.1109/OTCON65728.2025.11070632\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e doi:10.1109/OTCON65728.2025.11070632\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eG K, İŞ HY, B., F, N. P., \u0026Ouml; \u0026Ccedil; (2025) Evaluation of the Detectability of Oral Potentially Malignant Diseases with a Deep Learning Approach: A Retrospective Pilot Study. J Imaging Inf Med. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s10278-025-01665-6\u003c/span\u003e\u003cspan address=\"10.1007/s10278-025-01665-6\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYadav DP, Sharma B, Noonia A, Mehbodniya A (2025) Explainable label guided lightweight network with axial transformer encoder for early detection of oral cancer. Sci Rep 15\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCimino MGCA et al (2025) Explainable screening of oral cancer via deep learning and case-based reasoning. Smart Health 35\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWelikala RA et al (2020) Fine-Tuning Deep Learning Architectures for Early Detection of Oral Cancer. Lect Notes Comput Sci 12508:25\u0026ndash;31\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRabinovici-Cohen S et al (2024) From Pixels to Diagnosis: Algorithmic Analysis of Clinical Oral Photos for Early Detection of Oral Squamous Cell Carcinoma. \u003cem\u003eCANCERS\u003c/em\u003e 16\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eShamim MZM (2022) Hardware Deployable Edge-AI Solution for Prescreening of Oral Tongue Lesions Using TinyML on Embedded Devices. IEEE Embed Syst Lett 14:183\u0026ndash;186\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eReddy MR, Saritha KN, Reddy A, Nagaratnamaiah PA, C., Sandhya TK (2025) Hybrid Deep Learning Framework for Real-Time Oral Cancer Detection and Prevention Using Multi-Model CNN Integration. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/ISAC364032.2025.11156758\u003c/span\u003e\u003cspan address=\"10.1109/ISAC364032.2025.11156758\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e doi:10.1109/ISAC364032.2025.11156758\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eParola M et al (2023) Image-Based Screening of Oral Cancer via Deep Ensemble Architecture. 1572\u0026ndash;1578. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1109/SSCI52147.2023.10371865\u003c/span\u003e\u003cspan address=\"10.1109/SSCI52147.2023.10371865\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLee YO, Kim J, Lee JW (2025) Improving Diagnostic Accuracy for Oral Cancer with inpainting Synthesis Lesions Generated Using Diffusion Models. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.48550/arXiv.2508.06151\u003c/span\u003e\u003cspan address=\"10.48550/arXiv.2508.06151\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eParola M et al (2025) Improving oral cancer classification via segment-driven photographic deep learning imaging. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/CISMCompanion65074.2025.11032552\u003c/span\u003e\u003cspan address=\"10.1109/CISMCompanion65074.2025.11032552\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e doi:10.1109/CISMCompanion65074.2025.11032552\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKarthikeyan B et al (2024) Design and Development of an Oral Cancer Identification Methodology based on Improved Neural Classification Scheme. 411\u0026ndash;416. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1109/ICSCSA64454.2024.00072\u003c/span\u003e\u003cspan address=\"10.1109/ICSCSA64454.2024.00072\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBordoloi D, Joshi K, Kukreja V, Sharma R (2024) Innovative Approaches in Oncology: YOLOv5 and EfficientNet for Improved. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/APCIT62007.2024.10673457\u003c/span\u003e\u003cspan address=\"10.1109/APCIT62007.2024.10673457\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. Oral Cancer Diagnosis\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKavyashree C, Vimala HS (2025) Instance Segmentation of Oral Cancer Images with Fusion of Swin Transformer and Mask RCNN. J Innov Image Process 7:695\u0026ndash;706\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAlanazi AA, Khayyat MM, Khayyat MM, Elnaim BM (2022) \u0026amp; Abdel-khalek, S. Intelligent Deep Learning Enabled Oral Squamous Cell Carcinoma Detection and Classification Using Biomedical Images. \u003cem\u003eComput. Intell. Neurosci.\u003c/em\u003e (2022)\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChen R, Wang Q, Huang X (2024) Intelligent deep learning supports biomedical image detection and classification of oral cancer. Technol Health Care 32:S465\u0026ndash;S475\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFiguera K et al (2022) Interpretable deep learning approach for oral cancer classification using guided attention inference network. J Biomed Opt 27:015001\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDinesh Y, Ramalingam K, Ramani P, Deepak RM (2023) Machine Learning in the Detection of Oral Lesions With Clinical Intraoral Images. \u003cem\u003eCureus\u003c/em\u003e \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.7759/cureus.44018\u003c/span\u003e\u003cspan address=\"10.7759/cureus.44018\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e doi:10.7759/cureus.44018\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSchw\u0026auml;rzler J et al (2025) Machine learning versus clinicians for detection and classification of oral mucosal lesions. J Dent 161\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLiyanage V, Tao M, Park S, Wang JS, K. N., Azimi S (2023) Malignant and non-malignant oral lesions classification and diagnosis with deep neural networks. J Dent 137\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSong B et al (2021) Mobile-based oral cancer classification for point-of-care screening. J Biomed Opt 26\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRashid J et al (2024) Mouth and oral disease classification using InceptionResNetV2 method. Multimed Tools Appl 83:33903\u0026ndash;33921\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003e\u0026Ouml;zen BB, Karadaş F, Ba Alawi A (2024) Multi-Model Stacking Ensemble Approach for Improving Oral Cancer Diagnosis. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/SIU61531.2024.10600983\u003c/span\u003e\u003cspan address=\"10.1109/SIU61531.2024.10600983\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e doi:10.1109/SIU61531.2024.10600983\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRedondo A et al (2026) Multiclass classification of oral mucosal lesions by deep learning from clinical images without performing any restrictions. Biomed Signal Process Control 111\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDevindi GAI et al (2024) Multimodal Deep Convolutional Neural Network Pipeline for AI-Assisted Early Detection of Oral Cancer. IEEE Access 12:124375\u0026ndash;124390\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang J, Tian Y, Song B, Lee E-J (2025) Multimodal Learning for Enhanced Detection in Oral Cancer Screening. \u003cem\u003eProc. - Int. Symp. Biomed. Imaging\u003c/em\u003e \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/ISBI60581.2025.10981235\u003c/span\u003e\u003cspan address=\"10.1109/ISBI60581.2025.10981235\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e doi:10.1109/ISBI60581.2025.10981235\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLi J et al (2025) Next-generation AI framework for comprehensive oral leukoplakia evaluation and management. Npj Digit Med 8\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRazmjouei P et al (2025) NFR-EDL: Non-linear fuzzy rank-based ensemble deep learning for accurate diagnosis of oral and dental diseases using RGB color photography. Comput Biol Med 192\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHuang Q, Ding H, Razmjooy N (2023) Optimal deep learning neural network using ISSA for diagnosing the oral cancer. Biomed Signal Process Control 84\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSekaran R, Manikandan M, Suliman W, Ravi V (2024) Optimizing Oral Cancer Diagnosis with Advanced Deep Learning Approaches. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/ISAECT64333.2024.10799545\u003c/span\u003e\u003cspan address=\"10.1109/ISAECT64333.2024.10799545\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e doi:10.1109/ISAECT64333.2024.10799545\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRajan A, Oviya IR (2024) Oral Cancer Classification Using Few-Shot Learning with CNN and Siamese Networks. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/INDICON63790.2024.10958513\u003c/span\u003e\u003cspan address=\"10.1109/INDICON63790.2024.10958513\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e doi:10.1109/INDICON63790.2024.10958513\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKhan SUR, Asif S (2024) Oral cancer detection using feature-level fusion and novel self-attention mechanisms. Biomed Signal Process Control 95:106437\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSwamikannan LD et al (2024) Oral Cancer Detection Using Mobile Vision Technology. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/BHI62660.2024.10913489\u003c/span\u003e\u003cspan address=\"10.1109/BHI62660.2024.10913489\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e doi:10.1109/BHI62660.2024.10913489\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChai Y, Chai X, Zhang L, Ye G (2025) \u0026amp; Rashid Sheykhahmad, F. R. Oral cancer detection via Vanilla CNN optimized by improved artificial protozoa optimizer. Sci Rep 15\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSaxena K et al (2025) Oral Cancer Detection with Customized Deep Neural Network Based Transfer Learning Technique: A Comprehensive 2-D Image Analysis. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/APSIT63993.2025.11086279\u003c/span\u003e\u003cspan address=\"10.1109/APSIT63993.2025.11086279\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMedapati MP, Ahmed A, Subash K, K., Aeron A (2024) Oral Cancer Detections and Classification Using Region Based Convolutional Neural Network. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/TQCEBT59414.2024.10545203\u003c/span\u003e\u003cspan address=\"10.1109/TQCEBT59414.2024.10545203\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e doi:10.1109/TQCEBT59414.2024.10545203\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang L, Shi R, Yousefi N (2024) Oral cancer diagnosis based on gated recurrent unit networks optimized by an improved version of Northern Goshawk optimization algorithm. \u003cem\u003eHeliyon\u003c/em\u003e 10\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTeo AHA, Goh CP (2024) Oral Disease Image Detection System Using Transfer Learning. 194\u0026ndash;198. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1109/ICDXA61007.2024.10470514\u003c/span\u003e\u003cspan address=\"10.1109/ICDXA61007.2024.10470514\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSingh R, Sharma N, Rajput K, Singh M (2024) Oral Lesions Classification Using EfficientNet Transfer Learning Model. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/WCONF61366.2024.10692037\u003c/span\u003e\u003cspan address=\"10.1109/WCONF61366.2024.10692037\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e doi:10.1109/WCONF61366.2024.10692037\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNanditha BR, Kiran GA, Chandrashekar HS, Dinesh MS, Murali S (2020) Oral Malignancy Detection Using Color Features from Digital True Color Images. Int J ONLINE Biomed Eng 16:95\u0026ndash;106\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eXie F et al (2024) Oral mucosal disease recognition based on dynamic self-attention and feature discriminant loss. Oral Dis 30:3094\u0026ndash;3107\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBhopal M, Ranjan R (2023) Oral Tumor Detection based on Convolution Neural Network. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/INCOFT60753.2023.10425572\u003c/span\u003e\u003cspan address=\"10.1109/INCOFT60753.2023.10425572\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e doi:10.1109/INCOFT60753.2023.10425572\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDivya S, Oviya IR, PrasannaKumar R, Oralnet (2024) A Deep Learning Model for Automated. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/INDICON63790.2024.10958407\u003c/span\u003e\u003cspan address=\"10.1109/INDICON63790.2024.10958407\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. Oral Cancer Detection\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAsif S, Wang VY, Xu D, OralTransNet (2025) A novel hybrid model integrating transformer attention and CNN features for accurate diagnosis of mouth and oral diseases. Eng Appl Artif IN℡LIGENCE 159\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKumar M et al (2025) Performance Evaluation of Large Language Models in Detecting Buccal Mucosal Lesions Using Smartphone-Based Imaging. J Pioneer Med Sci 14:102\u0026ndash;107\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKhovidhunkit S-OP et al (2025) Performance of deep learning models for the classification and object detection of different oral white lesions using photographic images. Sci Rep 15\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eUthoff RD et al (2018) Point-of-care, smartphone-based, dual-modality, dual-view, oral cancer screening device with neural network classification for low-resource communities. PLoS ONE 13:e0207493\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWuttisarnwattana P et al (2024) Precise Identification of Oral Cancer Lesions Using Artificial Intelligence. Stud Health Technol Inf 316:1096\u0026ndash;1097\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNooralli IM, Patil SB, Kulkarni G (2024) Predictive Analytics for Tongue Diseases: A Comparative Study of Deep Learning Models. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/INNOVA63080.2024.10847037\u003c/span\u003e\u003cspan address=\"10.1109/INNOVA63080.2024.10847037\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e doi:10.1109/INNOVA63080.2024.10847037\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDissorn P et al (2023) Preprocessing Technique for Oral Lesion Classification using U-NET Segmentation. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/BMEiCON60347.2023.10322048\u003c/span\u003e\u003cspan address=\"10.1109/BMEiCON60347.2023.10322048\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e doi:10.1109/BMEiCON60347.2023.10322048\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePradeep Singh SM et al (2023) Real Time Oral Cavity Detection Leading to Oral Cancer using CNN. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/NMITCON58196.2023.10275851\u003c/span\u003e\u003cspan address=\"10.1109/NMITCON58196.2023.10275851\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAl Duhayyim M et al (2023) Sailfish Optimization with Deep Learning Based Oral Cancer Classification Model. Comput Syst Sci Eng 45:753\u0026ndash;767\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDesai KM et al (2025) Screening of oral potentially malignant disorders and oral cancer using deep learning models. Sci Rep 15\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZahran FM, el-Din YS, Azab NA (2025) The Application of Artificial-based Models to Classify Oral Cavity Findings Based on Clinical Image Analysis. Adv Dent J 7:627\u0026ndash;641\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePhosri K et al (2022) The Comparison of Deep Learning Model Efficiency for Classification of Oral White Lesions. 235\u0026ndash;238. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1109/ITC-CSCC55581.2022.9894916\u003c/span\u003e\u003cspan address=\"10.1109/ITC-CSCC55581.2022.9894916\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eParola M et al (2024) Towards explainable oral cancer recognition: Screening on imperfect images via Informed Deep Learning and Case-Based Reasoning. Comput Med Imaging Graph. 117\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAra\u0026uacute;jo ALD et al (2025) Two-step pipeline for oral diseases detection and classification: a deep learning approach. Front Oral Health 6:1659323\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGomes R et al (2023) Use of Artificial Intelligence in the Classification of Elementary Oral Lesions from Clinical Images. Int J Environ Res Public Health 20:3894\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYe Y-J, Han Y, Liu Y, Guo Z-L, Huang M-W (2024) Utilizing deep learning for automated detection of oral lesions: A multicenter study. Oral Oncol. 155\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eVemulapalli L, Kola AVS, Ravuri CC, Kanagala A (2025) White Light Medical Image Based Oral Cancer Diagnosis Using an Ensemble Deep Learning Model. 1132\u0026ndash;1137. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1109/ICCSAI64074.2025.11063769\u003c/span\u003e\u003cspan address=\"10.1109/ICCSAI64074.2025.11063769\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eOral Cancer (Lips and Tongue) images. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.kaggle.com/datasets/shivam17299/oral-cancer-lips-and-tongue-images\u003c/span\u003e\u003cspan address=\"https://www.kaggle.com/datasets/shivam17299/oral-cancer-lips-and-tongue-images\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSarker IH (2021) Machine Learning: Algorithms, Real-World Applications and Research Directions. SN Comput Sci 2:160\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFei-Fei L Knowledge transfer in learning to recognize visual objects classes\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRibeiro MT, Singh S, Guestrin C (2016) \u0026lsquo;Why Should I Trust You?\u0026rsquo;: Explaining the Predictions of Any Classifier. Preprint at \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.48550/arXiv.1602.04938\u003c/span\u003e\u003cspan address=\"10.48550/arXiv.1602.04938\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSimonyan K, Vedaldi A, Zisserman A (2014) Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps. Preprint at \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.48550/arXiv.1312.6034\u003c/span\u003e\u003cspan address=\"10.48550/arXiv.1312.6034\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSelvaraju RR et al (2020) Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization. Int J Comput Vis 128:336\u0026ndash;359\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eal SHB (2023) et. Automatic Detection and Classification of Oral Cancer from Photographic Images Using Attention Maps and Deep Learning. Int J Intell Syst Appl Eng\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLundberg S, Lee S-I (2017) A Unified Approach to Interpreting Model Predictions. Preprint at \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.48550/arXiv.1705.07874\u003c/span\u003e\u003cspan address=\"10.48550/arXiv.1705.07874\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBrown AC, Salmon PJM, Leffell DJ, Ko JM, Grant-Kels JM (2022) Artificial intelligence in the detection of skin cancer. J Am Acad Dermatol 87:1336\u0026ndash;1342\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSengupta N, Sarode S, Sarode G, Ghone U (2022) Scarcity of publicly available oral cancer image datasets for machine learning research. Oral Oncol 126\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTschandl P, ViDIR Group (2018) \u0026amp;. The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. 683406, 1403566547, 1366522108, 10808743, 421046860, 976058, 830369, 129493 Harvard Dataverse \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.7910/DVN/DBW86T\u003c/span\u003e\u003cspan address=\"10.7910/DVN/DBW86T\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFirdaus N, Raza Z (2025) Enhancing Privacy in Oral Cancer Detection through Federated Learning: A Cross-Institutional Study. Procedia Comput Sci 260:1113\u0026ndash;1120\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eShephard AJ et al (2025) Development and validation of an artificial intelligence-based pipeline for predicting oral epithelial dysplasia malignant transformation. Commun Med 5\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFerrer-S\u0026aacute;nchez A, Bagan J, Vila-Franc\u0026eacute;s J, Magdalena-Benedito R, Bagan-Debon L (2022) Prediction of the risk of cancer and the grade of dysplasia in leukoplakia lesions using deep learning. Oral Oncol 132:105967\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSchouten D et al (2025) Navigating the landscape of multimodal AI in medicine: A scoping review on technical challenges and clinical applications. Med Image Anal 105:103621\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eInternational Skin Imaging Collaboration (2024) SLICE-3D 2024 Permissive Challenge Dataset. International Skin Imaging Collaboration https://doi.org/10.34970/2024-SLICE-3D-PERMISSIVE\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHuang S-C et al (2023) Self-supervised learning for medical image classification: a systematic review and implementation guidelines. Npj Digit Med 6:74\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChen T, Kornblith S, Norouzi M, Hinton G (2020) A Simple Framework for Contrastive Learning of Visual Representations. Preprint at \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.48550/arXiv.2002.05709\u003c/span\u003e\u003cspan address=\"10.48550/arXiv.2002.05709\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHe K, Fan H, Wu Y, Xie S, Girshick R (2020) Momentum Contrast for Unsupervised Visual Representation Learning. Preprint at \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.48550/arXiv.1911.05722\u003c/span\u003e\u003cspan address=\"10.48550/arXiv.1911.05722\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMienye ID, Viriri S (2025) Graph Neural Networks in Medical Imaging: Methods, Applications and Future Directions. Information 16:1051\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKiechle J et al (2024) Graph Neural Networks: A suitable Alternative to MLPs in Latent 3D Medical Image. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003eorg/10.48550/arXiv.2407.17219\u003c/span\u003e\u003cspan address=\"10.48550/arXiv.2407.17219\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. Classification? Preprint at https://doi.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGoswami B et al (2025) Detection of Oral Potentially Malignant Lesions Through Transformer-Based Segmentation Models. Lect Notes Comput Sci 15305:318\u0026ndash;332\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang Y, Jiang H, Miura Y, Manning CD, Langlotz CP (2022) Contrastive Learning of Medical Visual Representations from Paired Images and Text. Preprint at \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.48550/arXiv.2010.00747\u003c/span\u003e\u003cspan address=\"10.48550/arXiv.2010.00747\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTiu E et al (2022) Expert-level detection of pathologies from unannotated chest X-ray images via self-supervised learning. Nat Biomed Eng 6:1399\u0026ndash;1406\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang J, Du B, Miao Y, Sun D, Cao X (2025) OralGPT: A Two-Stage Vision-Language Model for Oral Mucosal Disease Diagnosis and Description. Preprint at \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.48550/arXiv.2510.13911\u003c/span\u003e\u003cspan address=\"10.48550/arXiv.2510.13911\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBukovszky B et al (2023) Malignant Transformation and Long-Term Outcome of Oral and Laryngeal Leukoplakia. J Clin Med 12:4255\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLoftus TJ et al (2022) Uncertainty-aware deep learning in healthcare: A scoping review. PLOS Digit Health 1:e0000085\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTripathi PC, Bag S (2023) An Attention-Guided CNN Framework for Segmentation and Grading of Glioma Using 3D MRI Scans. IEEE/ACM Trans Comput Biol Bioinform 20:1890\u0026ndash;1904\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePatel G, Dolz J (2022) Weakly supervised segmentation with cross-modality equivariant constraints. Med Image Anal 77:102374\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFerreira DL, Lau C, Salaymang Z, Arnaout R (2025) Self-supervised learning for label-free segmentation in cardiac ultrasound. Nat Commun 16:4070\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTripathi M, Kongprawechnon W, Kondo TA (2025) Highly Robust Encoder\u0026ndash;Decoder Network with Multi-Scale Feature Enhancement and Attention Gate for the Reduction of Mixed Gaussian and Salt-and-Pepper Noise in Digital Images. J Imaging 11:51\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMichel N, Negrel R, Chierchia G, Bercher J-F (2023) Domain-Aware Augmentations for Unsupervised Online General Continual Learning. Preprint at \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.48550/ARXIV.2309.06896\u003c/span\u003e\u003cspan address=\"10.48550/ARXIV.2309.06896\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWagata K, Huang C, Nihey F, Kosaka Y, Nakahara K (2025) Attribute-Aware Adversarial Domain Augmentation for Zero-Shot Medical Domain Adaptation. \u003cem\u003eAnnu. Int. Conf. IEEE Eng. Med. Biol. Soc. IEEE Eng. Med. Biol. Soc. Annu. Int. Conf.\u003c/em\u003e 1\u0026ndash;7 (2025)\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eShen Z, Mao M, Fan PA Primary Comparison of Diffusion Models and Generative Adversarial Networks for Image Synthesis. in \u003cem\u003eProceedings of the\u003c/em\u003e (2024) \u003cem\u003e7th International Conference on Machine Learning and Machine Intelligence (MLMI)\u003c/em\u003e 225\u0026ndash;234 (ACM, Osaka Japan, 2024). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1145/3696271.3696307\u003c/span\u003e\u003cspan address=\"10.1145/3696271.3696307\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAraf I, Idri A, Chairi I (2024) Cost-sensitive learning for imbalanced medical data: a review. Artif Intell Rev 57:80\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYeung M, Sala E, Sch\u0026ouml;nlieb C-B, Rundo L (2022) Unified Focal loss: Generalising Dice and cross entropy-based losses to handle class imbalanced medical image segmentation. Comput Med Imaging Graph 95:102026\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eVito V, Stefanus LY (2022) An Asymmetric Contrastive Loss for Handling Imbalanced Datasets. Entropy 24:1303\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWelvaars K et al (2023) Implications of resampling data to address the class imbalance problem (IRCIP): an evaluation of impact on performance between classification algorithms in medical data. JAMIA Open 6:ooad033\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDashti A et al (2025) Uncertainty-Aware Deep Neural Network Training for Imbalanced Geochemical Data Distributions. Nat Resour Res. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s11053-025-10568-w\u003c/span\u003e\u003cspan address=\"10.1007/s11053-025-10568-w\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGadzicki K, Khamsehashari R, Zetzsche C Early vs Late Fusion in Multimodal Convolutional Neural Networks. in (2020) \u003cem\u003eIEEE 23rd International Conference on Information Fusion (FUSION)\u003c/em\u003e 1\u0026ndash;6 (IEEE, Rustenburg, South Africa, 2020). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.23919/FUSION45008.2020.9190246\u003c/span\u003e\u003cspan address=\"10.23919/FUSION45008.2020.9190246\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLiu G, Zhang J, Chan AB, Hsiao JH (2024) Human attention guided explainable artificial intelligence for computer vision models. Neural Netw 177:106392\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDelaney E, Pakrashi A, Greene D, Keane MT (2023) Counterfactual explanations for misclassified images: How human and machine explanations differ. Artif Intell 324:103995\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePastor E, Poeta E, Panisson A, Perotti A, Ciravegna G (2025) Beyond Input Attribution: A Hands-On Tutorial to Concept-Based Explainable AI and Mechanistic Interpretability. in \u003cem\u003eProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2\u003c/em\u003e 6247\u0026ndash;6248Association for Computing Machinery, New York, NY, USA. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1145/3711896.3737606\u003c/span\u003e\u003cspan address=\"10.1145/3711896.3737606\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDanish S et al (2026) A comprehensive survey of Vision\u0026ndash;Language Models: Pretrained models, fine-tuning, prompt engineering, adapters, and benchmark datasets. Inf Fusion 126:103623\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAyyapa V et al (2024) Non-Invasive Oral Cancer Detection Using Hyperspectral Imaging and Advanced Spectral Unmixing Models. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/ICEC59683.2024.10837017\u003c/span\u003e\u003cspan address=\"10.1109/ICEC59683.2024.10837017\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e doi:10.1109/ICEC59683.2024.10837017\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eThiem DGE et al (2021) Hyperspectral imaging and artificial intelligence to detect oral malignancy - part 1-automated tissue classification of oral muscle, fat and mucosa using a light-weight 6-layer deep neural network. HEAD FACE Med 17\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCaughlin K et al (2024) Contrastive Clustering-Based Patient Normalization to Improve Automated In Vivo Oral Cancer Diagnosis from Multispectral Autofluorescence Lifetime Images. \u003cem\u003eCancers\u003c/em\u003e 16\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMarsden M et al (2021) Intraoperative Margin Assessment in Oral and Oropharyngeal Cancer Using Label-Free Fluorescence Lifetime Imaging and Machine Learning. IEEE Trans Biomed Eng 68:857\u0026ndash;868\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRamani RS et al (2025) Convolutional neural networks for accurate real-time diagnosis of oral epithelial dysplasia and oral squamous cell carcinoma using high-resolution in vivo confocal microscopy. Sci Rep 15:2555\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAbd El-Aziz AAA, Mahmood MA, Abd El-Ghany SA (2025) Enhancing Early Detection of Oral Squamous Cell Carcinoma: A Deep Learning Approach with LRT-Enhanced EfficientNet-B3 for Accurate and Efficient Histopathological Diagnosis. \u003cem\u003eDiagnostics\u003c/em\u003e 15\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChantapakul W et al (2025) Detection of Architectural Dysplastic Features From Histopathological Imagery of Oral Mucosa Using Neural Networks. \u003cem\u003eBioengineering\u003c/em\u003e \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3390/bioengineering12030216\u003c/span\u003e\u003cspan address=\"10.3390/bioengineering12030216\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e doi:10.3390/bioengineering12030216\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ede Lima LM et al (2023) Importance of complementary data to histopathological image analysis of oral leukoplakia and carcinoma using deep neural networks. IN℡LIGENT Med 3:258\u0026ndash;266\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDeo BS, Pal M, Panigrahi PK, Pradhan A (2025) An ensemble deep learning model with empirical wavelet transform feature for oral cancer histopathological image classification. Int J DATA Sci Anal 20:1005\u0026ndash;1022\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAdeoye J, Su Y (2025) Deep learning with data transformation improves cancer risk prediction in oral precancerous conditions. Intell Med 5:141\u0026ndash;150\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eXue Z, Liang Z, Rajaraman S, Marini N, Antani S (2025) Detecting Oral Cancer Using Tabular Deep Learning. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/COINS65080.2025.11125786\u003c/span\u003e\u003cspan address=\"10.1109/COINS65080.2025.11125786\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e doi:10.1109/COINS65080.2025.11125786\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDeshpande PR, Pansare BS, Pawar HY, Dash S (2025) Exploring the Use of Neural Networks for Early Detection of Oral Cancer and Other Dental Pathologies. 764\u0026ndash;769. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1109/ICOECA66273.2025.00136\u003c/span\u003e\u003cspan address=\"10.1109/ICOECA66273.2025.00136\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePavani V et al (2025) An Advanced Imaging and Machine Learning Algorithm for Enhanced Oral Cancer Detection. 285\u0026ndash;294. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1109/ICMLAS64557.2025.10967776\u003c/span\u003e\u003cspan address=\"10.1109/ICMLAS64557.2025.10967776\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eShaheer KM et al (2024) Oral Cancer Analysis for Early Detection using Deep Learning. 317\u0026ndash;321. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1109/ICC-ROBINS60238.2024.10533923\u003c/span\u003e\u003cspan address=\"10.1109/ICC-ROBINS60238.2024.10533923\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHuang S-Y, Chiou C-Y, Tan Y-S, Chen C-Y, Chung P-C (2022) Deep Oral Cancer Lesion Segmentation with Heterogeneous Features. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/RASSE54974.2022.9989871\u003c/span\u003e\u003cspan address=\"10.1109/RASSE54974.2022.9989871\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e doi:10.1109/RASSE54974.2022.9989871\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eL L et al (2024) Development of an oral cancer detection system through deep learning. BMC Oral Health 24:1468\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePiazza C et al (2021) Deep Learning for Automatic Segmentation of Oral and Oropharyngeal Cancer Using Narrow Band Imaging: Preliminary Experience in a Clinical Perspective. Front Oncol. 11\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eParola M et al (2025) Oral Cancer Recognition on Photographic Images Via Deep Learning Semantic Segmentation. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/CIHMCompanion65205.2025.11002690\u003c/span\u003e\u003cspan address=\"10.1109/CIHMCompanion65205.2025.11002690\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e doi:10.1109/CIHMCompanion65205.2025.11002690\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang R et al (2024) Research and Application of Deep Learning Models with Multi-Scale Feature Fusion for Lesion Segmentation in Oral Mucosal Diseases. \u003cem\u003eBioengineering\u003c/em\u003e 11\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSong B et al (2022) Exploring uncertainty measures in convolutional neural network for semantic segmentation of oral cancer images. J Biomed Opt 27\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eThakuria T et al (2025) Smartphone-Based Oral Lesion Image Segmentation Using Deep Learning. J Imaging Inf Med. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s10278-025-01455-0\u003c/span\u003e\u003cspan address=\"10.1007/s10278-025-01455-0\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHsu Y et al (2025) Oral mucosal lesions triage via YOLOv7 models. J Formos Med Assoc 124:621\u0026ndash;627\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKeser G, Pekiner F, Bayrakdar İŞ, Celik \u0026Ouml;, Orhan K (2024) A deep learning approach to detection of oral cancer lesions from intra oral patient images: A preliminary retrospective study. J Stomatol Oral Maxillofac Surg 125\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLaila S, Ema RR, Galib SM, Jamil ARM (2024) \u0026amp; Shahab Uddin, A. F. M. S. A Novel Method to Detect Oral Carcinoma Using Box Annotation Based on YOLO Model. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/iCACCESS61735.2024.10499560\u003c/span\u003e\u003cspan address=\"10.1109/iCACCESS61735.2024.10499560\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e doi:10.1109/iCACCESS61735.2024.10499560\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBirur P et al (2022) Field validation of deep learning based Point-of-Care device for early detection of oral malignant and potentially malignant disorders. Sci Rep 12\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHaddaway NR, Page MJ, Pritchard CC, McGuinness LA (2022) PRISMA2020: An R package and Shiny app for producing PRISMA 2020-compliant flow diagrams, with interactivity for optimised digital transparency and Open Synthesis. Campbell Syst Rev 18:e1230\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"Queen Mary University of London","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Artificial intelligence, oral cancer, classification, clinical photography, scoping review","lastPublishedDoi":"10.21203/rs.3.rs-8865303/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-8865303/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eArtificial intelligence shows promise for oral cancer detection, yet clinical translation remains limited. This scoping review examined 134 studies (2015\u0026ndash;2025) investigating AI applications for oral lesion classification using visible-light clinical photography. Searches across Scopus, Web of Science, Embase, and PubMed followed PRISMA-ScR guidelines. Methodological limitations exist among studies; 25.4% utilised a single 131-image Kaggle dataset without ground-truth histological labelling, 99.3% employed supervised learning, and 8.2% performed external validation. Binary classification tasks predominated (59.7%), while dysplasia grading was seldom explored (10.4%). Convolutional neural network architectures, such as ResNet, dominated study designs. Critical gaps include limited multi-modal and multi-model integration, absence of ordinal classification approaches - reflecting disease progression, and underexplored potential of novel deep-learning architectures such as graph-based mechanisms, and use of frontier techniques to address data scarcity such as synthetic image generation.\u003c/p\u003e","manuscriptTitle":"AI for Classifying Oral Cancer and Precursor Lesions Using Visible-Light Photography","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-02-16 07:09:13","doi":"10.21203/rs.3.rs-8865303/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"6d9ee9e3-d661-4acc-b5f6-9ee596428680","owner":[],"postedDate":"February 16th, 2026","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":62838329,"name":"Dentistry"}],"tags":[],"updatedAt":"2026-02-16T07:09:13+00:00","versionOfRecord":[],"versionCreatedAt":"2026-02-16 07:09:13","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-8865303","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-8865303","identity":"rs-8865303","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00