DLNDD: An Explainable Deep Learning Framework for the Early Detection and Classification of Rare Diseases | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article DLNDD: An Explainable Deep Learning Framework for the Early Detection and Classification of Rare Diseases Mian Muhammad Hamza This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-9305966/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Background Rare Neurological Diseases (RNDs) represent a significant global health burden, affecting over 300 million patients. The diagnostic journey for RNDs is often a prolonged process, spanning from the initial manifestation of symptoms to severe disease stages. Limited labeled clinical data and high phenotypic heterogeneity present inherent challenges for both clinicians and automated systems. The aim of this research is to develop a multi-dataset architecture utilizing advanced preprocessing and a hybrid dataset to achieve diagnostic accuracy exceeding 95% while integrating advanced interpretability techniques. Method This study utilized four publicly available datasets MRI images (A), symptom-based text (B, CSV 1), genetic data (C, CSV 2), and rare disease metadata (D, CSV 3) to support early RND diagnosis. A CNN VGG-16 architecture with Contrast Limited Adaptive Histogram Equalization (CLAHE) preprocessing was employed for the MRI dataset, which consists of five severity stages. For tabular modalities, ten classifiers including MLP, SVM-RBF, Random Forest, and XGBoost were benchmarked. To ensure clinical transparency, Explainable AI (XAI) techniques, such as SHAP, LIME, and Grad-CAM, were integrated into the framework. Results The proposed DLNDD framework demonstrated ~ 100% accuracy on MRI images using the VGG-16 model. For tabular data, Logistic Regression achieved 98.89% accuracy on symptoms, Random Forest reached 96.67% on genetics, and XGBoost achieved 84.47% accuracy on Orphanet metadata. Conclusions The novel Deep Learning based Neurological Diseases Detection (DLNDD) illustrated that modality-specific, XAI achieved clinical meaningful findings even under rare diseases data constraints. The DLNDD outperformed previous studies in terms of accuracy and clinical interpretability. It also provides a replicable footprint for multimodal rare disease autonomous systems and points toward federated, fusion-ready architectures. Rare Neurological Diseases (RNDs) deep learning machine learning MRI CLAHE machine learning XAI Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9 Figure 10 Figure 11 Figure 12 1. Introduction and Review of Related Literature Rare Neurological Diseases (RNDs) are a mixed group of diseases that affected more than 300 million people worldwide. European Union (EU) defined RNDs as those disorders that affects less than 1 in every 2000 persons, but they are often remained with diagnosis. The average time to diagnose is five to seven years. Such a high diagnostic delay happens due to certain number of factors such as, lack of proper clinical data, high level of phenotypic heterogeneity of RNDs that create difficulty in early detection by clinicians (Löhmus et al., 2023) [ 1 ]. It is measured that almost 80% of RNDs have a genetic bases, and neurological symptoms are some of the most common and disabling symptoms in all these disorders (Germain et al., 2025) [ 2 ]. The high heterogeneity of RNDs does not allow to make specific diagnostic patterns using only clinical presentation or radiographic results. It further complexes the problem of early detection, that is why many patients undergo various misdiagnoses before receiving accurate diagnosis. Therefore, affecting the timely delivery of therapeutic intervention negatively affecting the quality of life (Shazeeb et al., 2025) [ 3 ]. The most recent developments in machine learning (ML) have had an immense impact on the diagnosis and study of rare neurological diseases (RNDs), especially in the context of combining genomics, transcriptomics, and proteomics data. Alganmi (2024) [ 1 ] conducted an extensive survey of the use of ML techniques to combine omics data on rare neurological diseases, thus demonstrating new biomarkers that can improve the accuracy of the diagnosis and treatment outcome. The effectiveness of these approaches in identifying genetic variants linked to rare diseases, such as sickle cell anemia and cystic fibrosis, and the appropriateness of ensemble methods to handle diverse and complicated genetic data were demonstrated by (Dong et al., 2022) [ 2 ]. This paper highlighted the strength of such algorithms in addressing the issues of noisy or incomplete data, which is a typical situation in rare disease studies. Furthermore, the enhancement of multi-omics data has demonstrated the possibility of enhancing the accuracy of diagnosis. Germain et al. (2025) [ 3 ] used artificial intelligence (AI) to combine genetic and clinical data to diagnose Fabry disease, which demonstrates that AI-based methods can significantly increase the level of diagnostic accuracy when using a combination of multiple sources of data at the same time (Germain et al., 2025) [ 3 ]. Multi-omics data integration is also compromised by the lack of availability; hence, data augmentation and transfer learning are critical. Lohmus et al. (2023) [ 4 ] discussed about the importance of using a combination of genomic, transcriptomic, and clinical data to overcome data scarcity and thus build more valid diagnostic models of RNDs. The use of deep learning (DL) models, and specifically convolutional neural networks (CNNs), has become an essential part of the medical image analysis process, such as magnetic resonance imaging (MRI) analysis, when it comes to RND diagnosis. Syed et al. (2021) [ 5 ] examined the use of CNNs to classify MRI scans of Alzheimer disease, and they showed that transfer learning using ImageNet allows deep models to be effectively generalized even with limited data, a situation that is often observed in imaging studies of rare diseases (Syed et al., 2021) [ 5 ]. Uppalapati et al. (2025) [ 6 ] proposed TinyViT-Batten, a few-shot vision transformer (ViT) that detects Batten disease on pediatric MRI images, in solving the problem of data scarcity. The model uses explainable attention, which enables it to learn using small datasets and produce results that are easy to interpret by clinicians, which is essential in the task of rare disease classification. Also, the combination of DL with retrieval-augmented architectures can also increase the diagnostic accuracy of rare diseases. A framework suggested by Kim et al. (2025) [ 7 ] is called RADAR, which combines the real-time search of clinical literature with the analysis of images using the DL, thus, allowing the diagnosis without the need to further fine-tune the results on the data related to rare diseases and is a promising perspective of AI-assisted diagnostics. Multimodal data synthesis is a critical step in the development of rare disease diagnostics. The simultaneous study of medical imaging, clinical symptoms, and genetic data has proven to enhance the results of the diagnosis. Shazeeb et al. (2025) [ 8 ] studied the multimodal neuroimaging methods, i.e., susceptibility-weighted imaging (SWI) and diffusion-tensor imaging (DTI) to diagnose rare neurodegenerative disorders and found out that a combination of multiple imaging modalities can provide a more comprehensive diagnostic framework. Human Phenotype Ontology (HPO) has been used more and more to improve diagnostic yield through the connection of phenotypic descriptions with genetic data. Kohler et al. demonstrated that deep phenotyping using HPO, in conjunction with genomic data, increases diagnostic performance in diseases with similar symptomology, which is particularly common in rare diseases with complex genetic models (Kohler et al., 2021) [ 9 ]. Moreover, in their study, Germain et al. (2025) [ 3 ] applied AI methods to combine clinical, genetic, and imaging data to diagnose Fabry disease and found that the combination of heterogeneous data types enhances diagnostic processes and improves the knowledge of rare diseases. As AI becomes increasingly integrated into clinical processes, explainability has become a necessity in the uptake of AI models in healthcare. Tjoa et al. (2021) [ 10 ] have reviewed Grad-CAM, which is a visualization method that creates heatmaps that show the areas of activation in images, thus helping clinicians understand the exact areas of the MRI that affect model predictions. This approach is useful in the diagnosis of rare diseases, in which subtle differences in images can be difficult to interpret by experts. Another popular explainability technique is SHapley Additive exPlanations (SHAP) which can be applied to both tabular and medical imaging data when classifying rare diseases. Arrieta et al. (2020) [ 11 ] emphasized the significance of SHAP in providing transparency in model predictions, which would allow clinicians to understand the rationale behind AI-based decisions, which is an essential requirement in the diagnosis of rare diseases where expert validation is regularly requested. Another new area of research is the intersection of AI with knowledge graph-based systems, including RDBridge, which encode clinical, genetic, and phenotypic data into structured knowledge bases that can be used by AI models. The study by Xing et al. (2023) emphasized the possibility of knowledge graphs to support AI systems in reasoning on intricate, infrequent conditions, which, in turn, will provide a solid basis of future diagnostic processes. Table 1 Summary of Reviewed Literature; Methodology, and Domain. Study Method(s) Domain Alganmi (2024) ML + Omics Review Rare Neurological Diseases Germain et al. (2025) AI Review (Fabry) Fabry Disease / Rare Uppalapati et al. (2025) TinyViT + Few-Shot Batten Disease (MRI) Kim et al. (2025) RADAR RAG Agents Rare Brain (MRI) Shazeeb et al. (2025) Neuroimaging Editorial Rare Diseases (General) Tjoa & Guan (2021) XAI Medical Imaging Medical Image Classification Arrieta et al. (2020) XAI Taxonomy Clinical AI Dong et al. (2022) Ensemble ML, Genomics Genetic Rare Diseases Kohler et al. (2021) HPO / Deep Phenotyping Rare Diseases (General) Syed et al. (2021) Deep CNN Transfer Learning Alzheimer's MRI Ma et al. (2021) Multimodal Anomaly Detection Rare Brain Phenotypes Reuter et al. (2024) Transformer, Phenotyping Rare Disease Cohorts 1.1. Challenges (CH) CH The problem is the lack of a unified multimodal framework for rare neurological disease diagnosis. This challenge is addressed by Research Questions 3 and 4, and by Objectives 2 and 4. The contribution of this study is the development of a multimodal deep learning and machine learning framework that can handle heterogeneous data types (CB1). CH 2 The issue is the limited integration of explainability methods in existing studies. This is explored through Research Questions 1 and 2 and is addressed by Objective 3. The study contributes by integrating explainable AI (XAI) techniques, such as SHAP, LIME, and Grad-CAM, to provide clinically meaningful interpretations of model predictions (CB2). CH 3 The main problem is the absence of hybrid datasets that combine MRI, symptom, genetic, and metadata sources. This challenge relates to Research Question 3 and is targeted by Objectives 1 and 2. The contribution is the evaluation of a hybrid dataset, demonstrating performance across multiple rare disease modalities (CB3). CH 4 No prior research provides cross-modality benchmarking for rare neurological disease diagnosis. This is examined through Research Questions 3 and 4 and is addressed by Objective 4. The study contributes by performing cross-modality performance analysis to identify the relative importance of each data modality (CB4). CH 5 Existing studies show weak real-world validation, reducing clinical trust in AI models. This challenge is explored through Research Question 1 and addressed by Objectives 3 and 4. The study contributes by providing clinically interpretable results validated through XAI techniques, enhancing reliability for potential deployment (CB5). 1.2. Problem Statement (PS) Problem statement is gaps and limitations in previous related researchers, the PS of DLNDD is defined below as PS, PS Despite many findings on MRI images with deep learning, tabular ML for genetic and symptom data, rare disease knowledge graphs, and XAI methods, no prior study has utilized hybrid dataset that covers four heterogenous datasets such as, images, symptom text, genetic, and administrative disease metadata within a unified evaluation and explainability framework targeting rare neurological conditions. 1.3. Objectives (OBs) Objectives are the aims or target to achieve in research. OB 1 The DLNDD will Apply CLAHE-enhanced VGG16 transfer learning to five-class brain MRI images dataset (A) for early detection with more than 95% accuracy of rare neurological diseases. OB 2 Benchmark ten ML classifiers on three tabular datasets B, C, and D. symptom-to-disease text (CSV1) genetic disease structured data (CSV2) Orphanet rare disease metadata (CSV3) with stratified three-way splitting and comprehensive metric reporting. OB 3 DLNDD Will integrate SHAP, LIME, and Grad-CAM explainability across on MRI data and SHAP on A, B, and C for evaluating the clinical coherence of feature attributions. OB 4 DLNDD will Conduct a cross-modality comparison of best models to quantify the relative diagnostic information content of each modality and to provide a blueprint for future multimodal fusion systems. 1.4. Research Questions (RQs) Research questions are questions about how the objectives will be achieved. The RQs of this study is stated below as RQ 1, RQ 2, RQ 3, and RQ 4. RQ 1 : How do explainability methods like SHAP, LIME, and Grad-CAM contribute to the clinical interpretability of deep learning and machine learning models in rare neurological disease diagnosis? RQ 2 : How consistent are the feature attributions across multiple explainability frameworks (SHAP, LIME, Grad-CAM) for MRI dataset, symptom text, genetic, and rare disease metadata modalities? RQ 3 : How do different clinical data modalities (MRI, symptom text, genetic data, and rare disease metadata) contribute to the overall diagnostic accuracy for rare neurological diseases? RQ 4 : What is the relative importance of each modality (MRI dataset vs. tabular data) in predicting rare neurological diseases, and how can they be fused to improve diagnostic outcomes? 1.5. Contributions (CBs) Contributions are new additions to topic on which research is done. The CBs of this research is defined below as CB 1, CB 2, CB 3 and CB 4. CB 1 Development of a multi-dataset deep learning and machine learning framework that integrates heterogeneous data types, including MRI images, symptom text, genetic data, and rare disease metadata, within a unified evaluation pipeline. CB 2 Integration of state-of-the-art explainable AI (XAI) techniques, such as SHAP, LIME, and Grad-CAM, to enhance the clinical interpretability of both deep learning and machine learning models. CB 3 Demonstration of the effectiveness of a hybrid dataset approach, enabling accurate and robust evaluation across four distinct modalities despite the inherent scarcity of rare disease data. CB 4 Conduct of comprehensive cross-modality benchmarking, quantifying the relative diagnostic contribution of each data type and providing insights for future multimodal fusion systems. CB 5 Assurance of clinical reliability by producing interpretable and actionable results that align with pathological features, supporting potential translation of AI models into real-world clinical settings. Table 2 Relationships of CH, PS, OBs, RQs, and CBs. Challenge PS RQs OBs CBs Data heterogeneity (MRI, symptoms, genetics, metadata) Lack of unified multimodal framework RQ3, RQ4 OB2, OB4 CB1: DL/ML framework achieved 95%/90% + accuracies. Low interpretability in AI models Limited explainability integration RQ1, RQ2 OB3 CB2: XAI integration (SHAP, LIME, Grad-CAM). Limited rare disease data No hybrid dataset usage RQ3 OB1, OB2 CB3: Hybrid dataset evaluation. Modality comparison gap No cross-modality benchmarking RQ3, RQ4 OB4 CB4: Cross-modality performance analysis. Clinical reliability issues Weak real-world validation RQ1 OB3, OB4 CB5: Clinically interpretable results. 2. Methodology The DLNDD study was structured into four segments for a specific clinical data modality. The deep learning (DL) branch processed brain magnetic resonance imaging (MRI) data using CLAHE and fine-tuned a Visual Geometry Group 16 (VGG16) model through ImageNet-based transfer learning. Three machine learning (ML) branches handled tabular data, including symptom text (CSV1), genetic features (CSV2), and rare disease metadata (CSV3), applying encoding, scaling, and multiple classifiers. All branches converged in a final stage, where performance was evaluated using accuracy, macro F1-score, Area Under the Curve (AUC), Cohen’s kappa, and Matthews Correlation Coefficient (MCC), with explainability provided via Gradient-weighted Class Activation Mapping (Grad-CAM), SHapley Additive exPlanations (SHAP), and Local Interpretable Model-agnostic Explanations (LIME). 2.1. Dataset Description and Split The Rare Neurological Diseases MRI Curated Edition dataset (AhsanNeural, 2026), sourced from Kaggle, was used in the deep learning branch. It contained 1,400 annotated brain MRI images across five rare diseases: Fukuyama Muscular Dystrophy (FMD), Hallervorden–Spatz Disease (HSD/PKAN), Moyamoya Disease (MMD), Pachygyria–Cerebellar Hypoplasia (PCH), and Walker–Warburg Syndrome (WWS), with 280 training, 210 validation, and 210 test images per class. The Symptom2Disease dataset (CSV1) linked symptom descriptions to 24 diseases, with 180 test samples and TF-IDF features. The Genetic Disease Prediction dataset (CSV2) included clinical and biomarker data for five diseases with 150 test samples. CSV3, from Orphanet (2026), provided metadata for 1,648 rare disease cases. Table 3 Dataset Description and Preprocessing Specifications. Modality Dataset N Classes Input Type Train / Val / Test DL (MRI) Rare Neuro MRI Curated 5 neurological conds. 224×224 RGB + CLAHE 980 / 210 / 210 CSV1 (Symptoms) Symptom2Disease 24 diseases TF-IDF text (3,000 feats) 840 / 180 / 180 CSV2 (Genetic) Genetic Disease Prediction 5 genetic diseases Structured clinical features 700 / 150 / 150 CSV3 (Rare Meta) Orphanet 2026 Complete 6 disorder types Metadata + name keywords 7,693 / 1,648 / 1,648 2.2. Preprocessing on MRI Dataset After input of dataset and split, the MRI images were preprocessed based on a strictly standardized procedure before the model training. The volumes of the images were resampled spatially to 224 × 224 pixels to match the input size of the VGG-16 model grayscale images were replicated across color planes to ensure compatibility with the three-channel RGB format. CLAHE is an adaptive, locally contrast-enhancing method that divides the image into a regular grid of small tiles, usually 8 by 8, and applies histogram equalization on each tile, using a user-specified clip limit to prevent over-enhancement of noise in low-contrast areas as presented in Fig. 3 . 2.3. Model Architecture The deep learning model was built based on the VGG16 architecture, a 16-layer convolutional neural network that was first trained on the ImageNet dataset in large scale, and had been designed to perform visual recognition. VGG16 is a 13-layer convolutional network that has five blocks (block1 to block5) separated by max-pooling operations at the end of block 2, 3, 4 and 5, and finally three fully connected layers. The VGG-16 base was initialized with ImageNet weights (frozen), and the original classification head was replaced with a custom head designed for the five-class RND task: GlobalAveragePooling2D → Dense (512, ReLU) → BatchNormalization → Dropout (0.4) → Dense (5, Softmax), as shown in Fig. 4 . There are about 14.98 million trainable parameters in the model. A standardized machine-learning pipeline was uniformly used on each of the three tabular modalities. Preprocessing of features included: (1) encoding of categorical variables with scikit-learn LabelEncoder; (2) missing values filled in with median and SimpleImputer; and (3) scaling of features with StandardScaler. In CSV1, TF-IDF vectorization (maximum features 3000, n-gram range (1,2), and sub-linear frequency scaling of terms) of raw symptom text was followed by scaling as show in Fig. 5 . Each modality was tested on 10 classifiers, i.e. Logistic Regression, Multilayer Perceptron (MLP) with two hidden layers using the ReLU activation function, Extra Trees Classifier, Support Vector Machines with radial basis function kernel (SVM-RBF), Random Forest with 100 estimators, Gradient Boosting with 100 estimators, XGBoost with 100 estimators, Light GBM with 100 estimators, K-Nearest Neighbors with k = 5, and Ada In the case of CSV3 alone, an ensemble of Bagging Random Forests (Bagging RF) was also evaluated. Each model was trained with 70 training split, and model selection was done using macro-F1 score calculated on 15% validation split. End measures that were reported were based on a held-out 15 per cent test partition, thus providing an objective evaluation as shown in Fig. 4 . 2.4. Evaluation Metrics and Libraries. · Libraries: Tensorflow. Keras. SHAP, LIME. XGB. · Metrics: Accuracy, Precision, Recall, F1 score, MCC, AUC, Confusion Matrix, and Cohen’s Kappa. 3. Results and Discussions 3.1. Training parameters and setup The model was trained on Kaggle, cloud computing platform with 2x GPUs of NVIDIA. Training time was 496.5 seconds (~ 8.3 minutes) and the model has 14,981,957 parameters. Table 4 Training Configuration Sr. Parameter Value / Description 01 Optimizer Adam 02 Initial Learning Rate 1 × 10⁻⁴ 03 Loss Function Categorical Cross-Entropy 04 Batch Size 32 05 Epochs 25 06 Total Training Time 496 seconds (≈ 8.3 minutes) on GPU 07 Best Epoch Epoch 16 09 EarlyStopping Patience = 8, Monitor = val_loss, Restore Best Weights = True 10 ReduceLROnPlateau Factor = 0.4, Patience = 3, Minimum LR = 1 × 10⁻⁶ 3.2. DL VGG-16 Performance Analysis with Quantitative Method Figure 5 presents the VGG16 training dynamics across 25 epochs, as shown in Table 4 showing four panels: training and validation accuracy, training and validation loss, learning rate schedule, and the generalization gap over time. Training accuracy rose rapidly from 71.1% at epoch 0 to 95.7% by epoch 3, reflecting the rapid transfer of ImageNet-derived features to the MRI domain. Validation accuracy was volatile in early epochs (61.9%→76.7%→95.2%) as the model adapted from natural image to medical MRI feature distributions, but stabilized above 97% from epoch 9 onward. The learning rate reduction at epoch 14 (from 1×10⁻⁴ to 4×10⁻⁵) produced a clear improvement in validation loss from 0.852 to 0.057 as presented in Fig. 5 , confirming the importance of adaptive learning rate scheduling. Training loss at the final epoch was 0.0055, and validation loss was 0.0078, with a generalization gap of < 0.003, indicating negligible overfitting. The minimum validation loss of 0.0054 was achieved at epoch 19, consistent with the best model checkpoint. On the held-out test set, the VGG16 model achieved perfect classification across all five rare neurological disease classes: Accuracy = ~ 100 F1-Macro = ~ 100, Precision = ~ 100, Recall = ~ 100, AUC = ~ 100, Cohen's Kappa = ~ 100, MCC = ~ 100, Balanced Accuracy = ~ 100. as shown in Fig. 6 . The VGG16 confusion matrix is entirely diagonal — no instance of any of the five disease classes were misclassified as any other. This result confirms that the model has learned fully discriminative representations for all five conditions. Per-class metrics are uniformly perfect: precision = recall = F1 = 1.000 for Fukuyama Muscular Dystrophy, Hallervorden-Spatz Disease, Moyamoya Disease, Pachygyria-Cerebellar Hypoplasia, and Walker-Warburg Syndrome. 3.3. Machine Learning Algorithms Performance Analysis On CSV1, Logistic Regression achieved 98.89% test accuracy and 98.84% macro F1 across 24 classes, with AUC = 99.97%, Kappa = 97.10%, and MCC = 97.10%. Extra-Trees 98.33% and MLP 95.00% followed. Most classes showed perfect scores, with minor drops for Migraine and Allergy recall = 85.70% due to symptom overlap, while Diabetes had slightly lower precision (0.889) as shown in Fig. 8 and Table 5 . For CSV2, Random Forest achieved 96.67% accuracy and 96.68% F1 AUC = 95.01%, as shown in Fig. 10 , Kappa ≈ 89%). Multiple models showed identical validation performance, indicating a clear decision boundary. Per-class F1 ranged from 93.30% to ~ 100% as shown in Fig. 9 . On CSV3, XGBoost achieved 84.47% accuracy and 73.99% F1. Performance varied widely, with strong results for Clinical Subtype (~ 100%) but lower for Morphological Anomaly (0.404), mainly due to class imbalance and semantic overlap as illustrated in Fig. 7 and Fig. 8 . Interpretability shown good results with SHAP and LIME as shown in Fig. 12 . Table 5 Illustrates Comparison of All Four Datasets with Accuracy, F1 score, AUC, Kappa, MCC, and Train Time. Modality Best Model Accuracy F1-Macro AUC Kappa MCC Train Time DL (MRI Images) VGG16 1.0000 1.0000 1.0000 1.0000 1.0000 496.5 s CSV1 (Symptoms) Logistic Regression 0.9889 0.9884 0.9997 0.9710 0.9710 1.73 s CSV2 (Genetics) Random Forest 0.9667 0.9668 0.9893 0.8917 0.8919 0.85 s CSV3 (Rare Meta) XGBoost 0.8447 0.7399 0.9607 0.7545 0.7576 6.98 s 3.4. Interpretability with XAI Figure presents the full six-method XAI evaluation panel for a Pachygyria-Cerebellar Hypoplasia test case, classified with ~ 100% confidence by VGG16. The Grad-CAM activation pattern of pachygyria is clinically consistent: the attentional focus of the model is localized to the cortical mantle, the location most visually apparent of the reduced and shallow gyral folding pattern that characterizes pachygyria. This interpretation is supported by the occlusion sensitivity mapping which shows that the greatest drop in classification confidence occurs when the central cortical region is occluded, and thus, it can be seen that the spatial attribution is not due to spurious background shortcuts. A distributed cortical pattern identified by the SHAP Gradient Explainer (cyan, bottom-right) is consistent as illustrated in Fig. 11 with the typical smooth-to-moderately pachygyric cortical surface. In the six methodological approaches, a strong agreement is reached that the model selects Pachygyria-Cerebellar Hypoplasia by focusing on cortical thickness and gyral patterning, which are the main pathological features of this disorder. Table 6 Comparison with State of Art Architectures. Author / Study Modality Accuracy (Reported) DLNDD Improvement Syed et al. (2021) MRI 98.35% + 1.65% Uppalapati et al. (2025) MRI ~ 94.50% + 5.50% Dong et al. (2022) Genetics ~ 92.00% + 4.67% Sabzeghabaiean (2025) MRI 95.00% + 5.00% DLNDD (Proposed Study) Multimodal (MRI) ~ 100% Benchmark DLNDD (Proposed Study) Symptom Text 98.89% Benchmark DLNDD (Proposed Study) Genetic Data 96.67% Benchmark 3.3. Limitations · The MRI dataset was derived from a single curated source, leading to homogeneous imaging conditions that may limit generalizability to real-world, multi-institutional MRI data. · The perfect VGG16 performance (1.000) may reflect overfitting due to the small test size (~42 images per class), reducing reliability of reported metrics. · The tabular datasets (Symptom2Disease, Genetic Disease Prediction, Orphanet) are benchmark datasets and do not fully capture real-world clinical data complexity, such as free-text ambiguity and variable data quality. · All four modalities were processed independently rather than using a unified multimodal fusion model, limiting the ability to exploit cross-modal relationships. · SHAP analysis of CSV2 illustrated most features had minimal contribution, suggesting possible data quality limitations or lack of strong biological signal in the feature set. 4. Recommendations and Conclusions 4.1. Recommendations · Future work should prioritize multi-site MRI validation using independently collected datasets from rare disease registries such as European Reference Networks (ERNs) and the NIH Undiagnosed Diseases Network. Federated learning offers a practical solution to data scarcity by enabling model training across distributed hospital datasets without centralizing sensitive data, ensuring compliance with GDPR and HIPAA while maintaining performance comparable to centralized approaches. · For CSV3, class imbalance should be addressed using techniques such as SMOTE, class-weighted loss, and cost-sensitive learning. Additionally, transformer-based models (e.g., BioBERT, PubMedBERT) with self-supervised pretraining on Orphanet text can improve semantic understanding and enhance classification of challenging classes like Morphological Anomaly and Clinical Group. 4.2. Conclusion This study developed and evaluated a multimodal DL/ML framework for rare neurological disease classification across four clinical data modalities. The optimized VGG16 model, combined with CLAHE and ImageNet transfer learning, achieved perfect performance (Accuracy, F1, AUC, Kappa, MCC = 1.000) on a five-class MRI dataset. In the ML branches, Logistic Regression achieved 98.89% accuracy on symptom-based classification, Random Forest reached 96.67% on genetic data, and XGBoost achieved 84.47% accuracy (AUC = 0.9559) on Orphanet metadata. Overall, all modalities demonstrated strong performance, with macro-AUC values exceeding 96%. Declarations Funding Statement: No funding was received for research present in this manuscript. Data Availability: Data is Available on Request. Contribution Statement: Ethical Statement: All four datasets are available on Kaggle and are allowed to be employed for research, with addition to this, no human or animal was involved for experiment in this research. Conflict of Interest Statement: The author(s) declares there is no conflict of interest for research present in this manuscript. References Alganmi, N. (2024). A comprehensive review of the impact of machine learning and omics on rare neurological diseases. Biomedical Informatics , 4(1), 73. https://doi.org/10.3390/biomedinformatics4010073 Dong, C., Guo, W., Li, Y., Wang, Y., Li, J., & Zhou, J. (2022). Ensemble machine learning approaches for the identification of rare genetic disease variants. Frontiers in Genetics , 13, 862830. https://doi.org/10.3389/fgene.2022.862830 Germain, D. P., Gruson, D., Malcles, M., & Garcelon, N. (2025). Applying artificial intelligence to rare diseases: A literature review highlighting lessons from Fabry disease. Orphanet Journal of Rare Diseases , 20, 186. https://doi.org/10.1186/s13023-025-03655-x Löhmus, M., Labuhn, A., Bravo, A., & Saez-Rodriguez, J. (2023). Artificial intelligence in rare disease diagnostics: From symptom recognition to multi-omics integration. Journal of Rare Diseases , 2(1), 12. https://doi.org/10.1186/s43058-023-00012-3 Syed, A. N., Anwar, S. M., Liaqat, R., Majid, M., Iqbal, J., & Bagci, U. (2021). Deep convolutional neural network-based classification of Alzheimer's disease using MRI data. arXiv:2101.02876 . https://arxiv.org/abs/2101.02876 Uppalapati, K., Yimenicioglu, B., Abdulkareem, S., Eftekhari, A., Uppalapati, B., & Kamath, V. (2025). TinyViT-Batten: Few-shot vision transformer with explainable attention for early Batten-disease detection on pediatric MRI. arXiv:2510.09649 . https://arxiv.org/abs/2510.09649 Kim, H. Y., Li, J., Solana, A. B., Pirkl, C. M., Wiestler, B., Schnabel, J. A., & Bercea, C. I. (2025). Learning to reason about rare diseases through retrieval-augmented agents. arXiv:2511.04720 . https://arxiv.org/abs/2511.04720 Shazeeb, M. S., Acosta, M. T., & Tifft, C. J. (2025). Editorial: Role of neuroimaging in the diagnosis and treatment of rare diseases. Frontiers in Neuroimaging , 4, 1566484. https://doi.org/10.3389/fnimg.2025.1566484 Köhler, S., Gargano, M., Matentzoglu, N., Carmody, L. C., Lewis-Smith, D., Vasilevsky, N. A., Danis, D., Babbi, G., Bello, S. M., Cedar, N., & others. (2021). The Human Phenotype Ontology in 2021. Nucleic Acids Research , 49(D1), D1207–D1217. https://doi.org/10.1093/nar/gkaa1043 Tjoa, E., & Guan, C. (2021). A survey on explainability of supervised machine learning. IEEE Transactions on Neural Networks and Learning Systems , 32(10), 4233–4246. https://doi.org/10.1109/TNNLS.2020.3027314 Arrieta, A. B., Díaz-Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., Barbado, A., Garcia, S., Gil-Lopez, S., Molina, D., Benjamins, R., Chatila, R., & Herrera, F. (2020). Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion , 58, 82–115. https://doi.org/10.1016/j.inffus.2019.12.012 Xing, H., Zhang, D., Cai, P., Zhang, R., & Hu, Q. N. (2023). RDBridge: A knowledge graph of rare diseases based on large-scale text mining. Bioinformatics , 39(7), btad440. https://doi.org/10.1093/bioinformatics/btad440 AhsanNeural. (2026). Rare neurological diseases MRI curated edition [Dataset]. Kaggle. https://www.kaggle.com/datasets/ahsanneural/rare-neurological-diseases-mri-curated-edition AhsanNeural. (2026). Rare diseases Orphadata 2026 complete [Dataset]. Kaggle. https://www.kaggle.com/datasets/ahsanneural/rare-diseases-orphadata-2026 Alganmi, N. (2024). A comprehensive review of the impact of machine learning and omics on rare neurological diseases. Biomedical Informatics , 4(1), 73. https://doi.org/10.3390/biomedinformatics4010073 Additional Declarations No competing interests reported. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-9305966","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":620534564,"identity":"f8104878-7520-4780-933f-4f35ea01a86d","order_by":0,"name":"Mian Muhammad Hamza","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABLUlEQVRIie3RwUqEQBjAcWVgvHzkVVHpCYIJwUvL7qsogienS5diDxss7MkH2MeYWJjopgh5aGGvhpfdFjolZFBsFJV6TcNuEf4Pih/zYxxGEPr6/mRQP0VWv04HgKVzZ13N97qRpafLEG5JNcediDiLB+rcuVeqjzZyMKfR060/NC6lG/dxhxGQ0PXOnv2hjgW0uUu/Eys9djXKXfMqoOwwAFySay8zuFv+GDZNv4n4pCTIYSFlNigAJAq8TOWoJIC1ZmK+UT5x2Cpn0TtRgMTyy4nKJz8Rq9wldlhKL6ZgE1BngicWPG4nywfriPLEZGm+QHpogwyCq4k8AYxazpL4Zkb52GAruijyj88R3l87xSsfj2Rputk2kOZQfVmo6/Iqcfeb1X19fX3/vS9d0msstAPj7wAAAABJRU5ErkJggg==","orcid":"","institution":"Government College University, Faisalabad","correspondingAuthor":true,"prefix":"","firstName":"Mian","middleName":"Muhammad","lastName":"Hamza","suffix":""}],"badges":[],"createdAt":"2026-04-02 17:53:19","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-9305966/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-9305966/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":106637088,"identity":"c18e1165-3e67-49c9-b928-516dbb3604e7","added_by":"auto","created_at":"2026-04-10 16:58:44","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":95351,"visible":true,"origin":"","legend":"\u003cp\u003eIllustrates Multimodal Methodology of DLNDD Novel Framework.\u003c/p\u003e","description":"","filename":"floatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-9305966/v1/4e7ffbe9e315f272f54a613b.png"},{"id":106637089,"identity":"7e789af9-5fcb-4968-8d74-52f8821c9efb","added_by":"auto","created_at":"2026-04-10 16:58:44","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":816836,"visible":true,"origin":"","legend":"\u003cp\u003eEDA Dashboard of Hybrid Dataset that illustrates Datasets Descriptions, Samples and Split plot.\u003c/p\u003e","description":"","filename":"floatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-9305966/v1/2f2533b99092cb3458456b21.png"},{"id":106727563,"identity":"11542996-07f2-44c5-b527-546438e8c444","added_by":"auto","created_at":"2026-04-12 18:39:17","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":1332976,"visible":true,"origin":"","legend":"\u003cp\u003eIllustrates Implications of applying CLAHE on MRI images Datasets\u003c/p\u003e","description":"","filename":"floatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-9305966/v1/d54b69f534163f90f8ce2d50.png"},{"id":106637090,"identity":"d313c0f7-7ddd-483e-8972-6e7b325e51b9","added_by":"auto","created_at":"2026-04-10 16:58:45","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":154494,"visible":true,"origin":"","legend":"\u003cp\u003eDLNDD Hybrid Dataset Architecture Diagram.\u003c/p\u003e","description":"","filename":"floatimage4.png","url":"https://assets-eu.researchsquare.com/files/rs-9305966/v1/a9dcc06598da9996a19eb943.png"},{"id":106726289,"identity":"bcc9d949-6e17-4198-be31-6b86bad4359c","added_by":"auto","created_at":"2026-04-12 18:35:49","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":624577,"visible":true,"origin":"","legend":"\u003cp\u003eVGG-16 Training-Loss Accuracy Curves.\u003c/p\u003e","description":"","filename":"floatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-9305966/v1/97541888cbff8f7b6f450e58.png"},{"id":106637092,"identity":"8e199688-f9f7-41a4-a584-1cd8797a7e7d","added_by":"auto","created_at":"2026-04-10 16:58:45","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":136305,"visible":true,"origin":"","legend":"\u003cp\u003eConfusion Matrix of Test Data of DL Algorithm.\u003c/p\u003e","description":"","filename":"floatimage6.png","url":"https://assets-eu.researchsquare.com/files/rs-9305966/v1/b68627b822ca9de733e61846.png"},{"id":106726391,"identity":"2598bb9a-968e-4196-8e6a-5d8cbf6c1180","added_by":"auto","created_at":"2026-04-12 18:35:59","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":504390,"visible":true,"origin":"","legend":"\u003cp\u003eModel Comparison CSV 3 RNDs Tabular Dataset.\u003c/p\u003e","description":"","filename":"floatimage7.png","url":"https://assets-eu.researchsquare.com/files/rs-9305966/v1/7373160e7d78b97d21f27c4f.png"},{"id":106727564,"identity":"de51bc27-fa72-4545-9407-fa106f3a43e2","added_by":"auto","created_at":"2026-04-12 18:39:17","extension":"png","order_by":8,"title":"Figure 8","display":"","copyAsset":false,"role":"figure","size":877651,"visible":true,"origin":"","legend":"\u003cp\u003eIllustrates Confusion Matrixs of All Datasets, Comparison Plots, Comparison Table.\u003c/p\u003e","description":"","filename":"floatimage8.png","url":"https://assets-eu.researchsquare.com/files/rs-9305966/v1/39fafe218d59ef64bc43c232.png"},{"id":106637095,"identity":"a6ed101b-1f37-411e-8c3b-1a7c3d1c4bd7","added_by":"auto","created_at":"2026-04-10 16:58:45","extension":"png","order_by":9,"title":"Figure 9","display":"","copyAsset":false,"role":"figure","size":404854,"visible":true,"origin":"","legend":"\u003cp\u003eIllustrates Cross Modality Performance Comparison with Accuracy and F1 score\u003c/p\u003e","description":"","filename":"floatimage9.png","url":"https://assets-eu.researchsquare.com/files/rs-9305966/v1/e65c60ff967f1ab1664e66c7.png"},{"id":106637096,"identity":"f67b456a-5616-48f8-9980-c5e0a716c355","added_by":"auto","created_at":"2026-04-10 16:58:45","extension":"png","order_by":10,"title":"Figure 10","display":"","copyAsset":false,"role":"figure","size":589858,"visible":true,"origin":"","legend":"\u003cp\u003eIllustrates Curves of All Four Datasets with AUC curves.\u003c/p\u003e","description":"","filename":"floatimage10.png","url":"https://assets-eu.researchsquare.com/files/rs-9305966/v1/56e563f18c2009fe315aac7d.png"},{"id":106637098,"identity":"f1940c8f-5595-4d98-9654-19e8b38297c0","added_by":"auto","created_at":"2026-04-10 16:58:45","extension":"png","order_by":11,"title":"Figure 11","display":"","copyAsset":false,"role":"figure","size":788436,"visible":true,"origin":"","legend":"\u003cp\u003eIllustrates Six XAI Results Applied on Same Images of MRI dataset\u003c/p\u003e","description":"","filename":"floatimage11.png","url":"https://assets-eu.researchsquare.com/files/rs-9305966/v1/be99605dc7d38261c73aee39.png"},{"id":106637097,"identity":"78ee69c7-7376-46ce-b15d-0a5aea3a8083","added_by":"auto","created_at":"2026-04-10 16:58:45","extension":"png","order_by":12,"title":"Figure 12","display":"","copyAsset":false,"role":"figure","size":301073,"visible":true,"origin":"","legend":"\u003cp\u003eIllustrates Interpretability Results with SHAP and LIME on Tabular CSV 1, CSV 2, and CSV 3 datasets.\u003c/p\u003e","description":"","filename":"floatimage12.png","url":"https://assets-eu.researchsquare.com/files/rs-9305966/v1/a98ef8900fc0046e6b0bf72d.png"},{"id":106728281,"identity":"16c8d1de-7b64-4615-9382-8f6d1bde0581","added_by":"auto","created_at":"2026-04-12 18:42:23","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":7620602,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9305966/v1/d8585fc8-6db2-401c-8912-1f22f25667de.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"DLNDD: An Explainable Deep Learning Framework for the Early Detection and Classification of Rare Diseases","fulltext":[{"header":"1. Introduction and Review of Related Literature","content":"\u003cp\u003eRare Neurological Diseases (RNDs) are a mixed group of diseases that affected more than 300\u0026nbsp;million people worldwide. European Union (EU) defined RNDs as those disorders that affects less than 1 in every 2000 persons, but they are often remained with diagnosis. The average time to diagnose is five to seven years. Such a high diagnostic delay happens due to certain number of factors such as, lack of proper clinical data, high level of phenotypic heterogeneity of RNDs that create difficulty in early detection by clinicians (L\u0026ouml;hmus et al., 2023) [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eIt is measured that almost 80% of RNDs have a genetic bases, and neurological symptoms are some of the most common and disabling symptoms in all these disorders (Germain et al., 2025) [\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e]. The high heterogeneity of RNDs does not allow to make specific diagnostic patterns using only clinical presentation or radiographic results. It further complexes the problem of early detection, that is why many patients undergo various misdiagnoses before receiving accurate diagnosis. Therefore, affecting the timely delivery of therapeutic intervention negatively affecting the quality of life (Shazeeb et al., 2025) [\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eThe most recent developments in machine learning (ML) have had an immense impact on the diagnosis and study of rare neurological diseases (RNDs), especially in the context of combining genomics, transcriptomics, and proteomics data. Alganmi (2024) [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e] conducted an extensive survey of the use of ML techniques to combine omics data on rare neurological diseases, thus demonstrating new biomarkers that can improve the accuracy of the diagnosis and treatment outcome. The effectiveness of these approaches in identifying genetic variants linked to rare diseases, such as sickle cell anemia and cystic fibrosis, and the appropriateness of ensemble methods to handle diverse and complicated genetic data were demonstrated by (Dong et al., 2022) [\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e]. This paper highlighted the strength of such algorithms in addressing the issues of noisy or incomplete data, which is a typical situation in rare disease studies. Furthermore, the enhancement of multi-omics data has demonstrated the possibility of enhancing the accuracy of diagnosis. Germain et al. (2025) [\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e] used artificial intelligence (AI) to combine genetic and clinical data to diagnose Fabry disease, which demonstrates that AI-based methods can significantly increase the level of diagnostic accuracy when using a combination of multiple sources of data at the same time (Germain et al., 2025) [\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e]. Multi-omics data integration is also compromised by the lack of availability; hence, data augmentation and transfer learning are critical. Lohmus et al. (2023) [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e] discussed about the importance of using a combination of genomic, transcriptomic, and clinical data to overcome data scarcity and thus build more valid diagnostic models of RNDs.\u003c/p\u003e \u003cp\u003eThe use of deep learning (DL) models, and specifically convolutional neural networks (CNNs), has become an essential part of the medical image analysis process, such as magnetic resonance imaging (MRI) analysis, when it comes to RND diagnosis. Syed et al. (2021) [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e] examined the use of CNNs to classify MRI scans of Alzheimer disease, and they showed that transfer learning using ImageNet allows deep models to be effectively generalized even with limited data, a situation that is often observed in imaging studies of rare diseases (Syed et al., 2021) [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e]. Uppalapati et al. (2025) [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e] proposed TinyViT-Batten, a few-shot vision transformer (ViT) that detects Batten disease on pediatric MRI images, in solving the problem of data scarcity. The model uses explainable attention, which enables it to learn using small datasets and produce results that are easy to interpret by clinicians, which is essential in the task of rare disease classification. Also, the combination of DL with retrieval-augmented architectures can also increase the diagnostic accuracy of rare diseases. A framework suggested by Kim et al. (2025) [\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e] is called RADAR, which combines the real-time search of clinical literature with the analysis of images using the DL, thus, allowing the diagnosis without the need to further fine-tune the results on the data related to rare diseases and is a promising perspective of AI-assisted diagnostics.\u003c/p\u003e \u003cp\u003eMultimodal data synthesis is a critical step in the development of rare disease diagnostics. The simultaneous study of medical imaging, clinical symptoms, and genetic data has proven to enhance the results of the diagnosis. Shazeeb et al. (2025) [\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e] studied the multimodal neuroimaging methods, i.e., susceptibility-weighted imaging (SWI) and diffusion-tensor imaging (DTI) to diagnose rare neurodegenerative disorders and found out that a combination of multiple imaging modalities can provide a more comprehensive diagnostic framework. Human Phenotype Ontology (HPO) has been used more and more to improve diagnostic yield through the connection of phenotypic descriptions with genetic data. Kohler et al. demonstrated that deep phenotyping using HPO, in conjunction with genomic data, increases diagnostic performance in diseases with similar symptomology, which is particularly common in rare diseases with complex genetic models (Kohler et al., 2021) [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e]. Moreover, in their study, Germain et al. (2025) [\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e] applied AI methods to combine clinical, genetic, and imaging data to diagnose Fabry disease and found that the combination of heterogeneous data types enhances diagnostic processes and improves the knowledge of rare diseases.\u003c/p\u003e \u003cp\u003eAs AI becomes increasingly integrated into clinical processes, explainability has become a necessity in the uptake of AI models in healthcare. Tjoa et al. (2021) [\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e] have reviewed Grad-CAM, which is a visualization method that creates heatmaps that show the areas of activation in images, thus helping clinicians understand the exact areas of the MRI that affect model predictions. This approach is useful in the diagnosis of rare diseases, in which subtle differences in images can be difficult to interpret by experts. Another popular explainability technique is SHapley Additive exPlanations (SHAP) which can be applied to both tabular and medical imaging data when classifying rare diseases. Arrieta et al. (2020) [\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e] emphasized the significance of SHAP in providing transparency in model predictions, which would allow clinicians to understand the rationale behind AI-based decisions, which is an essential requirement in the diagnosis of rare diseases where expert validation is regularly requested.\u003c/p\u003e \u003cp\u003eAnother new area of research is the intersection of AI with knowledge graph-based systems, including RDBridge, which encode clinical, genetic, and phenotypic data into structured knowledge bases that can be used by AI models. The study by Xing et al. (2023) emphasized the possibility of knowledge graphs to support AI systems in reasoning on intricate, infrequent conditions, which, in turn, will provide a solid basis of future diagnostic processes.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eSummary of Reviewed Literature; Methodology, and Domain.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"3\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eStudy\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eMethod(s)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eDomain\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAlganmi (2024)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eML\u0026thinsp;+\u0026thinsp;Omics Review\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eRare Neurological Diseases\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eGermain et al. (2025)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAI Review (Fabry)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eFabry Disease / Rare\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eUppalapati et al. (2025)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eTinyViT\u0026thinsp;+\u0026thinsp;Few-Shot\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eBatten Disease (MRI)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eKim et al. (2025)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eRADAR RAG Agents\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eRare Brain (MRI)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eShazeeb et al. (2025)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNeuroimaging Editorial\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eRare Diseases (General)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eTjoa \u0026amp; Guan (2021)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eXAI Medical Imaging\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eMedical Image Classification\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eArrieta et al. (2020)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eXAI Taxonomy\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eClinical AI\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eDong et al. (2022)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eEnsemble ML, Genomics\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eGenetic Rare Diseases\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eKohler et al. (2021)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eHPO / Deep Phenotyping\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eRare Diseases (General)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSyed et al. (2021)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eDeep CNN Transfer Learning\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eAlzheimer's MRI\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMa et al. (2021)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eMultimodal Anomaly Detection\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eRare Brain Phenotypes\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eReuter et al. (2024)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eTransformer, Phenotyping\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eRare Disease Cohorts\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cdiv id=\"Sec2\" class=\"Section2\"\u003e \u003ch2\u003e1.1. Challenges (CH)\u003c/h2\u003e \u003cp\u003e \u003cstrong\u003eCH\u003c/strong\u003e \u003cp\u003eThe problem is the lack of a unified multimodal framework for rare neurological disease diagnosis. This challenge is addressed by Research Questions 3 and 4, and by Objectives 2 and 4. The contribution of this study is the development of a multimodal deep learning and machine learning framework that can handle heterogeneous data types (CB1).\u003c/p\u003e \u003c/p\u003e \u003cp\u003e \u003cstrong\u003eCH 2\u003c/strong\u003e \u003cp\u003eThe issue is the limited integration of explainability methods in existing studies. This is explored through Research Questions 1 and 2 and is addressed by Objective 3. The study contributes by integrating explainable AI (XAI) techniques, such as SHAP, LIME, and Grad-CAM, to provide clinically meaningful interpretations of model predictions (CB2).\u003c/p\u003e \u003c/p\u003e \u003cp\u003e \u003cstrong\u003eCH 3\u003c/strong\u003e \u003cp\u003eThe main problem is the absence of hybrid datasets that combine MRI, symptom, genetic, and metadata sources. This challenge relates to Research Question 3 and is targeted by Objectives 1 and 2. The contribution is the evaluation of a hybrid dataset, demonstrating performance across multiple rare disease modalities (CB3).\u003c/p\u003e \u003c/p\u003e \u003cp\u003e \u003cstrong\u003eCH 4\u003c/strong\u003e \u003cp\u003eNo prior research provides cross-modality benchmarking for rare neurological disease diagnosis. This is examined through Research Questions 3 and 4 and is addressed by Objective 4. The study contributes by performing cross-modality performance analysis to identify the relative importance of each data modality (CB4).\u003c/p\u003e \u003c/p\u003e \u003cp\u003e \u003cstrong\u003eCH 5\u003c/strong\u003e \u003cp\u003eExisting studies show weak real-world validation, reducing clinical trust in AI models. This challenge is explored through Research Question 1 and addressed by Objectives 3 and 4. The study contributes by providing clinically interpretable results validated through XAI techniques, enhancing reliability for potential deployment (CB5).\u003c/p\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003e1.2. Problem Statement (PS)\u003c/h2\u003e \u003cp\u003eProblem statement is gaps and limitations in previous related researchers, the PS of DLNDD is defined below as PS,\u003c/p\u003e \u003cp\u003e \u003cstrong\u003ePS\u003c/strong\u003e \u003cp\u003eDespite many findings on MRI images with deep learning, tabular ML for genetic and symptom data, rare disease knowledge graphs, and XAI methods, no prior study has utilized hybrid dataset that covers four heterogenous datasets such as, images, symptom text, genetic, and administrative disease metadata within a unified evaluation and explainability framework targeting rare neurological conditions.\u003c/p\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec4\" class=\"Section2\"\u003e \u003ch2\u003e1.3. Objectives (OBs)\u003c/h2\u003e \u003cp\u003eObjectives are the aims or target to achieve in research.\u003c/p\u003e \u003cp\u003e \u003cstrong\u003eOB 1\u003c/strong\u003e \u003cp\u003eThe DLNDD will Apply CLAHE-enhanced VGG16 transfer learning to five-class brain MRI images dataset (A) for early detection with more than 95% accuracy of rare neurological diseases.\u003c/p\u003e \u003c/p\u003e \u003cp\u003e \u003cstrong\u003eOB 2\u003c/strong\u003e \u003cp\u003eBenchmark ten ML classifiers on three tabular datasets B, C, and D.\u003c/p\u003e \u003c/p\u003e \u003cp\u003e \u003col\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003esymptom-to-disease text (CSV1)\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003egenetic disease structured data (CSV2)\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003cspan\u003e \u003cli\u003e \u003cp\u003eOrphanet rare disease metadata (CSV3)\u003c/p\u003e \u003c/li\u003e \u003c/span\u003e \u003c/ol\u003e \u003c/p\u003e \u003cp\u003ewith stratified three-way splitting and comprehensive metric reporting.\u003c/p\u003e \u003cp\u003e \u003cstrong\u003eOB 3\u003c/strong\u003e \u003cp\u003eDLNDD Will integrate SHAP, LIME, and Grad-CAM explainability across on MRI data and SHAP on A, B, and C for evaluating the clinical coherence of feature attributions.\u003c/p\u003e \u003c/p\u003e \u003cp\u003e \u003cstrong\u003eOB 4\u003c/strong\u003e \u003cp\u003eDLNDD will Conduct a cross-modality comparison of best models to quantify the relative diagnostic information content of each modality and to provide a blueprint for future multimodal fusion systems.\u003c/p\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec5\" class=\"Section2\"\u003e \u003ch2\u003e1.4. Research Questions (RQs)\u003c/h2\u003e \u003cp\u003eResearch questions are questions about how the objectives will be achieved. The RQs of this study is stated below as \u003cb\u003eRQ 1, RQ 2, RQ 3, and RQ 4.\u003c/b\u003e\u003c/p\u003e \u003cp\u003e \u003cb\u003eRQ 1\u003c/b\u003e: How do explainability methods like SHAP, LIME, and Grad-CAM contribute to the clinical interpretability of deep learning and machine learning models in rare neurological disease diagnosis?\u003c/p\u003e \u003cp\u003e \u003cb\u003eRQ 2\u003c/b\u003e: How consistent are the feature attributions across multiple explainability frameworks (SHAP, LIME, Grad-CAM) for MRI dataset, symptom text, genetic, and rare disease metadata modalities?\u003c/p\u003e \u003cp\u003e \u003cb\u003eRQ 3\u003c/b\u003e: How do different clinical data modalities (MRI, symptom text, genetic data, and rare disease metadata) contribute to the overall diagnostic accuracy for rare neurological diseases?\u003c/p\u003e \u003cp\u003e \u003cb\u003eRQ 4\u003c/b\u003e: What is the relative importance of each modality (MRI dataset vs. tabular data) in predicting rare neurological diseases, and how can they be fused to improve diagnostic outcomes?\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec6\" class=\"Section2\"\u003e \u003ch2\u003e1.5. Contributions (CBs)\u003c/h2\u003e \u003cp\u003eContributions are new additions to topic on which research is done. The CBs of this research is defined below as CB 1, CB 2, CB 3 and CB 4.\u003c/p\u003e \u003cp\u003e \u003cstrong\u003eCB 1\u003c/strong\u003e \u003cp\u003eDevelopment of a multi-dataset deep learning and machine learning framework that integrates heterogeneous data types, including MRI images, symptom text, genetic data, and rare disease metadata, within a unified evaluation pipeline.\u003c/p\u003e \u003c/p\u003e \u003cp\u003e \u003cstrong\u003eCB 2\u003c/strong\u003e \u003cp\u003eIntegration of state-of-the-art explainable AI (XAI) techniques, such as SHAP, LIME, and Grad-CAM, to enhance the clinical interpretability of both deep learning and machine learning models.\u003c/p\u003e \u003c/p\u003e \u003cp\u003e \u003cstrong\u003eCB 3\u003c/strong\u003e \u003cp\u003eDemonstration of the effectiveness of a hybrid dataset approach, enabling accurate and robust evaluation across four distinct modalities despite the inherent scarcity of rare disease data.\u003c/p\u003e \u003c/p\u003e \u003cp\u003e \u003cstrong\u003eCB 4\u003c/strong\u003e \u003cp\u003eConduct of comprehensive cross-modality benchmarking, quantifying the relative diagnostic contribution of each data type and providing insights for future multimodal fusion systems.\u003c/p\u003e \u003c/p\u003e \u003cp\u003e \u003cstrong\u003eCB 5\u003c/strong\u003e \u003cp\u003eAssurance of clinical reliability by producing interpretable and actionable results that align with pathological features, supporting potential translation of AI models into real-world clinical settings.\u003c/p\u003e \u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eRelationships of CH, PS, OBs, RQs, and CBs.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"5\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eChallenge\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003ePS\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eRQs\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eOBs\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eCBs\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eData heterogeneity (MRI, symptoms, genetics, metadata)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eLack of unified multimodal framework\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eRQ3, RQ4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eOB2, OB4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eCB1: DL/ML framework achieved 95%/90% + accuracies.\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eLow interpretability in AI models\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eLimited explainability integration\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eRQ1, RQ2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eOB3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eCB2: XAI integration (SHAP, LIME, Grad-CAM).\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eLimited rare disease data\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNo hybrid dataset usage\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eRQ3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eOB1, OB2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eCB3: Hybrid dataset evaluation.\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eModality comparison gap\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eNo cross-modality benchmarking\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eRQ3, RQ4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eOB4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eCB4: Cross-modality performance analysis.\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eClinical reliability issues\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eWeak real-world validation\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eRQ1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eOB3, OB4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eCB5: Clinically interpretable results.\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003c/div\u003e"},{"header":"2. Methodology","content":"\u003cp\u003eThe DLNDD study was structured into four segments for a specific clinical data modality. The deep learning (DL) branch processed brain magnetic resonance imaging (MRI) data using CLAHE and fine-tuned a Visual Geometry Group 16 (VGG16) model through ImageNet-based transfer learning. Three machine learning (ML) branches handled tabular data, including symptom text (CSV1), genetic features (CSV2), and rare disease metadata (CSV3), applying encoding, scaling, and multiple classifiers. All branches converged in a final stage, where performance was evaluated using accuracy, macro F1-score, Area Under the Curve (AUC), Cohen\u0026rsquo;s kappa, and Matthews Correlation Coefficient (MCC), with explainability provided via Gradient-weighted Class Activation Mapping (Grad-CAM), SHapley Additive exPlanations (SHAP), and Local Interpretable Model-agnostic Explanations (LIME).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003e2.1. Dataset Description and Split\u003c/h2\u003e \u003cp\u003eThe Rare Neurological Diseases MRI Curated Edition dataset (AhsanNeural, 2026), sourced from Kaggle, was used in the deep learning branch. It contained 1,400 annotated brain MRI images across five rare diseases: Fukuyama Muscular Dystrophy (FMD), Hallervorden\u0026ndash;Spatz Disease (HSD/PKAN), Moyamoya Disease (MMD), Pachygyria\u0026ndash;Cerebellar Hypoplasia (PCH), and Walker\u0026ndash;Warburg Syndrome (WWS), with 280 training, 210 validation, and 210 test images per class.\u003c/p\u003e \u003cp\u003eThe Symptom2Disease dataset (CSV1) linked symptom descriptions to 24 diseases, with 180 test samples and TF-IDF features. The Genetic Disease Prediction dataset (CSV2) included clinical and biomarker data for five diseases with 150 test samples. CSV3, from Orphanet (2026), provided metadata for 1,648 rare disease cases.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab3\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 3\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eDataset Description and Preprocessing Specifications.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"5\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eModality\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eDataset\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eN Classes\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eInput Type\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eTrain / Val / Test\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eDL (MRI)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eRare Neuro MRI Curated\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e5 neurological conds.\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e224\u0026times;224 RGB\u0026thinsp;+\u0026thinsp;CLAHE\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e980 / 210 / 210\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCSV1 (Symptoms)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eSymptom2Disease\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e24 diseases\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eTF-IDF text (3,000 feats)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e840 / 180 / 180\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCSV2 (Genetic)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eGenetic Disease Prediction\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e5 genetic diseases\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eStructured clinical features\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e700 / 150 / 150\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCSV3 (Rare Meta)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eOrphanet 2026 Complete\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e6 disorder types\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eMetadata\u0026thinsp;+\u0026thinsp;name keywords\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e7,693 / 1,648 / 1,648\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec9\" class=\"Section2\"\u003e \u003ch2\u003e2.2. Preprocessing on MRI Dataset\u003c/h2\u003e \u003cp\u003eAfter input of dataset and split, the MRI images were preprocessed based on a strictly standardized procedure before the model training. The volumes of the images were resampled spatially to 224 \u0026times; 224 pixels to match the input size of the VGG-16 model grayscale images were replicated across color planes to ensure compatibility with the three-channel RGB format. CLAHE is an adaptive, locally contrast-enhancing method that divides the image into a regular grid of small tiles, usually 8 by 8, and applies histogram equalization on each tile, using a user-specified clip limit to prevent over-enhancement of noise in low-contrast areas as presented in Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec10\" class=\"Section2\"\u003e \u003ch2\u003e2.3. Model Architecture\u003c/h2\u003e \u003cp\u003eThe deep learning model was built based on the VGG16 architecture, a 16-layer convolutional neural network that was first trained on the ImageNet dataset in large scale, and had been designed to perform visual recognition. VGG16 is a 13-layer convolutional network that has five blocks (block1 to block5) separated by max-pooling operations at the end of block 2, 3, 4 and 5, and finally three fully connected layers. The VGG-16 base was initialized with ImageNet weights (frozen), and the original classification head was replaced with a custom head designed for the five-class RND task: GlobalAveragePooling2D \u0026rarr; Dense (512, ReLU) \u0026rarr; BatchNormalization \u0026rarr; Dropout (0.4) \u0026rarr; Dense (5, Softmax), as shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e. There are about 14.98\u0026nbsp;million trainable parameters in the model.\u003c/p\u003e \u003cp\u003eA standardized machine-learning pipeline was uniformly used on each of the three tabular modalities. Preprocessing of features included: (1) encoding of categorical variables with scikit-learn LabelEncoder; (2) missing values filled in with median and SimpleImputer; and (3) scaling of features with StandardScaler. In CSV1, TF-IDF vectorization (maximum features 3000, n-gram range (1,2), and sub-linear frequency scaling of terms) of raw symptom text was followed by scaling as show in Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e. Each modality was tested on 10 classifiers, i.e. Logistic Regression, Multilayer Perceptron (MLP) with two hidden layers using the ReLU activation function, Extra Trees Classifier, Support Vector Machines with radial basis function kernel (SVM-RBF), Random Forest with 100 estimators, Gradient Boosting with 100 estimators, XGBoost with 100 estimators, Light GBM with 100 estimators, K-Nearest Neighbors with k\u0026thinsp;=\u0026thinsp;5, and Ada In the case of CSV3 alone, an ensemble of Bagging Random Forests (Bagging RF) was also evaluated. Each model was trained with 70 training split, and model selection was done using macro-F1 score calculated on 15% validation split. End measures that were reported were based on a held-out 15 per cent test partition, thus providing an objective evaluation as shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e.\u003c/p\u003e \u003ch2\u003e\u003cstrong\u003e2.4.\u0026nbsp; \u0026nbsp;\u0026nbsp;Evaluation Metrics and Libraries.\u003c/strong\u003e\u003c/h2\u003e\n\u003cp\u003e\u0026middot; \u003cstrong\u003eLibraries:\u003c/strong\u003e Tensorflow. Keras. SHAP, LIME. XGB.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u0026middot; \u003cstrong\u003eMetrics:\u003c/strong\u003e Accuracy, Precision, Recall, F1 score, MCC, AUC, Confusion Matrix, and Cohen\u0026rsquo;s Kappa.\u003c/p\u003e"},{"header":"3. Results and Discussions","content":"\u003cdiv id=\"Sec12\" class=\"Section2\"\u003e \u003ch2\u003e3.1. Training parameters and setup\u003c/h2\u003e \u003cp\u003eThe model was trained on Kaggle, cloud computing platform with 2x GPUs of NVIDIA. Training time was 496.5 seconds (~\u0026thinsp;8.3 minutes) and the model has 14,981,957 parameters.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab4\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 4\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eTraining Configuration\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"5\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eSr.\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eParameter\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eValue / Description\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e01\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eOptimizer\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eAdam\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e02\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eInitial Learning Rate\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e1 \u0026times; 10⁻⁴\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e03\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eLoss Function\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eCategorical Cross-Entropy\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e04\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eBatch Size\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e32\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e05\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eEpochs\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e25\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e06\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eTotal Training Time\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e496 seconds (\u0026asymp;\u0026thinsp;8.3 minutes) on GPU\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e07\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eBest Epoch\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eEpoch 16\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e09\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eEarlyStopping\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003ePatience\u0026thinsp;=\u0026thinsp;8, Monitor\u0026thinsp;=\u0026thinsp;val_loss,\u003c/p\u003e \u003cp\u003eRestore Best Weights\u0026thinsp;=\u0026thinsp;True\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e10\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eReduceLROnPlateau\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eFactor\u0026thinsp;=\u0026thinsp;0.4, Patience\u0026thinsp;=\u0026thinsp;3, Minimum LR\u0026thinsp;=\u0026thinsp;1 \u0026times; 10⁻⁶\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec13\" class=\"Section2\"\u003e \u003ch2\u003e3.2. DL VGG-16 Performance Analysis with Quantitative Method\u003c/h2\u003e \u003cp\u003eFigure \u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e presents the VGG16 training dynamics across 25 epochs, as shown in Table\u0026nbsp;\u003cspan refid=\"Tab4\" class=\"InternalRef\"\u003e4\u003c/span\u003e showing four panels: training and validation accuracy, training and validation loss, learning rate schedule, and the generalization gap over time. Training accuracy rose rapidly from 71.1% at epoch 0 to 95.7% by epoch 3, reflecting the rapid transfer of ImageNet-derived features to the MRI domain. Validation accuracy was volatile in early epochs (61.9%\u0026rarr;76.7%\u0026rarr;95.2%) as the model adapted from natural image to medical MRI feature distributions, but stabilized above 97% from epoch 9 onward. The learning rate reduction at epoch 14 (from 1\u0026times;10⁻⁴ to 4\u0026times;10⁻⁵) produced a clear improvement in validation loss from 0.852 to 0.057 as presented in Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e, confirming the importance of adaptive learning rate scheduling. Training loss at the final epoch was 0.0055, and validation loss was 0.0078, with a generalization gap of \u0026lt;\u0026thinsp;0.003, indicating negligible overfitting. The minimum validation loss of 0.0054 was achieved at epoch 19, consistent with the best model checkpoint.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eOn the held-out test set, the VGG16 model achieved perfect classification across all five rare neurological disease classes: Accuracy\u0026thinsp;=\u0026thinsp;~\u0026thinsp;100 F1-Macro\u0026thinsp;=\u0026thinsp;~\u0026thinsp;100, Precision\u0026thinsp;=\u0026thinsp;~\u0026thinsp;100, Recall\u0026thinsp;=\u0026thinsp;~\u0026thinsp;100, AUC\u0026thinsp;=\u0026thinsp;~\u0026thinsp;100, Cohen's Kappa\u0026thinsp;=\u0026thinsp;~\u0026thinsp;100, MCC\u0026thinsp;=\u0026thinsp;~\u0026thinsp;100, Balanced Accuracy\u0026thinsp;=\u0026thinsp;~\u0026thinsp;100. as shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe VGG16 confusion matrix is entirely diagonal \u0026mdash; no instance of any of the five disease classes were misclassified as any other. This result confirms that the model has learned fully discriminative representations for all five conditions. Per-class metrics are uniformly perfect: precision\u0026thinsp;=\u0026thinsp;recall\u0026thinsp;=\u0026thinsp;F1\u0026thinsp;=\u0026thinsp;1.000 for Fukuyama Muscular Dystrophy, Hallervorden-Spatz Disease, Moyamoya Disease, Pachygyria-Cerebellar Hypoplasia, and Walker-Warburg Syndrome.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec14\" class=\"Section2\"\u003e \u003ch2\u003e3.3. Machine Learning Algorithms Performance Analysis\u003c/h2\u003e \u003cp\u003eOn CSV1, Logistic Regression achieved 98.89% test accuracy and 98.84% macro F1 across 24 classes, with AUC\u0026thinsp;=\u0026thinsp;99.97%, Kappa\u0026thinsp;=\u0026thinsp;97.10%, and MCC\u0026thinsp;=\u0026thinsp;97.10%. Extra-Trees 98.33% and MLP 95.00% followed. Most classes showed perfect scores, with minor drops for Migraine and Allergy recall\u0026thinsp;=\u0026thinsp;85.70% due to symptom overlap, while Diabetes had slightly lower precision (0.889) as shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e8\u003c/span\u003e \u003cb\u003eand\u003c/b\u003e Table\u0026nbsp;\u003cspan refid=\"Tab5\" class=\"InternalRef\"\u003e5\u003c/span\u003e.\u003c/p\u003e \u003cp\u003eFor CSV2, Random Forest achieved 96.67% accuracy and 96.68% F1 AUC\u0026thinsp;=\u0026thinsp;95.01%, as shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig10\" class=\"InternalRef\"\u003e10\u003c/span\u003e, Kappa\u0026thinsp;\u0026asymp;\u0026thinsp;89%). Multiple models showed identical validation performance, indicating a clear decision boundary. Per-class F1 ranged from 93.30% to ~\u0026thinsp;100% as shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e9\u003c/span\u003e.\u003c/p\u003e \u003cp\u003eOn CSV3, XGBoost achieved 84.47% accuracy and 73.99% F1. Performance varied widely, with strong results for Clinical Subtype (~\u0026thinsp;100%) but lower for Morphological Anomaly (0.404), mainly due to class imbalance and semantic overlap as illustrated in Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003e \u003cb\u003eand\u003c/b\u003e Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e8\u003c/span\u003e. Interpretability shown good results with SHAP and LIME as shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig12\" class=\"InternalRef\"\u003e12\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab5\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 5\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eIllustrates Comparison of All Four Datasets with Accuracy, F1 score, AUC, Kappa, MCC, and Train Time.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"8\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c8\" colnum=\"8\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eModality\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eBest Model\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eAccuracy\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eF1-Macro\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eAUC\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003eKappa\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c7\"\u003e \u003cp\u003eMCC\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c8\"\u003e \u003cp\u003eTrain Time\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eDL (MRI Images)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eVGG16\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e1.0000\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e1.0000\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e1.0000\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e1.0000\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e1.0000\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e496.5 s\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCSV1 (Symptoms)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eLogistic Regression\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.9889\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.9884\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.9997\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.9710\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.9710\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e1.73 s\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCSV2 (Genetics)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eRandom Forest\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.9667\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.9668\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.9893\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.8917\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.8919\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e0.85 s\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCSV3 (Rare Meta)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eXGBoost\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.8447\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e0.7399\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e0.9607\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e0.7545\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c7\"\u003e \u003cp\u003e0.7576\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e6.98 s\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec15\" class=\"Section2\"\u003e \u003ch2\u003e3.4. Interpretability with XAI\u003c/h2\u003e \u003cp\u003eFigure presents the full six-method XAI evaluation panel for a Pachygyria-Cerebellar Hypoplasia test case, classified with ~\u0026thinsp;100% confidence by VGG16. The Grad-CAM activation pattern of pachygyria is clinically consistent: the attentional focus of the model is localized to the cortical mantle, the location most visually apparent of the reduced and shallow gyral folding pattern that characterizes pachygyria. This interpretation is supported by the occlusion sensitivity mapping which shows that the greatest drop in classification confidence occurs when the central cortical region is occluded, and thus, it can be seen that the spatial attribution is not due to spurious background shortcuts.\u003c/p\u003e \u003cp\u003eA distributed cortical pattern identified by the SHAP Gradient Explainer (cyan, bottom-right) is consistent as illustrated in Fig.\u0026nbsp;\u003cspan refid=\"Fig11\" class=\"InternalRef\"\u003e11\u003c/span\u003e with the typical smooth-to-moderately pachygyric cortical surface. In the six methodological approaches, a strong agreement is reached that the model selects Pachygyria-Cerebellar Hypoplasia by focusing on cortical thickness and gyral patterning, which are the main pathological features of this disorder.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab6\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 6\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eComparison with State of Art Architectures.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"4\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAuthor / Study\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eModality\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eAccuracy (Reported)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eDLNDD Improvement\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSyed et al. (2021)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eMRI\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e98.35%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e+\u0026thinsp;1.65%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eUppalapati et al. (2025)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eMRI\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e~\u0026thinsp;94.50%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e+\u0026thinsp;5.50%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eDong et al. (2022)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eGenetics\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e~\u0026thinsp;92.00%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e+\u0026thinsp;4.67%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSabzeghabaiean (2025)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eMRI\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e95.00%\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e+\u0026thinsp;5.00%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eDLNDD (Proposed Study)\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003eMultimodal (MRI)\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cb\u003e~\u0026thinsp;100%\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e\u003cb\u003eBenchmark\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eDLNDD (Proposed Study)\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003eSymptom Text\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cb\u003e98.89%\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e\u003cb\u003eBenchmark\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eDLNDD (Proposed Study)\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003eGenetic Data\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e\u003cb\u003e96.67%\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e\u003cb\u003eBenchmark\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003ch2\u003e\u003cstrong\u003e3.3.\u0026nbsp; \u0026nbsp;\u0026nbsp;Limitations\u0026nbsp;\u003c/strong\u003e\u003c/h2\u003e\n\u003cp\u003e\u0026middot; The MRI dataset was derived from a single curated source, leading to homogeneous imaging conditions that may limit generalizability to real-world, multi-institutional MRI data.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u0026middot; \u003cem\u003eThe perfect VGG16 performance (1.000) may reflect overfitting due to the small test size (~42 images per class), reducing reliability of reported metrics.\u0026nbsp;\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003e\u0026middot; The tabular datasets (Symptom2Disease, Genetic Disease Prediction, Orphanet) are benchmark datasets and do not fully capture real-world clinical data complexity, such as free-text ambiguity and variable data quality.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u0026middot; All four modalities were processed independently rather than using a unified multimodal fusion model, limiting the ability to exploit cross-modal relationships.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u0026middot; SHAP analysis of CSV2 illustrated most features had minimal contribution, suggesting possible data quality limitations or lack of strong biological signal in the feature set.\u003c/p\u003e"},{"header":"4. Recommendations and Conclusions","content":"\u003ch2\u003e\u003cstrong\u003e4.1.\u0026nbsp;\u003c/strong\u003e\u003cstrong\u003eRecommendations\u003c/strong\u003e\u003cstrong\u003e\u0026nbsp;\u003c/strong\u003e\u003c/h2\u003e\n\u003cp\u003e\u0026middot; Future work should prioritize multi-site MRI validation using independently collected datasets from rare disease registries such as European Reference Networks (ERNs) and the NIH Undiagnosed Diseases Network. Federated learning offers a practical solution to data scarcity by enabling model training across distributed hospital datasets without centralizing sensitive data, ensuring compliance with GDPR and HIPAA while maintaining performance comparable to centralized approaches.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u0026middot; For CSV3, class imbalance should be addressed using techniques such as SMOTE, class-weighted loss, and cost-sensitive learning. Additionally, transformer-based models (e.g., BioBERT, PubMedBERT) with self-supervised pretraining on Orphanet text can improve semantic understanding and enhance classification of challenging classes like Morphological Anomaly and Clinical Group.\u003c/p\u003e\u003ch2\u003e\u003cstrong\u003e4.2.\u0026nbsp;Conclusion\u003c/strong\u003e\u003c/h2\u003e\n\u003cp\u003eThis study developed and evaluated a multimodal DL/ML framework for rare neurological disease classification across four clinical data modalities. The optimized VGG16 model, combined with CLAHE and ImageNet transfer learning, achieved perfect performance (Accuracy, F1, AUC, Kappa, MCC = 1.000) on a five-class MRI dataset. In the ML branches, Logistic Regression achieved 98.89% accuracy on symptom-based classification, Random Forest reached 96.67% on genetic data, and XGBoost achieved 84.47% accuracy (AUC = 0.9559) on Orphanet metadata. Overall, all modalities demonstrated strong performance, with macro-AUC values exceeding 96%.\u003cbr clear=\"all\"\u003e\u0026nbsp;\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eFunding Statement:\u003c/strong\u003e No funding was received for research present in this manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eData Availability:\u0026nbsp;\u003c/strong\u003eData is Available on Request.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eContribution Statement:\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eEthical Statement:\u003c/strong\u003e All four datasets are available on Kaggle and are allowed to be employed for research, with addition to this, no human or animal was involved for experiment in this research.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConflict of Interest Statement:\u003c/strong\u003e The author(s) declares there is no conflict of interest for research present in this manuscript.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eAlganmi, N. (2024). A comprehensive review of the impact of machine learning and omics on rare neurological diseases. \u003cem\u003eBiomedical Informatics\u003c/em\u003e, 4(1), 73. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3390/biomedinformatics4010073\u003c/span\u003e\u003cspan address=\"10.3390/biomedinformatics4010073\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDong, C., Guo, W., Li, Y., Wang, Y., Li, J., \u0026amp; Zhou, J. (2022). Ensemble machine learning approaches for the identification of rare genetic disease variants. \u003cem\u003eFrontiers in Genetics\u003c/em\u003e, 13, 862830. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3389/fgene.2022.862830\u003c/span\u003e\u003cspan address=\"10.3389/fgene.2022.862830\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGermain, D. P., Gruson, D., Malcles, M., \u0026amp; Garcelon, N. (2025). Applying artificial intelligence to rare diseases: A literature review highlighting lessons from Fabry disease. \u003cem\u003eOrphanet Journal of Rare Diseases\u003c/em\u003e, 20, 186. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1186/s13023-025-03655-x\u003c/span\u003e\u003cspan address=\"10.1186/s13023-025-03655-x\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eL\u0026ouml;hmus, M., Labuhn, A., Bravo, A., \u0026amp; Saez-Rodriguez, J. (2023). Artificial intelligence in rare disease diagnostics: From symptom recognition to multi-omics integration. \u003cem\u003eJournal of Rare Diseases\u003c/em\u003e, 2(1), 12. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1186/s43058-023-00012-3\u003c/span\u003e\u003cspan address=\"10.1186/s43058-023-00012-3\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSyed, A. N., Anwar, S. M., Liaqat, R., Majid, M., Iqbal, J., \u0026amp; Bagci, U. (2021). Deep convolutional neural network-based classification of Alzheimer's disease using MRI data. \u003cem\u003earXiv:2101.02876\u003c/em\u003e. https://arxiv.org/abs/2101.02876\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eUppalapati, K., Yimenicioglu, B., Abdulkareem, S., Eftekhari, A., Uppalapati, B., \u0026amp; Kamath, V. (2025). TinyViT-Batten: Few-shot vision transformer with explainable attention for early Batten-disease detection on pediatric MRI. \u003cem\u003earXiv:2510.09649\u003c/em\u003e. https://arxiv.org/abs/2510.09649\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKim, H. Y., Li, J., Solana, A. B., Pirkl, C. M., Wiestler, B., Schnabel, J. A., \u0026amp; Bercea, C. I. (2025). Learning to reason about rare diseases through retrieval-augmented agents. \u003cem\u003earXiv:2511.04720\u003c/em\u003e. https://arxiv.org/abs/2511.04720\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eShazeeb, M. S., Acosta, M. T., \u0026amp; Tifft, C. J. (2025). Editorial: Role of neuroimaging in the diagnosis and treatment of rare diseases. \u003cem\u003eFrontiers in Neuroimaging\u003c/em\u003e, 4, 1566484. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3389/fnimg.2025.1566484\u003c/span\u003e\u003cspan address=\"10.3389/fnimg.2025.1566484\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eK\u0026ouml;hler, S., Gargano, M., Matentzoglu, N., Carmody, L. C., Lewis-Smith, D., Vasilevsky, N. A., Danis, D., Babbi, G., Bello, S. M., Cedar, N., \u0026amp; others. (2021). The Human Phenotype Ontology in 2021. \u003cem\u003eNucleic Acids Research\u003c/em\u003e, 49(D1), D1207\u0026ndash;D1217. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/nar/gkaa1043\u003c/span\u003e\u003cspan address=\"10.1093/nar/gkaa1043\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTjoa, E., \u0026amp; Guan, C. (2021). A survey on explainability of supervised machine learning. \u003cem\u003eIEEE Transactions on Neural Networks and Learning Systems\u003c/em\u003e, 32(10), 4233\u0026ndash;4246. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1109/TNNLS.2020.3027314\u003c/span\u003e\u003cspan address=\"10.1109/TNNLS.2020.3027314\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eArrieta, A. B., D\u0026iacute;az-Rodr\u0026iacute;guez, N., Del Ser, J., Bennetot, A., Tabik, S., Barbado, A., Garcia, S., Gil-Lopez, S., Molina, D., Benjamins, R., Chatila, R., \u0026amp; Herrera, F. (2020). Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. \u003cem\u003eInformation Fusion\u003c/em\u003e, 58, 82\u0026ndash;115. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.inffus.2019.12.012\u003c/span\u003e\u003cspan address=\"10.1016/j.inffus.2019.12.012\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eXing, H., Zhang, D., Cai, P., Zhang, R., \u0026amp; Hu, Q. N. (2023). RDBridge: A knowledge graph of rare diseases based on large-scale text mining. \u003cem\u003eBioinformatics\u003c/em\u003e, 39(7), btad440. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1093/bioinformatics/btad440\u003c/span\u003e\u003cspan address=\"10.1093/bioinformatics/btad440\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAhsanNeural. (2026). \u003cem\u003eRare neurological diseases MRI curated edition\u003c/em\u003e [Dataset]. Kaggle. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.kaggle.com/datasets/ahsanneural/rare-neurological-diseases-mri-curated-edition\u003c/span\u003e\u003cspan address=\"https://www.kaggle.com/datasets/ahsanneural/rare-neurological-diseases-mri-curated-edition\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAhsanNeural. (2026). \u003cem\u003eRare diseases Orphadata 2026 complete\u003c/em\u003e [Dataset]. Kaggle. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.kaggle.com/datasets/ahsanneural/rare-diseases-orphadata-2026\u003c/span\u003e\u003cspan address=\"https://www.kaggle.com/datasets/ahsanneural/rare-diseases-orphadata-2026\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAlganmi, N. (2024). A comprehensive review of the impact of machine learning and omics on rare neurological diseases. \u003cem\u003eBiomedical Informatics\u003c/em\u003e, 4(1), 73. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3390/biomedinformatics4010073\u003c/span\u003e\u003cspan address=\"10.3390/biomedinformatics4010073\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Rare Neurological Diseases (RNDs), deep learning, machine learning, MRI, CLAHE, machine learning, XAI","lastPublishedDoi":"10.21203/rs.3.rs-9305966/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-9305966/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003ch2\u003eBackground\u003c/h2\u003e \u003cp\u003eRare Neurological Diseases (RNDs) represent a significant global health burden, affecting over 300\u0026nbsp;million patients. The diagnostic journey for RNDs is often a prolonged process, spanning from the initial manifestation of symptoms to severe disease stages. Limited labeled clinical data and high phenotypic heterogeneity present inherent challenges for both clinicians and automated systems. The aim of this research is to develop a multi-dataset architecture utilizing advanced preprocessing and a hybrid dataset to achieve diagnostic accuracy exceeding 95% while integrating advanced interpretability techniques.\u003c/p\u003e\u003ch2\u003eMethod\u003c/h2\u003e \u003cp\u003eThis study utilized four publicly available datasets MRI images (A), symptom-based text (B, CSV 1), genetic data (C, CSV 2), and rare disease metadata (D, CSV 3) to support early RND diagnosis. A CNN VGG-16 architecture with Contrast Limited Adaptive Histogram Equalization (CLAHE) preprocessing was employed for the MRI dataset, which consists of five severity stages. For tabular modalities, ten classifiers including MLP, SVM-RBF, Random Forest, and XGBoost were benchmarked. To ensure clinical transparency, Explainable AI (XAI) techniques, such as SHAP, LIME, and Grad-CAM, were integrated into the framework.\u003c/p\u003e\u003ch2\u003eResults\u003c/h2\u003e \u003cp\u003eThe proposed DLNDD framework demonstrated\u0026thinsp;~\u0026thinsp;100% accuracy on MRI images using the VGG-16 model. For tabular data, Logistic Regression achieved 98.89% accuracy on symptoms, Random Forest reached 96.67% on genetics, and XGBoost achieved 84.47% accuracy on Orphanet metadata.\u003c/p\u003e\u003ch2\u003eConclusions\u003c/h2\u003e \u003cp\u003eThe novel Deep Learning based Neurological Diseases Detection (DLNDD) illustrated that modality-specific, XAI achieved clinical meaningful findings even under rare diseases data constraints. The DLNDD outperformed previous studies in terms of accuracy and clinical interpretability. It also provides a replicable footprint for multimodal rare disease autonomous systems and points toward federated, fusion-ready architectures.\u003c/p\u003e","manuscriptTitle":"DLNDD: An Explainable Deep Learning Framework for the Early Detection and Classification of Rare Diseases","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-04-10 16:58:39","doi":"10.21203/rs.3.rs-9305966/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"4f5f99ae-b6d6-49f5-9a67-7bfaec677cbd","owner":[],"postedDate":"April 10th, 2026","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2026-05-02T07:09:21+00:00","versionOfRecord":[],"versionCreatedAt":"2026-04-10 16:58:39","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-9305966","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-9305966","identity":"rs-9305966","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.