Multimodal Deep Learning for Classifying Diabetes: Analyzing Carotid Ultrasound Images from UK and Taiwan Biobanks and Their Cardiovascular Disease Associations

doi:10.21203/rs.3.rs-3855322/v1

Multimodal Deep Learning for Classifying Diabetes: Analyzing Carotid Ultrasound Images from UK and Taiwan Biobanks and Their Cardiovascular Disease Associations

2024 · doi:10.21203/rs.3.rs-3855322/v1

preprint OA: closed CC-BY-4.0

📄 Open PDF Full text JSON View at publisher

Full text 122,276 characters · extracted from preprint-html · click to expand

Multimodal Deep Learning for Classifying Diabetes: Analyzing Carotid Ultrasound Images from UK and Taiwan Biobanks and Their Cardiovascular Disease Associations | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Multimodal Deep Learning for Classifying Diabetes: Analyzing Carotid Ultrasound Images from UK and Taiwan Biobanks and Their Cardiovascular Disease Associations Ren-Hua Chung, Djeane Onthoni, Hong-Ming Lin, Guo-Hung Li, Yu-Ping Hsiao, and 4 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-3855322/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Objective Clinical evidence has shown that carotid intima-media thickness (CIMT) is a robust biomarker for determining the thickness of atherosclerosis, which in turn increases the risk of cardiovascular disease (CVD). Additionally, diabetes mellitus (DM) is linked to the acceleration of atherosclerosis. Thus, as measured by carotid ultrasound (US), CIMT exhibits a significant association with both DM and CVD. This study examines the potential of US image features, beyond CIMT, in enhancing DM classification and their subsequent association with CVD risks. Specifically, we aimed to determine if these US image features could contribute to DM classification in conjunction with traditional predictors such as age, sex, CIMT, and body mass index (BMI). Additionally, we evaluated the relationship between the probabilities derived from the DM classification model and the prevalence and incidence of CVD in DM patients. Materials and Methods Utilizing carotid US image data from the UK Biobank (UKB) and Taiwan Biobank (TWB), we developed and trained a custom multimodal DM classification model. This model employed a Convolutional Neural Network (CNN) deep learning approach, using data from the UKB. We assessed the model's performance by comparing it with traditional models that incorporate only clinical features (age, sex, CIMT, BMI). The same comparative analysis was performed on the TWB data. Logistic regression was utilized to analyze the associations between the DM classification model's probability outcomes and CVD status. Results Our comprehensive performance evaluation across both the UKB and TWB datasets revealed that the multimodal DM classification model, which considers both image and clinical features (Age, Sex, CIMT, BMI), outperformed models that rely solely on clinical features. This was evidenced by an improved average precision of 0.762, recall of 0.655, specificity of 0.79, and accuracy of 0.721. Furthermore, in the UKB dataset, we identified a statistically significant association between the probabilities derived from the DM model and CVD status in DM patients, both prevalent (P-value: 0.006) and incident (P-value: 0.058), particularly on the left side. Conclusions The study provides robust evidence that carotid US image features, in addition to traditional parameters like CIMT, significantly enhance the capability of the multimodal DM classification model. The probability outcomes from this model could serve as a promising biomarker for assessing CVD risk in DM patients, offering a novel approach in the medical imaging field. Health sciences/Medical research/Experimental models of disease Health sciences/Biomarkers/Predictive markers Health sciences/Diseases/Endocrine system and metabolic diseases/Diabetes/Type 2 diabetes Diabetes mellitus Cardiovascular disease Carotid intima-media thickness Carotid ultrasound Deep learning Taiwan biobank UK biobank Figures Figure 1 Figure 2 Figure 3 Figure 4 1. Introduction Diabetes mellitus (DM) is a chronic disease affecting millions of people worldwide 1 . It is categorized into three types: Type 1, Type 2, and Gestational diabetes. DM is a long-term condition that often leads to the development of other chronic diseases, including cardiovascular disease (CVD). DM promotes the progression of atherosclerosis, a significant factor contributing to the heightened risk of CVD. In clinical practice, the advancement of atherosclerosis can be assessed using carotid ultrasound (US). Carotid intima-media thickness (CIMT), measured through carotid US, has been recognized as a surrogate marker for CVD in diabetic patients 2 . CIMT also exhibits a noteworthy association with DM, underscoring its role as a significant marker in assessing the relationship between vascular health and diabetes 3 . These findings affirm a substantial relationship between DM and CVD. Recent clinical studies have directly examined carotid US in patients with DM, revealing its importance as a prognostic tool for diabetes, especially for assessing CVD risk 4 . By undergoing routine carotid ultrasound, carotid atherosclerosis can be detected and treated early, reducing the risk of cardiovascular disease 5 . It is important to note that the presence of carotid atherosclerosis in carotid ultrasound is more significant than CIMT alone. Thus, it has been recommended that the assessment for CVD risk can be done by considering the results from carotid ultrasound such as CIMT, combined with other carotid atherosclerosis factors and additional risk assessment 5 . However, there is currently a lack of investigation into whether carotid US can be directly utilized as an imaging biomarker to classify patients into those with DM and those without, and subsequently used to predict early high-risk patients for CVD. In the era of AI for clinical applications, numerous works have been established utilizing Deep Learning (DL) and Machine Learning (ML) techniques, specifically for various tasks such as classification 6 , prediction 7 , detection 8 , segmentation 9 , etc. In the case of DM classifications, a few studies have applied various medical imaging techniques. For example, some authors have contended that Body Mass Index (BMI) has limitations in assessing Type 2 Diabetes (T2D). Consequently, these authors sought to enhance early T2D detection by incorporating Electronic Health Record (EHR) data, specifically considering six different diseases such as T2D, Congestive heart failure, Cardiac arrhythmias, Morbid obesity, Chronic obstructive pulmonary disease, and Vascular disease, and frontal chest radiographs into their DL model, employing a ResNet-34 Convolutional Neural Network (CNN)-based approach 10 . With a similar objective, the authors utilized neck-to-knee Dixon MRI and applied a 3D CNN 11 . On the other hand, authors analyzed the impact of T2D on human brain, where it can cause irreversible damage to the brain tissue. Thus, authors utilized T1-weighted structural MRI and constructed an 11-layer 3D CNN 12 . As our focus is on DM associated with CVD, we found that frontal chest radiographs may not be suitable due to limitations in assessing key factors in CVD, such as atherosclerosis. Meanwhile, MRI is a time-consuming and costly imaging technique. Additionally, we observed a lack of studies aimed at improving DM classification models by incorporating carotid US along with conventional DM predictors in association with CVD. Therefore, this study has two primary objectives. Firstly, we aim to explore whether the inclusion of carotid US, in conjunction with conventional predictors such as age, sex, CIMT, and BMI, can enhance the performance of a multimodal DM classification model for distinguishing between individuals with DM and those without. The secondary goal is to demonstrate whether the probability outcomes generated by our designed CNN classification model significantly differ between two groups: DM patients with CVD (DM-CVD) and DM patients without CVD (DM-non-CVD). To achieve the first objective, we utilized the UK Biobank (UKB) dataset to design a multimodal DM classification model employing a CNN deep learning-based approach. Secondly, we employed logistic regression to examine the association of carotid ultrasound image probability outcomes for DM with both CVD and non-CVD statuses. Additionally, we validated the designed multimodal DM classification model using cross-validation with the Taiwan Biobank (TWB) dataset. 2. Materials and methods In our experimental design, we incorporated data from the UKB and the TWB. Accordingly, in this section, we describe the two sources of image data acquisition, image and scalar data processing, and outcome definitions. In detail, we explain our proposed multimodal diabetes mellitus classification model and the statistical analysis. 2.1 Image data acquisition All images in UKB and TWB are in DICOM format, with Common Carotid Artery (CCA) being the primary focus. The UKB dataset comprises approximately 500,000 participants 13 . For our study, our focus was on the data from the first imaging visit date of attending assessment (i.e., instance 2) and its corresponding imaging visit date (data-field 53), resulting in a dataset of 19,911 patients. In terms of carotid US, we considered pairs of left (data-field 20222) and right (data-field 20223) images. In TWB 14 , there are 46,561 follow-up participants, with 25,731 participants having undergone imaging tests. Out of these, a total of 25,587 participants have received carotid US on both left and right sides, referred to as Vertebral Artery (VAS) or Vascular US. Unlike in the UKB, in TWB, the left and right sides are not separated, and all sides are included within each patient’s DICOM files. 2.2 Image data preprocessing The image processing techniques were applied separately to the left and right side of DICOM files. We employed two data processing methods for carotid US image. Initially in UKB the carotid US procedures were conducted using a 2D scan along short-axis (transverse plane) and long-axis (longitudinal plane). In the long-axis, the CIMT was measured at pre-defined two angles (150 and 120 degrees) on the left and two angles (210 and 240 degrees) on the right sides 15 . Accordingly, four types of images can be found on the left and right side in each participant through image processing methods: long-axis, short-axis, CIMT150 (right: 150 and left: 210 degrees), and CIMT120 (right: 120 and left: 240 degrees). Specifically for UKB, we utilized the existing preprocessing techniques proposed in 16 . For every side, we performed cropping on all DICOM files using pre-defined top-left and bottom-right coordinates. The files are then named based on factors such as color ranges, angles, and the presence of check symbols within the DICOM files. This involves comparing the average pixel value in specific, predefined coordinates to pre-established thresholds. The process is contingent on whether the image corresponds to the sides and image categories which in this case, image CIMT150 for both left and right sides. The selection of image CIMT150 (150 degrees) category is based on our empirical study of stroke cases. We conducted multiple experiments using various CNN architectures, among which Inception ResNet V2 emerged as superior to other architectures. According to our experiments, image CIMT150 (150 degrees) category exhibits significant testing results in terms of accuracy, precision, recall, and specificity. Finally, the corresponding image is stored in JPEG format. Similar to the UKB, we underwent preprocessing of the entire DICOM dataset in the TWB and differentiated between the left and right sides, even though the specific degree of carotid ultrasound has not been specified in the TWB. Due to the absence of established carotid ultrasound preprocessing techniques in TWB, we developed our own data preprocessing, consisting of two phases designed for distinct objectives. The first goal is to obtain images without any annotations, and the second goal is to distinguish between the left and right CCA. Details of our designed carotid ultrasound processing methods for TWB can be found in Supplementary Materials Section S1. 2.3 Scalar data preprocessing and outcomes definition It has been reported that age is a crucial factor for CVD patients with or without DM 17 . Similarly, along with Sex factor, Age and Sex differences encompass DM factors, complications, and treatments 18 . For this reason, we selected Age and Sex as the scalar data. Additionally, as a robust direct biomarker for CVD and an indirect for DM, CIMT value derived from carotid US is extracted. In the UKB, we calculated age based on the patient's year and month of birth (data-field 34 and 52) and determined sex based on sex information and genetic sex (data-field 31 and 22001). In cases where the genetic sex had a missing value, we filled it in with the available sex value stored in the sex information. Additionally, as one of the DM predictors, BMI (data-field 21001) was included in the analysis. To match the selected image category, we only considered CIMT measured at 150 degrees, extracting the minimum (data-field 22673), mean (data-field 22674), and maximum (data-field 22675) values. Moreover, to align with image data, all UKB scalar data are derived by considering instance 2. In the TWB, we extracted patient’s Age, Sex, and BMI from the image and follow-up information. Unlike UKB, a single value for both left and right CIMT is available. In the UKB, for DM outcomes, we utilized the data field “diabetes diagnosed by doctor” (data-field 2443), which consisted of binary values '0' and '1' for 'no' and 'yes,' respectively. We filtered and extracted CVD outcomes from the EHR using ICD9 and ICD10 codes [ 6 ]. The data filtering was conducted by considering the period of 5 years prior to, and 3 years after instance 2. In TWB, we defined the DM outcomes from the follow-up report 5 years prior to the image taken date. The details of the total extracted images, scalar data, and the division of training and testing data can be found in Supplementary Materials Section S2. 2.4 Multimodal diabetes mellitus classification model The overview of our multimodal DM classification model is illustrated in Fig. 1 . Our model comprises three pipelines: training, validation, and prediction. For the training model, we designed a CNN architecture with a custom loss. The specifics of our designed CNN block can be found in Table 1 . Using the image data training-testing ratio of 80:20, we fed the DM cases and non-DM matched controls training set. After completing the training, we conducted validation using DM cases and non-DM control random testing set. Table 1 Our designed CNN block details. Block Values Input image dimension 3 x 300 x 300 Convolution 2D Output features = 32, Kernel size = 3, Stride = 1, Padding = 1 Activation function ReLU Batch normalization 2D Number of features = 32 Dropout 2D Dropout rate = 0.5 Fully connected layer Output features = 1024 Dropout 2D 0.5 Epoch 10 Batch size 2 Optimizer (Learning rate) Adam (0.0001) Data augmentation Random Horizontal Flip, Random Vertical Flip and Normalization Evaluation was performed using Cross-Entropy Loss (CEL) and L1 regularization loss. During validation, scalar data including Age, Sex, CIMT, and BMI were also loaded and concatenated with the extracted imaging features from the DM cases and non-DM control random testing set. Three values of CIMT were extracted for UKB, and two values of CIMT were found for TWB. Therefore, in total, we had 1030 and 1029 features as input for the final prediction for UKB and TWB, respectively. Before applying the final prediction block using logistic regression, we randomly split the merged features, consisting of both image and scalar features, into an 80:20 training-testing ratio. To gain a deeper understanding of which patterns in the images contain important signals contributing to the multimodal DM classification model, we visualized the extracted features in Convolution 2D by retrieving the weight data and obtaining the feature maps. We then reshaped the 32 feature maps into the dimensions of the input image (32 x 300 x 300). Importance scores of the 32 feature maps were calculated by computing the mean activation for the correct class minus the mean activation for the incorrect class. This process allowed us to track the best feature map indices and visualize them using the "Inferno" colormap built into Matplotlib. The detailed used Hardware and software specifications in this experiment can be found in Supplementary materials Section S3. 2.5 Statistical analysis To gather evidence supporting the statistically significant association of the probability outcomes from our designed CNN model with the CVD status in DM patients, we filtered and extracted the CVD status from the EHR using ICD9 and ICD10 codes 19 . The data filtering was conducted by considering the 5 years prior to the first imaging visit date of attending assessment to identify prevalent CVD cases. From this, we extracted probabilistic outcomes for both the DM-CVD group and DM-non-CVD group. Associations of the probabilistic outcomes with CVD status were assessed using logistic regression, incorporating Age and Sex as covariates. Furthermore, in a similar manner, we also analyzed DM patients without CVD at the time of their initial imaging. Over the 3-year period following their first imaging visit, the incident CVD cases were identified. Again, associations between the probabilistic outcomes and the incident CVD status were evaluated. 3. Results In this Section, we present the results for the UKB and TWB showcasing the outcomes of our designed multimodal DM classification model. Additionally, we compare our results with the existing well-known CNN architectures such as VGG-16 20 and ResNet-18 21 previously applied in Diabetic Retinopathy and DM prediction models, respectively. We conducted an analysis of the extracted image feature maps that contribute to our designed DM classification model, discussing the patterns found in each class. Moreover, we presented the statical analysis results that elucidate the association between the DM and CVD status. 3.1 Results on UK Biobank a. Multimodal diabetes mellitus classification model results In our multimodal DM classification model, we employed a case and control matching approach for the training of images, with a specific focus on extracting features essential for predicting DM directly from image data. This method aligns with the preference for balanced classes in CNN models. Subsequently, for the testing phase, we utilized a case and control random approach. This was implemented to combine the features extracted from the image data with scalar features, thereby mirroring the general population's distribution of key scalar features like age, sex, and BMI. The rationale behind this approach is to evaluate the predictive performance of these scalar features - age, sex, and BMI - in assessing the risk of DM across the broader population. Additionally, we identified some data points in the Training and Testing datasets with missing CIMT and BMI values, with a maximum percentage of 1.89% and 0.04% missing values in UKB and TWB, respectively. As the missing value percentage was low, we transformed the value into zero and included that patient in the analysis, considering the availability of other scalar values and images. To achieve the best performance for different feature combinations in each model and compare the performance of different predictors for DM classification, we calculated precision, recall, specificity, accuracy, and AUC values based on either derived from the best cut-off value using Youden’s J statistics or conventional default threshold 0.5. This allowed us to observe how well each combination of features predicts DM. We conducted the experiment using five different combinations of features and evaluated them using four main metrics. Additionally, we considered the average value obtained by summing the precision, recall, specificity, and accuracy, and dividing the total by 4. This approach allows us to assess the overall effectiveness of the model across various metric evaluations. As presented in Table 2 , we observe that the combination of Image, Age, Sex, CIMT and BMI yields a higher average performance of 0.735, surpassing the performance of other combined features. Table 2 Multimodal DM classification model results on UKB. Features (optimized cutoff) Total features Precision Recall Specificity Accuracy Average of overall metrics Age + Sex (0.392) 2 0.66 0.789 0.594 0.691 0.683 Image + Age + Sex (0.566) 1026 0.77 0.626 0.8 0.716 0.728 CIMT (0.535) 3 0.6 0.389 0.74 0.564 0.573 BMI (0.5) 1 0.647 0.664 0.637 0.651 0.649 Age + Sex + CIMT (0.5) 5 0.7 0.681 0.713 0.697 0.697 Image + Age + Sex + CIMT (0.555) 1029 0.741 0.613 0.78 0.695 0.707 Age + Sex + CIMT + BMI (0.454) 6 0.693 0.756 0.664 0.71 0.7 Image + Age + Sex + CIMT + BMI (0.584) 1030 0.789 0.6 0.835 0.716 0.735 To observe the trade-off between sensitivity and specificity of three included image features models: Image + Age + Sex, Image + Age + Sex + CIMT, and Image + Age + Sex + CIMT + BMI, Fig. 2 shows the Receiver Operating Characteristic (ROC) curve and Area Under the Curve (AUC) values. As depicted in Fig. 2 (c), the best AUC of 0.67 is found in the model with the combination of Image + Age + Sex + CIMT + BMI features. We also compared our multimodal DM classification model with existing CNN models. In comparison with other existing algorithms, we maintained the original image dimensions of 224 x 224 x 3, with 4096 and 512 image features for VGG-16 and ResNet-18, respectively. Table 3 presents the results of the comparison. Our proposed model demonstrated superior performance in several key metrics: it achieved a precision of 0.789, specificity of 0.835, and accuracy of 0.716. These values are significantly higher than those for VGG-16 and ResNet-10, while the recall value of our model is slightly lower compared to the two models. We also observed that our model's AUC:0.67 is higher by 0.08 compared to VGG-16 (AUC: 0.59) and 0.16 compared to ResNet-18 (AUC: 0.51), as shown in Supplementary Materials Section S4 Figure S2. Table 3 Multimodal DM classification models comparison results on UKB. Features: Image + Age + Sex + CIMT + BMI (optimized cutoff) Total features Precision Recall Specificity Accuracy Average of overall metrics Our proposed model (0.584) 1030 0.789 0.6 0.835 0.716 0.735 VGG-16 (0.546) 4102 0.687 0.611 0.736 0.675 0.677 ResNet-18 (0.432) 518 0.7 0.697 0.596 0.655 0.662 b. Feature maps and statistical analysis results To gain insights into the extracted features, we visualized the feature maps generated by the convolution layer. For the correct predicted class, out of 32 feature maps, we selected the best feature map by highlighting the feature map corresponding to the correct class and attenuating the feature map that corresponds to the incorrect class. Figure 3 illustrates an example of the highlighted area, showcasing the significant areas contributing to non-DM and DM classes. It can be observed that the most crucial regions for the non-DM class are in the wall of the arteries, while for the DM class, the important region is located in the lumen artery regions and other part outside the arteries. We applied logistic regression to individually analyze the left and right carotid US data across two distinct cohorts. The purpose of this regression analysis was to assess whether the probability of DM, as determined by our multimodal DM classification model, is associated with either prevalent CVD or incident CVD in patients diagnosed with DM. This approach allowed us to explore the relationship between our model's DM predictions and the occurrence of cardiovascular complications in these patients. Table 4 presents our findings: among the 923 DM patients with 80 prevalent CVD cases 5 years before the imaging visits, the association P-value for the left carotid US exhibits greater significance compared to the right carotid US. Similarly, this trend is observed in the cohort of 843 total DM patients with 27 incident CVD cases in 3 years after the imaging visits, where the P-value remains at 0.057. Although we have less CVD case in each cohort, these results still strongly suggest that there is a significant difference in the probability outcomes between the DM-CVD and DM-non-CVD groups. Table 4 Combined cohort groups result on UKB Cohorts Combine groups Total patients Total images Left (P-value) Right (P-value) Retrospective 5 years before • DM-CVD (80) • DM-non-CVD (843) 923 1,846 0.006 0.458 Prospective 3 years after • DM-CVD (27) • DM-non-CVD (816) 843 1,686 0.058 0.71 3.2 Results on Taiwan Biobank a. Multimodal diabetes mellitus classification model results In the same manner, we conducted the experiment on the TWB dataset. As shown in Table 5 , similar to the pattern observed in the UKB, it can be noted that adding image features into three models: Image + Age + Sex, Image + Age + Sex + CIMT, and Image + Age + Sex + CIMT + BMI, results in higher performance compared to other features without images. All metrics are increasing in the models Image + Age + Sex + CIMT and Image + Age + Sex + CIMT + BMI, except Image + Age + Sex, where the recall of 0.725 is less than Age + Sex recall, which is 0.774. Table 5 Multimodal DM classification model results on TWB. Features (optimized cutoff) Total features Precision Recall Specificity Accuracy Average of overall metrics Age + Sex (0.437) 2 0.62 0.774 0.525 0.65 0.642 Image + Age + Sex (0.5) 1026 0.728 0.725 0.725 0.725 0.726 CIMT (0.5) 2 0.657 0.6 0.682 0.646 0.646 BMI (0.471) 1 0.613 0.711 0.552 0.631 0.626 Age + Sex + CIMT (0.5) 4 0.657 0.673 0.65 0.661 0.66 Image + Age + Sex + CIMT (0.5) 1028 0.748 0.734 0.75 0.742 0.743 Age + Sex + CIMT + BMI (0.5) 5 0.677 0.7 0.663 0.685 0.681 Image + Age + Sex + CIMT + BMI (0.5) 1029 0.738 0.71 0.745 0.727 0.73 A notable observation in the TWB dataset was the exceptional performance of the Image + Age + Sex + CIMT model across all metrics, surpassing both the Image + Age + Sex and Image + Age + Sex + CIMT + BMI models. This suggests that in the TWB dataset, the inclusion of CIMT features provides higher predictive power compared to the inclusion of only Age + Sex and BMI, although it is slightly lower in recall. This difference in performance could be attributed to the way CIMT values are represented in the TWB dataset, where they are shown as actual measurements of left and right CIMT, in contrast to the UKB dataset, which uses minimum, maximum, and mean values of CIMT. Figure 4 shows the ROC curves of different models in the TWB dataset. It is noticeable that the AUC of adding BMI is the same as only adding Image + Age + Sex + CIMT, both having an AUC of 0.80. Moreover, a comparison with existing CNN models was conducted. As shown in Table 6 , similar to UKB, we found the same trend using TWB, where the overall performance of our multimodal DM classification model still outperforms VGG-16. Interestingly, using ResNet-18 on the TWB dataset, a similar performance can be achieved with an average of 0.73, though our proposed model AUC is slightly better with the best AUC being 0.80 for balancing the sensitivity and specificity trade-off, while for ResNet-18, the AUC is 0.79, as shown in Supplementary Materials Section S5, subsection A, Figure S3. Table 6 Multimodal DM classification models comparison results on TWB. Features: Image + Age + Sex + CIMT + BMI (optimized cutoff) Total features Precision Recall Specificity Accuracy Average overall metrics Our proposed model (0.5) 1029 0.738 0.71 0.745 0.727 0.73 VGG-16 (0.5) 4101 0.62 0.641 0.599 0.62 0.62 ResNet-18 (0.5) 517 0.74 0.747 0.716 0.732 0.733 b. Feature maps and statistical analysis results To comprehend the image features of our model, we visualized the extracted features and performed statistical analysis on the TWB dataset. As illustrated in Supplementary Materials Section S5 subsection B Figure S4, a pattern like that observed in the UKB emerges. Although the activation appears to be less pronounced than in the UKB, it is evident that, in the non-DM class, there is high activation, manifesting slightly in an orange hue, on the walls of the arteries and other areas outside the arteries. Our model replicates the patterns identified in the UKB dataset, where, in the DM class, robust activation is primarily located within the lumen area. Given the close association between the lumen area and the presence of atherosclerosis, it is reasonable for the CNN to capture significant features with high activations in that region. This visualization proves valuable in understanding how our CNN feature maps capture information crucial for assessing the performance of our multimodal DM classification model. As depicted in Supplementary Materials Section S5 subsection C Table S2, the analysis indicates that the probability of diabetes mellitus (DM), as extracted from our designed model, exhibits less significance when comparing the CVD and non-CVD groups in both studies. In contrast to the findings in the UKB, the non-significant p-values suggest insufficient evidence to conclude a statistically significant difference in the probability of DM within the TWB dataset. Unlike the results in the UKB on the left side, in TWB, retrospective and prospective observations yield values of 0.315 and 0.154, respectively. 4. Discussion In this study, we developed a multimodal DM classification model using CNN and logistic regression to analyze a combination of carotid US Images, Age, Sex, CIMT, and BMI across the UKB and TWB datasets. Our experiments revealed unique patterns in the UKB and TWB. For UKB, the model with all features (Image + Age + Sex + CIMT + BMI) showed the best overall performance, although the recall metric was slightly lower compared to models using image and other feature combinations. This is primarily attributed to the higher recall rate of Age + Sex, which was 0.789, as opposed to the lower recall rates achieved by CIMT and BMI individually, which were 0.389 and 0.664, respectively. When considering the optimal models, we observed that our model outperformed VGG-16 and ResNet-18. By adding image features, except for recall, other metrics improved significantly. Our image features captured important patterns for each class. In the non-DM class, the important features tend to have high activations on the wall of the arteries, whereas in the DM class, the important features are shown in the lumen area and other areas outside the lumen, showing similarity in patterns. These findings have a significant impact on the statistical analysis between the DM-CVD and DM-non-CVD groups. Based on logistic regression, we discovered a significant association between the DM prediction probability and the prevalent and incident CVD with significant P-values of 0.006 and 0.058, respectively, on the left side of the arteries. The anatomical origins of the left and right arteries, stemming from the arch of the aorta and the innominate artery, respectively, contribute to the difference. The left arteries are found to be thicker in comparison to the right side and are correlated with blood biochemical indices such as blood glucose level, total cholesterol, and low-density lipoprotein cholesterol, which are closely related to CVD risk factors 22 . A thicker artery wall may indicate more advanced plaque development 23 , and it has been noted that unilateral plaque usually occurs on the left artery, making the left side vulnerable. In contrast, the TWB results showed a slightly different trend. Prediction models that included CIMT along with Image, Age, and Sex, and excluded BMI, outperformed other models, achieving higher Precision (0.748), Recall (0.734), Specificity (0.75), and Accuracy (0.742). The average score of all metrics for this model was 0.743 surpassing Image + Age + Sex (average 0.726), and Image + Age + Sex + CIMT + BMI (average 0.73). Therefore, the inclusion of CIMT had more predictive power than just adding BMI or Age + Sex, despite both models achieving an AUC of 0.8. The unique nature of CIMT measurements in TWB, involving actual measurements on both sides, partly explains this difference compared to UKB's mean, maximum, and minimum values approach. Surprisingly, although our model slightly outperformed VGG-16 with an AUC of 0.8 compared to 0.79 in ResNet-18, ResNet-18 exhibited better performance in terms of precision, recall, and accuracy. This improvement could be attributed to the skip connection features present in ResNet-18 when compared to both VGG-16 and our models. Nevertheless, even with a simple yet efficient model, our approach still demonstrates slightly better AUC performance in comparison to VGG-16 and ResNet-18. In terms of feature visualization, the extracted feature patterns were consistent with those observed in UKB, suggesting that our CNN model successfully identified important features for DM and non-DM classes. This is indicative of the model's significant predictive power, combined with conventional features, and its statistical association with CVD status. Despite our study’s contributions, there are some limitations. We focused only on image CIMT150 category, and CIMT measured at 150 degrees. Additionally, we have not yet applied a feature selection method, which could be beneficial in selecting the best and optimal image features to be used. In our upcoming endeavors, there is potential for enhancing the current multimodal DM classification model. This improvement can be realized by integrating more intricate data, specifically through the aggregation of EHR datasets. This approach is particularly promising for addressing complex diseases or exploring associations with diabetes related complications. 5. Conclusion In conclusion, we proposed a multimodal DM classification model utilizing carotid US images and scalar features such as Age, Sex, CIMT, and BMI. The model, based on CNN Deep Learning and logistic regression, demonstrated improved overall metrics performance in both UKB and TWB datasets when image features were incorporated. The designed CNN model captured unique patterns in the images, identifying important features for the DM class in the lumen area whereas for the non-DM class, crucial features were observed in the wall of the lumen area. These consistent findings across different datasets suggest that carotid US image features can be a promising predictor for identifying DM outcomes associated with both CVD and non-CVD patients. In UKB, our statistical analysis revealed a significant association between DM probabilities derived from the model and CVD status, particularly on the left side, with retrospective and prospective P-values of 0.006 and 0.058, respectively. This highlights the potential of carotid US as a predictive tool for DM in relation to CVD. Declarations FUNDING This study was supported by grants PH-112-PP-10 and PH-112-GP-04 from the National Health Research Institutes, Taiwan. AUTHOR CONTRIBUTIONS DDO, HYC and RHC designed the study. DDO, HML, GHL, and YPH contributed to data acquisition. DDO, HML, GHL, YPH, YSZ, AIO, and YHL performed the analyses. All authors helped interpret the analysis results and approved the final manuscript. ETHIC APPROVAL Written informed consent for participants in the Taiwan Biobank and UK Biobank was obtained from all participants. Research in this study was approved by the Institutional Review Board of the National Health Research Institutes in Taiwan (reference number: EC1091202-E). DATA AVAILABILITY The UK Biobank data can be applied through the UK Biobank. Similarly, the Tawain Biobank data can be obtained through application to the Taiwan Biobank. CODE AVAILABILITY The underlying code for this study is not publicly available but may be made available to qualified researchers on reasonable request from the corresponding author. COMPETING INTERESTS None exists. ACKNOWLEDGEMENTS We thank the participants from the Taiwan Biobank and UK Biobank study. This research was conducted using the UK Biobank Resource under Application Number 82617. References WHO. Diabetes , (2023). AlGhibiwi, H. K. et al. The Association between Cardiovascular Risk Factors and Carotid Intima-Media Thickness in 42,726 Adults in UK Biobank: A Cross-Sectional Study. Journal of Cardiovascular Development and Disease 10 , 358 (2023). Sibal, L., Agarwal, S. C. & Home, P. D. Carotid intima-media thickness as a surrogate marker of cardiovascular disease in diabetes. Diabetes, metabolic syndrome and obesity: targets and therapy , 23-34 (2011). Hoke, M. et al. Carotid ultrasound investigation as a prognostic tool for patients with diabetes mellitus. Cardiovascular diabetology 18 , 1-8 (2019). Li, H., Xu, X., Luo, B. & Zhang, Y. The predictive value of carotid ultrasonography with cardiovascular risk factors—A “SPIDER” promoting atherosclerosis. Frontiers in Cardiovascular Medicine 8 , 706490 (2021). Feng, X., Cai, Y. & Xin, R. Optimizing diabetes classification with a machine learning-based framework. BMC Bioinformatics 24 , 428 (2023). https://doi.org:10.1186/s12859-023-05467-x Chien, S.-C. et al. Predicting long-term care service demands for cancer patients: A machine learning approach. Cancers 15 , 4598 (2023). Onthoni, D. D., Sheng, T.-W., Sahoo, P. K., Wang, L.-J. & Gupta, P. Deep learning assisted localization of polycystic kidney on contrast-enhanced CT images. Diagnostics 10 , 1113 (2020). Bokhorst, J.-M. et al. Deep learning for multi-class semantic segmentation enables colorectal cancer detection and classification in digital pathology images. Scientific Reports 13 , 8398 (2023). Pyrros, A. et al. (2023). Wachinger, C., Wolf, T. N. & Polsterl, S. Deep learning for the prediction of type 2 diabetes mellitus from neck-to-knee Dixon MRI in the UK biobank. Heliyon 9 , e22239 (2023). https://doi.org:10.1016/j.heliyon.2023.e22239 Tan, X. et al. Convolutional Neural Networks for Classification of T2DM Cognitive Impairment Based on Whole Brain Structural Features. Front Neurosci 16 , 926486 (2022). https://doi.org:10.3389/fnins.2022.926486 Palmer, L. J. UK Biobank: bank on it. The Lancet 369 , 1980-1982 (2007). Biobank, T. Statistics , (2023). Biobank, U. Imaging Modality: Carodtif Ultrasound , ( Le Goallec, A., Collin, S., Diai, S., Vincent, T. & Patel, C. J. Predicting arterial age using carotid ultrasound images, pulse wave analysis records, cardiovascular biomarkers and deep learning. medRxiv , 2021.2006. 2017.21259120 (2021). Booth, G. L., Kapral, M. K., Fung, K. & Tu, J. V. Relation between age and cardiovascular disease in men and women with diabetes compared with non-diabetic people: a population-based retrospective cohort study. The Lancet 368 , 29-36 (2006). Kautzky-Willer, A., Leutner, M. & Harreiter, J. Sex differences in type 2 diabetes. Diabetologia 66 , 986-1002 (2023). Wan, E. Y. F. et al. Blood pressure and risk of cardiovascular disease in UK biobank: a mendelian randomization study. Hypertension 77 , 367-375 (2021). Mohanty, C. et al. Using Deep Learning Architectures for Detection and Classification of Diabetic Retinopathy. Sensors 23 , 5726 (2023). Aslan, M. F. & Sabanci, K. A novel proposal for deep learning-based diabetes prediction: Converting clinical data to image data. Diagnostics 13 , 796 (2023). Luo, X., Yang, Y., Cao, T. & Li, Z. Differences in left and right carotid intima-media thickness and the associated risk factors. Clin Radiol 66 , 393-398 (2011). https://doi.org:10.1016/j.crad.2010.12.002 Selwaness, M. et al. Atherosclerotic Plaque in the Left Carotid Artery Is More Vulnerable Than in the Right. Stroke 45 , 3226-3230 (2014). https://doi.org:10.1161/Strokeaha.114.005202 Additional Declarations (Not answered) Supplementary Files SUPPLEMENTARYMATERIAL.docx Supplementary Material Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-3855322","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":269075251,"identity":"863d6796-90ef-4562-a7e9-a8ce6997f493","order_by":0,"name":"Ren-Hua Chung","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAv0lEQVRIiWNgGAWjYFACHhBhw2DAwNjAwMBGvJY0hBYeIrUcBmoBAWK0mLefPSZd8Ot8nrn04QaGD2WHGewlEvBrkTmTlyY9s+92sWVfYgPjjHOHGXgIaZGQ4DGT5u25nbjhDGMDM28bUIs0cVrOQbT8JVoLz48DEC2MRGnhyTG25m1IBms52HMunYfn/gMCWtjPGN7m+WMH1ML+8MGPMms59p4D+LWAAWMbhAapJSImweAPkepGwSgYBaNgZAIAkDo+RHpUtnYAAAAASUVORK5CYII=","orcid":"https://orcid.org/0000-0002-9835-6333","institution":"National Health Research Institutes","correspondingAuthor":true,"prefix":"","firstName":"Ren-Hua","middleName":"","lastName":"Chung","suffix":""},{"id":269075252,"identity":"899b272b-6bb1-4cb1-8d6a-ecb8c7b732fd","order_by":1,"name":"Djeane Onthoni","email":"","orcid":"","institution":"National Health Research Institutes","correspondingAuthor":false,"prefix":"","firstName":"Djeane","middleName":"","lastName":"Onthoni","suffix":""},{"id":269075253,"identity":"9960bf6b-5262-4783-927b-1e5959f58430","order_by":2,"name":"Hong-Ming Lin","email":"","orcid":"","institution":"National Health Research Institutes","correspondingAuthor":false,"prefix":"","firstName":"Hong-Ming","middleName":"","lastName":"Lin","suffix":""},{"id":269075254,"identity":"0d4e176e-8418-4d57-8cf3-3b4ad1182bb9","order_by":3,"name":"Guo-Hung Li","email":"","orcid":"","institution":"National Health Research Institutes","correspondingAuthor":false,"prefix":"","firstName":"Guo-Hung","middleName":"","lastName":"Li","suffix":""},{"id":269075255,"identity":"4adb2449-7597-4fe3-9490-5aece0b57dde","order_by":4,"name":"Yu-Ping Hsiao","email":"","orcid":"","institution":"National Health Research Institutes","correspondingAuthor":false,"prefix":"","firstName":"Yu-Ping","middleName":"","lastName":"Hsiao","suffix":""},{"id":269075256,"identity":"df28e55a-922d-47df-9f93-988bb3973e5a","order_by":5,"name":"Yong-Sheng Zhuang","email":"","orcid":"","institution":"National Health Research Institutes","correspondingAuthor":false,"prefix":"","firstName":"Yong-Sheng","middleName":"","lastName":"Zhuang","suffix":""},{"id":269075257,"identity":"af0b60fd-529a-4d3a-8fa8-987a34e1243e","order_by":6,"name":"Ade Onthoni","email":"","orcid":"","institution":"National Health Research Institutes","correspondingAuthor":false,"prefix":"","firstName":"Ade","middleName":"","lastName":"Onthoni","suffix":""},{"id":269075258,"identity":"3dc2f88c-d334-4158-8aee-127015c3f3e8","order_by":7,"name":"Yi-Hsuan Lai","email":"","orcid":"","institution":"National Health Research Institutes","correspondingAuthor":false,"prefix":"","firstName":"Yi-Hsuan","middleName":"","lastName":"Lai","suffix":""},{"id":269075259,"identity":"c8a3c406-478c-49fd-965c-d058467157f6","order_by":8,"name":"Hung-Yi Chiou","email":"","orcid":"","institution":"National Health Research Institutes","correspondingAuthor":false,"prefix":"","firstName":"Hung-Yi","middleName":"","lastName":"Chiou","suffix":""}],"badges":[],"createdAt":"2024-01-12 02:10:34","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-3855322/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-3855322/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":50331363,"identity":"888f1f12-d742-476d-a0a6-ce8224320448","added_by":"auto","created_at":"2024-01-29 21:47:26","extension":"jpg","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":112048,"visible":true,"origin":"","legend":"\u003cp\u003eThe overview of multimodal DM classification model\u003c/p\u003e","description":"","filename":"Figure1.jpg","url":"https://assets-eu.researchsquare.com/files/rs-3855322/v1/bf21ee6cd4de182c5acc2067.jpg"},{"id":50332054,"identity":"786b2d51-cb21-4db1-8031-89152b435726","added_by":"auto","created_at":"2024-01-29 21:55:26","extension":"jpg","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":146098,"visible":true,"origin":"","legend":"\u003cp\u003eROC curves based on feature combinations on UKB: (a) Image + Age + Sex features; (b) Image + Age + Sex + CIMT features; (c) Image + Age + Sex + CIMT + BMI features.\u003c/p\u003e","description":"","filename":"Figure2.jpg","url":"https://assets-eu.researchsquare.com/files/rs-3855322/v1/6154b00a720e8681d75b6aa8.jpg"},{"id":50331366,"identity":"db1a18cb-52f0-42fb-87a5-36e9af5c740b","added_by":"auto","created_at":"2024-01-29 21:47:26","extension":"jpg","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":417573,"visible":true,"origin":"","legend":"\u003cp\u003eFeature maps in DM and non-DM classes on UKB.\u003c/p\u003e","description":"","filename":"Figure3.jpg","url":"https://assets-eu.researchsquare.com/files/rs-3855322/v1/72aa19e346828a2a0b99d7dd.jpg"},{"id":50332055,"identity":"709fcaf6-a2e2-4b2a-9e01-bd1f1e1d060c","added_by":"auto","created_at":"2024-01-29 21:55:26","extension":"jpg","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":153416,"visible":true,"origin":"","legend":"\u003cp\u003eROC curves based on feature combinations on TWB: (a) Image + Age + Sex features; (b) Image + Age + Sex + CIMT features; (c) Image + Age + Sex + CIMT + BMI features.\u003c/p\u003e","description":"","filename":"Figure4.jpg","url":"https://assets-eu.researchsquare.com/files/rs-3855322/v1/4051c7ed616d82a51e9c15e2.jpg"},{"id":57871423,"identity":"20d748f8-48e0-4e7c-abe6-f13bb7f858b1","added_by":"auto","created_at":"2024-06-06 18:03:48","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1611466,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-3855322/v1/5803caa1-f42a-4a64-ac2e-1bd3ef110f51.pdf"},{"id":50331364,"identity":"4b5b6cf9-93a8-4008-b99d-c913be484dae","added_by":"auto","created_at":"2024-01-29 21:47:26","extension":"docx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":1448384,"visible":true,"origin":"","legend":"\u003cp\u003eSupplementary Material\u003c/p\u003e","description":"","filename":"SUPPLEMENTARYMATERIAL.docx","url":"https://assets-eu.researchsquare.com/files/rs-3855322/v1/7e5a54395c052557447fe4bf.docx"}],"financialInterests":"(Not answered)","formattedTitle":"Multimodal Deep Learning for Classifying Diabetes: Analyzing Carotid Ultrasound Images from UK and Taiwan Biobanks and Their Cardiovascular Disease Associations","fulltext":[{"header":"1. Introduction","content":"\u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eDiabetes mellitus (DM) is a chronic disease affecting millions of people worldwide \u003csup\u003e1\u003c/sup\u003e. It is categorized into three types: Type 1, Type 2, and Gestational diabetes. DM is a long-term condition that often leads to the development of other chronic diseases, including cardiovascular disease (CVD). DM promotes the progression of atherosclerosis, a significant factor contributing to the heightened risk of CVD. In clinical practice, the advancement of atherosclerosis can be assessed using carotid ultrasound (US). Carotid intima-media thickness (CIMT), measured through carotid US, has been recognized as a surrogate marker for CVD in diabetic patients \u003csup\u003e2\u003c/sup\u003e. CIMT also exhibits a noteworthy association with DM, underscoring its role as a significant marker in assessing the relationship between vascular health and diabetes \u003csup\u003e3\u003c/sup\u003e. These findings affirm a substantial relationship between DM and CVD.\u003c/p\u003e \u003cp\u003eRecent clinical studies have directly examined carotid US in patients with DM, revealing its importance as a prognostic tool for diabetes, especially for assessing CVD risk \u003csup\u003e4\u003c/sup\u003e. By undergoing routine carotid ultrasound, carotid atherosclerosis can be detected and treated early, reducing the risk of cardiovascular disease \u003csup\u003e5\u003c/sup\u003e. It is important to note that the presence of carotid atherosclerosis in carotid ultrasound is more significant than CIMT alone. Thus, it has been recommended that the assessment for CVD risk can be done by considering the results from carotid ultrasound such as CIMT, combined with other carotid atherosclerosis factors and additional risk assessment \u003csup\u003e5\u003c/sup\u003e. However, there is currently a lack of investigation into whether carotid US can be directly utilized as an imaging biomarker to classify patients into those with DM and those without, and subsequently used to predict early high-risk patients for CVD.\u003c/p\u003e \u003cp\u003eIn the era of AI for clinical applications, numerous works have been established utilizing Deep Learning (DL) and Machine Learning (ML) techniques, specifically for various tasks such as classification \u003csup\u003e6\u003c/sup\u003e, prediction \u003csup\u003e7\u003c/sup\u003e, detection \u003csup\u003e8\u003c/sup\u003e, segmentation \u003csup\u003e9\u003c/sup\u003e, etc. In the case of DM classifications, a few studies have applied various medical imaging techniques. For example, some authors have contended that Body Mass Index (BMI) has limitations in assessing Type 2 Diabetes (T2D). Consequently, these authors sought to enhance early T2D detection by incorporating Electronic Health Record (EHR) data, specifically considering six different diseases such as T2D, Congestive heart failure, Cardiac arrhythmias, Morbid obesity, Chronic obstructive pulmonary disease, and Vascular disease, and frontal chest radiographs into their DL model, employing a ResNet-34 Convolutional Neural Network (CNN)-based approach \u003csup\u003e10\u003c/sup\u003e. With a similar objective, the authors utilized neck-to-knee Dixon MRI and applied a 3D CNN \u003csup\u003e11\u003c/sup\u003e. On the other hand, authors analyzed the impact of T2D on human brain, where it can cause irreversible damage to the brain tissue. Thus, authors utilized T1-weighted structural MRI and constructed an 11-layer 3D CNN \u003csup\u003e12\u003c/sup\u003e. As our focus is on DM associated with CVD, we found that frontal chest radiographs may not be suitable due to limitations in assessing key factors in CVD, such as atherosclerosis. Meanwhile, MRI is a time-consuming and costly imaging technique. Additionally, we observed a lack of studies aimed at improving DM classification models by incorporating carotid US along with conventional DM predictors in association with CVD.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003cp\u003eTherefore, this study has two primary objectives. Firstly, we aim to explore whether the inclusion of carotid US, in conjunction with conventional predictors such as age, sex, CIMT, and BMI, can enhance the performance of a multimodal DM classification model for distinguishing between individuals with DM and those without. The secondary goal is to demonstrate whether the probability outcomes generated by our designed CNN classification model significantly differ between two groups: DM patients with CVD (DM-CVD) and DM patients without CVD (DM-non-CVD).\u003cdiv class=\"BlockQuote\"\u003e\u003cp\u003eTo achieve the first objective, we utilized the UK Biobank (UKB) dataset to design a multimodal DM classification model employing a CNN deep learning-based approach. Secondly, we employed logistic regression to examine the association of carotid ultrasound image probability outcomes for DM with both CVD and non-CVD statuses. Additionally, we validated the designed multimodal DM classification model using cross-validation with the Taiwan Biobank (TWB) dataset.\u003c/p\u003e\u003c/div\u003e\u003c/p\u003e"},{"header":"2. Materials and methods","content":"\u003cp\u003eIn our experimental design, we incorporated data from the UKB and the TWB. Accordingly, in this section, we describe the two sources of image data acquisition, image and scalar data processing, and outcome definitions. In detail, we explain our proposed multimodal diabetes mellitus classification model and the statistical analysis.\u003c/p\u003e\n\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\n\u003ch2\u003e2.1 Image data acquisition\u003c/h2\u003e\n\u003cdiv class=\"BlockQuote\"\u003e\n\u003cp\u003eAll images in UKB and TWB are in DICOM format, with Common Carotid Artery (CCA) being the primary focus. The UKB dataset comprises approximately 500,000 participants \u003csup\u003e13\u003c/sup\u003e. For our study, our focus was on the data from the first imaging visit date of attending assessment (i.e., instance 2) and its corresponding imaging visit date (data-field 53), resulting in a dataset of 19,911 patients. In terms of carotid US, we considered pairs of left (data-field 20222) and right (data-field 20223) images. In TWB \u003csup\u003e14\u003c/sup\u003e, there are 46,561 follow-up participants, with 25,731 participants having undergone imaging tests. Out of these, a total of 25,587 participants have received carotid US on both left and right sides, referred to as Vertebral Artery (VAS) or Vascular US. Unlike in the UKB, in TWB, the left and right sides are not separated, and all sides are included within each patient\u0026rsquo;s DICOM files.\u003c/p\u003e\n\u003c/div\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec4\" class=\"Section2\"\u003e\n\u003ch2\u003e2.2 Image data preprocessing\u003c/h2\u003e\n\u003cdiv class=\"BlockQuote\"\u003e\n\u003cp\u003eThe image processing techniques were applied separately to the left and right side of DICOM files. We employed two data processing methods for carotid US image. Initially in UKB the carotid US procedures were conducted using a 2D scan along short-axis (transverse plane) and long-axis (longitudinal plane). In the long-axis, the CIMT was measured at pre-defined two angles (150 and 120 degrees) on the left and two angles (210 and 240 degrees) on the right sides \u003csup\u003e15\u003c/sup\u003e. Accordingly, four types of images can be found on the left and right side in each participant through image processing methods: long-axis, short-axis, CIMT150 (right: 150 and left: 210 degrees), and CIMT120 (right: 120 and left: 240 degrees). Specifically for UKB, we utilized the existing preprocessing techniques proposed in \u003csup\u003e16\u003c/sup\u003e. For every side, we performed cropping on all DICOM files using pre-defined top-left and bottom-right coordinates. The files are then named based on factors such as color ranges, angles, and the presence of check symbols within the DICOM files. This involves comparing the average pixel value in specific, predefined coordinates to pre-established thresholds. The process is contingent on whether the image corresponds to the sides and image categories which in this case, image CIMT150 for both left and right sides. The selection of image CIMT150 (150 degrees) category is based on our empirical study of stroke cases. We conducted multiple experiments using various CNN architectures, among which Inception ResNet V2 emerged as superior to other architectures. According to our experiments, image CIMT150 (150 degrees) category exhibits significant testing results in terms of accuracy, precision, recall, and specificity. Finally, the corresponding image is stored in JPEG format.\u003c/p\u003e\n\u003cp\u003eSimilar to the UKB, we underwent preprocessing of the entire DICOM dataset in the TWB and differentiated between the left and right sides, even though the specific degree of carotid ultrasound has not been specified in the TWB. Due to the absence of established carotid ultrasound preprocessing techniques in TWB, we developed our own data preprocessing, consisting of two phases designed for distinct objectives. The first goal is to obtain images without any annotations, and the second goal is to distinguish between the left and right CCA. Details of our designed carotid ultrasound processing methods for TWB can be found in Supplementary Materials Section S1.\u003c/p\u003e\n\u003c/div\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec5\" class=\"Section2\"\u003e\n\u003ch2\u003e2.3 Scalar data preprocessing and outcomes definition\u003c/h2\u003e\n\u003cdiv class=\"BlockQuote\"\u003e\n\u003cp\u003eIt has been reported that age is a crucial factor for CVD patients with or without DM \u003csup\u003e17\u003c/sup\u003e. Similarly, along with Sex factor, Age and Sex differences encompass DM factors, complications, and treatments \u003csup\u003e18\u003c/sup\u003e. For this reason, we selected Age and Sex as the scalar data. Additionally, as a robust direct biomarker for CVD and an indirect for DM, CIMT value derived from carotid US is extracted. In the UKB, we calculated age based on the patient's year and month of birth (data-field 34 and 52) and determined sex based on sex information and genetic sex (data-field 31 and 22001). In cases where the genetic sex had a missing value, we filled it in with the available sex value stored in the sex information. Additionally, as one of the DM predictors, BMI (data-field 21001) was included in the analysis. To match the selected image category, we only considered CIMT measured at 150 degrees, extracting the minimum (data-field 22673), mean (data-field 22674), and maximum (data-field 22675) values. Moreover, to align with image data, all UKB scalar data are derived by considering instance 2. In the TWB, we extracted patient\u0026rsquo;s Age, Sex, and BMI from the image and follow-up information. Unlike UKB, a single value for both left and right CIMT is available.\u003c/p\u003e\n\u003cp\u003eIn the UKB, for DM outcomes, we utilized the data field \u0026ldquo;diabetes diagnosed by doctor\u0026rdquo; (data-field 2443), which consisted of binary values '0' and '1' for 'no' and 'yes,' respectively. We filtered and extracted CVD outcomes from the EHR using ICD9 and ICD10 codes [\u003cspan class=\"CitationRef\"\u003e6\u003c/span\u003e]. The data filtering was conducted by considering the period of 5 years prior to, and 3 years after instance 2. In TWB, we defined the DM outcomes from the follow-up report 5 years prior to the image taken date. The details of the total extracted images, scalar data, and the division of training and testing data can be found in Supplementary Materials Section S2.\u003c/p\u003e\n\u003c/div\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec6\" class=\"Section2\"\u003e\n\u003ch2\u003e2.4 Multimodal diabetes mellitus classification model\u003c/h2\u003e\n\u003cdiv class=\"BlockQuote\"\u003e\n\u003cp\u003eThe overview of our multimodal DM classification model is illustrated in Fig.\u0026nbsp;\u003cspan class=\"InternalRef\"\u003e1\u003c/span\u003e. Our model comprises three pipelines: training, validation, and prediction. For the training model, we designed a CNN architecture with a custom loss. The specifics of our designed CNN block can be found in Table\u0026nbsp;\u003cspan class=\"InternalRef\"\u003e1\u003c/span\u003e. Using the image data training-testing ratio of 80:20, we fed the DM cases and non-DM matched controls training set. After completing the training, we conducted validation using DM cases and non-DM control random testing set.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv class=\"gridtable\"\u003e\n\u003ctable id=\"Tab1\" border=\"1\"\u003e\u003ccaption\u003e\n\u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e\n\u003cdiv class=\"CaptionContent\"\u003e\n\u003cp\u003eOur designed CNN block details.\u003c/p\u003e\n\u003c/div\u003e\n\u003c/caption\u003e\n\u003cthead\u003e\n\u003ctr\u003e\n\u003cth align=\"left\"\u003e\n\u003cp\u003eBlock\u003c/p\u003e\n\u003c/th\u003e\n\u003cth align=\"left\"\u003e\n\u003cp\u003eValues\u003c/p\u003e\n\u003c/th\u003e\n\u003c/tr\u003e\n\u003c/thead\u003e\n\u003ctbody\u003e\n\u003ctr\u003e\n\u003ctd align=\"left\"\u003e\n\u003cp\u003eInput image dimension\u003c/p\u003e\n\u003c/td\u003e\n\u003ctd align=\"left\"\u003e\n\u003cp\u003e3 x 300 x 300\u003c/p\u003e\n\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd align=\"left\"\u003e\n\u003cp\u003eConvolution 2D\u003c/p\u003e\n\u003c/td\u003e\n\u003ctd align=\"left\"\u003e\n\u003cp\u003eOutput features\u0026thinsp;=\u0026thinsp;32, Kernel size\u0026thinsp;=\u0026thinsp;3, Stride\u0026thinsp;=\u0026thinsp;1, Padding\u0026thinsp;=\u0026thinsp;1\u003c/p\u003e\n\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd align=\"left\"\u003e\n\u003cp\u003eActivation function\u003c/p\u003e\n\u003c/td\u003e\n\u003ctd align=\"left\"\u003e\n\u003cp\u003eReLU\u003c/p\u003e\n\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd align=\"left\"\u003e\n\u003cp\u003eBatch normalization 2D\u003c/p\u003e\n\u003c/td\u003e\n\u003ctd align=\"left\"\u003e\n\u003cp\u003eNumber of features\u0026thinsp;=\u0026thinsp;32\u003c/p\u003e\n\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd align=\"left\"\u003e\n\u003cp\u003eDropout 2D\u003c/p\u003e\n\u003c/td\u003e\n\u003ctd align=\"left\"\u003e\n\u003cp\u003eDropout rate\u0026thinsp;=\u0026thinsp;0.5\u003c/p\u003e\n\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd align=\"left\"\u003e\n\u003cp\u003eFully connected layer\u003c/p\u003e\n\u003c/td\u003e\n\u003ctd align=\"left\"\u003e\n\u003cp\u003eOutput features\u0026thinsp;=\u0026thinsp;1024\u003c/p\u003e\n\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd align=\"left\"\u003e\n\u003cp\u003eDropout 2D\u003c/p\u003e\n\u003c/td\u003e\n\u003ctd align=\"left\"\u003e\n\u003cp\u003e0.5\u003c/p\u003e\n\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd align=\"left\"\u003e\n\u003cp\u003eEpoch\u003c/p\u003e\n\u003c/td\u003e\n\u003ctd align=\"left\"\u003e\n\u003cp\u003e10\u003c/p\u003e\n\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd align=\"left\"\u003e\n\u003cp\u003eBatch size\u003c/p\u003e\n\u003c/td\u003e\n\u003ctd align=\"left\"\u003e\n\u003cp\u003e2\u003c/p\u003e\n\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd align=\"left\"\u003e\n\u003cp\u003eOptimizer (Learning rate)\u003c/p\u003e\n\u003c/td\u003e\n\u003ctd align=\"left\"\u003e\n\u003cp\u003eAdam (0.0001)\u003c/p\u003e\n\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd align=\"left\"\u003e\n\u003cp\u003eData augmentation\u003c/p\u003e\n\u003c/td\u003e\n\u003ctd align=\"left\"\u003e\n\u003cp\u003eRandom Horizontal Flip, Random Vertical Flip and Normalization\u003c/p\u003e\n\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/tbody\u003e\n\u003c/table\u003e\n\u003c/div\u003e\n\u003cdiv class=\"BlockQuote\"\u003e\n\u003cp\u003eEvaluation was performed using Cross-Entropy Loss (CEL) and L1 regularization loss. During validation, scalar data including Age, Sex, CIMT, and BMI were also loaded and concatenated with the extracted imaging features from the DM cases and non-DM control random testing set. Three values of CIMT were extracted for UKB, and two values of CIMT were found for TWB. Therefore, in total, we had 1030 and 1029 features as input for the final prediction for UKB and TWB, respectively. Before applying the final prediction block using logistic regression, we randomly split the merged features, consisting of both image and scalar features, into an 80:20 training-testing ratio.\u003c/p\u003e\n\u003c/div\u003e\n\u003cp\u003eTo gain a deeper understanding of which patterns in the images contain important signals contributing to the multimodal DM classification model, we visualized the extracted features in Convolution 2D by retrieving the weight data and obtaining the feature maps. We then reshaped the 32 feature maps into the dimensions of the input image (32 x 300 x 300). Importance scores of the 32 feature maps were calculated by computing the mean activation for the correct class minus the mean activation for the incorrect class. This process allowed us to track the best feature map indices and visualize them using the \"Inferno\" colormap built into Matplotlib. The detailed used Hardware and software specifications in this experiment can be found in Supplementary materials Section S3.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec7\" class=\"Section2\"\u003e\n\u003ch2\u003e2.5 Statistical analysis\u003c/h2\u003e\n\u003cdiv class=\"BlockQuote\"\u003e\n\u003cp\u003eTo gather evidence supporting the statistically significant association of the probability outcomes from our designed CNN model with the CVD status in DM patients, we filtered and extracted the CVD status from the EHR using ICD9 and ICD10 codes \u003csup\u003e19\u003c/sup\u003e. The data filtering was conducted by considering the 5 years prior to the first imaging visit date of attending assessment to identify prevalent CVD cases. From this, we extracted probabilistic outcomes for both the DM-CVD group and DM-non-CVD group. Associations of the probabilistic outcomes with CVD status were assessed using logistic regression, incorporating Age and Sex as covariates. Furthermore, in a similar manner, we also analyzed DM patients without CVD at the time of their initial imaging. Over the 3-year period following their first imaging visit, the incident CVD cases were identified. Again, associations between the probabilistic outcomes and the incident CVD status were evaluated.\u003c/p\u003e\n\u003c/div\u003e\n\u003c/div\u003e"},{"header":"3. Results","content":"\u003cp\u003eIn this Section, we present the results for the UKB and TWB showcasing the outcomes of our designed multimodal DM classification model. Additionally, we compare our results with the existing well-known CNN architectures such as VGG-16 \u003csup\u003e20\u003c/sup\u003e and ResNet-18 \u003csup\u003e21\u003c/sup\u003e previously applied in Diabetic Retinopathy and DM prediction models, respectively. We conducted an analysis of the extracted image feature maps that contribute to our designed DM classification model, discussing the patterns found in each class. Moreover, we presented the statical analysis results that elucidate the association between the DM and CVD status.\u003c/p\u003e\n\u003cdiv id=\"Sec9\" class=\"Section2\"\u003e\n \u003ch2\u003e3.1 Results on UK Biobank\u003c/h2\u003e\n \u003cp\u003e\u003cstrong\u003ea. Multimodal diabetes mellitus classification model results\u003c/strong\u003e\u003c/p\u003e\n \u003cdiv class=\"BlockQuote\"\u003e\n \u003cp\u003eIn our multimodal DM classification model, we employed a case and control matching approach for the training of images, with a specific focus on extracting features essential for predicting DM directly from image data. This method aligns with the preference for balanced classes in CNN models. Subsequently, for the testing phase, we utilized a case and control random approach. This was implemented to combine the features extracted from the image data with scalar features, thereby mirroring the general population\u0026apos;s distribution of key scalar features like age, sex, and BMI. The rationale behind this approach is to evaluate the predictive performance of these scalar features - age, sex, and BMI - in assessing the risk of DM across the broader population. Additionally, we identified some data points in the Training and Testing datasets with missing CIMT and BMI values, with a maximum percentage of 1.89% and 0.04% missing values in UKB and TWB, respectively. As the missing value percentage was low, we transformed the value into zero and included that patient in the analysis, considering the availability of other scalar values and images.\u003c/p\u003e\n \u003cp\u003eTo achieve the best performance for different feature combinations in each model and compare the performance of different predictors for DM classification, we calculated precision, recall, specificity, accuracy, and AUC values based on either derived from the best cut-off value using Youden\u0026rsquo;s J statistics or conventional default threshold 0.5. This allowed us to observe how well each combination of features predicts DM.\u003c/p\u003e\n \u003cp\u003eWe conducted the experiment using five different combinations of features and evaluated them using four main metrics. Additionally, we considered the average value obtained by summing the precision, recall, specificity, and accuracy, and dividing the total by 4. This approach allows us to assess the overall effectiveness of the model across various metric evaluations. As presented in Table\u0026nbsp;\u003cspan class=\"InternalRef\"\u003e2\u003c/span\u003e, we observe that the combination of Image, Age, Sex, CIMT and BMI yields a higher average performance of 0.735, surpassing the performance of other combined features.\u003c/p\u003e\n \u003c/div\u003e\n \u003cdiv class=\"gridtable\"\u003e\n \u003ctable id=\"Tab2\" border=\"1\"\u003e\n \u003ccaption\u003e\n \u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e\n \u003cdiv class=\"CaptionContent\"\u003e\n \u003cp\u003eMultimodal DM classification model results on UKB.\u003c/p\u003e\n \u003c/div\u003e\n \u003c/caption\u003e\n \u003cthead\u003e\n \u003ctr\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eFeatures\u003c/p\u003e\n \u003cp\u003e(optimized cutoff)\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eTotal features\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003ePrecision\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eRecall\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eSpecificity\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eAccuracy\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eAverage of overall metrics\u003c/p\u003e\n \u003c/th\u003e\n \u003c/tr\u003e\n \u003c/thead\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eAge\u0026thinsp;+\u0026thinsp;Sex (0.392)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.66\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.789\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.594\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.691\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.683\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eImage\u0026thinsp;+\u0026thinsp;Age\u0026thinsp;+\u0026thinsp;Sex (0.566)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1026\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.77\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.626\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.8\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.716\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.728\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eCIMT (0.535)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e3\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.389\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.74\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.564\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.573\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eBMI (0.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.647\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.664\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.637\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.651\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.649\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eAge\u0026thinsp;+\u0026thinsp;Sex\u0026thinsp;+\u0026thinsp;CIMT (0.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e5\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.7\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.681\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.713\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.697\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.697\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eImage\u0026thinsp;+\u0026thinsp;Age\u0026thinsp;+\u0026thinsp;Sex\u0026thinsp;+\u0026thinsp;CIMT (0.555)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1029\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.741\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.613\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.78\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.695\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.707\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eAge\u0026thinsp;+\u0026thinsp;Sex\u0026thinsp;+\u0026thinsp;CIMT\u0026thinsp;+\u0026thinsp;BMI (0.454)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.693\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.756\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.664\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.71\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.7\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eImage\u0026thinsp;+\u0026thinsp;Age\u0026thinsp;+\u0026thinsp;Sex\u0026thinsp;+\u0026thinsp;CIMT\u0026thinsp;+\u0026thinsp;BMI (0.584)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1030\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.789\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.835\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.716\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.735\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n \u003c/table\u003e\n \u003c/div\u003e\n \u003cp\u003eTo observe the trade-off between sensitivity and specificity of three included image features models: Image\u0026thinsp;+\u0026thinsp;Age\u0026thinsp;+\u0026thinsp;Sex, Image\u0026thinsp;+\u0026thinsp;Age\u0026thinsp;+\u0026thinsp;Sex\u0026thinsp;+\u0026thinsp;CIMT, and Image\u0026thinsp;+\u0026thinsp;Age\u0026thinsp;+\u0026thinsp;Sex\u0026thinsp;+\u0026thinsp;CIMT\u0026thinsp;+\u0026thinsp;BMI, Fig.\u0026nbsp;\u003cspan class=\"InternalRef\"\u003e2\u003c/span\u003e shows the Receiver Operating Characteristic (ROC) curve and Area Under the Curve (AUC) values. As depicted in Fig.\u0026nbsp;\u003cspan class=\"InternalRef\"\u003e2\u003c/span\u003e (c), the best AUC of 0.67 is found in the model with the combination of Image\u0026thinsp;+\u0026thinsp;Age\u0026thinsp;+\u0026thinsp;Sex\u0026thinsp;+\u0026thinsp;CIMT\u0026thinsp;+\u0026thinsp;BMI features.\u003c/p\u003e\n \u003cp\u003eWe also compared our multimodal DM classification model with existing CNN models. In comparison with other existing algorithms, we maintained the original image dimensions of 224 x 224 x 3, with 4096 and 512 image features for VGG-16 and ResNet-18, respectively. Table\u0026nbsp;\u003cspan class=\"InternalRef\"\u003e3\u003c/span\u003e presents the results of the comparison. Our proposed model demonstrated superior performance in several key metrics: it achieved a precision of 0.789, specificity of 0.835, and accuracy of 0.716. These values are significantly higher than those for VGG-16 and ResNet-10, while the recall value of our model is slightly lower compared to the two models. We also observed that our model\u0026apos;s AUC:0.67 is higher by 0.08 compared to VGG-16 (AUC: 0.59) and 0.16 compared to ResNet-18 (AUC: 0.51), as shown in Supplementary Materials Section S4 Figure S2.\u003c/p\u003e\n \u003cdiv class=\"gridtable\"\u003e\n \u003ctable id=\"Tab3\" border=\"1\"\u003e\n \u003ccaption\u003e\n \u003cdiv class=\"CaptionNumber\"\u003eTable 3\u003c/div\u003e\n \u003cdiv class=\"CaptionContent\"\u003e\n \u003cp\u003eMultimodal DM classification models comparison results on UKB.\u003c/p\u003e\n \u003c/div\u003e\n \u003c/caption\u003e\n \u003cthead\u003e\n \u003ctr\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eFeatures: Image\u0026thinsp;+\u0026thinsp;Age\u0026thinsp;+\u0026thinsp;Sex\u0026thinsp;+\u0026thinsp;CIMT\u0026thinsp;+\u0026thinsp;BMI\u003c/p\u003e\n \u003cp\u003e(optimized cutoff)\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eTotal features\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003ePrecision\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eRecall\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eSpecificity\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eAccuracy\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eAverage of overall metrics\u003c/p\u003e\n \u003c/th\u003e\n \u003c/tr\u003e\n \u003c/thead\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eOur proposed model (0.584)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1030\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.789\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.835\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.716\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.735\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eVGG-16 (0.546)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e4102\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.687\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.611\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.736\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.675\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.677\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eResNet-18 (0.432)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e518\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.7\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.697\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.596\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.655\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.662\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n \u003c/table\u003e\n \u003c/div\u003e\n \u003cp\u003e\u003cstrong\u003eb. Feature maps and statistical analysis results\u003c/strong\u003e\u003c/p\u003e\n \u003cdiv class=\"BlockQuote\"\u003e\n \u003cp\u003eTo gain insights into the extracted features, we visualized the feature maps generated by the convolution layer. For the correct predicted class, out of 32 feature maps, we selected the best feature map by highlighting the feature map corresponding to the correct class and attenuating the feature map that corresponds to the incorrect class. Figure \u003cspan class=\"InternalRef\"\u003e3\u003c/span\u003e illustrates an example of the highlighted area, showcasing the significant areas contributing to non-DM and DM classes. It can be observed that the most crucial regions for the non-DM class are in the wall of the arteries, while for the DM class, the important region is located in the lumen artery regions and other part outside the arteries.\u003c/p\u003e\n \u003cp\u003eWe applied logistic regression to individually analyze the left and right carotid US data across two distinct cohorts. The purpose of this regression analysis was to assess whether the probability of DM, as determined by our multimodal DM classification model, is associated with either prevalent CVD or incident CVD in patients diagnosed with DM. This approach allowed us to explore the relationship between our model\u0026apos;s DM predictions and the occurrence of cardiovascular complications in these patients. Table \u003cspan class=\"InternalRef\"\u003e4\u003c/span\u003e presents our findings: among the 923 DM patients with 80 prevalent CVD cases 5 years before the imaging visits, the association P-value for the left carotid US exhibits greater significance compared to the right carotid US. Similarly, this trend is observed in the cohort of 843 total DM patients with 27 incident CVD cases in 3 years after the imaging visits, where the P-value remains at 0.057. Although we have less CVD case in each cohort, these results still strongly suggest that there is a significant difference in the probability outcomes between the DM-CVD and DM-non-CVD groups.\u003c/p\u003e\n \u003c/div\u003e\n \u003cdiv class=\"gridtable\"\u003e\n \u003ctable id=\"Tab4\" border=\"1\"\u003e\n \u003ccaption\u003e\n \u003cdiv class=\"CaptionNumber\"\u003eTable 4\u003c/div\u003e\n \u003cdiv class=\"CaptionContent\"\u003e\n \u003cp\u003eCombined cohort groups result on UKB\u003c/p\u003e\n \u003c/div\u003e\n \u003c/caption\u003e\n \u003cthead\u003e\n \u003ctr\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eCohorts\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eCombine groups\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eTotal patients\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eTotal images\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eLeft (P-value)\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eRight (P-value)\u003c/p\u003e\n \u003c/th\u003e\n \u003c/tr\u003e\n \u003c/thead\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eRetrospective\u003c/p\u003e\n \u003cp\u003e5 years before\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026bull; DM-CVD (80)\u003c/p\u003e\n \u003cp\u003e\u0026bull; DM-non-CVD (843)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e923\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1,846\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.006\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.458\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eProspective\u003c/p\u003e\n \u003cp\u003e3 years after\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u0026bull; DM-CVD (27)\u003c/p\u003e\n \u003cp\u003e\u0026bull; DM-non-CVD (816)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e843\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1,686\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.058\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.71\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n \u003c/table\u003e\n \u003c/div\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec10\" class=\"Section2\"\u003e\n \u003ch2\u003e3.2 Results on Taiwan Biobank\u003c/h2\u003e\n \u003cp\u003e\u003cstrong\u003ea. Multimodal diabetes mellitus classification model results\u003c/strong\u003e\u003c/p\u003e\n \u003cdiv class=\"BlockQuote\"\u003e\n \u003cp\u003eIn the same manner, we conducted the experiment on the TWB dataset. As shown in Table \u003cspan class=\"InternalRef\"\u003e5\u003c/span\u003e, similar to the pattern observed in the UKB, it can be noted that adding image features into three models: Image\u0026thinsp;+\u0026thinsp;Age\u0026thinsp;+\u0026thinsp;Sex, Image\u0026thinsp;+\u0026thinsp;Age\u0026thinsp;+\u0026thinsp;Sex\u0026thinsp;+\u0026thinsp;CIMT, and Image\u0026thinsp;+\u0026thinsp;Age\u0026thinsp;+\u0026thinsp;Sex\u0026thinsp;+\u0026thinsp;CIMT\u0026thinsp;+\u0026thinsp;BMI, results in higher performance compared to other features without images. All metrics are increasing in the models Image\u0026thinsp;+\u0026thinsp;Age\u0026thinsp;+\u0026thinsp;Sex\u0026thinsp;+\u0026thinsp;CIMT and Image\u0026thinsp;+\u0026thinsp;Age\u0026thinsp;+\u0026thinsp;Sex\u0026thinsp;+\u0026thinsp;CIMT\u0026thinsp;+\u0026thinsp;BMI, except Image\u0026thinsp;+\u0026thinsp;Age\u0026thinsp;+\u0026thinsp;Sex, where the recall of 0.725 is less than Age\u0026thinsp;+\u0026thinsp;Sex recall, which is 0.774.\u003c/p\u003e\n \u003c/div\u003e\n \u003cdiv class=\"gridtable\"\u003e\n \u003ctable id=\"Tab5\" border=\"1\"\u003e\n \u003ccaption\u003e\n \u003cdiv class=\"CaptionNumber\"\u003eTable 5\u003c/div\u003e\n \u003cdiv class=\"CaptionContent\"\u003e\n \u003cp\u003eMultimodal DM classification model results on TWB.\u003c/p\u003e\n \u003c/div\u003e\n \u003c/caption\u003e\n \u003cthead\u003e\n \u003ctr\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eFeatures\u003c/p\u003e\n \u003cp\u003e(optimized cutoff)\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eTotal features\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003ePrecision\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eRecall\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eSpecificity\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eAccuracy\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eAverage of overall metrics\u003c/p\u003e\n \u003c/th\u003e\n \u003c/tr\u003e\n \u003c/thead\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eAge\u0026thinsp;+\u0026thinsp;Sex (0.437)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.62\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.774\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.525\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.65\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.642\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eImage\u0026thinsp;+\u0026thinsp;Age\u0026thinsp;+\u0026thinsp;Sex (0.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1026\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.728\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.725\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.725\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.725\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.726\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eCIMT (0.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.657\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.682\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.646\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.646\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eBMI (0.471)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.613\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.711\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.552\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.631\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.626\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eAge\u0026thinsp;+\u0026thinsp;Sex\u0026thinsp;+\u0026thinsp;CIMT (0.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e4\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.657\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.673\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.65\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.661\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.66\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eImage\u0026thinsp;+\u0026thinsp;Age\u0026thinsp;+\u0026thinsp;Sex\u0026thinsp;+\u0026thinsp;CIMT (0.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1028\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.748\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.734\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.75\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.742\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.743\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eAge\u0026thinsp;+\u0026thinsp;Sex\u0026thinsp;+\u0026thinsp;CIMT\u0026thinsp;+\u0026thinsp;BMI (0.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e5\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.677\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.7\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.663\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.685\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.681\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eImage\u0026thinsp;+\u0026thinsp;Age\u0026thinsp;+\u0026thinsp;Sex\u0026thinsp;+\u0026thinsp;CIMT\u0026thinsp;+\u0026thinsp;BMI (0.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1029\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.738\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.71\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.745\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.727\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.73\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n \u003c/table\u003e\n \u003c/div\u003e\n \u003cp\u003eA notable observation in the TWB dataset was the exceptional performance of the Image\u0026thinsp;+\u0026thinsp;Age\u0026thinsp;+\u0026thinsp;Sex\u0026thinsp;+\u0026thinsp;CIMT model across all metrics, surpassing both the Image\u0026thinsp;+\u0026thinsp;Age\u0026thinsp;+\u0026thinsp;Sex and Image\u0026thinsp;+\u0026thinsp;Age\u0026thinsp;+\u0026thinsp;Sex\u0026thinsp;+\u0026thinsp;CIMT\u0026thinsp;+\u0026thinsp;BMI models. This suggests that in the TWB dataset, the inclusion of CIMT features provides higher predictive power compared to the inclusion of only Age\u0026thinsp;+\u0026thinsp;Sex and BMI, although it is slightly lower in recall. This difference in performance could be attributed to the way CIMT values are represented in the TWB dataset, where they are shown as actual measurements of left and right CIMT, in contrast to the UKB dataset, which uses minimum, maximum, and mean values of CIMT.\u003c/p\u003e\n \u003cp\u003eFigure \u003cspan class=\"InternalRef\"\u003e4\u003c/span\u003e shows the ROC curves of different models in the TWB dataset. It is noticeable that the AUC of adding BMI is the same as only adding Image\u0026thinsp;+\u0026thinsp;Age\u0026thinsp;+\u0026thinsp;Sex\u0026thinsp;+\u0026thinsp;CIMT, both having an AUC of 0.80. Moreover, a comparison with existing CNN models was conducted. As shown in Table\u0026nbsp;\u003cspan class=\"InternalRef\"\u003e6\u003c/span\u003e, similar to UKB, we found the same trend using TWB, where the overall performance of our multimodal DM classification model still outperforms VGG-16. Interestingly, using ResNet-18 on the TWB dataset, a similar performance can be achieved with an average of 0.73, though our proposed model AUC is slightly better with the best AUC being 0.80 for balancing the sensitivity and specificity trade-off, while for ResNet-18, the AUC is 0.79, as shown in Supplementary Materials Section S5, subsection A, Figure S3.\u003c/p\u003e\n \u003cdiv class=\"gridtable\"\u003e\n \u003ctable id=\"Tab6\" border=\"1\"\u003e\n \u003ccaption\u003e\n \u003cdiv class=\"CaptionNumber\"\u003eTable 6\u003c/div\u003e\n \u003cdiv class=\"CaptionContent\"\u003e\n \u003cp\u003eMultimodal DM classification models comparison results on TWB.\u003c/p\u003e\n \u003c/div\u003e\n \u003c/caption\u003e\n \u003cthead\u003e\n \u003ctr\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eFeatures: Image\u0026thinsp;+\u0026thinsp;Age\u0026thinsp;+\u0026thinsp;Sex\u0026thinsp;+\u0026thinsp;CIMT\u0026thinsp;+\u0026thinsp;BMI (optimized cutoff)\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eTotal features\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003ePrecision\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eRecall\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eSpecificity\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eAccuracy\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eAverage overall metrics\u003c/p\u003e\n \u003c/th\u003e\n \u003c/tr\u003e\n \u003c/thead\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eOur proposed model (0.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1029\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.738\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.71\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.745\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.727\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.73\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eVGG-16 (0.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e4101\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.62\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.641\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.599\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.62\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.62\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eResNet-18 (0.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e517\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.74\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.747\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.716\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.732\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.733\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n \u003c/table\u003e\n \u003c/div\u003e\n \u003cp\u003e\u003cstrong\u003eb. Feature maps and statistical analysis results\u003c/strong\u003e\u003c/p\u003e\n \u003cp\u003eTo comprehend the image features of our model, we visualized the extracted features and performed statistical analysis on the TWB dataset. As illustrated in Supplementary Materials Section S5 subsection B Figure S4, a pattern like that observed in the UKB emerges. Although the activation appears to be less pronounced than in the UKB, it is evident that, in the non-DM class, there is high activation, manifesting slightly in an orange hue, on the walls of the arteries and other areas outside the arteries. Our model replicates the patterns identified in the UKB dataset, where, in the DM class, robust activation is primarily located within the lumen area. Given the close association between the lumen area and the presence of atherosclerosis, it is reasonable for the CNN to capture significant features with high activations in that region. This visualization proves valuable in understanding how our CNN feature maps capture information crucial for assessing the performance of our multimodal DM classification model.\u003c/p\u003e\n \u003cp\u003eAs depicted in Supplementary Materials Section S5 subsection C Table S2, the analysis indicates that the probability of diabetes mellitus (DM), as extracted from our designed model, exhibits less significance when comparing the CVD and non-CVD groups in both studies. In contrast to the findings in the UKB, the non-significant p-values suggest insufficient evidence to conclude a statistically significant difference in the probability of DM within the TWB dataset. Unlike the results in the UKB on the left side, in TWB, retrospective and prospective observations yield values of 0.315 and 0.154, respectively.\u003c/p\u003e\n\u003c/div\u003e"},{"header":"4. Discussion","content":"\u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eIn this study, we developed a multimodal DM classification model using CNN and logistic regression to analyze a combination of carotid US Images, Age, Sex, CIMT, and BMI across the UKB and TWB datasets. Our experiments revealed unique patterns in the UKB and TWB. For UKB, the model with all features (Image\u0026thinsp;+\u0026thinsp;Age\u0026thinsp;+\u0026thinsp;Sex\u0026thinsp;+\u0026thinsp;CIMT\u0026thinsp;+\u0026thinsp;BMI) showed the best overall performance, although the recall metric was slightly lower compared to models using image and other feature combinations. This is primarily attributed to the higher recall rate of Age\u0026thinsp;+\u0026thinsp;Sex, which was 0.789, as opposed to the lower recall rates achieved by CIMT and BMI individually, which were 0.389 and 0.664, respectively. When considering the optimal models, we observed that our model outperformed VGG-16 and ResNet-18. By adding image features, except for recall, other metrics improved significantly.\u003c/p\u003e \u003cp\u003eOur image features captured important patterns for each class. In the non-DM class, the important features tend to have high activations on the wall of the arteries, whereas in the DM class, the important features are shown in the lumen area and other areas outside the lumen, showing similarity in patterns. These findings have a significant impact on the statistical analysis between the DM-CVD and DM-non-CVD groups. Based on logistic regression, we discovered a significant association between the DM prediction probability and the prevalent and incident CVD with significant P-values of 0.006 and 0.058, respectively, on the left side of the arteries. The anatomical origins of the left and right arteries, stemming from the arch of the aorta and the innominate artery, respectively, contribute to the difference. The left arteries are found to be thicker in comparison to the right side and are correlated with blood biochemical indices such as blood glucose level, total cholesterol, and low-density lipoprotein cholesterol, which are closely related to CVD risk factors \u003csup\u003e22\u003c/sup\u003e. A thicker artery wall may indicate more advanced plaque development \u003csup\u003e23\u003c/sup\u003e, and it has been noted that unilateral plaque usually occurs on the left artery, making the left side vulnerable.\u003c/p\u003e \u003cp\u003eIn contrast, the TWB results showed a slightly different trend. Prediction models that included CIMT along with Image, Age, and Sex, and excluded BMI, outperformed other models, achieving higher Precision (0.748), Recall (0.734), Specificity (0.75), and Accuracy (0.742). The average score of all metrics for this model was 0.743 surpassing Image\u0026thinsp;+\u0026thinsp;Age\u0026thinsp;+\u0026thinsp;Sex (average 0.726), and Image\u0026thinsp;+\u0026thinsp;Age\u0026thinsp;+\u0026thinsp;Sex\u0026thinsp;+\u0026thinsp;CIMT\u0026thinsp;+\u0026thinsp;BMI (average 0.73). Therefore, the inclusion of CIMT had more predictive power than just adding BMI or Age\u0026thinsp;+\u0026thinsp;Sex, despite both models achieving an AUC of 0.8. The unique nature of CIMT measurements in TWB, involving actual measurements on both sides, partly explains this difference compared to UKB's mean, maximum, and minimum values approach.\u003c/p\u003e \u003cp\u003eSurprisingly, although our model slightly outperformed VGG-16 with an AUC of 0.8 compared to 0.79 in ResNet-18, ResNet-18 exhibited better performance in terms of precision, recall, and accuracy. This improvement could be attributed to the skip connection features present in ResNet-18 when compared to both VGG-16 and our models. Nevertheless, even with a simple yet efficient model, our approach still demonstrates slightly better AUC performance in comparison to VGG-16 and ResNet-18.\u003c/p\u003e \u003cp\u003eIn terms of feature visualization, the extracted feature patterns were consistent with those observed in UKB, suggesting that our CNN model successfully identified important features for DM and non-DM classes. This is indicative of the model's significant predictive power, combined with conventional features, and its statistical association with CVD status.\u003c/p\u003e \u003cp\u003eDespite our study\u0026rsquo;s contributions, there are some limitations. We focused only on image CIMT150 category, and CIMT measured at 150 degrees. Additionally, we have not yet applied a feature selection method, which could be beneficial in selecting the best and optimal image features to be used. In our upcoming endeavors, there is potential for enhancing the current multimodal DM classification model. This improvement can be realized by integrating more intricate data, specifically through the aggregation of EHR datasets. This approach is particularly promising for addressing complex diseases or exploring associations with diabetes related complications.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e"},{"header":"5. Conclusion","content":"\u003cp\u003eIn conclusion, we proposed a multimodal DM classification model utilizing carotid US images and scalar features such as Age, Sex, CIMT, and BMI. The model, based on CNN Deep Learning and logistic regression, demonstrated improved overall metrics performance in both UKB and TWB datasets when image features were incorporated. The designed CNN model captured unique patterns in the images, identifying important features for the DM class in the lumen area whereas for the non-DM class, crucial features were observed in the wall of the lumen area. These consistent findings across different datasets suggest that carotid US image features can be a promising predictor for identifying DM outcomes associated with both CVD and non-CVD patients. In UKB, our statistical analysis revealed a significant association between DM probabilities derived from the model and CVD status, particularly on the left side, with retrospective and prospective P-values of 0.006 and 0.058, respectively. This highlights the potential of carotid US as a predictive tool for DM in relation to CVD.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eFUNDING\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis study was supported by grants PH-112-PP-10 and PH-112-GP-04 from the National Health Research Institutes, Taiwan.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAUTHOR CONTRIBUTIONS\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eDDO, HYC and RHC designed the study. DDO, HML, GHL, and YPH contributed to data acquisition. DDO, HML, GHL, YPH, YSZ, AIO, and YHL performed the analyses. All authors helped interpret the analysis results and approved the final manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eETHIC APPROVAL\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWritten informed consent for participants in the Taiwan Biobank and UK Biobank was obtained from all participants. Research in this study was approved by the Institutional Review Board of the National Health Research Institutes in Taiwan (reference number: EC1091202-E).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eDATA AVAILABILITY\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe UK Biobank data can be applied through the UK Biobank. Similarly, the Tawain Biobank data can be obtained through application to the Taiwan Biobank.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCODE AVAILABILITY\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe underlying code for this study is not publicly available but may be made available to qualified researchers on reasonable request from the corresponding author.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCOMPETING INTERESTS\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eNone exists.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eACKNOWLEDGEMENTS\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe thank the participants from the Taiwan Biobank and UK Biobank study. This research was conducted using the UK Biobank Resource under Application Number 82617.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eWHO. \u003cem\u003eDiabetes\u003c/em\u003e, \u0026lt;https://www.who.int/health-topics/diabetes#tab=tab_1\u0026gt; (2023).\u003c/li\u003e\n\u003cli\u003eAlGhibiwi, H. K.\u003cem\u003e et al.\u003c/em\u003e The Association between Cardiovascular Risk Factors and Carotid Intima-Media Thickness in 42,726 Adults in UK Biobank: A Cross-Sectional Study. \u003cem\u003eJournal of Cardiovascular Development and Disease\u003c/em\u003e \u003cstrong\u003e10\u003c/strong\u003e, 358 (2023). \u003c/li\u003e\n\u003cli\u003eSibal, L., Agarwal, S. C. \u0026amp; Home, P. D. Carotid intima-media thickness as a surrogate marker of cardiovascular disease in diabetes. \u003cem\u003eDiabetes, metabolic syndrome and obesity: targets and therapy\u003c/em\u003e, 23-34 (2011). \u003c/li\u003e\n\u003cli\u003eHoke, M.\u003cem\u003e et al.\u003c/em\u003e Carotid ultrasound investigation as a prognostic tool for patients with diabetes mellitus. \u003cem\u003eCardiovascular diabetology\u003c/em\u003e \u003cstrong\u003e18\u003c/strong\u003e, 1-8 (2019). \u003c/li\u003e\n\u003cli\u003eLi, H., Xu, X., Luo, B. \u0026amp; Zhang, Y. The predictive value of carotid ultrasonography with cardiovascular risk factors\u0026mdash;A \u0026ldquo;SPIDER\u0026rdquo; promoting atherosclerosis. \u003cem\u003eFrontiers in Cardiovascular Medicine\u003c/em\u003e \u003cstrong\u003e8\u003c/strong\u003e, 706490 (2021). \u003c/li\u003e\n\u003cli\u003eFeng, X., Cai, Y. \u0026amp; Xin, R. Optimizing diabetes classification with a machine learning-based framework. \u003cem\u003eBMC Bioinformatics\u003c/em\u003e \u003cstrong\u003e24\u003c/strong\u003e, 428 (2023). https://doi.org:10.1186/s12859-023-05467-x\u003c/li\u003e\n\u003cli\u003eChien, S.-C.\u003cem\u003e et al.\u003c/em\u003e Predicting long-term care service demands for cancer patients: A machine learning approach. \u003cem\u003eCancers\u003c/em\u003e \u003cstrong\u003e15\u003c/strong\u003e, 4598 (2023). \u003c/li\u003e\n\u003cli\u003eOnthoni, D. D., Sheng, T.-W., Sahoo, P. K., Wang, L.-J. \u0026amp; Gupta, P. Deep learning assisted localization of polycystic kidney on contrast-enhanced CT images. \u003cem\u003eDiagnostics\u003c/em\u003e \u003cstrong\u003e10\u003c/strong\u003e, 1113 (2020). \u003c/li\u003e\n\u003cli\u003eBokhorst, J.-M.\u003cem\u003e et al.\u003c/em\u003e Deep learning for multi-class semantic segmentation enables colorectal cancer detection and classification in digital pathology images. \u003cem\u003eScientific Reports\u003c/em\u003e \u003cstrong\u003e13\u003c/strong\u003e, 8398 (2023). \u003c/li\u003e\n\u003cli\u003ePyrros, A.\u003cem\u003e et al.\u003c/em\u003e (2023).\u003c/li\u003e\n\u003cli\u003eWachinger, C., Wolf, T. N. \u0026amp; Polsterl, S. Deep learning for the prediction of type 2 diabetes mellitus from neck-to-knee Dixon MRI in the UK biobank. \u003cem\u003eHeliyon\u003c/em\u003e \u003cstrong\u003e9\u003c/strong\u003e, e22239 (2023). https://doi.org:10.1016/j.heliyon.2023.e22239\u003c/li\u003e\n\u003cli\u003eTan, X.\u003cem\u003e et al.\u003c/em\u003e Convolutional Neural Networks for Classification of T2DM Cognitive Impairment Based on Whole Brain Structural Features. \u003cem\u003eFront Neurosci\u003c/em\u003e \u003cstrong\u003e16\u003c/strong\u003e, 926486 (2022). https://doi.org:10.3389/fnins.2022.926486\u003c/li\u003e\n\u003cli\u003ePalmer, L. J. UK Biobank: bank on it. \u003cem\u003eThe Lancet\u003c/em\u003e \u003cstrong\u003e369\u003c/strong\u003e, 1980-1982 (2007). \u003c/li\u003e\n\u003cli\u003eBiobank, T. \u003cem\u003eStatistics\u003c/em\u003e, \u0026lt;https://www.biobank.org.tw/statistics.php\u0026gt; (2023).\u003c/li\u003e\n\u003cli\u003eBiobank, U. \u003cem\u003eImaging Modality: Carodtif Ultrasound\u003c/em\u003e, \u0026lt;https://biobank.ndph.ox.ac.uk/showcase/showcase/docs/carult_explan_doc.pdf\u0026gt; (\u003c/li\u003e\n\u003cli\u003eLe Goallec, A., Collin, S., Diai, S., Vincent, T. \u0026amp; Patel, C. J. Predicting arterial age using carotid ultrasound images, pulse wave analysis records, cardiovascular biomarkers and deep learning. \u003cem\u003emedRxiv\u003c/em\u003e, 2021.2006. 2017.21259120 (2021). \u003c/li\u003e\n\u003cli\u003eBooth, G. L., Kapral, M. K., Fung, K. \u0026amp; Tu, J. V. Relation between age and cardiovascular disease in men and women with diabetes compared with non-diabetic people: a population-based retrospective cohort study. \u003cem\u003eThe Lancet\u003c/em\u003e \u003cstrong\u003e368\u003c/strong\u003e, 29-36 (2006). \u003c/li\u003e\n\u003cli\u003eKautzky-Willer, A., Leutner, M. \u0026amp; Harreiter, J. Sex differences in type 2 diabetes. \u003cem\u003eDiabetologia\u003c/em\u003e \u003cstrong\u003e66\u003c/strong\u003e, 986-1002 (2023). \u003c/li\u003e\n\u003cli\u003eWan, E. Y. F.\u003cem\u003e et al.\u003c/em\u003e Blood pressure and risk of cardiovascular disease in UK biobank: a mendelian randomization study. \u003cem\u003eHypertension\u003c/em\u003e \u003cstrong\u003e77\u003c/strong\u003e, 367-375 (2021). \u003c/li\u003e\n\u003cli\u003eMohanty, C.\u003cem\u003e et al.\u003c/em\u003e Using Deep Learning Architectures for Detection and Classification of Diabetic Retinopathy. \u003cem\u003eSensors\u003c/em\u003e \u003cstrong\u003e23\u003c/strong\u003e, 5726 (2023). \u003c/li\u003e\n\u003cli\u003eAslan, M. F. \u0026amp; Sabanci, K. A novel proposal for deep learning-based diabetes prediction: Converting clinical data to image data. \u003cem\u003eDiagnostics\u003c/em\u003e \u003cstrong\u003e13\u003c/strong\u003e, 796 (2023). \u003c/li\u003e\n\u003cli\u003eLuo, X., Yang, Y., Cao, T. \u0026amp; Li, Z. Differences in left and right carotid intima-media thickness and the associated risk factors. \u003cem\u003eClin Radiol\u003c/em\u003e \u003cstrong\u003e66\u003c/strong\u003e, 393-398 (2011). https://doi.org:10.1016/j.crad.2010.12.002\u003c/li\u003e\n\u003cli\u003eSelwaness, M.\u003cem\u003e et al.\u003c/em\u003e Atherosclerotic Plaque in the Left Carotid Artery Is More Vulnerable Than in the Right. \u003cem\u003eStroke\u003c/em\u003e \u003cstrong\u003e45\u003c/strong\u003e, 3226-3230 (2014). https://doi.org:10.1161/Strokeaha.114.005202\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Diabetes mellitus, Cardiovascular disease, Carotid intima-media thickness, Carotid ultrasound, Deep learning, Taiwan biobank, UK biobank","lastPublishedDoi":"10.21203/rs.3.rs-3855322/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-3855322/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003e\u003cb\u003eObjective\u003c/b\u003e\u003c/p\u003e \u003cp\u003eClinical evidence has shown that carotid intima-media thickness (CIMT) is a robust biomarker for determining the thickness of atherosclerosis, which in turn increases the risk of cardiovascular disease (CVD). Additionally, diabetes mellitus (DM) is linked to the acceleration of atherosclerosis. Thus, as measured by carotid ultrasound (US), CIMT exhibits a significant association with both DM and CVD. This study examines the potential of US image features, beyond CIMT, in enhancing DM classification and their subsequent association with CVD risks. Specifically, we aimed to determine if these US image features could contribute to DM classification in conjunction with traditional predictors such as age, sex, CIMT, and body mass index (BMI). Additionally, we evaluated the relationship between the probabilities derived from the DM classification model and the prevalence and incidence of CVD in DM patients.\u003c/p\u003e\u003cp\u003e\u003cb\u003eMaterials and Methods\u003c/b\u003e\u003c/p\u003e \u003cp\u003eUtilizing carotid US image data from the UK Biobank (UKB) and Taiwan Biobank (TWB), we developed and trained a custom multimodal DM classification model. This model employed a Convolutional Neural Network (CNN) deep learning approach, using data from the UKB. We assessed the model's performance by comparing it with traditional models that incorporate only clinical features (age, sex, CIMT, BMI). The same comparative analysis was performed on the TWB data. Logistic regression was utilized to analyze the associations between the DM classification model's probability outcomes and CVD status.\u003c/p\u003e\u003cp\u003e\u003cb\u003eResults\u003c/b\u003e\u003c/p\u003e \u003cp\u003eOur comprehensive performance evaluation across both the UKB and TWB datasets revealed that the multimodal DM classification model, which considers both image and clinical features (Age, Sex, CIMT, BMI), outperformed models that rely solely on clinical features. This was evidenced by an improved average precision of 0.762, recall of 0.655, specificity of 0.79, and accuracy of 0.721. Furthermore, in the UKB dataset, we identified a statistically significant association between the probabilities derived from the DM model and CVD status in DM patients, both prevalent (P-value: 0.006) and incident (P-value: 0.058), particularly on the left side.\u003c/p\u003e\u003cp\u003e\u003cb\u003eConclusions\u003c/b\u003e\u003c/p\u003e \u003cp\u003eThe study provides robust evidence that carotid US image features, in addition to traditional parameters like CIMT, significantly enhance the capability of the multimodal DM classification model. The probability outcomes from this model could serve as a promising biomarker for assessing CVD risk in DM patients, offering a novel approach in the medical imaging field.\u003c/p\u003e","manuscriptTitle":"Multimodal Deep Learning for Classifying Diabetes: Analyzing Carotid Ultrasound Images from UK and Taiwan Biobanks and Their Cardiovascular Disease Associations","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-01-29 21:47:21","doi":"10.21203/rs.3.rs-3855322/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"a1eb64f1-ce72-4412-8d77-436dde4358cb","owner":[],"postedDate":"January 29th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":28352055,"name":"Health sciences/Medical research/Experimental models of disease"},{"id":28352056,"name":"Health sciences/Biomarkers/Predictive markers"},{"id":28352057,"name":"Health sciences/Diseases/Endocrine system and metabolic diseases/Diabetes/Type 2 diabetes"}],"tags":[],"updatedAt":"2024-06-06T17:55:41+00:00","versionOfRecord":[],"versionCreatedAt":"2024-01-29 21:47:21","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-3855322","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-3855322","identity":"rs-3855322","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Outcome instruments

VAS-pain

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00
unpaywall: last seen: 2026-05-20T11:00:21.680559+00:00

License: CC-BY-4.0