Multi-center evaluation of radiomics and deep learning to stratify malignancy risk of IPMNs | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Multi-center evaluation of radiomics and deep learning to stratify malignancy risk of IPMNs Andrea M. Bejar, María Jaramillo Gonzalez, Ziliang Hong, Gorkem Durak, and 26 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-6622868/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Distinguishing high-risk intraductal papillary mucinous neoplasms (IPMNs), pancreatic cysts requiring surgery, from low-risk lesions remains a clinical challenge, often resulting in unnecessary procedures due to limited specificity of current methods. While radiomics and deep learning (DL) have been explored for pancreatic cancer, cyst-level malignancy risk stratification of IPMNs remains untapped. We conducted a multi-institutional study (seven centers, 359 T2W MRI images) to assess the feasibility of AI for predicting IPMN dysplasia grade using cyst-level image features. We developed and compared 2D and 3D radiomics-only, deep learning (DL)-only, and radiomics-DL fusion models, using expert radiologist scoring as a baseline reference. Model performance was evaluated using held-out test data. The radiomics-DL fusion model showed the highest discriminatory ability on the test set (AUC 0.692), outperforming the radiomics-only model (AUC 0.665). Expert accuracy varied widely (37.4%-66.7%). The fusion model integrating deep learning and radiomics features from routine T2W MRI (AUC: 0.692) demonstrates potential for objective, cyst-level risk stratification of IPMNs in a multi-center cohort, outperforming both radiomics-only models and expert radiologists. While performance requires improvement for standalone clinical use, this approach offers a scalable, non-invasive method to potentially improve diagnostic accuracy and reduce unnecessary surgical interventions. Biological sciences/Cancer/Cysts Health sciences/Diseases/Gastrointestinal diseases/Gastrointestinal cancer/Pancreatic cancer Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 INTRODUCTION The increasing detection of pancreatic cysts has become a significant clinical challenge 1 – 3 , imposing a substantial burden on patients (due to invasive procedures and surgical risks) and healthcare systems (due to cost of surveillance and interventions). Intraductal papillary mucinous neoplasms (IPMNs), a potentially premalignant cyst subtype, constitute a substantial proportion (estimated 50–80%) of these incidental lesions 1 – 3 . Despite their premalignant potential, the risk of malignant transformation in IPMNs remains poorly defined. Resection studies report a wide range of malignancy rates of 1–38% for branch duct (BD) IPMN and 33–85% for main duct (MD) IPMN; figures that likely overestimate the true rate of progression 2 , 3 . The most recent international consensus guidelines, the 2023 Kyoto criteria , represent the current standard for IPMN management 2 . However, the limitations in accurately diagnosing pancreatic cystic disease and assessing the risk of malignancy in pancreatic cysts according to existing guidelines continue to impose a substantial burden on patients and healthcare systems. These shortcomings frequently lead to invasive diagnostic procedures and high-risk surgical resections, especially for lesions ultimately deemed low-grade 4 – 9 . Magnetic resonance imaging (MRI) and endoscopic ultrasound (EUS) with fine needle aspiration (FNA) are primary modalities for IPMN diagnosis and characterization 2 . EUS-FNA is particularly valuable for evaluating pancreatic cysts and assessing IPMN dysplasia grade. Accurate IPMN characterization prior to invasive procedures is necessary to lower patient burden and cost. While EUS-FNA procedures carry a low complication rate (up to 3%) and rare mortality, the diagnostic sensitivity of EUS-FNA histopathology remains limited, ranging from 4.8–61.6% 10–12 . Thus, even after this invasive, operator-dependent, and costly procedure, malignancy cannot be reliably excluded, leaving patients at risk of bleeding, pancreatitis, and infection 2 , 6 , 10 , 11 . A leading indication for the surgical resection of cystic lesions is a concern for malignancy 13 . However, pancreatic resections are major surgeries with significant morbidity and mortality rates and as a result it is critical to diagnose suspicious lesions prior to surgery 14 , 15 . Radiomics, leveraging high-throughput quantitative image analysis, enables the extraction and analysis of quantitative features imperceptible to the human eye 16 . Deep learning (DL), a neural network-based advanced artificial intelligence (AI) technique, utilizes convolution to effectively extract and discern complex imaging patterns 17 . While radiomics and DL have advanced pancreatic tumor detection and segmentation in computed tomography (CT) and MRI, their application to characterization of premalignant lesions including IPMNs remains nascent 18 . Recent studies propose radiomics, DL, or fused models for IPMN diagnosis and classification 19 – 23 ; however, critical barriers persist. First, the pancreas’ retroperitoneal anatomy and heterogeneous parenchyma complicate image analysis 18 . Second, IPMNs exhibit marked variability in morphology and texture, even within individual cysts 18 . Third, DL demands large, diverse datasets, yet pancreatic MRI—the optimal and preferred modality for cyst characterization—remains scarce and protocol-dependent 3 , 18 , 24 – 26 . Prior studies using radiomics/DL have focused predominantly on tumor detection or whole-pancreas analysis, potentially overlooking crucial information within the heterogeneous cyst itself 18 . Many analyses exclude higher-risk MD/mixed-IPMNs, limiting applicability to the full spectrum of disease. Furthermore, smaller single-center cohorts (< 150 patients) limit generalizability 19 – 22 . Our work addresses these gaps by performing a large, multicenter evaluation focused specifically on cyst regional level features from T2W MRI, including MD- and mixed-type IPMNs, to predict dysplasia grade using 2D and 3D radiomics, DL, and fusion approaches. RESULTS In this section, we evaluate the development and performance of three advanced machine learning algorithms for the stratification of IPMN dysplasia grade in MRI: 1) radiomics-only; 2) DL-only; 3) radiomics-DL fusion. Each approach was assessed using rigorous validation protocols across our multicenter dataset. Evaluation of dataset heterogeneity using UMAP A UMAP (uniform manifold approximation and projection) was used to explore high-dimensional representation of the multi-institutional MRI data using the normalized image quality indicators is shown in Fig. 1 . UMAP revealed distinct clusters of scans associated with different centers (a total of seven) and directly related to voxel height of MRI images. One cluster comprised scans primarily from the Mayo Clinic Florida (MCF) center (blue), characterized by a mean voxel height of 4 mm. Another cluster included scans from the Northwestern Memorial Hospitals (NMH) and New York University (NYU) clinical centers (green and red), with respective mean voxel heights of 5.5 mm and 5 mm. A separate cluster consisted of Erasmus Medical Center (EMC) scans (pink), notable for a 7.3 mm mean voxel height and acquisition exclusively on a 1.5T MRI magnet. Scans from the MCA, Allegheny Health Network (AHN), and Istanbul University (IU) Hospital centers (orange, purple, and brown), exhibiting a 7 mm mean voxel height, were distributed outside these primary clusters. These findings underscore significant variations in image quality across participating centers. Manual segmentation of index cystic lesions Intraobserver mean dice similarity coefficient (DSC) across three readers (abbreviated as GK, AMB, and HEA) was 80% and a Hausdorff Distance at 95 percentile (HD95) of 6.63 mm. Interobserver mean DSC was 75% with a HD95 of 7.2 mm. These DSC and HD95 values are indicative of high segmentation consistency and support the reliability of our reference standard segmentations. Representative T2-weighted (T2W) MRI images and corresponding segmentations of main-duct (MD) and branch-duct (BD)-IPMN across varying dysplasia grades are shown in Fig. 2 . Visual scoring and risk prediction Diagnostic performance metrics are summarized in Table 1 . Sensitivity analysis revealed that Rater 3 achieved the highest detection rate (72.3%, 95% CI: 64.1–79.5%), while Rater 1 demonstrated superior specificity (64.5%, 95% CI: 57.8–70.9%). In cases with concordant majority readings (n = 337), positive percent agreement was calculated. Pairwise comparisons of overall accuracy demonstrated significant heterogeneity, with the highest concordance observed between Raters 1 and 3 (80.1%, p < 0.001), and the lowest between Raters 1 and 2 (48.1%, p = 0.042). Cohen's kappa coefficient analysis indicated fair to moderate agreement between raters with κ of 0.33 to 0.67, as detailed in Table 2 . The inter-observer variability demonstrated statistically significant heterogeneity (Cochran's Q test, p < 0.001), underscoring the subjective nature of visual assessment in this context. Table 1 Comparison of rater performance in visual scoring of IPMN by the imaging features of the Kyoto Criteria 2 . Sensitivity and specificity of each rater (n = 347), and a pooled sensitivity and specificity for subjects that received the same score by a majority of the raters (n = 337). Sens (%) Spec (%) Acc (%) Rater 1 68.9 64.5 66.7 Rater 2 42.2 32.6 37.4 Rater 3 72.3 41.8 57.1 Majority 70.6 63.2 66.9 Table 2 Inter-rater comparison of accuracy and agreement in visual scoring of IPMN by the imaging features of the Kyoto Criteria. Acc (95% CI) Weighted Kappa (95% CI) Rater 1 vs 2 48.1 (42.7, 53.5) 0.33 (0.27, 0.39) Rater 2 vs 3 61.7 (56.3, 66.8) 0.67 (0.59, 0.74) Rater 1 vs 3 80.1 (75.5, 84.1) 0.47 (0.39, 0.53) Prediction using 2D and 3D radiomic features The 2D radiomic analysis yielded a mean AUC of 66.4% with mean accuracy of 65.9%. The 3D analysis yielded a mean AUC of 66.5% and a mean accuracy of 66.1% (Table 3 ). The corresponding ROC curves are shown in Fig. 3 . Bar plots displaying the individual and mean testing set AUC, acc, and F1 are shown in Fig. 4 and Fig. 5 , respectively. Considering expert-judgment results (radiologists’ scoring based on imaging features of the Kyoto guidelines) in previous section, radiomics results are shown to be superior to two of three radiologists and performing on par with the majority-voting based results. It should also be noted that radiologists used both T1W and T2W scans and Kyoto guidelines for determining the cysts stratification while our DL and radiomics analysis used only T2W, indicating the promising and superiority nature of machine generated results. Table 3 Radiomics-only results for 2D and 3D analysis for each trial set. Testing - Random Forest (%) F1 52.2 ± 6.0 60.4 ± 5.5 72.4 ± 4.1 61.8 ± 3.9 61.7 ± 5.0 60.3 ± 3.4 57.4 ± 4.5 66.5 ± 4.5 60.2 ± 3.6 61.1 ± 4.0 Spec 46.2 ± 13 62.8 ± 8.7 75.0 ± 4.1 67.8 ± 6.6 63.1 ± 8.8 58.2 ± 5.7 54.5 ± 13 72.9 ± 6.9 70.6 ± 7.2 64.1 ± 8.5 Sens 62.9 ± 13 65.0 ± 8.7 78.2 ± 7.5 77.0 ± 9.3 71.0 ± 9.0 68.7 ± 5.3 64.7 ± 12 70.4 ± 9.1 71.1 ± 7.0 68.7 ± 8.8 PPV 45.9 ± 6.9 57.0 ± 5.5 67.7 ± 3.2 52.1 ± 4.5 55.7 ± 5.2 53.9 ± 3.4 52.5 ± 6.1 63.8 ± 5.1 52.7 ± 5.1 55.7 ± 5.0 Acc 53.1 ± 5.3 63.7 ± 4.5 76.3 ± 2.9 70.6 ± 3.4 65.9 ± 4.1 62.6 ± 3.6 58.9 ± 3.8 71.9 ± 3.6 70.8 ± 4.0 66.1 ± 3.8 AUC 46.7 ± 5.4 61.9 ± 4.2 77.0 ± 3.5 79.6 ± 2.5 66.4 ± 4.0 59.2 ± 2.5 58.8 ± 2.9 73.5 ± 3.3 74.6 ± 2.3 66.5 ± 2.8 Cross-Validation - Random Forest (%) F1 62.9 ± 2.1 63.0 ± 1.9 62.2 ± 1.8 62.8 ± 2.0 62.7 ± 1.9 63.9 ± 1.7 62.9 ± 1.9 59.8 ± 2.2 62.9 ± 2.3 62.4 ± 2.0 Spec 68.9 ± 4.7 69.7 ± 4.5 68.8 ± 4.9 68.1 ± 4.8 68.9 ± 4.7 69.8 ± 4.3 68.3 ± 4.6 67.0 ± 4.5 68.2 ± 4.2 68.3 ± 4.4 Sens 67.7 ± 4.2 67.5 ± 4.4 66.6 ± 4.5 66.1 ± 4.7 67.0 ± 4.5 68.7 ± 4.2 68.3 ± 4.1 64.2 ± 4.6 66.4 ± 5.2 66.9 ± 4.5 PPV 58.9 ± 2.9 59.3 ± 2.6 58.6 ± 2.8 60.0 ± 2.7 59.2 ± 2.7 59.9 ± 2.6 58.5 ± 3.0 56.2 ± 2.7 60.1 ± 2.1 58.7 ± 2.6 Acc 68.4 ± 2.1 68.9 ± 1.8 67.9 ± 1.9 67.3 ± 1.8 68.1 ± 1.9 69.3 ± 1.7 68.3 ± 2.0 65.9 ± 2.0 67.4 ± 1.5 67.7 ± 1.8 AUC 72.6 ± 1.5 72.7 ± 1.2 71.8 ± 1.6 71.2 ± 1.5 72.1 ± 1.5 73.5 ± 1.3 72.2 ± 1.7 69.0 ± 1.8 71.5 ± 1.5 71.6 ± 1.6 Features 28 30 20 30 Average 28 30 30 30 Average Trial T1 T2 T3 T4 T1 T2 T3 T4 2D Radiomics 3D Radiomics Comparing the performance of six DL architectures in predicting IPMN dysplasia grade Among the tested various CNNs, DenseNet121 27 demonstrated the highest AUC at 73.3% (Table 4 ). In comparison, ResNet-34 28 achieved a slightly lower AUC of 73.1%. Lightweight models, such as EfficientNet-B0 29 and ShuffleNet-V2 30 , exhibited demonstrably lower AUC values of 68.1% and 66.1%, respectively. Table 4 Deep Learning results and standard deviation of IPMN cyst malignancy risk stratification in 5 folds cross-validation 27 – 31 . AUC (%) Acc (%) Sens (%) Spec (%) DenseNet121 73.3 ± 7.9 68.0 ± 7.7 46.5 ± 23 82.7 ± 6.5 Mobilenetv2 73.0 ± 2.9 66.6 ± 2.3 N/A N/A ResNet34 73.1 ± 4.7 68.5 ± 4.4 N/A N/A ResNet50 71.8 ± 6.3 66.0 ± 6.1 N/A N/A ShuffleNet-V2 66.6 ± 5.7 61.3 ± 1.7 N/A N/A EfficientNet-B0 68.1 ± 1.1 65.6 ± 6.0 N/A N/A Evaluation of 2D and 3D radiomics-DL fusion algorithms Using 2D radiomic features, the fusion model achieved a weighted average AUC of 74.3% and an accuracy of 71.0% in cross-validation. In independent testing, this 2D feature fusion model yielded an AUC of 69.2% and an accuracy of 61.6% (Table 5 ). When trained with 3D radiomic features, the fusion model demonstrated a weighted average AUC of 73.4% and an accuracy of 98.4% in cross-validation, and an AUC of 68.3% and accuracy of 62.7% in independent testing. Table 5 Radiomics-deep learning fusion algorithm results. 2D and 3D radiomic features were fed into DenseNet121 in 5 cross-validation on 4 different trials 27 . Cross Validation (%) Spec 60.9 ± 9.6 62.9 ± 14.3 77.6 ± 17.3 72.2 ± 11.6 68.4 ± 7.8 100.0 ± 0.0 98.4 ± 2.0 98.3 ± 2.3 96.3 ± 4.6 98.3 ± 1.5 Sens 81.2 ± 10.2 78.5 ± 12.3 63.9 ± 13.1 77.6 ± 17.4 75.3 ± 7.8 98.5 ± 1.9 100.0 ± 0.0 98.5 ± 3.0 95.1 ± 3.6 98.0 ± 2.1 Acc 68.7 ± 2.6 68.7 ± 5.9 71.5 ± 4.9 74.9 ± 3.2 71.0 ± 3.0 99.4 ± 0.8 99.1 ± 1.2 98.5 ± 1.4 96.5 ± 1.1 98.4 ± 1.3 AUC 73.3 ± 2.9 74.6 ± 4.7 73.9 ± 2.6 75.6 ± 4.2 74.3 ± 3.6 77.1 ± 3.2 74.0 ± 6.1 70.0 ± 4.8 72.5 ± 2.2 73.4 ± 4.1 Testing (%) Spec 66.7 ± 12.9 66.7 ± 24.2 54.3 ± 24.6 75.5 ± 16.9 67.8 ± 7.9 63.3 ± 4.1 62.7 ± 13.1 35.7 ± 10.1 77.3 ± 5.0 63.3 ± 15.4 Sens 35.3 ± 9.8 46.0 ± 28.4 77.1 ± 10.2 63.7 ± 13.9 57.9 ± 14.4 56.5 ± 6.0 57.0 ± 14.0 81.0 ± 9.0 60.4 ± 7.6 63.2 ± 9.1 Acc 48.3 ± 2.2 54.9 ± 6.9 68.0 ± 7.1 67.3 ± 4.9 61.6 ± 7.9 59.3 ± 3.4 59.4 ± 3.3 62.9 ± 7.3 65.6 ± 3.8 62.7 ± 2.8 AUC 47.1 ± 1.7 61.7 ± 3.8 77.5 ± 5.5 79.0 ± 1.6 69.2 ± 2.9 59.3 ± 3.6 63.3 ± 2.3 67.6 ± 4.7 74.7 ± 2.2 68.3 ± 2.9 Trial T1 T2 T3 T4 Weighted Average T1 T2 T3 T4 Weighted Average 2D Radiomic Features 3D Radiomic Features DISCUSSIONS In this large, multicenter study focused on cyst regional-level IPMN analysis from T2W MRI scans of 359 subjects, we demonstrated the feasibility of using radiomics and DL approaches for malignancy risk stratification of IPMN lesions. Visual scoring, the current standard in clinics, raters had minimal to moderate agreement with weighted Kappa scores of 0.33–0.67 32 . The visual scoring accuracy for the majority cases and Rater 1 were similar to the accuracies of the radiomics-only algorithms on testing; and higher than the accuracies of the fusion algorithms on testing, Rater 2, and Rater 3. Our DenseNet121 deep learning model achieved the highest performance (AUC 73.3%, accuracy 68.0%), followed closely by our radiomics-deep learning fusion algorithm using 2D radiomic features (AUC 69.2%, accuracy 61.6% in testing; AUC 74.3%, accuracy 71.0% in cross-validation). This performance effectively balanced parameter efficiency and predictive power. Lightweight models, EfficientNet-B0 and ShuffleNet-V2, exhibited lower AUC values, underscoring the trade-off between model complexity and predictive accuracy across diverse architectures. The fusion of DL and radiomics algorithm, utilizing 2D radiomic features, attained a weighted average AUC of 69.2% and accuracy of 61.6% in testing, and a weighted average AUC of 74.3% and Acc of 71.0% on cross validation. Radiomics-only analyses, employing 3D features, followed with respective AUC and accuracy of 66.5% and 66.1% on testing. Comparable performance was observed between algorithms utilizing 2D versus 3D radiomic features, indicating the potential utility of computationally efficient 2D methods. Importantly, these advanced methods demonstrated performance that matched or exceeded expert radiologist assessment, highlighting their potential to augment clinical decision-making in IPMN management. In our earlier work (Yao et.al. 2023), we classified IPMN malignancy risk using advanced analysis techniques coupled with an automatic whole pancreas segmentation algorithm in 246 T1W and T2W MRI scans from five centers 23 . In that work, we developed three algorithms with incorporated clinical features (age, gender, BMI, diabetes mellitus, and chronic pancreatitis) to accomplish this task: a radiomics-only, DL-only, and DL-radiomics fusion using four CNNs and Vision Transformer (ViT). Our algorithms stratified cases as healthy (n = 70), low-grade risk (n = 85), and high-grade risk (n = 91). In our current study; hence, our results are not entirely comparable with Yao et al. 2023 23 because we switched into two class-classification from three-class classification by focusing only in cystic cases. In our earlier results, we found a mean HD95 of 26.08 mm and DSC of 70.11 for our automatic segmentations that might have introduced additional errors in radiomics and DL analysis. On the other hand, herein, we used manual segmentation (i.e., ground truths) by interdisciplinary experts that were then reviewed by expert radiologists to ensure accurate segmentation; hence, we minimized segmentation induced errors in radiomics and DL analysis. Another key difference compared to our earlier study is our earlier study did not include cyst-type, and all the experiments conducted on a much smaller cohort. Cui et.al. 2021 conducted a study to develop a nomogram to predict the pathological grade of BD-IPMN 33 . The nomogram incorporated clinical features (sex, symptoms, age, CA19-9, and CEA) and radiomic features derived from manually segmented cysts. Their dataset included T2W, T1W, and contrast enhanced T1W scans pertaining to 202 patients collected from three centers. Their data was classified by dysplasia grades as low or high. In their results, it was found that 24.8% of their BD-IPMN cases had high grade dysplasia. On testing using radiomic-only features, they had specificity, sensitivity, and AUC of 81.6%, 70.0%, and 81.1% respectively on validation. Once radiomic and clinical features were incorporated, their nomogram achieved specificity, sensitivity, and AUC of 79.0%, 90.0%, and 88.4% in validation. To compare our studies, the main difference is the type of IPMN cysts that they have included: BD-IPMN while we utilized MD-IPMN, BD-IPMN and mixed-types. While promising, nomograms may perform poorly when applied to populations different from their development cohort, limiting their generalizability across diverse clinical settings 34 . Additionally, the focus on only BD-IPMN could lead to selection bias in their study because BD-IPMN has a lower risk of malignancy. Their ratio of high-grade dysplasia cases is lower than ours and may not be representative of a real-world cohort of IPMN which we tried to approximate. Furthermore, authors included an additional scan of contrast enhanced T1W sequences in their analysis while we confined ourselves into conventional T1W and T2W. In comparing our results, their radiomics-only analysis outperformed ours in AUC and specificity. This could be highly likely because their analysis included several clinical features like patient symptoms and tumor markers which are known to be predictive of higher risk IPMN 2 . We are aware of the significance of clinical features in predicting IPMN malignancy risk and plan to incorporate them into our future analyses. Despite this, our radiomics-only analysis had similar sensitivity to theirs. This is in spite of our use of multiple centers but could have been due to our larger data set. Overall, our study was done in more of a medical image analysis environment than theirs and could provide a more robust malignancy risk prediction method, and having a promise of even better predictions once other clinical markers are combined with imaging. To our knowledge, majority of studies that have used radiomics to classify IPMN are largely CT-based 35 – 38 . MRI is the preferred imaging method for IPMN classification and monitoring compared to CT because it has no radiation exposure, has higher contrast resolution, and it is better at assessing tissue and cysts 3 , 39 . Furthermore, ours is the most comprehensive study on IPMN malignancy risk stratification that utilizes cyst masks in MRI 20 , 33 , 40 . Cheng et.al. 2022 found superior performance of an MRI radiomics algorithm when compared to CT in predicting IPMN malignant potential 20 . Among studies that have utilized MRI, two analyzed only BD-IPMN and the remainder did not specify IPMN type 20 , 23 , 33 , 40 . We found that 38.5% of our Mixed/MD-IPMN and 78.2% of BD-IPMN lesions were Low-Risk. This suggests that many pancreatic resections are unnecessarily performed because a lesion is a MD-IPMN, without any further analysis to stratify lesions that may actually be at risk of malignancy. MD-IPMN is frequently surgically resected in patients that do not have contraindications to surgery, as it has a higher risk of malignant transformation than BD-IPMN 2 , 3 . Studies that only include BD-IPMN are excluding an important and under-investigated subtype. We included MD- and mixed-IPMN in our advanced algorithm training to approximate a more real-world cohort. Our study has several limitations that should be considered. First, its retrospective design inherently limits causal inference and introduces potential biases in the historical data collection. Second, the data collected over two decades contributed to variations, including differences in scan quality and uncertainties regarding the accurate grading of dysplasia. The experience levels of operators and pathologists varied across cases, potentially affecting the reliability of dysplasia assessments. Additionally, there was no standardized protocol for selecting cases for EUS-FNA, which may introduce bias since some patients might have undergone EUS for reasons unrelated to the malignancy risk of cystic lesions. Consequently, cytology might have been obtained from cysts that were not classified as high-risk based on imaging. Moreover, the appearance of cysts may have changed in MRI images taken after the EUS procedure, which could complicate image analysis. Despite the risk of cyst appearance changes following EUS, we have found our results to be reliable using segmentations of visible cystic lesions. Acknowledging these concerns, we thoroughly reviewed the dataset to ensure its suitability for the study. Third, for the BD-IPMN group, we exclusively analyzed data from the sampled cysts. This intentional selection introduced some selection bias; however, focusing on patients at higher risk for malignancy was crucial. Consequently, our observed rates of malignancy risk for BD-IPMNs are similar to higher than those reported in the broader literature, which frequently includes milder cases 2 , 3 . Fourth, our dataset was collected from seven institutions using various brands of MRI scanners and field strengths (1.5T and 3T) with differing image acquisition protocols. This variability poses analytical challenges and ultimately affects the algorithm's performance. Although our multicenter dataset is diverse and heterogeneous, this variety strengthens the algorithm's robustness, ensures its stability across different environments, and enhances its applicability in real-world clinical settings, where imaging protocols frequently vary. Fifth, our image analysis was limited to T2W MRI sequences due to data availability constraints. We plan to include and analyze additional MRI sequences in our future studies. Lastly, radiologist raters utilized only T1W and T2W sequences for expert risk assessment; however, these sequences alone are insufficient for a thorough visual evaluation and do not represent real-life assessments fully. Moreover, the radiologist raters lacked access to previous scans, clinical information, or other critical MRI sequences—such as diffusion sequences—that are valuable for accurately estimating risk. These factors could affect the accuracy of visual scoring compared to standard comprehensive imaging analyses. These limitations point to several promising directions for future research. Prospective validation studies with standardized imaging protocols would strengthen evidence for clinical translation. Integration of clinical parameters (age, symptoms, tumor markers) and additional MRI sequences (contrast-enhanced, diffusion-weighted) could further improve model performance. Development of ensemble approaches that combine imaging features with other biomarkers (cyst fluid analysis, circulating markers) might provide more comprehensive risk assessment. Finally, extending these methods to predict long-term outcomes rather than cross-sectional histopathology would better align with the clinical goal of identifying lesions likely to progress to malignancy. In conclusion, our multicenter, pancreatic cyst-focused study demonstrates the feasibility and potential clinical utility of radiomics and deep learning for IPMN risk stratification using routinely acquired T2W MRI scans. While predictive performance requires further enhancement, potentially through integration of clinical data and additional imaging sequences, our advanced machine learning models achieved performance comparable and even better to expert radiologists in this challenging cohort, offering greater objectivity and reproducibility compared to visual assessment. Given that current international consensus guidelines lack optimal specificity for identifying low-risk IPMNs without invasive procedures, computational tools like ours represent a valuable step toward more precise patient selection for intervention versus surveillance. Hence, our findings have immediate clinical relevance. The fusion model's performance, comparable to expert radiologists, suggests potential for integration into clinical workflows as a decision support tool. By providing objective risk stratification of IPMNs, our approach could reduce the high rates of unnecessary surgical resections of low-risk lesions, particularly for MD-IPMNs which are often resected based solely on morphology. Implementation could take the form of a software plugin for radiology workstations, offering real-time risk assessment during routine reads without disrupting workflow. Cost-effectiveness analyses and prospective validation would be logical next steps toward clinical translation. METHODS Data collection and subject selection Our retrospective study (overview in Fig. 6 ) was approved by an Institutional Review Board (IRB) and all images were de-identified prior to usage in accordance with ethical standards. We collected 746 T2W MRI scans from patients over 18 years of age undergoing assessment for pancreatic cystic lesions between March 2004 and June 2024. Scans were collected from seven centers: Allegheny Health Network (AHN), Erasmus Medical Center (EMC), Istanbul University (IU) Hospital, Mayo Clinic Florida (MCF), Mayo Clinic Arizona (MCA), Northwestern Memorial Hospital (NMH), and New York University Langone Hospital (NYU) (Fig. 6 -A). From an initial cohort of 746 subjects, 359 met these inclusion criteria and were selected for analysis (Fig. 7 ). The selected cohort had a mean age of 67.2 ± 10.8 years and was 53% female. Images were acquired on Siemens, Philips, or GE scanners with either 1.5 T or 3 T field strength. After collection, images were selected converted to Neuroimaging Informatics Technology Initiative (NIfTI) format for analysis. We selected axial, non-fat-suppressed T2W scans; slice thicknesses of original DICOM files were between 3–8 mm and voxel heights of the converted NIFTI files were 3–15.9 mm. This data set includes abdominal MRIs of subjects with pancreatic cysts that were selected from an extended version of the PanSegNet dataset by our multicenter group (Zhang et.al., 2025 25 ). We evaluated the radiologic and histopathology results of all subjects prior to inclusion in our study (Fig. 7 ). On radiologic evaluation, 216 scans were excluded due to the absence of a pancreatic cyst, the presence of a different histopathology, or an unavailable radiologic result. The remaining 530 were further evaluated histopathologically via EUS-FNA or surgical resection. Among this cohort, 171 either did not undergo intervention or had histopathology findings negative for IPMN, leaving 359 patients for our study. Subject classification Dysplasia grades for BD-IPMN were determined via histopathology from EUS-FNA or surgical resection. All MD or Mixed IPMN were surgically resected and histopathologically evaluated. Subjects were then grouped based on dysplasia grade: lesions with low grade dysplasia (LGD) as Low Risk, and lesions with high grade dysplasia (HGD) and/or invasive carcinoma (IC) as High Risk (Table 6 ). The Low-Risk group was composed of 217 scans, which included 78.2% of the BD-IPMN cases and 38.5% of the MD or mixed-type IPMN cases. The High-Risk group included 142 subjects in total, 75 with HGD and 67 with IC. The High-Risk group included 27.7% of the BD-IPMN cases and 61.5% of the MD or mixed-type IPMN cases. One-third of Low-Risk patients had MD or mixed IPMN and were therefore unnecessarily resected due to current guidelines largely suggesting surgical resection for MD lesions 2 . Table 6 Breakdown of IPMN subtype in the Low and High-Risk groups. BD-IPMN (%) Mixed & MD-IPMN (%) Low Risk 155 (78.2) 62 (38.5) High Risk 43 (27.7) 99 (61.5) Total 198 161 Image quality assessment To carefully investigate the retrospective data that was accrued across a 20-year time span (2004–2024) from across different institutions, quality indicators were calculated to evaluate center variabilities caused by imaging devices and acquisition protocols. A total of 21 image-quality indicators including statistical values of intensities (e.g. mean, range, variance), and second-order statistics or filter-based measures (e.g. contrast per pixel, entropy focus criterion, and signal-to-noise ratios) were calculated using the open-source MRQy tool 41 . Then, to visualize the quality indicators, the features are projected into a 2D plot using Uniform Manifold Approximation and Projection (UMAP) for Dimension Reduction. Before UMAP projection, each feature was normalized across the dataset using three different methods: z-score, minmax, and data whitening. Manual segmentation The index lesion for each MRI scan was segmented manually and reviewed by an interdisciplinary team of radiologists (GD, an abdominal radiologist with seven years of experience; and FB, a general radiologist with four years of abdominal radiology experience) and students (AMB and HEA, third year medical students; and ZSJ, a fourth year undergraduate student) using ITK-Snap (Version 4.2.0) 42 . All segmentations were reviewed by three expert abdominal radiologists [GD, FB, YBT] to ensure accuracy and consistency. For subjects with BD-IPMN, the index lesion was defined as the cyst that was sampled in EUS-FNA or that was surgically resected. For mixed and MD-IPMN subjects, all regions with cystic involvement were surgically resected. Therefore, approximated cyst boundaries were discussed between abdominal radiologists and decided in consensus prior to segmentation. Concomitant cystic lesions were not included in the analyses. Interobserver and intraobserver agreements were assessed to evaluate the quality and reproducibility of image segmentations. These are calculated using the Dice Similarity Coefficient (DSC) and Hausdorff distance (HD95). Higher DSC and lower HD95 scores indicate higher levels of agreement between the two segmentations. 30 randomly selected MRI scans were segmented by a separate radiologist and compared to the corresponding reference segmentations to assess interobserver agreement. To determine intraobserver agreement, 20 randomly selected MRI scans were segmented a second time after a wash-out period of two weeks. Radiomics and DL methods are discussed below. The data from each center was grouped into four trial sets for testing while data from the remaining centers is used for cross-validation (Table 7 ). In order to simulate a real scenario, the test set consists of total data from one or two centers in each trial, which results in a different amount of test set data in each trial. These sets are referred to as Trial 1 (T1), Trial 2 (T2), Trial 3 (T3), and Trial 4 (T4). The groupings of studies across Trials were chosen such that there is balanced representation of low-, and high-risk studies across the training and test sets. To ensure robust and comprehensive validation, models from each trial were evaluated using a different center’s data for testing. Table 7 Centers used for testing and cross validation of the four trial sets used in the radiomics, DL, and radiomics-DL fusion analyses. Trial Cross Validation (N) Testing (N) T1 EMC, IU, MCF, NMH, & NYU (330) AHN & MCA (29) T2 AHN, IU, MCA, MCF, NMH, & NYU (323) EMC (36) T3 AHN, EMC, MCA, MCF, NMH, & NYU (324) IU (35) T4 AHN, EMC, IU, MCA, MCF, & NMH (288) NYU (71) Radiomics-only analysis We conducted a 2D and 3D radiomics analysis (Fig. 6 -B). Images were resized to achieve an isotropic voxel size of 1 mm 3 (3D analysis) or an isotropic pixel size of 1 mm 2 (2D analysis), using linear interpolation. An N4 bias field correction was applied to reduce low-frequency variations in the acquired signals 43 . Intensity values were normalized using the min-max normalization technique 44 . Then, radiomic features were extracted from the preprocessed images using in-house software and the package collageradiomics developed with Python 45 – 48 . For the 3D analysis, 763 radiomic features were extracted within the entire volume of the cyst. For the 2D analysis, 447 features were extracted from axial plane slices. Features were extracted from six radiomic families. To capture spatial properties of pixel intensities were used the Raw (original intensity values) and Gray (image was fileted with median, mean, std and range filters) families. To capture edge-related features were used Gradient (image was filtered with Sobel and gradient-like kernel filters) and Law’s (images were filtered by local specific masks) families 49 – 51 . Additionally, Haralick and CoLlAGe feature families were used to characterize the co-occurrence matrices (GLCM) 46 , 47 . The window sizes used to construct the GLCM matrices were w = 3×3, w = 5×5, and w = 7×7, and the number of gray levels was set to 4, 8, 16, 32, and 64 (See detailed description of radiomic families in Table S1 of supplementary material). The features were calculated inside the entire region of interest (the cyst) and then each feature was represented by four statistical measures: median, standard deviation, skewness, and kurtosis. Using the training set of each trial, a Spearman correlation threshold of 0.6 was applied to remove the most correlated features. Then, a 5-fold cross-validation scheme with 50 iterations was applied to select the best features applying the Maximum Relevance Minimum Redundancy (mRMR) algorithm and training a Random Forest (RF) model 52 . After cross-validation, features selected in at least 70% of the iterations were picked up. Finally, a RF was trained with the entire training set and tested with the hold-out set. Deep learning-only analysis The DL experiment was done in two parts (Fig. 6 -C). First, we applied 5-fold cross-validation on the entire dataset from all centers to select the best performing model. We assessed the performance of six advanced convolutional neural networks (CNNs): EfficientNet-B0 29 , MobileNet-V2 31 , ResNet-34 28 , ResNet-50 28 , ShuffleNet-V2 30 , and DenseNet-121 27 . ROIs were cropped based on whole pancreas segmentation published in Zhang et. al., 2025 25 . Images were shuffled and resized to 96×96×96 for training. Models were trained using stochastic gradient descent (SGD) with a momentum of 0.9 and a batch size of 2, for a total of 200 epochs. The initial learning rate was set at 0.001 and decreased by a factor of 10 every 30 epochs. A 5-fold cross-validation process was applied to enhance result robustness. In the second part, we split the dataset by center, based on the four trial sets as described in Table 7 . Each set was then used to train the best performing DL model from part one using the same parameters. Radiomics-deep learning fusion algorithm Our radiomics-DL fusion algorithm was developed by fusing decision probabilities of the radiomics Random Forest classifier with the best performing CNN in the DL-only analysis, DenseNet121 27 (Fig. 6 -C). Radiomics feature refinement was done by applying a 5-fold cross-validation and selecting radiomic features with a Spearman correlation coefficient below 0.6 to minimize redundancy. Both the radiomics-based Random Forest model and the deep learning model were retrained using the same training set, which was consistently split to ensure comparability. After training, the predicted probabilities from both models were fused, and the combined output was evaluated on the validation and test sets. For decision-level fusion, the probability outputs from both the DenseNet121 and Random Forest models were combined. Inspired by our earlier work (Yao et al 2023 23 ), we applied an exact sample to the fusion method and found the best hyperparameter with grid search on fivefold cross validation 18 . Two hyperparameters were introduced in the fusion method: the threshold t and the weight k . If the radiomics prediction exceeds the threshold t , the final model output was solely based on the radiomics prediction, and the DL prediction was discarded. Otherwise, the fusion output was a weighted combination of the predictions from the radiomics-based model and the DL model, with the weight of the radiomics-based model set to 1-k and weight of the DL model set to k . The fusion pipeline was conducted twice, using either 2D or 3D radiomic features. Weighted averages were calculated because of differences in the number of subjects represented in each center. A visualization of our fusion pipeline is provided in Fig. 6 -C. Radiologist Visual Scoring Images were visually scored by three independent, expert radiologists [GD, FB, YBT] using the imaging features of the Kyoto Criteria 2 . Cysts were given the label of no risk, low risk, or high risk according to radiological assessment. The radiologists were not told that the cysts were confirmed IPMN to emulate real-life, initial cystic lesion evaluation. Additionally, the radiologists were blinded to the subject’s clinical information, utilized only T2W and contrast-enhanced T1W sequences, and did not have access to previous imaging. 13 cases from the study cohort were excluded from this analysis because T1W images were not available (n = 347). A pairwise assessment of weighted kappa statistics was calculated to evaluate agreement between raters. Sensitivity and specificity were calculated to evaluate the accuracy that the radiologists identified a high-risk lesion correctly. Declarations Author Contributions: Conceptualization, AMB, GD, and UB; methodology, AMB, GD, and UB; software, ZH, and MJG; validation, ZH, MJG, and LZ; formal analysis, ZH, HP, MJG, and LZ; investigation, AMB, ZH, MJG, and ZSJ; resources, CS, GDK, YV, EA, PT, AM, MSE, ZX, SJ, IGS, MJB, CH, TG, and CB; data curation, EK, GD, AMB and HEA; writing—original draft preparation, AMB, ZH, and MJG; writing—review and editing, AMB, UB, and GD ; visualization, AMB, ZH, MJG, and HEA; supervision, GD, UB, RNK, FHM; project administration, UB, RNK, FHM, GD, MBW, and PT; funding acquisition, UB, MBW, and RNK. All authors have read and agreed to the published version of the manuscript. Institutional Review Board Statement: The study was conducted in accordance with the Declaration of Helsinki and approved by the Institutional Review Board of Northwestern University (protocol code STU00214545, approved on 04/15/2021). Informed Consent Statement: Patient consent was waived due to retrospective and blind nature of the study. Acknowledgments: This study was funded by: NIH R01-CA246704, R01-CA240639, U01-CA268808, R01-HL171376, U01 DK127384-02S1, U01CA248226, R01CA277728, R01CA264017, and the WARF Accelerator Oncology Diagnostics Award. Competing Interests: Authors declare no conflict of interest except the following ones: Dr. Ulas Bagci acknowledges: Ther-AI LLC. Dr. Pallavi Tiwari is an equity holder in LivAI Inc. and serves as a scientific consultant for Johnson & Johnson. Dr. Rajesh N. Keswani acknowledges: Boston Scientific - consultant; Olympus - consultant; Medtronic - consultant and research support. Dr. Michael B. Wallace acknowledges Boston Scientific, ClearNote Health, Cosmo Pharmaceuticals, Endostart, Endiatix, Fujifilm, Medtronic, Surgical Automations, Ohelio Ltd, Venn Bioscience, Virgo Inc., Surgical Automation, and Microtek. Dr. Marco J. Bruno acknowledges: Boston Scientific - consultant, support for industry and investigator-initiated studies; Cook Medical - consultant, support for industry and investigator-initiated studies; Pentax Medical - consultant, support for investigator-initiated studies; Mylan - support for investigator initiated studies; AMBU - consultant, support for investigator initiated studies; ChiRoStim - support for investigator-initiated studies. Data Availability Statement: Our MRI and corresponding excel file for risk status of the patients are available at OSF server (NIH supported data sharing platform) at https://osf.io/74vfs/. Code Availability Statement: The underlying code for this study is available in GitHub and can be accessed via this link: https://github.com/Zilian4/IPMN-Radiomics-Plus-Deeplearning. References Schweber, A. B., Agarunov, E., Brooks, C., Hur, C. & Gonda, T. A. Prevalence, incidence, and risk of progression of asymptomatic pancreatic cysts in large sample real-world data. Pancreas 50, 1287–1292 (2021). Ohtsuka, T. et al. International evidence-based Kyoto guidelines for the management of intraductal papillary mucinous neoplasm of the pancreas. Pancreatology 24, 225–270 (2023). Gonda, T. A., Cahen, D. L. & Farrell, J. J. Pancreatic Cysts. N. Engl. J. Med. 391, 832–843 (2024). Heckler, M. et al. The Sendai and Fukuoka consensus criteria for the management of branch duct IPMN-A meta-analysis on their accuracy. Pancreatology 17, 255–262 (2017). Yu, S. et al. Validation of the 2012 Fukuoka consensus guideline for intraductal papillary mucinous neoplasm of the pancreas from a single institution experience. Pancreas 46, 936–942 (2017). Romutis, S. & Brand, R. Burden of new pancreatic cyst diagnosis. Gastrointestinal Endoscopy Clinics 33, 487–495 (2023). Robles, E. P.-C. et al. Accuracy of 2012 International Consensus Guidelines for the prediction of malignancy of branch-duct intraductal papillary mucinous neoplasms of the pancreas. United European Gastroenterology Journal 4, 580–586 (2016). Bulcke, A. V. et al. Evaluating the accuracy of three international guidelines in identifying the risk of malignancy in pancreatic cysts: a retrospective analysis of a surgical treated population. Acta gastro-enterologica Belgica 84, 443–450 (2021). Maggi, G. et al. Pancreatic cystic neoplasms: What is the most cost-effective follow-up strategy? Endoscopic Ultrasound 7, 319–322 (2018). Eloubeidi, M. A. et al. Acute pancreatitis after EUS-guided FNA of solid pancreatic masses: a pooled analysis from EUS centers in the United States. Gastrointest. Endosc. 60, 385–389 (2004). Polkowski, M. et al. Learning, techniques, and complications of endoscopic ultrasound (EUS)-guided sampling in gastroenterology: European Society of Gastrointestinal Endoscopy (ESGE) Technical Guideline. Endoscopy 44, 190–206 (2012). Tacelli, M. et al. Diagnostic performance of endoscopic ultrasound through-the‐needle microforceps biopsy of pancreatic cystic lesions: Systematic review with meta‐analysis. Dig. Endosc. 32, 1018–1030 (2020). De Pretis, N. et al. Pancreatic cysts: diagnostic accuracy and risk of inappropriate resections. Pancreatology 17, 267–272 (2017). Loos, M. et al. Categorization of differing types of total pancreatectomy. JAMA surgery 157, 120–128 (2022). Collaborative, P. o. Pancreatic surgery outcomes: multicentre prospective snapshot study in 67 countries. Br. J. Surg. 111, znad330 (2024). Gillies, R. J., Kinahan, P. E. & Hricak, H. Radiomics: images are more than pictures, they are data. Radiology 278, 563–577 (2016). LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015). Yao, L. et al. A review of deep learning and radiomics approaches for pancreatic cancer diagnosis from medical imaging. Curr. Opin. Gastroenterol. 39, 436–447 (2023). Corral, J. E. et al. Deep learning to classify intraductal papillary mucinous neoplasms using magnetic resonance imaging. Pancreas 48, 805–810 (2019). Cheng, S. et al. Radiomics analysis for predicting malignant potential of intraductal papillary mucinous neoplasms of the pancreas: comparison of CT and MRI. Acad. Radiol. 29, 367–375 (2022). LaLonde, R. et al. in International Conference on Medical Image Computing and Computer-Assisted Intervention. 101–109 (Springer). Salanitri, F. P. et al. in 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC). 475–479 (IEEE). Yao, L. et al. in International Workshop on Machine Learning in Medical Imaging. 134–143 (Springer). Zhang, Z., Yao, L., Keles, E., Velichko, Y. & Bagci, U. Deep learning algorithms for pancreas segmentation from radiology scans: A review. Advances in Clinical Radiology 5, 31–52 (2023). Zhang, Z. et al. Large-scale multi-center CT and MRI segmentation of pancreas with deep learning. Med. Image Anal. 99, 103382 (2025). Suman, G. et al. Quality gaps in public pancreas imaging datasets: Implications & challenges for AI applications. Pancreatology 21, 1001–1008 (2021). Huang, G., Liu, Z., Van Der Maaten, L. & Weinberger, K. Q. in Proceedings of the IEEE conference on computer vision and pattern recognition. 4700–4708. He, K., Zhang, X., Ren, S. & Sun, J. in Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778. Tan, M. Efficientnet: Rethinking model scaling for convolutional neural networks. arXiv preprint arXiv:1905. 11946 , 6105–6114 (2019). Ma, N., Zhang, X., Zheng, H.-T. & Sun, J. in Proceedings of the European conference on computer vision (ECCV). 116–131. Howard, A. G. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017). McHugh, M. L. Interrater reliability: the kappa statistic. Biochemia medica 22, 276–282 (2012). Cui, S. et al. Radiomic nomogram based on MRI to predict grade of branching type intraductal papillary mucinous neoplasms of the pancreas: a multicenter study. Cancer Imaging 21, 1–13 (2021). Balachandran, V. P., Gonen, M., Smith, J. J. & DeMatteo, R. P. Nomograms in oncology: more than meets the eye. The lancet oncology 16, e173-e180 (2015). Lee, D. Y. et al. Radiomics model versus 2017 revised international consensus guidelines for predicting malignant intraductal papillary mucinous neoplasms. Eur. Radiol. 34, 1222–1231 (2024). Tobaly, D. et al. CT-based radiomics analysis to predict malignancy in patients with intraductal papillary mucinous neoplasm (IPMN) of the pancreas. Cancers (Basel) 12, 3089 (2020). Permuth, J. B. et al. Combining radiomic features with a miRNA classifier may improve prediction of malignant pathology for pancreatic intraductal papillary mucinous neoplasms. Oncotarget 7, 85785 (2016). Lou, F. et al. Comprehensive analysis of clinical data and radiomic features from contrast enhanced CT for differentiating benign and malignant pancreatic intraductal papillary mucinous neoplasms. Sci. Rep. 14, 17218 (2024). Pozzi-Mucelli, R. M. et al. Pancreatic MRI for the surveillance of cystic neoplasms: comparison of a short with a comprehensive imaging protocol. Eur. Radiol. 27, 41–50 (2017). Flammia, F. et al. Branch duct-intraductal papillary mucinous neoplasms (BD-IPMNs): An MRI-based radiomic model to determine the malignant degeneration potential. Radiol. Med. 128, 383–392 (2023). Sadri, A. R. et al. MRQy—An open-source tool for quality control of MR imaging data. Med. Phys. 47, 6029–6038 (2020). Yushkevich, P. A. et al. User-guided 3D active contour segmentation of anatomical structures: significantly improved efficiency and reliability. Neuroimage 31, 1116–1128 (2006). Tustison, N. J. et al. N4ITK: improved N3 bias correction. IEEE Trans. Med. Imaging 29, 1310–1320 (2010). Bagci, U., Udupa, J. K. & Bai, L. in Medical Imaging 2010: Visualization, Image-Guided Procedures, and Modeling. 602–613 (SPIE). Inc, T. MATLAB version: 9.13. 0 (R2022b). The MathWorks Inc (2022). Prasanna, P., Tiwari, P. & Madabhushi, A. Co-occurrence of local anisotropic gradient orientations (CoLlAGe): a new radiomics descriptor. Sci. Rep. 6, 1–14 (2016). Haralick, R. M., Shanmugam, K. & Dinstein, I. H. Textural features for image classification. IEEE Transactions on systems, man, and cybernetics, 610–621 (1973). Python version: 3.8.2. Python Software Foundation (2023). Mohammad, E. J., Taha, R. Y. & Mazher, H. A. Design and fundamentals of Sobel Edge Detection of an image. J Multidiscip Eng Sci Technol (JMEST) ISSN 9, 2458–9403 (2022). Sobel, I. & Feldman, G. A 3x3 isotropic gradient operator for image processing. a talk at the Stanford Artificial Project in 1968, 271–272 (1968). Laws, K. I. Textured image segmentation. (1981). Peng, H., Long, F. & Ding, C. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on pattern analysis and machine intelligence 27, 1226–1238 (2005). Additional Declarations No competing interests reported. Supplementary Files SupplementaryInformation.docx Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-6622868","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":462482431,"identity":"4411e86b-9d9e-453f-8019-7c0aec807536","order_by":0,"name":"Andrea M. Bejar","email":"","orcid":"","institution":"Northwestern University","correspondingAuthor":false,"prefix":"","firstName":"Andrea","middleName":"M.","lastName":"Bejar","suffix":""},{"id":462482432,"identity":"f6fbdcc6-44b1-4f1a-bd01-f8048e91f024","order_by":1,"name":"María Jaramillo Gonzalez","email":"","orcid":"","institution":"University of Wisconsin-Madison","correspondingAuthor":false,"prefix":"","firstName":"María","middleName":"Jaramillo","lastName":"Gonzalez","suffix":""},{"id":462482433,"identity":"35775901-0182-40ce-b335-fb53621e3793","order_by":2,"name":"Ziliang Hong","email":"","orcid":"","institution":"Northwestern University","correspondingAuthor":false,"prefix":"","firstName":"Ziliang","middleName":"","lastName":"Hong","suffix":""},{"id":462482434,"identity":"35320517-d518-40e1-99a2-7d619b3af270","order_by":3,"name":"Gorkem Durak","email":"","orcid":"","institution":"Northwestern University","correspondingAuthor":false,"prefix":"","firstName":"Gorkem","middleName":"","lastName":"Durak","suffix":""},{"id":462482435,"identity":"ba2a6159-351d-4d0d-a9d8-ddc58069bdfa","order_by":4,"name":"Elif Keles","email":"","orcid":"","institution":"Northwestern University","correspondingAuthor":false,"prefix":"","firstName":"Elif","middleName":"","lastName":"Keles","suffix":""},{"id":462482436,"identity":"895597ae-c678-4ab1-aa95-a00c23f1a5d2","order_by":5,"name":"Halil Ertugrul Aktas","email":"","orcid":"","institution":"Northwestern University","correspondingAuthor":false,"prefix":"","firstName":"Halil","middleName":"Ertugrul","lastName":"Aktas","suffix":""},{"id":462482437,"identity":"983f3c82-6505-48bd-9821-21f6198c9898","order_by":6,"name":"Zheyuan Zhang","email":"","orcid":"","institution":"Northwestern University","correspondingAuthor":false,"prefix":"","firstName":"Zheyuan","middleName":"","lastName":"Zhang","suffix":""},{"id":462482438,"identity":"f6136b1d-58ee-40df-a28d-ca08404ccbac","order_by":7,"name":"Hongyi Pan","email":"","orcid":"","institution":"Northwestern University","correspondingAuthor":false,"prefix":"","firstName":"Hongyi","middleName":"","lastName":"Pan","suffix":""},{"id":462482439,"identity":"fec07012-2f95-4be0-91e3-6708046f87e4","order_by":8,"name":"Zeynep Sue Jozwiak","email":"","orcid":"","institution":"Northwestern University","correspondingAuthor":false,"prefix":"","firstName":"Zeynep","middleName":"Sue","lastName":"Jozwiak","suffix":""},{"id":462482440,"identity":"9122f422-c756-4bb7-9340-ce953a90dc57","order_by":9,"name":"Fergan Bol","email":"","orcid":"","institution":"University of Health Sciences Istanbul Bakirkoy Dr. Sadi Konuk Training and Research Hospital","correspondingAuthor":false,"prefix":"","firstName":"Fergan","middleName":"","lastName":"Bol","suffix":""},{"id":462482441,"identity":"95eb80ce-5b89-4e9f-8387-d160c2f7938c","order_by":10,"name":"Lili Zhao","email":"","orcid":"","institution":"Northwestern University","correspondingAuthor":false,"prefix":"","firstName":"Lili","middleName":"","lastName":"Zhao","suffix":""},{"id":462482442,"identity":"70573bd9-8c70-42e9-b1ac-a89534fbed76","order_by":11,"name":"Chao Chen","email":"","orcid":"","institution":"Stony Brook University","correspondingAuthor":false,"prefix":"","firstName":"Chao","middleName":"","lastName":"Chen","suffix":""},{"id":462482443,"identity":"f0e2612e-ae33-4b83-af87-ac01d119178a","order_by":12,"name":"Concetto Spampinato","email":"","orcid":"","institution":"University of Catania","correspondingAuthor":false,"prefix":"","firstName":"Concetto","middleName":"","lastName":"Spampinato","suffix":""},{"id":462482444,"identity":"414ecb91-cd3c-481b-9ab5-32a8b9b37a43","order_by":13,"name":"Alpay Medetalibeyoglu","email":"","orcid":"","institution":"Istanbul University Faculty of Medicine","correspondingAuthor":false,"prefix":"","firstName":"Alpay","middleName":"","lastName":"Medetalibeyoglu","suffix":""},{"id":462482445,"identity":"65d81da1-e27b-46e4-bd0a-498a8d8c2b07","order_by":14,"name":"Sukru Mehmet Erturk","email":"","orcid":"","institution":"Istanbul University Faculty of Medicine","correspondingAuthor":false,"prefix":"","firstName":"Sukru","middleName":"Mehmet","lastName":"Erturk","suffix":""},{"id":462482446,"identity":"b43fd62e-3db1-4cfa-8e3e-86889ea88d2c","order_by":15,"name":"Gulbiz Dagoglu Kartal","email":"","orcid":"","institution":"Istanbul University Faculty of Medicine","correspondingAuthor":false,"prefix":"","firstName":"Gulbiz","middleName":"Dagoglu","lastName":"Kartal","suffix":""},{"id":462482447,"identity":"a8922b5a-5b5c-4c6f-8d95-dd574a3b339f","order_by":16,"name":"Yury Velichko","email":"","orcid":"","institution":"Northwestern University","correspondingAuthor":false,"prefix":"","firstName":"Yury","middleName":"","lastName":"Velichko","suffix":""},{"id":462482448,"identity":"3c62d0cc-55b8-4ceb-a6fc-7f7631e26efe","order_by":17,"name":"Emil Agarunov","email":"","orcid":"","institution":"New York University","correspondingAuthor":false,"prefix":"","firstName":"Emil","middleName":"","lastName":"Agarunov","suffix":""},{"id":462482449,"identity":"bf61fb1f-e75d-426b-8e62-14b8882d79c5","order_by":18,"name":"Ziyue Xu","email":"","orcid":"","institution":"NVIDIA","correspondingAuthor":false,"prefix":"","firstName":"Ziyue","middleName":"","lastName":"Xu","suffix":""},{"id":462482450,"identity":"47cc84e5-e51e-40b6-b9c3-2c0b1f1bac5f","order_by":19,"name":"Sachin Jambawalikar","email":"","orcid":"","institution":"Columbia University in the City of New York","correspondingAuthor":false,"prefix":"","firstName":"Sachin","middleName":"","lastName":"Jambawalikar","suffix":""},{"id":462482451,"identity":"78016ef4-4181-42b0-91c4-14b37039a840","order_by":20,"name":"Ivo G. Schoots","email":"","orcid":"","institution":"Erasmus University Medical Center","correspondingAuthor":false,"prefix":"","firstName":"Ivo","middleName":"G.","lastName":"Schoots","suffix":""},{"id":462482452,"identity":"36a7cb12-0ff7-4d5f-92e7-3bf05f814b40","order_by":21,"name":"Marco J. Bruno","email":"","orcid":"","institution":"Erasmus Medical Center","correspondingAuthor":false,"prefix":"","firstName":"Marco","middleName":"J.","lastName":"Bruno","suffix":""},{"id":462482453,"identity":"5e0c572c-9c64-4762-af56-462457e62e81","order_by":22,"name":"Chenchang Huang","email":"","orcid":"","institution":"New York University","correspondingAuthor":false,"prefix":"","firstName":"Chenchang","middleName":"","lastName":"Huang","suffix":""},{"id":462482454,"identity":"f754ab94-79da-45c0-ad74-3155348c4436","order_by":23,"name":"Tamas Gonda","email":"","orcid":"","institution":"New York University","correspondingAuthor":false,"prefix":"","firstName":"Tamas","middleName":"","lastName":"Gonda","suffix":""},{"id":462482455,"identity":"116881db-dbb7-44b9-8d7e-e2cb8bd9e958","order_by":24,"name":"Candice Bolan","email":"","orcid":"","institution":"Mayo Clinic","correspondingAuthor":false,"prefix":"","firstName":"Candice","middleName":"","lastName":"Bolan","suffix":""},{"id":462482456,"identity":"9cf7ce5b-473b-4f13-9b2a-04fabe15c290","order_by":25,"name":"Frank H. Miller","email":"","orcid":"","institution":"Northwestern University","correspondingAuthor":false,"prefix":"","firstName":"Frank","middleName":"H.","lastName":"Miller","suffix":""},{"id":462482457,"identity":"0ddff373-7532-49f0-9eef-c13c62ee1da6","order_by":26,"name":"Michael B. Wallace","email":"","orcid":"","institution":"Mayo Clinic","correspondingAuthor":false,"prefix":"","firstName":"Michael","middleName":"B.","lastName":"Wallace","suffix":""},{"id":462482458,"identity":"d155c263-ee3d-4cac-91a0-7573085a5840","order_by":27,"name":"Rajesh N. Keswani","email":"","orcid":"","institution":"Northwestern University","correspondingAuthor":false,"prefix":"","firstName":"Rajesh","middleName":"N.","lastName":"Keswani","suffix":""},{"id":462482459,"identity":"0cb0cfe7-81bd-4ab2-9337-a56906505b96","order_by":28,"name":"Pallavi Tiwari","email":"","orcid":"","institution":"University of Wisconsin-Madison","correspondingAuthor":false,"prefix":"","firstName":"Pallavi","middleName":"","lastName":"Tiwari","suffix":""},{"id":462482460,"identity":"59890bdf-c575-4d04-aba7-1d1206df1305","order_by":29,"name":"Ulas Bagci","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABH0lEQVRIiWNgGAWjYDACdiBmbAASzMwHQPwEJLkEDNVgwAzXwgZSYZAA4xOhhYHHgDgtBoeZH39g3GFn19/O8+3Bxz1/8gyOtz8HMu4w8LPnGGDXwmYmwXgmOXnGYd7thjOeGRQbnDlj2Djj2TMGyZ43OLQwmDEwtjEnGzDzbpPmOWCQuO1GDmMzz4HDDAY3cNnC/vkDY1s9UAvPM4iW+88fNv8BarHHqYXHQIKx7bAdUAsb1BYGw2YGkC0S2LVIHuYpk0g8czxBAugpyRkHjBP3n8kxnNlz4DCPxJlnBdi08B1v3/zh445qe/7+w88kPhyQS5zZfvzBhx8HDsvxtydvwBrKIJDAwJDYgC7Ig1M5FNgTUjAKRsEoGAUjGAAAX2ho5S6SKWkAAAAASUVORK5CYII=","orcid":"","institution":"Northwestern University","correspondingAuthor":true,"prefix":"","firstName":"Ulas","middleName":"","lastName":"Bagci","suffix":""}],"badges":[],"createdAt":"2025-05-08 18:08:08","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-6622868/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-6622868/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":83675520,"identity":"fb2b5998-4747-4f71-8b34-b4ce3eb7a731","added_by":"auto","created_at":"2025-05-30 14:47:55","extension":"jpg","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":41818,"visible":true,"origin":"","legend":"\u003cp\u003eUMAP of quality indicators (projected into x and y axes from 21 quality indicators) per center using different normalization methods. Centers: Mayo Clinic Florida (MCF), Mayo Clinic Arizona (MCA), Northwestern Memorial Hospitals (NMH), New York University (NYU), Allegheny Health Network (AHN), Istanbul University (IU) Hospital, and Erasmus Medical Center (EMC).\u003c/p\u003e","description":"","filename":"Picture1.jpg","url":"https://assets-eu.researchsquare.com/files/rs-6622868/v1/774f831d4ff0db87fe1bd0df.jpg"},{"id":83675526,"identity":"e1b2469e-7e61-4dfe-9038-80d23501062b","added_by":"auto","created_at":"2025-05-30 14:47:55","extension":"jpg","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":201298,"visible":true,"origin":"","legend":"\u003cp\u003eRepresentative T2-weighted (T2W) images and reference segmentations of high-grade and low-grade IPMNs. The first row shows T2W MRI images, and the second row shows reference segmentations of high-grade and low-grade, main-duct (MD)-IPMN and branch-duct (BD)-IPMN cases.\u003c/p\u003e","description":"","filename":"Picture2.jpg","url":"https://assets-eu.researchsquare.com/files/rs-6622868/v1/9c4ffd206bf7a7aee103aebd.jpg"},{"id":83675522,"identity":"8f2c6b99-62ee-4b53-bd1a-b441c89669ca","added_by":"auto","created_at":"2025-05-30 14:47:55","extension":"jpg","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":26980,"visible":true,"origin":"","legend":"\u003cp\u003eReceiver operating characteristic curve of 2D and 3D radiomics predictions distinguishing between Low and High-Risk groups in the testing set.\u003c/p\u003e","description":"","filename":"Picture3.jpg","url":"https://assets-eu.researchsquare.com/files/rs-6622868/v1/78a4bb90f8ca8f3531929dc8.jpg"},{"id":83676392,"identity":"1c84246d-9a52-4e48-9441-ace24864fa1a","added_by":"auto","created_at":"2025-05-30 14:55:55","extension":"jpg","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":20703,"visible":true,"origin":"","legend":"\u003cp\u003eLow vs High Risk classification comparison in cross-validation set’s mean (%) AUC, accuracy, and F1 for 2D and 3D radiomic analyses. Means are written above the error bars, and error bars show standard deviation.\u003c/p\u003e","description":"","filename":"Picture4.jpg","url":"https://assets-eu.researchsquare.com/files/rs-6622868/v1/dd77a0ec1fff2d393499a0db.jpg"},{"id":83675525,"identity":"64599bb3-a09f-466a-94f8-387e1e78e8c6","added_by":"auto","created_at":"2025-05-30 14:47:55","extension":"jpg","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":112224,"visible":true,"origin":"","legend":"\u003cp\u003eLow vs High Risk classification comparison of mean (%) AUC, accuracy, and F1 between cross-validation and testing trials for 2D and 3D radiomic analyses. Values above each bar represent the mean value. Error bars show standard deviation.\u003c/p\u003e","description":"","filename":"Picture5.jpg","url":"https://assets-eu.researchsquare.com/files/rs-6622868/v1/8c3cd73bc38b9d876abca4d2.jpg"},{"id":83675523,"identity":"fe67c5fb-09bf-4df0-8d7d-f930a7fae18c","added_by":"auto","created_at":"2025-05-30 14:47:55","extension":"jpg","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":284837,"visible":true,"origin":"","legend":"\u003cp\u003eDiagram of patient selection, data set curation, and radiomics and deep learning (DL) pipelines. (A) 746 patients were selected and received MRI imaging from seven centers between three countries; images were then preprocessed and manually segmented. (B) 2D and 3D radiomic features were extracted and classified using a random forest algorithm. (C) A DL-only analysis was conducted, then we developed a radiomics-DL fusion algorithm\u003csup\u003e27\u003c/sup\u003e.\u003c/p\u003e","description":"","filename":"Picture6.jpg","url":"https://assets-eu.researchsquare.com/files/rs-6622868/v1/f97214c67aa88ec229b1ba14.jpg"},{"id":83675528,"identity":"36f35ced-54a8-44d2-93b4-3098053b2a4b","added_by":"auto","created_at":"2025-05-30 14:47:55","extension":"jpg","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":165686,"visible":true,"origin":"","legend":"\u003cp\u003eFlowchart of the subject selection and classification.\u003c/p\u003e","description":"","filename":"Picture7.jpg","url":"https://assets-eu.researchsquare.com/files/rs-6622868/v1/77126aeb1ae71ed1ec405f02.jpg"},{"id":84717334,"identity":"dd790148-9624-49ca-bf13-d77ce43ad095","added_by":"auto","created_at":"2025-06-16 14:32:07","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":2500388,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6622868/v1/2d29f094-2cd3-4e5c-b5bd-af738e8d2642.pdf"},{"id":83676391,"identity":"5dd865eb-9673-405f-9b2b-c220acd1265f","added_by":"auto","created_at":"2025-05-30 14:55:55","extension":"docx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":16614,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryInformation.docx","url":"https://assets-eu.researchsquare.com/files/rs-6622868/v1/39732166a0b3cd035b85b9fc.docx"}],"financialInterests":"No competing interests reported.","formattedTitle":"Multi-center evaluation of radiomics and deep learning to stratify malignancy risk of IPMNs","fulltext":[{"header":"INTRODUCTION","content":"\u003cp\u003eThe increasing detection of \u003cem\u003epancreatic cysts\u003c/em\u003e has become a significant clinical challenge\u003csup\u003e\u003cspan additionalcitationids=\"CR2\" citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e\u003c/sup\u003e, imposing a substantial burden on patients (due to invasive procedures and surgical risks) and healthcare systems (due to cost of surveillance and interventions). Intraductal papillary mucinous neoplasms (IPMNs), a potentially premalignant cyst subtype, constitute a substantial proportion (estimated 50\u0026ndash;80%) of these incidental lesions\u003csup\u003e\u003cspan additionalcitationids=\"CR2\" citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e\u003c/sup\u003e. Despite their premalignant potential, the risk of malignant transformation in IPMNs remains poorly defined. Resection studies report a wide range of malignancy rates of 1\u0026ndash;38% for branch duct (BD) IPMN and 33\u0026ndash;85% for main duct (MD) IPMN; figures that likely overestimate the true rate of progression\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e,\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e\u003c/sup\u003e. The most recent international consensus guidelines, the \u003cem\u003e2023 Kyoto criteria\u003c/em\u003e, represent the current standard for IPMN management\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e\u003c/sup\u003e. However, the limitations in accurately diagnosing pancreatic cystic disease and assessing the risk of malignancy in pancreatic cysts according to existing guidelines continue to impose a substantial burden on patients and healthcare systems. These shortcomings frequently lead to invasive diagnostic procedures and high-risk surgical resections, especially for lesions ultimately deemed low-grade\u003csup\u003e\u003cspan additionalcitationids=\"CR5 CR6 CR7 CR8\" citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eMagnetic resonance imaging (MRI) and endoscopic ultrasound (EUS) with fine needle aspiration (FNA) are primary modalities for IPMN diagnosis and characterization\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e\u003c/sup\u003e. EUS-FNA is particularly valuable for evaluating pancreatic cysts and assessing IPMN dysplasia grade. Accurate IPMN characterization prior to invasive procedures is necessary to lower patient burden and cost. While EUS-FNA procedures carry a low complication rate (up to 3%) and rare mortality, the diagnostic sensitivity of EUS-FNA histopathology remains limited, ranging from 4.8\u0026ndash;61.6%\u003csup\u003e10\u0026ndash;12\u003c/sup\u003e. Thus, even after this invasive, operator-dependent, and costly procedure, malignancy cannot be reliably excluded, leaving patients at risk of bleeding, pancreatitis, and infection\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e,\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e,\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e,\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e\u003c/sup\u003e. A leading indication for the surgical resection of cystic lesions is a concern for malignancy\u003csup\u003e\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e\u003c/sup\u003e. However, pancreatic resections are major surgeries with significant morbidity and mortality rates and as a result it is critical to diagnose suspicious lesions prior to surgery\u003csup\u003e\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e,\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eRadiomics, leveraging high-throughput quantitative image analysis, enables the extraction and analysis of quantitative features imperceptible to the human eye\u003csup\u003e\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e\u003c/sup\u003e. Deep learning (DL), a neural network-based advanced artificial intelligence (AI) technique, utilizes convolution to effectively extract and discern complex imaging patterns\u003csup\u003e\u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e\u003c/sup\u003e. While radiomics and DL have advanced pancreatic tumor detection and segmentation in computed tomography (CT) and MRI, their application to characterization of premalignant lesions including IPMNs remains nascent\u003csup\u003e\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e\u003c/sup\u003e. Recent studies propose radiomics, DL, or fused models for IPMN diagnosis and classification\u003csup\u003e\u003cspan additionalcitationids=\"CR20 CR21 CR22\" citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e\u003c/sup\u003e; however, critical barriers persist. First, the pancreas\u0026rsquo; retroperitoneal anatomy and heterogeneous parenchyma complicate image analysis\u003csup\u003e\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e\u003c/sup\u003e. Second, IPMNs exhibit marked variability in morphology and texture, even within individual cysts\u003csup\u003e\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e\u003c/sup\u003e. Third, DL demands large, diverse datasets, yet pancreatic MRI\u0026mdash;the optimal and preferred modality for cyst characterization\u0026mdash;remains scarce and protocol-dependent\u003csup\u003e\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e,\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e,\u003cspan additionalcitationids=\"CR25\" citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e\u003c/sup\u003e. Prior studies using radiomics/DL have focused predominantly on tumor detection or whole-pancreas analysis, potentially overlooking crucial information within the heterogeneous cyst itself \u003csup\u003e\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e\u003c/sup\u003e. Many analyses exclude higher-risk MD/mixed-IPMNs, limiting applicability to the full spectrum of disease. Furthermore, smaller single-center cohorts (\u0026lt;\u0026thinsp;150 patients) limit generalizability\u003csup\u003e\u003cspan additionalcitationids=\"CR20 CR21\" citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e\u003c/sup\u003e. Our work addresses these gaps by performing a large, multicenter evaluation focused specifically on cyst regional level features from T2W MRI, including MD- and mixed-type IPMNs, to predict dysplasia grade using 2D and 3D radiomics, DL, and fusion approaches.\u003c/p\u003e"},{"header":"RESULTS","content":"\u003cp\u003eIn this section, we evaluate the development and performance of three advanced machine learning algorithms for the stratification of IPMN dysplasia grade in MRI: 1) radiomics-only; 2) DL-only; 3) radiomics-DL fusion. Each approach was assessed using rigorous validation protocols across our multicenter dataset.\u003c/p\u003e \u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003eEvaluation of dataset heterogeneity using UMAP\u003c/h2\u003e \u003cp\u003eA UMAP (uniform manifold approximation and projection) was used to explore high-dimensional representation of the multi-institutional MRI data using the normalized image quality indicators is shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e. UMAP revealed distinct clusters of scans associated with different centers (a total of seven) and directly related to voxel height of MRI images. One cluster comprised scans primarily from the Mayo Clinic Florida (MCF) center (blue), characterized by a mean voxel height of 4 mm. Another cluster included scans from the Northwestern Memorial Hospitals (NMH) and New York University (NYU) clinical centers (green and red), with respective mean voxel heights of 5.5 mm and 5 mm. A separate cluster consisted of Erasmus Medical Center (EMC) scans (pink), notable for a 7.3 mm mean voxel height and acquisition exclusively on a 1.5T MRI magnet. Scans from the MCA, Allegheny Health Network (AHN), and Istanbul University (IU) Hospital centers (orange, purple, and brown), exhibiting a 7 mm mean voxel height, were distributed outside these primary clusters. These findings underscore significant variations in image quality across participating centers.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eManual segmentation of index cystic lesions\u003c/h3\u003e\n\u003cp\u003eIntraobserver mean dice similarity coefficient (DSC) across three readers (abbreviated as GK, AMB, and HEA) was 80% and a Hausdorff Distance at 95 percentile (HD95) of 6.63 mm. Interobserver mean DSC was 75% with a HD95 of 7.2 mm. These DSC and HD95 values are indicative of high segmentation consistency and support the reliability of our reference standard segmentations. Representative T2-weighted (T2W) MRI images and corresponding segmentations of main-duct (MD) and branch-duct (BD)-IPMN across varying dysplasia grades are shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e\n\u003ch3\u003eVisual scoring and risk prediction\u003c/h3\u003e\n\u003cp\u003eDiagnostic performance metrics are summarized in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e. Sensitivity analysis revealed that Rater 3 achieved the highest detection rate (72.3%, 95% CI: 64.1\u0026ndash;79.5%), while Rater 1 demonstrated superior specificity (64.5%, 95% CI: 57.8\u0026ndash;70.9%). In cases with concordant majority readings (n\u0026thinsp;=\u0026thinsp;337), positive percent agreement was calculated. Pairwise comparisons of overall accuracy demonstrated significant heterogeneity, with the highest concordance observed between Raters 1 and 3 (80.1%, p\u0026thinsp;\u0026lt;\u0026thinsp;0.001), and the lowest between Raters 1 and 2 (48.1%, p\u0026thinsp;=\u0026thinsp;0.042). Cohen's kappa coefficient analysis indicated fair to moderate agreement between raters with κ of 0.33 to 0.67, as detailed in Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e. The inter-observer variability demonstrated statistically significant heterogeneity (Cochran's Q test, p\u0026thinsp;\u0026lt;\u0026thinsp;0.001), underscoring the subjective nature of visual assessment in this context.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eComparison of rater performance in visual scoring of IPMN by the imaging features of the Kyoto Criteria\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e\u003c/sup\u003e. Sensitivity and specificity of each rater (n\u0026thinsp;=\u0026thinsp;347), and a pooled sensitivity and specificity for subjects that received the same score by a majority of the raters (n\u0026thinsp;=\u0026thinsp;337).\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"4\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eSens (%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eSpec (%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eAcc (%)\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eRater 1\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e68.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e64.5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e66.7\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eRater 2\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e42.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e32.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e37.4\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eRater 3\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e72.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e41.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e57.1\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eMajority\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e70.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e63.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c4\"\u003e \u003cp\u003e66.9\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eInter-rater comparison of accuracy and agreement in visual scoring of IPMN by the imaging features of the Kyoto Criteria.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"3\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAcc (95% CI)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eWeighted Kappa (95% CI)\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eRater 1 vs 2\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e48.1 (42.7, 53.5)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.33 (0.27, 0.39)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eRater 2 vs 3\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e61.7 (56.3, 66.8)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.67 (0.59, 0.74)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eRater 1 vs 3\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c2\"\u003e \u003cp\u003e80.1 (75.5, 84.1)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c3\"\u003e \u003cp\u003e0.47 (0.39, 0.53)\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e\n\u003ch3\u003ePrediction using 2D and 3D radiomic features\u003c/h3\u003e\n\u003cp\u003eThe 2D radiomic analysis yielded a mean AUC of 66.4% with mean accuracy of 65.9%. The 3D analysis yielded a mean AUC of 66.5% and a mean accuracy of 66.1% (Table\u0026nbsp;\u003cspan refid=\"Tab3\" class=\"InternalRef\"\u003e3\u003c/span\u003e). The corresponding ROC curves are shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e. Bar plots displaying the individual and mean testing set AUC, acc, and F1 are shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e and Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e, respectively.\u003c/p\u003e \u003cp\u003e Considering expert-judgment results (radiologists\u0026rsquo; scoring based on imaging features of the Kyoto guidelines) in previous section, radiomics results are shown to be superior to two of three radiologists and performing on par with the majority-voting based results. It should also be noted that radiologists used both T1W and T2W scans and Kyoto guidelines for determining the cysts stratification while our DL and radiomics analysis used only T2W, indicating the promising and superiority nature of machine generated results.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab3\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 3\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eRadiomics-only results for 2D and 3D analysis for each trial set.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"13\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c8\" colnum=\"8\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c9\" colnum=\"9\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c10\" colnum=\"10\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c11\" colnum=\"11\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c12\" colnum=\"12\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c13\" colnum=\"13\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\" morerows=\"5\" rowspan=\"6\"\u003e \u003cp\u003eTesting - Random Forest (%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eF1\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003e52.2\u0026thinsp;\u0026plusmn;\u0026thinsp;6.0\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003e60.4\u0026thinsp;\u0026plusmn;\u0026thinsp;5.5\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003e72.4\u0026thinsp;\u0026plusmn;\u0026thinsp;4.1\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003e61.8\u0026thinsp;\u0026plusmn;\u0026thinsp;3.9\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c7\"\u003e \u003cp\u003e61.7\u0026thinsp;\u0026plusmn;\u0026thinsp;5.0\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c8\"\u003e \u003cp\u003e60.3\u0026thinsp;\u0026plusmn;\u0026thinsp;3.4\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c9\"\u003e \u003cp\u003e57.4\u0026thinsp;\u0026plusmn;\u0026thinsp;4.5\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c10\"\u003e \u003cp\u003e66.5\u0026thinsp;\u0026plusmn;\u0026thinsp;4.5\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c11\"\u003e \u003cp\u003e60.2\u0026thinsp;\u0026plusmn;\u0026thinsp;3.6\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c12\"\u003e \u003cp\u003e61.1\u0026thinsp;\u0026plusmn;\u0026thinsp;4.0\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colspan=\"1\" nameend=\"c13\" namest=\"c13\"\u003e\u0026nbsp;\u003c/th\u003e \u003c/tr\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003eSpec\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003e46.2\u0026thinsp;\u0026plusmn;\u0026thinsp;13\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003e62.8\u0026thinsp;\u0026plusmn;\u0026thinsp;8.7\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003e75.0\u0026thinsp;\u0026plusmn;\u0026thinsp;4.1\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003e67.8\u0026thinsp;\u0026plusmn;\u0026thinsp;6.6\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c7\"\u003e \u003cp\u003e\u003cb\u003e63.1\u0026thinsp;\u0026plusmn;\u0026thinsp;8.8\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c8\"\u003e \u003cp\u003e58.2\u0026thinsp;\u0026plusmn;\u0026thinsp;5.7\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c9\"\u003e \u003cp\u003e54.5\u0026thinsp;\u0026plusmn;\u0026thinsp;13\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c10\"\u003e \u003cp\u003e72.9\u0026thinsp;\u0026plusmn;\u0026thinsp;6.9\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c11\"\u003e \u003cp\u003e70.6\u0026thinsp;\u0026plusmn;\u0026thinsp;7.2\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c12\"\u003e \u003cp\u003e\u003cb\u003e64.1\u0026thinsp;\u0026plusmn;\u0026thinsp;8.5\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colspan=\"1\" nameend=\"c13\" namest=\"c13\"\u003e\u0026nbsp;\u003c/th\u003e \u003c/tr\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003eSens\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003e62.9\u0026thinsp;\u0026plusmn;\u0026thinsp;13\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003e65.0\u0026thinsp;\u0026plusmn;\u0026thinsp;8.7\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003e78.2\u0026thinsp;\u0026plusmn;\u0026thinsp;7.5\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003e77.0\u0026thinsp;\u0026plusmn;\u0026thinsp;9.3\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c7\"\u003e \u003cp\u003e\u003cb\u003e71.0\u0026thinsp;\u0026plusmn;\u0026thinsp;9.0\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c8\"\u003e \u003cp\u003e68.7\u0026thinsp;\u0026plusmn;\u0026thinsp;5.3\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c9\"\u003e \u003cp\u003e64.7\u0026thinsp;\u0026plusmn;\u0026thinsp;12\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c10\"\u003e \u003cp\u003e70.4\u0026thinsp;\u0026plusmn;\u0026thinsp;9.1\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c11\"\u003e \u003cp\u003e71.1\u0026thinsp;\u0026plusmn;\u0026thinsp;7.0\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c12\"\u003e \u003cp\u003e\u003cb\u003e68.7\u0026thinsp;\u0026plusmn;\u0026thinsp;8.8\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colspan=\"1\" nameend=\"c13\" namest=\"c13\"\u003e\u0026nbsp;\u003c/th\u003e \u003c/tr\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003ePPV\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003e45.9\u0026thinsp;\u0026plusmn;\u0026thinsp;6.9\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003e57.0\u0026thinsp;\u0026plusmn;\u0026thinsp;5.5\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003e67.7\u0026thinsp;\u0026plusmn;\u0026thinsp;3.2\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003e52.1\u0026thinsp;\u0026plusmn;\u0026thinsp;4.5\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c7\"\u003e \u003cp\u003e\u003cb\u003e55.7\u0026thinsp;\u0026plusmn;\u0026thinsp;5.2\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c8\"\u003e \u003cp\u003e53.9\u0026thinsp;\u0026plusmn;\u0026thinsp;3.4\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c9\"\u003e \u003cp\u003e52.5\u0026thinsp;\u0026plusmn;\u0026thinsp;6.1\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c10\"\u003e \u003cp\u003e63.8\u0026thinsp;\u0026plusmn;\u0026thinsp;5.1\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c11\"\u003e \u003cp\u003e52.7\u0026thinsp;\u0026plusmn;\u0026thinsp;5.1\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c12\"\u003e \u003cp\u003e\u003cb\u003e55.7\u0026thinsp;\u0026plusmn;\u0026thinsp;5.0\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colspan=\"1\" nameend=\"c13\" namest=\"c13\"\u003e\u0026nbsp;\u003c/th\u003e \u003c/tr\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003eAcc\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003e53.1\u0026thinsp;\u0026plusmn;\u0026thinsp;5.3\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003e63.7\u0026thinsp;\u0026plusmn;\u0026thinsp;4.5\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003e76.3\u0026thinsp;\u0026plusmn;\u0026thinsp;2.9\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003e70.6\u0026thinsp;\u0026plusmn;\u0026thinsp;3.4\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c7\"\u003e \u003cp\u003e\u003cb\u003e65.9\u0026thinsp;\u0026plusmn;\u0026thinsp;4.1\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c8\"\u003e \u003cp\u003e62.6\u0026thinsp;\u0026plusmn;\u0026thinsp;3.6\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c9\"\u003e \u003cp\u003e58.9\u0026thinsp;\u0026plusmn;\u0026thinsp;3.8\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c10\"\u003e \u003cp\u003e71.9\u0026thinsp;\u0026plusmn;\u0026thinsp;3.6\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c11\"\u003e \u003cp\u003e70.8\u0026thinsp;\u0026plusmn;\u0026thinsp;4.0\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c12\"\u003e \u003cp\u003e\u003cb\u003e66.1\u0026thinsp;\u0026plusmn;\u0026thinsp;3.8\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colspan=\"1\" nameend=\"c13\" namest=\"c13\"\u003e\u0026nbsp;\u003c/th\u003e \u003c/tr\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003eAUC\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003e46.7\u0026thinsp;\u0026plusmn;\u0026thinsp;5.4\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003e61.9\u0026thinsp;\u0026plusmn;\u0026thinsp;4.2\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003e77.0\u0026thinsp;\u0026plusmn;\u0026thinsp;3.5\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003e79.6\u0026thinsp;\u0026plusmn;\u0026thinsp;2.5\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c7\"\u003e \u003cp\u003e\u003cb\u003e66.4\u0026thinsp;\u0026plusmn;\u0026thinsp;4.0\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c8\"\u003e \u003cp\u003e59.2\u0026thinsp;\u0026plusmn;\u0026thinsp;2.5\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c9\"\u003e \u003cp\u003e58.8\u0026thinsp;\u0026plusmn;\u0026thinsp;2.9\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c10\"\u003e \u003cp\u003e73.5\u0026thinsp;\u0026plusmn;\u0026thinsp;3.3\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c11\"\u003e \u003cp\u003e74.6\u0026thinsp;\u0026plusmn;\u0026thinsp;2.3\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c12\"\u003e \u003cp\u003e\u003cb\u003e66.5\u0026thinsp;\u0026plusmn;\u0026thinsp;2.8\u003c/b\u003e\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colspan=\"1\" nameend=\"c13\" namest=\"c13\"\u003e\u0026nbsp;\u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"5\" rowspan=\"6\"\u003e \u003cp\u003e\u003cb\u003eCross-Validation - Random Forest (%)\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003eF1\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e62.9\u0026thinsp;\u0026plusmn;\u0026thinsp;2.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e63.0\u0026thinsp;\u0026plusmn;\u0026thinsp;1.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e62.2\u0026thinsp;\u0026plusmn;\u0026thinsp;1.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e62.8\u0026thinsp;\u0026plusmn;\u0026thinsp;2.0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e\u003cb\u003e62.7\u0026thinsp;\u0026plusmn;\u0026thinsp;1.9\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e63.9\u0026thinsp;\u0026plusmn;\u0026thinsp;1.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c9\"\u003e \u003cp\u003e62.9\u0026thinsp;\u0026plusmn;\u0026thinsp;1.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c10\"\u003e \u003cp\u003e59.8\u0026thinsp;\u0026plusmn;\u0026thinsp;2.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c11\"\u003e \u003cp\u003e62.9\u0026thinsp;\u0026plusmn;\u0026thinsp;2.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c12\"\u003e \u003cp\u003e\u003cb\u003e62.4\u0026thinsp;\u0026plusmn;\u0026thinsp;2.0\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colspan=\"1\" nameend=\"c13\" namest=\"c13\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003eSpec\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e68.9\u0026thinsp;\u0026plusmn;\u0026thinsp;4.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e69.7\u0026thinsp;\u0026plusmn;\u0026thinsp;4.5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e68.8\u0026thinsp;\u0026plusmn;\u0026thinsp;4.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e68.1\u0026thinsp;\u0026plusmn;\u0026thinsp;4.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e\u003cb\u003e68.9\u0026thinsp;\u0026plusmn;\u0026thinsp;4.7\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e69.8\u0026thinsp;\u0026plusmn;\u0026thinsp;4.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c9\"\u003e \u003cp\u003e68.3\u0026thinsp;\u0026plusmn;\u0026thinsp;4.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c10\"\u003e \u003cp\u003e67.0\u0026thinsp;\u0026plusmn;\u0026thinsp;4.5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c11\"\u003e \u003cp\u003e68.2\u0026thinsp;\u0026plusmn;\u0026thinsp;4.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c12\"\u003e \u003cp\u003e\u003cb\u003e68.3\u0026thinsp;\u0026plusmn;\u0026thinsp;4.4\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colspan=\"1\" nameend=\"c13\" namest=\"c13\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003eSens\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e67.7\u0026thinsp;\u0026plusmn;\u0026thinsp;4.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e67.5\u0026thinsp;\u0026plusmn;\u0026thinsp;4.4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e66.6\u0026thinsp;\u0026plusmn;\u0026thinsp;4.5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e66.1\u0026thinsp;\u0026plusmn;\u0026thinsp;4.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e\u003cb\u003e67.0\u0026thinsp;\u0026plusmn;\u0026thinsp;4.5\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e68.7\u0026thinsp;\u0026plusmn;\u0026thinsp;4.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c9\"\u003e \u003cp\u003e68.3\u0026thinsp;\u0026plusmn;\u0026thinsp;4.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c10\"\u003e \u003cp\u003e64.2\u0026thinsp;\u0026plusmn;\u0026thinsp;4.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c11\"\u003e \u003cp\u003e66.4\u0026thinsp;\u0026plusmn;\u0026thinsp;5.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c12\"\u003e \u003cp\u003e\u003cb\u003e66.9\u0026thinsp;\u0026plusmn;\u0026thinsp;4.5\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colspan=\"1\" nameend=\"c13\" namest=\"c13\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003ePPV\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e58.9\u0026thinsp;\u0026plusmn;\u0026thinsp;2.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e59.3\u0026thinsp;\u0026plusmn;\u0026thinsp;2.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e58.6\u0026thinsp;\u0026plusmn;\u0026thinsp;2.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e60.0\u0026thinsp;\u0026plusmn;\u0026thinsp;2.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e\u003cb\u003e59.2\u0026thinsp;\u0026plusmn;\u0026thinsp;2.7\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e59.9\u0026thinsp;\u0026plusmn;\u0026thinsp;2.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c9\"\u003e \u003cp\u003e58.5\u0026thinsp;\u0026plusmn;\u0026thinsp;3.0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c10\"\u003e \u003cp\u003e56.2\u0026thinsp;\u0026plusmn;\u0026thinsp;2.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c11\"\u003e \u003cp\u003e60.1\u0026thinsp;\u0026plusmn;\u0026thinsp;2.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c12\"\u003e \u003cp\u003e\u003cb\u003e58.7\u0026thinsp;\u0026plusmn;\u0026thinsp;2.6\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colspan=\"1\" nameend=\"c13\" namest=\"c13\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003eAcc\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e68.4\u0026thinsp;\u0026plusmn;\u0026thinsp;2.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e68.9\u0026thinsp;\u0026plusmn;\u0026thinsp;1.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e67.9\u0026thinsp;\u0026plusmn;\u0026thinsp;1.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e67.3\u0026thinsp;\u0026plusmn;\u0026thinsp;1.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e\u003cb\u003e68.1\u0026thinsp;\u0026plusmn;\u0026thinsp;1.9\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e69.3\u0026thinsp;\u0026plusmn;\u0026thinsp;1.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c9\"\u003e \u003cp\u003e68.3\u0026thinsp;\u0026plusmn;\u0026thinsp;2.0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c10\"\u003e \u003cp\u003e65.9\u0026thinsp;\u0026plusmn;\u0026thinsp;2.0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c11\"\u003e \u003cp\u003e67.4\u0026thinsp;\u0026plusmn;\u0026thinsp;1.5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c12\"\u003e \u003cp\u003e\u003cb\u003e67.7\u0026thinsp;\u0026plusmn;\u0026thinsp;1.8\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colspan=\"1\" nameend=\"c13\" namest=\"c13\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003eAUC\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e72.6\u0026thinsp;\u0026plusmn;\u0026thinsp;1.5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e72.7\u0026thinsp;\u0026plusmn;\u0026thinsp;1.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e71.8\u0026thinsp;\u0026plusmn;\u0026thinsp;1.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e71.2\u0026thinsp;\u0026plusmn;\u0026thinsp;1.5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e\u003cb\u003e72.1\u0026thinsp;\u0026plusmn;\u0026thinsp;1.5\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e73.5\u0026thinsp;\u0026plusmn;\u0026thinsp;1.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c9\"\u003e \u003cp\u003e72.2\u0026thinsp;\u0026plusmn;\u0026thinsp;1.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c10\"\u003e \u003cp\u003e69.0\u0026thinsp;\u0026plusmn;\u0026thinsp;1.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c11\"\u003e \u003cp\u003e71.5\u0026thinsp;\u0026plusmn;\u0026thinsp;1.5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c12\"\u003e \u003cp\u003e\u003cb\u003e71.6\u0026thinsp;\u0026plusmn;\u0026thinsp;1.6\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colspan=\"1\" nameend=\"c13\" namest=\"c13\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003eFeatures\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e28\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e30\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e20\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e30\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003e\u003cb\u003eAverage\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e28\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c9\"\u003e \u003cp\u003e30\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c10\"\u003e \u003cp\u003e30\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c11\"\u003e \u003cp\u003e30\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c12\" morerows=\"1\" rowspan=\"2\"\u003e \u003cp\u003e\u003cb\u003eAverage\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colspan=\"1\" nameend=\"c13\" namest=\"c13\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003eTrial\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eT1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eT2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eT3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003eT4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003eT1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c9\"\u003e \u003cp\u003eT2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c10\"\u003e \u003cp\u003eT3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c11\"\u003e \u003cp\u003eT4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colspan=\"1\" nameend=\"c13\" namest=\"c13\"\u003e\u0026nbsp;\u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colspan=\"2\" nameend=\"c2\" namest=\"c1\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colspan=\"5\" nameend=\"c7\" namest=\"c3\"\u003e \u003cp\u003e\u003cb\u003e2D Radiomics\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colspan=\"6\" nameend=\"c13\" namest=\"c8\"\u003e \u003cp\u003e\u003cb\u003e3D Radiomics\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e\n\u003ch3\u003eComparing the performance of six DL architectures in predicting IPMN dysplasia grade\u003c/h3\u003e\n\u003cp\u003eAmong the tested various CNNs, DenseNet121\u003csup\u003e27\u003c/sup\u003e demonstrated the highest AUC at 73.3% (Table\u0026nbsp;\u003cspan refid=\"Tab4\" class=\"InternalRef\"\u003e4\u003c/span\u003e). In comparison, ResNet-34\u003csup\u003e28\u003c/sup\u003e achieved a slightly lower AUC of 73.1%. Lightweight models, such as EfficientNet-B0\u003csup\u003e29\u003c/sup\u003e and ShuffleNet-V2\u003csup\u003e30\u003c/sup\u003e, exhibited demonstrably lower AUC values of 68.1% and 66.1%, respectively.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab4\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 4\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eDeep Learning results and standard deviation of IPMN cyst malignancy risk stratification in 5 folds cross-validation\u003csup\u003e\u003cspan additionalcitationids=\"CR28 CR29 CR30\" citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"5\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAUC (%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eAcc (%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eSens (%)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eSpec (%)\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eDenseNet121\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e73.3\u0026thinsp;\u0026plusmn;\u0026thinsp;7.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e68.0\u0026thinsp;\u0026plusmn;\u0026thinsp;7.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e46.5\u0026thinsp;\u0026plusmn;\u0026thinsp;23\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e82.7\u0026thinsp;\u0026plusmn;\u0026thinsp;6.5\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eMobilenetv2\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e73.0\u0026thinsp;\u0026plusmn;\u0026thinsp;2.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e66.6\u0026thinsp;\u0026plusmn;\u0026thinsp;2.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eN/A\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eN/A\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eResNet34\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e73.1\u0026thinsp;\u0026plusmn;\u0026thinsp;4.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e68.5\u0026thinsp;\u0026plusmn;\u0026thinsp;4.4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eN/A\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eN/A\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eResNet50\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e71.8\u0026thinsp;\u0026plusmn;\u0026thinsp;6.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e66.0\u0026thinsp;\u0026plusmn;\u0026thinsp;6.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eN/A\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eN/A\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eShuffleNet-V2\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e66.6\u0026thinsp;\u0026plusmn;\u0026thinsp;5.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e61.3\u0026thinsp;\u0026plusmn;\u0026thinsp;1.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eN/A\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eN/A\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003eEfficientNet-B0\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e68.1\u0026thinsp;\u0026plusmn;\u0026thinsp;1.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e65.6\u0026thinsp;\u0026plusmn;\u0026thinsp;6.0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eN/A\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eN/A\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cdiv id=\"Sec8\" class=\"Section2\"\u003e \u003ch2\u003eEvaluation of 2D and 3D radiomics-DL fusion algorithms\u003c/h2\u003e \u003cp\u003eUsing 2D radiomic features, the fusion model achieved a weighted average AUC of 74.3% and an accuracy of 71.0% in cross-validation. In independent testing, this 2D feature fusion model yielded an AUC of 69.2% and an accuracy of 61.6% (Table\u0026nbsp;\u003cspan refid=\"Tab5\" class=\"InternalRef\"\u003e5\u003c/span\u003e). When trained with 3D radiomic features, the fusion model demonstrated a weighted average AUC of 73.4% and an accuracy of 98.4% in cross-validation, and an AUC of 68.3% and accuracy of 62.7% in independent testing.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab5\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 5\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eRadiomics-deep learning fusion algorithm results. 2D and 3D radiomic features were fed into DenseNet121 in 5 cross-validation on 4 different trials\u003csup\u003e\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"12\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c8\" colnum=\"8\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c9\" colnum=\"9\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c10\" colnum=\"10\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c11\" colnum=\"11\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c12\" colnum=\"12\"\u003e\u003c/div\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"3\" rowspan=\"4\"\u003e \u003cp\u003eCross Validation (%)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eSpec\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e60.9\u0026thinsp;\u0026plusmn;\u0026thinsp;9.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e62.9\u0026thinsp;\u0026plusmn;\u0026thinsp;14.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e77.6\u0026thinsp;\u0026plusmn;\u0026thinsp;17.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e72.2\u0026thinsp;\u0026plusmn;\u0026thinsp;11.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e68.4\u0026thinsp;\u0026plusmn;\u0026thinsp;7.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e100.0\u0026thinsp;\u0026plusmn;\u0026thinsp;0.0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c9\"\u003e \u003cp\u003e98.4\u0026thinsp;\u0026plusmn;\u0026thinsp;2.0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c10\"\u003e \u003cp\u003e98.3\u0026thinsp;\u0026plusmn;\u0026thinsp;2.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c11\"\u003e \u003cp\u003e96.3\u0026thinsp;\u0026plusmn;\u0026thinsp;4.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c12\"\u003e \u003cp\u003e98.3\u0026thinsp;\u0026plusmn;\u0026thinsp;1.5\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003eSens\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e81.2\u0026thinsp;\u0026plusmn;\u0026thinsp;10.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e78.5\u0026thinsp;\u0026plusmn;\u0026thinsp;12.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e63.9\u0026thinsp;\u0026plusmn;\u0026thinsp;13.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e77.6\u0026thinsp;\u0026plusmn;\u0026thinsp;17.4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e\u003cb\u003e75.3\u0026thinsp;\u0026plusmn;\u0026thinsp;7.8\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e98.5\u0026thinsp;\u0026plusmn;\u0026thinsp;1.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c9\"\u003e \u003cp\u003e100.0\u0026thinsp;\u0026plusmn;\u0026thinsp;0.0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c10\"\u003e \u003cp\u003e98.5\u0026thinsp;\u0026plusmn;\u0026thinsp;3.0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c11\"\u003e \u003cp\u003e95.1\u0026thinsp;\u0026plusmn;\u0026thinsp;3.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c12\"\u003e \u003cp\u003e\u003cb\u003e98.0\u0026thinsp;\u0026plusmn;\u0026thinsp;2.1\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003eAcc\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e68.7\u0026thinsp;\u0026plusmn;\u0026thinsp;2.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e68.7\u0026thinsp;\u0026plusmn;\u0026thinsp;5.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e71.5\u0026thinsp;\u0026plusmn;\u0026thinsp;4.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e74.9\u0026thinsp;\u0026plusmn;\u0026thinsp;3.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e\u003cb\u003e71.0\u0026thinsp;\u0026plusmn;\u0026thinsp;3.0\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e99.4\u0026thinsp;\u0026plusmn;\u0026thinsp;0.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c9\"\u003e \u003cp\u003e99.1\u0026thinsp;\u0026plusmn;\u0026thinsp;1.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c10\"\u003e \u003cp\u003e98.5\u0026thinsp;\u0026plusmn;\u0026thinsp;1.4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c11\"\u003e \u003cp\u003e96.5\u0026thinsp;\u0026plusmn;\u0026thinsp;1.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c12\"\u003e \u003cp\u003e\u003cb\u003e98.4\u0026thinsp;\u0026plusmn;\u0026thinsp;1.3\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003eAUC\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e73.3\u0026thinsp;\u0026plusmn;\u0026thinsp;2.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e74.6\u0026thinsp;\u0026plusmn;\u0026thinsp;4.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e73.9\u0026thinsp;\u0026plusmn;\u0026thinsp;2.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e75.6\u0026thinsp;\u0026plusmn;\u0026thinsp;4.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e\u003cb\u003e74.3\u0026thinsp;\u0026plusmn;\u0026thinsp;3.6\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e77.1\u0026thinsp;\u0026plusmn;\u0026thinsp;3.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c9\"\u003e \u003cp\u003e74.0\u0026thinsp;\u0026plusmn;\u0026thinsp;6.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c10\"\u003e \u003cp\u003e70.0\u0026thinsp;\u0026plusmn;\u0026thinsp;4.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c11\"\u003e \u003cp\u003e72.5\u0026thinsp;\u0026plusmn;\u0026thinsp;2.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c12\"\u003e \u003cp\u003e\u003cb\u003e73.4\u0026thinsp;\u0026plusmn;\u0026thinsp;4.1\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"3\" rowspan=\"4\"\u003e \u003cp\u003e\u003cb\u003eTesting (%)\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003eSpec\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e66.7\u0026thinsp;\u0026plusmn;\u0026thinsp;12.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e66.7\u0026thinsp;\u0026plusmn;\u0026thinsp;24.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e54.3\u0026thinsp;\u0026plusmn;\u0026thinsp;24.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e75.5\u0026thinsp;\u0026plusmn;\u0026thinsp;16.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e\u003cb\u003e67.8\u0026thinsp;\u0026plusmn;\u0026thinsp;7.9\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e63.3\u0026thinsp;\u0026plusmn;\u0026thinsp;4.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c9\"\u003e \u003cp\u003e62.7\u0026thinsp;\u0026plusmn;\u0026thinsp;13.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c10\"\u003e \u003cp\u003e35.7\u0026thinsp;\u0026plusmn;\u0026thinsp;10.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c11\"\u003e \u003cp\u003e77.3\u0026thinsp;\u0026plusmn;\u0026thinsp;5.0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c12\"\u003e \u003cp\u003e\u003cb\u003e63.3\u0026thinsp;\u0026plusmn;\u0026thinsp;15.4\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003eSens\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e35.3\u0026thinsp;\u0026plusmn;\u0026thinsp;9.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e46.0\u0026thinsp;\u0026plusmn;\u0026thinsp;28.4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e77.1\u0026thinsp;\u0026plusmn;\u0026thinsp;10.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e63.7\u0026thinsp;\u0026plusmn;\u0026thinsp;13.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e\u003cb\u003e57.9\u0026thinsp;\u0026plusmn;\u0026thinsp;14.4\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e56.5\u0026thinsp;\u0026plusmn;\u0026thinsp;6.0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c9\"\u003e \u003cp\u003e57.0\u0026thinsp;\u0026plusmn;\u0026thinsp;14.0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c10\"\u003e \u003cp\u003e81.0\u0026thinsp;\u0026plusmn;\u0026thinsp;9.0\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c11\"\u003e \u003cp\u003e60.4\u0026thinsp;\u0026plusmn;\u0026thinsp;7.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c12\"\u003e \u003cp\u003e\u003cb\u003e63.2\u0026thinsp;\u0026plusmn;\u0026thinsp;9.1\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003eAcc\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e48.3\u0026thinsp;\u0026plusmn;\u0026thinsp;2.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e54.9\u0026thinsp;\u0026plusmn;\u0026thinsp;6.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e68.0\u0026thinsp;\u0026plusmn;\u0026thinsp;7.1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e67.3\u0026thinsp;\u0026plusmn;\u0026thinsp;4.9\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e\u003cb\u003e61.6\u0026thinsp;\u0026plusmn;\u0026thinsp;7.9\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e59.3\u0026thinsp;\u0026plusmn;\u0026thinsp;3.4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c9\"\u003e \u003cp\u003e59.4\u0026thinsp;\u0026plusmn;\u0026thinsp;3.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c10\"\u003e \u003cp\u003e62.9\u0026thinsp;\u0026plusmn;\u0026thinsp;7.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c11\"\u003e \u003cp\u003e65.6\u0026thinsp;\u0026plusmn;\u0026thinsp;3.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c12\"\u003e \u003cp\u003e\u003cb\u003e62.7\u0026thinsp;\u0026plusmn;\u0026thinsp;2.8\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003eAUC\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e47.1\u0026thinsp;\u0026plusmn;\u0026thinsp;1.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e61.7\u0026thinsp;\u0026plusmn;\u0026thinsp;3.8\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e77.5\u0026thinsp;\u0026plusmn;\u0026thinsp;5.5\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e79.0\u0026thinsp;\u0026plusmn;\u0026thinsp;1.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e\u003cb\u003e69.2\u0026thinsp;\u0026plusmn;\u0026thinsp;2.9\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e59.3\u0026thinsp;\u0026plusmn;\u0026thinsp;3.6\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c9\"\u003e \u003cp\u003e63.3\u0026thinsp;\u0026plusmn;\u0026thinsp;2.3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c10\"\u003e \u003cp\u003e67.6\u0026thinsp;\u0026plusmn;\u0026thinsp;4.7\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c11\"\u003e \u003cp\u003e74.7\u0026thinsp;\u0026plusmn;\u0026thinsp;2.2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c12\"\u003e \u003cp\u003e\u003cb\u003e68.3\u0026thinsp;\u0026plusmn;\u0026thinsp;2.9\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\" morerows=\"1\" rowspan=\"2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e\u003cb\u003eTrial\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eT1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eT2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eT3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003eT4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e\u003cb\u003eWeighted Average\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003eT1\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c9\"\u003e \u003cp\u003eT2\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c10\"\u003e \u003cp\u003eT3\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c11\"\u003e \u003cp\u003eT4\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c12\"\u003e \u003cp\u003e\u003cb\u003eWeighted Average\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e\u0026nbsp;\u003c/td\u003e \u003ctd align=\"left\" colspan=\"5\" nameend=\"c7\" namest=\"c3\"\u003e \u003cp\u003e\u003cb\u003e2D Radiomic Features\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colspan=\"5\" nameend=\"c12\" namest=\"c8\"\u003e \u003cp\u003e\u003cb\u003e3D Radiomic Features\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003c/div\u003e"},{"header":"DISCUSSIONS","content":"\u003cp\u003eIn this large, multicenter study focused on cyst regional-level IPMN analysis from T2W MRI scans of 359 subjects, we demonstrated the feasibility of using radiomics and DL approaches for malignancy risk stratification of IPMN lesions. Visual scoring, the current standard in clinics, raters had minimal to moderate agreement with weighted Kappa scores of 0.33\u0026ndash;0.67\u003csup\u003e32\u003c/sup\u003e. The visual scoring accuracy for the majority cases and Rater 1 were similar to the accuracies of the radiomics-only algorithms on testing; and higher than the accuracies of the fusion algorithms on testing, Rater 2, and Rater 3. Our DenseNet121 deep learning model achieved the highest performance (AUC 73.3%, accuracy 68.0%), followed closely by our radiomics-deep learning fusion algorithm using 2D radiomic features (AUC 69.2%, accuracy 61.6% in testing; AUC 74.3%, accuracy 71.0% in cross-validation). This performance effectively balanced parameter efficiency and predictive power. Lightweight models, EfficientNet-B0 and ShuffleNet-V2, exhibited lower AUC values, underscoring the trade-off between model complexity and predictive accuracy across diverse architectures. The fusion of DL and radiomics algorithm, utilizing 2D radiomic features, attained a weighted average AUC of 69.2% and accuracy of 61.6% in testing, and a weighted average AUC of 74.3% and Acc of 71.0% on cross validation. Radiomics-only analyses, employing 3D features, followed with respective AUC and accuracy of 66.5% and 66.1% on testing. Comparable performance was observed between algorithms utilizing 2D versus 3D radiomic features, indicating the potential utility of computationally efficient 2D methods. Importantly, these advanced methods demonstrated performance that matched or exceeded expert radiologist assessment, highlighting their potential to augment clinical decision-making in IPMN management.\u003c/p\u003e \u003cp\u003eIn our earlier work (Yao et.al. 2023), we classified IPMN malignancy risk using advanced analysis techniques coupled with an automatic whole pancreas segmentation algorithm in 246 T1W and T2W MRI scans from five centers\u003csup\u003e\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e\u003c/sup\u003e. In that work, we developed three algorithms with incorporated clinical features (age, gender, BMI, diabetes mellitus, and chronic pancreatitis) to accomplish this task: a radiomics-only, DL-only, and DL-radiomics fusion using four CNNs and Vision Transformer (ViT). Our algorithms stratified cases as healthy (n\u0026thinsp;=\u0026thinsp;70), low-grade risk (n\u0026thinsp;=\u0026thinsp;85), and high-grade risk (n\u0026thinsp;=\u0026thinsp;91). In our current study; hence, our results are not entirely comparable with Yao et al. 2023\u003csup\u003e23\u003c/sup\u003e because we switched into two class-classification from three-class classification by focusing only in cystic cases. In our earlier results, we found a mean HD95 of 26.08 mm and DSC of 70.11 for our automatic segmentations that might have introduced additional errors in radiomics and DL analysis. On the other hand, herein, we used manual segmentation (i.e., ground truths) by interdisciplinary experts that were then reviewed by expert radiologists to ensure accurate segmentation; hence, we minimized segmentation induced errors in radiomics and DL analysis. Another key difference compared to our earlier study is our earlier study did not include cyst-type, and all the experiments conducted on a much smaller cohort.\u003c/p\u003e \u003cp\u003eCui et.al. 2021 conducted a study to develop a nomogram to predict the pathological grade of BD-IPMN\u003csup\u003e\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e\u003c/sup\u003e. The nomogram incorporated clinical features (sex, symptoms, age, CA19-9, and CEA) and radiomic features derived from manually segmented cysts. Their dataset included T2W, T1W, and contrast enhanced T1W scans pertaining to 202 patients collected from three centers. Their data was classified by dysplasia grades as low or high. In their results, it was found that 24.8% of their BD-IPMN cases had high grade dysplasia. On testing using radiomic-only features, they had specificity, sensitivity, and AUC of 81.6%, 70.0%, and 81.1% respectively on validation. Once radiomic and clinical features were incorporated, their nomogram achieved specificity, sensitivity, and AUC of 79.0%, 90.0%, and 88.4% in validation. To compare our studies, the main difference is the type of IPMN cysts that they have included: BD-IPMN while we utilized MD-IPMN, BD-IPMN and mixed-types. While promising, nomograms may perform poorly when applied to populations different from their development cohort, limiting their generalizability across diverse clinical settings\u003csup\u003e\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e\u003c/sup\u003e. Additionally, the focus on only BD-IPMN could lead to selection bias in their study because BD-IPMN has a lower risk of malignancy. Their ratio of high-grade dysplasia cases is lower than ours and may not be representative of a real-world cohort of IPMN which we tried to approximate. Furthermore, authors included an additional scan of contrast enhanced T1W sequences in their analysis while we confined ourselves into conventional T1W and T2W. In comparing our results, their radiomics-only analysis outperformed ours in AUC and specificity. This could be highly likely because their analysis included several clinical features like patient symptoms and tumor markers which are known to be predictive of higher risk IPMN\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e\u003c/sup\u003e. We are aware of the significance of clinical features in predicting IPMN malignancy risk and plan to incorporate them into our future analyses. Despite this, our radiomics-only analysis had similar sensitivity to theirs. This is in spite of our use of multiple centers but could have been due to our larger data set. Overall, our study was done in more of a medical image analysis environment than theirs and could provide a more robust malignancy risk prediction method, and having a promise of even better predictions once other clinical markers are combined with imaging.\u003c/p\u003e \u003cp\u003eTo our knowledge, majority of studies that have used radiomics to classify IPMN are largely CT-based\u003csup\u003e\u003cspan additionalcitationids=\"CR36 CR37\" citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e\u003c/sup\u003e. MRI is the preferred imaging method for IPMN classification and monitoring compared to CT because it has no radiation exposure, has higher contrast resolution, and it is better at assessing tissue and cysts\u003csup\u003e\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e,\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e\u003c/sup\u003e. Furthermore, ours is the most comprehensive study on IPMN malignancy risk stratification that utilizes cyst masks in MRI\u003csup\u003e\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e,\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e,\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e\u003c/sup\u003e. Cheng et.al. 2022 found superior performance of an MRI radiomics algorithm when compared to CT in predicting IPMN malignant potential\u003csup\u003e\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e\u003c/sup\u003e. Among studies that have utilized MRI, two analyzed only BD-IPMN and the remainder did not specify IPMN type\u003csup\u003e\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e,\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e,\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e,\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e\u003c/sup\u003e. We found that 38.5% of our Mixed/MD-IPMN and 78.2% of BD-IPMN lesions were Low-Risk. This suggests that many pancreatic resections are unnecessarily performed because a lesion is a MD-IPMN, without any further analysis to stratify lesions that may actually be at risk of malignancy. MD-IPMN is frequently surgically resected in patients that do not have contraindications to surgery, as it has a higher risk of malignant transformation than BD-IPMN\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e,\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e\u003c/sup\u003e. Studies that only include BD-IPMN are excluding an important and under-investigated subtype. We included MD- and mixed-IPMN in our advanced algorithm training to approximate a more real-world cohort.\u003c/p\u003e \u003cp\u003eOur study has several limitations that should be considered. First, its retrospective design inherently limits causal inference and introduces potential biases in the historical data collection. Second, the data collected over two decades contributed to variations, including differences in scan quality and uncertainties regarding the accurate grading of dysplasia. The experience levels of operators and pathologists varied across cases, potentially affecting the reliability of dysplasia assessments. Additionally, there was no standardized protocol for selecting cases for EUS-FNA, which may introduce bias since some patients might have undergone EUS for reasons unrelated to the malignancy risk of cystic lesions. Consequently, cytology might have been obtained from cysts that were not classified as high-risk based on imaging. Moreover, the appearance of cysts may have changed in MRI images taken after the EUS procedure, which could complicate image analysis. Despite the risk of cyst appearance changes following EUS, we have found our results to be reliable using segmentations of visible cystic lesions. Acknowledging these concerns, we thoroughly reviewed the dataset to ensure its suitability for the study.\u003c/p\u003e \u003cp\u003eThird, for the BD-IPMN group, we exclusively analyzed data from the sampled cysts. This intentional selection introduced some selection bias; however, focusing on patients at higher risk for malignancy was crucial. Consequently, our observed rates of malignancy risk for BD-IPMNs are similar to higher than those reported in the broader literature, which frequently includes milder cases\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e,\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eFourth, our dataset was collected from seven institutions using various brands of MRI scanners and field strengths (1.5T and 3T) with differing image acquisition protocols. This variability poses analytical challenges and ultimately affects the algorithm's performance. Although our multicenter dataset is diverse and heterogeneous, this variety strengthens the algorithm's robustness, ensures its stability across different environments, and enhances its applicability in real-world clinical settings, where imaging protocols frequently vary.\u003c/p\u003e \u003cp\u003eFifth, our image analysis was limited to T2W MRI sequences due to data availability constraints. We plan to include and analyze additional MRI sequences in our future studies. Lastly, radiologist raters utilized only T1W and T2W sequences for expert risk assessment; however, these sequences alone are insufficient for a thorough visual evaluation and do not represent real-life assessments fully. Moreover, the radiologist raters lacked access to previous scans, clinical information, or other critical MRI sequences\u0026mdash;such as diffusion sequences\u0026mdash;that are valuable for accurately estimating risk. These factors could affect the accuracy of visual scoring compared to standard comprehensive imaging analyses.\u003c/p\u003e \u003cp\u003eThese limitations point to several promising directions for future research. Prospective validation studies with standardized imaging protocols would strengthen evidence for clinical translation. Integration of clinical parameters (age, symptoms, tumor markers) and additional MRI sequences (contrast-enhanced, diffusion-weighted) could further improve model performance. Development of ensemble approaches that combine imaging features with other biomarkers (cyst fluid analysis, circulating markers) might provide more comprehensive risk assessment. Finally, extending these methods to predict long-term outcomes rather than cross-sectional histopathology would better align with the clinical goal of identifying lesions likely to progress to malignancy.\u003c/p\u003e \u003cp\u003eIn conclusion, our multicenter, pancreatic cyst-focused study demonstrates the feasibility and potential clinical utility of radiomics and deep learning for IPMN risk stratification using routinely acquired T2W MRI scans. While predictive performance requires further enhancement, potentially through integration of clinical data and additional imaging sequences, our advanced machine learning models achieved performance comparable and even better to expert radiologists in this challenging cohort, offering greater objectivity and reproducibility compared to visual assessment. Given that current international consensus guidelines lack optimal specificity for identifying low-risk IPMNs without invasive procedures, computational tools like ours represent a valuable step toward more precise patient selection for intervention versus surveillance. Hence, our findings have immediate clinical relevance. The fusion model's performance, comparable to expert radiologists, suggests potential for integration into clinical workflows as a decision support tool. By providing objective risk stratification of IPMNs, our approach could reduce the high rates of unnecessary surgical resections of low-risk lesions, particularly for MD-IPMNs which are often resected based solely on morphology. Implementation could take the form of a software plugin for radiology workstations, offering real-time risk assessment during routine reads without disrupting workflow. Cost-effectiveness analyses and prospective validation would be logical next steps toward clinical translation.\u003c/p\u003e"},{"header":"METHODS","content":"\u003cdiv id=\"Sec11\" class=\"Section2\"\u003e\n \u003ch2\u003eData collection and subject selection\u003c/h2\u003e\n \u003cp\u003eOur retrospective study (overview in Fig. \u003cspan class=\"InternalRef\"\u003e6\u003c/span\u003e) was approved by an Institutional Review Board (IRB) and all images were de-identified prior to usage in accordance with ethical standards. We collected 746 T2W MRI scans from patients over 18 years of age undergoing assessment for pancreatic cystic lesions between March 2004 and June 2024. Scans were collected from seven centers: Allegheny Health Network (AHN), Erasmus Medical Center (EMC), Istanbul University (IU) Hospital, Mayo Clinic Florida (MCF), Mayo Clinic Arizona (MCA), Northwestern Memorial Hospital (NMH), and New York University Langone Hospital (NYU) (Fig. \u003cspan class=\"InternalRef\"\u003e6\u003c/span\u003e-A). From an initial cohort of 746 subjects, 359 met these inclusion criteria and were selected for analysis (Fig. \u003cspan class=\"InternalRef\"\u003e7\u003c/span\u003e). The selected cohort had a mean age of 67.2\u0026thinsp;\u0026plusmn;\u0026thinsp;10.8 years and was 53% female. Images were acquired on Siemens, Philips, or GE scanners with either 1.5 T or 3 T field strength. After collection, images were selected converted to Neuroimaging Informatics Technology Initiative (NIfTI) format for analysis. We selected axial, non-fat-suppressed T2W scans; slice thicknesses of original DICOM files were between 3\u0026ndash;8 mm and voxel heights of the converted NIFTI files were 3\u0026ndash;15.9 mm. This data set includes abdominal MRIs of subjects with pancreatic cysts that were selected from an extended version of the \u003cem\u003ePanSegNet dataset\u003c/em\u003e by our multicenter group (Zhang et.al., 2025\u003csup\u003e25\u003c/sup\u003e).\u003c/p\u003e\n \u003cp\u003eWe evaluated the radiologic and histopathology results of all subjects prior to inclusion in our study (Fig. \u003cspan class=\"InternalRef\"\u003e7\u003c/span\u003e). On radiologic evaluation, 216 scans were excluded due to the absence of a pancreatic cyst, the presence of a different histopathology, or an unavailable radiologic result. The remaining 530 were further evaluated histopathologically via EUS-FNA or surgical resection. Among this cohort, 171 either did not undergo intervention or had histopathology findings negative for IPMN, leaving 359 patients for our study.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec12\" class=\"Section2\"\u003e\n \u003ch2\u003eSubject classification\u003c/h2\u003e\n \u003cp\u003eDysplasia grades for BD-IPMN were determined via histopathology from EUS-FNA or surgical resection. All MD or Mixed IPMN were surgically resected and histopathologically evaluated. Subjects were then grouped based on dysplasia grade: lesions with low grade dysplasia (LGD) as Low Risk, and lesions with high grade dysplasia (HGD) and/or invasive carcinoma (IC) as High Risk (Table \u003cspan class=\"InternalRef\"\u003e6\u003c/span\u003e). The Low-Risk group was composed of 217 scans, which included 78.2% of the BD-IPMN cases and 38.5% of the MD or mixed-type IPMN cases. The High-Risk group included 142 subjects in total, 75 with HGD and 67 with IC. The High-Risk group included 27.7% of the BD-IPMN cases and 61.5% of the MD or mixed-type IPMN cases. One-third of Low-Risk patients had MD or mixed IPMN and were therefore unnecessarily resected due to current guidelines largely suggesting surgical resection for MD lesions\u003csup\u003e\u003cspan class=\"CitationRef\"\u003e2\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e\n \u003cdiv class=\"gridtable\"\u003e\u0026nbsp;\u003ctable id=\"Tab6\" border=\"1\"\u003e\n \u003ccaption language=\"En\"\u003e\n \u003cdiv class=\"CaptionNumber\"\u003eTable 6\u003c/div\u003e\n \u003cdiv class=\"CaptionContent\"\u003e\n \u003cp\u003eBreakdown of IPMN subtype in the Low and High-Risk groups.\u003c/p\u003e\n \u003c/div\u003e\n \u003c/caption\u003e\n \u003ccolgroup cols=\"3\"\u003e\u003c/colgroup\u003e\n \u003cthead\u003e\n \u003ctr\u003e\n \u003cth align=\"left\"\u003e\u0026nbsp;\u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eBD-IPMN (%)\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eMixed \u0026amp; MD-IPMN (%)\u003c/p\u003e\n \u003c/th\u003e\n \u003c/tr\u003e\n \u003c/thead\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u003cstrong\u003eLow Risk\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e155 (78.2)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e62 (38.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u003cstrong\u003eHigh Risk\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e43 (27.7)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e99 (61.5)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u003cstrong\u003eTotal\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e198\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e161\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n \u003c/table\u003e\n \u003c/div\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec13\" class=\"Section2\"\u003e\n \u003ch2\u003eImage quality assessment\u003c/h2\u003e\n \u003cp\u003eTo carefully investigate the retrospective data that was accrued across a 20-year time span (2004\u0026ndash;2024) from across different institutions, quality indicators were calculated to evaluate center variabilities caused by imaging devices and acquisition protocols. A total of 21 image-quality indicators including statistical values of intensities (e.g. mean, range, variance), and second-order statistics or filter-based measures (e.g. contrast per pixel, entropy focus criterion, and signal-to-noise ratios) were calculated using the open-source MRQy tool\u003csup\u003e\u003cspan class=\"CitationRef\"\u003e41\u003c/span\u003e\u003c/sup\u003e. Then, to visualize the quality indicators, the features are projected into a 2D plot using Uniform Manifold Approximation and Projection (UMAP) for Dimension Reduction. Before UMAP projection, each feature was normalized across the dataset using three different methods: z-score, minmax, and data whitening.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec14\" class=\"Section2\"\u003e\n \u003ch2\u003eManual segmentation\u003c/h2\u003e\n \u003cp\u003eThe index lesion for each MRI scan was segmented manually and reviewed by an interdisciplinary team of radiologists (GD, an abdominal radiologist with seven years of experience; and FB, a general radiologist with four years of abdominal radiology experience) and students (AMB and HEA, third year medical students; and ZSJ, a fourth year undergraduate student) using ITK-Snap (Version 4.2.0)\u003csup\u003e\u003cspan class=\"CitationRef\"\u003e42\u003c/span\u003e\u003c/sup\u003e. All segmentations were reviewed by three expert abdominal radiologists [GD, FB, YBT] to ensure accuracy and consistency. For subjects with BD-IPMN, the index lesion was defined as the cyst that was sampled in EUS-FNA or that was surgically resected. For mixed and MD-IPMN subjects, all regions with cystic involvement were surgically resected. Therefore, approximated cyst boundaries were discussed between abdominal radiologists and decided in consensus prior to segmentation. Concomitant cystic lesions were not included in the analyses.\u003c/p\u003e\n \u003cp\u003eInterobserver and intraobserver agreements were assessed to evaluate the quality and reproducibility of image segmentations. These are calculated using the Dice Similarity Coefficient (DSC) and Hausdorff distance (HD95). Higher DSC and lower HD95 scores indicate higher levels of agreement between the two segmentations. 30 randomly selected MRI scans were segmented by a separate radiologist and compared to the corresponding reference segmentations to assess interobserver agreement. To determine intraobserver agreement, 20 randomly selected MRI scans were segmented a second time after a wash-out period of two weeks.\u003c/p\u003e\n \u003cp\u003eRadiomics and DL methods are discussed below. The data from each center was grouped into four trial sets for testing while data from the remaining centers is used for cross-validation (Table \u003cspan class=\"InternalRef\"\u003e7\u003c/span\u003e). In order to simulate a real scenario, the test set consists of total data from one or two centers in each trial, which results in a different amount of test set data in each trial. These sets are referred to as Trial 1 (T1), Trial 2 (T2), Trial 3 (T3), and Trial 4 (T4). The groupings of studies across Trials were chosen such that there is balanced representation of low-, and high-risk studies across the training and test sets. To ensure robust and comprehensive validation, models from each trial were evaluated using a different center\u0026rsquo;s data for testing.\u003c/p\u003e\n \u003cdiv class=\"gridtable\"\u003e\u0026nbsp;\u003ctable id=\"Tab8\" border=\"1\"\u003e\n \u003ccaption language=\"En\"\u003e\n \u003cdiv class=\"CaptionNumber\"\u003eTable 7\u003c/div\u003e\n \u003cdiv class=\"CaptionContent\"\u003e\n \u003cp\u003eCenters used for testing and cross validation of the four trial sets used in the radiomics, DL, and radiomics-DL fusion analyses.\u003c/p\u003e\n \u003c/div\u003e\n \u003c/caption\u003e\n \u003ccolgroup cols=\"3\"\u003e\u003c/colgroup\u003e\n \u003cthead\u003e\n \u003ctr\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eTrial\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eCross Validation (N)\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eTesting (N)\u003c/p\u003e\n \u003c/th\u003e\n \u003c/tr\u003e\n \u003c/thead\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u003cstrong\u003eT1\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eEMC, IU, MCF, NMH, \u0026amp; NYU (330)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eAHN \u0026amp; MCA (29)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u003cstrong\u003eT2\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eAHN, IU, MCA, MCF, NMH, \u0026amp; NYU (323)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eEMC (36)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u003cstrong\u003eT3\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eAHN, EMC, MCA, MCF, NMH, \u0026amp; NYU (324)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eIU (35)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u003cstrong\u003eT4\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eAHN, EMC, IU, MCA, MCF, \u0026amp; NMH (288)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNYU (71)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n \u003c/table\u003e\n \u003c/div\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec15\" class=\"Section2\"\u003e\n \u003ch2\u003eRadiomics-only analysis\u003c/h2\u003e\n \u003cp\u003eWe conducted a 2D and 3D radiomics analysis (Fig. \u003cspan class=\"InternalRef\"\u003e6\u003c/span\u003e-B). Images were resized to achieve an isotropic voxel size of 1 mm\u003csup\u003e\u003cspan class=\"CitationRef\"\u003e3\u003c/span\u003e\u003c/sup\u003e (3D analysis) or an isotropic pixel size of 1 mm\u003csup\u003e\u003cspan class=\"CitationRef\"\u003e2\u003c/span\u003e\u003c/sup\u003e (2D analysis), using linear interpolation. An N4 bias field correction was applied to reduce low-frequency variations in the acquired signals\u003csup\u003e\u003cspan class=\"CitationRef\"\u003e43\u003c/span\u003e\u003c/sup\u003e. Intensity values were normalized using the min-max normalization technique\u003csup\u003e\u003cspan class=\"CitationRef\"\u003e44\u003c/span\u003e\u003c/sup\u003e. Then, radiomic features were extracted from the preprocessed images using in-house software and the package \u003cem\u003ecollageradiomics\u003c/em\u003e developed with Python\u003csup\u003e\u003cspan class=\"CitationRef\"\u003e45\u003c/span\u003e\u0026ndash;\u003cspan class=\"CitationRef\"\u003e48\u003c/span\u003e\u003c/sup\u003e. For the 3D analysis, 763 radiomic features were extracted within the entire volume of the cyst. For the 2D analysis, 447 features were extracted from axial plane slices. Features were extracted from six radiomic families. To capture spatial properties of pixel intensities were used the Raw (original intensity values) and Gray (image was fileted with median, mean, std and range filters) families. To capture edge-related features were used Gradient (image was filtered with Sobel and gradient-like kernel filters) and Law\u0026rsquo;s (images were filtered by local specific masks) families\u003csup\u003e\u003cspan class=\"CitationRef\"\u003e49\u003c/span\u003e\u0026ndash;\u003cspan class=\"CitationRef\"\u003e51\u003c/span\u003e\u003c/sup\u003e. Additionally, Haralick and CoLlAGe feature families were used to characterize the co-occurrence matrices (GLCM) \u003csup\u003e\u003cspan class=\"CitationRef\"\u003e46\u003c/span\u003e,\u003cspan class=\"CitationRef\"\u003e47\u003c/span\u003e\u003c/sup\u003e. The window sizes used to construct the GLCM matrices were w\u0026thinsp;=\u0026thinsp;3\u0026times;3, w\u0026thinsp;=\u0026thinsp;5\u0026times;5, and w\u0026thinsp;=\u0026thinsp;7\u0026times;7, and the number of gray levels was set to 4, 8, 16, 32, and 64 (See detailed description of radiomic families in Table S1 of supplementary material). The features were calculated inside the entire region of interest (the cyst) and then each feature was represented by four statistical measures: median, standard deviation, skewness, and kurtosis. Using the training set of each trial, a Spearman correlation threshold of 0.6 was applied to remove the most correlated features. Then, a 5-fold cross-validation scheme with 50 iterations was applied to select the best features applying the Maximum Relevance Minimum Redundancy (mRMR) algorithm and training a Random Forest (RF) model\u003csup\u003e\u003cspan class=\"CitationRef\"\u003e52\u003c/span\u003e\u003c/sup\u003e. After cross-validation, features selected in at least 70% of the iterations were picked up. Finally, a RF was trained with the entire training set and tested with the hold-out set.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec16\" class=\"Section2\"\u003e\n \u003ch2\u003eDeep learning-only analysis\u003c/h2\u003e\n \u003cp\u003eThe DL experiment was done in two parts (Fig. \u003cspan class=\"InternalRef\"\u003e6\u003c/span\u003e-C). First, we applied 5-fold cross-validation on the entire dataset from all centers to select the best performing model. We assessed the performance of six advanced convolutional neural networks (CNNs): EfficientNet-B0\u003csup\u003e29\u003c/sup\u003e, MobileNet-V2\u003csup\u003e31\u003c/sup\u003e, ResNet-34\u003csup\u003e28\u003c/sup\u003e, ResNet-50\u003csup\u003e28\u003c/sup\u003e, ShuffleNet-V2\u003csup\u003e30\u003c/sup\u003e, and DenseNet-121\u003csup\u003e27\u003c/sup\u003e. ROIs were cropped based on whole pancreas segmentation published in Zhang et. al., 2025\u003csup\u003e25\u003c/sup\u003e. Images were shuffled and resized to 96\u0026times;96\u0026times;96 for training. Models were trained using stochastic gradient descent (SGD) with a momentum of 0.9 and a batch size of 2, for a total of 200 epochs. The initial learning rate was set at 0.001 and decreased by a factor of 10 every 30 epochs. A 5-fold cross-validation process was applied to enhance result robustness. In the second part, we split the dataset by center, based on the four trial sets as described in Table \u003cspan class=\"InternalRef\"\u003e7\u003c/span\u003e. Each set was then used to train the best performing DL model from part one using the same parameters.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec17\" class=\"Section2\"\u003e\n \u003ch2\u003eRadiomics-deep learning fusion algorithm\u003c/h2\u003e\n \u003cp\u003eOur radiomics-DL fusion algorithm was developed by fusing decision probabilities of the radiomics Random Forest classifier with the best performing CNN in the DL-only analysis, DenseNet121 \u003csup\u003e27\u003c/sup\u003e (Fig. \u003cspan class=\"InternalRef\"\u003e6\u003c/span\u003e-C). Radiomics feature refinement was done by applying a 5-fold cross-validation and selecting radiomic features with a Spearman correlation coefficient below 0.6 to minimize redundancy. Both the radiomics-based Random Forest model and the deep learning model were retrained using the same training set, which was consistently split to ensure comparability. After training, the predicted probabilities from both models were fused, and the combined output was evaluated on the validation and test sets. For decision-level fusion, the probability outputs from both the DenseNet121 and Random Forest models were combined. Inspired by our earlier work (Yao et al 2023\u003csup\u003e23\u003c/sup\u003e), we applied an exact sample to the fusion method and found the best hyperparameter with grid search on fivefold cross validation\u003csup\u003e\u003cspan class=\"CitationRef\"\u003e18\u003c/span\u003e\u003c/sup\u003e. Two hyperparameters were introduced in the fusion method: the threshold \u003cem\u003et\u003c/em\u003e and the weight \u003cem\u003ek\u003c/em\u003e. If the radiomics prediction exceeds the threshold \u003cem\u003et\u003c/em\u003e, the final model output was solely based on the radiomics prediction, and the DL prediction was discarded. Otherwise, the fusion output was a weighted combination of the predictions from the radiomics-based model and the DL model, with the weight of the radiomics-based model set to \u003cem\u003e1-k\u003c/em\u003e and weight of the DL model set to \u003cem\u003ek\u003c/em\u003e. The fusion pipeline was conducted twice, using either 2D or 3D radiomic features. Weighted averages were calculated because of differences in the number of subjects represented in each center. A visualization of our fusion pipeline is provided in Fig. \u003cspan class=\"InternalRef\"\u003e6\u003c/span\u003e-C.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec18\" class=\"Section2\"\u003e\n \u003ch2\u003eRadiologist Visual Scoring\u003c/h2\u003e\n \u003cp\u003eImages were visually scored by three independent, expert radiologists [GD, FB, YBT] using the imaging features of the Kyoto Criteria\u003csup\u003e\u003cspan class=\"CitationRef\"\u003e2\u003c/span\u003e\u003c/sup\u003e. Cysts were given the label of no risk, low risk, or high risk according to radiological assessment. The radiologists were not told that the cysts were confirmed IPMN to emulate real-life, initial cystic lesion evaluation. Additionally, the radiologists were blinded to the subject\u0026rsquo;s clinical information, utilized only T2W and contrast-enhanced T1W sequences, and did not have access to previous imaging. 13 cases from the study cohort were excluded from this analysis because T1W images were not available (n\u0026thinsp;=\u0026thinsp;347). A pairwise assessment of weighted kappa statistics was calculated to evaluate agreement between raters. Sensitivity and specificity were calculated to evaluate the accuracy that the radiologists identified a high-risk lesion correctly.\u003c/p\u003e\n\u003c/div\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eAuthor Contributions:\u003c/strong\u003e Conceptualization, AMB, GD, and UB; methodology, AMB, GD, and UB; software, ZH, and MJG; validation, ZH, MJG, and LZ; formal analysis, ZH, HP, MJG, and LZ; investigation, AMB, ZH, MJG, and ZSJ; resources, CS, GDK, YV, EA, PT, AM, MSE, ZX, SJ, IGS, MJB, CH, TG, and CB; data curation, EK, GD, AMB and HEA; writing\u0026mdash;original draft preparation, AMB, ZH, and MJG; writing\u0026mdash;review and editing, AMB, UB, and GD ; visualization, AMB, ZH, MJG, and HEA; supervision, GD, UB, RNK, FHM; project administration, UB, RNK, FHM, GD, MBW, and PT; funding acquisition, UB, MBW, and RNK. All authors have read and agreed to the published version of the manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eInstitutional Review Board Statement:\u0026nbsp;\u003c/strong\u003eThe study was conducted in accordance with the Declaration of Helsinki and approved by the Institutional Review Board of Northwestern University (protocol code STU00214545, approved on 04/15/2021).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eInformed Consent Statement:\u0026nbsp;\u003c/strong\u003ePatient consent was waived due to retrospective and blind nature of the study.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAcknowledgments:\u003c/strong\u003e This study was funded by: NIH R01-CA246704, R01-CA240639, U01-CA268808, R01-HL171376, U01 DK127384-02S1, U01CA248226, R01CA277728, R01CA264017, and the WARF Accelerator Oncology Diagnostics Award.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting Interests:\u003c/strong\u003e Authors declare no conflict of interest except the following ones:\u003c/p\u003e\n\u003cp\u003eDr. Ulas Bagci acknowledges: Ther-AI LLC.\u003c/p\u003e\n\u003cp\u003eDr. Pallavi Tiwari is an equity holder in LivAI Inc. and serves as a scientific consultant for Johnson \u0026amp; Johnson.\u003c/p\u003e\n\u003cp\u003eDr. Rajesh N. Keswani acknowledges: Boston Scientific - consultant; Olympus - consultant; Medtronic - consultant and research support.\u003c/p\u003e\n\u003cp\u003eDr. Michael B. Wallace acknowledges Boston Scientific, ClearNote Health, Cosmo Pharmaceuticals, Endostart, Endiatix, Fujifilm, Medtronic, Surgical Automations, Ohelio Ltd, Venn Bioscience, Virgo Inc., Surgical Automation, and Microtek.\u003c/p\u003e\n\u003cp\u003eDr. Marco J. Bruno acknowledges: Boston Scientific - consultant, support for industry and investigator-initiated studies; Cook Medical - consultant, support for industry and investigator-initiated studies; Pentax Medical - consultant, support for investigator-initiated studies; Mylan - support for investigator initiated studies; AMBU - consultant, support for investigator initiated studies; ChiRoStim - support for investigator-initiated studies.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eData Availability Statement:\u003c/strong\u003e Our MRI and corresponding excel file for risk status of the patients are available at OSF server (NIH supported data sharing platform) at https://osf.io/74vfs/.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCode Availability Statement:\u003c/strong\u003e The underlying code for this study is available in GitHub and can be accessed via this link: https://github.com/Zilian4/IPMN-Radiomics-Plus-Deeplearning.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eSchweber, A. B., Agarunov, E., Brooks, C., Hur, C. \u0026amp; Gonda, T. A. Prevalence, incidence, and risk of progression of asymptomatic pancreatic cysts in large sample real-world data. Pancreas 50, 1287\u0026ndash;1292 (2021).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eOhtsuka, T. \u003cem\u003eet al.\u003c/em\u003e International evidence-based Kyoto guidelines for the management of intraductal papillary mucinous neoplasm of the pancreas. Pancreatology 24, 225\u0026ndash;270 (2023).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGonda, T. A., Cahen, D. L. \u0026amp; Farrell, J. J. Pancreatic Cysts. N. Engl. J. Med. 391, 832\u0026ndash;843 (2024).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHeckler, M. \u003cem\u003eet al.\u003c/em\u003e The Sendai and Fukuoka consensus criteria for the management of branch duct IPMN-A meta-analysis on their accuracy. Pancreatology 17, 255\u0026ndash;262 (2017).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYu, S. \u003cem\u003eet al.\u003c/em\u003e Validation of the 2012 Fukuoka consensus guideline for intraductal papillary mucinous neoplasm of the pancreas from a single institution experience. Pancreas 46, 936\u0026ndash;942 (2017).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRomutis, S. \u0026amp; Brand, R. Burden of new pancreatic cyst diagnosis. Gastrointestinal Endoscopy Clinics 33, 487\u0026ndash;495 (2023).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRobles, E. P.-C. \u003cem\u003eet al.\u003c/em\u003e Accuracy of 2012 International Consensus Guidelines for the prediction of malignancy of branch-duct intraductal papillary mucinous neoplasms of the pancreas. United European Gastroenterology Journal 4, 580\u0026ndash;586 (2016).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBulcke, A. V. \u003cem\u003eet al.\u003c/em\u003e Evaluating the accuracy of three international guidelines in identifying the risk of malignancy in pancreatic cysts: a retrospective analysis of a surgical treated population. Acta gastro-enterologica Belgica 84, 443\u0026ndash;450 (2021).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMaggi, G. \u003cem\u003eet al.\u003c/em\u003e Pancreatic cystic neoplasms: What is the most cost-effective follow-up strategy? Endoscopic Ultrasound 7, 319\u0026ndash;322 (2018).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eEloubeidi, M. A. \u003cem\u003eet al.\u003c/em\u003e Acute pancreatitis after EUS-guided FNA of solid pancreatic masses: a pooled analysis from EUS centers in the United States. Gastrointest. Endosc. 60, 385\u0026ndash;389 (2004).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePolkowski, M. \u003cem\u003eet al.\u003c/em\u003e Learning, techniques, and complications of endoscopic ultrasound (EUS)-guided sampling in gastroenterology: European Society of Gastrointestinal Endoscopy (ESGE) Technical Guideline. Endoscopy 44, 190\u0026ndash;206 (2012).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTacelli, M. \u003cem\u003eet al.\u003c/em\u003e Diagnostic performance of endoscopic ultrasound through-the‐needle microforceps biopsy of pancreatic cystic lesions: Systematic review with meta‐analysis. Dig. Endosc. 32, 1018\u0026ndash;1030 (2020).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDe Pretis, N. \u003cem\u003eet al.\u003c/em\u003e Pancreatic cysts: diagnostic accuracy and risk of inappropriate resections. Pancreatology 17, 267\u0026ndash;272 (2017).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLoos, M. \u003cem\u003eet al.\u003c/em\u003e Categorization of differing types of total pancreatectomy. JAMA surgery 157, 120\u0026ndash;128 (2022).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCollaborative, P. o. Pancreatic surgery outcomes: multicentre prospective snapshot study in 67 countries. Br. J. Surg. 111, znad330 (2024).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGillies, R. J., Kinahan, P. E. \u0026amp; Hricak, H. Radiomics: images are more than pictures, they are data. Radiology 278, 563\u0026ndash;577 (2016).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLeCun, Y., Bengio, Y. \u0026amp; Hinton, G. Deep learning. Nature 521, 436\u0026ndash;444 (2015).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYao, L. \u003cem\u003eet al.\u003c/em\u003e A review of deep learning and radiomics approaches for pancreatic cancer diagnosis from medical imaging. Curr. Opin. Gastroenterol. 39, 436\u0026ndash;447 (2023).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCorral, J. E. \u003cem\u003eet al.\u003c/em\u003e Deep learning to classify intraductal papillary mucinous neoplasms using magnetic resonance imaging. Pancreas 48, 805\u0026ndash;810 (2019).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCheng, S. \u003cem\u003eet al.\u003c/em\u003e Radiomics analysis for predicting malignant potential of intraductal papillary mucinous neoplasms of the pancreas: comparison of CT and MRI. Acad. Radiol. 29, 367\u0026ndash;375 (2022).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLaLonde, R. \u003cem\u003eet al.\u003c/em\u003e in \u003cem\u003eInternational Conference on Medical Image Computing and Computer-Assisted Intervention.\u003c/em\u003e 101\u0026ndash;109 (Springer).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSalanitri, F. P. \u003cem\u003eet al.\u003c/em\u003e in 2022 \u003cem\u003e44th Annual International Conference of the IEEE Engineering in Medicine \u0026amp; Biology Society (EMBC).\u003c/em\u003e 475\u0026ndash;479 (IEEE).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYao, L. \u003cem\u003eet al.\u003c/em\u003e in \u003cem\u003eInternational Workshop on Machine Learning in Medical Imaging.\u003c/em\u003e 134\u0026ndash;143 (Springer).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang, Z., Yao, L., Keles, E., Velichko, Y. \u0026amp; Bagci, U. Deep learning algorithms for pancreas segmentation from radiology scans: A review. Advances in Clinical Radiology 5, 31\u0026ndash;52 (2023).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang, Z. \u003cem\u003eet al.\u003c/em\u003e Large-scale multi-center CT and MRI segmentation of pancreas with deep learning. Med. Image Anal. 99, 103382 (2025).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSuman, G. \u003cem\u003eet al.\u003c/em\u003e Quality gaps in public pancreas imaging datasets: Implications \u0026amp; challenges for AI applications. Pancreatology 21, 1001\u0026ndash;1008 (2021).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHuang, G., Liu, Z., Van Der Maaten, L. \u0026amp; Weinberger, K. Q. in \u003cem\u003eProceedings of the IEEE conference on computer vision and pattern recognition.\u003c/em\u003e 4700\u0026ndash;4708.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHe, K., Zhang, X., Ren, S. \u0026amp; Sun, J. in \u003cem\u003eProceedings of the IEEE conference on computer vision and pattern recognition.\u003c/em\u003e 770\u0026ndash;778.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTan, M. Efficientnet: Rethinking model scaling for convolutional neural networks. arXiv preprint arXiv:1905.\u003cem\u003e11946\u003c/em\u003e, 6105\u0026ndash;6114 (2019).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMa, N., Zhang, X., Zheng, H.-T. \u0026amp; Sun, J. in \u003cem\u003eProceedings of the European conference on computer vision (ECCV).\u003c/em\u003e 116\u0026ndash;131.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHoward, A. G. Mobilenets: Efficient convolutional neural networks for mobile vision applications. \u003cem\u003earXiv preprint arXiv:1704.04861\u003c/em\u003e (2017).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMcHugh, M. L. Interrater reliability: the kappa statistic. Biochemia medica 22, 276\u0026ndash;282 (2012).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCui, S. \u003cem\u003eet al.\u003c/em\u003e Radiomic nomogram based on MRI to predict grade of branching type intraductal papillary mucinous neoplasms of the pancreas: a multicenter study. Cancer Imaging 21, 1\u0026ndash;13 (2021).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBalachandran, V. P., Gonen, M., Smith, J. J. \u0026amp; DeMatteo, R. P. Nomograms in oncology: more than meets the eye. The lancet oncology 16, e173-e180 (2015).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLee, D. Y. \u003cem\u003eet al.\u003c/em\u003e Radiomics model versus 2017 revised international consensus guidelines for predicting malignant intraductal papillary mucinous neoplasms. Eur. Radiol. 34, 1222\u0026ndash;1231 (2024).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTobaly, D. \u003cem\u003eet al.\u003c/em\u003e CT-based radiomics analysis to predict malignancy in patients with intraductal papillary mucinous neoplasm (IPMN) of the pancreas. Cancers (Basel) 12, 3089 (2020).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePermuth, J. B. \u003cem\u003eet al.\u003c/em\u003e Combining radiomic features with a miRNA classifier may improve prediction of malignant pathology for pancreatic intraductal papillary mucinous neoplasms. Oncotarget 7, 85785 (2016).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLou, F. \u003cem\u003eet al.\u003c/em\u003e Comprehensive analysis of clinical data and radiomic features from contrast enhanced CT for differentiating benign and malignant pancreatic intraductal papillary mucinous neoplasms. Sci. Rep. 14, 17218 (2024).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePozzi-Mucelli, R. M. \u003cem\u003eet al.\u003c/em\u003e Pancreatic MRI for the surveillance of cystic neoplasms: comparison of a short with a comprehensive imaging protocol. Eur. Radiol. 27, 41\u0026ndash;50 (2017).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFlammia, F. \u003cem\u003eet al.\u003c/em\u003e Branch duct-intraductal papillary mucinous neoplasms (BD-IPMNs): An MRI-based radiomic model to determine the malignant degeneration potential. Radiol. Med. 128, 383\u0026ndash;392 (2023).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSadri, A. R. \u003cem\u003eet al.\u003c/em\u003e MRQy\u0026mdash;An open-source tool for quality control of MR imaging data. Med. Phys. 47, 6029\u0026ndash;6038 (2020).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYushkevich, P. A. \u003cem\u003eet al.\u003c/em\u003e User-guided 3D active contour segmentation of anatomical structures: significantly improved efficiency and reliability. Neuroimage 31, 1116\u0026ndash;1128 (2006).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTustison, N. J. \u003cem\u003eet al.\u003c/em\u003e N4ITK: improved N3 bias correction. IEEE Trans. Med. Imaging 29, 1310\u0026ndash;1320 (2010).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBagci, U., Udupa, J. K. \u0026amp; Bai, L. in \u003cem\u003eMedical Imaging 2010: Visualization, Image-Guided Procedures, and Modeling.\u003c/em\u003e 602\u0026ndash;613 (SPIE).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eInc, T. MATLAB version: 9.13. 0 (R2022b). \u003cem\u003eThe MathWorks Inc\u003c/em\u003e (2022).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePrasanna, P., Tiwari, P. \u0026amp; Madabhushi, A. Co-occurrence of local anisotropic gradient orientations (CoLlAGe): a new radiomics descriptor. Sci. Rep. 6, 1\u0026ndash;14 (2016).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHaralick, R. M., Shanmugam, K. \u0026amp; Dinstein, I. H. Textural features for image classification. IEEE Transactions on systems, man, and cybernetics, 610\u0026ndash;621 (1973).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePython version: 3.8.2. \u003cem\u003ePython Software Foundation\u003c/em\u003e (2023).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMohammad, E. J., Taha, R. Y. \u0026amp; Mazher, H. A. Design and fundamentals of Sobel Edge Detection of an image. J Multidiscip Eng Sci Technol (JMEST) ISSN 9, 2458\u0026ndash;9403 (2022).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSobel, I. \u0026amp; Feldman, G. A 3x3 isotropic gradient operator for image processing. \u003cem\u003ea talk at the Stanford Artificial Project in\u003c/em\u003e 1968, 271\u0026ndash;272 (1968).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLaws, K. I. Textured image segmentation. (1981).\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePeng, H., Long, F. \u0026amp; Ding, C. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on pattern analysis and machine intelligence 27, 1226\u0026ndash;1238 (2005).\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"","lastPublishedDoi":"10.21203/rs.3.rs-6622868/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-6622868/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eDistinguishing high-risk intraductal papillary mucinous neoplasms (IPMNs), pancreatic cysts requiring surgery, from low-risk lesions remains a clinical challenge, often resulting in unnecessary procedures due to limited specificity of current methods. While radiomics and deep learning (DL) have been explored for pancreatic cancer, cyst-level malignancy risk stratification of IPMNs remains untapped. We conducted a multi-institutional study (seven centers, 359 T2W MRI images) to assess the feasibility of AI for predicting IPMN dysplasia grade using cyst-level image features. We developed and compared 2D and 3D radiomics-only, deep learning (DL)-only, and radiomics-DL fusion models, using expert radiologist scoring as a baseline reference. Model performance was evaluated using held-out test data. The radiomics-DL fusion model showed the highest discriminatory ability on the test set (AUC 0.692), outperforming the radiomics-only model (AUC 0.665). Expert accuracy varied widely (37.4%-66.7%). The fusion model integrating deep learning and radiomics features from routine T2W MRI (AUC: 0.692) demonstrates potential for objective, cyst-level risk stratification of IPMNs in a multi-center cohort, outperforming both radiomics-only models and expert radiologists. While performance requires improvement for standalone clinical use, this approach offers a scalable, non-invasive method to potentially improve diagnostic accuracy and reduce unnecessary surgical interventions.\u003c/p\u003e","manuscriptTitle":"Multi-center evaluation of radiomics and deep learning to stratify malignancy risk of IPMNs","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-05-30 14:47:50","doi":"10.21203/rs.3.rs-6622868/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"4d3e730e-6360-4b00-a051-1ec62b2822da","owner":[],"postedDate":"May 30th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":49106121,"name":"Biological sciences/Cancer/Cysts"},{"id":49106122,"name":"Health sciences/Diseases/Gastrointestinal diseases/Gastrointestinal cancer/Pancreatic cancer"}],"tags":[],"updatedAt":"2025-06-16T14:23:49+00:00","versionOfRecord":[],"versionCreatedAt":"2025-05-30 14:47:50","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-6622868","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-6622868","identity":"rs-6622868","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.