Development and validation of a deep learning model for differentiating cytomegalovirus and herpes simplex virus esophagitis using endoscopic images

doi:10.21203/rs.3.rs-9113001/v1

Development and validation of a deep learning model for differentiating cytomegalovirus and herpes simplex virus esophagitis using endoscopic images

2026 · doi:10.21203/rs.3.rs-9113001/v1

preprint OA: closed

Full text JSON View at publisher

Full text 113,521 characters · extracted from preprint-html · click to expand

Development and validation of a deep learning model for differentiating cytomegalovirus and herpes simplex virus esophagitis using endoscopic images | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Development and validation of a deep learning model for differentiating cytomegalovirus and herpes simplex virus esophagitis using endoscopic images Ji Eun Kim, Yeong Chan Lee, Young Eun Oh, Tae Se Kim, Hyuk Lee, and 4 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-9113001/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted 6 You are reading this latest preprint version Abstract Background: Cytomegalovirus (CMV) and herpes simplex virus (HSV) are the most common causes of infectious esophagitis in immuno-compromised patients. However, their endoscopic features frequently overlap, making real-time etiologic differentiation difficult and often requiring delayed confirmation by immunohistochemistry. Methods: We developed and validated a deep learning model to distinguish CMV from HSV esophagitis using endoscopic images from biopsy-proven cases at a tertiary referral center. The model was trained with domain-specific pretraining and a curriculum learning strategy that sequentially introduced cases according to diagnostic difficulty, mimicking clinical learning processes. Diagnostic performance was evaluated using an independent test set and compared with experienced endoscopists. Results: The curriculum learning–based model demonstrated improved classification performance over conventional training approaches, achieving an AUROC of 0.783 (95% confidence interval 0.680–0.885). Its sensitivity and specificity were comparable to those of expert endoscopists. Model visualization showed attention to clinically relevant mucosal abnormalities. Conclusion: Deep learning analysis of routine endoscopic images can assist in differentiating CMV and HSV esophagitis. A curriculum learning strategy may enhance clinical applicability by improving performance in visually ambiguous conditions, potentially supporting earlier therapeutic decision-making while awaiting histopathologic confirmation Biological sciences/Computational biology and bioinformatics Health sciences/Diseases Health sciences/Gastroenterology Health sciences/Medical research Cytomegalovirus esophagitis Herpes simplex virus esophagitis Deep learning Endoscopic imaging Computer-aided diagnosis Figures Figure 1 Figure 2 Figure 3 Figure 4 Introduction Cytomegalovirus (CMV) and herpes simplex virus (HSV) represent the most common etiologies of infectious esophagitis, particularly in immunocompromised hosts[1-4]. Although endoscopy is routinely performed when viral esophagitis is suspected, the substantial overlap in endoscopic morphology including erosions, punched-out ulcers, and longitudinal defects makes reliable discrimination between CMV and HSV challenging in real time[5-7]. As a result, definitive diagnosis relies on immunohistochemical (IHC) staining of biopsy specimens[8, 9]. However, IHC processing requires additional time, and treatment cannot be confidently initiated until results are confirmed[10]. This delay is clinically meaningful because inappropriate or deferred antiviral therapy can lead to worsening esophageal injury, uncontrolled symptoms, and, in severe cases, systemic complications in vulnerable patients. Recent advances in artificial intelligence (AI) have demonstrated promising performance in viral detection across gastrointestinal imaging from CMV identification on digital pathology slides to machine-learning models distinguishing CMV from HSV esophagitis using handcrafted features[11-13]. However, these approaches do not resolve the central diagnostic challenge encountered at the time of endoscopy. Clinicians must frequently decide whether to initiate antiviral therapy before histopathologic confirmation, yet the endoscopic appearances of CMV and HSV esophagitis overlap substantially, and no validated AI tool currently offers reliable, point-of-care differentiation that can function as a consistent central reader. Against this backdrop, we sought to develop a deep learning system capable of assisting real-time etiologic classification of viral esophagitis using biopsy proven CMV and HSV cases from a large tertiary center, supplemented by publicly available datasets to enhance robustness. By evaluating its performance relative to experienced endoscopists and contemporary AI models, we aimed to determine whether such a system could meaningfully support clinical decision-making during the narrow window in which treatment strategies are first considered. This study highlights the potential role of AI-assisted differentiation in mitigating diagnostic delays and improving the timely initiation of appropriate therapy for patients with suspected viral esophagitis. Methods 1. Study population and data preparation 1) Study population The study population comprised patients who underwent esophagogastroduodenoscopy (EGD) at Samsung Medical Center (SMC), a tertiary academic referral hospital in Seoul, South Korea, between January 1, 2012, and December 31, 2021. Eligible patients were diagnosed with either CMV esophagitis or herpes simplex virus HSV esophagitis, confirmed by tissue biopsy with IHC staining. Only patients with histopathological proven CMV or HSV infection were included. Clinical, endoscopic, and pathological variables were extracted from the institutional electronic medical record system. To ensure phenotypic consistency, cases were selected only when the endoscopic report explicitly included the term esophagitis. Patients who did not undergo IHC evaluation or whose pathology results were inconclusive were excluded from the analysis. 170 patients were used to train and validate our model. Demographic and clinical variables including age, sex, and comorbidities were collected for all eligible patients. The study protocol was reviewed and approved by the SMC Institutional Review Board (IRB No. 2023-03-079) and conducted in accordance with the principles of the Declaration of Helsinki. In this study, we used de-identified patients whose the requirement for informed consent was waived according to the rules of SMC IRB. 2) Target dataset We collected 470 endoscopic images from the 170 patients with CMV or HSV in SMC, representing a class imbalance ratio of 1:3. The dataset was divided into a cross-validation set (369 images) and an independent hold-out set (101 images). Each fold in the cross-validation set included approximately 295 images for training and 74 images for validation ( Table 1 ). 3) Public dataset for pretraining We obtained 105,183 endoscopic images including upper gastrointestinal (GI) tract from two public datasets: GastroVision[14] (2,314 labeled upper GI images) and HyperKvasir[15] (3,452 labeled upper GI + 99,417 unlabeled endoscopic images. With these large datasets for GI endoscopy, we used 5,766 upper GI images and 99,417 images for self-supervised learning ( Table 2 ). The labeled images included normal and pathological conditions of upper GI anatomy; normal esophagus (139 images), esophagitis (770 images), Barret’s esophagus (189 images), normal stomach (969 images), pylorus (1,392 images), and others (2,302 images). 4) Data processing and augmentation There was no subject overlap between the cross validation set and the hold-out set. Images were resized to 608×608 pixels after center-cropping and padding the original images. All images were additionally processed following by normalization using ImageNet color statistics (RGB mean = [0.485, 0.456, 0.406] and standard deviation = [0.229, 0.224, 0.225]) and data augmentation techniques including random horizontal and vertical flips, color jittering with brightness and contrast variations of 0.1. 2. Model architecture We employed FocalNet-Base as our foundation architecture, a vision transformer variant that incorporates focal modulation mechanisms for enhanced feature representation learning. Focal modulation hierarchically captures multi-scale spatial context with depth-wise convolution and gated aggregation, then injects the resulting modulator into each query[16]. It can measure spatially context-aware interactions without explicit query–key matching that require intensive operations. Our model processes 608×608 input images with 4×4 patches, resulting in 152×152 patch tokens. The architecture consists of hierarchical feature extraction with progressive expansion of hidden dimensions across transformer layers including a focal modulation layer, culminating in a binary prediction head to classify CMV or HSV. The model was trained using the Adam optimizer with cross-entropy loss. We developed a convolutional neural network (CNN) as a comparative model for binary classification of CMV and HSV. The network architecture consisted of four sequential convolutional blocks, each comprising a convolutional layer (with 32, 64, 128, and 256 filters respectively), batch normalization, ReLU activation, and 2×2 max-pooling. Following the convolutional blocks, we applied adaptive average pooling to generate fixed-size feature maps, followed by a fully connected layer with dropout (p=0.3) for regularization. This scratch model was trained for 50 epochs with a learning rate of 1×10-3 and batch size of 32 using the Adam optimizer with cross-entropy loss. We further validated our curriculum learning approach with a GastroNet-5M foundation model (GNFM) [20] in gastrointestinal endoscopy. GNFM was pretrained with about 5 million endoscopic images (224×224 pixel) of the gastrointestinal tract in GastroNet-5M. The architecture employed a ResNet50[21] with the original classification head replaced by a custom head comprising dropout layers (p=0.3 and 0.2) and fully connected layers for binary classification. 3. Training The overall training workflow is illustrated in Figure 1 . We first conducted domain-specific self-supervised pretraining using masked image modeling, followed by fine-grained tuning that integrated automated hyperparameter search and a curriculum learning strategy to maximize classification performance. 1) Self-supervised pretraining We implemented masked image modeling (MIM) [17]for domain-specific pretraining using publicly available 105,183 endoscopic images. MIM is one of self-supervised learning strategies that learn rich feature representations by reconstructing partially masked patches in an image[17]. We randomly masked 60% of 8×8 pixel regions in an endoscopic image. Then, FocalNet was trained to reconstruct the masked regions, enabling learning of meaningful visual representations relevant to GI regions. Training was conducted for 50 epochs with a learning rate of 1×10 -4 , batch size of 32, and gradient clipping at a maximum norm of 1.0. 2) Automated hyperparameter optimization We used an automatic hyperparameter optimization (HPO) framework[18] to search for optimal training configurations to fine-tune the pretrained FocalNet for each cross-validation fold. The search space included learning rates (1×10 -7 to 1×10 -6 ), training epochs (20 to 30), and batch sizes (4, 8, or 16). HPO was conducted using Tree-Structured Parzen Estimator [19] based on Bayesian optimization method with the search space, which explores randomly for the initial few trials and subsequently samples more frequently from promising hyperparameter regions of the search space. This automatic optimization was performed over 30 trials per fold with the objective of maximizing area under the receiver operating characteristic curve (AUROC). 3) Curriculum learning with confidence-based stratification We usedcurriculum learning approach to further train FocalNet and GNFM. We generated pseudo-labels for difficulty with the training set. Samples in the training set per fold were stratified into four difficulty levels for CMV and HSV, respectively: (1) Very Easy: correct predictions with high confidence (≥median confidence), (2) Easy: correct predictions with low confidence (<median confidence), (3) Hard: incorrect predictions with low confidence, and (4) Very Hard: incorrect predictions with high confidence. The optimal threshold for confidence assessment was determined using Youden's J statistics to maximize sensitivity and specificity. Curriculum learning proceeded progressively through four stages, incorporating cumulatively more difficult samples. Stage 1 trained on Very Easy samples only, Stage 2 added Easy samples, Stage 3 included Hard samples, and Stage 4 encompassed all samples. Each curriculum stage trained for 20 epochs with evaluation on the validation set. All experiments were implemented using the Hugging Face Transformers (version 4.51.3) library with PyTorch (version 2.5.1) backend. 4. Evaluation After fold-specific fine-tuning, an ensemble of five models was applied to improve generalization. For the hold-out test set, final predictions were obtained by averaging output probabilities. Performance was evaluated using AUROC, accuracy, sensitivity, specificity, F1-score, and positive predictive value. Ninety-five percent confidence intervals (CI) for AUROC were calculated with 2,000 bootstrap replicates. To compare performance with clinicians, nine endoscopists independently classified hold-out images as CMV or HSV, and consensus scores were determined by vote counts. Model interpretability was qualitatively assessed using gradient-weighted class activation mapping (Grad-CAM), which highlights regions contributing to predictions [20]. We additionally evaluated generative pretrained transformer 4o (GPT-4o; OpenAI). GPT-4o classified CMV or HSV from hold-out images using a predefined instruction (Supplementary Figure 1). To mitigate hallucinated outputs, it was instructed to respond “difficult to differentiate” when a definitive classification was not possible. Statistical Analysis Demographic statistics are described as mean ± standard deviation (SD) for continuous variables and as number (percentage) for categorical variables. We conducted t-tests and chi-squared tests for continuous variables and categorical variables, respectively, to compare CMV and HSV groups. All statistical comparisons used two-sided tests with significance threshold α=0.05 and were performed with R software (version 4.1.0; R Foundation for Statistical Computing, Vienna, Austria). Results Baseline characteristics In this study, 42 CMV cases and 125 HSV cases were included for development and validation of the model. Table 3 shows the demographic characteristics of the study participants at the baseline. The mean age between two groups was not significantly different (58.3 for CMV and 59.6 for HSV; p-value=0.587). About 71% of patients with CMV and 67% of patients with HSV were male (p-value=0.765). The proportions for the morbidities were also not significantly different between patients with CMV and HSV. The difference of mortality rates within 1 year after the diagnosis was marginally insignificant (46.5% for CMV and 26.8% for HSV, p-value=0.060) The pretrained model can reconstruct the masked region in an in-house endoscopic image We confirmed whether the model pretrained on the large dataset could reconstruct masked endoscopic images. The reconstruction loss with the pretraining dataset converged to 0.075 and the loss with the SMC dataset remained similarly low at 0.082 ( Figure 2a ). Representative examples are shown in Figure 2b . The model recovered mucosal folds, vascular patterns, lesions and luminal contours, indicating that it had learned transferable, structure-aware representations of the mucosa. Fine-grained tuning improved model performance We observed a gradual improvement for our model through fine-grained tuning ( Figure 3a ). A vanilla FocalNet (AUROC of 0.701; 95% CI 0.586-0.816), which was only trained with HPO but without pretraining, outperformed a convolutional neural network (AUROC of 0.660; 95% CI 0.524-0.796). The vanilla model and the model applying both pretraining and HPO achieved AUROC of 0.734 (95% CI 0.634-0.833) was lower than endoscopists’ consensus (AUROC of 0.737; 95% CI 0.612-0.862). However, further training with curriculum learning yielded the best performance (AUROC of 0.783; 95% CI 0.680-0.885). GNFM with curriculum learning also showed improved performance (AUROC of 0.774; 95% CI 0.669-0.879) over baseline fine-tuning (AUROC of 0.747; 95% CI 0.641-0.853). Its sensitivity and specificity were comparable to those of each of the nine endoscopists ( Figure 3b and Table 4 ). By contrast, GPT-4o showed substantially lower performance with accuracy of 0.485, sensitivity of 0.500, and specificity of 0.440. Even GPT-4o could not conclude 24 images to a specific disease class. We identified a consistent trend of performance gains across folds as the curriculum advanced ( Supplementary Table 1 ). Grad-CAM heatmaps highlight the class-discriminative regions for the prediction, which largely overlap with abnormal mucosa ( Figure 4 ). Discussions In this study, we demonstrated that deep learning analysis of routine endoscopic images can aid in differentiating CMV from HSV esophagitis, a distinction that is often challenging based on endoscopic appearance alone. Our findings suggest that artificial intelligence may serve as a useful adjunct to support early etiologic assessment during endoscopy. CMV and HSV esophagitis share substantial clinical and endoscopic overlap, making differentiation challenging even for experienced endoscopists. Both commonly present with odynophagia, chest pain, and ulcerative lesions, while classic features—such as linear ulcers in CMV or vesicular lesions in HSV—are often absent in practice. Because these infections primarily affect immunocompromised patients, delayed diagnosis may result in prolonged hospitalization and increased morbidity. Although histopathologic confirmation with immunohistochemical staining remains the diagnostic standard, it is time-consuming and limited by sampling error from the patchy distribution of viral inclusions, often requiring multiple biopsies that may not be feasible in unstable patients. These limitations underscore the need for adjunctive tools capable of rapid, image-based differentiation. In this context, artificial intelligence may support objective assessment during endoscopy. The improved performance of our final model appears to be driven more by the curriculum learning strategy than by foundation model pretraining alone. Presenting training images in order of increasing difficulty may parallel the stepwise development of clinical expertise and enhance recognition of subtle mucosal differences in visually heterogeneous settings. This effect was most pronounced in the FocalNet-based model, where curriculum learning produced the greatest gain, whereas pretraining without structured progression yielded less consistent improvement. These findings highlight the importance of task-specific training design in viral esophagitis, where morphologic distinctions are often subtle. In practice, such image-based inference is not intended to replace histopathology but to provide an earlier probabilistic assessment that may help guide initial antiviral selection while confirmatory testing is pending. Recent advances in gastrointestinal imaging have demonstrated the potential of deep learning to support the diagnosis of infectious and inflammatory diseases[21, 22]. Deep learning models have shown high accuracy in identifying Helicobacter pylori infection, grading gastritis, and recognizing inflammatory changes in conditions such as eosinophilic esophagitis[23, 24]. These successes suggest that endoscopic manifestations of mucosal inflammation can be effectively captured by image-based neural networks, supporting the feasibility of extending similar approaches to viral esophagitis. Our findings therefore align with a broader body of evidence indicating that AI can augment diagnostic assessment in gastrointestinal diseases[25]. However, despite these advances, pathogen-level differentiation has rarely been addressed in prior work. An earlier Scientific Reports study from Asan Medical Center applied machine-learning methods to viral esophagitis using manually defined regions of interest (ROIs) and reported near-perfect classification performance[12]. However, such exceptionally high accuracy is difficult to reconcile with the substantial visual overlap between CMV and HSV esophagitis and may reflect methodological constraints inherent to small datasets and ROI-dependent pipelines. Manual ROI selection can inadvertently introduce bias or information leakage by focusing on conspicuous lesion components while excluding surrounding mucosa, thereby overestimating model performance and limiting real-world applicability. In contrast, our model processes full-frame endoscopic images and were evaluated using a strictly separated hold-out test set, reducing the risk of overfitting and providing a more reliable estimate of generalizability. By combining domain-adapted pretraining and curriculum learning, our framework offers a structurally more robust and scalable approach than earlier ROI-based methods. This study has several limitations. First, CMV and HSV esophagitis are uncommon even in high-volume tertiary centers, resulting in a limited sample size that may restrict generalizability. Although we addressed this through domain-adapted pretraining, curriculum-based fine-tuning, and data augmentation, larger multicenter datasets are needed to confirm clinical applicability. Second, the retrospective design and variability in image acquisition may introduce selection and spectrum biases. Third, the model generates image-level predictions without incorporating clinical or laboratory data, which could improve performance within multimodal frameworks. Future studies should expand dataset diversity and prospectively evaluate real-time integration into clinical workflows. Our findings show that deep learning can assist in differentiating CMV and HSV esophagitis, a task that remains challenging even for experienced clinicians. Using domain-adapted pretraining and full-frame image analysis, our model provides an early, noninvasive estimate of pathogen likelihood to help guide antiviral selection while awaiting histopathologic confirmation. Given that viral esophagitis often affects immunocompromised patients, rapid image-based inference may offer meaningful clinical value. Future studies should validate this approach across diverse institutions, incorporate multimodal clinical data, and assess real-time integration into endoscopic workflows to support clinically reliable AI-assisted decision-making. Declarations CONFLICT OF INTEREST No potential conflict of interest relevant to this article was reported. AUTHOR’S CONTRIBUTIONS Study concept and design: K.J.E., M.Y.W. Data acquisition: L.Y.C. Data analysis, and drafting of manuscript: L.Y.C. Critical revision for intellectual content: O.Y. E., KT.S., L.H., M.B.H., L.J.H., R.P.L All authors read and approved the final manuscript. FUNDING INFORMATION None DATA AVAILABILITY STATEMENT The data underlying this article cannot be shared publicly, given the privacy expectations of the individuals who participated in the study. The data will be shared upon reasonable request to the corresponding author. ETHICS STATEMENT The study protocol was reviewed and approved by the Institutional Review Board of Samsung Medical Center (IRB No. 2023-03-079). References Lanxin Li 1, R.C.C., Cytomegalovirus Esophagitis(Archived). StatPearls Publishing, 2023. Mohit Gupta 1, M.S., Cytomegalovirus Infections. StatPearls Publishing, 2025. Kim, N.Y. and J. Lee, Infective Esophagitis. Korean J Helicobacter Up Gastrointest Res, 2025. 25 (2): p. 108–116. Ali, A.A., et al., Cytomegalovirus Esophagitis in an Immunocompromised Patient. Cureus, 2023. 15 (9): p. e45634. Li, X., et al., Advances and Challenges in Cytomegalovirus Detection Methods for Liver Transplant Donors. Diagnostics (Basel), 2023. 13 (21). Coisel, Y., et al., Cytomegalovirus and herpes simplex virus effect on the prognosis of mechanically ventilated patients suspected to have ventilator-associated pneumonia. PLoS One, 2012. 7 (12): p. e51340. Hasan, M.R., et al., Analytical methods for detection of human cytomegalovirus clinched biosensor a cutting-edge diagnostic tool. Biomedical Engineering Advances, 2021. 1 . Nirali Desai, M.B.H., * Said Albahra, MD,† Elena Lucas, MD,† and M. Amit G. Singal, MS,‡ Suntrea T.G. Hammer, MD,† and Purva Gopal, MD, MS, Clinical and Histopathologic Features Can Help Target Immunohistochemical Stain Use in the Diagnosis of Viral Esophagitis. 2021. Yeh, P.J., et al., Risk Factors, Clinical and Endoscopic Features, and Clinical Outcomes in Patients with Cytomegalovirus Esophagitis. J Clin Med, 2022. 11 (6). Juric-Sekhar, G., et al., Cytomegalovirus (CMV) in gastrointestinal mucosal biopsies: should a pathologist perform CMV immunohistochemistry if the clinician requests it? Hum Pathol, 2017. 60 : p. 11–15. Post, C.S., et al., Utility of Machine Learning to Detect Cytomegalovirus in Digital Hematoxylin and Eosin-Stained Slides. Lab Invest, 2023. 103 (10): p. 100225. Lee, J.S., et al., Machine learning approach for differentiating cytomegalovirus esophagitis from herpes simplex virus esophagitis. Sci Rep, 2021. 11 (1): p. 3672. Kim, J.H., et al., Enhancing the Predictions of Cytomegalovirus Infection in Severe Ulcerative Colitis Using a Deep Learning Ensemble Model: Development and Validation Study. JMIR Med Inform, 2025. 13 : p. e64987. Jha, D., et al., GastroVision: A Multi-class Endoscopy Image Dataset for Computer Aided Gastrointestinal Disease Detection. 2023. Borgli, H., et al., HyperKvasir, a comprehensive multi-class image and video dataset for gastrointestinal endoscopy. Sci Data, 2020. 7 (1): p. 283. Jianwei Yang1, C.L., Xiyang Dai2, Lu Yuan2, Jianfeng Gao1, M.C.A. 1Microsoft Research at Redmond, and c. {jianwyan, xidai,luyuan,jfgao}@microsoft.com, Focal Modulation Networks. Advances in Neural Information Processing Systems, 2022. Cao2*, Z.X.Z.Z.Y., et al., SimMIM: a Simple Framework for Masked Image Modeling. Akiba1, T., et al., Optuna: A Next-generation Hyperparameter Optimization Framework. 2019. Watanabe, S., Tree-Structured Parzen Estimator: Understanding Its Algorithm Components and Their Roles for Better Empirical Performance. 2025. Ramakrishna, R.R.S.M.C.A.D. and V.D.P.D. Batra, Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization. 2019. Esteva, A., et al., A guide to deep learning in healthcare. Nat Med, 2019. 25 (1): p. 24–29. Messmann, H., et al., Expected value of artificial intelligence in gastrointestinal endoscopy: European Society of Gastrointestinal Endoscopy (ESGE) Position Statement. Endoscopy, 2022. 54 (12): p. 1211–1231. Goncalves, W.G.E., et al., DeepHP: A New Gastric Mucosa Histopathology Dataset for Helicobacter pylori Infection Diagnosis. Int J Mol Sci, 2022. 23 (23). Gong, E.J., C.S. Bang, and J.J. Lee, Computer-aided diagnosis in real-time endoscopy for all stages of gastric carcinogenesis: Development and validation study. United European Gastroenterol J, 2024. 12 (4): p. 487–495. Cao, J.S., et al., Artificial intelligence in gastroenterology and hepatology: Status and challenges. World J Gastroenterol, 2021. 27 (16): p. 1664–1690. Tables Table 1. Number of images for cross-validation and hold-out sets. CMV HSV Total Cross-validation Fold 1 Training 60 225 285 Validation 29 55 84 Fold 2 Training 81 224 305 Validation 8 56 64 Fold 3 Training 68 237 305 Validation 21 43 64 Fold 4 Training 78 221 299 Validation 11 59 70 Fold 5 Training 69 213 282 Validation 20 67 87 Holdout 25 76 101 Abbreviation: CMV, cytomegalovirus; herpes simplex virus, HSV. Table 2. The number of images in GI classes of HyperKvasir and GastroVision Source Class Images GastroVision Normal Stomach 969 Pylorus 393 GE Junction Normal Z-line 329 Duodenal Bulb 205 Normal Esophagus 139 Esophagitis 107 Barret's Esophagus 95 Gastric Polyps 65 Esophageal Varices 7 Ulcer 5 HyperKvasir Pylorus 999 Z-line 932 Retroflex Stomach 764 Esophagitis 663 Barret's Esophagus 94 Unlabeled 99,417 Table 3. Baseline characteristics for the patients with CMV and HSV. CMV (N = 43) HSV (N = 127) p-value Age 58.1 ± 14.5 60.4 ± 11.9 0.296 Sex 1.000 Male 30 (69.8%) 87 (68.5%) Female 13 (30.2%) 40 (31.5%) Upper GI cancer 11 (25.6%) 19 (15.0%) 0.178 Other cancer 22 (51.2%) 71 (55.9%) 0.717 Inflammatory disease 3 (7.0%) 8 (6.3%) 1.000 Autoimmune disease 2 (4.7%) 7 (5.5%) 1.000 Metabolic disease 11 (25.6%) 37 (29.1%) 0.802 Cardiovascular disease 15 (34.9%) 45 (35.4%) 1.000 Infectious disease 13 (30.2%) 33 (26.0%) 0.731 Other disease 18 (41.9%) 70 (55.1%) 0.184 Transplantation 14 (32.6%) 42 (33.1%) 1.000 Mortality within 1 year after CMV/HSV diagnosis 0.060* Unknown 2 (4.7%) 16 (12.6%) No 21 (48.8%) 77 (60.6%) Yes 20 (46.5%) 34 (26.8%) Variables are presented as mean ± standard deviation, and attributes as N (%). Abbreviation: CMV, cytomegalovirus; herpes simplex virus, HSV; GI, gastrointestinal. *The statistical test for mortality rate was conducted without ‘Unknown’ group. Table 4. Model performances with a hold-out set AUROC Accuracy Sensitivity Specificity F1 score Precision CNN 0.660 0.723 0.776 0.560 0.808 0.843 FocalNet +HPO 0.701 0.653 0.605 0.800 0.724 0.902 FocalNet +Pretrained +HPO 0.734 0.584 0.461 0.960 0.625 0.972 FocalNet +Pretrained +HPO +Curriculum learning 0.783 0.772 0.776 0.760 0.837 0.908 GNFM +Fine-tuning 0.747 0.663 0.605 0.840 0.730 0.920 GNFM +Fine-tuning +Curriculum learning 0.774 0.782 0.842 0.600 0.853 0.865 Endoscopist's consensus 0.737 0.798 0.798 0.542 0.868 0.857 Endoscopist 1 - 0.596 0.560 0.708 0.677 0.857 Endoscopist 2 - 0.758 0.867 0.417 0.844 0.823 Endoscopist 3 - 0.444 0.280 0.958 0.433 0.955 Endoscopist 4 - 0.636 0.733 0.333 0.753 0.775 Endoscopist 5 - 0.747 0.800 0.583 0.828 0.857 Endoscopist 6 - 0.707 0.867 0.208 0.818 0.774 Endoscopist 7 - 0.697 0.747 0.542 0.789 0.836 Endoscopist 8 - 0.545 0.440 0.875 0.595 0.917 Endoscopist 9 - 0.535 0.667 0.125 0.685 0.704 GPT-4o - 0.485 0.500 0.440 0.517 0.535 Abbreviation: CNN, convolutional neural network; GNFN, GastroNet-5M-pretrained foundation model; HPO, hyperparameter optimization; GPT, generative pretrained transformer; AUROC, area under the receiver operating characteristic Additional Declarations No competing interests reported. Supplementary Files SupplementaryTable1.docx supplefigure1.jpg Cite Share Download PDF Status: Under Review Version 1 posted Reviewers agreed at journal 26 Apr, 2026 Reviewers invited by journal 21 Apr, 2026 Editor invited by journal 16 Apr, 2026 Editor assigned by journal 14 Mar, 2026 Submission checks completed at journal 14 Mar, 2026 First submitted to journal 13 Mar, 2026 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-9113001","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":629767180,"identity":"e82deda2-b84a-4117-9f7d-bfa64e5df637","order_by":0,"name":"Ji Eun Kim","email":"","orcid":"","institution":"Samsung Medical Center, Sungkyunkwan University School of Medicine","correspondingAuthor":false,"prefix":"","firstName":"Ji","middleName":"Eun","lastName":"Kim","suffix":""},{"id":629767181,"identity":"b7c73647-fd31-405d-b0f3-17713c48e96e","order_by":1,"name":"Yeong Chan Lee","email":"","orcid":"","institution":"Ajou University School of Medicine","correspondingAuthor":false,"prefix":"","firstName":"Yeong","middleName":"Chan","lastName":"Lee","suffix":""},{"id":629767182,"identity":"38a19dbf-b0ac-48c5-bc80-cf90f591ced8","order_by":2,"name":"Young Eun Oh","email":"","orcid":"","institution":"Samsung Medical Center, Sungkyunkwan University School of Medicine","correspondingAuthor":false,"prefix":"","firstName":"Young","middleName":"Eun","lastName":"Oh","suffix":""},{"id":629767183,"identity":"a1f8abca-25e6-496e-8acb-bb36a0b45028","order_by":3,"name":"Tae Se Kim","email":"","orcid":"","institution":"Samsung Medical Center, Sungkyunkwan University School of Medicine","correspondingAuthor":false,"prefix":"","firstName":"Tae","middleName":"Se","lastName":"Kim","suffix":""},{"id":629767184,"identity":"ee805fe9-460c-45f5-85b2-e86c6b052486","order_by":4,"name":"Hyuk Lee","email":"","orcid":"","institution":"Samsung Medical Center, Sungkyunkwan University School of Medicine","correspondingAuthor":false,"prefix":"","firstName":"Hyuk","middleName":"","lastName":"Lee","suffix":""},{"id":629767185,"identity":"58ffbc6e-d086-4b99-966c-e197f5b5b444","order_by":5,"name":"Byung-Hoon Min","email":"","orcid":"","institution":"Samsung Medical Center, Sungkyunkwan University School of Medicine","correspondingAuthor":false,"prefix":"","firstName":"Byung-Hoon","middleName":"","lastName":"Min","suffix":""},{"id":629767186,"identity":"b07a81bc-8db8-4f9e-be41-c8b8d3d97fe3","order_by":6,"name":"Jun Haeng Lee","email":"","orcid":"","institution":"Samsung Medical Center, Sungkyunkwan University School of Medicine","correspondingAuthor":false,"prefix":"","firstName":"Jun","middleName":"Haeng","lastName":"Lee","suffix":""},{"id":629767187,"identity":"1379f352-d1c5-4311-b78d-0e1d6216b06f","order_by":7,"name":"Poong-Lyul Rhee","email":"","orcid":"","institution":"Samsung Medical Center, Sungkyunkwan University School of Medicine","correspondingAuthor":false,"prefix":"","firstName":"Poong-Lyul","middleName":"","lastName":"Rhee","suffix":""},{"id":629767188,"identity":"8b7653f1-88de-41ec-b286-4e581797bad6","order_by":8,"name":"Yang Won Min","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA2ElEQVRIiWNgGAWjYFAD9gYgYWBBihaeAyAtEqRokUgAk4QVGhw/e/g1T8UdOX7J51c3/CiQYOBv707Ar+VMXpo1z5lnxpKzc8pu9gAdJnHm7Ab8Wg7kmBnnth1O3HA7J+0GD1CLgUQuAS3n34C11O+/eSbt5h+itNzIMX4M1JJgIMF+7DZRtkjeeGPG/OfMYcMZZ3LYbssYSPAQ9Avf+RzjjzMqDsvztx9/dvPNHxs5/vZe/FoUDjCwQeOCxwBM4lUOAvINDMwfIEz2BwRVj4JRMApGwcgEAPV/TEQhRocnAAAAAElFTkSuQmCC","orcid":"","institution":"Samsung Medical Center, Sungkyunkwan University School of Medicine","correspondingAuthor":true,"prefix":"","firstName":"Yang","middleName":"Won","lastName":"Min","suffix":""}],"badges":[],"createdAt":"2026-03-13 09:38:31","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-9113001/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-9113001/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":108492790,"identity":"ea8fb16e-e871-425f-9753-32963bd32478","added_by":"auto","created_at":"2026-05-05 09:58:39","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":178174,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eStudy overview.\u003c/strong\u003e\u003c/p\u003e","description":"","filename":"1.png","url":"https://assets-eu.researchsquare.com/files/rs-9113001/v1/accf897eba318a9e04b7dddf.png"},{"id":108388814,"identity":"3fcf6952-bad7-4df3-8150-0795a07864c9","added_by":"auto","created_at":"2026-05-04 06:43:52","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":566222,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eSelf-supervised masked reconstruction performance of the pretrained model. \u003c/strong\u003e(a) Reconstruction loss during masked image modeling. During pretraining, the model’s reconstruction loss rapidly declined and reached 0.075 (blue curve). Evaluated directly on SMC endoscopic images, without further fine-tuning, the same model achieved a comparable loss of 0.082 (red dashed line). (b) Reconstruction examples in the in-house dataset. For each case, the original endoscopic image (left) is heavily masked in the central region (middle), and the pretrained model’s pixel-level reconstruction is shown on the right. Abbreviation: SMC, Samsung Medical Center\u003c/p\u003e","description":"","filename":"2.png","url":"https://assets-eu.researchsquare.com/files/rs-9113001/v1/04dcf920638404c590a20752.png"},{"id":108492428,"identity":"4bd56eca-7f7b-4057-908b-b0e69defa3d8","added_by":"auto","created_at":"2026-05-05 09:57:45","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":62548,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eArea under the receiver operating characteristic with a holdout set \u003c/strong\u003e(a) Area under the receiver operating characteristic curve (AUROC) for each classifier. Error bars denote the 95% confidence interval. (b) Receiver operating characteristic (ROC) curve (red line) of the final model on the hold-out test set, with points representing each individual endoscopist (gray circles) and GPT-4o (gray triangle). Abbreviation: CNN, convolutional neural network; GNFN, GastroNet-5M-pretrained foundation model; HPO, hyperparameter optimization;\u003c/p\u003e","description":"","filename":"3.png","url":"https://assets-eu.researchsquare.com/files/rs-9113001/v1/774f1fd32d964585763fbead.png"},{"id":108388817,"identity":"d35f372d-09fc-402d-bacb-9ed19cab18aa","added_by":"auto","created_at":"2026-05-04 06:43:52","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":496248,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eExamples of true negatives (CMV) and true positives (HSV) of a hold-out set and their interpretation using Grad-CAM. \u003c/strong\u003eAbbreviation: CMV, cytomegalovirus; herpes simplex virus, HSV.\u003c/p\u003e","description":"","filename":"4.png","url":"https://assets-eu.researchsquare.com/files/rs-9113001/v1/96d940d684b579b521901bb2.png"},{"id":108804830,"identity":"5ffe63bb-a879-4ed3-a1b5-b06063e34b68","added_by":"auto","created_at":"2026-05-08 15:23:46","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1994344,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9113001/v1/cb008b55-7aa2-41bd-bd53-a04a34c9c619.pdf"},{"id":108388811,"identity":"e10a07d8-4c19-4384-9e07-462e4b818e0e","added_by":"auto","created_at":"2026-05-04 06:43:52","extension":"docx","order_by":0,"title":"","display":"","copyAsset":false,"role":"supplement","size":18341,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryTable1.docx","url":"https://assets-eu.researchsquare.com/files/rs-9113001/v1/5cf92e9514061b4145b03ffd.docx"},{"id":108492399,"identity":"356d1a24-e02a-4424-b2a2-e0b67eabdfcb","added_by":"auto","created_at":"2026-05-05 09:57:40","extension":"jpg","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":170907,"visible":true,"origin":"","legend":"","description":"","filename":"supplefigure1.jpg","url":"https://assets-eu.researchsquare.com/files/rs-9113001/v1/5652b4b8157ea83f9e991edf.jpg"}],"financialInterests":"No competing interests reported.","formattedTitle":"Development and validation of a deep learning model for differentiating cytomegalovirus and herpes simplex virus esophagitis using endoscopic images","fulltext":[{"header":"Introduction","content":"\u003cp\u003eCytomegalovirus (CMV) and herpes simplex virus (HSV) represent the most common etiologies of infectious esophagitis, particularly in immunocompromised hosts[1-4]. Although endoscopy is routinely performed when viral esophagitis is suspected, the substantial overlap in endoscopic morphology including erosions, punched-out ulcers, and longitudinal defects makes reliable discrimination between CMV and HSV challenging in real time[5-7]. As a result, definitive diagnosis relies on immunohistochemical (IHC) staining of biopsy specimens[8, 9]. However, IHC processing requires additional time, and treatment cannot be confidently initiated until results are confirmed[10]. This delay is clinically meaningful because inappropriate or deferred antiviral therapy can lead to worsening esophageal injury, uncontrolled symptoms, and, in severe cases, systemic complications in vulnerable patients.\u003c/p\u003e\n\u003cp\u003eRecent advances in artificial intelligence (AI) have demonstrated promising performance in viral detection across gastrointestinal imaging from CMV identification on digital pathology slides to machine-learning models distinguishing CMV from HSV esophagitis using handcrafted features[11-13]. However, these approaches do not resolve the central diagnostic challenge encountered at the time of endoscopy. Clinicians must frequently decide whether to initiate antiviral therapy before histopathologic confirmation, yet the endoscopic appearances of CMV and HSV esophagitis overlap substantially, and no validated AI tool currently offers reliable, point-of-care differentiation that can function as a consistent central reader.\u003c/p\u003e\n\u003cp\u003eAgainst this backdrop, we sought to develop a deep learning system capable of assisting real-time etiologic classification of viral esophagitis using biopsy proven CMV and HSV cases from a large tertiary center, supplemented by publicly available datasets to enhance robustness. By evaluating its performance relative to experienced endoscopists and contemporary AI models, we aimed to determine whether such a system could meaningfully support clinical decision-making during the narrow window in which treatment strategies are first considered. This study highlights the potential role of AI-assisted differentiation in mitigating diagnostic delays and improving the timely initiation of appropriate therapy for patients with suspected viral esophagitis.\u003c/p\u003e"},{"header":"Methods","content":"\u003cp\u003e\u003cstrong\u003e1.\u0026nbsp; \u0026nbsp;Study population and data preparation\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e1)\u0026nbsp; \u0026nbsp;Study population\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe study population comprised patients who underwent esophagogastroduodenoscopy (EGD) at Samsung Medical Center (SMC), a tertiary academic referral hospital in Seoul, South Korea, between January 1, 2012, and December 31, 2021. Eligible patients were diagnosed with either CMV esophagitis or herpes simplex virus HSV esophagitis, confirmed by tissue biopsy with IHC staining. Only patients with histopathological proven CMV or HSV infection were included. Clinical, endoscopic, and pathological variables were extracted from the institutional electronic medical record system. To ensure phenotypic consistency, cases were selected only when the endoscopic report explicitly included the term esophagitis. Patients who did not undergo IHC evaluation or whose pathology results were inconclusive were excluded from the analysis. 170 patients were used to train and validate our model. Demographic and clinical variables including age, sex, and comorbidities were collected for all eligible patients. The study protocol was reviewed and approved by the SMC Institutional Review Board (IRB No. 2023-03-079) and conducted in accordance with the principles of the Declaration of Helsinki. In this study, we used de-identified patients whose the requirement for informed consent was waived according to the rules of SMC IRB.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e2)\u0026nbsp; \u0026nbsp;Target dataset\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe collected 470 endoscopic images from the 170 patients with CMV or HSV in SMC, representing a class imbalance ratio of 1:3. The dataset was divided into a cross-validation set (369 images) and an independent hold-out set (101 images). Each fold in the cross-validation set included approximately 295 images for training and 74 images for validation (\u003cstrong\u003eTable 1\u003c/strong\u003e).\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e3)\u0026nbsp; \u0026nbsp;Public dataset for pretraining\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe obtained 105,183 endoscopic images including upper gastrointestinal (GI) tract from two public datasets: GastroVision[14] (2,314 labeled upper GI images) and HyperKvasir[15] (3,452 labeled upper GI + 99,417 unlabeled endoscopic images. With these large datasets for GI endoscopy, we used 5,766 upper GI images and 99,417 images for self-supervised learning (\u003cstrong\u003eTable 2\u003c/strong\u003e). The labeled images included normal and pathological conditions of upper GI anatomy; normal esophagus (139 images), esophagitis (770 images), Barret’s esophagus (189 images), normal stomach (969 images), pylorus (1,392 images), and others (2,302 images).\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e4)\u0026nbsp; \u0026nbsp;Data processing and augmentation\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThere was no subject overlap between the cross validation set and the hold-out set. Images were resized to 608×608 pixels after center-cropping and padding the original images. All images were additionally processed following by normalization using ImageNet color statistics (RGB mean = [0.485, 0.456, 0.406] and standard deviation = [0.229, 0.224, 0.225]) and data augmentation techniques including random horizontal and vertical flips, color jittering with brightness and contrast variations of 0.1.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e2.\u0026nbsp; \u0026nbsp;Model architecture\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe employed FocalNet-Base as our foundation architecture, a vision transformer variant that incorporates focal modulation mechanisms for enhanced feature representation learning. Focal modulation hierarchically captures multi-scale spatial context with depth-wise convolution and gated aggregation, then injects the resulting modulator into each query[16]. It can measure spatially context-aware interactions without explicit query–key matching that require intensive operations. Our model processes 608×608 input images with 4×4 patches, resulting in 152×152 patch tokens. The architecture consists of hierarchical feature extraction with progressive expansion of hidden dimensions across transformer layers including a focal modulation layer, culminating in a binary prediction head to classify CMV or HSV. The model was trained using the Adam optimizer with cross-entropy loss.\u003c/p\u003e\n\u003cp\u003eWe developed a convolutional neural network (CNN) as a comparative model for binary classification of CMV and HSV. The network architecture consisted of four sequential convolutional blocks, each comprising a convolutional layer (with 32, 64, 128, and 256 filters respectively), batch normalization, ReLU activation, and 2×2 max-pooling. Following the convolutional blocks, we applied adaptive average pooling to generate fixed-size feature maps, followed by a fully connected layer with dropout (p=0.3) for regularization. This scratch model was trained for 50 epochs with a learning rate of 1×10-3 and batch size of 32 using the Adam optimizer with cross-entropy loss.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eWe further validated our curriculum learning approach with a GastroNet-5M foundation model (GNFM) [20] in gastrointestinal endoscopy. GNFM was pretrained with about 5 million endoscopic images (224×224 pixel) of the gastrointestinal tract in GastroNet-5M. The architecture employed a ResNet50[21] with the original classification head replaced by a custom head comprising dropout layers (p=0.3 and 0.2) and fully connected layers for binary classification.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e3.\u0026nbsp; \u0026nbsp;Training\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe overall training workflow is illustrated in \u003cstrong\u003eFigure 1\u003c/strong\u003e. We first conducted domain-specific self-supervised pretraining using masked image modeling, followed by fine-grained tuning that integrated automated hyperparameter search and a curriculum learning strategy to maximize classification performance.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e1)\u0026nbsp; \u0026nbsp;Self-supervised pretraining\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe implemented masked image modeling (MIM) [17]for domain-specific pretraining using publicly available 105,183 endoscopic images. MIM is one of self-supervised learning strategies that learn rich feature representations by reconstructing partially masked patches in an image[17]. We randomly masked 60% of 8×8 pixel regions in an endoscopic image. Then, FocalNet was trained to reconstruct the masked regions, enabling learning of meaningful visual representations relevant to GI regions. Training was conducted for 50 epochs with a learning rate of 1×10\u003csup\u003e-4\u003c/sup\u003e, batch size of 32, and gradient clipping at a maximum norm of 1.0.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e2)\u0026nbsp; \u0026nbsp;Automated hyperparameter optimization\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe used an automatic hyperparameter optimization (HPO) framework[18] to search for optimal training configurations to fine-tune the pretrained FocalNet for each cross-validation fold. The search space included learning rates (1×10\u003csup\u003e-7\u003c/sup\u003e to 1×10\u003csup\u003e-6\u003c/sup\u003e), training epochs (20 to 30), and batch sizes (4, 8, or 16). HPO was conducted using Tree-Structured Parzen Estimator [19] based on Bayesian optimization method with the search space, which explores randomly for the initial few trials and subsequently samples more frequently from promising hyperparameter regions of the search space. This automatic optimization was performed over 30 trials per fold with the objective of maximizing area under the receiver operating characteristic curve (AUROC).\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e3)\u0026nbsp; \u0026nbsp;Curriculum learning with confidence-based stratification\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe usedcurriculum learning approach to further train FocalNet and GNFM. We generated pseudo-labels for difficulty with the training set. Samples in the training set per fold were stratified into four difficulty levels for CMV and HSV, respectively: (1) Very Easy: correct predictions with high confidence (≥median confidence), (2) Easy: correct predictions with low confidence (\u0026lt;median confidence), (3) Hard: incorrect predictions with low confidence, and (4) Very Hard: incorrect predictions with high confidence. The optimal threshold for confidence assessment was determined using Youden's J statistics to maximize sensitivity and specificity. Curriculum learning proceeded progressively through four stages, incorporating cumulatively more difficult samples. Stage 1 trained on Very Easy samples only, Stage 2 added Easy samples, Stage 3 included Hard samples, and Stage 4 encompassed all samples. Each curriculum stage trained for 20 epochs with evaluation on the validation set. All experiments were implemented using the Hugging Face Transformers (version 4.51.3) library with PyTorch (version 2.5.1) backend.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003e4.\u0026nbsp; \u0026nbsp;Evaluation\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAfter fold-specific fine-tuning, an ensemble of five models was applied to improve generalization. For the hold-out test set, final predictions were obtained by averaging output probabilities. Performance was evaluated using AUROC, accuracy, sensitivity, specificity, F1-score, and positive predictive value. Ninety-five percent confidence intervals (CI) for AUROC were calculated with 2,000 bootstrap replicates. To compare performance with clinicians, nine endoscopists independently classified hold-out images as CMV or HSV, and consensus scores were determined by vote counts. Model interpretability was qualitatively assessed using gradient-weighted class activation mapping (Grad-CAM), which highlights regions contributing to predictions [20].\u003c/p\u003e\n\u003cp\u003e\u0026nbsp;We additionally evaluated generative pretrained transformer 4o (GPT-4o; OpenAI). GPT-4o classified CMV or HSV from hold-out images using a predefined instruction (Supplementary Figure 1). To mitigate hallucinated outputs, it was instructed to respond “difficult to differentiate” when a definitive classification was not possible.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eStatistical Analysis\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eDemographic statistics are described as mean ± standard deviation (SD) for continuous variables and as number (percentage) for categorical variables. We conducted t-tests and chi-squared tests for continuous variables and categorical variables, respectively, to compare CMV and HSV groups. All statistical comparisons used two-sided tests with significance threshold α=0.05 and were performed with R software (version 4.1.0; R Foundation for Statistical Computing, Vienna, Austria).\u0026nbsp;\u003c/p\u003e"},{"header":"Results","content":"\u003cp\u003e\u003cstrong\u003eBaseline characteristics\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eIn this study, 42 CMV cases and 125 HSV cases were included for development and validation of the model. \u003cstrong\u003eTable 3\u003c/strong\u003e shows the demographic characteristics of the study participants at the baseline. The mean age between two groups was not significantly different (58.3 for CMV and 59.6 for HSV; p-value=0.587). About 71% of patients with CMV and 67% of patients with HSV were male (p-value=0.765). The proportions for the morbidities were also not significantly different between patients with CMV and HSV. The difference of mortality rates within 1 year after the diagnosis was marginally insignificant (46.5% for CMV and 26.8% for HSV, p-value=0.060)\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eThe pretrained model can reconstruct the masked region in an in-house endoscopic image\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe confirmed whether the model pretrained on the large dataset could reconstruct masked endoscopic images. The reconstruction loss with the pretraining dataset converged to 0.075 and the loss with the SMC dataset remained similarly low at 0.082 (\u003cstrong\u003eFigure 2a\u003c/strong\u003e). Representative examples are shown in \u003cstrong\u003eFigure 2b\u003c/strong\u003e. The model recovered mucosal folds, vascular patterns, lesions and luminal contours, indicating that it had learned transferable, structure-aware representations of the mucosa.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFine-grained tuning improved model performance\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe observed a gradual improvement for our model through fine-grained tuning (\u003cstrong\u003eFigure 3a\u003c/strong\u003e). A vanilla FocalNet (AUROC of 0.701; 95% CI 0.586-0.816), which was only trained with HPO but without pretraining, outperformed a convolutional neural network (AUROC of 0.660; 95% CI 0.524-0.796). The vanilla model and the model applying both pretraining and HPO achieved AUROC of 0.734 (95% CI 0.634-0.833) was lower than endoscopists’ consensus (AUROC of 0.737; 95% CI 0.612-0.862). However, further training with curriculum learning yielded the best performance (AUROC of 0.783; 95% CI 0.680-0.885). GNFM with curriculum learning also showed improved performance (AUROC of 0.774; 95% CI 0.669-0.879) over baseline fine-tuning (AUROC of 0.747; 95% CI 0.641-0.853). Its sensitivity and specificity were comparable to those of each of the nine endoscopists (\u003cstrong\u003eFigure 3b\u003c/strong\u003e and\u003cstrong\u003e\u0026nbsp;Table 4\u003c/strong\u003e). By contrast, GPT-4o showed substantially lower performance with accuracy of 0.485, sensitivity of 0.500, and specificity of 0.440. Even GPT-4o could not conclude 24 images to a specific disease class. We identified a consistent trend of performance gains across folds as the curriculum advanced (\u003cstrong\u003eSupplementary Table 1\u003c/strong\u003e). Grad-CAM heatmaps highlight the class-discriminative regions for the prediction, which largely overlap with abnormal mucosa (\u003cstrong\u003eFigure 4\u003c/strong\u003e).\u003c/p\u003e"},{"header":"Discussions","content":"\u003cp\u003eIn this study, we demonstrated that deep learning analysis of routine endoscopic images can aid in differentiating CMV from HSV esophagitis, a distinction that is often challenging based on endoscopic appearance alone. Our findings suggest that artificial intelligence may serve as a useful adjunct to support early etiologic assessment during endoscopy.\u003c/p\u003e\n\u003cp\u003eCMV and HSV esophagitis share substantial clinical and endoscopic overlap, making differentiation challenging even for experienced endoscopists. Both commonly present with odynophagia, chest pain, and ulcerative lesions, while classic features—such as linear ulcers in CMV or vesicular lesions in HSV—are often absent in practice. Because these infections primarily affect immunocompromised patients, delayed diagnosis may result in prolonged hospitalization and increased morbidity. Although histopathologic confirmation with immunohistochemical staining remains the diagnostic standard, it is time-consuming and limited by sampling error from the patchy distribution of viral inclusions, often requiring multiple biopsies that may not be feasible in unstable patients. These limitations underscore the need for adjunctive tools capable of rapid, image-based differentiation. In this context, artificial intelligence may support objective assessment during endoscopy.\u003c/p\u003e\n\u003cp\u003eThe improved performance of our final model appears to be driven more by the curriculum learning strategy than by foundation model pretraining alone. Presenting training images in order of increasing difficulty may parallel the stepwise development of clinical expertise and enhance recognition of subtle mucosal differences in visually heterogeneous settings. This effect was most pronounced in the FocalNet-based model, where curriculum learning produced the greatest gain, whereas pretraining without structured progression yielded less consistent improvement. These findings highlight the importance of task-specific training design in viral esophagitis, where morphologic distinctions are often subtle. In practice, such image-based inference is not intended to replace histopathology but to provide an earlier probabilistic assessment that may help guide initial antiviral selection while confirmatory testing is pending.\u003c/p\u003e\n\u003cp\u003eRecent advances in gastrointestinal imaging have demonstrated the potential of deep learning to support the diagnosis of infectious and inflammatory diseases[21, 22]. Deep learning models have shown high accuracy in identifying \u003cem\u003eHelicobacter pylori\u003c/em\u003e infection, grading gastritis, and recognizing inflammatory changes in conditions such as eosinophilic esophagitis[23, 24]. These successes suggest that endoscopic manifestations of mucosal inflammation can be effectively captured by image-based neural networks, supporting the feasibility of extending similar approaches to viral esophagitis. Our findings therefore align with a broader body of evidence indicating that AI can augment diagnostic assessment in gastrointestinal diseases[25].\u003c/p\u003e\n\u003cp\u003eHowever, despite these advances, pathogen-level differentiation has rarely been addressed in prior work. An earlier Scientific Reports study from Asan Medical Center applied machine-learning methods to viral esophagitis using manually defined regions of interest (ROIs) and reported near-perfect classification performance[12]. However, such exceptionally high accuracy is difficult to reconcile with the substantial visual overlap between CMV and HSV esophagitis and may reflect methodological constraints inherent to small datasets and ROI-dependent pipelines. Manual ROI selection can inadvertently introduce bias or information leakage by focusing on conspicuous lesion components while excluding surrounding mucosa, thereby overestimating model performance and limiting real-world applicability. In contrast, our model processes full-frame endoscopic images and were evaluated using a strictly separated hold-out test set, reducing the risk of overfitting and providing a more reliable estimate of generalizability. By combining domain-adapted pretraining and curriculum learning, our framework offers a structurally more robust and scalable approach than earlier ROI-based methods.\u003c/p\u003e\n\u003cp\u003eThis study has several limitations. First, CMV and HSV esophagitis are uncommon even in high-volume tertiary centers, resulting in a limited sample size that may restrict generalizability. Although we addressed this through domain-adapted pretraining, curriculum-based fine-tuning, and data augmentation, larger multicenter datasets are needed to confirm clinical applicability. Second, the retrospective design and variability in image acquisition may introduce selection and spectrum biases. Third, the model generates image-level predictions without incorporating clinical or laboratory data, which could improve performance within multimodal frameworks. Future studies should expand dataset diversity and prospectively evaluate real-time integration into clinical workflows.\u003c/p\u003e\n\u003cp\u003eOur findings show that deep learning can assist in differentiating CMV and HSV esophagitis, a task that remains challenging even for experienced clinicians. Using domain-adapted pretraining and full-frame image analysis, our model provides an early, noninvasive estimate of pathogen likelihood to help guide antiviral selection while awaiting histopathologic confirmation. Given that viral esophagitis often affects immunocompromised patients, rapid image-based inference may offer meaningful clinical value. Future studies should validate this approach across diverse institutions, incorporate multimodal clinical data, and assess real-time integration into endoscopic workflows to support clinically reliable AI-assisted decision-making.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003eCONFLICT OF INTEREST\u003c/p\u003e\n\u003cp\u003eNo potential conflict of interest relevant to this article was reported.\u003c/p\u003e\n\u003cp\u003eAUTHOR’S CONTRIBUTIONS\u003c/p\u003e\n\u003cp\u003eStudy concept and design: K.J.E., M.Y.W. Data acquisition: L.Y.C. Data analysis, and drafting of manuscript: L.Y.C. Critical revision for intellectual content: O.Y. E., KT.S., L.H., M.B.H., L.J.H., R.P.L All authors read and approved the final manuscript.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eFUNDING INFORMATION\u003c/p\u003e\n\u003cp\u003eNone\u003c/p\u003e\n\u003cp\u003eDATA AVAILABILITY STATEMENT\u003c/p\u003e\n\u003cp\u003eThe data underlying this article cannot be shared publicly, given the privacy expectations of the individuals who participated in the study. The data will be shared upon reasonable request to the corresponding author.\u003c/p\u003e\n\u003cp\u003eETHICS STATEMENT\u003c/p\u003e\n\u003cp\u003eThe study protocol was reviewed and approved by the Institutional Review Board of Samsung Medical Center (IRB No. 2023-03-079).\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n \u003cli\u003eLanxin Li 1, R.C.C., \u003cem\u003eCytomegalovirus Esophagitis(Archived).\u003c/em\u003e StatPearls Publishing, 2023.\u003c/li\u003e\n \u003cli\u003eMohit Gupta 1, M.S., \u003cem\u003eCytomegalovirus Infections.\u003c/em\u003e StatPearls Publishing, 2025.\u003c/li\u003e\n \u003cli\u003eKim, N.Y. and J. Lee, \u003cem\u003eInfective Esophagitis.\u003c/em\u003e Korean J Helicobacter Up Gastrointest Res, 2025. \u003cstrong\u003e25\u003c/strong\u003e(2): p. 108\u0026ndash;116.\u003c/li\u003e\n \u003cli\u003eAli, A.A., et al., \u003cem\u003eCytomegalovirus Esophagitis in an Immunocompromised Patient.\u003c/em\u003e Cureus, 2023. \u003cstrong\u003e15\u003c/strong\u003e(9): p. e45634.\u003c/li\u003e\n \u003cli\u003eLi, X., et al., \u003cem\u003eAdvances and Challenges in Cytomegalovirus Detection Methods for Liver Transplant Donors.\u003c/em\u003e Diagnostics (Basel), 2023. \u003cstrong\u003e13\u003c/strong\u003e(21).\u003c/li\u003e\n \u003cli\u003eCoisel, Y., et al., \u003cem\u003eCytomegalovirus and herpes simplex virus effect on the prognosis of mechanically ventilated patients suspected to have ventilator-associated pneumonia.\u003c/em\u003e PLoS One, 2012. \u003cstrong\u003e7\u003c/strong\u003e(12): p. e51340.\u003c/li\u003e\n \u003cli\u003eHasan, M.R., et al., \u003cem\u003eAnalytical methods for detection of human cytomegalovirus clinched biosensor a cutting-edge diagnostic tool.\u003c/em\u003e Biomedical Engineering Advances, 2021. \u003cstrong\u003e1\u003c/strong\u003e.\u003c/li\u003e\n \u003cli\u003eNirali Desai, M.B.H., * Said Albahra, MD,\u0026dagger; Elena Lucas, MD,\u0026dagger; and M. Amit G. Singal, MS,\u0026Dagger; Suntrea T.G. Hammer, MD,\u0026dagger; and Purva Gopal, MD, MS, \u003cem\u003eClinical and Histopathologic Features Can Help Target Immunohistochemical Stain Use in the Diagnosis of Viral Esophagitis.\u003c/em\u003e 2021.\u003c/li\u003e\n \u003cli\u003eYeh, P.J., et al., \u003cem\u003eRisk Factors, Clinical and Endoscopic Features, and Clinical Outcomes in Patients with Cytomegalovirus Esophagitis.\u003c/em\u003e J Clin Med, 2022. \u003cstrong\u003e11\u003c/strong\u003e(6).\u003c/li\u003e\n \u003cli\u003eJuric-Sekhar, G., et al., \u003cem\u003eCytomegalovirus (CMV) in gastrointestinal mucosal biopsies: should a pathologist perform CMV immunohistochemistry if the clinician requests it?\u003c/em\u003e Hum Pathol, 2017. \u003cstrong\u003e60\u003c/strong\u003e: p. 11\u0026ndash;15.\u003c/li\u003e\n \u003cli\u003ePost, C.S., et al., \u003cem\u003eUtility of Machine Learning to Detect Cytomegalovirus in Digital Hematoxylin and Eosin-Stained Slides.\u003c/em\u003e Lab Invest, 2023. \u003cstrong\u003e103\u003c/strong\u003e(10): p. 100225.\u003c/li\u003e\n \u003cli\u003eLee, J.S., et al., \u003cem\u003eMachine learning approach for differentiating cytomegalovirus esophagitis from herpes simplex virus esophagitis.\u003c/em\u003e Sci Rep, 2021. \u003cstrong\u003e11\u003c/strong\u003e(1): p. 3672.\u003c/li\u003e\n \u003cli\u003eKim, J.H., et al., \u003cem\u003eEnhancing the Predictions of Cytomegalovirus Infection in Severe Ulcerative Colitis Using a Deep Learning Ensemble Model: Development and Validation Study.\u003c/em\u003e JMIR Med Inform, 2025. \u003cstrong\u003e13\u003c/strong\u003e: p. e64987.\u003c/li\u003e\n \u003cli\u003eJha, D., et al., \u003cem\u003eGastroVision: A Multi-class Endoscopy Image Dataset for Computer Aided Gastrointestinal Disease Detection.\u003c/em\u003e 2023.\u003c/li\u003e\n \u003cli\u003eBorgli, H., et al., \u003cem\u003eHyperKvasir, a comprehensive multi-class image and video dataset for gastrointestinal endoscopy.\u003c/em\u003e Sci Data, 2020. \u003cstrong\u003e7\u003c/strong\u003e(1): p. 283.\u003c/li\u003e\n \u003cli\u003eJianwei Yang1, C.L., Xiyang Dai2, Lu Yuan2, Jianfeng Gao1, M.C.A. 1Microsoft Research at Redmond, and c. {jianwyan, xidai,luyuan,jfgao}@microsoft.com, \u003cem\u003eFocal Modulation Networks.\u003c/em\u003e Advances in Neural Information Processing Systems, 2022.\u003c/li\u003e\n \u003cli\u003eCao2*, Z.X.Z.Z.Y., et al., \u003cem\u003eSimMIM: a Simple Framework for Masked Image Modeling.\u003c/em\u003e\u003c/li\u003e\n \u003cli\u003eAkiba1, T., et al., \u003cem\u003eOptuna: A Next-generation Hyperparameter Optimization Framework.\u003c/em\u003e 2019.\u003c/li\u003e\n \u003cli\u003eWatanabe, S., \u003cem\u003eTree-Structured Parzen Estimator: Understanding Its Algorithm Components and Their Roles for Better Empirical Performance.\u003c/em\u003e 2025.\u003c/li\u003e\n \u003cli\u003eRamakrishna, R.R.S.M.C.A.D. and V.D.P.D. Batra, \u003cem\u003eGrad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization.\u003c/em\u003e 2019.\u003c/li\u003e\n \u003cli\u003eEsteva, A., et al., \u003cem\u003eA guide to deep learning in healthcare.\u003c/em\u003e Nat Med, 2019. \u003cstrong\u003e25\u003c/strong\u003e(1): p. 24\u0026ndash;29.\u003c/li\u003e\n \u003cli\u003eMessmann, H., et al., \u003cem\u003eExpected value of artificial intelligence in gastrointestinal endoscopy: European Society of Gastrointestinal Endoscopy (ESGE) Position Statement.\u003c/em\u003e Endoscopy, 2022. \u003cstrong\u003e54\u003c/strong\u003e(12): p. 1211\u0026ndash;1231.\u003c/li\u003e\n \u003cli\u003eGoncalves, W.G.E., et al., \u003cem\u003eDeepHP: A New Gastric Mucosa Histopathology Dataset for Helicobacter pylori Infection Diagnosis.\u003c/em\u003e Int J Mol Sci, 2022. \u003cstrong\u003e23\u003c/strong\u003e(23).\u003c/li\u003e\n \u003cli\u003eGong, E.J., C.S. Bang, and J.J. Lee, \u003cem\u003eComputer-aided diagnosis in real-time endoscopy for all stages of gastric carcinogenesis: Development and validation study.\u003c/em\u003e United European Gastroenterol J, 2024. \u003cstrong\u003e12\u003c/strong\u003e(4): p. 487\u0026ndash;495.\u003c/li\u003e\n \u003cli\u003eCao, J.S., et al., \u003cem\u003eArtificial intelligence in gastroenterology and hepatology: Status and challenges.\u003c/em\u003e World J Gastroenterol, 2021. \u003cstrong\u003e27\u003c/strong\u003e(16): p. 1664\u0026ndash;1690.\u003c/li\u003e\n\u003c/ol\u003e"},{"header":"Tables","content":"\u003cp\u003e\u003cstrong\u003eTable 1. Number of images for cross-validation and hold-out sets.\u003c/strong\u003e\u003c/p\u003e\n\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\" width=\"100%\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 12px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 10px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 20px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 14px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eCMV\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 12px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eHSV\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 29px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eTotal\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd rowspan=\"10\" style=\"width: 12px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eCross-validation\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" rowspan=\"2\" style=\"width: 10px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eFold 1\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 20px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eTraining\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 14px;\"\u003e\n \u003cp\u003e60\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 12px;\"\u003e\n \u003cp\u003e225\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 29px;\"\u003e\n \u003cp\u003e285\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 20px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eValidation\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 14px;\"\u003e\n \u003cp\u003e29\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 12px;\"\u003e\n \u003cp\u003e55\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 29px;\"\u003e\n \u003cp\u003e84\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" rowspan=\"2\" style=\"width: 10px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eFold 2\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 20px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eTraining\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 14px;\"\u003e\n \u003cp\u003e81\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 12px;\"\u003e\n \u003cp\u003e224\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 29px;\"\u003e\n \u003cp\u003e305\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 20px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eValidation\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 14px;\"\u003e\n \u003cp\u003e8\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 12px;\"\u003e\n \u003cp\u003e56\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 29px;\"\u003e\n \u003cp\u003e64\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" rowspan=\"2\" style=\"width: 10px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eFold 3\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 20px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eTraining\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 14px;\"\u003e\n \u003cp\u003e68\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 12px;\"\u003e\n \u003cp\u003e237\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 29px;\"\u003e\n \u003cp\u003e305\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 20px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eValidation\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 14px;\"\u003e\n \u003cp\u003e21\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 12px;\"\u003e\n \u003cp\u003e43\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 29px;\"\u003e\n \u003cp\u003e64\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" rowspan=\"2\" style=\"width: 10px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eFold 4\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 20px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eTraining\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 14px;\"\u003e\n \u003cp\u003e78\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 12px;\"\u003e\n \u003cp\u003e221\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 29px;\"\u003e\n \u003cp\u003e299\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 20px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eValidation\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 14px;\"\u003e\n \u003cp\u003e11\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 12px;\"\u003e\n \u003cp\u003e59\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 29px;\"\u003e\n \u003cp\u003e70\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" rowspan=\"2\" style=\"width: 10px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eFold 5\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 20px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eTraining\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 14px;\"\u003e\n \u003cp\u003e69\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 12px;\"\u003e\n \u003cp\u003e213\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 29px;\"\u003e\n \u003cp\u003e282\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 20px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eValidation\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 14px;\"\u003e\n \u003cp\u003e20\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 12px;\"\u003e\n \u003cp\u003e67\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 29px;\"\u003e\n \u003cp\u003e87\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"3\" style=\"width: 43px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eHoldout\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 14px;\"\u003e\n \u003cp\u003e25\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 12px;\"\u003e\n \u003cp\u003e76\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 29px;\"\u003e\n \u003cp\u003e101\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\n\u003cp\u003eAbbreviation: CMV, cytomegalovirus; herpes simplex virus, HSV.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTable 2. The number of images in GI classes of HyperKvasir and GastroVision\u003c/strong\u003e\u003c/p\u003e\n\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\" width=\"100%\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 27px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eSource\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 28px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eClass\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 43px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eImages\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" rowspan=\"10\" style=\"width: 27px;\"\u003e\n \u003cp\u003eGastroVision\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 28px;\"\u003e\n \u003cp\u003eNormal Stomach\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 43px;\"\u003e\n \u003cp\u003e969\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 28px;\"\u003e\n \u003cp\u003ePylorus\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 43px;\"\u003e\n \u003cp\u003e393\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 28px;\"\u003e\n \u003cp\u003eGE Junction Normal Z-line\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 43px;\"\u003e\n \u003cp\u003e329\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 28px;\"\u003e\n \u003cp\u003eDuodenal Bulb\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 43px;\"\u003e\n \u003cp\u003e205\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 28px;\"\u003e\n \u003cp\u003eNormal Esophagus\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 43px;\"\u003e\n \u003cp\u003e139\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 28px;\"\u003e\n \u003cp\u003eEsophagitis\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 43px;\"\u003e\n \u003cp\u003e107\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 28px;\"\u003e\n \u003cp\u003eBarret\u0026apos;s Esophagus\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 43px;\"\u003e\n \u003cp\u003e95\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 28px;\"\u003e\n \u003cp\u003eGastric Polyps\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 43px;\"\u003e\n \u003cp\u003e65\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 28px;\"\u003e\n \u003cp\u003eEsophageal Varices\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 43px;\"\u003e\n \u003cp\u003e7\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 28px;\"\u003e\n \u003cp\u003eUlcer\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 43px;\"\u003e\n \u003cp\u003e5\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" rowspan=\"6\" style=\"width: 27px;\"\u003e\n \u003cp\u003eHyperKvasir\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 28px;\"\u003e\n \u003cp\u003ePylorus\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 43px;\"\u003e\n \u003cp\u003e999\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 28px;\"\u003e\n \u003cp\u003eZ-line\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 43px;\"\u003e\n \u003cp\u003e932\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 28px;\"\u003e\n \u003cp\u003eRetroflex Stomach\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 43px;\"\u003e\n \u003cp\u003e764\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 28px;\"\u003e\n \u003cp\u003eEsophagitis\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 43px;\"\u003e\n \u003cp\u003e663\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 28px;\"\u003e\n \u003cp\u003eBarret\u0026apos;s Esophagus\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 43px;\"\u003e\n \u003cp\u003e94\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 28px;\"\u003e\n \u003cp\u003eUnlabeled\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 43px;\"\u003e\n \u003cp\u003e99,417\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\n\u003cp\u003e\u003cstrong\u003e\u0026nbsp;\u003c/strong\u003e\u003cstrong\u003eTable 3. Baseline characteristics for the patients with CMV and HSV.\u003c/strong\u003e\u003c/p\u003e\n\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\" width=\"620\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 180px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 130px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eCMV\u003cbr\u003e\u0026nbsp;(N = 43)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eHSV\u003cbr\u003e\u0026nbsp;(N = 127)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e\u003cstrong\u003ep-value\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 180px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eAge\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 130px;\"\u003e\n \u003cp\u003e58.1 \u0026plusmn; 14.5\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e60.4 \u0026plusmn; 11.9\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e0.296\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 180px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eSex\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 130px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e1.000\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 180px;\"\u003e\n \u003cp\u003eMale\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 130px;\"\u003e\n \u003cp\u003e30 (69.8%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e87 (68.5%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 180px;\"\u003e\n \u003cp\u003eFemale\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 130px;\"\u003e\n \u003cp\u003e13 (30.2%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e40 (31.5%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 180px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eUpper GI cancer\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 130px;\"\u003e\n \u003cp\u003e11 (25.6%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e19 (15.0%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e0.178\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 180px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eOther cancer\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 130px;\"\u003e\n \u003cp\u003e22 (51.2%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e71 (55.9%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e0.717\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 180px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eInflammatory disease\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 130px;\"\u003e\n \u003cp\u003e3 (7.0%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e8 (6.3%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e1.000\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 180px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eAutoimmune disease\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 130px;\"\u003e\n \u003cp\u003e2 (4.7%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e7 (5.5%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e1.000\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 180px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eMetabolic disease\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 130px;\"\u003e\n \u003cp\u003e11 (25.6%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e37 (29.1%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e0.802\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 180px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eCardiovascular disease\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 130px;\"\u003e\n \u003cp\u003e15 (34.9%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e45 (35.4%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e1.000\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 180px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eInfectious disease\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 130px;\"\u003e\n \u003cp\u003e13 (30.2%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e33 (26.0%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e0.731\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 180px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eOther disease\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 130px;\"\u003e\n \u003cp\u003e18 (41.9%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e70 (55.1%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e0.184\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 180px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eTransplantation\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 130px;\"\u003e\n \u003cp\u003e14 (32.6%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e42 (33.1%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e1.000\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 180px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eMortality within 1 year after CMV/HSV diagnosis\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 130px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e0.060*\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 180px;\"\u003e\n \u003cp\u003e\u0026nbsp;Unknown\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 130px;\"\u003e\n \u003cp\u003e2 (4.7%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e16 (12.6%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 180px;\"\u003e\n \u003cp\u003e\u0026nbsp;No\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 130px;\"\u003e\n \u003cp\u003e21 (48.8%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e77 (60.6%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd style=\"width: 180px;\"\u003e\n \u003cp\u003e\u0026nbsp;Yes\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 130px;\"\u003e\n \u003cp\u003e20 (46.5%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e34 (26.8%)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd style=\"width: 155px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\n\u003cp\u003eVariables are presented as mean \u0026plusmn; standard deviation, and attributes as N (%). Abbreviation: CMV, cytomegalovirus; herpes simplex virus, HSV; GI, gastrointestinal. *The statistical test for mortality rate was conducted without \u0026lsquo;Unknown\u0026rsquo; group.\u003c/p\u003e\n\u003cp\u003e\u0026nbsp;\u003cstrong\u003eTable 4. Model performances with a hold-out set\u003c/strong\u003e\u003c/p\u003e\n\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\" width=\"602\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 151px;\"\u003e\u003cbr\u003e\u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eAUROC\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 76px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eAccuracy\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eSensitivity\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eSpecificity\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 54px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eF1 score\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e\u003cstrong\u003ePrecision\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 151px;\"\u003e\n \u003cp\u003eCNN\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e0.660\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 76px;\"\u003e\n \u003cp\u003e0.723\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.776\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.560\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 54px;\"\u003e\n \u003cp\u003e0.808\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e0.843\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 151px;\"\u003e\n \u003cp\u003eFocalNet\u003c/p\u003e\n \u003cp\u003e\u0026nbsp;+HPO\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e0.701\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 76px;\"\u003e\n \u003cp\u003e0.653\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.605\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.800\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 54px;\"\u003e\n \u003cp\u003e0.724\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e0.902\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 151px;\"\u003e\n \u003cp\u003eFocalNet\u003c/p\u003e\n \u003cp\u003e\u0026nbsp;+Pretrained\u003c/p\u003e\n \u003cp\u003e\u0026nbsp;+HPO\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e0.734\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 76px;\"\u003e\n \u003cp\u003e0.584\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.461\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.960\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 54px;\"\u003e\n \u003cp\u003e0.625\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e0.972\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 151px;\"\u003e\n \u003cp\u003eFocalNet\u003c/p\u003e\n \u003cp\u003e\u0026nbsp;+Pretrained\u003c/p\u003e\n \u003cp\u003e\u0026nbsp;+HPO\u003c/p\u003e\n \u003cp\u003e\u0026nbsp;+Curriculum learning\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e0.783\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 76px;\"\u003e\n \u003cp\u003e0.772\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.776\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.760\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 54px;\"\u003e\n \u003cp\u003e0.837\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e0.908\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 151px;\"\u003e\n \u003cp\u003eGNFM\u003c/p\u003e\n \u003cp\u003e\u0026nbsp;+Fine-tuning\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e0.747\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 76px;\"\u003e\n \u003cp\u003e0.663\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.605\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.840\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 54px;\"\u003e\n \u003cp\u003e0.730\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e0.920\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 151px;\"\u003e\n \u003cp\u003eGNFM\u003c/p\u003e\n \u003cp\u003e\u0026nbsp;+Fine-tuning\u003c/p\u003e\n \u003cp\u003e\u0026nbsp;+Curriculum learning\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e0.774\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 76px;\"\u003e\n \u003cp\u003e0.782\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.842\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.600\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 54px;\"\u003e\n \u003cp\u003e0.853\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e0.865\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 151px;\"\u003e\n \u003cp\u003eEndoscopist\u0026apos;s consensus\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e0.737\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 76px;\"\u003e\n \u003cp\u003e0.798\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.798\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.542\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 54px;\"\u003e\n \u003cp\u003e0.868\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e0.857\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 151px;\"\u003e\n \u003cp\u003eEndoscopist 1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 76px;\"\u003e\n \u003cp\u003e0.596\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.560\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.708\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 54px;\"\u003e\n \u003cp\u003e0.677\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e0.857\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 151px;\"\u003e\n \u003cp\u003eEndoscopist 2\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 76px;\"\u003e\n \u003cp\u003e0.758\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.867\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.417\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 54px;\"\u003e\n \u003cp\u003e0.844\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e0.823\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 151px;\"\u003e\n \u003cp\u003eEndoscopist 3\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 76px;\"\u003e\n \u003cp\u003e0.444\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.280\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.958\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 54px;\"\u003e\n \u003cp\u003e0.433\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e0.955\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 151px;\"\u003e\n \u003cp\u003eEndoscopist 4\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 76px;\"\u003e\n \u003cp\u003e0.636\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.733\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.333\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 54px;\"\u003e\n \u003cp\u003e0.753\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e0.775\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 151px;\"\u003e\n \u003cp\u003eEndoscopist 5\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 76px;\"\u003e\n \u003cp\u003e0.747\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.800\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.583\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 54px;\"\u003e\n \u003cp\u003e0.828\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e0.857\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 151px;\"\u003e\n \u003cp\u003eEndoscopist 6\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 76px;\"\u003e\n \u003cp\u003e0.707\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.867\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.208\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 54px;\"\u003e\n \u003cp\u003e0.818\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e0.774\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 151px;\"\u003e\n \u003cp\u003eEndoscopist 7\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 76px;\"\u003e\n \u003cp\u003e0.697\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.747\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.542\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 54px;\"\u003e\n \u003cp\u003e0.789\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e0.836\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 151px;\"\u003e\n \u003cp\u003eEndoscopist 8\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 76px;\"\u003e\n \u003cp\u003e0.545\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.440\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.875\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 54px;\"\u003e\n \u003cp\u003e0.595\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e0.917\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 151px;\"\u003e\n \u003cp\u003eEndoscopist 9\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 76px;\"\u003e\n \u003cp\u003e0.535\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.667\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.125\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 54px;\"\u003e\n \u003cp\u003e0.685\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e0.704\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd nowrap=\"\" style=\"width: 151px;\"\u003e\n \u003cp\u003eGPT-4o\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 76px;\"\u003e\n \u003cp\u003e0.485\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.500\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 85px;\"\u003e\n \u003cp\u003e0.440\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 54px;\"\u003e\n \u003cp\u003e0.517\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd nowrap=\"\" style=\"width: 75px;\"\u003e\n \u003cp\u003e0.535\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e\n\u003cp\u003eAbbreviation: CNN, convolutional neural network; GNFN, GastroNet-5M-pretrained foundation model; HPO, hyperparameter optimization; GPT, generative pretrained transformer; AUROC, area under the receiver operating characteristic\u003c/p\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"scientific-reports","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"scirep","sideBox":"Learn more about [Scientific Reports](http://www.nature.com/srep/)","snPcode":"","submissionUrl":"","title":"Scientific Reports","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Scientific Reports","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"Cytomegalovirus esophagitis, Herpes simplex virus esophagitis, Deep learning, Endoscopic imaging, Computer-aided diagnosis","lastPublishedDoi":"10.21203/rs.3.rs-9113001/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-9113001/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eBackground: Cytomegalovirus (CMV) and herpes simplex virus (HSV) are the most common causes of infectious esophagitis in immuno-compromised patients. However, their endoscopic features frequently overlap, making real-time etiologic differentiation difficult and often requiring delayed confirmation by immunohistochemistry.\u003c/p\u003e\n\u003cp\u003eMethods: We developed and validated a deep learning model to distinguish CMV from HSV esophagitis using endoscopic images from biopsy-proven cases at a tertiary referral center. The model was trained with domain-specific pretraining and a curriculum learning strategy that sequentially introduced cases according to diagnostic difficulty, mimicking clinical learning processes. Diagnostic performance was evaluated using an independent test set and compared with experienced endoscopists.\u003c/p\u003e\n\u003cp\u003eResults: The curriculum learning–based model demonstrated improved classification performance over conventional training approaches, achieving an AUROC of 0.783 (95% confidence interval 0.680–0.885). Its sensitivity and specificity were comparable to those of expert endoscopists. Model visualization showed attention to clinically relevant mucosal abnormalities.\u003c/p\u003e\n\u003cp\u003eConclusion: Deep learning analysis of routine endoscopic images can assist in differentiating CMV and HSV esophagitis. A curriculum learning strategy may enhance clinical applicability by improving performance in visually ambiguous conditions, potentially supporting earlier therapeutic decision-making while awaiting histopathologic confirmation\u003c/p\u003e","manuscriptTitle":"Development and validation of a deep learning model for differentiating cytomegalovirus and herpes simplex virus esophagitis using endoscopic images","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-05-04 06:43:48","doi":"10.21203/rs.3.rs-9113001/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"reviewerAgreed","content":"261188718450729335144126738487169720576","date":"2026-04-26T14:23:47+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2026-04-21T11:30:09+00:00","index":"","fulltext":""},{"type":"editorInvited","content":"","date":"2026-04-16T19:05:28+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2026-03-14T14:15:18+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2026-03-14T14:14:39+00:00","index":"","fulltext":""},{"type":"submitted","content":"Scientific Reports","date":"2026-03-13T09:28:07+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"scientific-reports","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"scirep","sideBox":"Learn more about [Scientific Reports](http://www.nature.com/srep/)","snPcode":"","submissionUrl":"","title":"Scientific Reports","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Scientific Reports","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"eb5e2504-57d2-4fda-bb0a-78841ff1fafd","owner":[],"postedDate":"May 4th, 2026","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[{"id":67029269,"name":"Biological sciences/Computational biology and bioinformatics"},{"id":67029270,"name":"Health sciences/Diseases"},{"id":67029271,"name":"Health sciences/Gastroenterology"},{"id":67029272,"name":"Health sciences/Medical research"}],"tags":[],"updatedAt":"2026-05-04T06:43:48+00:00","versionOfRecord":[],"versionCreatedAt":"2026-05-04 06:43:48","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-9113001","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-9113001","identity":"rs-9113001","version":["v1"]},"buildId":"XKTyCvWXoU3ODBz1xrDgd","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00