Prediction of the gastric precancerous risk based on deep learning of multimodal medical images

doi:10.21203/rs.3.rs-4747833/v1

Prediction of the gastric precancerous risk based on deep learning of multimodal medical images

2024 · doi:10.21203/rs.3.rs-4747833/v1

preprint OA: closed CC-BY-4.0

📄 Open PDF Full text JSON View at publisher

Full text 102,163 characters · extracted from preprint-html · click to expand

Prediction of the gastric precancerous risk based on deep learning of multimodal medical images | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Prediction of the gastric precancerous risk based on deep learning of multimodal medical images Changzheng Ma, Peng Zhang, Shiyu Du, Shao Li This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-4747833/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Effective warning diverse gastritis lesions, including precancerous lesions of gastric cancer (PLGC) and Non-PLGC, and progression risks, are pivotal for early prevention of gastric cancer. An attention-based model (Attention-GT) was constructed. It integrated multimodal features such as gastroscopic, tongue images, and clinicopathological indicators (Age, Gender, Hp) for the first time to assist in distinguishing diverse gastritis lesions and progression risks. A longitudinal cohort of 384 participants with gastritis (206 Non-PLGC and 178 PLGC) was constructed. These two baseline groups were subdivided into progressive (Pro) and Non-Pro groups, respectively, based on a mean follow-up of 3.3 years. The Attention-GT model exhibited excellent performance in distinguishing diverse gastritis lesions and progression risks. It was found that the AUC of Attention-GT in distinguishing PLGC was 0.83, significantly higher than that of clinicopathological indicators (AUC = 0.72, p < 0.01). Importantly, for the patients with baseline lesions as Non-PLGC, the AUC of Attention-GT in distinguishing the Pro group was 0.84, significantly higher than that of clinicopathological indicators (AUC = 0.67, p < 0.01), demonstrating the value of the fusion of gastroscopic and tongue images in predicting the progression risk of gastritis. Finally, morphological features related to diverse gastritis lesions and progression risk, respectively, were identified in both gastroscopic and tongue images through interpretability analysis. Collectively, our study has demonstrated the value of integrating multimodal data of medical images in assisting prediction of diverse gastritis lesions and progression risks, paving a new way for early gastric cancer risk prediction. Gastroenterology & Hepatology gastric precancerous diseases progression prediction gastroscopic images multimodal fusion deep learning Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Introduction Gastric cancer (GC) ranks as the fourth leading cause of cancer-related deaths globally 1 . Its development is often preceded by precursor lesions such as atrophic gastritis, intestinal metaplasia (IM), and dysplasia, with the incidence of GC increasing from 0.1–6% as these lesions progress 2–6 . The great inter-individual difference in the risk of progression poses a challenge to the prevention and control of gastric cancer. Early prediction of diverse gastritis lesions and their progression risks in patients is crucial for GC prevention 7–9 . Recent studies highlight the significance of multimodal and multi-factor elements in the development of precancerous lesions of gastric cancer (PLGC), including IM and dysplasia 10, 11 12 . Among various diagnostic tools, gastroscopic and tongue images have emerged as effective indicators for GC diagnosis and prevention 13, 14 . Gastroscopic images are pivotal for accurate diagnoses and precise lesion localization in gastric diseases, while tongue image features, such as color, shape, and coating thickness, are increasingly recognized for their diagnostic value in line with Traditional Chinese Medicine (TCM) practices 15–17 . However, the extent to which the fusion of gastroscopic and tongue images enhances gastric disease management, particularly in prediction of diverse gastritis lesions and their progression risks, remains unclear. The development of computational methods to integrate these multimodal images for disease prediction is imperative. Advancements in artificial intelligence (AI), particularly deep learning models, have shown promise in precise cancer screening and treatment 18, 19 . These models effectively decode large-scale medical images into quantitative features, with studies suggesting that morphological features from multimodal images are complementary 20 . It thus indicates that multimodal learning is becoming one of the most challenging and attractive perspectives in the field of medial image analysis with AI models 21–23 . Attention-based models, focusing on critical information and filtering out irrelevant data, have become increasingly relevant in multimodal image data fusion 24–26 . These approaches often outperform unimodal methods in screening, risk stratification, and treatment, indicating the potential of attention-based AI models in integrating multimodal information for disease prediction 26–28 . In our study, we developed an intelligent model integrating gastroscopic images, tongue images, and clinicopathological indicators for prediction of diverse gastritis lesions and their progression risks, evaluated on a longitudinal cohort of 384 patients, and revealed morphological features, further elucidating their associations. Our research contributes to the development of AI-based methodologies for early GC risk screening. Methods Data collection and grouping The study was conducted at China-Japan Friendship Hospital between 2015 and 2022. The experimental protocol was designed following the ethical guidelines stated in the "Declaration of Helsinki" and was approved by the Human Ethics Committee of the Institution Review Board of China-Japan Friendship Hospital. Patients were enrolled according to the criteria demonstrated in Supplementary Table 1. According to the Correa model, the evolution process of GC includes five step-like stages: superficial gastritis, atrophic gastritis, IM, and dysplasia to GC 2 . Among them, PLGC refers to pathological findings of IM and dysplasia while Non-PLGC encompasses superficial gastritis and atrophic gastritis 10, 11 . Patients were divided into PLGC and Non-PLGC groups according to baseline pathological findings, and then were further divided into progressive (Pro) and Non-Pro groups within each group after a mean follow-up of 3.3 years. Patients in the Pro group had a follow-up pathological test stage higher than baseline, whereas those in the Non-Pro group had a follow-up pathological stage lower than or equal to baseline. The follow-up cohort comprised a total of 384 participants in our study (Supplementary Table 2). Among these participants, 178 ones were enrolled with baseline diagnosis as PLGC whereas the others as Non-PLGC (Fig. 1 ). Subsequently, 306 of 384 participants were left after the follow-up survey with an average interval of 3.3 years, where 112 ones were categorized into the Pro group while the others into the Non-Pro group (Supplementary Table 3). Gastroscopy and histological examination Gastroscopic examinations were performed by two experienced gastroenterologists and conducted using video endoscopes. The biopsy samples obtained during the procedure of gastroscopic examinations were evaluated according to the criteria outlined in the Updated Sydney System 2, 29 . To determine the infection status of Helicobacter pylori (Hp), an enzyme-linked immunosorbent assay was performed on plasma samples to detect Hp-specific IgG antibodies 30, 31 . Tongue images acquisition and pre-processing The tongue image collector was used to capture images of the tongue. Before taking pictures, the patient was asked to clean mouth with water to avoid affecting the natural color of the tongue and coating. During the shooting process, the patient kept head fixed in the shooting area and extended tongue without moving. The use of the tongue image collector ensured consistent lighting, shooting angles, and distances, guaranteeing clear and undistorted images. After shooting, a TCM doctor examined the pictures to ensure they clearly displayed the entire tongue, including the tip, edges, and root. This process maximized the consistency of the tongue images, avoiding the influence of environmental factors and human error. Development of Attention-GT model To address the challenges of multimodal fusion in this study, the Attention-GT (Gastroscopic and Tongue) model based on the attention mechanism was constructed (Fig. 2 ). Attention-GT model, compared to traditional deep learning models, can enhance feature representation and adaptively focus on and integrate information from different modalities, thereby enhancing the understanding and representation of multimodal data by the model. Furthermore, it can generate visual attention maps, improving interpretability. First, in the Attention-GT model, a tongue image and multi gastroscopic images from a single patient were transformed into feature vectors by a pre-trained ResNet50 model 32 . Each image is transformed into an independent vector in the feature matrix. Subsequently, these feature vectors were vertically concatenated with the Class token vector to form a matrix F (Eq. 1 ). The Class token vector is a trainable vector providing a summary representation of the fused features to make predictions. $$\:\text{F}=\text{C}\text{o}\text{n}\text{c}\text{a}\text{t}\left({\text{C}\text{l}\text{a}\text{s}\text{s}}_{\text{t}\text{o}\text{k}\text{e}\text{n}},{\text{F}\text{e}\text{a}\text{t}\text{u}\text{r}\text{e}\text{s}}_{gas},{\text{F}\text{e}\text{a}\text{t}\text{u}\text{r}\text{e}}_{tongue},\dots\:\right)$$ 1 Next, refer to the context gating (CG) method, the preprocessed feature matrix F is fed into an attention-based module to aggregate high-level representations from low-level representations 33, 34 . In Eq. 2 , the F matrix passes through two separate fully connected layers (W1, W2) and is activated by the Sigmoid and Tanh functions. A Hadamard product is performed between the two parts resulting in a T matrix. In Eq. 3 , the T matrix passes through a fully connected layer (W3) and is activated by the SoftMax function to obtain a matrix representing the weights of various image feature vectors. The matrix is multiplied by the F matrix to obtain the Context (C), which is a vector representing the weighted sum of all the features. In Eq. 4 , the C is passed through a fully connected layer (W4) and then subjected to a Hadamard product with the F matrix, resulting in a new F matrix. In the aforementioned process, the fully connected layers are independently trainable, and the dropout rate is set at 0.2. These processes allow for nonlinear interactions between the features and employ an activation function to determine which input features to retain and help to enhance and concentrate the relevant information within the features. $$\:\text{T}=\text{S}\text{i}\text{g}\text{m}\text{o}\text{i}\text{d}\left({\text{W}}_{1}\text{*}\text{F}\right)\circ\:\text{T}\text{a}\text{n}\text{h}\left({\text{W}}_{2}\text{*}F\right)$$ 2 $$\:\text{C}=\text{S}\text{o}\text{f}\text{t}\text{M}\text{a}\text{x}\left({\text{W}}_{3}\text{*}\text{T}\right)\text{*}\text{F}$$ 3 $$\:F={\text{W}}_{4}\text{*}\text{C}\circ\:\text{F}$$ 4 Finally, in Eq. 5 , the Class token vector in F passes through a Multi-Layer Perceptron (MLP) to obtain the final output. $$\:\text{C}\text{l}\text{a}\text{s}\text{s}=\text{M}\text{L}\text{P}\left({\text{C}\text{l}\text{a}\text{s}\text{s}}_{\text{t}\text{o}\text{k}\text{e}\text{n}}\right)$$ 5 Overall, the attention module in the multimodal fusion approach is a key component. It can fuse features from different modes, capture the internal correlation between gastroscopy images and tongue images, so as to achieve more accurate feature extraction and information mining, and improve the risk prediction effect. Baseline model in model evaluation. In this study, by comparing the Attention-GT model and the baseline model, we verified that the model we constructed had better multi-modal fusion effect, so as to improve the efficiency of PLGC screening and risk prediction. The baseline model refers to intermediate and late fusion methods. These two methods are common multimodal fusion methods, which are characterized by fusion in the middle and late stages of feature extraction. The Attention-GT method is an early fusion method, which can better extract the correlation between the modes. In addition, we verified that multimodal fusion can effectively improve the efficiency of PLGC screening and risk prediction by comparing multimodal fusion with unimodal features. The unimodal features refer to clinicopathological indicators, tongue images, and gastroscopic images respectively. Statistical analysis and visualization In this study, comparative experiments from multimodal fusion and unimodal were conducted to validate the effectiveness of the Attention-GT model. Multimodal fusion comparison refers to the comparison of the Attention-GT model with baseline models, such as intermediate and late fusion methods. Then, unimodal comparison involved comparing between different independent features. All statistical analysis procedures in this study were conducted using Python (Version: 3.7.0). For model validation, a five-fold cross-validation strategy was employed. To assess the models' performance, a range of evaluation metrics were employed. These metrics encompass accuracy, sensitivity, specificity, F1 score, receiver operating characteristic (ROC) curve, and area under the curve (AUC). By leveraging these evaluation metrics, the study aims to holistically gauge the models' performance and proffer a dependable assessment of their predictive prowess. The visualization and interpretability of morphological features in gastroscopic and tongue images were achieved using the Grad-CAM model, with the focused areas of attention displayed in the form of heatmaps 35 . Results Improvement the performance of prediction of PLGC First, the prediction performance of the Attention-GT model for the PLGC in terms of multi-modal integration was evaluated. In addition, model interpretability analysis was conducted to dissect the PLGC-related morphological features from both gastroscopic and tongue images. By performing ROC curves analysis, it was revealed that the Attention-GT achieved an AUC of 0.83 for PLGC prediction, significantly superior to other multimodal models (P < 0.05) (Fig. 3 ). In detail, the Attention-GT model achieved an AUC of 0.83 and showed significantly better performance than the baseline multimodal fusion models mentioned in the methods section, which were intermediate and late fusion methods (Fig. 3 A). Similarly, it demonstrated significant advantages over baseline models utilizing unimodal features, with the baseline models based on clinicopathological indicators, tongue images, and gastroscopic images respectively (Fig. 3 B). Accuracy, sensitivity, specificity, and F1 score were shown in Supplementary Table 4. Collectively, all of these results demonstrated the fusion of multimodal by Attention-GT model might be of great benefit for PLGC prediction. Improvement the performance of prediction of progression risks Given the significant associations between pathological lesions and morphological features, we thus anticipated that integrating multi-modal images could enhance the effectiveness of pathological prediction of progression risks based on our established follow-up cohort. Expectedly, among patients with Non-PLGC at baseline pathological findings, it was found that Attention-GT achieved an AUC of 0.84 in distinguishing the Pro patients from Non-Pro ones, significantly superior to other multimodal models (P < 0.05) (Fig. 4 ). It indicated the potential of Attention-GT in predicting pathological lesion progression. In detail, the Attention-GT model showed significantly better performance than the baseline multimodal fusion models (Fig. 4 A). Similarly, it demonstrated significant advantages over baseline models utilizing unimodal features (Fig. 4 B). Accuracy, sensitivity, specificity, and F1 score were in Supplementary Table 5. Collectively, all of these results demonstrated the fusion of multimodal by Attention-GT model might be of great benefit for disease progression among Non-PLGC patients. Interestingly, among patients with PLGC at baseline pathological findings, significant effectiveness of gastroscopic images in prediction of progression risks was observed (Supplementary Fig. 1). In detail, the Attention-GT model based on gastroscopic images achieved an AUC of 0.74, which was significantly superior (P value < 0.05) to the baseline models based on clinicopathological indicators. Accuracy, sensitivity, specificity, and F1 score were in Supplementary Table 6. Unfortunately, the AUC of tongue image was 0.54, did not show significant prediction effect, so the fusion effect of tongue and gastroscopic images was not good. Overall, in the investigation of disease progression among PLGC patients, our findings indicated that gastroscopic images outperformed the clinicopathological indicators and tongue images for prediction of progression risks. In general, the fusion of gastroscopic and tongue images can significantly improve the prediction of progression risks in PLGC patients, and our proposed Attention-GT model performed better in modal fusion than the baseline models. Morphological risk features from gastroscopic and tongue images Morphological features that involved in the prediction of PLGC from both gastroscopic and tongue images were also dissected. As shown in Fig. 5 , the output attention heat maps highlighted the regions of potential risk. The main features of the identified high-risk areas in gastroscopic images were small gray-white raised patches surrounded by pink areas, with an irregular and uneven surface, consistent with the foci labeled by experienced experts 36 . The key features of tongue images were primarily observed in the central region of the tongue, including morphological and coating characteristics, which were consistent with our previous findings (Fig. 5 ) 14, 37 . These results might provide objective morphological indicators for the accurate clinical prediction of PLGC. Associations of morphological features involved in prediction of PLGC and their progression risks Leveraging the advantage of the Attention-GT model in attention visualization, associations of morphological features involved in prediction of diverse gastritis lesions and their progression risks were further investigated from the spectrum of attention heatmaps in original images. Regarding gastroscopic images, the attention regions for both prediction of diverse gastritis lesions and their progression risks exhibit some similarities but also slight differences (Fig. 6 ). The attention areas highly overlapped, while the attention areas of high progression risk had a relatively broader range (Fig. 6 A). They primarily consisted of small gray-white raised patches surrounded by pink areas, with an irregular and uneven surface, which was in line with the feature descriptions mentioned in the literature 38 . This suggested that there was a correlation between the features of the two tasks. The attention areas showed a lack of significant correlation in Fig. 6 B. The primary feature observed in attention areas of high progression risk was the presence of irregular white patches, which was similar to the gastroscopic features of patients with atypical hyperplasia in clinical diagnosis 36, 39 . The differences in attention areas for the two types of tasks suggested that, compared with the screening of PLGC, the risk prediction of PLGC has specific risk characteristics. For disease progression risk assessment, it may be advantageous to collect samples from multiple points during gastroscopy to enhance the effectiveness of risk prediction. These findings might provide novel insights and lay the foundation for constructing more accurate and reliable prediction models. Multimodal morphological features associations between gastroscopic and tongue images Although the morphological features from the unimodal tongue or gastroscopic images have been heavily investigated, their associations between multimodal images remain unclear. Thus, the associations of morphological features between gastroscopic and tongue images in the context of two tasks were investigated, respectively. Correlation analysis for paired gastroscopic and tongue images of patients on prediction of progression risks for Non-PLGC patients was conducted (Fig. 7 ). The scatterplot illustrated the correlation between paired gastroscopic and tongue image prediction scores for different patients. The blue dots represented Pro patients, while orange dots were Non-Pro patients. Numbers indicated Pro prediction scores based on gastroscopic or tongue images. Patients in the red circle were predicted as negative by tongue images and positive by gastroscopic images, and the green circle was the opposite. The three sets of heatmaps represented patient samples within the circles of the three colors. Taking one of the blue dots highlighted within a red circle as an example, the prediction score based on unimodal features from gastroscopic images was 0.99, and the corresponding heat map indicated clear morphological feature regions. Conversely, the prediction score based on unimodal features from tongue images was 0.09, and the heat map did not exhibit discernible feature regions. These findings further emphasized the distinct contributions of gastroscopic and tongue images in the prediction of progression risks. Correlation analysis on PLGC prediction was in Supplementary Fig. 2. This analysis provided evidence for the improvement of prediction accuracy through multimodal fusion of gastroscopic and tongue images. The correlation between gastroscopic and tongue images from the TCM theory Furthermore, we attempted to interpret the correlation between gastroscopic and tongue images from the TCM theory of cold and heat 40–43 . The concept of cold and heat in TCM is a core aspect of its theoretical framework, used to describe the physiological and pathological states of the human body, reflecting the balance of Yin and Yang within. The appearance of the tongue is one of the key indicators for determining the cold and heat conditions. A cold pattern in the tongue is characterized by features like a white and thick coating, and a white tongue color. Conversely, a heat pattern is indicated by a thin coating and a red tongue color. Researching the correlation between tongue diagnosis and gastroscopy from the perspective of TCM's cold and heat concepts may provide new insights for medical practice. Based on the TCM classification of tongue imaging into cold and heat categories, differences in the two groups of gastroscopic images were found (Fig. 8 ). Tongue images of patients with heat syndrome tended to exhibit a deeper red tongue color and thinner tongue coating compared to those with cold syndrome. Correspondingly, in the gastroscopic images of these patients, blood vessels appear to be more prominent and abundant. However, these findings still required further in-depth research to establish their true reliability. On the one hand, this provided supporting evidence for the correlation between the two types of images. On the other hand, it suggested that variations in cold and heat constitutions may correspond to different disease states in the stomach. This indicated that future research and medical practice can further investigate and consider these differences. Discussion The Attention-GT model, constructed based on the attention module, demonstrated significant advantages over traditional methods for addressing the unique challenges of multimodal fusion in this article. The main strength of this model lies in its enhanced feature representation, adaptively focusing on and integrating information from different modalities, and enhancing interpretability and visualization capabilities. Due to its ability to align and attend to information across modalities adaptively, the model also exhibited excellent scalability. In future research, the introduction of pathological images and molecular omics data could potentially improve the effectiveness of risk prediction. The attention heat maps highlighted areas of the gastroscopic images that were deemed to be of potential high risk. These heat maps provided a visual reference for identifying regions of interest that may be associated with the risk of disease progression. They can serve as valuable tools for subsequent studies, offering insights into areas that require further investigation. Additionally, these attention heat maps can assist clinicians in their diagnostic process by providing visual cues and assistance in identifying areas that may require closer examination. The study on the correlation between gastroscopic and tongue images based on TCM's theory of cold and heat provided us with some new insights. By combining TCM theory with modern medical imaging, researchers can explore new perspectives and approaches to understanding diseases and patients' conditions. This integrated approach can offer more information and tools for medical practice. Merely, its reliability still needs further validation and in-depth research. The limitations of this work lie primarily in the amount of data included. The number of participants in this study is limited due to the long follow-up required for disease warning studies. The main research object in this study is medical images, and the microscopic risk characteristics at the molecular level are not included. We preliminarily found the synergism between tongue image and gastroscopy image in screening and early warning of stomach diseases, and also tried to explain it from various perspectives. However, the current conclusion was not deep enough. Further exploration of the relevance of gastroscopy and tongue images in predicting disease progression and its potential mechanism is of great significance. It is necessary to further expand the scale of the experiment. Including more participants may improve the training and prediction of the model, which is the focus of subsequent research. Further, in order to further improve the effect of PLGC risk prediction, the introduction of molecular characteristics into the model is also the focus of the next research. Conclusion In this study, the researchers investigated the use of gastroscopic and tongue images in prediction of diverse gastritis lesions and their progression risks. The effectiveness of the proposed Attention-GT model, as well as the fusion of these imaging modalities, was compared with the baseline models. Additionally, through performing the interpretability analysis, we uncovered multimodal morphological features and their associations involved in the two tasks, providing potentially novel indicators for GC early prevention. Overall, the findings of this study suggested that the fusion of gastroscopic and tongue images using the Attention-GT model can enhance the prediction of diverse gastritis lesions and their progression risks. Declarations Ethics approval and consent to participate The experimental protocol was established, according to the ethical guidelines of the Helsinki Declaration and was approved by the Human Ethics Committee of Institution Review Board of China-Japan Friendship Hospital (protocol code 2023-KY-174). Consent for publication Not applicable. Availability of data and materials The datasets generated and/or analyzed during the current study are not publicly available due to patient privacy but are available from the corresponding author upon reasonable request. Competing Interests The authors declare that they have no competing interests. Funding Funding for this study was provided by the National Natural Science Foundation of China, China [T2341008] References Sung, H.; Ferlay, J.; Siegel, R. L.; Laversanne, M.; Soerjomataram, I.; Jemal, A.; Bray, F., Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J Clin 2021, 71 (3), 209-249. Dixon, M. F.; Genta, R. M.; Yardley, J. H.; Correa, P., Classification and grading of gastritis. The updated Sydney System. International Workshop on the Histopathology of Gastritis, Houston 1994. The American journal of surgical pathology 1996, 20 (10), 1161-1181. Piazuelo, M. B.; Bravo, L. E.; Mera, R. M.; Camargo, M. C.; Bravo, J. C.; Delgado, A. G.; Washington, M. K.; Rosero, A.; Garcia, L. S.; Realpe, J. L.; Cifuentes, S. P.; Morgan, D. R.; Peek, R. M., Jr.; Correa, P.; Wilson, K. T., The Colombian Chemoprevention Trial: 20-Year Follow-Up of a Cohort of Patients With Gastric Precancerous Lesions. Gastroenterology 2021, 160 (4), 1106-1117 e3. Kang, J. Y.; Finlayson, C.; Maxwell, J. D.; Neild, P., Risk of gastric carcinoma in patients with atrophic gastritis and intestinal metaplasia. Gut 2002, 51 (6), 899. de Vries, A. C.; van Grieken, N. C.; Looman, C. W.; Casparie, M. K.; de Vries, E.; Meijer, G. A.; Kuipers, E. J., Gastric cancer risk in patients with premalignant gastric lesions: a nationwide cohort study in the Netherlands. Gastroenterology 2008, 134 (4), 945-52. Rugge, M.; Meggio, A.; Pravadelli, C.; Barbareschi, M.; Fassan, M.; Gentilini, M.; Zorzi, M.; Pretis, G.; Graham, D. Y.; Genta, R. M., Gastritis staging in the endoscopic follow-up for the secondary prevention of gastric cancer: a 5-year prospective study of 1755 patients. Gut 2019, 68 (1), 11-17. Huang, H. L.; Leung, C. Y.; Saito, E.; Katanoda, K.; Hur, C.; Kong, C. Y.; Nomura, S.; Shibuya, K., Effect and cost-effectiveness of national gastric cancer screening in Japan: a microsimulation modeling study. BMC Med 2020, 18 (1), 257. Suh, Y. S.; Lee, J.; Woo, H.; Shin, D.; Kong, S. H.; Lee, H. J.; Shin, A.; Yang, H. K., National cancer screening program for gastric cancer in Korea: Nationwide treatment benefit and cost. Cancer 2020, 126 (9), 1929-1939. Zhang, P.; Wang, B.; Li, S., Network-based cancer precision prevention with artificial intelligence and multi-omics. Sci Bull (Beijing) 2023, 68 (12), 1219-1222. Schlemper, R. J.; Riddell, R. H.; Kato, Y.; Borchard, F.; Cooper, H. S.; Dawsey, S. M.; Dixon, M. F.; Fenoglio-Preiser, C. M.; Fléjou, J. F.; Geboes, K.; Hattori, T.; Hirota, T.; Itabashi, M.; Iwafuchi, M.; Iwashita, A.; Kim, Y. I.; Kirchner, T.; Klimpfinger, M.; Koike, M.; Lauwers, G. Y.; Lewin, K. J.; Oberhuber, G.; Offner, F.; Price, A. B.; Rubio, C. A.; Shimizu, M.; Shimoda, T.; Sipponen, P.; Solcia, E.; Stolte, M.; Watanabe, H.; Yamabe, H., The Vienna classification of gastrointestinal epithelial neoplasia. Gut 2000, 47 (2), 251-255. Song, H.; Ekheden, I. G.; Zheng, Z.; Ericsson, J.; Nyren, O.; Ye, W., Incidence of gastric cancer among patients with gastric precancerous lesions: observational cohort study in a low risk Western population. BMJ 2015, 351 , h3867. Tan, P.; Yeoh, K. G., Genetics and Molecular Pathogenesis of Gastric Adenocarcinoma. Gastroenterology 2015, 149 (5), 1153-1162 e3. Ma, L.; Su, X.; Ma, L.; Gao, X.; Sun, M., Deep learning for classification and localization of early gastric cancer in endoscopic images. Biomedical Signal Processing and Control 2023, 79 , 104200. Yuan, L.; Yang, L.; Zhang, S.; Xu, Z.; Qin, J.; Shi, Y.; Yu, P.; Wang, Y.; Bao, Z.; Xia, Y.; Sun, J.; He, W.; Chen, T.; Chen, X.; Hu, C.; Zhang, Y.; Dong, C.; Zhao, P.; Wang, Y.; Jiang, N.; Lv, B.; Xue, Y.; Jiao, B.; Gao, H.; Chai, K.; Li, J.; Wang, H.; Wang, X.; Guan, X.; Liu, X.; Zhao, G.; Zheng, Z.; Yan, J.; Yu, H.; Chen, L.; Ye, Z.; You, H.; Bao, Y.; Cheng, X.; Zhao, P.; Wang, L.; Zeng, W.; Tian, Y.; Chen, M.; You, Y.; Yuan, G.; Ruan, H.; Gao, X.; Xu, J.; Xu, H.; Du, L.; Zhang, S.; Fu, H.; Cheng, X., Development of a tongue image-based machine learning tool for the diagnosis of gastric cancer: a prospective multicentre clinical cohort study. EClinicalMedicine 2023, 57 , 101834. Shang, Z.; Du, Z. G.; Guan, B.; Ji, X. Y.; Chen, L. C.; Wang, Y. J.; Ma, Y., Correlation analysis between characteristics under gastroscope and image information of tongue in patients with chronic gastriti. J Tradit Chin Med 2022, 42 (1), 102-107. Gholami, E. a. K. T., Seyed and Kheirabadi, Maryam, Increasing the accuracy in the diagnosis of stomach cancer based on color and lint features of tongue. Biomedical Signal Processing and Control 2021, 69 , 102782. Li, S.; Wang, R.; Zhang, Y.; Zhang, X.; Layon, A. J.; Li, Y.; Chen, M., Symptom combinations associated with outcome and therapeutic effects in a cohort of cases with SARS. Am J Chin Med 2006, 34 (6), 937-47. Liu, T.; Huang, J.; Liao, T.; Pu, R.; Liu, S.; Peng, Y., A Hybrid Deep Learning Model for Predicting Molecular Subtypes of Human Breast Cancer Using Multimodal Data. IRBM 2022, 43 (1), 62-74. Chen, R. J.; Lu, M. Y.; Williamson, D. F. K.; Chen, T. Y.; Lipkova, J.; Noor, Z.; Shaban, M.; Shady, M.; Williams, M.; Joo, B.; Mahmood, F., Pan-cancer integrative histology-genomic analysis via multimodal deep learning. Cancer Cell 2022, 40 (8), 865-878 e6. Sun, C.; Wang, A.; Zhou, Y.; Chen, P.; Wang, X.; Huang, J.; Gao, J.; Wang, X.; Shu, L.; Lu, J.; Dai, W.; Bu, Z.; Ji, J.; He, J., Spatially resolved multi-omics highlights cell-specific metabolic remodeling and interactions in gastric cancer. Nat Commun 2023, 14 (1), 2692. Azam, M. A.; Khan, K. B.; Salahuddin, S.; Rehman, E.; Khan, S. A.; Khan, M. A.; Kadry, S.; Gandomi, A. H., A review on multimodal medical image fusion: Compendious analysis of medical modalities, multimodal databases, fusion techniques and quality metrics. Computers in Biology and Medicine 2022, 144 , 105253. Kline, A.; Wang, H.; Li, Y.; Dennis, S.; Hutch, M.; Xu, Z.; Wang, F.; Cheng, F.; Luo, Y., Multimodal machine learning in precision health: A scoping review. NPJ Digit Med 2022, 5 (1), 171. Boehm, K. M.; Khosravi, P.; Vanguri, R.; Gao, J.; Shah, S. P., Harnessing multimodal data integration to advance precision oncology. Nat Rev Cancer 2022, 22 (2), 114-126. Lara Ramírez, J.; Contreras, V.; Otálora Montenegro, J.; Müller, H.; González, F., Multimodal Latent Semantic Alignment for Automated Prostate Tissue Classification and Retrieval . 2020. Yan, R.; Zhang, F.; Rao, X.; Lv, Z.; Li, J.; Zhang, L.; Liang, S.; Li, Y.; Ren, F.; Zheng, C.; Liang, J., Richer fusion network for breast cancer classification based on multimodal data. BMC Medical Informatics and Decision Making 2021, 21 (1), 134. Ou, C.; Zhou, S.; Yang, R.; Jiang, W.; He, H.; Gan, W.; Chen, W.; Qin, X.; Luo, W.; Pi, X.; Li, J., A deep learning based multimodal fusion model for skin lesion diagnosis using smartphone collected clinical images and metadata. Front Surg 2022, 9 , 1029991. Singh, L. K.; Khanna, M.; Pooja, A novel multimodality based dual fusion integrated approach for efficient and early prediction of glaucoma. Biomedical Signal Processing and Control 2022, 73 , 103468. Cai, Q.; Wang, H.; Li, Z.; Liu, X., A Survey on Multimodal Data-Driven Smart Healthcare Systems: Approaches and Applications. IEEE Access 2019, PP , 1-1. You, W. C.; Blot, W. J.; Li, J. Y.; Chang, Y. S.; Jin, M. L.; Kneller, R.; Zhang, L.; Han, Z. X.; Zeng, X. R.; Liu, W. D.; et al., Precancerous gastric lesions in a population at high risk of stomach cancer. Cancer Res 1993, 53 (6), 1317-21. Zhang, L.; Blot, W. J.; You, W. C.; Chang, Y. S.; Kneller, R. W.; Jin, M. L.; Li, J. Y.; Zhao, L.; Liu, W. D.; Zhang, J. S.; Ma, J. L.; Samloff, I. M.; Correa, P.; Blaser, M. J.; Xu, G. W.; Fraumeni, J. F., Jr., Helicobacter pylori antibodies in relation to precancerous gastric lesions in a high-risk Chinese population. Cancer Epidemiol Biomarkers Prev 1996, 5 (8), 627-30. Li, S.; Lu, A. P.; Zhang, L.; Li, Y. D., Anti-Helicobacter pylori immunoglobulin G (IgG) and IgA antibody responses and the value of clinical presentations in diagnosis of H. pylori infection in patients with precancerous lesions. World J Gastroenterol 2003, 9 (4), 755-8. He, K.; Zhang, X.; Ren, S.; Sun, J., Deep Residual Learning for Image Recognition. IEEE 2016 . Lu, M. Y.; Williamson, D. F. K.; Chen, T. Y.; Chen, R. J.; Barbieri, M.; Mahmood, F., Data-efficient and weakly supervised computational pathology on whole-slide images. Nat Biomed Eng 2021, 5 (6), 555-570. Miech, A.; Laptev, I.; Sivic, J., Learnable pooling with context gating for video classification. arXiv preprint arXiv:1706.06905 2017 . Selvaraju, R. R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. In Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2017 IEEE International Conference on Computer Vision (ICCV), 22-29 Oct. 2017; 2017; pp 618-626. Cummins, G.; Cox, B. F.; Ciuti, G.; Anbarasan, T.; Desmulliez, M. P. Y.; Cochran, S.; Steele, R.; Plevris, J. N.; Koulaouzidis, A., Gastrointestinal diagnosis using non-white light imaging capsule endoscopy. Nat Rev Gastroenterol Hepatol 2019, 16 (7), 429-447. Ma, C.; Zhang, P.; Du, S.; Li, Y.; Li, S., Construction of Tongue Image-Based Machine Learning Model for Screening Patients with Gastric Precancerous Lesions. J Pers Med 2023, 13 (2), 271. Young, E.; Philpott, H.; Singh, R., Endoscopic diagnosis and treatment of gastric dysplasia and early cancer: Current evidence and what the future may hold. World J Gastroenterol 2021, 27 (31), 5126-5151. Hoffman, A.; Manner, H.; Rey, J. W.; Kiesslich, R., A guide to multimodal endoscopy imaging for gastrointestinal malignancy - an early indicator. Nat Rev Gastroenterol Hepatol 2017, 14 (7), 421-434. Li, R.; Ma, T.; Gu, J.; Liang, X.; Li, S., Imbalanced network biomarkers for traditional Chinese medicine Syndrome in gastritis patients. Sci Rep 2013, 3 , 1543. Wang, Z. Y.; Wang, X.; Zhang, D. Y.; Hu, Y. J.; Li, S., [Traditional Chinese medicine network pharmacology: development in new era under guidance of network pharmacology evaluation method guidance]. Zhongguo Zhong Yao Za Zhi 2022, 47 (1), 7-17. Zhou, W.; Yang, K.; Zeng, J.; Lai, X.; Wang, X.; Ji, C.; Li, Y.; Zhang, P.; Li, S., FordNet: Recommending traditional Chinese medicine formula via deep neural network integrating phenotype and molecule. Pharmacol Res 2021, 173 , 105752. 张彦琼; 李梢, 网络药理学与中医药现代研究的若干进展. 中国药理学与毒理学杂志 2015, 29 (6), 883-892. Additional Declarations The authors declare no competing interests. Supplementary Files appendix2.docx Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-4747833","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":327535801,"identity":"3f129c2c-38ea-44ca-90bb-f024c0fed66d","order_by":0,"name":"Changzheng Ma","email":"","orcid":"","institution":"Institute of TCM-X/MOE Key Laboratory of Bioinformatics, Bioinformatics Division, BNRist/Department of Automation, Tsinghua University, 100084 Beijing, China","correspondingAuthor":false,"prefix":"","firstName":"Changzheng","middleName":"","lastName":"Ma","suffix":""},{"id":327535802,"identity":"48ff0970-a33e-4db0-ab29-ece8a5900a2a","order_by":1,"name":"Peng Zhang","email":"","orcid":"","institution":"Institute of TCM-X/MOE Key Laboratory of Bioinformatics, Bioinformatics Division, BNRist/Department of Automation, Tsinghua University, 100084 Beijing, China","correspondingAuthor":false,"prefix":"","firstName":"Peng","middleName":"","lastName":"Zhang","suffix":""},{"id":327535803,"identity":"879c5304-dd8b-4b22-9db2-7de9a5d68f70","order_by":2,"name":"Shiyu Du","email":"","orcid":"","institution":"Department of Gastroenterology, China-Japan Friendship Hospital, Chaoyang District, Beijing 100029, China","correspondingAuthor":false,"prefix":"","firstName":"Shiyu","middleName":"","lastName":"Du","suffix":""},{"id":327535804,"identity":"cde0b9b2-53aa-4860-bd5a-faf89aa80a01","order_by":3,"name":"Shao Li","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA7ElEQVRIiWNgGAWjYDACCSB+YGDDwMAMRiCQQISWBIM0CVK1MBwGUURqkZ/d/PBBQsH5Ovl23sOvC9vsGPjZcwwYfu7ArcXgzjFjgwSD2xIGh/nSrGe2JTNI9rwxYOw9g0eLRIKZBFgLM4+ZMW8bM4PBjRwDZsY2PA6bkf4NqOWchHwzWEs9gz0hLQw3ckC2HJBgOMxj/Ji37TDQXgJagM4oBvolWXLDYR4zZp5zx3kkzjwrONiL32EbH3z4Y8cv33/G+DNPWbUcf3vyxgc/8TkMCbCBIocHxDpAnAZgTH4gVuUoGAWjYBSMLAAAevxIx4gxCo0AAAAASUVORK5CYII=","orcid":"","institution":"Institute of TCM-X/MOE Key Laboratory of Bioinformatics, Bioinformatics Division, BNRist/Department of Automation, Tsinghua University, 100084 Beijing, China","correspondingAuthor":true,"prefix":"","firstName":"Shao","middleName":"","lastName":"Li","suffix":""}],"badges":[],"createdAt":"2024-07-16 07:28:33","currentVersionCode":1,"declarations":{"humanSubjects":true,"vertebrateSubjects":false,"conflictsOfInterestStatement":false,"humanSubjectEthicalGuidelines":true,"humanSubjectConsent":true,"humanSubjectClinicalTrial":false,"humanSubjectCaseReport":true,"vertebrateSubjectEthicalGuidelines":false},"doi":"10.21203/rs.3.rs-4747833/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-4747833/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":60597525,"identity":"9a0a70be-b5bd-4ff6-9a69-8c096b64244f","added_by":"auto","created_at":"2024-07-18 15:50:04","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":47151,"visible":true,"origin":"","legend":"\u003cp\u003eThe study design for patient enrollment and follow-up.\u003c/p\u003e","description":"","filename":"1.png","url":"https://assets-eu.researchsquare.com/files/rs-4747833/v1/bf63ad994df95855306cb769.png"},{"id":60597527,"identity":"e6c8f2c9-abe4-4cc2-91ea-c4326e3da3ad","added_by":"auto","created_at":"2024-07-18 15:50:04","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":145050,"visible":true,"origin":"","legend":"\u003cp\u003eSchematic diagram illustrating the model construction and evaluation in terms of prediction of diverse gastritis lesions and their progression risks.\u003c/p\u003e","description":"","filename":"2.png","url":"https://assets-eu.researchsquare.com/files/rs-4747833/v1/2cb8cf37c9479a0f14234222.png"},{"id":60597530,"identity":"0efbe668-5113-4c5e-9642-fbe0aed49ef0","added_by":"auto","created_at":"2024-07-18 15:50:05","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":155642,"visible":true,"origin":"","legend":"\u003cp\u003eComparison and visualization of PLGC prediction. A. Schematic diagram and comparison of the Attention-GT with the baseline multimodal fusion models. B. Comparison of the Attention-GT with the baseline models based on unimodal features.\u003c/p\u003e","description":"","filename":"3.png","url":"https://assets-eu.researchsquare.com/files/rs-4747833/v1/53090ad8c47afa25cc1ad0ad.png"},{"id":60597531,"identity":"9bc9b49a-4545-4ce0-995b-c0a462aeebd3","added_by":"auto","created_at":"2024-07-18 15:50:05","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":64490,"visible":true,"origin":"","legend":"\u003cp\u003eComparison of prediction of progression risks in Non-PLGC patients. A. Comparison on multimodal fusion. B. Comparison on unimodal data.\u003c/p\u003e","description":"","filename":"4.png","url":"https://assets-eu.researchsquare.com/files/rs-4747833/v1/44cbf99b276fc54cedae036e.png"},{"id":60597529,"identity":"27785b32-1434-41a7-8099-c8af6901a87d","added_by":"auto","created_at":"2024-07-18 15:50:05","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":449133,"visible":true,"origin":"","legend":"\u003cp\u003eAttention visualization for gastroscopic and tongue images. Left: the original images. Middle: the manually labeled risk feature region. Right: the attention heat maps.\u003c/p\u003e","description":"","filename":"5.png","url":"https://assets-eu.researchsquare.com/files/rs-4747833/v1/5303cfafbeea6515dea765a7.png"},{"id":60597533,"identity":"865aae09-a234-49a5-b630-a1547e198b43","added_by":"auto","created_at":"2024-07-18 15:50:06","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":415449,"visible":true,"origin":"","legend":"\u003cp\u003eMorphological features association between prediction of diverse gastritis lesions and their progression risks. A. High overlap of the attention areas. B. Low overlap of the attention areas.\u003c/p\u003e","description":"","filename":"6.png","url":"https://assets-eu.researchsquare.com/files/rs-4747833/v1/094f59be931f593f59bc6f9c.png"},{"id":60597528,"identity":"2d2940dd-8ed2-41a3-b530-b380d9b9008e","added_by":"auto","created_at":"2024-07-18 15:50:05","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":399824,"visible":true,"origin":"","legend":"\u003cp\u003eCorrelation analysis for paired gastroscopic and tongue images of patients on prediction of progression risks for Non-PLGC patients. Scatterplot: the correlation between paired gastroscopic and tongue image prediction scores for different patients. Numbers: prediction scores.\u003c/p\u003e","description":"","filename":"7.png","url":"https://assets-eu.researchsquare.com/files/rs-4747833/v1/37039f3a5fe0d0a4faf9f3f5.png"},{"id":60597526,"identity":"1ce450eb-c023-4d43-be56-be22eb88971c","added_by":"auto","created_at":"2024-07-18 15:50:04","extension":"png","order_by":8,"title":"Figure 8","display":"","copyAsset":false,"role":"figure","size":294087,"visible":true,"origin":"","legend":"\u003cp\u003eThe correlation between gastroscopic and tongue images from the TCM theory of cold and heat syndrome.\u003c/p\u003e","description":"","filename":"8.png","url":"https://assets-eu.researchsquare.com/files/rs-4747833/v1/b03ef702c51e8b2af12f45d3.png"},{"id":60597538,"identity":"de05e05c-6bf9-477d-af51-ffbe11656e62","added_by":"auto","created_at":"2024-07-18 15:50:13","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":3017343,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-4747833/v1/c8f4c3b8-0b43-43a6-9105-25a36ae31c27.pdf"},{"id":60597532,"identity":"2689396b-8609-4668-8bac-652299424538","added_by":"auto","created_at":"2024-07-18 15:50:05","extension":"docx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":2016155,"visible":true,"origin":"","legend":"","description":"","filename":"appendix2.docx","url":"https://assets-eu.researchsquare.com/files/rs-4747833/v1/941080a28abb97850db3e726.docx"}],"financialInterests":"The authors declare no competing interests.","formattedTitle":"\u003cp\u003ePrediction of the gastric precancerous risk based on deep learning of multimodal medical images\u003c/p\u003e","fulltext":[{"header":"Introduction","content":"\u003cp\u003eGastric cancer (GC) ranks as the fourth leading cause of cancer-related deaths globally\u003csup\u003e1\u003c/sup\u003e. Its development is often preceded by precursor lesions such as atrophic gastritis, intestinal metaplasia (IM), and dysplasia, with the incidence of GC increasing from 0.1\u0026ndash;6% as these lesions progress\u003csup\u003e2\u0026ndash;6\u003c/sup\u003e. The great inter-individual difference in the risk of progression poses a challenge to the prevention and control of gastric cancer. Early prediction of diverse gastritis lesions and their progression risks in patients is crucial for GC prevention\u003csup\u003e7\u0026ndash;9\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eRecent studies highlight the significance of multimodal and multi-factor elements in the development of precancerous lesions of gastric cancer (PLGC), including IM and dysplasia\u003csup\u003e10, 11 12\u003c/sup\u003e. Among various diagnostic tools, gastroscopic and tongue images have emerged as effective indicators for GC diagnosis and prevention\u003csup\u003e13, 14\u003c/sup\u003e. Gastroscopic images are pivotal for accurate diagnoses and precise lesion localization in gastric diseases, while tongue image features, such as color, shape, and coating thickness, are increasingly recognized for their diagnostic value in line with Traditional Chinese Medicine (TCM) practices\u003csup\u003e15\u0026ndash;17\u003c/sup\u003e. However, the extent to which the fusion of gastroscopic and tongue images enhances gastric disease management, particularly in prediction of diverse gastritis lesions and their progression risks, remains unclear. The development of computational methods to integrate these multimodal images for disease prediction is imperative.\u003c/p\u003e \u003cp\u003eAdvancements in artificial intelligence (AI), particularly deep learning models, have shown promise in precise cancer screening and treatment\u003csup\u003e18, 19\u003c/sup\u003e. These models effectively decode large-scale medical images into quantitative features, with studies suggesting that morphological features from multimodal images are complementary\u003csup\u003e20\u003c/sup\u003e. It thus indicates that multimodal learning is becoming one of the most challenging and attractive perspectives in the field of medial image analysis with AI models\u003csup\u003e21\u0026ndash;23\u003c/sup\u003e. Attention-based models, focusing on critical information and filtering out irrelevant data, have become increasingly relevant in multimodal image data fusion\u003csup\u003e24\u0026ndash;26\u003c/sup\u003e. These approaches often outperform unimodal methods in screening, risk stratification, and treatment, indicating the potential of attention-based AI models in integrating multimodal information for disease prediction \u003csup\u003e26\u0026ndash;28\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eIn our study, we developed an intelligent model integrating gastroscopic images, tongue images, and clinicopathological indicators for prediction of diverse gastritis lesions and their progression risks, evaluated on a longitudinal cohort of 384 patients, and revealed morphological features, further elucidating their associations. Our research contributes to the development of AI-based methodologies for early GC risk screening.\u003c/p\u003e"},{"header":"Methods","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003eData collection and grouping\u003c/h2\u003e \u003cp\u003eThe study was conducted at China-Japan Friendship Hospital between 2015 and 2022. The experimental protocol was designed following the ethical guidelines stated in the \"Declaration of Helsinki\" and was approved by the Human Ethics Committee of the Institution Review Board of China-Japan Friendship Hospital. Patients were enrolled according to the criteria demonstrated in Supplementary Table\u0026nbsp;1.\u003c/p\u003e \u003cp\u003eAccording to the Correa model, the evolution process of GC includes five step-like stages: superficial gastritis, atrophic gastritis, IM, and dysplasia to GC\u003csup\u003e2\u003c/sup\u003e. Among them, PLGC refers to pathological findings of IM and dysplasia while Non-PLGC encompasses superficial gastritis and atrophic gastritis \u003csup\u003e10, 11\u003c/sup\u003e. Patients were divided into PLGC and Non-PLGC groups according to baseline pathological findings, and then were further divided into progressive (Pro) and Non-Pro groups within each group after a mean follow-up of 3.3 years. Patients in the Pro group had a follow-up pathological test stage higher than baseline, whereas those in the Non-Pro group had a follow-up pathological stage lower than or equal to baseline.\u003c/p\u003e \u003cp\u003eThe follow-up cohort comprised a total of 384 participants in our study (Supplementary Table\u0026nbsp;2). Among these participants, 178 ones were enrolled with baseline diagnosis as PLGC whereas the others as Non-PLGC (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e). Subsequently, 306 of 384 participants were left after the follow-up survey with an average interval of 3.3 years, where 112 ones were categorized into the Pro group while the others into the Non-Pro group (Supplementary Table\u0026nbsp;3).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec4\" class=\"Section2\"\u003e \u003ch2\u003eGastroscopy and histological examination\u003c/h2\u003e \u003cp\u003eGastroscopic examinations were performed by two experienced gastroenterologists and conducted using video endoscopes. The biopsy samples obtained during the procedure of gastroscopic examinations were evaluated according to the criteria outlined in the Updated Sydney System\u003csup\u003e2, 29\u003c/sup\u003e. To determine the infection status of Helicobacter pylori (Hp), an enzyme-linked immunosorbent assay was performed on plasma samples to detect Hp-specific IgG antibodies\u003csup\u003e30, 31\u003c/sup\u003e.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec5\" class=\"Section2\"\u003e \u003ch2\u003eTongue images acquisition and pre-processing\u003c/h2\u003e \u003cp\u003eThe tongue image collector was used to capture images of the tongue. Before taking pictures, the patient was asked to clean mouth with water to avoid affecting the natural color of the tongue and coating. During the shooting process, the patient kept head fixed in the shooting area and extended tongue without moving. The use of the tongue image collector ensured consistent lighting, shooting angles, and distances, guaranteeing clear and undistorted images. After shooting, a TCM doctor examined the pictures to ensure they clearly displayed the entire tongue, including the tip, edges, and root. This process maximized the consistency of the tongue images, avoiding the influence of environmental factors and human error.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec6\" class=\"Section2\"\u003e \u003ch2\u003eDevelopment of Attention-GT model\u003c/h2\u003e \u003cp\u003eTo address the challenges of multimodal fusion in this study, the Attention-GT (Gastroscopic and Tongue) model based on the attention mechanism was constructed (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e). Attention-GT model, compared to traditional deep learning models, can enhance feature representation and adaptively focus on and integrate information from different modalities, thereby enhancing the understanding and representation of multimodal data by the model. Furthermore, it can generate visual attention maps, improving interpretability.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eFirst, in the Attention-GT model, a tongue image and multi gastroscopic images from a single patient were transformed into feature vectors by a pre-trained ResNet50 model \u003csup\u003e32\u003c/sup\u003e. Each image is transformed into an independent vector in the feature matrix. Subsequently, these feature vectors were vertically concatenated with the Class\u003csub\u003etoken\u003c/sub\u003e vector to form a matrix F (Eq.\u0026nbsp;\u003cspan refid=\"Equ1\" class=\"InternalRef\"\u003e1\u003c/span\u003e). The Class\u003csub\u003etoken\u003c/sub\u003e vector is a trainable vector providing a summary representation of the fused features to make predictions.\u003cdiv id=\"Equ1\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ1\" name=\"EquationSource\"\u003e\n$$\\:\\text{F}=\\text{C}\\text{o}\\text{n}\\text{c}\\text{a}\\text{t}\\left({\\text{C}\\text{l}\\text{a}\\text{s}\\text{s}}_{\\text{t}\\text{o}\\text{k}\\text{e}\\text{n}},{\\text{F}\\text{e}\\text{a}\\text{t}\\text{u}\\text{r}\\text{e}\\text{s}}_{gas},{\\text{F}\\text{e}\\text{a}\\text{t}\\text{u}\\text{r}\\text{e}}_{tongue},\\dots\\:\\right)$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e1\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003eNext, refer to the context gating (CG) method, the preprocessed feature matrix F is fed into an attention-based module to aggregate high-level representations from low-level representations\u003csup\u003e33, 34\u003c/sup\u003e. In Eq.\u0026nbsp;\u003cspan refid=\"Equ2\" class=\"InternalRef\"\u003e2\u003c/span\u003e, the F matrix passes through two separate fully connected layers (W1, W2) and is activated by the Sigmoid and Tanh functions. A Hadamard product is performed between the two parts resulting in a T matrix. In Eq.\u0026nbsp;\u003cspan refid=\"Equ3\" class=\"InternalRef\"\u003e3\u003c/span\u003e, the T matrix passes through a fully connected layer (W3) and is activated by the SoftMax function to obtain a matrix representing the weights of various image feature vectors. The matrix is multiplied by the F matrix to obtain the Context (C), which is a vector representing the weighted sum of all the features. In Eq.\u0026nbsp;\u003cspan refid=\"Equ4\" class=\"InternalRef\"\u003e4\u003c/span\u003e, the C is passed through a fully connected layer (W4) and then subjected to a Hadamard product with the F matrix, resulting in a new F matrix. In the aforementioned process, the fully connected layers are independently trainable, and the dropout rate is set at 0.2. These processes allow for nonlinear interactions between the features and employ an activation function to determine which input features to retain and help to enhance and concentrate the relevant information within the features.\u003cdiv id=\"Equ2\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ2\" name=\"EquationSource\"\u003e\n$$\\:\\text{T}=\\text{S}\\text{i}\\text{g}\\text{m}\\text{o}\\text{i}\\text{d}\\left({\\text{W}}_{1}\\text{*}\\text{F}\\right)\\circ\\:\\text{T}\\text{a}\\text{n}\\text{h}\\left({\\text{W}}_{2}\\text{*}F\\right)$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e2\u003c/div\u003e\u003c/div\u003e\u003cdiv id=\"Equ3\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ3\" name=\"EquationSource\"\u003e\n$$\\:\\text{C}=\\text{S}\\text{o}\\text{f}\\text{t}\\text{M}\\text{a}\\text{x}\\left({\\text{W}}_{3}\\text{*}\\text{T}\\right)\\text{*}\\text{F}$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e3\u003c/div\u003e\u003c/div\u003e\u003cdiv id=\"Equ4\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ4\" name=\"EquationSource\"\u003e\n$$\\:F={\\text{W}}_{4}\\text{*}\\text{C}\\circ\\:\\text{F}$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e4\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003eFinally, in Eq.\u0026nbsp;\u003cspan refid=\"Equ5\" class=\"InternalRef\"\u003e5\u003c/span\u003e, the Class\u003csub\u003etoken\u003c/sub\u003e vector in F passes through a Multi-Layer Perceptron (MLP) to obtain the final output.\u003cdiv id=\"Equ5\" class=\"Equation\"\u003e\u003cdiv format=\"TEX\" class=\"mathdisplay\" id=\"FileID_Equ5\" name=\"EquationSource\"\u003e\n$$\\:\\text{C}\\text{l}\\text{a}\\text{s}\\text{s}=\\text{M}\\text{L}\\text{P}\\left({\\text{C}\\text{l}\\text{a}\\text{s}\\text{s}}_{\\text{t}\\text{o}\\text{k}\\text{e}\\text{n}}\\right)$$\u003c/div\u003e\u003cdiv class=\"EquationNumber\"\u003e5\u003c/div\u003e\u003c/div\u003e\u003c/p\u003e \u003cp\u003eOverall, the attention module in the multimodal fusion approach is a key component. It can fuse features from different modes, capture the internal correlation between gastroscopy images and tongue images, so as to achieve more accurate feature extraction and information mining, and improve the risk prediction effect.\u003c/p\u003e \u003cp\u003e \u003cb\u003eBaseline model in model evaluation.\u003c/b\u003e \u003c/p\u003e \u003cp\u003eIn this study, by comparing the Attention-GT model and the baseline model, we verified that the model we constructed had better multi-modal fusion effect, so as to improve the efficiency of PLGC screening and risk prediction. The baseline model refers to intermediate and late fusion methods. These two methods are common multimodal fusion methods, which are characterized by fusion in the middle and late stages of feature extraction. The Attention-GT method is an early fusion method, which can better extract the correlation between the modes. In addition, we verified that multimodal fusion can effectively improve the efficiency of PLGC screening and risk prediction by comparing multimodal fusion with unimodal features. The unimodal features refer to clinicopathological indicators, tongue images, and gastroscopic images respectively.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec7\" class=\"Section2\"\u003e \u003ch2\u003eStatistical analysis and visualization\u003c/h2\u003e \u003cp\u003eIn this study, comparative experiments from multimodal fusion and unimodal were conducted to validate the effectiveness of the Attention-GT model. Multimodal fusion comparison refers to the comparison of the Attention-GT model with baseline models, such as intermediate and late fusion methods. Then, unimodal comparison involved comparing between different independent features.\u003c/p\u003e \u003cp\u003eAll statistical analysis procedures in this study were conducted using Python (Version: 3.7.0). For model validation, a five-fold cross-validation strategy was employed. To assess the models' performance, a range of evaluation metrics were employed. These metrics encompass accuracy, sensitivity, specificity, F1 score, receiver operating characteristic (ROC) curve, and area under the curve (AUC). By leveraging these evaluation metrics, the study aims to holistically gauge the models' performance and proffer a dependable assessment of their predictive prowess.\u003c/p\u003e \u003cp\u003eThe visualization and interpretability of morphological features in gastroscopic and tongue images were achieved using the Grad-CAM model, with the focused areas of attention displayed in the form of heatmaps\u003csup\u003e35\u003c/sup\u003e.\u003c/p\u003e \u003c/div\u003e"},{"header":"Results","content":"\u003cdiv id=\"Sec9\" class=\"Section2\"\u003e \u003ch2\u003eImprovement the performance of prediction of PLGC\u003c/h2\u003e \u003cp\u003eFirst, the prediction performance of the Attention-GT model for the PLGC in terms of multi-modal integration was evaluated. In addition, model interpretability analysis was conducted to dissect the PLGC-related morphological features from both gastroscopic and tongue images.\u003c/p\u003e \u003cp\u003eBy performing ROC curves analysis, it was revealed that the Attention-GT achieved an AUC of 0.83 for PLGC prediction, significantly superior to other multimodal models (P\u0026thinsp;\u0026lt;\u0026thinsp;0.05) (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eIn detail, the Attention-GT model achieved an AUC of 0.83 and showed significantly better performance than the baseline multimodal fusion models mentioned in the \u003cspan refid=\"Sec2\" class=\"InternalRef\"\u003emethods\u003c/span\u003e section, which were intermediate and late fusion methods (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eA). Similarly, it demonstrated significant advantages over baseline models utilizing unimodal features, with the baseline models based on clinicopathological indicators, tongue images, and gastroscopic images respectively (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eB). Accuracy, sensitivity, specificity, and F1 score were shown in Supplementary Table\u0026nbsp;4. Collectively, all of these results demonstrated the fusion of multimodal by Attention-GT model might be of great benefit for PLGC prediction.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec10\" class=\"Section2\"\u003e \u003ch2\u003eImprovement the performance of prediction of progression risks\u003c/h2\u003e \u003cp\u003eGiven the significant associations between pathological lesions and morphological features, we thus anticipated that integrating multi-modal images could enhance the effectiveness of pathological prediction of progression risks based on our established follow-up cohort.\u003c/p\u003e \u003cp\u003eExpectedly, among patients with Non-PLGC at baseline pathological findings, it was found that Attention-GT achieved an AUC of 0.84 in distinguishing the Pro patients from Non-Pro ones, significantly superior to other multimodal models (P\u0026thinsp;\u0026lt;\u0026thinsp;0.05) (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e). It indicated the potential of Attention-GT in predicting pathological lesion progression.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eIn detail, the Attention-GT model showed significantly better performance than the baseline multimodal fusion models (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eA). Similarly, it demonstrated significant advantages over baseline models utilizing unimodal features (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003eB). Accuracy, sensitivity, specificity, and F1 score were in Supplementary Table\u0026nbsp;5. Collectively, all of these results demonstrated the fusion of multimodal by Attention-GT model might be of great benefit for disease progression among Non-PLGC patients.\u003c/p\u003e \u003cp\u003eInterestingly, among patients with PLGC at baseline pathological findings, significant effectiveness of gastroscopic images in prediction of progression risks was observed (Supplementary Fig.\u0026nbsp;1). In detail, the Attention-GT model based on gastroscopic images achieved an AUC of 0.74, which was significantly superior (P value\u0026thinsp;\u0026lt;\u0026thinsp;0.05) to the baseline models based on clinicopathological indicators. Accuracy, sensitivity, specificity, and F1 score were in Supplementary Table\u0026nbsp;6. Unfortunately, the AUC of tongue image was 0.54, did not show significant prediction effect, so the fusion effect of tongue and gastroscopic images was not good. Overall, in the investigation of disease progression among PLGC patients, our findings indicated that gastroscopic images outperformed the clinicopathological indicators and tongue images for prediction of progression risks.\u003c/p\u003e \u003cp\u003eIn general, the fusion of gastroscopic and tongue images can significantly improve the prediction of progression risks in PLGC patients, and our proposed Attention-GT model performed better in modal fusion than the baseline models.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003eMorphological risk features from gastroscopic and tongue images\u003c/h2\u003e \u003cp\u003eMorphological features that involved in the prediction of PLGC from both gastroscopic and tongue images were also dissected. As shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e, the output attention heat maps highlighted the regions of potential risk. The main features of the identified high-risk areas in gastroscopic images were small gray-white raised patches surrounded by pink areas, with an irregular and uneven surface, consistent with the foci labeled by experienced experts\u003csup\u003e36\u003c/sup\u003e. The key features of tongue images were primarily observed in the central region of the tongue, including morphological and coating characteristics, which were consistent with our previous findings (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e)\u003csup\u003e14, 37\u003c/sup\u003e. These results might provide objective morphological indicators for the accurate clinical prediction of PLGC.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec12\" class=\"Section2\"\u003e \u003ch2\u003eAssociations of morphological features involved in prediction of PLGC and their progression risks\u003c/h2\u003e \u003cp\u003eLeveraging the advantage of the Attention-GT model in attention visualization, associations of morphological features involved in prediction of diverse gastritis lesions and their progression risks were further investigated from the spectrum of attention heatmaps in original images.\u003c/p\u003e \u003cp\u003eRegarding gastroscopic images, the attention regions for both prediction of diverse gastritis lesions and their progression risks exhibit some similarities but also slight differences (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e). The attention areas highly overlapped, while the attention areas of high progression risk had a relatively broader range (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003eA). They primarily consisted of small gray-white raised patches surrounded by pink areas, with an irregular and uneven surface, which was in line with the feature descriptions mentioned in the literature\u003csup\u003e38\u003c/sup\u003e. This suggested that there was a correlation between the features of the two tasks.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe attention areas showed a lack of significant correlation in Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003eB. The primary feature observed in attention areas of high progression risk was the presence of irregular white patches, which was similar to the gastroscopic features of patients with atypical hyperplasia in clinical diagnosis\u003csup\u003e36, 39\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eThe differences in attention areas for the two types of tasks suggested that, compared with the screening of PLGC, the risk prediction of PLGC has specific risk characteristics. For disease progression risk assessment, it may be advantageous to collect samples from multiple points during gastroscopy to enhance the effectiveness of risk prediction. These findings might provide novel insights and lay the foundation for constructing more accurate and reliable prediction models.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec13\" class=\"Section2\"\u003e \u003ch2\u003eMultimodal morphological features associations between gastroscopic and tongue images\u003c/h2\u003e \u003cp\u003eAlthough the morphological features from the unimodal tongue or gastroscopic images have been heavily investigated, their associations between multimodal images remain unclear. Thus, the associations of morphological features between gastroscopic and tongue images in the context of two tasks were investigated, respectively.\u003c/p\u003e \u003cp\u003eCorrelation analysis for paired gastroscopic and tongue images of patients on prediction of progression risks for Non-PLGC patients was conducted (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003e). The scatterplot illustrated the correlation between paired gastroscopic and tongue image prediction scores for different patients. The blue dots represented Pro patients, while orange dots were Non-Pro patients. Numbers indicated Pro prediction scores based on gastroscopic or tongue images. Patients in the red circle were predicted as negative by tongue images and positive by gastroscopic images, and the green circle was the opposite. The three sets of heatmaps represented patient samples within the circles of the three colors.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eTaking one of the blue dots highlighted within a red circle as an example, the prediction score based on unimodal features from gastroscopic images was 0.99, and the corresponding heat map indicated clear morphological feature regions. Conversely, the prediction score based on unimodal features from tongue images was 0.09, and the heat map did not exhibit discernible feature regions.\u003c/p\u003e \u003cp\u003eThese findings further emphasized the distinct contributions of gastroscopic and tongue images in the prediction of progression risks. Correlation analysis on PLGC prediction was in Supplementary Fig.\u0026nbsp;2. This analysis provided evidence for the improvement of prediction accuracy through multimodal fusion of gastroscopic and tongue images.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec14\" class=\"Section2\"\u003e \u003ch2\u003eThe correlation between gastroscopic and tongue images from the TCM theory\u003c/h2\u003e \u003cp\u003eFurthermore, we attempted to interpret the correlation between gastroscopic and tongue images from the TCM theory of cold and heat\u003csup\u003e40\u0026ndash;43\u003c/sup\u003e. The concept of cold and heat in TCM is a core aspect of its theoretical framework, used to describe the physiological and pathological states of the human body, reflecting the balance of Yin and Yang within. The appearance of the tongue is one of the key indicators for determining the cold and heat conditions. A cold pattern in the tongue is characterized by features like a white and thick coating, and a white tongue color. Conversely, a heat pattern is indicated by a thin coating and a red tongue color. Researching the correlation between tongue diagnosis and gastroscopy from the perspective of TCM's cold and heat concepts may provide new insights for medical practice.\u003c/p\u003e \u003cp\u003eBased on the TCM classification of tongue imaging into cold and heat categories, differences in the two groups of gastroscopic images were found (Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e8\u003c/span\u003e). Tongue images of patients with heat syndrome tended to exhibit a deeper red tongue color and thinner tongue coating compared to those with cold syndrome. Correspondingly, in the gastroscopic images of these patients, blood vessels appear to be more prominent and abundant. However, these findings still required further in-depth research to establish their true reliability. On the one hand, this provided supporting evidence for the correlation between the two types of images. On the other hand, it suggested that variations in cold and heat constitutions may correspond to different disease states in the stomach. This indicated that future research and medical practice can further investigate and consider these differences.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003c/div\u003e"},{"header":"Discussion","content":"\u003cp\u003eThe Attention-GT model, constructed based on the attention module, demonstrated significant advantages over traditional methods for addressing the unique challenges of multimodal fusion in this article. The main strength of this model lies in its enhanced feature representation, adaptively focusing on and integrating information from different modalities, and enhancing interpretability and visualization capabilities. Due to its ability to align and attend to information across modalities adaptively, the model also exhibited excellent scalability. In future research, the introduction of pathological images and molecular omics data could potentially improve the effectiveness of risk prediction.\u003c/p\u003e \u003cp\u003eThe attention heat maps highlighted areas of the gastroscopic images that were deemed to be of potential high risk. These heat maps provided a visual reference for identifying regions of interest that may be associated with the risk of disease progression. They can serve as valuable tools for subsequent studies, offering insights into areas that require further investigation. Additionally, these attention heat maps can assist clinicians in their diagnostic process by providing visual cues and assistance in identifying areas that may require closer examination.\u003c/p\u003e \u003cp\u003e The study on the correlation between gastroscopic and tongue images based on TCM's theory of cold and heat provided us with some new insights. By combining TCM theory with modern medical imaging, researchers can explore new perspectives and approaches to understanding diseases and patients' conditions. This integrated approach can offer more information and tools for medical practice. Merely, its reliability still needs further validation and in-depth research.\u003c/p\u003e \u003cp\u003eThe limitations of this work lie primarily in the amount of data included. The number of participants in this study is limited due to the long follow-up required for disease warning studies. The main research object in this study is medical images, and the microscopic risk characteristics at the molecular level are not included. We preliminarily found the synergism between tongue image and gastroscopy image in screening and early warning of stomach diseases, and also tried to explain it from various perspectives. However, the current conclusion was not deep enough.\u003c/p\u003e \u003cp\u003eFurther exploration of the relevance of gastroscopy and tongue images in predicting disease progression and its potential mechanism is of great significance. It is necessary to further expand the scale of the experiment. Including more participants may improve the training and prediction of the model, which is the focus of subsequent research. Further, in order to further improve the effect of PLGC risk prediction, the introduction of molecular characteristics into the model is also the focus of the next research.\u003c/p\u003e"},{"header":"Conclusion","content":"\u003cp\u003eIn this study, the researchers investigated the use of gastroscopic and tongue images in prediction of diverse gastritis lesions and their progression risks. The effectiveness of the proposed Attention-GT model, as well as the fusion of these imaging modalities, was compared with the baseline models. Additionally, through performing the interpretability analysis, we uncovered multimodal morphological features and their associations involved in the two tasks, providing potentially novel indicators for GC early prevention. Overall, the findings of this study suggested that the fusion of gastroscopic and tongue images using the Attention-GT model can enhance the prediction of diverse gastritis lesions and their progression risks.\u003c/p\u003e"},{"header":"Declarations","content":"\u003ch3\u003eEthics approval and consent to participate\u003c/h3\u003e\n\u003cp\u003eThe experimental protocol was established, according to the ethical guidelines of the Helsinki Declaration and was approved by the Human Ethics Committee of Institution Review Board of China-Japan Friendship Hospital\u0026nbsp;(protocol code 2023-KY-174).\u003c/p\u003e\n\u003ch3\u003eConsent for publication\u003c/h3\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e\n\u003ch3\u003eAvailability of data and materials\u003c/h3\u003e\n\u003cp\u003eThe datasets generated and/or analyzed during the current study are not publicly available due to patient privacy but are available from the corresponding author upon reasonable request.\u003c/p\u003e\n\u003ch3\u003eCompeting Interests\u003c/h3\u003e\n\u003cp\u003eThe authors declare that they have no competing interests.\u003c/p\u003e\n\u003ch3\u003eFunding\u003c/h3\u003e\n\u003cp\u003eFunding for this study was provided by the National Natural Science Foundation of China, China [T2341008]\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n \u003cli\u003eSung, H.; Ferlay, J.; Siegel, R. L.; Laversanne, M.; Soerjomataram, I.; Jemal, A.; Bray, F., Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. \u003cem\u003eCA Cancer J Clin\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2021,\u003c/strong\u003e \u003cem\u003e71\u003c/em\u003e (3), 209-249.\u003c/li\u003e\n \u003cli\u003eDixon, M. F.; Genta, R. M.; Yardley, J. H.; Correa, P., Classification and grading of gastritis. The updated Sydney System. International Workshop on the Histopathology of Gastritis, Houston 1994. \u003cem\u003eThe American journal of surgical pathology\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e1996,\u003c/strong\u003e \u003cem\u003e20\u003c/em\u003e (10), 1161-1181.\u003c/li\u003e\n \u003cli\u003ePiazuelo, M. B.; Bravo, L. E.; Mera, R. M.; Camargo, M. C.; Bravo, J. C.; Delgado, A. G.; Washington, M. K.; Rosero, A.; Garcia, L. S.; Realpe, J. L.; Cifuentes, S. P.; Morgan, D. R.; Peek, R. M., Jr.; Correa, P.; Wilson, K. T., The Colombian Chemoprevention Trial: 20-Year Follow-Up of a Cohort of Patients With Gastric Precancerous Lesions. \u003cem\u003eGastroenterology\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2021,\u003c/strong\u003e \u003cem\u003e160\u003c/em\u003e (4), 1106-1117 e3.\u003c/li\u003e\n \u003cli\u003eKang, J. Y.; Finlayson, C.; Maxwell, J. D.; Neild, P., Risk of gastric carcinoma in patients with atrophic gastritis and intestinal metaplasia. \u003cem\u003eGut\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2002,\u003c/strong\u003e \u003cem\u003e51\u003c/em\u003e (6), 899.\u003c/li\u003e\n \u003cli\u003ede Vries, A. C.; van Grieken, N. C.; Looman, C. W.; Casparie, M. K.; de Vries, E.; Meijer, G. A.; Kuipers, E. J., Gastric cancer risk in patients with premalignant gastric lesions: a nationwide cohort study in the Netherlands. \u003cem\u003eGastroenterology\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2008,\u003c/strong\u003e \u003cem\u003e134\u003c/em\u003e (4), 945-52.\u003c/li\u003e\n \u003cli\u003eRugge, M.; Meggio, A.; Pravadelli, C.; Barbareschi, M.; Fassan, M.; Gentilini, M.; Zorzi, M.; Pretis, G.; Graham, D. Y.; Genta, R. M., Gastritis staging in the endoscopic follow-up for the secondary prevention of gastric cancer: a 5-year prospective study of 1755 patients. \u003cem\u003eGut\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2019,\u003c/strong\u003e \u003cem\u003e68\u003c/em\u003e (1), 11-17.\u003c/li\u003e\n \u003cli\u003eHuang, H. L.; Leung, C. Y.; Saito, E.; Katanoda, K.; Hur, C.; Kong, C. Y.; Nomura, S.; Shibuya, K., Effect and cost-effectiveness of national gastric cancer screening in Japan: a microsimulation modeling study. \u003cem\u003eBMC Med\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2020,\u003c/strong\u003e \u003cem\u003e18\u003c/em\u003e (1), 257.\u003c/li\u003e\n \u003cli\u003eSuh, Y. S.; Lee, J.; Woo, H.; Shin, D.; Kong, S. H.; Lee, H. J.; Shin, A.; Yang, H. K., National cancer screening program for gastric cancer in Korea: Nationwide treatment benefit and cost. \u003cem\u003eCancer\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2020,\u003c/strong\u003e \u003cem\u003e126\u003c/em\u003e (9), 1929-1939.\u003c/li\u003e\n \u003cli\u003eZhang, P.; Wang, B.; Li, S., Network-based cancer precision prevention with artificial intelligence and multi-omics. \u003cem\u003eSci Bull (Beijing)\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2023,\u003c/strong\u003e \u003cem\u003e68\u003c/em\u003e (12), 1219-1222.\u003c/li\u003e\n \u003cli\u003eSchlemper, R. J.; Riddell, R. H.; Kato, Y.; Borchard, F.; Cooper, H. S.; Dawsey, S. M.; Dixon, M. F.; Fenoglio-Preiser, C. M.; Fl\u0026eacute;jou, J. F.; Geboes, K.; Hattori, T.; Hirota, T.; Itabashi, M.; Iwafuchi, M.; Iwashita, A.; Kim, Y. I.; Kirchner, T.; Klimpfinger, M.; Koike, M.; Lauwers, G. Y.; Lewin, K. J.; Oberhuber, G.; Offner, F.; Price, A. B.; Rubio, C. A.; Shimizu, M.; Shimoda, T.; Sipponen, P.; Solcia, E.; Stolte, M.; Watanabe, H.; Yamabe, H., The Vienna classification of gastrointestinal epithelial neoplasia. \u003cem\u003eGut\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2000,\u003c/strong\u003e \u003cem\u003e47\u003c/em\u003e (2), 251-255.\u003c/li\u003e\n \u003cli\u003eSong, H.; Ekheden, I. G.; Zheng, Z.; Ericsson, J.; Nyren, O.; Ye, W., Incidence of gastric cancer among patients with gastric precancerous lesions: observational cohort study in a low risk Western population. \u003cem\u003eBMJ\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2015,\u003c/strong\u003e \u003cem\u003e351\u003c/em\u003e, h3867.\u003c/li\u003e\n \u003cli\u003eTan, P.; Yeoh, K. G., Genetics and Molecular Pathogenesis of Gastric Adenocarcinoma. \u003cem\u003eGastroenterology\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2015,\u003c/strong\u003e \u003cem\u003e149\u003c/em\u003e (5), 1153-1162 e3.\u003c/li\u003e\n \u003cli\u003eMa, L.; Su, X.; Ma, L.; Gao, X.; Sun, M., Deep learning for classification and localization of early gastric cancer in endoscopic images. \u003cem\u003eBiomedical Signal Processing and Control\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2023,\u003c/strong\u003e \u003cem\u003e79\u003c/em\u003e, 104200.\u003c/li\u003e\n \u003cli\u003eYuan, L.; Yang, L.; Zhang, S.; Xu, Z.; Qin, J.; Shi, Y.; Yu, P.; Wang, Y.; Bao, Z.; Xia, Y.; Sun, J.; He, W.; Chen, T.; Chen, X.; Hu, C.; Zhang, Y.; Dong, C.; Zhao, P.; Wang, Y.; Jiang, N.; Lv, B.; Xue, Y.; Jiao, B.; Gao, H.; Chai, K.; Li, J.; Wang, H.; Wang, X.; Guan, X.; Liu, X.; Zhao, G.; Zheng, Z.; Yan, J.; Yu, H.; Chen, L.; Ye, Z.; You, H.; Bao, Y.; Cheng, X.; Zhao, P.; Wang, L.; Zeng, W.; Tian, Y.; Chen, M.; You, Y.; Yuan, G.; Ruan, H.; Gao, X.; Xu, J.; Xu, H.; Du, L.; Zhang, S.; Fu, H.; Cheng, X., Development of a tongue image-based machine learning tool for the diagnosis of gastric cancer: a prospective multicentre clinical cohort study. \u003cem\u003eEClinicalMedicine\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2023,\u003c/strong\u003e \u003cem\u003e57\u003c/em\u003e, 101834.\u003c/li\u003e\n \u003cli\u003eShang, Z.; Du, Z. G.; Guan, B.; Ji, X. Y.; Chen, L. C.; Wang, Y. J.; Ma, Y., Correlation analysis between characteristics under gastroscope and image information of tongue in patients with chronic gastriti. \u003cem\u003eJ Tradit Chin Med\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2022,\u003c/strong\u003e \u003cem\u003e42\u003c/em\u003e (1), 102-107.\u003c/li\u003e\n \u003cli\u003eGholami, E. a. K. T., Seyed and Kheirabadi, Maryam, Increasing the accuracy in the diagnosis of stomach cancer based on color and lint features of tongue. \u003cem\u003eBiomedical Signal Processing and Control\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2021,\u003c/strong\u003e \u003cem\u003e69\u003c/em\u003e, 102782.\u003c/li\u003e\n \u003cli\u003eLi, S.; Wang, R.; Zhang, Y.; Zhang, X.; Layon, A. J.; Li, Y.; Chen, M., Symptom combinations associated with outcome and therapeutic effects in a cohort of cases with SARS. \u003cem\u003eAm J Chin Med\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2006,\u003c/strong\u003e \u003cem\u003e34\u003c/em\u003e (6), 937-47.\u003c/li\u003e\n \u003cli\u003eLiu, T.; Huang, J.; Liao, T.; Pu, R.; Liu, S.; Peng, Y., A Hybrid Deep Learning Model for Predicting Molecular Subtypes of Human Breast Cancer Using Multimodal Data. \u003cem\u003eIRBM\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2022,\u003c/strong\u003e \u003cem\u003e43\u003c/em\u003e (1), 62-74.\u003c/li\u003e\n \u003cli\u003eChen, R. J.; Lu, M. Y.; Williamson, D. F. K.; Chen, T. Y.; Lipkova, J.; Noor, Z.; Shaban, M.; Shady, M.; Williams, M.; Joo, B.; Mahmood, F., Pan-cancer integrative histology-genomic analysis via multimodal deep learning. \u003cem\u003eCancer Cell\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2022,\u003c/strong\u003e \u003cem\u003e40\u003c/em\u003e (8), 865-878 e6.\u003c/li\u003e\n \u003cli\u003eSun, C.; Wang, A.; Zhou, Y.; Chen, P.; Wang, X.; Huang, J.; Gao, J.; Wang, X.; Shu, L.; Lu, J.; Dai, W.; Bu, Z.; Ji, J.; He, J., Spatially resolved multi-omics highlights cell-specific metabolic remodeling and interactions in gastric cancer. \u003cem\u003eNat Commun\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2023,\u003c/strong\u003e \u003cem\u003e14\u003c/em\u003e (1), 2692.\u003c/li\u003e\n \u003cli\u003eAzam, M. A.; Khan, K. B.; Salahuddin, S.; Rehman, E.; Khan, S. A.; Khan, M. A.; Kadry, S.; Gandomi, A. H., A review on multimodal medical image fusion: Compendious analysis of medical modalities, multimodal databases, fusion techniques and quality metrics. \u003cem\u003eComputers in Biology and Medicine\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2022,\u003c/strong\u003e \u003cem\u003e144\u003c/em\u003e, 105253.\u003c/li\u003e\n \u003cli\u003eKline, A.; Wang, H.; Li, Y.; Dennis, S.; Hutch, M.; Xu, Z.; Wang, F.; Cheng, F.; Luo, Y., Multimodal machine learning in precision health: A scoping review. \u003cem\u003eNPJ Digit Med\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2022,\u003c/strong\u003e \u003cem\u003e5\u003c/em\u003e (1), 171.\u003c/li\u003e\n \u003cli\u003eBoehm, K. M.; Khosravi, P.; Vanguri, R.; Gao, J.; Shah, S. P., Harnessing multimodal data integration to advance precision oncology. \u003cem\u003eNat Rev Cancer\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2022,\u003c/strong\u003e \u003cem\u003e22\u003c/em\u003e (2), 114-126.\u003c/li\u003e\n \u003cli\u003eLara Ram\u0026iacute;rez, J.; Contreras, V.; Ot\u0026aacute;lora Montenegro, J.; M\u0026uuml;ller, H.; Gonz\u0026aacute;lez, F., \u003cem\u003eMultimodal Latent Semantic Alignment for Automated Prostate Tissue Classification and Retrieval\u003c/em\u003e. 2020.\u003c/li\u003e\n \u003cli\u003eYan, R.; Zhang, F.; Rao, X.; Lv, Z.; Li, J.; Zhang, L.; Liang, S.; Li, Y.; Ren, F.; Zheng, C.; Liang, J., Richer fusion network for breast cancer classification based on multimodal data. \u003cem\u003eBMC Medical Informatics and Decision Making\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2021,\u003c/strong\u003e \u003cem\u003e21\u003c/em\u003e (1), 134.\u003c/li\u003e\n \u003cli\u003eOu, C.; Zhou, S.; Yang, R.; Jiang, W.; He, H.; Gan, W.; Chen, W.; Qin, X.; Luo, W.; Pi, X.; Li, J., A deep learning based multimodal fusion model for skin lesion diagnosis using smartphone collected clinical images and metadata. \u003cem\u003eFront Surg\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2022,\u003c/strong\u003e \u003cem\u003e9\u003c/em\u003e, 1029991.\u003c/li\u003e\n \u003cli\u003eSingh, L. K.; Khanna, M.; Pooja, A novel multimodality based dual fusion integrated approach for efficient and early prediction of glaucoma. \u003cem\u003eBiomedical Signal Processing and Control\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2022,\u003c/strong\u003e \u003cem\u003e73\u003c/em\u003e, 103468.\u003c/li\u003e\n \u003cli\u003eCai, Q.; Wang, H.; Li, Z.; Liu, X., A Survey on Multimodal Data-Driven Smart Healthcare Systems: Approaches and Applications. \u003cem\u003eIEEE Access\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2019,\u003c/strong\u003e \u003cem\u003ePP\u003c/em\u003e, 1-1.\u003c/li\u003e\n \u003cli\u003eYou, W. C.; Blot, W. J.; Li, J. Y.; Chang, Y. S.; Jin, M. L.; Kneller, R.; Zhang, L.; Han, Z. X.; Zeng, X. R.; Liu, W. D.; et al., Precancerous gastric lesions in a population at high risk of stomach cancer. \u003cem\u003eCancer Res\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e1993,\u003c/strong\u003e \u003cem\u003e53\u003c/em\u003e (6), 1317-21.\u003c/li\u003e\n \u003cli\u003eZhang, L.; Blot, W. J.; You, W. C.; Chang, Y. S.; Kneller, R. W.; Jin, M. L.; Li, J. Y.; Zhao, L.; Liu, W. D.; Zhang, J. S.; Ma, J. L.; Samloff, I. M.; Correa, P.; Blaser, M. J.; Xu, G. W.; Fraumeni, J. F., Jr., Helicobacter pylori antibodies in relation to precancerous gastric lesions in a high-risk Chinese population. \u003cem\u003eCancer Epidemiol Biomarkers Prev\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e1996,\u003c/strong\u003e \u003cem\u003e5\u003c/em\u003e (8), 627-30.\u003c/li\u003e\n \u003cli\u003eLi, S.; Lu, A. P.; Zhang, L.; Li, Y. D., Anti-Helicobacter pylori immunoglobulin G (IgG) and IgA antibody responses and the value of clinical presentations in diagnosis of H. pylori infection in patients with precancerous lesions. \u003cem\u003eWorld J Gastroenterol\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2003,\u003c/strong\u003e \u003cem\u003e9\u003c/em\u003e (4), 755-8.\u003c/li\u003e\n \u003cli\u003eHe, K.; Zhang, X.; Ren, S.; Sun, J., Deep Residual Learning for Image Recognition. \u003cem\u003eIEEE\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2016\u003c/strong\u003e.\u003c/li\u003e\n \u003cli\u003eLu, M. Y.; Williamson, D. F. K.; Chen, T. Y.; Chen, R. J.; Barbieri, M.; Mahmood, F., Data-efficient and weakly supervised computational pathology on whole-slide images. \u003cem\u003eNat Biomed Eng\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2021,\u003c/strong\u003e \u003cem\u003e5\u003c/em\u003e (6), 555-570.\u003c/li\u003e\n \u003cli\u003eMiech, A.; Laptev, I.; Sivic, J., Learnable pooling with context gating for video classification. \u003cem\u003earXiv preprint arXiv:1706.06905\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2017\u003c/strong\u003e.\u003c/li\u003e\n \u003cli\u003eSelvaraju, R. R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. In \u003cem\u003eGrad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization\u003c/em\u003e, 2017 IEEE International Conference on Computer Vision (ICCV), 22-29 Oct. 2017; 2017; pp 618-626.\u003c/li\u003e\n \u003cli\u003eCummins, G.; Cox, B. F.; Ciuti, G.; Anbarasan, T.; Desmulliez, M. P. Y.; Cochran, S.; Steele, R.; Plevris, J. N.; Koulaouzidis, A., Gastrointestinal diagnosis using non-white light imaging capsule endoscopy. \u003cem\u003eNat Rev Gastroenterol Hepatol\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2019,\u003c/strong\u003e \u003cem\u003e16\u003c/em\u003e (7), 429-447.\u003c/li\u003e\n \u003cli\u003eMa, C.; Zhang, P.; Du, S.; Li, Y.; Li, S., Construction of Tongue Image-Based Machine Learning Model for Screening Patients with Gastric Precancerous Lesions. \u003cem\u003eJ Pers Med\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2023,\u003c/strong\u003e \u003cem\u003e13\u003c/em\u003e (2), 271.\u003c/li\u003e\n \u003cli\u003eYoung, E.; Philpott, H.; Singh, R., Endoscopic diagnosis and treatment of gastric dysplasia and early cancer: Current evidence and what the future may hold. \u003cem\u003eWorld J Gastroenterol\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2021,\u003c/strong\u003e \u003cem\u003e27\u003c/em\u003e (31), 5126-5151.\u003c/li\u003e\n \u003cli\u003eHoffman, A.; Manner, H.; Rey, J. W.; Kiesslich, R., A guide to multimodal endoscopy imaging for gastrointestinal malignancy - an early indicator. \u003cem\u003eNat Rev Gastroenterol Hepatol\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2017,\u003c/strong\u003e \u003cem\u003e14\u003c/em\u003e (7), 421-434.\u003c/li\u003e\n \u003cli\u003eLi, R.; Ma, T.; Gu, J.; Liang, X.; Li, S., Imbalanced network biomarkers for traditional Chinese medicine Syndrome in gastritis patients. \u003cem\u003eSci Rep\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2013,\u003c/strong\u003e \u003cem\u003e3\u003c/em\u003e, 1543.\u003c/li\u003e\n \u003cli\u003eWang, Z. Y.; Wang, X.; Zhang, D. Y.; Hu, Y. J.; Li, S., [Traditional Chinese medicine network pharmacology: development in new era under guidance of network pharmacology evaluation method guidance]. \u003cem\u003eZhongguo Zhong Yao Za Zhi\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2022,\u003c/strong\u003e \u003cem\u003e47\u003c/em\u003e (1), 7-17.\u003c/li\u003e\n \u003cli\u003eZhou, W.; Yang, K.; Zeng, J.; Lai, X.; Wang, X.; Ji, C.; Li, Y.; Zhang, P.; Li, S., FordNet: Recommending traditional Chinese medicine formula via deep neural network integrating phenotype and molecule. \u003cem\u003ePharmacol Res\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2021,\u003c/strong\u003e \u003cem\u003e173\u003c/em\u003e, 105752.\u003c/li\u003e\n \u003cli\u003e张彦琼; 李梢, 网络药理学与中医药现代研究的若干进展. \u003cem\u003e中国药理学与毒理学杂志\u0026nbsp;\u003c/em\u003e\u003cstrong\u003e2015,\u003c/strong\u003e \u003cem\u003e29\u003c/em\u003e (6), 883-892.\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[{"identity":"9a592716-5ca1-4bb1-81eb-79e939b481aa","identifier":"10.13039/501100001809","name":"National Natural Science Foundation of China","awardNumber":"T2341008","order_by":0}],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"Institute of TCM-X/MOE Key Laboratory of Bioinformatics, Bioinformatics Division, BNRist/Department of Automation, Tsinghua University, 100084 Beijing, China","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"gastric precancerous diseases, progression prediction, gastroscopic images, multimodal fusion, deep learning","lastPublishedDoi":"10.21203/rs.3.rs-4747833/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-4747833/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eEffective warning diverse gastritis lesions, including precancerous lesions of gastric cancer (PLGC) and Non-PLGC, and progression risks, are pivotal for early prevention of gastric cancer. An attention-based model (Attention-GT) was constructed. It integrated multimodal features such as gastroscopic, tongue images, and clinicopathological indicators (Age, Gender, Hp) for the first time to assist in distinguishing diverse gastritis lesions and progression risks. A longitudinal cohort of 384 participants with gastritis (206 Non-PLGC and 178 PLGC) was constructed. These two baseline groups were subdivided into progressive (Pro) and Non-Pro groups, respectively, based on a mean follow-up of 3.3 years. The Attention-GT model exhibited excellent performance in distinguishing diverse gastritis lesions and progression risks. It was found that the AUC of Attention-GT in distinguishing PLGC was 0.83, significantly higher than that of clinicopathological indicators (AUC = 0.72, p \u0026lt; 0.01). Importantly, for the patients with baseline lesions as Non-PLGC, the AUC of Attention-GT in distinguishing the Pro group was 0.84, significantly higher than that of clinicopathological indicators (AUC = 0.67, p \u0026lt; 0.01), demonstrating the value of the fusion of gastroscopic and tongue images in predicting the progression risk of gastritis. Finally, morphological features related to diverse gastritis lesions and progression risk, respectively, were identified in both gastroscopic and tongue images through interpretability analysis. Collectively, our study has demonstrated the value of integrating multimodal data of medical images in assisting prediction of diverse gastritis lesions and progression risks, paving a new way for early gastric cancer risk prediction.\u003c/p\u003e","manuscriptTitle":"Prediction of the gastric precancerous risk based on deep learning of multimodal medical images","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-07-18 15:49:54","doi":"10.21203/rs.3.rs-4747833/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"32294aec-d753-49a9-9a57-e7ca1d518668","owner":[],"postedDate":"July 18th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":34662896,"name":"Gastroenterology \u0026 Hepatology"}],"tags":[],"updatedAt":"2024-07-18T15:49:54+00:00","versionOfRecord":[],"versionCreatedAt":"2024-07-18 15:49:54","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-4747833","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-4747833","identity":"rs-4747833","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00
unpaywall: last seen: 2026-05-23T02:00:01.238055+00:00

License: CC-BY-4.0