Prediction of dysphagia severity after lateral medullary infarction with deep learning

doi:10.21203/rs.3.rs-7007189/v1

Prediction of dysphagia severity after lateral medullary infarction with deep learning

2025 · doi:10.21203/rs.3.rs-7007189/v1

preprint OA: closed

Full text JSON View at publisher

Full text 82,831 characters · extracted from preprint-html · click to expand

Prediction of dysphagia severity after lateral medullary infarction with deep learning | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Prediction of dysphagia severity after lateral medullary infarction with deep learning Taeheon Lee, Bo Hae Kim, Kihwan Nam, Jin-Woo Park This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7007189/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 19 Feb, 2026 Read the published version in Scientific Reports → Version 1 posted 11 You are reading this latest preprint version Abstract Dysphagia is a common and debilitating complication in patients with lateral medullary infarction (LMI), affecting up to 100% of cases and significantly impairing quality of life. Accurate early prediction of dysphagia severity is essential for timely intervention and personalized rehabilitation planning. This study aimed to develop and validate a deep learning algorithm using acute-phase diffusion-weighted MRI to classify dysphagia severity in LMI patients. A retrospective cohort of 163 patients with confirmed acute LMI was analyzed. Dysphagia severity was determined by videofluoroscopic swallowing studies (VFSS), categorizing patients into severe and non-severe groups. Lesion regions were manually labeled and preprocessed for model training. A Transformer-based deep learning architecture, the Hierarchical Vision Transformer (Hier-ViT), was employed due to its capacity to model spatial hierarchies and global image context. The model achieved an accuracy of 0.85, with a precision of 0.70, recall of 0.75, F1-score of 0.72, and an area under the ROC curve (AUC) of 0.69. These findings suggest that Hier-ViT can effectively classify dysphagia severity in LMI patients using early MRI, offering a potential tool for prognosis prediction. Further studies with larger cohorts and multi-modal data are needed to confirm clinical utility and enhance model generalizability. Biological sciences/Computational biology and bioinformatics Health sciences/Diseases Health sciences/Health care Health sciences/Medical research Health sciences/Neurology Lateral medullary infarction Dysphagia Prognosis Deep learning Magnetic Resonance Imaging Figures Figure 1 Figure 2 1. Introduction Dysphagia is a commonly documented complication after stroke, but its reported prevalence varies widely, ranging between 19% and 81% [ 1 ]. Dysphagia can significantly impair patients’ quality of life, leading to complications such as malnutrition, dehydration, aspiration pneumonia, social isolation, depression, and anxiety [ 2 , 3 ]. Strokes can be classified based on the affected brain region. Among these, lateral medullary infarction (LMI) is caused by ischemia in the lateral part of the medulla oblongata [ 4 ]. LMI is characterized by a range of neurological symptoms, including gaze-induced nystagmus [ 5 ], Horner’s syndrome [ 6 ], ipsilateral ataxia [ 7 ]. Importantly, dysphagia has been reported in 51–100% of patients with LMI [ 8 , 9 ], and it tends to be more severe and prolonged in LMI patients compared to those with hemispheric stroke patients [ 10 ]. This can be attributed to the anatomical characteristics of the medulla, where the central pattern generators (CPGs) responsible for swallowing are located. CPGs are generally defined as neural networks capable of generating central commands that govern stereotyped, rhythmic motor behaviors such as swallowing [ 11 , 12 ]. In the medulla, the CPG is composed primarily of the nucleus of the tractus solitarius (NTS) and the nucleus ambiguus (NA). Neurons in the NTS function as interneurons that coordinate and program the sequential swallowing motor pattern, while neurons in the NA primarily serve as motoneurons innervating the pharyngolaryngeal muscles and the esophagus [ 13 ]. Initial dysphagia is considered a key factor associated with poor outcomes following LMI [ 14 ]. Although most patients with dysphagia experience mild symptoms and recover quickly [ 15 , 16 ], some patients with severe dysphagia require tube feeding for several months or even years [ 12 , 17 , 18 ]. Therefore, predicting the prognosis of dysphagia is clinically important for developing appropriate therapeutic strategies. Several studies have attempted to identify factors associated with dysphagia prognosis in LMI patients [ 18 , 19 ] A previous lesion-symptom mapping study demonstrated that posterolateral involvement in both the upper and lower medulla, as visualized on diffusion-weighted imaging, was significantly associated with severe dysphagia- typically characterized by decreased pharyngeal constriction and the apparent absence of any evidence of esophageal passage in videofluoroscopic swallowing study(VFSS) findings, as well as the initial requirement for enteral tube feeding [ 18 , 20 ]. These findings suggest that the anatomic factor related to the extent and vertical distribution of the lesion play a critical role in determining swallowing impairment. This underscores the importance of early lesion localization in predicting swallowing outcomes and guiding individualized rehabilitation planning Recently, artificial intelligence (AI) based on deep learning has been expanding its applications in various medical fields to enhance the diagnosis and treatment of diseases [ 21 ]. Deep learning methods are a type of representation learning that leverage multiple levels of abstraction. These methods consist of simple but non-linear modules that transform the representation at one level (starting with raw input) into a higher, more abstract level of representation [ 22 ]. Deep learning techniques have been widely applied across various organ systems, including the kidney, prostate, and spine [ 23 ]. Notably, brain image analysis has emerged as one of the most extensively studied areas in the field of medical imaging. A variety of tasks have been successfully addressed, such as the staging and early diagnosis of Alzheimer’s disease using multimodal magnetic resonance imaging (MRI) data [ 24 ], automated segmentation of brain tumors including glioblastoma and meningioma from heterogeneous clinical MRI scans [ 25 , 26 ], and lesion detection in patients with multiple sclerosis [ 27 ]. Additional applications include skull stripping for brain extraction [ 28 ], and classification of brain functional connectomes [ 29 ]. Given that anatomical lesion patterns in the medulla have been shown to significantly influence the severity of dysphagia in patients with LMI, accurate lesion localization is of particular clinical relevance. At the same time, recent advances in deep learning have led to substantial progress in brain imaging analysis, with applications ranging from disease classification to lesion segmentation and functional mapping. These developments suggest that integrating anatomical insights with data-driven modeling may offer a powerful approach to predicting dysphagia severity. In this study, we aim to develop and validate a Transformer-based deep learning algorithm to automatically analyze the location of initial MRI lesions in patients with LMI and predict the prognosis of dysphagia. By leveraging the ability of Transformers to capture both local and global patterns, we expect to enhance the accuracy of prognosis prediction and facilitate the development of more effective rehabilitation strategies. 2. Materials and methods 2.1. Study design We performed a retrospective study analyzing clinical and imaging data of patients who were admitted to Dongguk University Ilsan Hospital with acute LMI from September 2005 to the July 2024. Participants satisfied the following inclusion criteria: (1) first-ever onset of acute stroke, (2) patients aged 20 years or older and (3) diagnosis of lateral medullary infarction confirmed by MRI. Patients with other structural brain disorders, including neurodegenerative or neuromuscular conditions that could independently influence swallowing function, were excluded from the study. The study was approved by the Institutional Review Board of Dongguk University Ilsan Hospital (No. 2024-07-004). All procedures were performed in compliance with the relevant guidelines and regulations. 2.2. Data collection and outcome measures All stroke patients underwent brain MRI within 24 hours of admission using a 1.5-T MR machine (MAGNETOM-Avanto, Siemens, Erlangen, Germany). Diffusion-weighted MRI (DWI) parameters were as follows: b-values of 0 and 1000 s/mm², repetition time of 5400 ms, echo time of 77 ms, field of view of 220 × 220 mm, slice thickness of 3.0 mm, and an interslice gap of 0.3 mm. Following established protocols [ 20 ], three DWI slices were extracted from the lower, middle, and upper medulla levels based on the MNI brain template for analysis. VFSS was conducted at a mean of 14 days following stroke onset using a fluoroscopic system (Sonialvision-100, Shimadzu Corporation, Kyoto, Japan). All examinations adhered to a standardized protocol, which included the following: (1) patients maintained an upright seated position throughout the procedure; (2) each subject was given 5 mL of diluted barium solution (35% w/v), curd-type yogurt, and mashed boiled pumpkin, each administered twice; and (3) the protocol was modified as needed according to the patient’s clinical status and level of cooperation. Dysphagia severity was determined according to VFSS findings [ 20 ]. Severe dysphagia was defined by the presence of diminished pharyngeal contraction and a clear lack of observable esophageal passage. Patients who did not exhibit these features were categorized as having non-severe dysphagia. 2.3. Data preprocessing Three brain MRI images were selected based on consistent anatomical landmarks to ensure comparability across subjects and were input into the model as independent samples. Each patient was labeled as having either severe or non-severe dysphagia based on VFSS results, creating a binary classification task. All images were resized to a standardized resolution and normalized for pixel intensity. The dataset was split into training (70%) and testing (30%) sets to evaluate model performance. No additional clinical variables were included at this stage, and the analysis focused solely on the spatial imaging features from early post-stroke MRI scans. This preprocessing ensured that the model learned discriminative patterns directly from medullary lesion characteristics while controlling for variations in image size and intensity. 2.4. Deep learning models This study employed a Hierarchical Vision Transformer (Hier-ViT) model to classify dysphagia severity based on early brain MRI data. Hier-ViT is a Transformer-based vision model that constructs multi-level representations by hierarchically merging non-overlapping image patches. Each input image was divided into fixed-size patches and linearly embedded, with positional encoding added to preserve spatial information. These embeddings were then passed through a series of Transformer encoder layers that progressively reduced spatial resolution while increasing feature abstraction, allowing for both local and global contextual learning. The model was trained using the preprocessed medulla MRI slices and labeled dysphagia severity, with performance evaluated based on classification metrics such as accuracy, precision, recall, and AUC. A schematic overview of the entire deep learning pipeline—from diffusion-weighted image preprocessing to patch embedding, hierarchical feature extraction, and severity classification—is presented in Fig. 1 . 2.5. Statistical analysis Statistical analysis was performed using Python 3.8 on an Ubuntu 22.04 operating system, utilizing the Scikit-learn (version 1.5) and SciPy (version 1.11.4) libraries. Model performance was evaluated using receiver operating characteristic (ROC) curve analysis, and the area under the curve (AUC) was calculated as a primary indicator of discriminative ability. Confidence intervals for the mean AUC were estimated using the bias-corrected and accelerated bootstrap method. 3. Results 3.1. Patient characteristics A total of 163 patients met the inclusion criteria for this study. The mean age of the participants was 60.5 ± 13.0 years. Among them, 44 patients (27.0%) were classified as having severe dysphagia, including 33 males (75.0%) and 11 females (25.0%), with a mean age of 60.0 ± 14.5 years. The remaining 119 patients (73.0%) were classified as having non-severe dysphagia, comprising 71 males (59.7%) and 48 females (40.3%), with a mean age of 60.7 ± 12.6 years. VFSS was conducted at a mean of 12.4 ± 3.4 days from stroke onset. Table 1 summarizes the baseline demographic characteristics stratified by dysphagia severity. Table 1 Patient characteristics (n = 163). Characteristic Severe (n = 44) Non-severe (n = 119) Age (yr) 60.0 ± 14.5 60.7 ± 12.6 Gender (M/F) 33 / 11 (75.0 / 25.0%) 71 / 48 (59.7 / 40.3%) Days from onset to VFSS 12.4 ± 3.4 (overall mean) Values are mean ± SD. 3.2. Model performance The Hier-ViT model achieved an accuracy of 0.85, precision of 0.70, recall of 0.75, and an F1-score of 0.72 in classifying dysphagia severity. The area under the receiver operating characteristic curve (AUC) was 0.69, indicating fair overall classification performance. These performance metrics are summarized in Table 2 . Model performance metrics are summarized in Fig. 2 , including the confusion matrix and ROC curve demonstrating the classifier’s ability to distinguish between severity groups. Table 2 Performance of prediction models. Accuracy Precision Recall F1-score AUC Hier-ViT 0.85 0.70 0.75 0.72 0.69 4. Discussion This study investigated the performance of a deep learning model based on the Hier-ViT in classifying the severity of dysphagia following LMI using early brain MRI findings. The model achieved an accuracy of 0.85, indicating that it correctly classified the dysphagia severity in 85% of patients. These findings suggest that the model’s predictions are reasonably reliable and may support decision-making regarding rehabilitation strategies. The AUC of 0.69 reflects the model’s ability to discriminate between severe and non-severe dysphagia cases across different threshold settings, indicating a fair level of overall classification performance. The precision of 0.70 means that when the model predicts severe dysphagia, it is correct 70% of the time—an important factor in minimizing false-positive predictions and avoiding unnecessary interventions. The recall of 0.75 shows that the model successfully identifies 76% of patients who truly have severe dysphagia, which is essential for ensuring that patients at high risk are not overlooked. The balance between precision and recall is demonstrated by the F1-score of 0.72, which indicates moderate yet clinically meaningful performance in terms of both identifying true cases and minimizing misclassification. From an anatomical perspective, this performance is particularly meaningful in light of previous findings indicating that anatomical lesion patterns within the medulla significantly influence the severity of dysphagia in patients with LMI. Specifically, lesion-symptom mapping studies have demonstrated that posterolateral involvement of both the upper and lower medulla, as identified on diffusion-weighted imaging, is strongly associated with severe dysphagia requiring enteral feeding [ 20 ]. Given that infarctions in the medullary region typically involve extremely small anatomical territories that are often indistinguishable by visual inspection alone, the model’s ability to achieve this level of classification accuracy suggests that it may capture subtle lesion characteristics that are not easily recognized through standard radiological assessment. Recent advances in deep learning have enabled automated analysis of brain imaging data across various domains, including the segmentation of brain tumors, the staging of Alzheimer’s disease, and the classification of neurological disorders [ 23 ]. In this context, our application of a Transformer-based deep learning model trained on early brain MRI data offers a novel approach that integrates anatomical lesion characteristics with data-driven prediction of functional severity. Previous deep learning studies on post-stroke dysphagia have predominantly focused on diagnostic classification using CNNs, particularly applied to VFSS data. While prior CNN-based approaches have demonstrated significant utility in analyzing static VFSS images—particularly in detecting laryngeal penetration, aspiration, or pharyngeal residue—these models typically rely on 2D convolutional architectures that extract spatial features from individual frames [ 30 ]. More recently, studies have introduced video-based action recognition models utilizing 3D convolutional networks, which are capable of capturing both spatial and temporal dynamics of bolus movement in VFSS videos [ 31 ]. Nevertheless, most existing models have primarily focused on diagnostic classification, and relatively few have addressed the prediction of long-term outcomes or functional severity in dysphagia. In non–deep learning-based studies, the clinical course and severity of dysphagia following LMI have been the subject of continued investigation [ 8 , 18 , 20 ]. Importantly, the ability to stratify patients by severity at an early stage holds significant clinical relevance. Such early risk prediction can support timely intervention planning and resource allocation, particularly in settings where VFSS is delayed or unavailable. The present study applies a Hier-ViT architecture to early brain MRI data to classify dysphagia severity. In this approach, the input image is divided into patches, which are then linearly projected and processed by the Transformer encoder [ 32 ]. Although CNN-based methods have demonstrated commendable performance, their ability to model long-range dependencies is inherently limited due to the localized nature of convolution operations. In contrast, Hier-ViT effectively addresses this limitation by capturing both local and global contextual relationships through its hierarchical self-attention mechanism [ 33 ]. Hier-ViT has demonstrated effectiveness in various medical imaging tasks and has shown superior performance compared to conventional CNN-based approaches [ 34 ]. This constitutes a novel and meaningful extension of deep learning applications in the domain of neuroimaging for stroke-related dysphagia, as it incorporates global contextual information while preserving spatial hierarchies—a key advantage when analyzing small, anatomically complex structures such as the medulla. More broadly, the implementation of a Hier-ViT model in this study highlights its potential utility in medical imaging domains that involve anatomically small but functionally critical regions. Despite inherent limitations, including modest dataset size, the model demonstrated stable classification performance, suggesting that transformer architectures may offer robustness in complex neuroanatomical prediction tasks. This reinforces the potential of early imaging-based models in supporting clinical decision. This study has several limitations that should be acknowledged. First, the sample was drawn from a single-center cohort, which may limit the generalizability of the model’s performance to broader populations with different demographic or clinical characteristics. Second, external validation was not performed, restricting the applicability of the model to other clinical settings and raising concerns about potential overfitting to institutional-specific data. Third, the follow-up period was confined to the early phase of dysphagia recovery, thereby limiting insight into long-term functional outcomes. Lastly, the model was developed using only MRI and VFSS data, without incorporating additional clinical variables such as medical history or neurological assessments, which may have further enhanced the model’s predictive capacity through multi-modal integration. Future research should focus on validating the current findings across larger, multi-center datasets to enhance the generalizability and clinical robustness of the model. Incorporating additional clinical variables—such as neurological assessments, comorbidities, or laboratory findings—as well as multi-modal imaging may further improve predictive performance. Moreover, extending the follow-up period to capture long-term swallowing outcomes will be essential to fully evaluate the clinical relevance of such models. As deep learning approaches continue to evolve, efforts should also be made to streamline model architecture for real-time clinical implementation and to assess their utility in guiding personalized rehabilitation strategies. 5. Conclusion This study demonstrated the feasibility of utilizing a Hier-ViT model to predict the severity of dysphagia in patients with LMI based on early brain MRI findings. The model showed moderate but clinically meaningful classification performance, indicating that Transformer-based architectures are capable of effectively capturing spatial features associated with swallowing dysfunction. Given the clinical importance of early prognosis in formulating individualized rehabilitation strategies, these findings support the potential application of deep learning models as adjunctive tools in dysphagia management. Further studies involving larger, heterogeneous cohorts and integration of multi-modal clinical data are warranted to improve predictive accuracy and enhance generalizability. Declarations Acknowledgements We thank the Department of Medical Information at Dongguk University Ilsan Hospital for their support in accessing radiologic data. Author contributions T.L. collected clinical and imaging data, performed data preprocessing and annotation, and drafted the manuscript. K.N. developed and implemented the deep learning model and contributed to data analysis. B.H.K. critically reviewed the manuscript and contributed to overall interpretation. J.-W.P. conceptualized and supervised the study, provided critical revisions, and served as co-corresponding author. All authors read and approved the final version of the manuscript. Data availability statement The datasets used and/or analyzed during the current study are available from the corresponding author upon reasonable request. Competing interests The authors declare no competing interests. Ethics approval This study was approved by the Institutional Review Board of Dongguk University Ilsan Hospital (IRB No. 2024-07-004). All methods were carried out in accordance with relevant guidelines and regulations. Consent to participate/consent to publish Informed consent was waived by the IRB due to the retrospective nature of the study using de-identified patient data. Funding This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT). (No. RS-2023-00252208) References Barer, D. H. The natural history and functional consequences of dysphagia after hemispheric stroke . J. Neurol. Neurosurg. Psychiatry 52 , 236-241 (1989). Clavé, P. & Shaker, R. Dysphagia: current reality and scope of the problem . Nat. Rev. Gastroenterol. Hepatol. 12 , 259-270 (2015). Ekberg, O., Hamdy, S., Woisard, V., Wuttge-Hannig, A. & Ortega, P. Social and psychological burden of dysphagia: its impact on diagnosis and treatment . Dysphagia 17 , 139-146 (2002). Vuilleumier, P., Bogousslavsky, J. & Regli, F. Infarction of the lower brainstem. clinical, aetiological and MRI-topographical correlations . Brain 118 , 1013-1025 (1995). Lee, H. & Sohn, C. H. Axial lateropulsion as a sole manifestation of lateral medullary infarction: a clinical variant related to rostral–dorsolateral lesion . Neurol. Res. 24 , 773-774 (2002). Dieterich, M. & Brandt, T. Wallenberg’s syndrome: lateropulsion, cyclorotation, and subjective visual vertical in thirty‐six patients . Ann. Neurol. 31 , 399-408 (1992). Nowak, D. A. & Topka, H. R. The clinical variability of Wallenberg’s syndrome. the anatomical correlate of ipsilateral axial lateropulsion . J. Neurol. 253 , 507-511 (2006). Norrving, B. & Cronqvist, S. Lateral medullary infarction: prognosis in an unselected series . Neurology 41 , 244–248 (1991). Sacco, R. L. et al. Wallenberg’s lateral medullary syndrome. clinical-magnetic resonance imaging correlations . Arch. Neurol. 50 , 609-614 (1993). Ertekin, C., Aydogdu, I., Tarlaci, S., Turman, A. B. & Kiylioglu, N. Mechanisms of dysphagia in suprabulbar palsy with lacunar infarct . Stroke 31 , 1370-1376 (2000). Steuer, I. & Guertin, P. A. Central pattern generators in the brainstem and spinal cord: an overview of basic principles, similarities and differences . Rev. Neurosci. 30 , 107-164 (2019). Vigderman, A. M., Chavin, J. M., Kososky, C. & Tahmoush, A. J. Aphagia due to pharyngeal constrictor paresis from acute lateral medullary infarction . J. Neurol. Sci. 155 , 208-210 (1998). Jean, A. & Car, A. Inputs to the swallowing medullary neurons from the peripheral afferent fibers and the swallowing cortical area . Brain Res. 178 , 567-572 (1979). Kim, T. J. et al. Dysphagia may be an independent marker of poor outcome in acute lateral medullary infarction . J. Clin. Neurol. 11 , 349-357 (2015). Kim, H., Chung, C. S., Lee, K. H. & Robbins, J. Aspiration subsequent to a pure medullary infarction: lesion sites, clinical variables, and outcome . Arch. Neurol. 57 , 478-483 (2000). Chun, M. H., Kim, D. & Chang, M. C. Comparison of dysphagia outcomes between rostral and caudal lateral medullary infarct patients . Int. J. Neurosci. 127 , 965-970 (2017). Gupta, H. & Banerjee, A. Recovery of Dysphagia in lateral medullary stroke . Case Rep. Neurol. Med. 2014 , 404871 (2014). Kim, H., Lee, H. J. & Park, J.-W. Clinical course and outcome in patients with severe dysphagia after lateral medullary syndrome . Ther. Adv. Neurol. Disord. 11 , 1756286418759864 (2018). Jang, S. H. & Kim, M. S. Dysphagia in lateral medullary syndrome: a narrative review . Dysphagia 36 , 329-338 (2021). Cho, Y.-J., Ryu, W.-S., Lee, H., Kim, D.-E. & Park, J.-W. Which factors affect the severity of dysphagia in lateral medullary infarction? Dysphagia 35 , 414-418 (2020). Miller, D. D. & Brown, E. W. Artificial intelligence in medical practice: the question to the answer? Am. J. Med. 131 , 129-133 (2018). LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521 , 436-444 (2015). Lundervold, A. S. & Lundervold, A. An overview of deep learning in medical imaging focusing on MRI . Z. Med. Phys. 29 , 102-127 (2019). Islam, J. & Zhang, Y. Brain MRI analysis for Alzheimer’s disease diagnosis using an ensemble system of deep convolutional neural networks . Brain Inform. 5 , 2 (2018). Laukamp, K. R. et al. Fully automated detection and segmentation of meningiomas using deep learning on routine multiparametric MRI . Eur. Radiol. 29 , 124-132 (2019). Perkuhn, M. et al. Clinical evaluation of a multiparametric deep learning model for glioblastoma segmentation using heterogeneous magnetic resonance imaging data from clinical routine . Invest. Radiol. 53 , 647-654 (2018). Yoo, Y. et al. Deep learning of joint myelin and T1w MRI features in normal-appearing brain tissue to distinguish between multiple sclerosis patients and healthy controls . NeuroImage Clin. 17 , 169-178 (2018). Kleesiek, J. et al. Deep MRI brain extraction: A 3D convolutional neural network for skull stripping . Neuroimage 129 , 460-469 (2016). Li, H., Parikh, N. A. & He, L. A novel transfer learning approach to enhance deep neural network classification of brain functional connectomes . Front. Neurosci. 12 , 491 (2018). Lee, S. J., Ko, J. Y., Kim, H. I. & Choi, S. I. Automatic detection of airway invasion from videofluoroscopy via deep learning technology . Appl. Sci. 10 , 6179 (2020). Nam, K. et al. Automated laryngeal invasion detector of boluses in videofluoroscopic swallowing study videos using action recognition-based networks . Diagnostics (Basel) 14 , 1444 (2024). Liu, Z. et al. Swin transformer: hierarchical vision transformer using shifted windows in Proceedings of the IEEE/CVF International Conference on Computer Vision 9992–10002 (IEEE, New York, 2021). Cao, H. et al. Swin-unet: Unet-like pure transformer for medical image segmentation in European Conference on Computer Vision (Springer, 2022). Cantone, M., Marrocco, C., Tortorella, F. & Bria, A. Convolutional networks and transformers for mammography classification: an experimental study . Sensors (Basel) 23 , 1229 (2023). Additional Declarations No competing interests reported. Cite Share Download PDF Status: Published Journal Publication published 19 Feb, 2026 Read the published version in Scientific Reports → Version 1 posted Editorial decision: Revision requested 12 Nov, 2025 Reviews received at journal 10 Nov, 2025 Reviewers agreed at journal 04 Nov, 2025 Reviewers agreed at journal 04 Nov, 2025 Reviews received at journal 26 Jul, 2025 Reviewers agreed at journal 18 Jul, 2025 Reviewers invited by journal 18 Jul, 2025 Editor assigned by journal 17 Jul, 2025 Editor invited by journal 10 Jul, 2025 Submission checks completed at journal 07 Jul, 2025 First submitted to journal 07 Jul, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7007189","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":487996837,"identity":"6d48cc5e-fb67-4df2-bfbe-3262caed1939","order_by":0,"name":"Taeheon Lee","email":"","orcid":"","institution":"Dongguk University Ilsan Hospital","correspondingAuthor":false,"prefix":"","firstName":"Taeheon","middleName":"","lastName":"Lee","suffix":""},{"id":487996838,"identity":"72982eea-394c-40bc-b7ed-1ee13c6c8ace","order_by":1,"name":"Bo Hae Kim","email":"","orcid":"","institution":"Dongguk University Ilsan Hospital","correspondingAuthor":false,"prefix":"","firstName":"Bo","middleName":"Hae","lastName":"Kim","suffix":""},{"id":487996839,"identity":"fdcc4cdc-82ea-4221-8a7c-d775d9178c15","order_by":2,"name":"Kihwan Nam","email":"","orcid":"","institution":"Korea University","correspondingAuthor":false,"prefix":"","firstName":"Kihwan","middleName":"","lastName":"Nam","suffix":""},{"id":487996840,"identity":"72e4d39a-6171-41aa-af60-f67edd3d64a8","order_by":3,"name":"Jin-Woo Park","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA7UlEQVRIiWNgGAWjYBACxmY4kw2EbUBijQdI0ZIGEmvAqwUJgLUcBjPxamFu5z0m8aGCIY9/9rEEphtl5+3Wth8G2lJjE43bYXxpkjPOMBRLnEs7wJxz7nbytjOJQC3H0nIbcGrhMbvN28aQ2HCGvYE5t+12stkBoBbGhsOEtcyHaDmXbHb+IZFaNpxhOwDUcsDO7AZhW8x/zjgjUWx4hi3hcM655ASzG0BbEvD4xbD/jLHBhwqbPLkzbIaPc8rs7M3Opz988KHGBrcWiIREAog8AMSJYIEEHMpBQB5Kw9XY41E8CkbBKBgFIxQAAGJsX95iCYz9AAAAAElFTkSuQmCC","orcid":"","institution":"Dongguk University Ilsan Hospital","correspondingAuthor":true,"prefix":"","firstName":"Jin-Woo","middleName":"","lastName":"Park","suffix":""}],"badges":[],"createdAt":"2025-06-30 06:53:31","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7007189/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7007189/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1038/s41598-026-40751-9","type":"published","date":"2026-02-19T15:57:26+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":87334752,"identity":"2a747d80-0022-4dba-bad8-6e764db37bff","added_by":"auto","created_at":"2025-07-22 20:18:31","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":204360,"visible":true,"origin":"","legend":"\u003cp\u003eDeep learning pipeline using Hier-ViT for dysphagia severity classification\u003c/p\u003e\n\u003cp\u003eDiffusion-weighted images (DWIs) from axial medullary slices were preprocessed and segmented into fixed-size patches. Each patch was converted into an embedding vector and input into the hierarchical vision transformer (Hier-ViT) architecture, enabling multiscale feature aggregation. The output features were then used to classify patients into severe or non-severe dysphagia groups, based on videofluoroscopic swallowing study (VFSS) labels.\u003c/p\u003e","description":"","filename":"floatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-7007189/v1/e555b1d4ae0cf76d1a965e1e.png"},{"id":87334750,"identity":"003427dd-1185-4d83-a4aa-25c2f464b73a","added_by":"auto","created_at":"2025-07-22 20:18:30","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":101493,"visible":true,"origin":"","legend":"\u003cp\u003eConfusion matrix (A) and ROC curve (B) of Hier-ViT model.\u003c/p\u003e\n\u003cp\u003e(A) Confusion matrix showing model predictions of dysphagia severity. (B) Receiver operating characteristic (ROC) curve with area under the curve (AUC) of 0.69, indicating fair discriminative performance.\u003c/p\u003e","description":"","filename":"floatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-7007189/v1/d15b0ed8a5bc4ca731847b5f.png"},{"id":103252304,"identity":"e77b430a-ab12-4ea1-a2b6-87d1d0960a9f","added_by":"auto","created_at":"2026-02-23 16:14:14","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":911827,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7007189/v1/e794569f-22dd-4dc9-9a21-748dfb12e876.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Prediction of dysphagia severity after lateral medullary infarction with deep learning","fulltext":[{"header":"1. Introduction","content":"\u003cp\u003eDysphagia is a commonly documented complication after stroke, but its reported prevalence varies widely, ranging between 19% and 81% [\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e]. Dysphagia can significantly impair patients\u0026rsquo; quality of life, leading to complications such as malnutrition, dehydration, aspiration pneumonia, social isolation, depression, and anxiety [\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e, \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e].\u003c/p\u003e\u003cp\u003eStrokes can be classified based on the affected brain region. Among these, lateral medullary infarction (LMI) is caused by ischemia in the lateral part of the medulla oblongata [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e]. LMI is characterized by a range of neurological symptoms, including gaze-induced nystagmus [\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e], Horner\u0026rsquo;s syndrome [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e], ipsilateral ataxia [\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e]. Importantly, dysphagia has been reported in 51\u0026ndash;100% of patients with LMI [\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e, \u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e], and it tends to be more severe and prolonged in LMI patients compared to those with hemispheric stroke patients [\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e]. This can be attributed to the anatomical characteristics of the medulla, where the central pattern generators (CPGs) responsible for swallowing are located. CPGs are generally defined as neural networks capable of generating central commands that govern stereotyped, rhythmic motor behaviors such as swallowing [\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e, \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e]. In the medulla, the CPG is composed primarily of the nucleus of the tractus solitarius (NTS) and the nucleus ambiguus (NA). Neurons in the NTS function as interneurons that coordinate and program the sequential swallowing motor pattern, while neurons in the NA primarily serve as motoneurons innervating the pharyngolaryngeal muscles and the esophagus [\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e]. Initial dysphagia is considered a key factor associated with poor outcomes following LMI [\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e]. Although most patients with dysphagia experience mild symptoms and recover quickly [\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e, \u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e], some patients with severe dysphagia require tube feeding for several months or even years [\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e, \u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e, \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e]. Therefore, predicting the prognosis of dysphagia is clinically important for developing appropriate therapeutic strategies. Several studies have attempted to identify factors associated with dysphagia prognosis in LMI patients [\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e, \u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e]\u003c/p\u003e\u003cp\u003eA previous lesion-symptom mapping study demonstrated that posterolateral involvement in both the upper and lower medulla, as visualized on diffusion-weighted imaging, was significantly associated with severe dysphagia- typically characterized by decreased pharyngeal constriction and the apparent absence of any evidence of esophageal passage in videofluoroscopic swallowing study(VFSS) findings, as well as the initial requirement for enteral tube feeding [\u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e, \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e]. These findings suggest that the anatomic factor related to the extent and vertical distribution of the lesion play a critical role in determining swallowing impairment. This underscores the importance of early lesion localization in predicting swallowing outcomes and guiding individualized rehabilitation planning\u003c/p\u003e\u003cp\u003eRecently, artificial intelligence (AI) based on deep learning has been expanding its applications in various medical fields to enhance the diagnosis and treatment of diseases [\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e]. Deep learning methods are a type of representation learning that leverage multiple levels of abstraction. These methods consist of simple but non-linear modules that transform the representation at one level (starting with raw input) into a higher, more abstract level of representation [\u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e].\u003c/p\u003e\u003cp\u003eDeep learning techniques have been widely applied across various organ systems, including the kidney, prostate, and spine [\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e]. Notably, brain image analysis has emerged as one of the most extensively studied areas in the field of medical imaging. A variety of tasks have been successfully addressed, such as the staging and early diagnosis of Alzheimer\u0026rsquo;s disease using multimodal magnetic resonance imaging (MRI) data [\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e], automated segmentation of brain tumors including glioblastoma and meningioma from heterogeneous clinical MRI scans [\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e, \u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e], and lesion detection in patients with multiple sclerosis [\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e]. Additional applications include skull stripping for brain extraction [\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e], and classification of brain functional connectomes [\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e].\u003c/p\u003e\u003cp\u003eGiven that anatomical lesion patterns in the medulla have been shown to significantly influence the severity of dysphagia in patients with LMI, accurate lesion localization is of particular clinical relevance. At the same time, recent advances in deep learning have led to substantial progress in brain imaging analysis, with applications ranging from disease classification to lesion segmentation and functional mapping. These developments suggest that integrating anatomical insights with data-driven modeling may offer a powerful approach to predicting dysphagia severity.\u003c/p\u003e\u003cp\u003eIn this study, we aim to develop and validate a Transformer-based deep learning algorithm to automatically analyze the location of initial MRI lesions in patients with LMI and predict the prognosis of dysphagia. By leveraging the ability of Transformers to capture both local and global patterns, we expect to enhance the accuracy of prognosis prediction and facilitate the development of more effective rehabilitation strategies.\u003c/p\u003e"},{"header":"2. Materials and methods","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\u003ch2\u003e2.1. Study design\u003c/h2\u003e\u003cp\u003eWe performed a retrospective study analyzing clinical and imaging data of patients who were admitted to Dongguk University Ilsan Hospital with acute LMI from September 2005 to the July 2024. Participants satisfied the following inclusion criteria: (1) first-ever onset of acute stroke, (2) patients aged 20 years or older and (3) diagnosis of lateral medullary infarction confirmed by MRI. Patients with other structural brain disorders, including neurodegenerative or neuromuscular conditions that could independently influence swallowing function, were excluded from the study. The study was approved by the Institutional Review Board of Dongguk University Ilsan Hospital (No. 2024-07-004). All procedures were performed in compliance with the relevant guidelines and regulations.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec4\" class=\"Section2\"\u003e\u003ch2\u003e2.2. Data collection and outcome measures\u003c/h2\u003e\u003cp\u003eAll stroke patients underwent brain MRI within 24 hours of admission using a 1.5-T MR machine (MAGNETOM-Avanto, Siemens, Erlangen, Germany). Diffusion-weighted MRI (DWI) parameters were as follows: b-values of 0 and 1000 s/mm\u0026sup2;, repetition time of 5400 ms, echo time of 77 ms, field of view of 220 \u0026times; 220 mm, slice thickness of 3.0 mm, and an interslice gap of 0.3 mm. Following established protocols [\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e], three DWI slices were extracted from the lower, middle, and upper medulla levels based on the MNI brain template for analysis.\u003c/p\u003e\u003cp\u003eVFSS was conducted at a mean of 14 days following stroke onset using a fluoroscopic system (Sonialvision-100, Shimadzu Corporation, Kyoto, Japan). All examinations adhered to a standardized protocol, which included the following: (1) patients maintained an upright seated position throughout the procedure; (2) each subject was given 5 mL of diluted barium solution (35% w/v), curd-type yogurt, and mashed boiled pumpkin, each administered twice; and (3) the protocol was modified as needed according to the patient\u0026rsquo;s clinical status and level of cooperation.\u003c/p\u003e\u003cp\u003eDysphagia severity was determined according to VFSS findings [\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e]. Severe dysphagia was defined by the presence of diminished pharyngeal contraction and a clear lack of observable esophageal passage. Patients who did not exhibit these features were categorized as having non-severe dysphagia.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec5\" class=\"Section2\"\u003e\u003ch2\u003e2.3. Data preprocessing\u003c/h2\u003e\u003cp\u003eThree brain MRI images were selected based on consistent anatomical landmarks to ensure comparability across subjects and were input into the model as independent samples. Each patient was labeled as having either severe or non-severe dysphagia based on VFSS results, creating a binary classification task. All images were resized to a standardized resolution and normalized for pixel intensity. The dataset was split into training (70%) and testing (30%) sets to evaluate model performance. No additional clinical variables were included at this stage, and the analysis focused solely on the spatial imaging features from early post-stroke MRI scans. This preprocessing ensured that the model learned discriminative patterns directly from medullary lesion characteristics while controlling for variations in image size and intensity.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec6\" class=\"Section2\"\u003e\u003ch2\u003e2.4. Deep learning models\u003c/h2\u003e\u003cp\u003eThis study employed a Hierarchical Vision Transformer (Hier-ViT) model to classify dysphagia severity based on early brain MRI data. Hier-ViT is a Transformer-based vision model that constructs multi-level representations by hierarchically merging non-overlapping image patches. Each input image was divided into fixed-size patches and linearly embedded, with positional encoding added to preserve spatial information. These embeddings were then passed through a series of Transformer encoder layers that progressively reduced spatial resolution while increasing feature abstraction, allowing for both local and global contextual learning. The model was trained using the preprocessed medulla MRI slices and labeled dysphagia severity, with performance evaluated based on classification metrics such as accuracy, precision, recall, and AUC. A schematic overview of the entire deep learning pipeline\u0026mdash;from diffusion-weighted image preprocessing to patch embedding, hierarchical feature extraction, and severity classification\u0026mdash;is presented in Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec7\" class=\"Section2\"\u003e\u003ch2\u003e2.5. Statistical analysis\u003c/h2\u003e\u003cp\u003eStatistical analysis was performed using Python 3.8 on an Ubuntu 22.04 operating system, utilizing the Scikit-learn (version 1.5) and SciPy (version 1.11.4) libraries. Model performance was evaluated using receiver operating characteristic (ROC) curve analysis, and the area under the curve (AUC) was calculated as a primary indicator of discriminative ability. Confidence intervals for the mean AUC were estimated using the bias-corrected and accelerated bootstrap method.\u003c/p\u003e\u003c/div\u003e"},{"header":"3. Results","content":"\u003cdiv id=\"Sec9\" class=\"Section2\"\u003e\u003ch2\u003e3.1. Patient characteristics\u003c/h2\u003e\u003cp\u003eA total of 163 patients met the inclusion criteria for this study. The mean age of the participants was 60.5\u0026thinsp;\u0026plusmn;\u0026thinsp;13.0 years. Among them, 44 patients (27.0%) were classified as having severe dysphagia, including 33 males (75.0%) and 11 females (25.0%), with a mean age of 60.0\u0026thinsp;\u0026plusmn;\u0026thinsp;14.5 years. The remaining 119 patients (73.0%) were classified as having non-severe dysphagia, comprising 71 males (59.7%) and 48 females (40.3%), with a mean age of 60.7\u0026thinsp;\u0026plusmn;\u0026thinsp;12.6 years. VFSS was conducted at a mean of 12.4\u0026thinsp;\u0026plusmn;\u0026thinsp;3.4 days from stroke onset. Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e summarizes the baseline demographic characteristics stratified by dysphagia severity.\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003ePatient characteristics (n\u0026thinsp;=\u0026thinsp;163).\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"3\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eCharacteristic\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eSevere\u003c/p\u003e\u003cp\u003e(n\u0026thinsp;=\u0026thinsp;44)\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003eNon-severe\u003c/p\u003e\u003cp\u003e(n\u0026thinsp;=\u0026thinsp;119)\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eAge (yr)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e60.0\u0026thinsp;\u0026plusmn;\u0026thinsp;14.5\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e60.7\u0026thinsp;\u0026plusmn;\u0026thinsp;12.6\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eGender (M/F)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e33 / 11 (75.0 / 25.0%)\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e71 / 48 (59.7 / 40.3%)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003eDays from onset to VFSS\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colspan=\"2\" nameend=\"c3\" namest=\"c2\"\u003e\u003cp\u003e12.4\u0026thinsp;\u0026plusmn;\u0026thinsp;3.4 (overall mean)\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003ctfoot\u003e\u003ctr\u003e\u003ctd colspan=\"3\"\u003eValues are mean\u0026thinsp;\u0026plusmn;\u0026thinsp;SD.\u003c/td\u003e\u003c/tr\u003e\u003c/tfoot\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec10\" class=\"Section2\"\u003e\u003ch2\u003e3.2. Model performance\u003c/h2\u003e\u003cp\u003eThe Hier-ViT model achieved an accuracy of 0.85, precision of 0.70, recall of 0.75, and an F1-score of 0.72 in classifying dysphagia severity. The area under the receiver operating characteristic curve (AUC) was 0.69, indicating fair overall classification performance. These performance metrics are summarized in Table\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e. Model performance metrics are summarized in Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e, including the confusion matrix and ROC curve demonstrating the classifier\u0026rsquo;s ability to distinguish between severity groups.\u003c/p\u003e\u003cp\u003e\u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003ePerformance of prediction models.\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"6\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u0026nbsp;\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eAccuracy\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003ePrecision\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003eRecall\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003eF1-score\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c6\"\u003e\u003cp\u003eAUC\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cb\u003eHier-ViT\u003c/b\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003e0.85\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003e0.70\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e0.75\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003e0.72\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c6\"\u003e\u003cp\u003e0.69\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/colgroup\u003e\u003c/table\u003e\u003c/div\u003e\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003c/div\u003e"},{"header":"4. Discussion","content":"\u003cp\u003eThis study investigated the performance of a deep learning model based on the Hier-ViT in classifying the severity of dysphagia following LMI using early brain MRI findings. The model achieved an accuracy of 0.85, indicating that it correctly classified the dysphagia severity in 85% of patients. These findings suggest that the model\u0026rsquo;s predictions are reasonably reliable and may support decision-making regarding rehabilitation strategies. The AUC of 0.69 reflects the model\u0026rsquo;s ability to discriminate between severe and non-severe dysphagia cases across different threshold settings, indicating a fair level of overall classification performance. The precision of 0.70 means that when the model predicts severe dysphagia, it is correct 70% of the time\u0026mdash;an important factor in minimizing false-positive predictions and avoiding unnecessary interventions. The recall of 0.75 shows that the model successfully identifies 76% of patients who truly have severe dysphagia, which is essential for ensuring that patients at high risk are not overlooked. The balance between precision and recall is demonstrated by the F1-score of 0.72, which indicates moderate yet clinically meaningful performance in terms of both identifying true cases and minimizing misclassification.\u003c/p\u003e\u003cp\u003eFrom an anatomical perspective, this performance is particularly meaningful in light of previous findings indicating that anatomical lesion patterns within the medulla significantly influence the severity of dysphagia in patients with LMI. Specifically, lesion-symptom mapping studies have demonstrated that posterolateral involvement of both the upper and lower medulla, as identified on diffusion-weighted imaging, is strongly associated with severe dysphagia requiring enteral feeding [\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e]. Given that infarctions in the medullary region typically involve extremely small anatomical territories that are often indistinguishable by visual inspection alone, the model\u0026rsquo;s ability to achieve this level of classification accuracy suggests that it may capture subtle lesion characteristics that are not easily recognized through standard radiological assessment.\u003c/p\u003e\u003cp\u003eRecent advances in deep learning have enabled automated analysis of brain imaging data across various domains, including the segmentation of brain tumors, the staging of Alzheimer\u0026rsquo;s disease, and the classification of neurological disorders [\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e]. In this context, our application of a Transformer-based deep learning model trained on early brain MRI data offers a novel approach that integrates anatomical lesion characteristics with data-driven prediction of functional severity.\u003c/p\u003e\u003cp\u003ePrevious deep learning studies on post-stroke dysphagia have predominantly focused on diagnostic classification using CNNs, particularly applied to VFSS data. While prior CNN-based approaches have demonstrated significant utility in analyzing static VFSS images\u0026mdash;particularly in detecting laryngeal penetration, aspiration, or pharyngeal residue\u0026mdash;these models typically rely on 2D convolutional architectures that extract spatial features from individual frames [\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e]. More recently, studies have introduced video-based action recognition models utilizing 3D convolutional networks, which are capable of capturing both spatial and temporal dynamics of bolus movement in VFSS videos [\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e]. Nevertheless, most existing models have primarily focused on diagnostic classification, and relatively few have addressed the prediction of long-term outcomes or functional severity in dysphagia. In non\u0026ndash;deep learning-based studies, the clinical course and severity of dysphagia following LMI have been the subject of continued investigation [\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e, \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e, \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e].\u003c/p\u003e\u003cp\u003eImportantly, the ability to stratify patients by severity at an early stage holds significant clinical relevance. Such early risk prediction can support timely intervention planning and resource allocation, particularly in settings where VFSS is delayed or unavailable.\u003c/p\u003e\u003cp\u003eThe present study applies a Hier-ViT architecture to early brain MRI data to classify dysphagia severity. In this approach, the input image is divided into patches, which are then linearly projected and processed by the Transformer encoder [\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e]. Although CNN-based methods have demonstrated commendable performance, their ability to model long-range dependencies is inherently limited due to the localized nature of convolution operations. In contrast, Hier-ViT effectively addresses this limitation by capturing both local and global contextual relationships through its hierarchical self-attention mechanism [\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e]. Hier-ViT has demonstrated effectiveness in various medical imaging tasks and has shown superior performance compared to conventional CNN-based approaches [\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e]. This constitutes a novel and meaningful extension of deep learning applications in the domain of neuroimaging for stroke-related dysphagia, as it incorporates global contextual information while preserving spatial hierarchies\u0026mdash;a key advantage when analyzing small, anatomically complex structures such as the medulla.\u003c/p\u003e\u003cp\u003eMore broadly, the implementation of a Hier-ViT model in this study highlights its potential utility in medical imaging domains that involve anatomically small but functionally critical regions. Despite inherent limitations, including modest dataset size, the model demonstrated stable classification performance, suggesting that transformer architectures may offer robustness in complex neuroanatomical prediction tasks. This reinforces the potential of early imaging-based models in supporting clinical decision.\u003c/p\u003e\u003cp\u003eThis study has several limitations that should be acknowledged. First, the sample was drawn from a single-center cohort, which may limit the generalizability of the model\u0026rsquo;s performance to broader populations with different demographic or clinical characteristics. Second, external validation was not performed, restricting the applicability of the model to other clinical settings and raising concerns about potential overfitting to institutional-specific data. Third, the follow-up period was confined to the early phase of dysphagia recovery, thereby limiting insight into long-term functional outcomes. Lastly, the model was developed using only MRI and VFSS data, without incorporating additional clinical variables such as medical history or neurological assessments, which may have further enhanced the model\u0026rsquo;s predictive capacity through multi-modal integration.\u003c/p\u003e\u003cp\u003eFuture research should focus on validating the current findings across larger, multi-center datasets to enhance the generalizability and clinical robustness of the model. Incorporating additional clinical variables\u0026mdash;such as neurological assessments, comorbidities, or laboratory findings\u0026mdash;as well as multi-modal imaging may further improve predictive performance. Moreover, extending the follow-up period to capture long-term swallowing outcomes will be essential to fully evaluate the clinical relevance of such models. As deep learning approaches continue to evolve, efforts should also be made to streamline model architecture for real-time clinical implementation and to assess their utility in guiding personalized rehabilitation strategies.\u003c/p\u003e"},{"header":"5. Conclusion","content":"\u003cp\u003eThis study demonstrated the feasibility of utilizing a Hier-ViT model to predict the severity of dysphagia in patients with LMI based on early brain MRI findings. The model showed moderate but clinically meaningful classification performance, indicating that Transformer-based architectures are capable of effectively capturing spatial features associated with swallowing dysfunction. Given the clinical importance of early prognosis in formulating individualized rehabilitation strategies, these findings support the potential application of deep learning models as adjunctive tools in dysphagia management. Further studies involving larger, heterogeneous cohorts and integration of multi-modal clinical data are warranted to improve predictive accuracy and enhance generalizability.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eAcknowledgements\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eWe thank the Department of Medical Information at Dongguk University Ilsan Hospital for their support in accessing radiologic data.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthor contributions\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eT.L. collected clinical and imaging data, performed data preprocessing and annotation, and drafted the manuscript. K.N. developed and implemented the deep learning model and contributed to data analysis. B.H.K. critically reviewed the manuscript and contributed to overall interpretation. J.-W.P. conceptualized and supervised the study, provided critical revisions, and served as co-corresponding author. All authors read and approved the final version of the manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eData availability statement\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe datasets used and/or analyzed during the current study are available from the corresponding author upon reasonable request.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting interests\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare no competing interests.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eEthics approval\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis study was approved by the Institutional Review Board of Dongguk University Ilsan Hospital (IRB No. 2024-07-004). All methods were carried out in accordance with relevant guidelines and regulations.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eConsent to participate/consent to publish\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eInformed consent was waived by the IRB due to the retrospective nature of the study using de-identified patient data.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT). (No. RS-2023-00252208)\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eBarer, D. H. The natural history and functional consequences of dysphagia after hemispheric stroke\u003cem\u003e.\u003c/em\u003e \u003cem\u003eJ. Neurol. Neurosurg. Psychiatry\u003c/em\u003e \u003cstrong\u003e52\u003c/strong\u003e, 236-241 (1989).\u003c/li\u003e\n\u003cli\u003eClav\u0026eacute;, P. \u0026amp; Shaker, R. Dysphagia: current reality and scope of the problem\u003cem\u003e.\u003c/em\u003e \u003cem\u003eNat. Rev. Gastroenterol. Hepatol.\u003c/em\u003e \u003cstrong\u003e12\u003c/strong\u003e, 259-270 (2015).\u003c/li\u003e\n\u003cli\u003eEkberg, O., Hamdy, S., Woisard, V., Wuttge-Hannig, A. \u0026amp; Ortega, P. Social and psychological burden of dysphagia: its impact on diagnosis and treatment\u003cem\u003e.\u003c/em\u003e \u003cem\u003eDysphagia\u003c/em\u003e \u003cstrong\u003e17\u003c/strong\u003e, 139-146 (2002).\u003c/li\u003e\n\u003cli\u003eVuilleumier, P., Bogousslavsky, J. \u0026amp; Regli, F. Infarction of the lower brainstem. clinical, aetiological and MRI-topographical correlations\u003cem\u003e.\u003c/em\u003e \u003cem\u003eBrain\u003c/em\u003e \u003cstrong\u003e118\u003c/strong\u003e, 1013-1025 (1995).\u003c/li\u003e\n\u003cli\u003eLee, H. \u0026amp; Sohn, C. H. Axial lateropulsion as a sole manifestation of lateral medullary infarction: a clinical variant related to rostral\u0026ndash;dorsolateral lesion\u003cem\u003e.\u003c/em\u003e \u003cem\u003eNeurol. Res.\u003c/em\u003e \u003cstrong\u003e24\u003c/strong\u003e, 773-774 (2002).\u003c/li\u003e\n\u003cli\u003eDieterich, M. \u0026amp; Brandt, T. Wallenberg\u0026rsquo;s syndrome: lateropulsion, cyclorotation, and subjective visual vertical in thirty‐six patients\u003cem\u003e.\u003c/em\u003e \u003cem\u003eAnn. Neurol.\u003c/em\u003e \u003cstrong\u003e31\u003c/strong\u003e, 399-408 (1992).\u003c/li\u003e\n\u003cli\u003eNowak, D. A. \u0026amp; Topka, H. R. The clinical variability of Wallenberg\u0026rsquo;s syndrome. the anatomical correlate of ipsilateral axial lateropulsion\u003cem\u003e.\u003c/em\u003e \u003cem\u003eJ. Neurol.\u003c/em\u003e \u003cstrong\u003e253\u003c/strong\u003e, 507-511 (2006).\u003c/li\u003e\n\u003cli\u003eNorrving, B. \u0026amp; Cronqvist, S. Lateral medullary infarction: prognosis in an unselected series\u003cem\u003e.\u003c/em\u003e \u003cem\u003eNeurology\u003c/em\u003e \u003cstrong\u003e41\u003c/strong\u003e, 244\u0026ndash;248 (1991).\u003c/li\u003e\n\u003cli\u003eSacco, R. L. et al. Wallenberg\u0026rsquo;s lateral medullary syndrome. clinical-magnetic resonance imaging correlations\u003cem\u003e.\u003c/em\u003e \u003cem\u003eArch. Neurol.\u003c/em\u003e \u003cstrong\u003e50\u003c/strong\u003e, 609-614 (1993).\u003c/li\u003e\n\u003cli\u003eErtekin, C., Aydogdu, I., Tarlaci, S., Turman, A. B. \u0026amp; Kiylioglu, N. Mechanisms of dysphagia in suprabulbar palsy with lacunar infarct\u003cem\u003e.\u003c/em\u003e \u003cem\u003eStroke\u003c/em\u003e \u003cstrong\u003e31\u003c/strong\u003e, 1370-1376 (2000).\u003c/li\u003e\n\u003cli\u003eSteuer, I. \u0026amp; Guertin, P. A. Central pattern generators in the brainstem and spinal cord: an overview of basic principles, similarities and differences\u003cem\u003e.\u003c/em\u003e \u003cem\u003eRev. Neurosci.\u003c/em\u003e \u003cstrong\u003e30\u003c/strong\u003e, 107-164 (2019).\u003c/li\u003e\n\u003cli\u003eVigderman, A. M., Chavin, J. M., Kososky, C. \u0026amp; Tahmoush, A. J. Aphagia due to pharyngeal constrictor paresis from acute lateral medullary infarction\u003cem\u003e.\u003c/em\u003e \u003cem\u003eJ. Neurol. Sci.\u003c/em\u003e \u003cstrong\u003e155\u003c/strong\u003e, 208-210 (1998).\u003c/li\u003e\n\u003cli\u003eJean, A. \u0026amp; Car, A. Inputs to the swallowing medullary neurons from the peripheral afferent fibers and the swallowing cortical area\u003cem\u003e.\u003c/em\u003e \u003cem\u003eBrain Res.\u003c/em\u003e \u003cstrong\u003e178\u003c/strong\u003e, 567-572 (1979).\u003c/li\u003e\n\u003cli\u003eKim, T. J. et al. Dysphagia may be an independent marker of poor outcome in acute lateral medullary infarction\u003cem\u003e.\u003c/em\u003e \u003cem\u003eJ. Clin. Neurol.\u003c/em\u003e \u003cstrong\u003e11\u003c/strong\u003e, 349-357 (2015).\u003c/li\u003e\n\u003cli\u003eKim, H., Chung, C. S., Lee, K. H. \u0026amp; Robbins, J. Aspiration subsequent to a pure medullary infarction: lesion sites, clinical variables, and outcome\u003cem\u003e.\u003c/em\u003e \u003cem\u003eArch. Neurol.\u003c/em\u003e \u003cstrong\u003e57\u003c/strong\u003e, 478-483 (2000).\u003c/li\u003e\n\u003cli\u003eChun, M. H., Kim, D. \u0026amp; Chang, M. C. Comparison of dysphagia outcomes between rostral and caudal lateral medullary infarct patients\u003cem\u003e.\u003c/em\u003e \u003cem\u003eInt. J. Neurosci.\u003c/em\u003e \u003cstrong\u003e127\u003c/strong\u003e, 965-970 (2017).\u003c/li\u003e\n\u003cli\u003eGupta, H. \u0026amp; Banerjee, A. Recovery of Dysphagia in lateral medullary stroke\u003cem\u003e.\u003c/em\u003e \u003cem\u003eCase Rep. Neurol. Med.\u003c/em\u003e \u003cstrong\u003e2014\u003c/strong\u003e, 404871 (2014).\u003c/li\u003e\n\u003cli\u003eKim, H., Lee, H. J. \u0026amp; Park, J.-W. Clinical course and outcome in patients with severe dysphagia after lateral medullary syndrome\u003cem\u003e.\u003c/em\u003e \u003cem\u003eTher. Adv. Neurol. Disord.\u003c/em\u003e \u003cstrong\u003e11\u003c/strong\u003e, 1756286418759864 (2018).\u003c/li\u003e\n\u003cli\u003eJang, S. H. \u0026amp; Kim, M. S. Dysphagia in lateral medullary syndrome: a narrative review\u003cem\u003e.\u003c/em\u003e \u003cem\u003eDysphagia\u003c/em\u003e \u003cstrong\u003e36\u003c/strong\u003e, 329-338 (2021).\u003c/li\u003e\n\u003cli\u003eCho, Y.-J., Ryu, W.-S., Lee, H., Kim, D.-E. \u0026amp; Park, J.-W. Which factors affect the severity of dysphagia in lateral medullary infarction? \u003cem\u003eDysphagia\u003c/em\u003e \u003cstrong\u003e35\u003c/strong\u003e, 414-418 (2020).\u003c/li\u003e\n\u003cli\u003eMiller, D. D. \u0026amp; Brown, E. W. Artificial intelligence in medical practice: the question to the answer? \u003cem\u003eAm. J. Med.\u003c/em\u003e \u003cstrong\u003e131\u003c/strong\u003e, 129-133 (2018).\u003c/li\u003e\n\u003cli\u003eLeCun, Y., Bengio, Y. \u0026amp; Hinton, G. Deep learning. \u003cem\u003eNature\u003c/em\u003e \u003cstrong\u003e521\u003c/strong\u003e, 436-444 (2015).\u003c/li\u003e\n\u003cli\u003eLundervold, A. S. \u0026amp; Lundervold, A. An overview of deep learning in medical imaging focusing on MRI\u003cem\u003e.\u003c/em\u003e \u003cem\u003eZ. Med. Phys.\u003c/em\u003e \u003cstrong\u003e29\u003c/strong\u003e, 102-127 (2019).\u003c/li\u003e\n\u003cli\u003eIslam, J. \u0026amp; Zhang, Y. Brain MRI analysis for Alzheimer\u0026rsquo;s disease diagnosis using an ensemble system of deep convolutional neural networks\u003cem\u003e.\u003c/em\u003e \u003cem\u003eBrain Inform.\u003c/em\u003e \u003cstrong\u003e5\u003c/strong\u003e, 2 (2018).\u003c/li\u003e\n\u003cli\u003eLaukamp, K. R. et al. Fully automated detection and segmentation of meningiomas using deep learning on routine multiparametric MRI\u003cem\u003e.\u003c/em\u003e \u003cem\u003eEur. Radiol.\u003c/em\u003e \u003cstrong\u003e29\u003c/strong\u003e, 124-132 (2019).\u003c/li\u003e\n\u003cli\u003ePerkuhn, M. et al. Clinical evaluation of a multiparametric deep learning model for glioblastoma segmentation using heterogeneous magnetic resonance imaging data from clinical routine\u003cem\u003e.\u003c/em\u003e \u003cem\u003eInvest. Radiol.\u003c/em\u003e \u003cstrong\u003e53\u003c/strong\u003e, 647-654 (2018).\u003c/li\u003e\n\u003cli\u003eYoo, Y. et al. Deep learning of joint myelin and T1w MRI features in normal-appearing brain tissue to distinguish between multiple sclerosis patients and healthy controls\u003cem\u003e.\u003c/em\u003e \u003cem\u003eNeuroImage Clin.\u003c/em\u003e \u003cstrong\u003e17\u003c/strong\u003e, 169-178 (2018).\u003c/li\u003e\n\u003cli\u003eKleesiek, J. et al. Deep MRI brain extraction: A 3D convolutional neural network for skull stripping\u003cem\u003e.\u003c/em\u003e \u003cem\u003eNeuroimage\u003c/em\u003e \u003cstrong\u003e129\u003c/strong\u003e, 460-469 (2016).\u003c/li\u003e\n\u003cli\u003eLi, H., Parikh, N. A. \u0026amp; He, L. A novel transfer learning approach to enhance deep neural network classification of brain functional connectomes\u003cem\u003e.\u003c/em\u003e \u003cem\u003eFront. Neurosci.\u003c/em\u003e \u003cstrong\u003e12\u003c/strong\u003e, 491 (2018).\u003c/li\u003e\n\u003cli\u003eLee, S. J., Ko, J. Y., Kim, H. I. \u0026amp; Choi, S. I. Automatic detection of airway invasion from videofluoroscopy via deep learning technology\u003cem\u003e.\u003c/em\u003e \u003cem\u003eAppl. Sci.\u003c/em\u003e \u003cstrong\u003e10\u003c/strong\u003e, 6179 (2020).\u003c/li\u003e\n\u003cli\u003eNam, K. et al. Automated laryngeal invasion detector of boluses in videofluoroscopic swallowing study videos using action recognition-based networks\u003cem\u003e.\u003c/em\u003e \u003cem\u003eDiagnostics (Basel)\u003c/em\u003e \u003cstrong\u003e14\u003c/strong\u003e, 1444 (2024).\u003c/li\u003e\n\u003cli\u003eLiu, Z. et al. Swin transformer: hierarchical vision transformer using shifted windows in \u003cem\u003eProceedings of the IEEE/CVF International Conference on Computer Vision\u003c/em\u003e 9992\u0026ndash;10002 (IEEE, New York, 2021).\u003c/li\u003e\n\u003cli\u003eCao, H. et al. Swin-unet: Unet-like pure transformer for medical image segmentation in \u003cem\u003eEuropean Conference on Computer Vision\u003c/em\u003e (Springer, 2022).\u003c/li\u003e\n\u003cli\u003eCantone, M., Marrocco, C., Tortorella, F. \u0026amp; Bria, A. Convolutional networks and transformers for mammography classification: an experimental study\u003cem\u003e.\u003c/em\u003e \u003cem\u003eSensors (Basel)\u003c/em\u003e \u003cstrong\u003e23\u003c/strong\u003e, 1229 (2023).\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"scientific-reports","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"scirep","sideBox":"Learn more about [Scientific Reports](http://www.nature.com/srep/)","snPcode":"","submissionUrl":"","title":"Scientific Reports","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Scientific Reports","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"Lateral medullary infarction, Dysphagia, Prognosis, Deep learning, Magnetic Resonance Imaging","lastPublishedDoi":"10.21203/rs.3.rs-7007189/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7007189/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eDysphagia is a common and debilitating complication in patients with lateral medullary infarction (LMI), affecting up to 100% of cases and significantly impairing quality of life. Accurate early prediction of dysphagia severity is essential for timely intervention and personalized rehabilitation planning. This study aimed to develop and validate a deep learning algorithm using acute-phase diffusion-weighted MRI to classify dysphagia severity in LMI patients. A retrospective cohort of 163 patients with confirmed acute LMI was analyzed. Dysphagia severity was determined by videofluoroscopic swallowing studies (VFSS), categorizing patients into severe and non-severe groups. Lesion regions were manually labeled and preprocessed for model training. A Transformer-based deep learning architecture, the Hierarchical Vision Transformer (Hier-ViT), was employed due to its capacity to model spatial hierarchies and global image context. The model achieved an accuracy of 0.85, with a precision of 0.70, recall of 0.75, F1-score of 0.72, and an area under the ROC curve (AUC) of 0.69. These findings suggest that Hier-ViT can effectively classify dysphagia severity in LMI patients using early MRI, offering a potential tool for prognosis prediction. Further studies with larger cohorts and multi-modal data are needed to confirm clinical utility and enhance model generalizability.\u003c/p\u003e","manuscriptTitle":"Prediction of dysphagia severity after lateral medullary infarction with deep learning","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-07-22 20:18:26","doi":"10.21203/rs.3.rs-7007189/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2025-11-12T08:39:32+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-11-10T07:55:39+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"315834496410910296135398110155656474652","date":"2025-11-04T07:48:18+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"128227955620548088021756784822042862098","date":"2025-11-04T06:35:40+00:00","index":"hide","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-07-26T14:54:53+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"339659168543868828158611486211357768453","date":"2025-07-19T02:03:25+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2025-07-18T07:09:43+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-07-17T06:51:27+00:00","index":"","fulltext":""},{"type":"editorInvited","content":"","date":"2025-07-10T13:08:23+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-07-08T01:45:17+00:00","index":"","fulltext":""},{"type":"submitted","content":"Scientific Reports","date":"2025-07-08T01:42:42+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"scientific-reports","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"scirep","sideBox":"Learn more about [Scientific Reports](http://www.nature.com/srep/)","snPcode":"","submissionUrl":"","title":"Scientific Reports","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"stoa","reportingPortfolio":"Scientific Reports","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"cc28a378-1f34-41ce-b085-97c87309d2f5","owner":[],"postedDate":"July 22nd, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[{"id":51804314,"name":"Biological sciences/Computational biology and bioinformatics"},{"id":51804315,"name":"Health sciences/Diseases"},{"id":51804316,"name":"Health sciences/Health care"},{"id":51804317,"name":"Health sciences/Medical research"},{"id":51804318,"name":"Health sciences/Neurology"}],"tags":[],"updatedAt":"2026-02-23T16:10:29+00:00","versionOfRecord":{"articleIdentity":"rs-7007189","link":"https://doi.org/10.1038/s41598-026-40751-9","journal":{"identity":"scientific-reports","isVorOnly":false,"title":"Scientific Reports"},"publishedOn":"2026-02-19 15:57:26","publishedOnDateReadable":"February 19th, 2026"},"versionCreatedAt":"2025-07-22 20:18:26","video":"","vorDoi":"10.1038/s41598-026-40751-9","vorDoiUrl":"https://doi.org/10.1038/s41598-026-40751-9","workflowStages":[]},"version":"v1","identity":"rs-7007189","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7007189","identity":"rs-7007189","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00