Prediction of antimicrobial resistance of Klebsiella pneumoniae through machine learning and revelation of significant role of insertion sequence elements

preprint OA: closed
Full text JSON View at publisher
Full text 150,953 characters · extracted from preprint-html · click to expand
Prediction of antimicrobial resistance of Klebsiella pneumoniae through machine learning and revelation of significant role of insertion sequence elements | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Prediction of antimicrobial resistance of Klebsiella pneumoniae through machine learning and revelation of significant role of insertion sequence elements Yiming Wang, Xinran Li, Ranran Gao, Yuan Tian, Changjun Wang, and 3 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-9397300/v1 This work is licensed under a CC BY 4.0 License Status: Under Review Version 1 posted 8 You are reading this latest preprint version Abstract Background Klebsiella pneumoniae ( K. pneumoniae ) poses critical therapeutic challenges due to multidrug resistance (MDR). While antibiotic resistance genes (ARGs) are primary determinants, their expression and phenotypic impact are modulated by mobile genetic elements such as insertion sequence (IS) elements. However, the contribution of IS elements to antimicrobial resistance (AMR) prediction remains largely unexplored in K. pneumoniae . Methods We retrieved genome sequences and corresponding antimicrobial susceptibility phenotypes for 2,732 K. pneumoniae isolates from the PATRIC database (2004–2024). We evaluated ARGs, IS elements, IS-ARG pairs, and curated key feature sets using six machine learning (ML) algorithms to predict resistance phenotypes for 15 antibiotics. Model performance was assessed using accuracy, recall, positive predictive value (PPV), F1-score, and the area under the receiver operating characteristic (ROC) curve (AUC). SHAP analysis was employed to interpret feature contributions, and enrichment of ARGs and IS elements in key feature sets was examined. Hierarchical clustering based on IS-ARG co-occurrence patterns was performed to explore associations with MDR profiles. Results The Random Forest (RF) and Support Vector Machine (SVM) outperformed other models, achieving mean accuracies of 92.22% and 88.83% across all antibiotics. SVM utilizing the key feature set attained an AUC of 99.87% for trimethoprim. IS elements enabled RF to reach the highest per-antibiotic accuracy of 97.73% for levofloxacin, with additional accuracies of 94.28% for meropenem, 96.01% for amikacin, and 93.44% for trimethoprim-sulfamethoxazole. Models utilizing ARGs demonstrated superior overall performance over those employing IS elements, with a higher mean accuracy of 92.08% across all 15 antibiotics compared to 89.82%. IS elements were significantly enriched in key feature sets of all 15 drugs (fold change ≥ 2, P ≤ 0.05), with RF and SHAP analysis identifying specific IS elements (e.g., IS 1380 , IS 66 , IS 1182 ) as key predictors. IS-ARG co-occurrence analysis revealed 314 pairs, with IS 6 - bla SHV pairs accounting for 35.99%. Hierarchical clustering stratified isolates into seven clusters with distinct MDR profiles. Notably, IS 21 - bla KPC associations were linked to resistance against eight antimicrobials, while the absence of specific IS-ARG pairs correlated with resistance to ceftazidime, ciprofloxacin, and levofloxacin. Conclusions This study established the potential to improve predictive accuracy in ML models for specific antibiotics when utilizing IS elements as genomic features in K. pneumoniae , with RF and SVM achieving superior performance across multiple feature sets. IS elements served as critical predictors beyond ARGs, and their co-occurrence with ARGs was associated with MDR phenotypes. The identification of IS 21 - bla KPC as a marker of resistance to eight antibiotics and the stratification of isolates into clinically relevant clusters provided candidate targets for genomic surveillance. These findings demonstrate that integrating mobile genetic elements into resistance prediction frameworks improved both predictive accuracy and biological interpretability, offering novel biomarkers and mechanistic insights for resistance surveillance. Klebsiella pneumoniae Antimicrobial resistance genes Insertion sequence elements Machine learning Multidrug resistance Figures Figure 1 Figure 2 Figure 3 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9 Introduction Antimicrobial resistance (AMR) represents one of the most formidable challenges to modern medicine, with Klebsiella pneumoniae ( K. pneumoniae ) emerging as a critical priority pathogen due to its capacity to acquire and disseminate multidrug resistance (MDR) determinants [ 1 – 3 ]. The increasing prevalence of carbapenem-resistant and extensively drug-resistant K. pneumoniae has compromised therapeutic options and is associated with elevated mortality, prolonged hospitalization, and substantial healthcare costs [ 4 , 5 ]. Whole-genome sequencing (WGS) has revolutionized the surveillance and prediction of AMR by enabling comprehensive exploration of the antibiotic resistance genes (ARGs) harbored by an organism [ 6 ]. Machine learning (ML) approaches applied to WGS data have demonstrated considerable promise in predicting resistance phenotypes, often achieving accuracies exceeding 90% for several pathogens [ 7 – 9 ]. However, a number of these predictive models focused exclusively on the presence or absence of ARGs, implicitly treating them as context-independent determinants of resistance [ 10 , 11 ]. The expression, mobility, and phenotypic impact of ARGs are profoundly influenced by the surrounding genomic context [ 12 – 14 ]. Insertion sequence (IS) elements, as the smallest and most abundant mobile genetic elements, are increasingly recognized as key modulators of this context [ 15 ]. IS elements can activate silent ARGs through promoter capture, enhance their expression via duplication of strong promoters, facilitate their dissemination by forming composite transposons, and generate genomic diversity through insertional inactivation of regulatory genes [ 16 – 20 ]. Despite these established roles in modulating ARGs behavior, the contribution of IS elements to resistance phenotypes remains largely unexplored. Moreover, the co-localization of IS elements and ARGs represents a genomic signature that captures both the presence of a resistance determinant and its mobilization potential [ 21 – 23 ]. While an individual study has documented IS-ARG associations in Acinetobacter baumannii (A. baumannii) [ 24 ], a systematic assessment of how these paired elements influence resistance phenotypes has not been performed in K. pneumoniae . In this study, we conducted an analysis of 2,732 publicly available K. pneumoniae genomes with associated antimicrobial susceptibility phenotypes. We constructed and systematically evaluated four distinct feature sets across six ML algorithms for resistance prediction against 15 antibiotics. By employing feature selection and SHAP-based model interpretability, we sought to elucidate the genomic factors contributing to AMR and characterize the co-occurrence patterns of IS elements and ARGs. Elucidating these interactions is critical for integrating mobile genetic elements into AMR prediction frameworks and underscores the need to combat the spread of multidrug-resistant pathogens. Materials and methods Data collection and isolates selection Genome sequences and corresponding antimicrobial susceptibility phenotypes for K. pneumoniae isolates from 2004 to 2024 were retrieved from the Pathosystems Resource Integration Center (PATRIC; https://www.bv-brc.org/ ). The selection criteria were as follows: (i) phenotypic evidence derived from laboratory methods; (ii) resistant or susceptible phenotypes; and (iii) a minimum of 80 isolates per phenotype class, with relatively balanced group sizes. Phenotypic susceptibility data encompassed 15 antibiotics from nine distinct classes: carbapenems (imipenem [IPM] and meropenem [MEM]), cephalosporins (ceftazidime [CAZ] and cefepime [FEP]), a cephamycin (cefoxitin [FOX]), penicillins (piperacillin-tazobactam [PTZ] and ticarcillin-clavulanate [TIM]), fluoroquinolones (ciprofloxacin [CIP] and levofloxacin [LVX]), aminoglycosides (amikacin [AK], gentamicin [GEN], and tobramycin [TOB]), an antifolate (trimethoprim [TMP]), tetracycline (TET), and a sulfonamide combination (trimethoprim/sulfamethoxazole [TMP/SMX]). Based on the genomic quality assessment results of CheckM v1.2.2 [ 25 ], isolates with a genome completeness ≥ 90% and contamination < 5% were retained. Identification of genomic features ARGs were annotated using Abricate v1.0.1 against the Comprehensive Antibiotic Resistance Database (CARD) [ 26 ]. IS elements were identified with ISEScan v1.7.2.3 [ 27 ], applying filters to exclude partial elements or sequences shorter than 400 bp. Unclassified sequences were identified through alignment against the ISFinder reference database ( https://www-is.biotoul.fr/ ) [ 28 ]. An IS-ARG pair was identified by three criteria: (i) the ARG and its associated IS element located on the same strand of the sequence; (ii) consistent directionality; and (iii) the base-pair distance not exceeding 5 kb [ 29 ]. Feature selection and model construction Model predictions were generated using four feature sets as binary matrices (presence = 1, absence = 0): (i) ARGs, (ii) IS elements, (iii) IS-ARG pairs, and (iv) key feature sets. The dataset was randomly divided into a training set (80%) and a test set (20%). For each antibiotic, a frequency-based filter was applied to retain features detected in > 2% of all isolates. Key feature sets were selected using the RandomForestClassifier with stratified 5-fold cross-validation. Within each fold, the significance of each feature was evaluated and ranked using the rank function. Features with a mean rank < 100 across folds constituted the key feature set for each antibiotic. The Synthetic Minority Over-sampling Technique (SMOTE) was employed to address class imbalance in the training data [ 30 ]. Six machine learning algorithms were implemented using scikit-learn: Random Forest (RF), Logistic Regression (LR), Decision Tree (DT), Support Vector Machine (SVM), Naive Bayes (NB), and Gradient Boosting (GB). All ML models were initially trained using the default parameter settings. Model evaluation and interpretation Model performance was evaluated on the test set using accuracy, recall, positive predictive value (PPV), F1-score, and the area under the receiver operating characteristic (ROC) curve (AUC). SHapley Additive exPlanations (SHAP) v0.39.0 was employed to provide model interpretability. SHAP summary plots provided a global view of feature importance, while force plots illustrated the contribution of individual features. Enrichment and cluster analysis Enrichment analysis of ARGs and IS elements in key feature sets was conducted by hypergeometric testing [ 31 ]. Bidirectional unsupervised hierarchical clustering [ 32 ] based on IS-ARG pairs was applied using the Euclidean distance metric and the complete linkage method [ 33 ]. Statistical analysis Statistical analyses were performed using Python v3.12 and R v4.4.3. Associations between categorical variables were assessed using Fisher's exact test, with the Benjamini-Hochberg method applied for multiple testing correction to control the false discovery rate (FDR) at < 0.05. Results Dataset and phenotypic profile This study utilized a dataset of 2,732 K. pneumoniae isolates obtained from the PATRIC database for AMR prediction (Table S1 ). Among these isolates, high resistance rates were observed for CAZ (81.59%), followed by LVX (78.58%) and CIP (74.79%). Notably, a total of 2,210 isolates (80.89%) exhibited an MDR phenotype, defined as resistance to three or more classes of antibiotics. Fifteen isolates were found to be resistant to 13 classes of antibiotics (Table 1 and Fig. 1 ). Table 1 Statistics of the antibiotic resistance of 2,732 K. pneumoniae isolates Antibiotic Abbreviation Antibiotic classes Action Mechanism Number of resistant isolates Number of sensitive isolates Imipenem IPM Carbapenem Cell Wall Synthesis 585 1069 Meropenem MEM Carbapenem Cell Wall Synthesis 926 1687 Cefepime FEP Cephalosporin (Fourth Generation) Cell Wall Synthesis 1250 807 Ceftazidime CAZ Cephalosporin (Third Generation) Cell Wall Synthesis 2262 512 Cefoxitin FOX Cephamycin Cell Wall Synthesis 1129 938 Piperacillin /Tazobactam PTZ Penicillin (Beta-lactamase Inhibitor Combination) Cell Wall Synthesis 1306 810 Ticarcillin /Clavulanic Acid TIM Penicillin (Beta-lactamase Inhibitor Combination) Cell Wall Synthesis 87 154 Ciprofloxacin CIP Fluoroquinolone DNA Synthesis 1827 610 Levofloxacin LVX Fluoroquinolone DNA Synthesis 1347 364 Trimethoprim TMP Antifolate (Dihydrofolate Reductase Inhibitor) Folic Acid Metabolism 120 157 Trimethoprim /Sulfamethoxazole TMP/SMX Sulfonamide (Combination) Folic Acid Metabolism 1625 581 Amikacin AK Aminoglycoside Protein Synthesis 419 2355 Gentamicin GEN Aminoglycoside Protein Synthesis 1236 1443 Tobramycin TOB Aminoglycoside Protein Synthesis 917 832 Tetracycline TET Tetracycline Protein Synthesis 824 630 Landscape of ARGs and IS elements A total of 124,851 ARGs were identified across all isolates. β-lactamase genes were the most dominant category, constituting 59.07% of all detected ARGs. Other prevalent resistance mechanisms included efflux pump systems associated with MDR (8.76%, n = 10,937), aminoglycoside-modifying enzymes (8.04%, n = 10,033), polymyxin resistance determinants (6.93%, n = 8,646), and quinolone resistance genes (6.42%, n = 8,015) (Table S2 ). Among all genes, the polymyxin resistance gene arnT was the most frequent, detected in 2,658 isolates (97.29%), followed by the efflux pump genes mdtQ (96.93%) and acrA (96.89%). A total of 62,347 occurrences of IS elements were identified. These were grouped into 64 distinct IS types, which belong to 22 transposon families (Table S3 ). The IS 3 family was the most abundant (34.35%, n = 21,406), followed by the IS 5 family (9.14%, n = 5,696). The most frequently detected individual IS elements were IS NCY_229 (present in 97.47% of isolates), IS 1_316 (90.41%), and IS 6_292 (88.07%). Performance of ML models in AMR prediction Four distinct feature sets, ARGs, IS elements, IS-ARG pairs, and key feature sets, were evaluated for their predictive performance using six ML models (Table S4 ). Across 15 antibiotics, RF and SVM outperformed GB, LR, DT and NB, with mean accuracies of 92.22% and 88.83%. Among the four feature sets, IS elements demonstrated superior performance for several drugs. The RF model using IS elements achieved the highest accuracy among all models across all 15 antibiotics (97.73% for LVX), with additional accuracies of 94.28% (MEM), 96.01% (AK), and 93.44% (TMP/SMX), alongside AUCs of 98.17%, 98.91%, and 98.22%, respectively. Furthermore, models using ARGs exhibited a higher mean accuracy of 92.08% and greater stability, with accuracies ranging from 80.86% to 97.21%, compared to those using IS elements (89.82%, 62.96%-97.73%) (Fig. 2 ). Notably, RF reached top accuracies of 97.20% for CAZ and 96.42% for IPM, while SVM delivered its best performance for CIP (97.21%). Additionally, models based on the key feature sets exhibited consistent performance, resulting in a mean accuracy of 90.16% (ranging from 72.60% to 96.36%), with SVM attaining the highest accuracy of 96.36% for TMP. In contrast, almost all models utilizing the IS-ARG pairs showed lower accuracies. Given the balanced performance of the key feature sets, we assessed the discriminative ability (AUC) for six representative antibiotics: carbapenems (IPM and MEM), cephalosporins (CAZ), fluoroquinolones (CIP and LVX), and the antibacterial enhancer trimethoprim (TMP) (Fig. 3 ). All models achieved AUCs exceeding 82.66% for these antibiotics. Notably, the SVM model, using the key feature set, reached an exceptional AUC of 99.87% for TMP, while RF yielded 99.60% for CIP. In summary, the RF model emerged as a high-performing classifier across multiple antibiotics. Its superior predictive capability was particularly enhanced when utilizing IS elements as genomic features, demonstrating their potential to improve predictive accuracy for specific antibiotics. Importance of features interpreted by SHAP value In addition to ARGs, RF identified a number of IS elements as crucial features for prediction (Fig. 4 ). For the majority of antibiotics (93.33%, 14/15), the number of ARGs in these key sets exceeded that of IS elements, with TOB being the sole exception. Among the 5,159 genetic elements initially identified across all isolates, 13.14% were ARGs and 3.52% were IS elements. Enrichment analysis confirmed that both ARGs and IS elements were enriched in key feature sets of all drugs (average fold change ≥ 2, P < 0.05) [ 34 ] (Fig. 4 ). The scatter plots for six antibiotics (IPM, MEM, CAZ, CIP, LVX, and TMP) revealed that IS Pa38 was retained in the key feature sets of all antibiotics. The SHAP analysis elucidated the impact of specific genetic elements. The key features identified by the two analytical methods were largely consistent. For IPM and MEM, most of the red dots were clustered around the negative x-axis, indicating that the presence of genetic elements decreased K. pneumoniae resistance. It was noteworthy that known genes associated with carbapenem resistance, such as bla KPC−2 and bla KPC−3 , were included in the list. IS 1182_104 was also identified as crucial predictors for IPM and MEM (Figure S1 and Figure S2 ). For CAZ, IS 1380_141 stood out as a key indicator for resistance prediction, followed by aac(6')-Ib10 and qnrB17 (Fig. 5 ). For CIP resistance prediction, aac(6')-Ib10 , IS 1380_141 , and bla SHV−33 contributed the most to the model (Fig. 6 ). For LVX, fosA5 , KpnE and IS 66_43 showed positive associations with LVX resistance (Fig. 7 ). The top three features that contributed most significantly to TMP prediction were IS 1380_141 , qnrB5 , and bla CTX−M−62 , showing a negative correlation with resistance (Figure S3 ). The SHAP force plots visualized the impact of key features on individual isolate predictions. For instance, in the representative CAZ-sensitive isolate, the cumulative contribution of sensitivity-associated markers (IS 1380_141 , aac(6′)-Ib10, bla CTX−M−15 ) outweighed that of resistance-promoting features ( bla SHV−62 , bla SHV−108 , bacA ), driving a SHAP value (0.02) below the base value. In contrast, in the CAZ-resistant isolate, the value (0.96) exceeded the base value (Fig. 7 ). These findings indicated that the predictions were driven by the combined effect of multiple features and that IS elements were critical in AMR prediction of K. pneumoniae . IS-ARG co-occurrence patterns as determinants of MDR We explored the association between IS-ARG relationships and resistance phenotypes. Based on IS-ARG pairs, bidirectional unsupervised clustering stratified the isolates into seven clusters, displaying different resistance patterns (Fig. 8 ). For a subset of isolates in cluster 1, although IS elements and ARGs were distributed in their genomes, IS-ARG pairs were missing. These isolates demonstrated resistance to CAZ, CIP, and LVX (Figure S4 ). This suggested a correlation between the lack of specific IS-ARG pairs and susceptibility phenotypes. Conversely, the presence of specific IS-ARG pairs was associated with MDR phenotypes. For a distinct subset of isolates in cluster 1, the associations between IS 21 elements (IS 21_203 and IS 21_259 ) and bla KPC genes (including bla KPC−2 , bla KPC−101 , and bla KPC−3 ) corresponded to resistance against eight antibiotics (FEP, FOX, CAZ, CIP, IPM, LVX, MEM, and PTZ). This pattern was observed in other clusters. In cluster 2, co-occurrences of IS 5_222 and IS 91_50 with armA , msrE , qacE , and qacEΔ1 genes were tied to resistance to eight antibiotics (AK, CAZ, CIP, GEN, PTZ, TOB, TMP, and TMP/SMX). In cluster 7, pairings of IS 6_292 with bla SHV−217 , bla SHV−44, bla SHV−119 , and bla SHV−11 were linked to resistance to five drugs (FEP, CAZ, CIP, LVX, and TMP/SMX). We further analyzed the distribution characteristics of 314 IS-ARG pairs in 2,732 isolates. The most prevalent combination was IS NCY_229 - acrA (present in 2.57% of isolates), followed by IS 6_292 - bla SHV−187 (1.52%) (Table S5 ). Our analysis revealed plentiful co-existence of IS elements with specific ARGs. Combinations involving IS 6 and β-lactamase genes represented the predominant category, accounting for 59.87%. Within this group, IS 6_292 was frequently associated with bla SHV genes, accounting for 35.99% of all 314 pairs, followed by bla LEN and bla OKP−B (each 8.91%). Other stable pairings included IS 91_365 - bla TEM and IS 110_208 - bla TEM . Significantly, conserved bidirectional IS-ARG pairs were also identified. The IS 21 elements (IS 21_203 and IS 21_259 ) were found to be exclusively associated with bla KPC genes, demonstrating their role in mediating carbapenem resistance (Fig. 9 ). Discussion ML models offer a powerful framework for predictive tasks and have demonstrated broad applicability in K. pneumoniae [ 35 – 38 ]. Based on this, we constructed six ML models incorporating ARGs, IS elements, IS-ARG pairs, and key feature sets to predict AMR phenotypes. The model evaluation demonstrates that the RF model delivered superior predictive performance, attaining the highest accuracy (97.73%) and AUC values, followed by SVM. This finding aligns with prior studies establishing RF as an effective algorithm for AMR prediction from WGS data [ 39 , 40 ]. Tree-based ensemble methods and SVM frequently achieve optimal performance in genomic prediction by effectively modeling high-dimensional and non-linear patterns in WGS data [ 9 , 41 , 42 ]. Moreover, RF possesses inherent interpretability through feature importance ranking, offering advantages in enhancing model transparency. The models based on ARGs exhibited a higher average accuracy (92.08%) compared to IS elements, with ARGs primarily functioning through mechanisms such as enzymatic drug inactivation, target protection/modification, and active efflux [ 43 , 44 ]. Nevertheless, our findings revealed that IS elements were crucial for predicting AMR to specific drugs, such as LVX, MEM, TMP/SMX, and AK. IS elements including IS 110 , IS 5 , IS L3 , and IS Pa38 were selected as key features of these antibiotics. IS elements play a significant role in AMR through various mechanisms. For instance, IS 110 family transposases co-occur with Tn 3 in bacterial resistance islands and are found integrated in plasmids, where their mode of action may facilitate excision, circular DNA formation, and targeted integration [ 45 , 46 ]. The pKpQIL plasmid, which harbors IS L3 and IS 5 elements, is distributed among KPC-producing K. pneumoniae and may functionally disrupt inner membrane proteins, thereby conferring AMR [ 47 ]. Remarkably, IS Pa38 was retained in the key feature sets of all 15 antibiotics. IS Pa38 , a Tn 3 family transposon, facilitates gene insertion into the DNA replisome through interaction with the host-encoded β-sliding clamp (DnaN), promoting the dissemination and expression of resistance determinants [ 48 ]. SHAP-based interpretability further demonstrated the specific IS elements driving model decisions. As the top-contributing feature across multiple antibiotics, IS 1380 is frequently found on prophage regions adjacent to β-lactamase genes and other resistance genes on hypervirulence plasmids in K. pneumoniae [ 49 ]. The conserved spatial proximity between IS 1380 and its cognate ARGs may facilitate horizontal gene transfer, thereby shaping AMR outcomes [ 50 ]. Specific IS elements also participate in genomic regulation in other bacterial species. For example, the proliferation of multiple IS classes, particularly IS 66 , is considered a hallmark of genomic plasticity in A. baumannii [ 51 ]. Among 2,732 K. pneumoniae isolates, investigation of IS-ARG identified 314 distinct co-occurrence pairs, many exhibiting preferential associations. Single-molecule real-time sequencing has detected substantial copy numbers of IS 6 and Tn 3 elements in K. pneumoniae genomes, with IS 6 members representing the most frequently observed transposon family, suggesting their transposability and sustained activity across diverse genetic backgrounds [ 50 ]. The predominant association of IS 6_292 and bla SHV genes, comprising 35.99% of all pairs, reflects the dissemination of specific mobile genetic elements in K. pneumoniae . In our study, the majority of IS-ARG interactions exhibit conserved patterns. The coexistence of bla TEM−1 and IS 110 in hybrid plasmids, as well as bla TEM−1 flanked by Tn 3 and IS 91 was reported in clinical K. pneumoniae isolates from different countries [ 52 , 53 ]. This structural conservation reflects that specific IS-ARG pairs are integrated as composite transposons [ 54 ]. Importantly, IS 21 elements exhibit an exclusive association with bla KPC genes, which are typically embedded within Tn 4401 transposons [ 32 , 55 ], with IS 21 -mediated transposition requiring the interacting action of istA and istB [ 56 ]. Bidirectional unsupervised clustering revealed that the occurrence of IS 21 - bla KPC−2 and IS 21 - bla KPC−3 correlated with resistance to eight antibiotics, corroborating the role of Tn 4401 -like transposons in carbapenemase dissemination. This finding reinforces the role of istA/istB-driven chromosomal rearrangements in the dissemination of resistance determinants. In A. baumannii , the interactions between IS elements and ARGs have been reported to be responsible for the resistance to certain antibiotics, such as IS Aba1 - bla OXA−23 [ 57 ]. Prior investigations have further established that IS-ARG interactions contribute to AMR phenotypes and have confirmed their mode of action [ 24 ]. The A. baumannii isolates lacking IS-ARG pairs in their genomes were sensitive to almost all 20 drugs, while certain K. pneumoniae isolates exhibited resistance despite lacking these genetic elements. This highlights a species-specific difference in the genetic architecture of AMR. Nevertheless, models based on the IS-ARG pairs yielded inferior predictive performance. Further investigation is needed to elucidate the contributions of IS-ARG pairs to resistance in K. pneumoniae [ 58 ]. Several limitations of this study should be acknowledged. First, our analysis was restricted to publicly available genomes from the PATRIC database. The lack of externally collected isolates for independent validation limits the generalizability of our findings. Second, we confined our input space to ARGs and IS elements. The exclusion of other genomic features has limited our ability to definitively attribute predictive importance solely to these two feature classes. Third, despite identifying associations between specific IS-ARG pairs and resistance phenotypes, definitive causal relationships remain to be established. Experimental validation through targeted gene knockout, complementation assays, and transcriptional profiling is required to determine the contributions of these paired elements. These limitations notwithstanding, our study provides a scalable analytical framework and testable hypotheses for future investigation. Overall, our study constructed binary classification models for AMR prediction across diverse antibiotic agents. We illustrated the associations between IS elements and AMR prediction, and explored the impact of IS-ARG interactions on MDR profiles. By highlighting the contribution of mobile genetic elements beyond ARGs, this work provides a foundation for incorporating genomic context into AMR prediction and mechanistic studies. Elucidating these interactions will be essential for developing targeted interventions to mitigate the spread of AMR in K. pneumoniae . Abbreviations K. pneumoniae Klebsiella pneumoniae MDR Multidrug resistance ARGs Antibiotic resistance genes IS Insertion sequence AMR Antimicrobial resistance ML Machine learning PPV Positive predictive value ROC Receiver operating characteristic AUC Area under the ROC curve WGS Whole-genome sequencing CARD Comprehensive antibiotic resistance database RF Random forest SVM Support vector machine LR Logistic regression DT Decision tree NB Naive bayes GB Gradient boosting SHAP SHapley additive exPlanations FDR False discovery rate PATRIC Pathosystems resource integration center SMOTE Synthetic minority over-sampling technique Declarations Ethics approval and consent to participate Not applicable. Consent for publication Not applicable. Availability of data and materials The whole genome sequence data analyzed during the current study were derived from existing publicly available data in the PATRIC database (https://www.bv-brc.org/), as detailed in Table S1 of the Supplementary Information files. Competing interests The authors declare no conflicts of interest in this work. Funding This work was supported by grants from the National Natural Science Foundation of China (12571540) and Open Project Fund of Key Laboratory of Biosafety Defense, Ministry of Education (No: KLBD-2024-003). Author Contributions CJW, YC, DCL, and XL designed the study and revised the manuscript. RRG and YT participated in data collection, related bioinformatics analysis and visualization. YMW and XRL performed the analysis and drafted the manuscript. All authors reviewed and agreed on the final version for submission. Acknowledgements Not applicable. References GBD 2021 Antimicrobial Resistance Collaborators. Global burden of bacterial antimicrobial resistance 1990–2021: a systematic analysis with forecasts to 2050. Lancet. 2024;404:1199–226. 10.1016/s0140-6736(24)01867-1 . Ding J, Yan W, Zheng R, et al. Combating Klebsiella pneumoniae : from antimicrobial resistance mechanisms to phage-based combination therapies. Front Cell Infect Microbiol. 2025;15:1691215. 10.3389/fcimb.2025.1691215 . Tinnirello R, Iannolo G, Cagigi A, et al. An overview of Klebsiella pneumoniae ST392 amid the AMR silent pandemic and consequent environmental dissemination and health risks. Int J Hyg Environ Health. 2026;271:114710. 10.1016/j.ijheh.2025.114710 . Yao Y, Zha Z, Li L, et al. Healthcare-associated carbapenem-resistant Klebsiella pneumoniae infections are associated with higher mortality compared to carbapenem-susceptible K. pneumoniae infections in the intensive care unit: a retrospective cohort study. J Hosp Infect. 2024;148:30–8. 10.1016/j.jhin.2024.03.003 . Zhang J, Zhou R, Ren J, et al. Economic burden of carbapenem-resistant Klebsiella pneumoniae infections in Chinese hospitals: A 2019 analysis. J Glob Antimicrob Resist. 2026;46:63–70. 10.1016/j.jgar.2025.11.006 . Matsumura Y, Yamamoto M, Gomi R, et al. Integrating whole-genome sequencing into antimicrobial resistance surveillance: methodologies, challenges, and perspectives. Clin Microbiol Rev. 2025;38:e0014022. 10.1128/cmr.00140-22 . Kim JI, Maguire F, Tsang KK, et al. Machine Learning for Antimicrobial Resistance Prediction: Current Practice, Limitations, and Clinical Perspective. Clin Microbiol Rev. 2022;35. 10.1128/cmr.00179-21 . Wang S, Zhao C, Yin Y, et al. A Practical Approach for Predicting Antimicrobial Phenotype Resistance in Staphylococcus aureus Through Machine Learning Analysis of Genome Data. Front Microbiol. 2022;13:841289. 10.3389/fmicb.2022.841289 . Khaledi A, Weimann A, Schniederjans M, et al. Predicting antimicrobial resistance in Pseudomonas aeruginosa with machine learning-enabled molecular diagnostics. EMBO Mol Med. 2020;12:e10264. 10.15252/emmm.201910264 . Chiaverini A, Curi R, Gori M, et al. Antimicrobial resistance of Listeria monocytogenes human strains and correlation to genomic data. Eur J Public Health. 2021;31:449. Stoesser N, Batty EM, Eyre DW, et al. Predicting antimicrobial susceptibilities for Escherichia coli and Klebsiella pneumoniae isolates using whole genomic sequence data. J Antimicrob Chemother. 2013;68:2234–44. 10.1093/jac/dkt180 . Zhang J, Lei H, Huang J, et al. Co-occurrence and co-expression of antibiotic, biocide, and metal resistance genes with mobile genetic elements in microbial communities subjected to long-term antibiotic pressure: Novel insights from metagenomics and metatranscriptomics. J Hazard Mater. 2025;489:137559. 10.1016/j.jhazmat.2025.137559 . Wang M, Xiong W, Liu P, et al. Metagenomic Insights Into the Contribution of Phages to Antibiotic Resistance in Water Samples Related to Swine Feedlot Wastewater Treatment. Front Microbiol. 2018;9:2474. 10.3389/fmicb.2018.02474 . Janse I, Beeloo R, Swart A, et al. The extent of carbapenemase-encoding genes in public genome sequences. PeerJ. 2021;9:e11000. 10.7717/peerj.11000 . Humayun MZ, Zhang Z, Butcher AM, et al. Hopping into a hot seat: Role of DNA structural features on IS5-mediated gene activation and inactivation under stress. PLoS ONE. 2017;12:e0180156. 10.1371/journal.pone.0180156 . Ding Y, Jiang X, Wu J, et al. Synergistic horizontal transfer of antibiotic resistance genes and transposons in the infant gut microbial genome. mSphere. 2024;9:e0060823. 10.1128/msphere.00608-23 . Hernández-Allés S, Benedí VJ, Martínez-Martínez L, et al. Development of resistance during antimicrobial therapy caused by insertion sequence interruption of porin genes. Antimicrob Agents Chemother. 1999;43:937–9. 10.1128/AAC.43.4.937 . Han X, Shi Q, Mao Y, et al. Emergence of Ceftazidime/Avibactam and Tigecycline Resistance in Carbapenem-Resistant Klebsiella pneumoniae Due to In-Host Microevolution. Front Cell Infect Microbiol. 2021;11:757470. 10.3389/fcimb.2021.757470 . Du Y, Liu T, Gong Y, et al. Scarless excision of an insertion sequence in the OmpK36 promoter restores meropenem susceptibility in a non-carbapenemase-producing Klebsiella pneumoniae . Emerg Microbes Infect. 2025;14:2503922. 10.1080/22221751.2025.2503922 . Wu Y, Zhao J, Li Z, et al. Within-host acquisition of colistin-resistance of an NDM-producing Klebsiella quasipneumoniae subsp. similipneumoniae strain through the insertion sequence- 903B -mediated inactivation of mgrB gene in a lung transplant child in China. Front Cell Infect Microbiol. 2023;13:1153387. 10.3389/fcimb.2023.1153387 . Razavi M, Kristiansson E, Flach CF, et al. The Association between Insertion Sequences and Antibiotic Resistance Genes. mSphere. 2020;5. 10.1128/mSphere.00418-20 . Nielsen TK, Browne PD, Hansen LH. Antibiotic resistance genes are differentially mobilized according to resistance mechanism. GigaScience. 2022;11. 10.1093/gigascience/giac072 . Partridge SR, Kwong SM, Firth N, et al. Mobile Genetic Elements Associated with Antimicrobial Resistance. Clin Microbiol Rev. 2018;31. 10.1128/cmr.00088-17 . Xie F, Wang L, Li S, et al. Large-scale genomic analysis reveals significant role of insertion sequences in antimicrobial resistance of Acinetobacter baumannii . mBio. 2025;16:e0285224. 10.1128/mbio.02852-24 . Parks DH, Imelfort M, Skennerton CT, et al. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 2015;25:1043–55. 10.1101/gr.186072.114 . Jia B, Raphenya AR, Alcock B, et al. CARD 2017: expansion and model-centric curation of the comprehensive antibiotic resistance database. Nucleic Acids Res. 2017;45:D566–73. 10.1093/nar/gkw1004 . Xie Z, Tang H. ISEScan: automated identification of insertion sequence elements in prokaryotic genomes. Bioinformatics. 2017;33:3340–7. 10.1093/bioinformatics/btx433 . Siguier P, Perochon J, Lestrade L, et al. ISfinder: the reference centre for bacterial insertion sequences. Nucleic Acids Res. 2006;34:D32–6. 10.1093/nar/gkj014 . Yao S, Yu J, Zhang T, et al. Comprehensive analysis of distribution characteristics and horizontal gene transfer elements of bla NDM-1 -carrying bacteria. Sci Total Environ. 2024;946:173907. 10.1016/j.scitotenv.2024.173907 . Dablain D, Krawczyk B, Chawla NV. DeepSMOTE: Fusing Deep Learning and SMOTE for Imbalanced Data. IEEE Trans Neural Netw Learn Syst. 2023;34:6390–404. 10.1109/TNNLS.2021.3136503 . Cao J, Zhang S. A Bayesian extension of the hypergeometric test for functional enrichment analysis. Biometrics. 2014;70:84–94. 10.1111/biom.12122 . Ortega-Paredes D, Del Canto F, Rios R, et al. Genomic Insights into Colistin and Tigecycline Resistance in ESBL-Producing Escherichia coli and Klebsiella pneumoniae Harboring bla KPC Genes in Ecuador. Antibiot (Basel). 2025;14. 10.3390/antibiotics14020206 . Terlep TA, Bell MR, Talavage TM, et al. Euclidean Distance Approximations From Replacement Product Graphs. IEEE Trans Image Process. 2022;31:125–37. 10.1109/TIP.2021.3128319 . Shu J, Liu Y, Shan Y, et al. Deep sequencing microRNA profiles associated with wooden breast in commercial broilers. Poult Sci. 2021;100:101496. 10.1016/j.psj.2021.101496 . de la Lastra JMP, Wardell SJT, Pal T, et al. From Data to Decisions: Leveraging Artificial Intelligence and Machine Learning in Combating Antimicrobial Resistance a Comprehensive Review. J Med Syst. 2024;48:71. 10.1007/s10916-024-02089-5 . Bilal H, Khan MN, Khan S, et al. The role of artificial intelligence and machine learning in predicting and combating antimicrobial resistance. Comput Struct Biotechnol J. 2025;27:423–39. 10.1016/j.csbj.2025.01.006 . Zhou X, Yang M, Chen F, et al. Prediction of antimicrobial resistance in Klebsiella pneumoniae using genomic and metagenomic next-generation sequencing data. J Antimicrob Chemother. 2024;79:2509–17. 10.1093/jac/dkae248 . Condorelli C, Nicitra E, Musso N, et al. Prediction of antimicrobial resistance of Klebsiella pneumoniae from genomic data through machine learning. PLoS ONE. 2024;19:e0309333. 10.1371/journal.pone.0309333 . Ren Y, Chakraborty T, Doijad S, et al. Prediction of antimicrobial resistance based on whole-genome sequencing and machine learning. Bioinformatics. 2022;38:325–34. 10.1093/bioinformatics/btab681 . Ardila CM, Yadalam PK, González-Arroyave D. Integrating whole genome sequencing and machine learning for predicting antimicrobial resistance in critical pathogens: a systematic review of antimicrobial susceptibility tests. PeerJ. 2024;12:e18213. 10.7717/peerj.18213 . Li D, Wang Y, Hu W, et al. Application of Machine Learning Classifier to Candida auris Drug Resistance Analysis. Front Cell Infect Microbiol. 2021;11:742062. 10.3389/fcimb.2021.742062 . Haga H, Sato H, Koseki A, et al. A machine learning-based treatment prediction model using whole genome variants of hepatitis C virus. PLoS ONE. 2020;15:e0242028. 10.1371/journal.pone.0242028 . Jhalora V, Bist R. A Comprehensive Review of Molecular Mechanisms Leading to the Emergence of Multidrug Resistance in Bacteria. Indian J Microbiol. 2025;65:844–65. 10.1007/s12088-024-01384-6 . Galgano M, Pellegrini F, Catalano E, et al. Acquired Bacterial Resistance to Antibiotics and Resistance Genes: From Past to Future. Antibiot (Basel). 2025;14. 10.3390/antibiotics14030222 . Durrant MG, Perry NT, Pai JJ, et al. Bridge RNAs direct programmable recombination of target and donor DNA. Nature. 2024;630:984–93. 10.1038/s41586-024-07552-4 . Wang Y, Dagan T. The evolution of antibiotic resistance islands occurs within the framework of plasmid lineages. Nat Commun. 2024;15:4555. 10.1038/s41467-024-48352-8 . Giordano C, Barnini S, Tsioutis C, et al. Expansion of KPC-producing Klebsiella pneumoniae with various mgrB mutations giving rise to colistin resistance: the role of ISL3 on plasmids. Int J Antimicrob Agents. 2018;51:260–5. 10.1016/j.ijantimicag.2017.10.011 . Tang Y, Zhang J, Guan J, et al. Transposition with Tn3-family elements occurs through interaction with the host β-sliding clamp processivity factor. Nucleic Acids Res. 2024;52:10416–30. 10.1093/nar/gkae674 . Bolourchi N, Naz A, Sohrabi M, et al. Comparative in silico characterization of Klebsiella pneumoniae hypervirulent plasmids and their antimicrobial resistance genes. Ann Clin Microbiol Antimicrob. 2022;21:23. 10.1186/s12941-022-00514-6 . Jiang Y, Wang Y, Hua X, et al. Pooled Plasmid Sequencing Reveals the Relationship Between Mobile Genetic Elements and Antimicrobial Resistance Genes in Clinically Isolated Klebsiella pneumoniae . Genomics Proteom Bioinf. 2020;18:539–48. 10.1016/j.gpb.2020.12.002 . Gaiarsa S, Bitar I, Comandatore F, et al. Can Insertion Sequences Proliferation Influence Genomic Plasticity? Comparative Analysis of Acinetobacter baumannii Sequence Type 78, a Persistent Clone in Italian Hospitals. Front Microbiol. 2019;10:2080. 10.3389/fmicb.2019.02080 . Kuzina ES, Kislichkina AA, Sizova AA, et al. High-Molecular-Weight Plasmids Carrying Carbapenemase Genes bla NDM-1 , bla KPC-2 , and bla OXA-48 Coexisting in Clinical Klebsiella pneumoniae Strains of ST39. Microorganisms. 2023;11:459. 10.3390/microorganisms11020459 . Mbelle NM, Feldman C, Sekyere JO, et al. Pathogenomics and Evolutionary Epidemiology of Multi-Drug Resistant Clinical Klebsiella pneumoniae Isolated from Pretoria, South Africa. Sci Rep. 2020;10:1232. 10.1038/s41598-020-58012-8 . Rana C, Vikas V, Awasthi S, et al. Antimicrobial resistance genes and associated mobile genetic elements in Escherichia coli from human, animal and environment. Chemosphere. 2024;369:143808. 10.1016/j.chemosphere.2024.143808 . Cai M, Song K, Wang R, et al. Tracking intra-species and inter-genus transmission of KPC through global plasmids mining. Cell Rep. 2024;43:114351. 10.1016/j.celrep.2024.114351 . de la Gándara Á, Spínola-Amilibia M, Araújo-Bazán L et al. Molecular basis for transposase activation by a dedicated AAA+ ATPase. Nature . 2024;630:1003-11. 10.1038/s41586-024-07550-6 Moffatt JH, Harper M, Adler B, et al. Insertion sequence IS Aba11 is involved in colistin resistance and loss of lipopolysaccharide in Acinetobacter baumannii . Antimicrob Agents Chemother. 2011;55:3022–4. 10.1128/aac.01732-10 . Darby EM, Trampari E, Siasat P, et al. Molecular mechanisms of antibiotic resistance revisited. Nat Rev Microbiol. 2023;21:280–95. 10.1038/s41579-022-00820-y . Additional Declarations No competing interests reported. Supplementary Files SupplementaryFigureS1.pdf SupplementaryFigureS2.pdf SupplementaryFigureS3.pdf SupplementaryFigureS4.pdf SupplementaryTable.xlsx Cite Share Download PDF Status: Under Review Version 1 posted Reviews received at journal 15 May, 2026 Reviewers agreed at journal 08 May, 2026 Reviewers agreed at journal 08 May, 2026 Reviewers invited by journal 08 May, 2026 Editor assigned by journal 29 Apr, 2026 Editor invited by journal 21 Apr, 2026 Submission checks completed at journal 20 Apr, 2026 First submitted to journal 20 Apr, 2026 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-9397300","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":640915256,"identity":"bff7df01-0f10-4964-95a5-9c2997f0946b","order_by":0,"name":"Yiming Wang","email":"","orcid":"","institution":"China Medical University","correspondingAuthor":false,"prefix":"","firstName":"Yiming","middleName":"","lastName":"Wang","suffix":""},{"id":640915260,"identity":"98a2d0dc-ba19-4fc4-bf5f-06b2c9565865","order_by":1,"name":"Xinran Li","email":"","orcid":"","institution":"China Medical University","correspondingAuthor":false,"prefix":"","firstName":"Xinran","middleName":"","lastName":"Li","suffix":""},{"id":640915266,"identity":"6f281df3-5253-4919-8540-c755d8408b8b","order_by":2,"name":"Ranran Gao","email":"","orcid":"","institution":"China Medical University","correspondingAuthor":false,"prefix":"","firstName":"Ranran","middleName":"","lastName":"Gao","suffix":""},{"id":640915275,"identity":"42afae31-17a6-4fd8-9354-06caf5e2bc75","order_by":3,"name":"Yuan Tian","email":"","orcid":"","institution":"China Medical University","correspondingAuthor":false,"prefix":"","firstName":"Yuan","middleName":"","lastName":"Tian","suffix":""},{"id":640915278,"identity":"8ae4c1b0-4da0-4712-8110-f09cd930aadd","order_by":4,"name":"Changjun Wang","email":"","orcid":"","institution":"Chinese PLA Center for Disease Control and Prevention","correspondingAuthor":false,"prefix":"","firstName":"Changjun","middleName":"","lastName":"Wang","suffix":""},{"id":640915284,"identity":"e3f87ff2-0270-4a37-af90-69da61f4dc3b","order_by":5,"name":"Yong Chen","email":"","orcid":"","institution":"Chinese PLA Center for Disease Control and Prevention","correspondingAuthor":false,"prefix":"","firstName":"Yong","middleName":"","lastName":"Chen","suffix":""},{"id":640915293,"identity":"428c7214-afd1-4a80-84bf-e8b3ba7ae55b","order_by":6,"name":"Dingchen Li","email":"","orcid":"","institution":"Chinese PLA Center for Disease Control and Prevention","correspondingAuthor":false,"prefix":"","firstName":"Dingchen","middleName":"","lastName":"Li","suffix":""},{"id":640915296,"identity":"42dc3fa9-f979-4fa7-a215-ce66eab3504d","order_by":7,"name":"Xiong Liu","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA1ElEQVRIiWNgGAWjYBACPmYgkQDE9ocZEh8kVNQQ1sIG08JwnOGxwYMzx4jQAmedZ3wm+bCFmQgt7DxmEg8q7tg1NjOnVSQ2sDHwt3cnEHAYj7FBwplnyc3MbGk3EnfIMEicObuBkBbDB4lth5OBDKCWM2wMBhK5BLUYHABp4WHm/1aQ2MZMlBawLXYSzAxpDERqYSsG+uVwggEzQ7JEwpljPAT9ws9/eJvkj4rD9gb8BxI//qiokeNv78WvBQYSG6AMHqKUg4A90SpHwSgYBaNg5AEA8w1C2ZR6PEwAAAAASUVORK5CYII=","orcid":"","institution":"Chinese PLA Center for Disease Control and Prevention","correspondingAuthor":true,"prefix":"","firstName":"Xiong","middleName":"","lastName":"Liu","suffix":""}],"badges":[],"createdAt":"2026-04-13 00:39:05","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-9397300/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-9397300/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":109759742,"identity":"992ecdbc-79c3-4308-9ccb-b507ef5d5dce","added_by":"auto","created_at":"2026-05-22 07:27:37","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":444913,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eAntimicrobial resistance profiles of 15 antibiotics\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eA. Antimicrobial susceptibility testing profiles against 15 antibiotics. The circle denotes the antibiotic type. Stacked bar plot showing the number of resistant (R, red) and sensitive (S, blue) samples for each antibiotic.\u003c/p\u003e\n\u003cp\u003eB. Distribution of isolates by number of antibiotics with resistant phenotypes. Darker colors represent resistance to a greater number of antibiotics, ranging from 0 (susceptible to all 13 agents) to 13 (resistant to 13 agents). The number and proportion of isolates for each resistance category are shown in the corresponding slice.\u003c/p\u003e","description":"","filename":"floatimage1.png","url":"https://assets-eu.researchsquare.com/files/rs-9397300/v1/ae48930abec7d6d21ea4d7af.png"},{"id":109498658,"identity":"ce1b148c-a92d-4e05-b0b4-c716c0731e69","added_by":"auto","created_at":"2026-05-18 21:11:34","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":623753,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eComparative performance of six ML models across four feature sets for 15 antibiotics\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eRadar plots depicting the prediction accuracy of six machine learning models for 15 antibiotics: A. Random Forest (RF), B. Support Vector Machine (SVM), C. Gradient Boosting (GB), D. Decision Tree (DT), E. Logistic Regression (LR), and F. Naïve Bayes (NB). Each radar axis represents one antibiotic, scaled from 0.4 to 1 accuracy. The four concentric polygons in each plot correspond to the four feature sets evaluated: ARGs (red), IS elements (green), IS-ARG pairs (yellow), and the curated key feature set (blue).\u003c/p\u003e","description":"","filename":"floatimage2.png","url":"https://assets-eu.researchsquare.com/files/rs-9397300/v1/a94670759d9fc6c4439a6add.png"},{"id":109759269,"identity":"50fa4b2a-b1b6-4e3d-b734-4fa0fc9b9778","added_by":"auto","created_at":"2026-05-22 07:26:23","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":316211,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eROC curves of six ML models against six representative antibiotics using the key feature set\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eReceiver operating characteristic (ROC) curves depicting the predictive performance of six machine learning models trained on the key feature set for\u003cstrong\u003e \u003c/strong\u003esix antibiotics: A.\u003cstrong\u003e \u003c/strong\u003eimipenem (IPM),\u003cstrong\u003e \u003c/strong\u003eB.\u003cstrong\u003e \u003c/strong\u003emeropenem (MEM), C.\u003cstrong\u003e \u003c/strong\u003eceftazidime (CAZ), D.\u003cstrong\u003e \u003c/strong\u003eciprofloxacin (CIP),\u003cstrong\u003e \u003c/strong\u003eE.\u003cstrong\u003e \u003c/strong\u003elevofloxacin (LVX),\u003cstrong\u003e \u003c/strong\u003eand\u003cstrong\u003e \u003c/strong\u003eF.\u003cstrong\u003e \u003c/strong\u003etrimethoprim (TMP). For each antibiotic, the area under the ROC curve (AUC) is reported for all six models in the accompanying legend.\u003c/p\u003e","description":"","filename":"floatimage3.png","url":"https://assets-eu.researchsquare.com/files/rs-9397300/v1/382e0be506f9ba7a05eb42dd.png"},{"id":109498668,"identity":"ef49e53e-4134-4852-85be-e12d892a76bc","added_by":"auto","created_at":"2026-05-18 21:11:34","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":434512,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eFeature importance and SHAP interpretation for ceftazidime (CAZ) resistance prediction using the RF model\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eA. Scatter plot displaying feature importance for CAZ prediction in the RF model. The x‑axis shows significance from Fisher’s exact test comparing feature presence in resistant and susceptible isolates (–log₁₀(\u003cem\u003eP\u003c/em\u003e)). The y‑axis shows the rank of each feature (lower rank indicates higher importance). ARGs are highlighted in red, IS elements in green, and IS-ARG pairs in black. Dashed gray line marks the dense rank threshold of 100. The top 15 IS elements and ARGs are labeled.\u003c/p\u003e\n\u003cp\u003eB. SHAP summary bar plot. Mean absolute SHAP values for the top 15 features in the key feature sets. The color of the dots from blue to red shows the feature values from low to high.\u003c/p\u003e\n\u003cp\u003eC. SHAP beeswarm plot showing the distribution of SHAP values for each top feature. Each point represents an isolate; color indicates the feature value (red: high, blue: low). Positive SHAP values indicate contribution to resistance; negative values to susceptibility.\u003c/p\u003e\n\u003cp\u003eD. SHAP force plots for representative CAZ-susceptible and CAZ-resistant isolates. The plots visualize how individual genomic features drive the model’s prediction. Red bars push prediction toward resistance; blue bars toward susceptibility. The base value represents the mean model output across all isolates.\u003c/p\u003e","description":"","filename":"floatimage5.png","url":"https://assets-eu.researchsquare.com/files/rs-9397300/v1/01bbf8340612aa06993e98b2.png"},{"id":109498664,"identity":"d6a90e71-ca28-45be-ada5-edd036d1389c","added_by":"auto","created_at":"2026-05-18 21:11:34","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":434367,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eFeature importance and SHAP interpretation for ciprofloxacin (CIP) resistance prediction using the RF model\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eA. Scatter plot displaying feature importance for CIP prediction in the RF model, B. SHAP summary plot, C. SHAP beeswarm plot, and D. SHAP force plots for representative CIP-susceptible and CIP-resistant isolates.\u003c/p\u003e","description":"","filename":"floatimage6.png","url":"https://assets-eu.researchsquare.com/files/rs-9397300/v1/59783ad5720438cd4fb72af1.png"},{"id":109498665,"identity":"2bb3f74f-fd58-4aa1-89e1-11a7b91494d0","added_by":"auto","created_at":"2026-05-18 21:11:34","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":418337,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eFeature importance and SHAP interpretation for levofloxacin (LVX) resistance prediction using the RF model\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eA. Scatter plot displaying feature importance for LVX prediction in the RF model, B. SHAP summary plot, C. SHAP beeswarm plot, and D. SHAP force plots for representative LVX-susceptible and LVX-resistant isolates.\u003c/p\u003e","description":"","filename":"floatimage7.png","url":"https://assets-eu.researchsquare.com/files/rs-9397300/v1/407a3fbc2a0b661fd3695a17.png"},{"id":109760494,"identity":"8b611b3f-d623-489b-9de5-5486cb6213ec","added_by":"auto","created_at":"2026-05-22 07:28:45","extension":"png","order_by":8,"title":"Figure 8","display":"","copyAsset":false,"role":"figure","size":825819,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eUnsupervised hierarchical clustering based on IS-ARG pair profiles and isolates\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eBidirectional unsupervised hierarchical clustering was performed using Euclidean distance and the complete linkage method. The analysis stratified the 2,732 isolates into seven distinct clusters. The input matrix comprised binary presence/absence (orange: present; green: absent) of IS-ARG pairs (middle heatmap), alongside corresponding ARG and IS element profiles (lower heatmaps). The upper bar depicted the antimicrobial susceptibility phenotypes for 15 antibiotics, with blue indicating susceptible and red indicating resistant.\u003c/p\u003e","description":"","filename":"floatimage8.png","url":"https://assets-eu.researchsquare.com/files/rs-9397300/v1/6a9ef3b186612fc02addc2c1.png"},{"id":109759985,"identity":"a7f3f9cf-15c4-4dfa-9d1e-4a5963cf8404","added_by":"auto","created_at":"2026-05-22 07:28:01","extension":"png","order_by":9,"title":"Figure 9","display":"","copyAsset":false,"role":"figure","size":619926,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eCo-occurrence patterns of IS-ARG pairs in 2,732 isolates\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eSankey diagram showing the prevalence of different IS elements (left) detected closely to the ARGs (right). A total of 18 distinct IS elements were identified within 5 kb upstream or downstream of ARGs, forming 314 unique IS-ARG associations. The height of each axis is proportional to the number of isolates with co-occurrence events.\u003c/p\u003e","description":"","filename":"floatimage9.png","url":"https://assets-eu.researchsquare.com/files/rs-9397300/v1/b2de5dcca82e3f5d20299f21.png"},{"id":109760297,"identity":"2264302b-1353-441f-9904-cbaeb3a2bab1","added_by":"auto","created_at":"2026-05-22 07:28:28","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":2471910,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9397300/v1/a2a4187e-7210-4467-91ec-4165dcc1bbe9.pdf"},{"id":109498655,"identity":"3cd78bff-2067-44fe-948c-8d2fd23903b3","added_by":"auto","created_at":"2026-05-18 21:11:34","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"supplement","size":1099923,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryFigureS1.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9397300/v1/015e43de0a86920140667b9c.pdf"},{"id":109498657,"identity":"25768f3c-f803-4fb9-8eb1-5b52bb04f0fb","added_by":"auto","created_at":"2026-05-18 21:11:34","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":1086303,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryFigureS2.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9397300/v1/eafe8556e93db782a629e475.pdf"},{"id":109760215,"identity":"3637ab46-977e-42be-b76b-3e92b7ddec9e","added_by":"auto","created_at":"2026-05-22 07:28:19","extension":"pdf","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":975045,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryFigureS3.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9397300/v1/03a1db4317a3c35e2f6c2f87.pdf"},{"id":109498661,"identity":"d22bea6a-ebcf-4173-9494-6a56af6cda92","added_by":"auto","created_at":"2026-05-18 21:11:34","extension":"pdf","order_by":3,"title":"","display":"","copyAsset":false,"role":"supplement","size":312363,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryFigureS4.pdf","url":"https://assets-eu.researchsquare.com/files/rs-9397300/v1/7befc7673331e17153fe1eaa.pdf"},{"id":109799801,"identity":"03bd8866-9b55-4057-8c2c-dd34562551c7","added_by":"auto","created_at":"2026-05-22 15:34:11","extension":"xlsx","order_by":4,"title":"","display":"","copyAsset":false,"role":"supplement","size":5821770,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryTable.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-9397300/v1/67102b8cfb969447a3318df3.xlsx"}],"financialInterests":"No competing interests reported.","formattedTitle":"Prediction of antimicrobial resistance of Klebsiella pneumoniae through machine learning and revelation of significant role of insertion sequence elements","fulltext":[{"header":"Introduction","content":"\u003cp\u003eAntimicrobial resistance (AMR) represents one of the most formidable challenges to modern medicine, with \u003cem\u003eKlebsiella pneumoniae\u003c/em\u003e (\u003cem\u003eK. pneumoniae\u003c/em\u003e) emerging as a critical priority pathogen due to its capacity to acquire and disseminate multidrug resistance (MDR) determinants [\u003cspan additionalcitationids=\"CR2\" citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e]. The increasing prevalence of carbapenem-resistant and extensively drug-resistant \u003cem\u003eK. pneumoniae\u003c/em\u003e has compromised therapeutic options and is associated with elevated mortality, prolonged hospitalization, and substantial healthcare costs [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e, \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e]. Whole-genome sequencing (WGS) has revolutionized the surveillance and prediction of AMR by enabling comprehensive exploration of the antibiotic resistance genes (ARGs) harbored by an organism [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e]. Machine learning (ML) approaches applied to WGS data have demonstrated considerable promise in predicting resistance phenotypes, often achieving accuracies exceeding 90% for several pathogens [\u003cspan additionalcitationids=\"CR8\" citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e]. However, a number of these predictive models focused exclusively on the presence or absence of ARGs, implicitly treating them as context-independent determinants of resistance [\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e, \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eThe expression, mobility, and phenotypic impact of ARGs are profoundly influenced by the surrounding genomic context [\u003cspan additionalcitationids=\"CR13\" citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e]. Insertion sequence (IS) elements, as the smallest and most abundant mobile genetic elements, are increasingly recognized as key modulators of this context [\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e]. IS elements can activate silent ARGs through promoter capture, enhance their expression via duplication of strong promoters, facilitate their dissemination by forming composite transposons, and generate genomic diversity through insertional inactivation of regulatory genes [\u003cspan additionalcitationids=\"CR17 CR18 CR19\" citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e]. Despite these established roles in modulating ARGs behavior, the contribution of IS elements to resistance phenotypes remains largely unexplored. Moreover, the co-localization of IS elements and ARGs represents a genomic signature that captures both the presence of a resistance determinant and its mobilization potential [\u003cspan additionalcitationids=\"CR22\" citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e23\u003c/span\u003e]. While an individual study has documented IS-ARG associations in \u003cem\u003eAcinetobacter baumannii (A. baumannii)\u003c/em\u003e [\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e], a systematic assessment of how these paired elements influence resistance phenotypes has not been performed in \u003cem\u003eK. pneumoniae\u003c/em\u003e.\u003c/p\u003e \u003cp\u003eIn this study, we conducted an analysis of 2,732 publicly available \u003cem\u003eK. pneumoniae\u003c/em\u003e genomes with associated antimicrobial susceptibility phenotypes. We constructed and systematically evaluated four distinct feature sets across six ML algorithms for resistance prediction against 15 antibiotics. By employing feature selection and SHAP-based model interpretability, we sought to elucidate the genomic factors contributing to AMR and characterize the co-occurrence patterns of IS elements and ARGs. Elucidating these interactions is critical for integrating mobile genetic elements into AMR prediction frameworks and underscores the need to combat the spread of multidrug-resistant pathogens.\u003c/p\u003e"},{"header":"Materials and methods","content":"\u003cp\u003eData collection and isolates selection\u003c/p\u003e \u003cp\u003eGenome sequences and corresponding antimicrobial susceptibility phenotypes for \u003cem\u003eK. pneumoniae\u003c/em\u003e isolates from 2004 to 2024 were retrieved from the Pathosystems Resource Integration Center (PATRIC; \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.bv-brc.org/\u003c/span\u003e\u003cspan address=\"https://www.bv-brc.org/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e). The selection criteria were as follows: (i) phenotypic evidence derived from laboratory methods; (ii) resistant or susceptible phenotypes; and (iii) a minimum of 80 isolates per phenotype class, with relatively balanced group sizes. Phenotypic susceptibility data encompassed 15 antibiotics from nine distinct classes: carbapenems (imipenem [IPM] and meropenem [MEM]), cephalosporins (ceftazidime [CAZ] and cefepime [FEP]), a cephamycin (cefoxitin [FOX]), penicillins (piperacillin-tazobactam [PTZ] and ticarcillin-clavulanate [TIM]), fluoroquinolones (ciprofloxacin [CIP] and levofloxacin [LVX]), aminoglycosides (amikacin [AK], gentamicin [GEN], and tobramycin [TOB]), an antifolate (trimethoprim [TMP]), tetracycline (TET), and a sulfonamide combination (trimethoprim/sulfamethoxazole [TMP/SMX]). Based on the genomic quality assessment results of CheckM v1.2.2 [\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e], isolates with a genome completeness\u0026thinsp;\u0026ge;\u0026thinsp;90% and contamination\u0026thinsp;\u0026lt;\u0026thinsp;5% were retained.\u003c/p\u003e \u003cp\u003eIdentification of genomic features\u003c/p\u003e \u003cp\u003eARGs were annotated using Abricate v1.0.1 against the Comprehensive Antibiotic Resistance Database (CARD) [\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e]. IS elements were identified with ISEScan v1.7.2.3 [\u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e], applying filters to exclude partial elements or sequences shorter than 400 bp. Unclassified sequences were identified through alignment against the ISFinder reference database (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www-is.biotoul.fr/\u003c/span\u003e\u003cspan address=\"https://www-is.biotoul.fr/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e) [\u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e28\u003c/span\u003e]. An IS-ARG pair was identified by three criteria: (i) the ARG and its associated IS element located on the same strand of the sequence; (ii) consistent directionality; and (iii) the base-pair distance not exceeding 5 kb [\u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e29\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eFeature selection and model construction\u003c/p\u003e \u003cp\u003eModel predictions were generated using four feature sets as binary matrices (presence\u0026thinsp;=\u0026thinsp;1, absence\u0026thinsp;=\u0026thinsp;0): (i) ARGs, (ii) IS elements, (iii) IS-ARG pairs, and (iv) key feature sets. The dataset was randomly divided into a training set (80%) and a test set (20%). For each antibiotic, a frequency-based filter was applied to retain features detected in \u0026gt;\u0026thinsp;2% of all isolates. Key feature sets were selected using the RandomForestClassifier with stratified 5-fold cross-validation. Within each fold, the significance of each feature was evaluated and ranked using the rank function. Features with a mean rank\u0026thinsp;\u0026lt;\u0026thinsp;100 across folds constituted the key feature set for each antibiotic. The Synthetic Minority Over-sampling Technique (SMOTE) was employed to address class imbalance in the training data [\u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e30\u003c/span\u003e]. Six machine learning algorithms were implemented using scikit-learn: Random Forest (RF), Logistic Regression (LR), Decision Tree (DT), Support Vector Machine (SVM), Naive Bayes (NB), and Gradient Boosting (GB). All ML models were initially trained using the default parameter settings.\u003c/p\u003e \u003cp\u003eModel evaluation and interpretation\u003c/p\u003e \u003cp\u003eModel performance was evaluated on the test set using accuracy, recall, positive predictive value (PPV), F1-score, and the area under the receiver operating characteristic (ROC) curve (AUC). SHapley Additive exPlanations (SHAP) v0.39.0 was employed to provide model interpretability. SHAP summary plots provided a global view of feature importance, while force plots illustrated the contribution of individual features.\u003c/p\u003e \u003cp\u003eEnrichment and cluster analysis\u003c/p\u003e \u003cp\u003eEnrichment analysis of ARGs and IS elements in key feature sets was conducted by hypergeometric testing [\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e]. Bidirectional unsupervised hierarchical clustering [\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e] based on IS-ARG pairs was applied using the Euclidean distance metric and the complete linkage method [\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e].\u003c/p\u003e \u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003eStatistical analysis\u003c/h2\u003e \u003cp\u003eStatistical analyses were performed using Python v3.12 and R v4.4.3. Associations between categorical variables were assessed using Fisher's exact test, with the Benjamini-Hochberg method applied for multiple testing correction to control the false discovery rate (FDR) at \u0026lt;\u0026thinsp;0.05.\u003c/p\u003e \u003c/div\u003e"},{"header":"Results","content":"\u003cp\u003eDataset and phenotypic profile\u003c/p\u003e \u003cp\u003eThis study utilized a dataset of 2,732 \u003cem\u003eK. pneumoniae\u003c/em\u003e isolates obtained from the PATRIC database for AMR prediction (Table \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e). Among these isolates, high resistance rates were observed for CAZ (81.59%), followed by LVX (78.58%) and CIP (74.79%). Notably, a total of 2,210 isolates (80.89%) exhibited an MDR phenotype, defined as resistance to three or more classes of antibiotics. Fifteen isolates were found to be resistant to 13 classes of antibiotics (Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e and Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eStatistics of the antibiotic resistance of 2,732 \u003cem\u003eK. pneumoniae\u003c/em\u003e isolates\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"6\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"char\" char=\".\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAntibiotic\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAbbreviation\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eAntibiotic classes\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eAction Mechanism\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eNumber of resistant isolates\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003eNumber of sensitive isolates\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eImipenem\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eIPM\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eCarbapenem\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eCell Wall Synthesis\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e585\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e1069\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eMeropenem\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eMEM\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eCarbapenem\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eCell Wall Synthesis\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e926\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e1687\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCefepime\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eFEP\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eCephalosporin (Fourth Generation)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eCell Wall Synthesis\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e1250\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e807\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCeftazidime\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eCAZ\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eCephalosporin (Third Generation)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eCell Wall Synthesis\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e2262\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e512\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCefoxitin\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eFOX\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eCephamycin\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eCell Wall Synthesis\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e1129\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e938\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003ePiperacillin\u003c/p\u003e \u003cp\u003e/Tazobactam\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003ePTZ\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003ePenicillin (Beta-lactamase Inhibitor Combination)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eCell Wall Synthesis\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e1306\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e810\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eTicarcillin\u003c/p\u003e \u003cp\u003e/Clavulanic Acid\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eTIM\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003ePenicillin (Beta-lactamase Inhibitor Combination)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eCell Wall Synthesis\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e87\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e154\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eCiprofloxacin\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eCIP\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eFluoroquinolone\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eDNA Synthesis\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e1827\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e610\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eLevofloxacin\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eLVX\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eFluoroquinolone\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eDNA Synthesis\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e1347\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e364\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eTrimethoprim\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eTMP\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eAntifolate (Dihydrofolate Reductase Inhibitor)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eFolic Acid Metabolism\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e120\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e157\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eTrimethoprim\u003c/p\u003e \u003cp\u003e/Sulfamethoxazole\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eTMP/SMX\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eSulfonamide (Combination)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eFolic Acid Metabolism\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e1625\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e581\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eAmikacin\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eAK\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eAminoglycoside\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eProtein Synthesis\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e419\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e2355\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eGentamicin\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eGEN\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eAminoglycoside\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eProtein Synthesis\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e1236\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e1443\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eTobramycin\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eTOB\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eAminoglycoside\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eProtein Synthesis\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e917\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e832\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003eTetracycline\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eTET\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eTetracycline\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eProtein Synthesis\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c5\"\u003e \u003cp\u003e824\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"char\" char=\".\" colname=\"c6\"\u003e \u003cp\u003e630\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eLandscape of ARGs and IS elements\u003c/p\u003e \u003cp\u003eA total of 124,851 ARGs were identified across all isolates. β-lactamase genes were the most dominant category, constituting 59.07% of all detected ARGs. Other prevalent resistance mechanisms included efflux pump systems associated with MDR (8.76%, n\u0026thinsp;=\u0026thinsp;10,937), aminoglycoside-modifying enzymes (8.04%, n\u0026thinsp;=\u0026thinsp;10,033), polymyxin resistance determinants (6.93%, n\u0026thinsp;=\u0026thinsp;8,646), and quinolone resistance genes (6.42%, n\u0026thinsp;=\u0026thinsp;8,015) (Table \u003cspan refid=\"MOESM2\" class=\"InternalRef\"\u003eS2\u003c/span\u003e). Among all genes, the polymyxin resistance gene \u003cem\u003earnT\u003c/em\u003e was the most frequent, detected in 2,658 isolates (97.29%), followed by the efflux pump genes \u003cem\u003emdtQ\u003c/em\u003e (96.93%) and \u003cem\u003eacrA\u003c/em\u003e (96.89%).\u003c/p\u003e \u003cp\u003eA total of 62,347 occurrences of IS elements were identified. These were grouped into 64 distinct IS types, which belong to 22 transposon families (Table \u003cspan refid=\"MOESM3\" class=\"InternalRef\"\u003eS3\u003c/span\u003e). The IS\u003cem\u003e3\u003c/em\u003e family was the most abundant (34.35%, n\u0026thinsp;=\u0026thinsp;21,406), followed by the IS\u003cem\u003e5\u003c/em\u003e family (9.14%, n\u0026thinsp;=\u0026thinsp;5,696). The most frequently detected individual IS elements were IS\u003cem\u003eNCY_229\u003c/em\u003e (present in 97.47% of isolates), IS\u003cem\u003e1_316\u003c/em\u003e (90.41%), and IS\u003cem\u003e6_292\u003c/em\u003e (88.07%).\u003c/p\u003e \u003cp\u003ePerformance of ML models in AMR prediction\u003c/p\u003e \u003cp\u003eFour distinct feature sets, ARGs, IS elements, IS-ARG pairs, and key feature sets, were evaluated for their predictive performance using six ML models (Table \u003cspan refid=\"MOESM4\" class=\"InternalRef\"\u003eS4\u003c/span\u003e). Across 15 antibiotics, RF and SVM outperformed GB, LR, DT and NB, with mean accuracies of 92.22% and 88.83%. Among the four feature sets, IS elements demonstrated superior performance for several drugs. The RF model using IS elements achieved the highest accuracy among all models across all 15 antibiotics (97.73% for LVX), with additional accuracies of 94.28% (MEM), 96.01% (AK), and 93.44% (TMP/SMX), alongside AUCs of 98.17%, 98.91%, and 98.22%, respectively. Furthermore, models using ARGs exhibited a higher mean accuracy of 92.08% and greater stability, with accuracies ranging from 80.86% to 97.21%, compared to those using IS elements (89.82%, 62.96%-97.73%) (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e). Notably, RF reached top accuracies of 97.20% for CAZ and 96.42% for IPM, while SVM delivered its best performance for CIP (97.21%). Additionally, models based on the key feature sets exhibited consistent performance, resulting in a mean accuracy of 90.16% (ranging from 72.60% to 96.36%), with SVM attaining the highest accuracy of 96.36% for TMP. In contrast, almost all models utilizing the IS-ARG pairs showed lower accuracies.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eGiven the balanced performance of the key feature sets, we assessed the discriminative ability (AUC) for six representative antibiotics: carbapenems (IPM and MEM), cephalosporins (CAZ), fluoroquinolones (CIP and LVX), and the antibacterial enhancer trimethoprim (TMP) (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003e). All models achieved AUCs exceeding 82.66% for these antibiotics. Notably, the SVM model, using the key feature set, reached an exceptional AUC of 99.87% for TMP, while RF yielded 99.60% for CIP. In summary, the RF model emerged as a high-performing classifier across multiple antibiotics. Its superior predictive capability was particularly enhanced when utilizing IS elements as genomic features, demonstrating their potential to improve predictive accuracy for specific antibiotics.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eImportance of features interpreted by SHAP value\u003c/p\u003e \u003cp\u003eIn addition to ARGs, RF identified a number of IS elements as crucial features for prediction (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e). For the majority of antibiotics (93.33%, 14/15), the number of ARGs in these key sets exceeded that of IS elements, with TOB being the sole exception. Among the 5,159 genetic elements initially identified across all isolates, 13.14% were ARGs and 3.52% were IS elements. Enrichment analysis confirmed that both ARGs and IS elements were enriched in key feature sets of all drugs (average fold change\u0026thinsp;\u0026ge;\u0026thinsp;2, \u003cem\u003eP\u003c/em\u003e\u0026thinsp;\u0026lt;\u0026thinsp;0.05) [\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e] (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe scatter plots for six antibiotics (IPM, MEM, CAZ, CIP, LVX, and TMP) revealed that IS\u003cem\u003ePa38\u003c/em\u003e was retained in the key feature sets of all antibiotics. The SHAP analysis elucidated the impact of specific genetic elements. The key features identified by the two analytical methods were largely consistent. For IPM and MEM, most of the red dots were clustered around the negative x-axis, indicating that the presence of genetic elements decreased \u003cem\u003eK. pneumoniae\u003c/em\u003e resistance. It was noteworthy that known genes associated with carbapenem resistance, such as \u003cem\u003ebla\u003c/em\u003e\u003csub\u003eKPC\u0026minus;2\u003c/sub\u003e and \u003cem\u003ebla\u003c/em\u003e\u003csub\u003eKPC\u0026minus;3\u003c/sub\u003e, were included in the list. IS\u003cem\u003e1182_104\u003c/em\u003e was also identified as crucial predictors for IPM and MEM (Figure \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e and Figure \u003cspan refid=\"MOESM2\" class=\"InternalRef\"\u003eS2\u003c/span\u003e). For CAZ, IS\u003cem\u003e1380_141\u003c/em\u003e stood out as a key indicator for resistance prediction, followed by \u003cem\u003eaac(6')-Ib10\u003c/em\u003e and \u003cem\u003eqnrB17\u003c/em\u003e (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e). For CIP resistance prediction, \u003cem\u003eaac(6')-Ib10\u003c/em\u003e, IS\u003cem\u003e1380_141\u003c/em\u003e, and \u003cem\u003ebla\u003c/em\u003e\u003csub\u003eSHV\u0026minus;33\u003c/sub\u003e contributed the most to the model (Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e6\u003c/span\u003e). For LVX, \u003cem\u003efosA5\u003c/em\u003e, \u003cem\u003eKpnE\u003c/em\u003e and IS\u003cem\u003e66_43\u003c/em\u003e showed positive associations with LVX resistance (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003e). The top three features that contributed most significantly to TMP prediction were IS\u003cem\u003e1380_141\u003c/em\u003e, \u003cem\u003eqnrB5\u003c/em\u003e, and \u003cem\u003ebla\u003c/em\u003e\u003csub\u003eCTX\u0026minus;M\u0026minus;62\u003c/sub\u003e, showing a negative correlation with resistance (Figure \u003cspan refid=\"MOESM3\" class=\"InternalRef\"\u003eS3\u003c/span\u003e). The SHAP force plots visualized the impact of key features on individual isolate predictions. For instance, in the representative CAZ-sensitive isolate, the cumulative contribution of sensitivity-associated markers (IS\u003cem\u003e1380_141\u003c/em\u003e, \u003cem\u003eaac(6\u0026prime;)-Ib10, bla\u003c/em\u003e\u003csub\u003eCTX\u0026minus;M\u0026minus;15\u003c/sub\u003e) outweighed that of resistance-promoting features (\u003cem\u003ebla\u003c/em\u003e\u003csub\u003eSHV\u0026minus;62\u003c/sub\u003e, \u003cem\u003ebla\u003c/em\u003e\u003csub\u003eSHV\u0026minus;108\u003c/sub\u003e, \u003cem\u003ebacA\u003c/em\u003e), driving a SHAP value (0.02) below the base value. In contrast, in the CAZ-resistant isolate, the value (0.96) exceeded the base value (Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e7\u003c/span\u003e). These findings indicated that the predictions were driven by the combined effect of multiple features and that IS elements were critical in AMR prediction of \u003cem\u003eK. pneumoniae\u003c/em\u003e.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eIS-ARG co-occurrence patterns as determinants of MDR\u003c/p\u003e \u003cp\u003eWe explored the association between IS-ARG relationships and resistance phenotypes.\u003c/p\u003e \u003cp\u003eBased on IS-ARG pairs, bidirectional unsupervised clustering stratified the isolates into seven clusters, displaying different resistance patterns (Fig.\u0026nbsp;\u003cspan refid=\"Fig8\" class=\"InternalRef\"\u003e8\u003c/span\u003e). For a subset of isolates in cluster 1, although IS elements and ARGs were distributed in their genomes, IS-ARG pairs were missing. These isolates demonstrated resistance to CAZ, CIP, and LVX (Figure \u003cspan refid=\"MOESM4\" class=\"InternalRef\"\u003eS4\u003c/span\u003e). This suggested a correlation between the lack of specific IS-ARG pairs and susceptibility phenotypes. Conversely, the presence of specific IS-ARG pairs was associated with MDR phenotypes. For a distinct subset of isolates in cluster 1, the associations between IS\u003cem\u003e21\u003c/em\u003e elements (IS\u003cem\u003e21_203\u003c/em\u003e and IS\u003cem\u003e21_259\u003c/em\u003e) and \u003cem\u003ebla\u003c/em\u003e\u003csub\u003eKPC\u003c/sub\u003e genes (including \u003cem\u003ebla\u003c/em\u003e\u003csub\u003eKPC\u0026minus;2\u003c/sub\u003e, \u003cem\u003ebla\u003c/em\u003e\u003csub\u003eKPC\u0026minus;101\u003c/sub\u003e, and \u003cem\u003ebla\u003c/em\u003e\u003csub\u003eKPC\u0026minus;3\u003c/sub\u003e) corresponded to resistance against eight antibiotics (FEP, FOX, CAZ, CIP, IPM, LVX, MEM, and PTZ). This pattern was observed in other clusters. In cluster 2, co-occurrences of IS\u003cem\u003e5_222\u003c/em\u003e and IS\u003cem\u003e91_50\u003c/em\u003e with \u003cem\u003earmA\u003c/em\u003e, \u003cem\u003emsrE\u003c/em\u003e, \u003cem\u003eqacE\u003c/em\u003e, and \u003cem\u003eqacEΔ1\u003c/em\u003e genes were tied to resistance to eight antibiotics (AK, CAZ, CIP, GEN, PTZ, TOB, TMP, and TMP/SMX). In cluster 7, pairings of IS\u003cem\u003e6_292\u003c/em\u003e with \u003cem\u003ebla\u003c/em\u003e\u003csub\u003eSHV\u0026minus;217\u003c/sub\u003e, \u003cem\u003ebla\u003c/em\u003e\u003csub\u003eSHV\u0026minus;44,\u003c/sub\u003e \u003cem\u003ebla\u003c/em\u003e\u003csub\u003eSHV\u0026minus;119\u003c/sub\u003e, and \u003cem\u003ebla\u003c/em\u003e\u003csub\u003eSHV\u0026minus;11\u003c/sub\u003e were linked to resistance to five drugs (FEP, CAZ, CIP, LVX, and TMP/SMX).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eWe further analyzed the distribution characteristics of 314 IS-ARG pairs in 2,732 isolates. The most prevalent combination was IS\u003cem\u003eNCY_229\u003c/em\u003e-\u003cem\u003eacrA\u003c/em\u003e (present in 2.57% of isolates), followed by IS\u003cem\u003e6_292\u003c/em\u003e-\u003cem\u003ebla\u003c/em\u003e\u003csub\u003eSHV\u0026minus;187\u003c/sub\u003e (1.52%) (Table \u003cspan refid=\"MOESM5\" class=\"InternalRef\"\u003eS5\u003c/span\u003e). Our analysis revealed plentiful co-existence of IS elements with specific ARGs. Combinations involving IS\u003cem\u003e6\u003c/em\u003e and β-lactamase genes represented the predominant category, accounting for 59.87%. Within this group, IS\u003cem\u003e6_292\u003c/em\u003e was frequently associated with \u003cem\u003ebla\u003c/em\u003e\u003csub\u003eSHV\u003c/sub\u003e genes, accounting for 35.99% of all 314 pairs, followed by \u003cem\u003ebla\u003c/em\u003e\u003csub\u003eLEN\u003c/sub\u003e and \u003cem\u003ebla\u003c/em\u003e\u003csub\u003eOKP\u0026minus;B\u003c/sub\u003e (each 8.91%). Other stable pairings included IS\u003cem\u003e91_365\u003c/em\u003e-\u003cem\u003ebla\u003c/em\u003e\u003csub\u003eTEM\u003c/sub\u003e and IS\u003cem\u003e110_208\u003c/em\u003e-\u003cem\u003ebla\u003c/em\u003e\u003csub\u003eTEM\u003c/sub\u003e. Significantly, conserved bidirectional IS-ARG pairs were also identified. The IS\u003cem\u003e21\u003c/em\u003e elements (IS\u003cem\u003e21_203\u003c/em\u003e and IS\u003cem\u003e21_259\u003c/em\u003e) were found to be exclusively associated with \u003cem\u003ebla\u003c/em\u003e\u003csub\u003eKPC\u003c/sub\u003e genes, demonstrating their role in mediating carbapenem resistance (Fig.\u0026nbsp;\u003cspan refid=\"Fig9\" class=\"InternalRef\"\u003e9\u003c/span\u003e).\u003c/p\u003e \u003cp\u003e \u003c/p\u003e"},{"header":"Discussion","content":"\u003cp\u003eML models offer a powerful framework for predictive tasks and have demonstrated broad applicability in \u003cem\u003eK. pneumoniae\u003c/em\u003e [\u003cspan additionalcitationids=\"CR36 CR37\" citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e]. Based on this, we constructed six ML models incorporating ARGs, IS elements, IS-ARG pairs, and key feature sets to predict AMR phenotypes. The model evaluation demonstrates that the RF model delivered superior predictive performance, attaining the highest accuracy (97.73%) and AUC values, followed by SVM. This finding aligns with prior studies establishing RF as an effective algorithm for AMR prediction from WGS data [\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e, \u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e]. Tree-based ensemble methods and SVM frequently achieve optimal performance in genomic prediction by effectively modeling high-dimensional and non-linear patterns in WGS data [\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e, \u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e, \u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e42\u003c/span\u003e]. Moreover, RF possesses inherent interpretability through feature importance ranking, offering advantages in enhancing model transparency. The models based on ARGs exhibited a higher average accuracy (92.08%) compared to IS elements, with ARGs primarily functioning through mechanisms such as enzymatic drug inactivation, target protection/modification, and active efflux [\u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e43\u003c/span\u003e, \u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eNevertheless, our findings revealed that IS elements were crucial for predicting AMR to specific drugs, such as LVX, MEM, TMP/SMX, and AK. IS elements including IS\u003cem\u003e110\u003c/em\u003e, IS\u003cem\u003e5\u003c/em\u003e, IS\u003cem\u003eL3\u003c/em\u003e, and IS\u003cem\u003ePa38\u003c/em\u003e were selected as key features of these antibiotics. IS elements play a significant role in AMR through various mechanisms. For instance, IS\u003cem\u003e110\u003c/em\u003e family transposases co-occur with Tn\u003cem\u003e3\u003c/em\u003e in bacterial resistance islands and are found integrated in plasmids, where their mode of action may facilitate excision, circular DNA formation, and targeted integration [\u003cspan citationid=\"CR45\" class=\"CitationRef\"\u003e45\u003c/span\u003e, \u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e46\u003c/span\u003e]. The pKpQIL plasmid, which harbors IS\u003cem\u003eL3\u003c/em\u003e and IS\u003cem\u003e5\u003c/em\u003e elements, is distributed among KPC-producing \u003cem\u003eK. pneumoniae\u003c/em\u003e and may functionally disrupt inner membrane proteins, thereby conferring AMR [\u003cspan citationid=\"CR47\" class=\"CitationRef\"\u003e47\u003c/span\u003e]. Remarkably, IS\u003cem\u003ePa38\u003c/em\u003e was retained in the key feature sets of all 15 antibiotics. IS\u003cem\u003ePa38\u003c/em\u003e, a Tn\u003cem\u003e3\u003c/em\u003e family transposon, facilitates gene insertion into the DNA replisome through interaction with the host-encoded β-sliding clamp (DnaN), promoting the dissemination and expression of resistance determinants [\u003cspan citationid=\"CR48\" class=\"CitationRef\"\u003e48\u003c/span\u003e]. SHAP-based interpretability further demonstrated the specific IS elements driving model decisions. As the top-contributing feature across multiple antibiotics, IS\u003cem\u003e1380\u003c/em\u003e is frequently found on prophage regions adjacent to β-lactamase genes and other resistance genes on hypervirulence plasmids in \u003cem\u003eK. pneumoniae\u003c/em\u003e [\u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e49\u003c/span\u003e]. The conserved spatial proximity between IS\u003cem\u003e1380\u003c/em\u003e and its cognate ARGs may facilitate horizontal gene transfer, thereby shaping AMR outcomes [\u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e50\u003c/span\u003e]. Specific IS elements also participate in genomic regulation in other bacterial species. For example, the proliferation of multiple IS classes, particularly IS\u003cem\u003e66\u003c/em\u003e, is considered a hallmark of genomic plasticity in \u003cem\u003eA. baumannii\u003c/em\u003e [\u003cspan citationid=\"CR51\" class=\"CitationRef\"\u003e51\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eAmong 2,732 \u003cem\u003eK. pneumoniae\u003c/em\u003e isolates, investigation of IS-ARG identified 314 distinct co-occurrence pairs, many exhibiting preferential associations. Single-molecule real-time sequencing has detected substantial copy numbers of IS\u003cem\u003e6\u003c/em\u003e and Tn\u003cem\u003e3\u003c/em\u003e elements in \u003cem\u003eK. pneumoniae\u003c/em\u003e genomes, with IS\u003cem\u003e6\u003c/em\u003e members representing the most frequently observed transposon family, suggesting their transposability and sustained activity across diverse genetic backgrounds [\u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e50\u003c/span\u003e]. The predominant association of IS\u003cem\u003e6_292\u003c/em\u003e and \u003cem\u003ebla\u003c/em\u003e\u003csub\u003eSHV\u003c/sub\u003e genes, comprising 35.99% of all pairs, reflects the dissemination of specific mobile genetic elements in \u003cem\u003eK. pneumoniae\u003c/em\u003e. In our study, the majority of IS-ARG interactions exhibit conserved patterns. The coexistence of \u003cem\u003ebla\u003c/em\u003e\u003csub\u003eTEM\u0026minus;1\u003c/sub\u003e and IS\u003cem\u003e110\u003c/em\u003e in hybrid plasmids, as well as \u003cem\u003ebla\u003c/em\u003e\u003csub\u003eTEM\u0026minus;1\u003c/sub\u003e flanked by Tn\u003cem\u003e3\u003c/em\u003e and IS\u003cem\u003e91\u003c/em\u003e was reported in clinical \u003cem\u003eK. pneumoniae\u003c/em\u003e isolates from different countries [\u003cspan citationid=\"CR52\" class=\"CitationRef\"\u003e52\u003c/span\u003e, \u003cspan citationid=\"CR53\" class=\"CitationRef\"\u003e53\u003c/span\u003e]. This structural conservation reflects that specific IS-ARG pairs are integrated as composite transposons [\u003cspan citationid=\"CR54\" class=\"CitationRef\"\u003e54\u003c/span\u003e]. Importantly, IS\u003cem\u003e21\u003c/em\u003e elements exhibit an exclusive association with \u003cem\u003ebla\u003c/em\u003e\u003csub\u003eKPC\u003c/sub\u003e genes, which are typically embedded within Tn\u003cem\u003e4401\u003c/em\u003e transposons [\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e, \u003cspan citationid=\"CR55\" class=\"CitationRef\"\u003e55\u003c/span\u003e], with IS\u003cem\u003e21\u003c/em\u003e-mediated transposition requiring the interacting action of istA and istB [\u003cspan citationid=\"CR56\" class=\"CitationRef\"\u003e56\u003c/span\u003e]. Bidirectional unsupervised clustering revealed that the occurrence of IS\u003cem\u003e21\u003c/em\u003e-\u003cem\u003ebla\u003c/em\u003e\u003csub\u003eKPC\u0026minus;2\u003c/sub\u003e and IS\u003cem\u003e21\u003c/em\u003e-\u003cem\u003ebla\u003c/em\u003e\u003csub\u003eKPC\u0026minus;3\u003c/sub\u003e correlated with resistance to eight antibiotics, corroborating the role of Tn\u003cem\u003e4401\u003c/em\u003e-like transposons in carbapenemase dissemination. This finding reinforces the role of istA/istB-driven chromosomal rearrangements in the dissemination of resistance determinants.\u003c/p\u003e \u003cp\u003eIn \u003cem\u003eA. baumannii\u003c/em\u003e, the interactions between IS elements and ARGs have been reported to be responsible for the resistance to certain antibiotics, such as IS\u003cem\u003eAba1\u003c/em\u003e-\u003cem\u003ebla\u003c/em\u003e\u003csub\u003eOXA\u0026minus;23\u003c/sub\u003e [\u003cspan citationid=\"CR57\" class=\"CitationRef\"\u003e57\u003c/span\u003e]. Prior investigations have further established that IS-ARG interactions contribute to AMR phenotypes and have confirmed their mode of action [\u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e24\u003c/span\u003e]. The \u003cem\u003eA. baumannii\u003c/em\u003e isolates lacking IS-ARG pairs in their genomes were sensitive to almost all 20 drugs, while certain \u003cem\u003eK. pneumoniae\u003c/em\u003e isolates exhibited resistance despite lacking these genetic elements. This highlights a species-specific difference in the genetic architecture of AMR. Nevertheless, models based on the IS-ARG pairs yielded inferior predictive performance. Further investigation is needed to elucidate the contributions of IS-ARG pairs to resistance in \u003cem\u003eK. pneumoniae\u003c/em\u003e [\u003cspan citationid=\"CR58\" class=\"CitationRef\"\u003e58\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eSeveral limitations of this study should be acknowledged. First, our analysis was restricted to publicly available genomes from the PATRIC database. The lack of externally collected isolates for independent validation limits the generalizability of our findings. Second, we confined our input space to ARGs and IS elements. The exclusion of other genomic features has limited our ability to definitively attribute predictive importance solely to these two feature classes. Third, despite identifying associations between specific IS-ARG pairs and resistance phenotypes, definitive causal relationships remain to be established. Experimental validation through targeted gene knockout, complementation assays, and transcriptional profiling is required to determine the contributions of these paired elements. These limitations notwithstanding, our study provides a scalable analytical framework and testable hypotheses for future investigation.\u003c/p\u003e \u003cp\u003eOverall, our study constructed binary classification models for AMR prediction across diverse antibiotic agents. We illustrated the associations between IS elements and AMR prediction, and explored the impact of IS-ARG interactions on MDR profiles. By highlighting the contribution of mobile genetic elements beyond ARGs, this work provides a foundation for incorporating genomic context into AMR prediction and mechanistic studies. Elucidating these interactions will be essential for developing targeted interventions to mitigate the spread of AMR in \u003cem\u003eK. pneumoniae\u003c/em\u003e.\u003c/p\u003e"},{"header":"Abbreviations","content":"\u003cdiv class=\"DefinitionList\"\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003e\u003cem\u003eK. pneumoniae\u003c/em\u003e\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003e \u003cem\u003eKlebsiella pneumoniae\u003c/em\u003e \u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eMDR\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eMultidrug resistance\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eARGs\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eAntibiotic resistance genes\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eIS\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eInsertion sequence\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eAMR\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eAntimicrobial resistance\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eML\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eMachine learning\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003ePPV\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003ePositive predictive value\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eROC\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eReceiver operating characteristic\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eAUC\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eArea under the ROC curve\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eWGS\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eWhole-genome sequencing\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eCARD\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eComprehensive antibiotic resistance database\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eRF\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eRandom forest\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eSVM\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eSupport vector machine\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eLR\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eLogistic regression\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eDT\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eDecision tree\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eNB\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eNaive bayes\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eGB\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eGradient boosting\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eSHAP\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eSHapley additive exPlanations\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eFDR\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eFalse discovery rate\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003ePATRIC\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003ePathosystems resource integration center\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003cdiv class=\"DefinitionListEntry\"\u003e \u003cdiv class=\"Term\"\u003eSMOTE\u003c/div\u003e \u003cdiv class=\"Description\"\u003e \u003cp\u003eSynthetic minority over-sampling technique\u003c/p\u003e \u003c/div\u003e \u003c/div\u003e \u003c/div\u003e"},{"header":"Declarations","content":"\u003ch3\u003eEthics approval and consent to participate\u003c/h3\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e\n\u003ch3\u003eConsent for publication\u003c/h3\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e\n\u003ch3\u003eAvailability of data and materials\u003c/h3\u003e\n\u003cp\u003eThe whole genome sequence data analyzed during the current study were derived from existing publicly available data in the PATRIC database (https://www.bv-brc.org/), as detailed in Table S1 of the Supplementary Information files.\u003c/p\u003e\n\u003ch3\u003eCompeting interests\u003c/h3\u003e\n\u003cp\u003eThe authors declare no conflicts of interest in this work.\u003c/p\u003e\n\u003ch3\u003eFunding\u003c/h3\u003e\n\u003cp\u003eThis work was supported by grants from the National Natural Science Foundation of China (12571540) and Open Project Fund of Key Laboratory of Biosafety Defense, Ministry of Education (No: KLBD-2024-003).\u003c/p\u003e\n\u003ch3\u003eAuthor Contributions\u003c/h3\u003e\n\u003cp\u003eCJW, YC, DCL, and XL designed the study and revised the manuscript. RRG and YT participated in data collection, related bioinformatics analysis and visualization. YMW and XRL performed the analysis and drafted the manuscript. All authors reviewed and agreed on the final version for submission.\u003c/p\u003e\n\u003ch3\u003eAcknowledgements\u003c/h3\u003e\n\u003cp\u003eNot applicable.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eGBD 2021 Antimicrobial Resistance Collaborators. Global burden of bacterial antimicrobial resistance 1990\u0026ndash;2021: a systematic analysis with forecasts to 2050. Lancet. 2024;404:1199\u0026ndash;226. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/s0140-6736(24)01867-1\u003c/span\u003e\u003cspan address=\"10.1016/s0140-6736(24)01867-1\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDing J, Yan W, Zheng R, et al. Combating \u003cem\u003eKlebsiella pneumoniae\u003c/em\u003e: from antimicrobial resistance mechanisms to phage-based combination therapies. Front Cell Infect Microbiol. 2025;15:1691215. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3389/fcimb.2025.1691215\u003c/span\u003e\u003cspan address=\"10.3389/fcimb.2025.1691215\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTinnirello R, Iannolo G, Cagigi A, et al. An overview of \u003cem\u003eKlebsiella pneumoniae\u003c/em\u003e ST392 amid the AMR silent pandemic and consequent environmental dissemination and health risks. Int J Hyg Environ Health. 2026;271:114710. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.ijheh.2025.114710\u003c/span\u003e\u003cspan address=\"10.1016/j.ijheh.2025.114710\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYao Y, Zha Z, Li L, et al. Healthcare-associated carbapenem-resistant \u003cem\u003eKlebsiella pneumoniae\u003c/em\u003e infections are associated with higher mortality compared to carbapenem-susceptible \u003cem\u003eK. pneumoniae\u003c/em\u003e infections in the intensive care unit: a retrospective cohort study. J Hosp Infect. 2024;148:30\u0026ndash;8. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.jhin.2024.03.003\u003c/span\u003e\u003cspan address=\"10.1016/j.jhin.2024.03.003\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang J, Zhou R, Ren J, et al. Economic burden of carbapenem-resistant \u003cem\u003eKlebsiella pneumoniae\u003c/em\u003e infections in Chinese hospitals: A 2019 analysis. J Glob Antimicrob Resist. 2026;46:63\u0026ndash;70. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.jgar.2025.11.006\u003c/span\u003e\u003cspan address=\"10.1016/j.jgar.2025.11.006\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMatsumura Y, Yamamoto M, Gomi R, et al. Integrating whole-genome sequencing into antimicrobial resistance surveillance: methodologies, challenges, and perspectives. Clin Microbiol Rev. 2025;38:e0014022. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1128/cmr.00140-22\u003c/span\u003e\u003cspan address=\"10.1128/cmr.00140-22\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKim JI, Maguire F, Tsang KK, et al. Machine Learning for Antimicrobial Resistance Prediction: Current Practice, Limitations, and Clinical Perspective. Clin Microbiol Rev. 2022;35. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1128/cmr.00179-21\u003c/span\u003e\u003cspan address=\"10.1128/cmr.00179-21\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWang S, Zhao C, Yin Y, et al. A Practical Approach for Predicting Antimicrobial Phenotype Resistance in Staphylococcus aureus Through Machine Learning Analysis of Genome Data. Front Microbiol. 2022;13:841289. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3389/fmicb.2022.841289\u003c/span\u003e\u003cspan address=\"10.3389/fmicb.2022.841289\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKhaledi A, Weimann A, Schniederjans M, et al. Predicting antimicrobial resistance in \u003cem\u003ePseudomonas aeruginosa\u003c/em\u003e with machine learning-enabled molecular diagnostics. EMBO Mol Med. 2020;12:e10264. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.15252/emmm.201910264\u003c/span\u003e\u003cspan address=\"10.15252/emmm.201910264\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eChiaverini A, Curi R, Gori M, et al. Antimicrobial resistance of Listeria monocytogenes human strains and correlation to genomic data. Eur J Public Health. 2021;31:449.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eStoesser N, Batty EM, Eyre DW, et al. Predicting antimicrobial susceptibilities for \u003cem\u003eEscherichia coli\u003c/em\u003e and \u003cem\u003eKlebsiella pneumoniae\u003c/em\u003e isolates using whole genomic sequence data. J Antimicrob Chemother. 2013;68:2234\u0026ndash;44. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1093/jac/dkt180\u003c/span\u003e\u003cspan address=\"10.1093/jac/dkt180\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhang J, Lei H, Huang J, et al. Co-occurrence and co-expression of antibiotic, biocide, and metal resistance genes with mobile genetic elements in microbial communities subjected to long-term antibiotic pressure: Novel insights from metagenomics and metatranscriptomics. J Hazard Mater. 2025;489:137559. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.jhazmat.2025.137559\u003c/span\u003e\u003cspan address=\"10.1016/j.jhazmat.2025.137559\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWang M, Xiong W, Liu P, et al. Metagenomic Insights Into the Contribution of Phages to Antibiotic Resistance in Water Samples Related to Swine Feedlot Wastewater Treatment. Front Microbiol. 2018;9:2474. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3389/fmicb.2018.02474\u003c/span\u003e\u003cspan address=\"10.3389/fmicb.2018.02474\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eJanse I, Beeloo R, Swart A, et al. The extent of carbapenemase-encoding genes in public genome sequences. PeerJ. 2021;9:e11000. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.7717/peerj.11000\u003c/span\u003e\u003cspan address=\"10.7717/peerj.11000\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHumayun MZ, Zhang Z, Butcher AM, et al. Hopping into a hot seat: Role of DNA structural features on IS5-mediated gene activation and inactivation under stress. PLoS ONE. 2017;12:e0180156. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1371/journal.pone.0180156\u003c/span\u003e\u003cspan address=\"10.1371/journal.pone.0180156\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDing Y, Jiang X, Wu J, et al. Synergistic horizontal transfer of antibiotic resistance genes and transposons in the infant gut microbial genome. mSphere. 2024;9:e0060823. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1128/msphere.00608-23\u003c/span\u003e\u003cspan address=\"10.1128/msphere.00608-23\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHern\u0026aacute;ndez-All\u0026eacute;s S, Bened\u0026iacute; VJ, Mart\u0026iacute;nez-Mart\u0026iacute;nez L, et al. Development of resistance during antimicrobial therapy caused by insertion sequence interruption of porin genes. Antimicrob Agents Chemother. 1999;43:937\u0026ndash;9. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1128/AAC.43.4.937\u003c/span\u003e\u003cspan address=\"10.1128/AAC.43.4.937\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHan X, Shi Q, Mao Y, et al. Emergence of Ceftazidime/Avibactam and Tigecycline Resistance in Carbapenem-Resistant \u003cem\u003eKlebsiella pneumoniae\u003c/em\u003e Due to In-Host Microevolution. Front Cell Infect Microbiol. 2021;11:757470. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3389/fcimb.2021.757470\u003c/span\u003e\u003cspan address=\"10.3389/fcimb.2021.757470\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDu Y, Liu T, Gong Y, et al. Scarless excision of an insertion sequence in the OmpK36 promoter restores meropenem susceptibility in a non-carbapenemase-producing \u003cem\u003eKlebsiella pneumoniae\u003c/em\u003e. Emerg Microbes Infect. 2025;14:2503922. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1080/22221751.2025.2503922\u003c/span\u003e\u003cspan address=\"10.1080/22221751.2025.2503922\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWu Y, Zhao J, Li Z, et al. Within-host acquisition of colistin-resistance of an NDM-producing \u003cem\u003eKlebsiella quasipneumoniae\u003c/em\u003e subsp. \u003cem\u003esimilipneumoniae\u003c/em\u003e strain through the insertion sequence-\u003cem\u003e903B\u003c/em\u003e-mediated inactivation of \u003cem\u003emgrB\u003c/em\u003e gene in a lung transplant child in China. Front Cell Infect Microbiol. 2023;13:1153387. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3389/fcimb.2023.1153387\u003c/span\u003e\u003cspan address=\"10.3389/fcimb.2023.1153387\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRazavi M, Kristiansson E, Flach CF, et al. The Association between Insertion Sequences and Antibiotic Resistance Genes. mSphere. 2020;5. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1128/mSphere.00418-20\u003c/span\u003e\u003cspan address=\"10.1128/mSphere.00418-20\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eNielsen TK, Browne PD, Hansen LH. Antibiotic resistance genes are differentially mobilized according to resistance mechanism. GigaScience. 2022;11. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1093/gigascience/giac072\u003c/span\u003e\u003cspan address=\"10.1093/gigascience/giac072\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePartridge SR, Kwong SM, Firth N, et al. Mobile Genetic Elements Associated with Antimicrobial Resistance. Clin Microbiol Rev. 2018;31. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1128/cmr.00088-17\u003c/span\u003e\u003cspan address=\"10.1128/cmr.00088-17\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eXie F, Wang L, Li S, et al. Large-scale genomic analysis reveals significant role of insertion sequences in antimicrobial resistance of \u003cem\u003eAcinetobacter baumannii\u003c/em\u003e. mBio. 2025;16:e0285224. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1128/mbio.02852-24\u003c/span\u003e\u003cspan address=\"10.1128/mbio.02852-24\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eParks DH, Imelfort M, Skennerton CT, et al. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 2015;25:1043\u0026ndash;55. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1101/gr.186072.114\u003c/span\u003e\u003cspan address=\"10.1101/gr.186072.114\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eJia B, Raphenya AR, Alcock B, et al. CARD 2017: expansion and model-centric curation of the comprehensive antibiotic resistance database. Nucleic Acids Res. 2017;45:D566\u0026ndash;73. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1093/nar/gkw1004\u003c/span\u003e\u003cspan address=\"10.1093/nar/gkw1004\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eXie Z, Tang H. ISEScan: automated identification of insertion sequence elements in prokaryotic genomes. Bioinformatics. 2017;33:3340\u0026ndash;7. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1093/bioinformatics/btx433\u003c/span\u003e\u003cspan address=\"10.1093/bioinformatics/btx433\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eSiguier P, Perochon J, Lestrade L, et al. ISfinder: the reference centre for bacterial insertion sequences. Nucleic Acids Res. 2006;34:D32\u0026ndash;6. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1093/nar/gkj014\u003c/span\u003e\u003cspan address=\"10.1093/nar/gkj014\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eYao S, Yu J, Zhang T, et al. Comprehensive analysis of distribution characteristics and horizontal gene transfer elements of \u003cem\u003ebla\u003c/em\u003e\u003csub\u003eNDM-1\u003c/sub\u003e-carrying bacteria. Sci Total Environ. 2024;946:173907. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.scitotenv.2024.173907\u003c/span\u003e\u003cspan address=\"10.1016/j.scitotenv.2024.173907\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDablain D, Krawczyk B, Chawla NV. DeepSMOTE: Fusing Deep Learning and SMOTE for Imbalanced Data. IEEE Trans Neural Netw Learn Syst. 2023;34:6390\u0026ndash;404. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1109/TNNLS.2021.3136503\u003c/span\u003e\u003cspan address=\"10.1109/TNNLS.2021.3136503\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCao J, Zhang S. A Bayesian extension of the hypergeometric test for functional enrichment analysis. Biometrics. 2014;70:84\u0026ndash;94. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1111/biom.12122\u003c/span\u003e\u003cspan address=\"10.1111/biom.12122\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eOrtega-Paredes D, Del Canto F, Rios R, et al. Genomic Insights into Colistin and Tigecycline Resistance in ESBL-Producing \u003cem\u003eEscherichia coli\u003c/em\u003e and \u003cem\u003eKlebsiella pneumoniae\u003c/em\u003e Harboring \u003cem\u003ebla\u003c/em\u003e\u003csub\u003eKPC\u003c/sub\u003e Genes in Ecuador. Antibiot (Basel). 2025;14. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3390/antibiotics14020206\u003c/span\u003e\u003cspan address=\"10.3390/antibiotics14020206\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTerlep TA, Bell MR, Talavage TM, et al. Euclidean Distance Approximations From Replacement Product Graphs. IEEE Trans Image Process. 2022;31:125\u0026ndash;37. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1109/TIP.2021.3128319\u003c/span\u003e\u003cspan address=\"10.1109/TIP.2021.3128319\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eShu J, Liu Y, Shan Y, et al. Deep sequencing microRNA profiles associated with wooden breast in commercial broilers. Poult Sci. 2021;100:101496. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.psj.2021.101496\u003c/span\u003e\u003cspan address=\"10.1016/j.psj.2021.101496\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ede la Lastra JMP, Wardell SJT, Pal T, et al. From Data to Decisions: Leveraging Artificial Intelligence and Machine Learning in Combating Antimicrobial Resistance a Comprehensive Review. J Med Syst. 2024;48:71. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/s10916-024-02089-5\u003c/span\u003e\u003cspan address=\"10.1007/s10916-024-02089-5\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBilal H, Khan MN, Khan S, et al. The role of artificial intelligence and machine learning in predicting and combating antimicrobial resistance. Comput Struct Biotechnol J. 2025;27:423\u0026ndash;39. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.csbj.2025.01.006\u003c/span\u003e\u003cspan address=\"10.1016/j.csbj.2025.01.006\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eZhou X, Yang M, Chen F, et al. Prediction of antimicrobial resistance in \u003cem\u003eKlebsiella pneumoniae\u003c/em\u003e using genomic and metagenomic next-generation sequencing data. J Antimicrob Chemother. 2024;79:2509\u0026ndash;17. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1093/jac/dkae248\u003c/span\u003e\u003cspan address=\"10.1093/jac/dkae248\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCondorelli C, Nicitra E, Musso N, et al. Prediction of antimicrobial resistance of \u003cem\u003eKlebsiella pneumoniae\u003c/em\u003e from genomic data through machine learning. PLoS ONE. 2024;19:e0309333. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1371/journal.pone.0309333\u003c/span\u003e\u003cspan address=\"10.1371/journal.pone.0309333\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRen Y, Chakraborty T, Doijad S, et al. Prediction of antimicrobial resistance based on whole-genome sequencing and machine learning. Bioinformatics. 2022;38:325\u0026ndash;34. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1093/bioinformatics/btab681\u003c/span\u003e\u003cspan address=\"10.1093/bioinformatics/btab681\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eArdila CM, Yadalam PK, Gonz\u0026aacute;lez-Arroyave D. Integrating whole genome sequencing and machine learning for predicting antimicrobial resistance in critical pathogens: a systematic review of antimicrobial susceptibility tests. PeerJ. 2024;12:e18213. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.7717/peerj.18213\u003c/span\u003e\u003cspan address=\"10.7717/peerj.18213\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLi D, Wang Y, Hu W, et al. Application of Machine Learning Classifier to Candida auris Drug Resistance Analysis. Front Cell Infect Microbiol. 2021;11:742062. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3389/fcimb.2021.742062\u003c/span\u003e\u003cspan address=\"10.3389/fcimb.2021.742062\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eHaga H, Sato H, Koseki A, et al. A machine learning-based treatment prediction model using whole genome variants of hepatitis C virus. PLoS ONE. 2020;15:e0242028. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1371/journal.pone.0242028\u003c/span\u003e\u003cspan address=\"10.1371/journal.pone.0242028\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eJhalora V, Bist R. A Comprehensive Review of Molecular Mechanisms Leading to the Emergence of Multidrug Resistance in Bacteria. Indian J Microbiol. 2025;65:844\u0026ndash;65. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1007/s12088-024-01384-6\u003c/span\u003e\u003cspan address=\"10.1007/s12088-024-01384-6\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGalgano M, Pellegrini F, Catalano E, et al. Acquired Bacterial Resistance to Antibiotics and Resistance Genes: From Past to Future. Antibiot (Basel). 2025;14. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3390/antibiotics14030222\u003c/span\u003e\u003cspan address=\"10.3390/antibiotics14030222\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDurrant MG, Perry NT, Pai JJ, et al. Bridge RNAs direct programmable recombination of target and donor DNA. Nature. 2024;630:984\u0026ndash;93. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/s41586-024-07552-4\u003c/span\u003e\u003cspan address=\"10.1038/s41586-024-07552-4\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eWang Y, Dagan T. The evolution of antibiotic resistance islands occurs within the framework of plasmid lineages. Nat Commun. 2024;15:4555. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/s41467-024-48352-8\u003c/span\u003e\u003cspan address=\"10.1038/s41467-024-48352-8\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGiordano C, Barnini S, Tsioutis C, et al. Expansion of KPC-producing \u003cem\u003eKlebsiella pneumoniae\u003c/em\u003e with various \u003cem\u003emgrB\u003c/em\u003e mutations giving rise to colistin resistance: the role of ISL3 on plasmids. Int J Antimicrob Agents. 2018;51:260\u0026ndash;5. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.ijantimicag.2017.10.011\u003c/span\u003e\u003cspan address=\"10.1016/j.ijantimicag.2017.10.011\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eTang Y, Zhang J, Guan J, et al. Transposition with Tn3-family elements occurs through interaction with the host β-sliding clamp processivity factor. Nucleic Acids Res. 2024;52:10416\u0026ndash;30. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1093/nar/gkae674\u003c/span\u003e\u003cspan address=\"10.1093/nar/gkae674\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBolourchi N, Naz A, Sohrabi M, et al. Comparative in silico characterization of \u003cem\u003eKlebsiella pneumoniae\u003c/em\u003e hypervirulent plasmids and their antimicrobial resistance genes. Ann Clin Microbiol Antimicrob. 2022;21:23. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1186/s12941-022-00514-6\u003c/span\u003e\u003cspan address=\"10.1186/s12941-022-00514-6\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eJiang Y, Wang Y, Hua X, et al. Pooled Plasmid Sequencing Reveals the Relationship Between Mobile Genetic Elements and Antimicrobial Resistance Genes in Clinically Isolated \u003cem\u003eKlebsiella pneumoniae\u003c/em\u003e. Genomics Proteom Bioinf. 2020;18:539\u0026ndash;48. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.gpb.2020.12.002\u003c/span\u003e\u003cspan address=\"10.1016/j.gpb.2020.12.002\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eGaiarsa S, Bitar I, Comandatore F, et al. Can Insertion Sequences Proliferation Influence Genomic Plasticity? Comparative Analysis of \u003cem\u003eAcinetobacter baumannii\u003c/em\u003e Sequence Type 78, a Persistent Clone in Italian Hospitals. Front Microbiol. 2019;10:2080. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3389/fmicb.2019.02080\u003c/span\u003e\u003cspan address=\"10.3389/fmicb.2019.02080\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eKuzina ES, Kislichkina AA, Sizova AA, et al. High-Molecular-Weight Plasmids Carrying Carbapenemase Genes \u003cem\u003ebla\u003c/em\u003e\u003csub\u003eNDM-1\u003c/sub\u003e, \u003cem\u003ebla\u003c/em\u003e\u003csub\u003eKPC-2\u003c/sub\u003e, and \u003cem\u003ebla\u003c/em\u003e\u003csub\u003eOXA-48\u003c/sub\u003e Coexisting in Clinical \u003cem\u003eKlebsiella pneumoniae\u003c/em\u003e Strains of ST39. Microorganisms. 2023;11:459. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.3390/microorganisms11020459\u003c/span\u003e\u003cspan address=\"10.3390/microorganisms11020459\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMbelle NM, Feldman C, Sekyere JO, et al. Pathogenomics and Evolutionary Epidemiology of Multi-Drug Resistant Clinical \u003cem\u003eKlebsiella pneumoniae\u003c/em\u003e Isolated from Pretoria, South Africa. Sci Rep. 2020;10:1232. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/s41598-020-58012-8\u003c/span\u003e\u003cspan address=\"10.1038/s41598-020-58012-8\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eRana C, Vikas V, Awasthi S, et al. Antimicrobial resistance genes and associated mobile genetic elements in \u003cem\u003eEscherichia coli\u003c/em\u003e from human, animal and environment. Chemosphere. 2024;369:143808. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.chemosphere.2024.143808\u003c/span\u003e\u003cspan address=\"10.1016/j.chemosphere.2024.143808\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCai M, Song K, Wang R, et al. Tracking intra-species and inter-genus transmission of KPC through global plasmids mining. Cell Rep. 2024;43:114351. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1016/j.celrep.2024.114351\u003c/span\u003e\u003cspan address=\"10.1016/j.celrep.2024.114351\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ede la G\u0026aacute;ndara \u0026Aacute;, Sp\u0026iacute;nola-Amilibia M, Ara\u0026uacute;jo-Baz\u0026aacute;n L et al. Molecular basis for transposase activation by a dedicated AAA+ ATPase. \u003cem\u003eNature\u003c/em\u003e. 2024;630:1003-11. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/s41586-024-07550-6\u003c/span\u003e\u003cspan address=\"10.1038/s41586-024-07550-6\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMoffatt JH, Harper M, Adler B, et al. Insertion sequence IS\u003cem\u003eAba11\u003c/em\u003e is involved in colistin resistance and loss of lipopolysaccharide in \u003cem\u003eAcinetobacter baumannii\u003c/em\u003e. Antimicrob Agents Chemother. 2011;55:3022\u0026ndash;4. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1128/aac.01732-10\u003c/span\u003e\u003cspan address=\"10.1128/aac.01732-10\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eDarby EM, Trampari E, Siasat P, et al. Molecular mechanisms of antibiotic resistance revisited. Nat Rev Microbiol. 2023;21:280\u0026ndash;95. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1038/s41579-022-00820-y\u003c/span\u003e\u003cspan address=\"10.1038/s41579-022-00820-y\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"bmc-microbiology","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"mcro","sideBox":"Learn more about [BMC Microbiology](http://bmcmicrobiol.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/mcro","title":"BMC Microbiology","twitterHandle":"#bmcmicrobiology","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"Klebsiella pneumoniae, Antimicrobial resistance genes, Insertion sequence elements, Machine learning, Multidrug resistance","lastPublishedDoi":"10.21203/rs.3.rs-9397300/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-9397300/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003ch2\u003eBackground\u003c/h2\u003e \u003cp\u003e \u003cem\u003eKlebsiella pneumoniae\u003c/em\u003e (\u003cem\u003eK. pneumoniae\u003c/em\u003e) poses critical therapeutic challenges due to multidrug resistance (MDR). While antibiotic resistance genes (ARGs) are primary determinants, their expression and phenotypic impact are modulated by mobile genetic elements such as insertion sequence (IS) elements. However, the contribution of IS elements to antimicrobial resistance (AMR) prediction remains largely unexplored in \u003cem\u003eK. pneumoniae\u003c/em\u003e.\u003c/p\u003e\u003ch2\u003eMethods\u003c/h2\u003e \u003cp\u003eWe retrieved genome sequences and corresponding antimicrobial susceptibility phenotypes for 2,732 \u003cem\u003eK. pneumoniae\u003c/em\u003e isolates from the PATRIC database (2004\u0026ndash;2024). We evaluated ARGs, IS elements, IS-ARG pairs, and curated key feature sets using six machine learning (ML) algorithms to predict resistance phenotypes for 15 antibiotics. Model performance was assessed using accuracy, recall, positive predictive value (PPV), F1-score, and the area under the receiver operating characteristic (ROC) curve (AUC). SHAP analysis was employed to interpret feature contributions, and enrichment of ARGs and IS elements in key feature sets was examined. Hierarchical clustering based on IS-ARG co-occurrence patterns was performed to explore associations with MDR profiles.\u003c/p\u003e\u003ch2\u003eResults\u003c/h2\u003e \u003cp\u003eThe Random Forest (RF) and Support Vector Machine (SVM) outperformed other models, achieving mean accuracies of 92.22% and 88.83% across all antibiotics. SVM utilizing the key feature set attained an AUC of 99.87% for trimethoprim. IS elements enabled RF to reach the highest per-antibiotic accuracy of 97.73% for levofloxacin, with additional accuracies of 94.28% for meropenem, 96.01% for amikacin, and 93.44% for trimethoprim-sulfamethoxazole. Models utilizing ARGs demonstrated superior overall performance over those employing IS elements, with a higher mean accuracy of 92.08% across all 15 antibiotics compared to 89.82%. IS elements were significantly enriched in key feature sets of all 15 drugs (fold change\u0026thinsp;\u0026ge;\u0026thinsp;2, \u003cem\u003eP\u003c/em\u003e\u0026thinsp;\u0026le;\u0026thinsp;0.05), with RF and SHAP analysis identifying specific IS elements (e.g., IS\u003cem\u003e1380\u003c/em\u003e, IS\u003cem\u003e66\u003c/em\u003e, IS\u003cem\u003e1182\u003c/em\u003e) as key predictors. IS-ARG co-occurrence analysis revealed 314 pairs, with IS\u003cem\u003e6\u003c/em\u003e-\u003cem\u003ebla\u003c/em\u003e\u003csub\u003eSHV\u003c/sub\u003e pairs accounting for 35.99%. Hierarchical clustering stratified isolates into seven clusters with distinct MDR profiles. Notably, IS\u003cem\u003e21\u003c/em\u003e-\u003cem\u003ebla\u003c/em\u003e\u003csub\u003eKPC\u003c/sub\u003e associations were linked to resistance against eight antimicrobials, while the absence of specific IS-ARG pairs correlated with resistance to ceftazidime, ciprofloxacin, and levofloxacin.\u003c/p\u003e\u003ch2\u003eConclusions\u003c/h2\u003e \u003cp\u003eThis study established the potential to improve predictive accuracy in ML models for specific antibiotics when utilizing IS elements as genomic features in \u003cem\u003eK. pneumoniae\u003c/em\u003e, with RF and SVM achieving superior performance across multiple feature sets. IS elements served as critical predictors beyond ARGs, and their co-occurrence with ARGs was associated with MDR phenotypes. The identification of IS\u003cem\u003e21\u003c/em\u003e-\u003cem\u003ebla\u003c/em\u003e\u003csub\u003eKPC\u003c/sub\u003e as a marker of resistance to eight antibiotics and the stratification of isolates into clinically relevant clusters provided candidate targets for genomic surveillance. These findings demonstrate that integrating mobile genetic elements into resistance prediction frameworks improved both predictive accuracy and biological interpretability, offering novel biomarkers and mechanistic insights for resistance surveillance.\u003c/p\u003e","manuscriptTitle":"Prediction of antimicrobial resistance of Klebsiella pneumoniae through machine learning and revelation of significant role of insertion sequence elements","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2026-05-18 21:11:28","doi":"10.21203/rs.3.rs-9397300/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"editorInvitedReview","content":"","date":"2026-05-15T14:18:02+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"248900868721905568896253729439452095082","date":"2026-05-08T16:49:20+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"1167304178320250725067815723394736672","date":"2026-05-08T13:09:11+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2026-05-08T07:17:14+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2026-04-30T02:03:23+00:00","index":"","fulltext":""},{"type":"editorInvited","content":"","date":"2026-04-21T15:55:29+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2026-04-21T00:57:41+00:00","index":"","fulltext":""},{"type":"submitted","content":"BMC Microbiology","date":"2026-04-21T00:52:15+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"bmc-microbiology","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"mcro","sideBox":"Learn more about [BMC Microbiology](http://bmcmicrobiol.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/mcro","title":"BMC Microbiology","twitterHandle":"#bmcmicrobiology","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"ae55fc4a-1e90-421a-a453-e4f8ddb3c5b3","owner":[],"postedDate":"May 18th, 2026","published":true,"recentEditorialEvents":[{"type":"editorInvitedReview","content":"","date":"2026-05-15T14:18:02+00:00","index":66,"fulltext":""},{"type":"reviewerAgreed","content":"248900868721905568896253729439452095082","date":"2026-05-08T16:49:20+00:00","index":61,"fulltext":""},{"type":"reviewerAgreed","content":"1167304178320250725067815723394736672","date":"2026-05-08T13:09:11+00:00","index":59,"fulltext":""},{"type":"reviewersInvited","content":"29","date":"2026-05-08T07:17:14+00:00","index":"","fulltext":""}],"rejectedJournal":[],"revision":"","amendment":"","status":"under-review","subjectAreas":[],"tags":[],"updatedAt":"2026-05-18T21:11:29+00:00","versionOfRecord":[],"versionCreatedAt":"2026-05-18 21:11:28","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-9397300","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-9397300","identity":"rs-9397300","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: preprint-html

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00