Benchmarking Illumina and Oxford Nanopore Technologies (ONT) sequencing platforms for whole genome sequencing of bacterial genomes and use in clinical microbiology | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Benchmarking Illumina and Oxford Nanopore Technologies (ONT) sequencing platforms for whole genome sequencing of bacterial genomes and use in clinical microbiology Srinithi Purushothaman, Tim Roloff, Adrian Egli, Helena MB Seth-Smith This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7036422/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 15 Jan, 2026 Read the published version in BMC Medical Genomics → Version 1 posted 7 You are reading this latest preprint version Abstract Background In microbial diagnostics, whole-genome sequencing (WGS) is used to address key questions such as species identification, presence of antimicrobial resistance genes (ARGs), virulence genes, and outbreak detection. The choice of sequencing technology is crucial to ensure high-quality data, cost-effectiveness, and efficient reporting times. We aimed to compare Illumina (short-read) and ONT (long-read) sequencing methods for WGS on different bacterial species for base accuracy and reliable taxonomic and ARG identification. Materials and Methods We used clinical isolates of ESKAPE pathogens (n = 12) and ATCC strains (n = 8) of varying %G + C. Illumina sequencing was performed on MiSeq (PE150) and ONT sequencing using GridION with R9.4.1 and R10.4.1 flowcells. Base-calling was performed using Guppy, Dorado, and Rerio software. We used de novo assembly with Unicycler for Illumina and Flye for ONT, and two types of hybrid assemblies, Unicycler and Polypolish. We annotated genomes with Bakta and assessed the quality (QUAST, GTDB-Tk). We identified ARGs (AMRFinderPlus) and plasmids (MOB-suite). We mapped reads and called SNPs using Minimap2, Pilon, vcftools, and Snippy (Illumina). Core genome MLST analysis was conducted with Ridom Seqsphere+. Results We observed that Illumina sequencing provided consistently high-quality reads (median Q-score 35), whereas for ONT R10.4.1, SUP mode showed higher median quality (median Q-score 15.3) compared to R9.4.1 (median Q-score 13.9, SUP mode). We observed that Illumina-based assemblies generated fewer genes annotated as disrupted; for ONT assemblies, the base-caller affects assembly annotation accuracy, with High accuracy (HAC) and Super accuracy (SUP) base-calling modes perform better than FAST mode. ONT assemblies resolved rRNA operons better than Illumina assemblies. Sequencing errors were determined by SNP calling, and varied widely by species, with ONT often generating more sequencing errors compared to Illumina. Hybrid assemblies combine accuracy and completeness effectively. Taxonomic identification and ARG detection were reliable across all methods. Conclusion Combining Illumina and ONT technologies yielded optimal bacterial genome sequencing results, leveraging the high accuracy of short reads and improved contiguity of ONT long reads. The HAC and SUP ONT models with Dorado notably enhance genome assembly annotation and resolution of complex regions, although species-specific issues, likely due to repeat regions and base modifications, remain challenging even in SUP mode with Dorado. Hybrid approaches currently offer the most comprehensive and accurate genome assemblies for clinical microbiology. For reliable cgMLST even using the most recent ONT methods, resolution must be assessed on a species-by-species basis. Whole genome sequencing Oxford Nanopore Technologies Illumina Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Introduction Bacterial typing and antimicrobial susceptibility testing (AST) are important services offered by clinical microbiology laboratories for infection control and prevention and to inform treatment strategies. Whole genome sequencing (WGS) offers detailed pathogen profiling and can be used for genome-based species identification, high-resolution typing, including cluster identification, as well as prediction of antimicrobial resistance genes (ARG) and virulence genes ( 1 – 3 ). WGS data can thus be used for diagnostics and, importantly, also for surveillance programs ( 4 , 5 ). To facilitate the accreditation of clinical laboratories and ensure consistent data interpretation and sharing, standardization of WGS workflows is critical ( 6 , 7 ). A clear understanding of the advantages and limitations of different sequencing technologies is essential for their reliable integration into clinical and public health microbiology ( 8 , 9 ). Short-read, Illumina platform-generated data has been used extensively for WGS of bacterial isolates in settings spanning from high- to limited-resource ( 10 , 11 ). Illumina platforms offer workflows with turnaround times (TAT) of 2–5 days ( 12 , 13 ), high throughput, base call accuracy of 99.99% ( 14 , 15 ), and read lengths of 75–300bp ( 16 ). However, Illumina data has lower resolution for repetitive regions, structural variant identification, and thus limitations in generating complete, circularized bacterial genome and plasmid assemblies ( 17 – 19 ) ( 20 ). The advent of long-read sequencing technologies over the last decade is redefining the microbial genomics landscape and promising to shed light on the uncharted regions of the clinical bacterial genomes ( 21 ). Long-read sequencing technologies generate read lengths in kilobases, often enabling complete genome assembly ( 15 ). Of particular interest in clinical microbiology labs is Oxford Nanopore Technologies (ONT). ONT offers rapid TAT for library preparation and sequencing (range here as for Illumina), real-time analysis, lower instrument costs, ease of implementation in resource-limited settings (e.g., handheld MinION sequencer), and the ability to generate circularized bacterial chromosome and plasmid assemblies ( 22 – 26 ). However, ONT data has a lower base quality of the sequenced reads due to a high error rate compared to Illumina ( 16 , 27 ). Over the years, ONT has been dynamically modifying pore chemistries and developing better base-calling algorithms to generate high-quality reads and increase read accuracy. The speed of these developments and the subsequent rapid depreciation of previous hardware pose an issue for clinical microbiology laboratories. Frequent validation is thus required of the bioinformatic software, base-callers, and base-calling models, to ensure reproducibility and standardization. Although several studies ( 27 – 32 ) have evaluated long- and short-read sequencing platforms for bacterial WGS-based typing, these investigations have focused on specific Gram-negative or Gram-positive species, and comparisons across different ONT base callers and models remain limited. Our benchmarking study aimed to analyze a range of genomically diverse bacteria, including clinically relevant pathogens of the ESKAPE group ( Enterococcus faecium , Staphylococcus aureus , Klebsiella pneumoniae , Acinetobacter baumannii , Pseudomonas aeruginosa , and Enterobacter spp.). We compared sequencing data from routine Illumina workflows, ONT chemistries using the R9.4.1 and R10.4.1 flowcells, and ONT base-callers, namely Guppy, Dorado, and Rerio, with different base-calling models: FAST, High accuracy (HAC), and Super accuracy (SUP). We assessed read quality, assembly quality, taxonomic identification, cluster identification, and ARG prediction. Along with the Illumina-only and ONT-only de novo assemblies, we also compared hybrid assemblies using the data. With this benchmarking study, we aim to address whether the improved ONT data can generate stand-alone assemblies on a par with those from Illumina data for pathogen profiling, and to investigate the potential paradigm shift in the use of short-to-long-read sequencing platforms. Methods Sample selection We selected American Type Culture Collection isolates (ATCC) bacterial isolates (n = 8) with %G + C content from 30–66%, and clinical ESKAPE (n = 12) isolates. Table 1 lists the bacterial isolates used. Table 1 Bacterial isolates used for the study . Clinical isolates were labeled as “s” for sensitive and “r” for resistant. For example, acibau_r or acibau_nr refers to A. baumannii with a specific resistance pattern. The resistant and sensitive ESKAPE isolates were identified and phenotypic antimicrobial sensitivity testing carried out with routine culture-based diagnostics. Bacterial isolates Isolate information Gram status %G + C Short name Acinetobacter baumannii Carbapenem resistant, bla OXA−23 Negative 39.0 acibau_nr Acinetobacter baumannii Carbapenem resistant, bla OXA−23 Negative 38.8 acibau_r Acinetobacter pittii Carbapenem resistant, bla OXA−500 Negative 38.7 acipit Burkholderia stabilis ATCCBAA-67 Negative 66.4 bursta Campylobacter jejuni ATCC700819 Negative 30.6 camjej Enterococcus faecalis ATCC29212 Positive 37.5 entfael Enterococcus faecium Vancomycin sensitive Positive 37.7 entfae_s Enterococcus faecium Vancomycin resistant, vanA Positive 37.7 entfae_r Escherichia coli ATCC25922 Negative 50.4 esccol Escherichia coli Non-Extended spectrum beta lactamase (ESBL) producing, sensitive Negative 50.6 esccol_s Escherichia coli ESBL producing, bla CTX−M−15 Negative 50.8 esccol_r Klebsiella pneumoniae Carbapenem sensitive Negative 57.3 klepne_s Klebsiella quasipneumoniae ATCC700603 Negative 57.7 klequa Pseudomonas aeruginosa ATCC27853 Negative 66.1 pseaer Pseudomonas aeruginosa Carbapenem sensitive Negative 66.4 pseaer_s Pseudomonas aeruginosa Carbapenem resistant, bla VIM2 Negative 66 pseaer_r Staphylococcus aureus Oxacillin sensitive Positive 32.8 staaur_s Staphylococcus aureus Oxacillin resistant, mecA Positive 32.9 staaur_r Staphylococcus aureus ATCC25923 Positive 32.9 staaur Streptococcus pyogenes ATCC19615 Positive 38.5 strpyo Two A. baumannii routine clinical isolates with the same resistance pattern were used to demonstrate the reliability of calling the bla OXA−23 allele across different genomes. DNA preparation We cultured the ATCC isolates on Colombia agar with 5% sheep blood and the ESKAPE isolates on LB agar plates. We extracted DNA from bacterial cultures using the QIAamp DNA Mini kit (Qiagen) with the Gram-positive protocol using 20mg/ml lysozyme, according to the manufacturer’s protocol. We used DNA from a single extraction in all experiments for each isolate, with the exception of the ATCC isolates on Illumina (see below). We used the Qubit 4™ Fluorometer with 1x dsDNA HS Assay Kit™ (Thermo Fisher Scientific) to quantify the DNA concentration. Genome sequencing We sequenced all ESKAPE isolates on the Illumina MiSeq platform with PE150, following QIAseq FX (Qiagen) library preparation according to diagnostic routine protocols (ISO norm 15189 / 17025). Illumina data for the ATCC isolate libraries prepared with QIAseq FX were obtained from a previous study ( 33 ) under ENA number PRJEB31421. For ONT sequencing, the rapid barcoding kit (RBK004 and RBK114.24) was used for library preparation and sequenced with both R9.4.1 and R10.4.1 flowcells in a GridION sequencer, multiplexing twelve samples per flowcell. We used the same DNA for library preparation for both flowcell chemistries. We sequenced with the 400-bps speed and 4 kHz model for the default 72 hours runtime, on flowcells each with > 1000 available pores. The demultiplexing was done using the inbuilt MinKNOW software (v22.10.7). We further sequenced the eight ATCC isolates using the inbuilt 5 kHz Dorado-SUP model provided within the MinKNOW software (v24.06.15). Base-calling for ONT We generated the base-calling for the Fast5 files from R9.4.1 and R10.4.1 in parallel using the three different command line base-callers Guppy (v6.4.6), Dorado (v0.3.4), and Rerio ( [email protected] ). For Guppy and Dorado, three base-calling modes (FAST, HAC - High accuracy, and SUP - Super accuracy) were used; for Rerio, only the SUP mode is available. An overview of these different base-calling options and programs is shown in Fig. 1 . We converted the Fast5 files into Pod5 files for Dorado base-calling using pod5 tools (v0.1.5). We set the minimum base quality score cutoff to nine for all base-calling modes. The sequencing was done using the 4 kHz model. We also compared the 5 kHz Dorado-SUP model for ATCC isolates. We used the fastq files obtained after the base-calling for downstream data analysis. Illumina read QC and assembly We performed sequencing read quality control for the Illumina data using the FastQC program (v0.11.9)( 34 ). The raw reads were subjected to adapter removal using Trimmomatic (v0.39)( 35 ). After the adapter removal, we assembled the reads using Unicycler (v0.5.0)( 36 ). We calculated mean genome read depth using the formula as follows: C = L*N/G; where C - Genome read depth; L - Read length; N - Total number of reads, and G - Genome size, and achieved over 30x mean read depth (our clinically validated minimum) for all samples. ONT read QC and assembly We filtered reads with Filtlong (v0.2.0)( 37 ), using a minimum read length of 200 bp (--min_length 200) and retaining the top 90% of bases by quality (keep_percent 90). The sequencing read quality for the ONT data was calculated using NanoPlot (v1.40)( 38 ). Over 30x mean read depth was obtained for all samples, except a few assemblies generated with FAST mode. We converted the fastq files to fasta using Seqtk(1.3-r106)( 39 ) for Medaka polishing. We de novo assembled the filtered reads using Flye (v2.9.1-b1780)( 40 ) and obtained the assembly summary from the Flye log. The assembled FASTA obtained from Flye was polished with Medaka (v1.7.2)( 41 ) using reads generated with the respective base-calling models. Medaka polishing was not performed for the genomes obtained using Rerio base-calling, as no medaka polishing model is available for this respective base-caller. We utilized the Medaka (v2.0.0)( 41 ) polishing model for bacterial methylation for the data generated using R10.4.1 flowcell chemistry for both the 4 and 5 kHz models. Hybrid assemblies We generated hybrid assemblies based on Illumina-first assemblies using Unicycler (v0.5.0)( 36 ) with Illumina-only assemblies polished with long-reads generated from all possible base-callers and models, hereafter referred to as long read polished (lrp). Next, we generated ONT-first hybrid assemblies using Polypolish(v0.5.0)( 42 ), using medaka polished ONT-only assemblies, polished using Illumina reads, hereafter referred to as short read polished (srp). For the 5 kHz sequencing, we have not performed the hybrid assemblies, as the aim is to compare whether ONT alone reads base called with the bacterial base modification genome-aware base calling model, outperforms the hybrid assemblies. Annotation We annotated the assembled genomes using Bakta (v1.10.3)( 43 ) and obtained the total number of rRNA, coding sequence (CDS), and pseudogenes from the Bakta report. We calculated the mean CDS length using SeqKit (v2.4.0)( 44 ). Reference genomes Table 2 lists the reference genomes for the ATCC isolates. We used the Irp assemblies generated with the Rerio-SUP model as reference genomes for the ESKAPE isolates, as the Rerio-SUP model accounted for bacterial methylation. Table 2 ATCC isolates, and reference genomes used in analysis. ATCC isolates ATCC ID Refseq Accession number Burkholderia stabilis ATCCBAA-67 NZ_CP016442.1, NZ_CP016443.1, NZ_CP016444.1 Campylobacter jejuni ATCC700819 NC_002163.1 Enterococcus faecalis ATCC29212 NZ_CP008816.1, NZ_CP008815.1, NZ_CP008814.1 Escherichia coli ATCC25922 CP009072.1 Klebsiella quasipneumoniae ATCC700603 NZ_CP014696.2, NZ_CP014697.2, NZ_CP014698.2 Pseudomonas aeruginosa ATCC27853 CP015117.1 Staphylococcus aureus ATCC25923 NZ_CP009361.1, NZ_CP009362.1 Streptococcus pyogenes ATCC19615 NZ_CP008926.1 Assembly quality We assessed the assembly quality using QUAST (v5.0.2)( 45 ). The taxonomy identification was carried out on the assembled genome using the Genome Taxonomy Database Tool kit (GTDB-Tk) (v2.2.6) with database release (v214)( 46 ). We performed ARG prediction using NCBI AMRFinderPlus (v3.12)( 47 ) on the assembled genomes, using gene length coverage and sequence identity of 95% as cutoff scores. We used MOB-suite (v3.1.9)( 48 ) to classify assembled contigs with the identified ARG as either of chromosome or plasmid origin. Mapping and SNP error calling We aligned the read length filtered ONT reads, and adapter-trimmed Illumina reads to the respective reference genomes according to ESKAPE or ATCC bacterial isolates using Minimap2 (v2.24)( 49 ) in ont-mode for ONT reads and sr-mode for Illumina reads. The BAM files were subjected to sorting, duplication removal, and indexing using Samtools (v1.15.1)( 50 ). We performed SNP calling from the BAM files using Pilon (v1.24)( 51 ), and vcftools (v0.1.16)( 52 ) to process the vcf files with retaining sites with a minimum of two alleles (--min-alleles 2) and a minimum quality threshold of 1(--minQ 1). We extracted the SNPs annotated in the vcf files using vcftools (v0.1.16). Variant calling on short reads was also tested using Snippy (v4.6.0) ( 53 ) for comparison. We calculated the read depth of the called SNPs at each position using Samtools (v1.15.1)( 50 ). Core genome MLST (cgMLST) analysis We analyzed all the assemblies in Ridom Seqsphere+ (v10.0.3)( 54 ) and used cgMLST analysis with defined schemes (cgmlst.org). Statistical tests were not feasible, as each comparison involved a single species, and species-specific effects would have influenced the results. Results Read and base quality The selection of diverse ATCC type strains and clinical isolates yielded a median base quality of 35.9 (IQR 33.2–36.3) with Illumina sequencing. In contrast, sequencing using ONT with R10.4.1 flowcells and base-calling in SUP mode with Dorado and Guppy resulted in a median read quality score (mean Q-score) of 15.3 (IQR 14.7–15.9) and 15.4 (IQR 14.9–16.1). This was higher than the data from R9.4.1 flowcells, which were base-called in SUP mode with Dorado and Guppy, producing a median Q-score of 13.9 (IQR 13.6–14.1) and 14.2 (IQR 14–14.6) (Fig. 2 a). The number of bases generated with R9.4.1 flowcells was greater than that of R10.4.1 flowcells, but the R10.4.1 flowcells produced longer reads, with a higher N50 (Fig. 2 b and c ). The quality metrics for each sample are provided in Supplementary Table S1 . ATCC isolates resequenced with Dorado 5 kHz SUP mode yielded the highest median base quality of 19.5 (IQR 19.0–19.7) compared to the R10.4.1 (4 kHz) and R9.4.1 base called reads ( Supplementary Table S1 ). We also obtained duplex reads from the R10.4.1 flowcells. While the read quality was improved with duplex reads, the number of these reads was insufficient for downstream genome assembly ( 31 ). Therefore, these data are not included in the results. Assembly quality and species identification We noted that bacterial genome assemblies generated using R10.4.1 Dorado SUP (4 and 5 kHz) and Rerio SUP base callers generally achieved higher assembly N50 values compared to those from R9.4.1 and Illumina platforms. The N50s are strongly species dependent, with the N50 of ONT and hybrid assemblies frequently reflecting longer assembly N50 lengths, with exceptions in a few isolates sequenced from R9.4.1 flow chemistries. entfae_s and acipit had shorter assembly length N50, whereas assemblies based solely on Illumina data generate shorter assembly N50 lengths (Fig. 3 a). There was no difference in the observed %G + C content between the assemblies from the two sequencing platforms, and the mean depth for the ONT and Illumina is also provided in Supplementary Table S2 . The GTDB-Tk correctly assigned bacterial taxonomy in all de novo assembled genomes, regardless of the sequencing technology or the flowcell chemistry ( Supplementary Table S3 ) , except for B. stabilis , was not classified for assemblies generated with R10.4.1 Dorado and Guppy in FAST mode similarly in srp hybrid assemblies from R10.4.1 Dorado and Guppy in FAST mode. Also, in R9.4.1, HAC mode could not classify ESKAPE P. aeruginosa as the assembly was incomplete. Annotation quality Annotated assemblies, across all species, generated with Illumina data, exhibited relatively consistent mean CDS lengths, suggesting accurate representation of the genome content and minimal frame-shift errors (Fig. 3 b). Notably, annotated assemblies from R10.4.1 Dorado SUP (4 and 5 kHz) and Rerio SUP produced mean CDS lengths that were closer to those observed in Illumina assemblies, followed by the R10.4.1 Dorado HAC model (Fig. 3 b). The FAST model with Guppy and Dorado performed the worst, with a lower mean CDS length. Pseudogenes, being annotated disrupted gene, were used as a proxy for incorrect base calls or indels leading to mis-annotation. Illumina-based assemblies exhibited a lower number of pseudogenes compared to ONT-only assemblies. Similarly, assemblies from R10.4.1 Dorado SUP (4 and 5 kHz), Rerio SUP, and R10.4.1 Dorado HAC models also had fewer pseudogenes compared to those from other ONT flowcells and base calling models (Fig. 3 c). Assemblies generated using the FAST base-calling models from both Dorado and Guppy, using data from either R9.4.1 or R10.4.1 chemistries, consistently exhibited higher pseudogene counts than those generated with the SUP models from Rerio and Dorado. In hybrid assemblies, annotation of lrp-based assemblies resulted in fewer pseudogenes than the srp-based assemblies, specifically those generated with R9.4.1 chemistry. Ribosomal RNA (rRNA) genes are often found in multiple copies within genomes( 55 ). ONT-only assemblies achieved the same total numbers of annotated rRNA genes as seen in the reference genomes, whereas they often collapse into single copies in Illumina assemblies (Fig. 3 d). Hybrid assemblies, integrating both long- and short-read data, showed no differences in terms of assembly length, N50, number of annotated rRNAs, and %G + C content. ARG detection All ARGs presented in Table 1 were successfully identified in assemblies of ESKAPE isolates, regardless of the sequencing platforms used. However, assemblies generated using the FAST mode from ONT showed incomplete annotation of the vanA operon in E. faecium (entfae_r) or entirely missed ARGs for certain bacterial species, notably E. coli (esccol_r ) . Additionally, when using ONT R9.4.1 chemistry with Dorado and Guppy base calling in High accuracy (HAC) and Super accuracy (SUP) models, the vanA operon in E. faecium was detected with only 81% gene length coverage. To investigate the chromosomal or plasmidic location of the ARGs, MOB-suite analysis of assembled genomes was correlated. The identified ARGs are located on the chromosomes, except for the vanA gene, which was classified as plasmid-encoded. Of note, the majority of the ARGs from lrp assemblies were not on contigs assigned specifically to chromosome or plasmid categories. The ARG results are shared in Supplementary Table S4 Sequencing error calling The number of sequencing errors was assessed using variant calling / SNP analysis, using ATCC (n = 8) isolates, which have high-quality reference genomes available in the RefSeq database. Overall, Illumina reads mapped to the reference sequences generated fewer sequencing errors than ONT reads, with more errors seen in R10.4.1 than R9.4.1 reads (Fig. 4 ). We observed that in ONT data (Dorado-SUP, R10.4.1), more sequencing errors were observed in C. jejuni (~ 10,000), followed by E. faecalis ( ~ 73 ), E. coli (~ 140) , and P. aeruginosa ( ~ 466 ) compared to Illumina. Pilon identified sequencing errors with at least 30x read depth. A species-specific effect is observed between the variant calling for different base-callers and models, which could be attributed to the varying %G + C content, presence of modified bases (miscalled methylated sites), or highly repetitive regions in the genome. For ATCC S. aureus , fewer sequencing errors were detected in ONT reads from Dorado-SUP and Rerio-SUP (R10.4.1 and R9.4.1 flowcells), with one error in ONT reads compared to two in Illumina. For K. quasipneumoniae , no errors were detected in ONT data from R10.4.1, Rerio (4kHz), and Dorado SUP 4kHz (n = 0 or n = 1). S. pyogenes had seven sequencing errors in ONT data (R10.4.1 Dorado 4kHz and Rerio SUP) and six errors in Illumina data. For B. stabilis , Illumina reads had more errors than ONT data. No difference was observed in sequencing errors between the 4kHz and 5kHz models of Dorado SUP in R10.4.1. The complete number of sequencing errors detected and the alignment percentage to the respective reference genomes are provided in Supplementary Table S5 for ATCC and ESKAPE isolates. Similarly, the number of indels was higher in the ONT data compared to Illumina, with the exception of K. quasipneumoniae and S. aureus. The Snippy comparison is also provided in Supplementary Table S5 . Core genome analysis of assemblies An important readout in transmission and outbreak analysis is cgMLST. This approach allows for high-resolution comparison of bacterial isolates by analyzing the allelic differences across a defined set of core genes shared within a species. This is often a genome assembly-based comparison. Where schemes were available, we compared our assemblies using cgMLST. Assemblies from Illumina and hybrid approaches (lrp and srp) exhibit lower allelic distances from their respective reference genomes, indicating higher similarity to the reference genomes at the core genome level compared to ONT data (Fig. 5 ). ONT-only assemblies generated with R10.4.1 Dorado SUP and Rerio SUP for S. aureus , K. quasipneumoniae , and K. pneumoniae also met the defined cluster thresholds. However, for other species, although some of the ONT-only assemblies from Dorado, Guppy, and Rerio (HAC and SUP models) achieved good target percentages above 90% (a quality control cut off), their allelic distances exceeded the cgMLST cutoff, meaning that the isolates would not be identified in the same cluster in an outbreak context ( Supplementary file S2 ). In contrast, ONT-only assemblies using the FAST model for Dorado and Guppy displayed lower than 90% good target percentages and higher allelic distances from the reference. The ATCC isolates sequenced with the Dorado SUP-5kHz model showed lower allelic distances from their respective ATCC references and were on par with Illumina assemblies, except for C. jejuni , which still displayed 348 allele differences from the reference genome. Discussion In this study, we evaluated short-read Illumina and long-read ONT technologies and software versions for bacterial whole genome sequencing using eight ATCC strains and twelve ESKAPE clinical isolates, with relevant analyses for clinical microbiology, including genome assembly quality, antimicrobial resistance identification, and taxonomic identification. Illumina data consistently achieved much higher median quality scores compared to both ONT flow cell versions ( 56 ). The ONT results were also below the expected Q20, potentially influenced by library preparation methods ( 57 ). The higher total base yields produced by the R9.4.1 chemistry was likely due to shorter read lengths being preferentially sequenced, compared to the longer reads of R10.4.1. (Fig. 2 b). Longer reads typically require more time to translocate through nanopores than shorter reads. Consequently, ONT platforms preferentially sequence shorter reads, resulting in an overall higher data yield for the R9.4.1 chemistry ( 58 , 59 ). We observed that genome assemblies generated using ONT R10.4.1 chemistry and the SUP base-calling models from Rerio and Dorado (4 and 5 kHz) produced notably higher N50 values compared to R9.4.1 and Illumina assemblies. This increase in N50 indicates greater genome contiguity, which is particularly beneficial for resolving complex genomic structures, repetitive elements, and large structural variants. Similar findings have been reported by previous studies using long-read technologies, highlighting the advantage of ONT sequencing for improved assembly continuity and completeness ( 60 – 62 ). These findings highlight the advantage of ONT-only assemblies in generating more contiguous genomes relative to Illumina-only assemblies. Pseudogene counts and CDS mean length varied by sequencing technology and base-calling model. Illumina assemblies consistently had the fewest pseudogenes, reflecting fewer sequencing errors and more accurate coding sequence annotations. Conversely, assemblies from FAST base-calling showed increased pseudogene counts due to lower raw read accuracy ( 63 ), causing frameshift mutations, indels, and misplaced stop codons. Thus, selecting high-accuracy base-calling models (SUP or HAC) is crucial for precise genome annotation and comparative genomic studies, such as pan-genome analysis and Genome-wide association studies, which rely on core gene annotations. ONT-alone assemblies and hybrid assemblies were able to resolve individual rRNA operons compared to Illumina-alone assemblies, in which they collapsed and assembled. This highlights the importance of long-read sequences provided by ONT in resolving repetitive genomic regions compared to short-read Illumina. These results suggest that ONT-alone assemblies R10.4.1 chemistry, specifically base-called on Dorado (5kHz) in the SUP model, can yield accurate annotations. Taxonomic identification of all assembled bacterial genomes was accurate, regardless of the sequencing technology used. GTDB-Tk assigned all bacterial genomes to the correct species. This robustness could be attributed to the reliance on a conserved set of single-copy marker genes detected via amino-acid Hidden Markov Models (HMMs), which tolerate indels and %G + C-extreme underrepresentation, and from Average Nucleotide Identity (ANI) comparisons focused on those conserved regions. By enforcing minimal marker-gene completeness thresholds, GTDB-Tk ensures sufficient phylogenetic signals for accurate taxonomic assignment ( 64 – 66 ). The consistent detection of ARGs across the sequencing platforms highlights their reliability and effectiveness in identifying resistance determinants among ESKAPE pathogens. However, the limitations observed with ONT's FAST mode assemblies emphasize the importance of selecting appropriate sequencing and base calling strategies to accurately identify ARG alleles. Sequencing errors, particularly in ONT data, can lead to miscalling ARG alleles which, particularly in the cases of some beta-lactamase genes, could have clinical implications. SNP calling is a critical process in bacterial WGS and comparison that relies on high-accuracy reads at a good read depth and is confounded by sequencing errors. Single-nucleotide changes can provide key insights during epidemiological investigations and outbreak analyses. We used SNP calling as a proxy for sequencing error analysis. A pronounced species-specific effect was evident in B. stabilis , E. faecalis , E. coli , C. jejuni , and P. aeruginosa , which displayed greater divergence between platforms, with ONT detecting up to four orders of magnitude more sequencing errors depending on the base‐caller model. By contrast, S. aureus, S. pyogenes , and K. quasipneumoniae remained almost invariant, reflecting their relatively homogeneous genomes and lower repeat content. These results underscore that genomic features such as the presence of methylated motifs, %G + C skew, and repeat density might lead to mapping artifacts ( 67 , 68 ). An important test of the generated assemblies was to evaluate the allelic distances from reference genomes using cgMLST. Our results demonstrate that Illumina and hybrid assemblies consistently yielded lower allelic distances from their reference genomes and clustered closely, within the accepted cgMLST thresholds. Interestingly, ONT-only assemblies produced using base caller models R10.4.1 Dorado SUP 4kHz and Rerio SUP 4kHz also showed comparable performance to Illumina for specific organisms, notably S. aureus, K. quasipneumoniae , and K. pneumoniae . However, our analysis revealed limitations for other species ( C. jejuni, E. faecalis, E. coli , and P. aeruginosa ) when employing ONT-only assemblies with Dorado, Guppy, and Rerio using High accuracy (HAC) and Super accuracy (SUP) models. Despite some of these assemblies having good target percentages of alleles identified above 90%, their allelic distances frequently surpassed the cgMLST cutoffs. Especially poor was the performance with C. jejuni . This observation suggests species-specific variability in the assembly quality, potentially linked to genome complexity and issues in sequencing native modified bases. Furthermore, ONT-only assemblies employing FAST models for Dorado and Guppy performed notably worse, demonstrating both suboptimal target percentages (< 90%) and increased allelic distances from references. This underscores the limited suitability of FAST models for applications requiring high precision, such as cgMLST-based epidemiological investigations. As a consequence, ONT sequencing should only be used with great caution in outbreak investigations, and with model- and species-specific validation. Using the recent Dorado SUP-5kHz model provided within the MinKNOW software (v24.06.15), the sequenced ATCC isolates displayed lower allelic distances, comparable to Illumina assemblies, across most species, except for C. jejuni. Based on these results, ONT-only assemblies generated using the Dorado (5 kHz) model can be considered on par with Illumina in terms of cgMLST cluster detection. However, care should be taken for species like C. jejuni , which consistently showed higher allelic distances using ONT data. This may be due to its low %G + C content and the presence of complex methylation motifs that are known to reduce ONT sequencing accuracy, as discussed in the benchmarking efforts ( 69 ). Interestingly, several studies have also reported strain-specific erroneous clustering with the Dorado SUP 5 kHz model in several species, like Listeria monocytogenes and Burkholderia pseudomallei ( 70 , 71 ). A multicenter study led by Dabernig-Heinz et al. ( 72 ) also demonstrated similar strain-specific erroneous clustering in E. faecium, K. pneumoniae, S. aureus , and L. monocytogenes species. The authors also discussed PCR preamplification of input DNA for ONT sequencing as an error mitigation strategy to overcome the faulty cgMLST clustering arising from methylated sites in the bacterial genomes. Similarly, PCR-based library preparation for ONT sequencing and masking of problematic genomic regions in Corynebacterium diphtheriae strains and Vancomycin-resistant Enterococci strains has been proven to match Illumina results in cgMLST clustering ( 73 ). Another ONT error mitigation is to mask the modified bases in the bacterial genomes through the polishing of the assemblies using the cgMLST polisher tool provided within the Ridom SeqSphere + software suite. The strategy is to identify and mask ambiguous base calls at methylation-related error hotspots using allele‐aware masking derived from raw‐signal patterns to refine allele assignments and eliminate systematic miscalls. Studies in diverse multidrug-resistant nosocomial important Gram-negative and Gram-positive bacterial isolates and %G ± C-rich B. pseudomallei isolates ( 71 , 74 ) highlight that targeted masking or polishing of ONT errors via the ONT-cgMLST-Polisher is critical to reach Illumina-comparable resolution in cgMLST clustering. A limitation of the study is the limited number of bacterial isolates that were used to validate the Dorado SUP 5 kHz performance across species, and the study did not compare the latest cgMLST ONT polisher made available within RidomSeqsphere+. Nonetheless, the study underscores the importance of leveraging the synergy between long- and short-read sequencing platforms and highlights the development of ONT across different stages from R9.4.1, 4kHz R10.4.1 Dorado followed by Rerio to 5kHz Dorado SUP and emphasizes the need to further improve and validate ONT’s bacterial base-calling models to account for modified bases across different bacterial species. Conclusion We systematically and comprehensively evaluated bacterial WGS across two platforms, Illumina and ONT, comparing multiple flowcell chemistries, base callers, and hybrid assembly approaches. ONT data, particularly those generated with R10.4.1 flowcells and the SUP models in Dorado and Rerio, yielded highly contiguous genome assemblies and resolved repetitive regions more effectively. In contrast, Illumina consistently provided more accurate base calls, albeit with more fragmented assemblies. Hybrid assemblies leverage the strengths of both technologies, producing comprehensive assemblies alongside high-level accuracy. Nevertheless, challenges persist in addressing base modifications in certain bacterial strains, and although ONT chemistries and base-calling algorithms continue to improve, there are still species-specific sequencing errors that affect especially for outbreak investigations. The complementary use of long and short reads remains the most reliable strategy for achieving both complete and highly accurate bacterial genome assemblies. Abbreviations ANI Average Nucleotide Identity ARG Antimicrobial resistance gene ATCC American Type Culture Collection CDS Coding DNA sequence cgMLST Core genome multilocus sequence typing ENA European Nucleotide Archive ESBL Extended-spectrum beta-lactamase ESKAPE Enterococcus faecium , Staphylococcus aureus , Klebsiella pneumoniae , Acinetobacter baumannii , Pseudomonas aeruginosa , and Enterobacter spp FAST FAST (ONT base-calling mode) GTDB-Tk Genome Taxonomy Database Toolkit HAC High accuracy (ONT base calling mode) IQR Interquartile Range ONT Oxford Nanopore Technologies PE Paired end PCR Polymerase chain reaction QC Quality control RBK Rapid barcoding kit (ONT) rRNA Ribosomal RNA SNP Single nucleotide polymorphism SUP Super accuracy (ONT base calling mode) TAT Turnaround time VCF Variant Call Format WGS Whole genome sequencing Declarations Ethics approval and consent to participate Not applicable Consent for publication Not applicable Competing interests Not applicable Funding We thank the Swiss National Science Foundation (reference 310030_192515) for supporting the research of AE. Author Contribution Conceptualization and study design plan: SP, HSS, TR, and AE. Sequencing, bioinformatics analysis, and visualization: SP and HSS. Writing of the manuscript: SP. Providing critical feedback on the manuscript: HSS, TR, and AE. Acknowledgement We would like to thank Dr. Vladimira Hinic and PD Dr. Stefano Mancini from the Institute of Medical Microbiology, University of Zurich for kindly providing the clinical ESKAPE isolates. We would also like to thank the High-Performance Cluster support provided by the Science IT team from the University of Zurich for performing the bioinformatics analysis. We thank Dr. Marco Meola for the helpful discussion and advice. Data Availability The raw fastq files from the ONT and the Illumina have been deposited in the European Nucleotide Archive (ENA) with accession ID: PRJEB90249 and will be made public upon acceptance of the manuscript. References Kozińska A, Seweryn P, Sitkiewicz I. A crash course in sequencing for a microbiologist. J Appl Genet. 2019;60:103–11. Didelot X, Bowden R, Wilson DJ, Peto TEA, Crook DW. Transforming clinical microbiology with bacterial genome sequencing. Nat Rev Genet. 2012;13:601–12. Simar SR, Hanson BM, Arias CA. Techniques in bacterial strain typing: past, present, and future. Curr Opin Infect Dis. 2021;34:339–45. Price V, Ngwira LG, Lewis JM, Baker KS, Peacock SJ, Jauneikaite E et al. A systematic review of economic evaluations of whole-genome sequencing for the surveillance of bacterial pathogens. Microb Genom. 2023;9. Schadron T, van den Beld M, Mughini-Gras L, Franz E. Use of whole genome sequencing for surveillance and control of foodborne diseases: status quo and quo vadis. Front Microbiol. 2024;15:1460335. Mutschler E, Roloff T, Neves A, Vangstein Aamot H, Rodriguez-Sanchez B, Ramirez M, et al. Towards unified reporting of genome sequencing results in clinical microbiology. PeerJ. 2024;12:e17673. Lau KA, Gonçalves da Silva A, Theis T, Gray J, Ballard SA, Rawlinson WD. Proficiency testing for bacterial whole genome sequencing in assuring the quality of microbiology diagnostics in clinical and public health laboratories. Pathology. 2021;53:902–11. Rossen JWA, Friedrich AW, Moran-Gilad J, ESCMID Study Group for Genomic and Molecular Diagnostics (ESGMD). Practical issues in implementing whole-genome-sequencing in routine diagnostic microbiology. Clin Microbiol Infect. 2018;24:355–60. Bianconi I, Aschbacher R, Pagani E. Current uses and future perspectives of genomic technologies in clinical microbiology. Antibiot (Basel). 2023;12:1580. Quainoo S, Coolen JPM, van Hijum SAFT, Huynen MA, Melchers WJG, van Schaik W, et al. Whole-genome sequencing of bacterial pathogens: The future of nosocomial outbreak analysis. Clin Microbiol Rev. 2017;30:1015–63. Kigen C, Muraya A, Kyanya C, Kingwara L, Mmboyi O, Hamm T et al. Enhancing capacity for national genomics surveillance of antimicrobial resistance in public health laboratories in Kenya. Microb Genom. 2023;9. Bogaerts B, Winand R, Van Braekel J, Hoffman S, Roosens NHC, De Keersmaecker SCJ et al. Evaluation of WGS performance for bacterial pathogen characterization with the Illumina technology optimized for time-critical situations. Microb Genom. 2021;7. Liu L, Li Y, Li S, Hu N, He Y, Pong R, et al. Comparison of next-generation sequencing systems. J Biomed Biotechnol. 2012;2012:251364. Bruzek S, Vestal G, Lasher A, Lima A, Silbert S. Bacterial whole genome sequencing on the Illumina iSeq 100 for clinical and public health laboratories. J Mol Diagn. 2020;22:1419–29. De Maio N, Shaw LP, Hubbard A, George S, Sanderson ND, Swann J et al. Comparison of long-read sequencing technologies in the hybrid assembly of complex bacterial genomes. Microb Genom. 2019;5. Bejaoui S, Nielsen SH, Rasmussen A, Coia JE, Andersen DT, Pedersen TB, et al. Comparison of Illumina and Oxford Nanopore sequencing data quality for Clostridioides difficile genome analysis and their application for epidemiological surveillance. BMC Genomics. 2025;26:92. Bachmann JA, Tedder A, Laenen B, Steige KA, Slotte T. Targeted long-read sequencing of a locus under long-term balancing selection in Capsella. G3 (Bethesda). 2018;8:1327–33. Luo Y, Jang JH, Balkey M, Hoffmann M. 217 closed Salmonella reference genomes using PacBio sequencing. BMC Genom Data. 2025;26:15. Sierra R, Roch M, Moraz M, Prados J, Vuilleumier N, Emonet S, et al. Contributions of long-read sequencing for the detection of antimicrobial resistance. Pathogens. 2024;13:730. George S, Pankhurst L, Hubbard A, Votintseva A, Stoesser N, Sheppard AE, et al. Resolving plasmid structures in Enterobacteriaceae using the MinION nanopore sequencer: assessment of MinION and MinION/Illumina hybrid data assembly approaches. Microb Genom. 2017;3:e000118. Shelenkov A, Petrova L, Mironova A, Zamyatin M, Akimkin V, Mikhaylova Y. Long-read whole genome sequencing elucidates the mechanisms of amikacin resistance in multidrug-resistant Klebsiella pneumoniae isolates obtained from COVID-19 patients. Antibiot (Basel). 2022;11:1364. Avershina E, Frye SA, Ali J, Taxt AM, Ahmad R. Ultrafast and cost-effective pathogen identification and resistance gene detection in a clinical setting using nanopore Flongle sequencing. Front Microbiol. 2022;13:822402. Wasswa FB, Kassaza K, Nielsen K, Bazira J. MinION whole-genome sequencing in resource-limited settings: Challenges and opportunities. Curr Clin Microbiol Rep. 2022;9:52–9. Pronyk PM, de Alwis R, Rockett R, Basile K, Boucher YF, Pang V, et al. Advancing pathogen genomics in resource-limited settings. Cell Genom. 2023;3:100443. Bayliss SC, Hunt VL, Yokoyama M, Thorpe HA, Feil EJ. The use of Oxford Nanopore native barcoding for complete genome assembly. Gigascience. 2017;6:1–6. Wick RR, Judd LM, Holt KE. Assembling the perfect bacterial genome using Oxford Nanopore and Illumina sequencing. PLoS Comput Biol. 2023;19:e1010905. Lerminiaux N, Fakharuddin K, Mulvey MR, Mataseje L. Do we still need Illumina sequencing data? Evaluating Oxford Nanopore Technologies R10.4.1 flowcells and the Rapid v14 library prep kit for Gram negative bacteria whole genome assemblies. Can J Microbiol. 2024;70:178–89. Linde J, Brangsch H, Hölzer M, Thomas C, Elschner MC, Melzer F, et al. Comparison of Illumina and Oxford Nanopore Technology for genome analysis of Francisella tularensis, Bacillus anthracis, and Brucella suis. BMC Genomics. 2023;24:258. Bogaerts B, Van den Bossche A, Verhaegen B, Delbrassinne L, Mattheus W, Nouws S, et al. Closing the gap: Oxford Nanopore Technologies R10 sequencing allows comparable results to Illumina sequencing for SNP-based outbreak investigation of bacterial pathogens. J Clin Microbiol. 2024;62:e0157623. Foster-Nyarko E, Cottingham H, Wick RR, Judd LM, Lam MMC, Wyres KL et al. Nanopore-only assemblies for genomic surveillance of the global priority drug-resistant pathogen, Klebsiella pneumoniae. Microb Genom. 2023;9. Sanderson ND, Kapel N, Rodger G, Webster H, Lipworth S, Street TL et al. Comparison of R9.4.1/Kit10 and R10/Kit12 Oxford Nanopore flowcells and chemistries in bacterial genome reconstruction. Microb Genom. 2023;9. Sanderson ND, Hopkins KMV, Colpus M, Parker M, Lipworth S, Crook D et al. Evaluation of the accuracy of bacterial genome reconstruction with Oxford Nanopore R10.4.1 long-read-only sequencing. Microb Genom. 2024;10. Seth-Smith HMB, Bonfiglio F, Cuénod A, Reist J, Egli A, Wüthrich D. Evaluation of rapid library preparation protocols for whole genome sequencing based outbreak investigation. Front Public Health. 2019;7:241. http:// . Accessed 6 Jun 2025. Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114–20. Wick RR, Judd LM, Gorrie CL, Holt KE. Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput Biol. 2017;13:e1005595. Wick R. Filtlong: quality filtering tool for long reads. Github. De Coster W, Rademakers R. NanoPack2: population-scale evaluation of long-read sequencing data. Bioinformatics. 2023;39. Li H. seqtk: Toolkit for processing sequences in FASTA/Q formats. Github. Kolmogorov M, Yuan J, Lin Y, Pevzner PA. Assembly of long, error-prone reads using repeat graphs. Nat Biotechnol. 2019;37:540–6. medaka. Sequence correction provided by ONT Research. Github. Wick RR, Holt KE, Polypolish. Short-read polishing of long-read bacterial genome assemblies. PLoS Comput Biol. 2022;18:e1009802. Schwengers O, Jelonek L, Dieckmann MA, Beyvers S, Blom J, Goesmann A. Bakta: rapid and standardized annotation of bacterial genomes via alignment-free sequence identification. Microb Genom. 2021;7. Shen W, Le S, Li Y, Hu F. SeqKit: A cross-platform and ultrafast toolkit for FASTA/Q file manipulation. PLoS ONE. 2016;11:e0163962. Gurevich A, Saveliev V, Vyahhi N, Tesler G. QUAST: quality assessment tool for genome assemblies. Bioinformatics. 2013;29:1072–5. Chaumeil P-A, Mussig AJ, Hugenholtz P, Parks DH. GTDB-Tk v2: memory friendly classification with the genome taxonomy database. Bioinformatics. 2022;38:5315–6. Feldgarden M, Brover V, Gonzalez-Escalona N, Frye JG, Haendiges J, Haft DH, et al. AMRFinderPlus and the Reference Gene Catalog facilitate examination of the genomic links among antimicrobial resistance, stress response, and virulence. Sci Rep. 2021;11:12728. Robertson J, Nash JHE. MOB-suite: software tools for clustering, reconstruction and typing of plasmids from draft assemblies. Microb Genom. 2018;4. Li H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics. 2018;34:3094–100. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25:2078–9. Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE. 2014;9:e112963. Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, et al. The variant call format and VCFtools. Bioinformatics. 2011;27:2156–8. Seemann T. snippy::scissors: Rapid haploid variant calling and core genome alignment. Github. Jünemann S, Sedlazeck FJ, Prior K, Albersmeier A, John U, Kalinowski J, et al. Updating benchtop sequencing performance comparison. Nat Biotechnol. 2013;31:294–6. Espejo RT, Plaza N. Multiple ribosomal RNA operons in bacteria; Their concerted evolution and potential consequences on the rate of evolution of their 16S rRNA. Front Microbiol. 2018;9:1232. Miller CS, Baker BJ, Thomas BC, Singer SW, Banfield JF. EMIRGE: reconstruction of full-length ribosomal genes from microbial community short read sequencing data. Genome Biol. 2011;12:R44. Ni Y, Liu X, Simeneh ZM, Yang M, Li R. Benchmarking of Nanopore R10.4 and R9.4.1 flowcells in single-cell whole-genome amplification and whole-genome shotgun sequencing. Comput Struct Biotechnol J. 2023;21:2352–64. Purushothaman S, Meola M, Roloff T, Rooney AM, Egli A. Evaluation of DNA extraction kits for long-read shotgun metagenomics using Oxford Nanopore sequencing for rapid taxonomic and antimicrobial resistance detection. Sci Rep. 2024;14:29531. De La Cerda GY, Landis JB, Eifler E, Hernandez AI, Li F-W, Zhang J, et al. Balancing read length and sequencing depth: Optimizing Nanopore long-read sequencing for monocots with an emphasis on the Liliales. Appl Plant Sci. 2023;11:e11524. Soto-Serrano A, Li W, Panah FM, Hui Y, Atienza P, Fomenkov A et al. Matching excellence: Oxford Nanopore Technologies’ rise to parity with Pacific Biosciences in genome reconstruction of non-model bacterium with high %G + C content. Microb Genom. 2024;10. Wagner GE, Dabernig-Heinz J, Lipp M, Cabal A, Simantzik J, Kohl M, et al. Real-time nanopore Q20 + sequencing enables extremely fast and accurate core genome MLST typing and democratizes access to high-resolution bacterial pathogen surveillance. J Clin Microbiol. 2023;61:e0163122. Loman NJ, Quick J, Simpson JT. A complete bacterial genome assembled de novo using only nanopore sequencing data. Nat Methods. 2015;12:733–5. Nanopore sequencing accuracy. Oxford Nanopore Technologies. https://nanoporetech.com/platform/accuracy Chaumeil P-A, Mussig AJ, Hugenholtz P, Parks DH. GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database. Bioinformatics. 2019;36:1925–7. Parks DH, Chuvochina M, Rinke C, Mussig AJ, Chaumeil P-A, Hugenholtz P. GTDB: an ongoing census of bacterial and archaeal diversity through a phylogenetically consistent, rank normalized and complete genome-based taxonomy. Nucleic Acids Res. 2022;50:D785–94. Mussig AJ, Chaumeil P-A, Chuvochina M, Rinke C, Parks DH, Hugenholtz P. Putative genome contamination has minimal impact on the GTDB taxonomy. Microb Genom. 2024;10. Browne PD, Nielsen TK, Kot W, Aggerholm A, Gilbert MTP, Puetz L et al. %G + C bias affects genomic and metagenomic reconstructions, underrepresenting %G + C-poor organisms. Gigascience. 2020;9. Olson ND, Lund SP, Colman RE, Foster JT, Sahl JW, Schupp JM, et al. Best practices for evaluating single nucleotide variant calling methods for microbial genomics. Front Genet. 2015;6:235. Wick R. ONT-only accuracy: 5 kHz and Dorado. Ryan Wick’s bioinformatics blog. 2023. https://rrwick.github.io/2023/10/24/ont-only-accuracy-update.html Biggel M, Cernela N, Horlbog JA, Stephan R. Oxford Nanopore’s 2024 sequencing technology for Listeria monocytogenes outbreak detection and source attribution: progress and clone-specific challenges. J Clin Microbiol. 2024;62:e0108324. Weigl S, Dabernig-Heinz J, Granitz F, Lipp M, Ostermann L, Harmsen D, et al. Improving Nanopore sequencing-based core genome MLST for global infection control: a strategy for %G + C-rich pathogens like Burkholderia pseudomallei. J Clin Microbiol. 2025;63:e0156924. Dabernig-Heinz J, Lohde M, Hölzer M, Cabal A, Conzemius R, Brandt C, et al. A multicenter study on accuracy and reproducibility of nanopore sequencing-based genotyping of bacterial pathogens. J Clin Microbiol. 2024;62:e0062824. Neuenschwander S, Borcard L, Gempeler S, Miani MT, Casanova C, Ramette A. Evaluation of Oxford Nanopore Technologies workflows for genomic epidemiology of outbreak-associated bacterial isolates in the clinical setting. medRxiv. 2025. Prior K, Becker K, Brandt C, Cabal Rosel A, Dabernig-Heinz J, Kohler C et al. Accurate and reproducible whole-genome genotyping for bacterial genomic surveillance with Nanopore sequencing data. J Clin Microbiol. 2025;:e0036925. Additional Declarations No competing interests reported. Supplementary Files SupplementaryS1Readqualitymetrics.xlsx SupplementaryS2assemblyqualitymetrics.xlsx SupplementaryS3GTDBtaxa.xlsx SupplementaryS4ARG.xlsx SupplementaryS5Sequencingerrors.xlsx Cite Share Download PDF Status: Published Journal Publication published 15 Jan, 2026 Read the published version in BMC Medical Genomics → Version 1 posted Editorial decision: Revision requested 10 Sep, 2025 Reviews received at journal 16 Jul, 2025 Reviewers agreed at journal 10 Jul, 2025 Reviewers invited by journal 08 Jul, 2025 Editor assigned by journal 07 Jul, 2025 Submission checks completed at journal 07 Jul, 2025 First submitted to journal 03 Jul, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7036422","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":482331160,"identity":"0a50b89f-205d-4a44-96ba-037a5bebfb3f","order_by":0,"name":"Srinithi Purushothaman","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA2klEQVRIiWNgGAWjYFACxgaGBIYDPAYMzAeAPAkZIrUkALWwsSWAtPAQaVPCAQYDNqBFQEBYi/zsw20PHv64I2Mu3/P51Y0aCx4G9sNHN+DTYnAusd0gIeEZj2Ub7zbrnGNAh/Gkpd3Aq4WHsU0iIeEwj8Ex3m3GOWxALRI8Zni1yPfAtfA8M875R4QWhjMILcyPc9uI0GIA1pJ2GOiXNDPm3D4JHjZCfpHvYX8m+cPmsL058+HHn3O+1cnxsx8+ht9hSIBNAkwSqxwEmD+QonoUjIJRMApGDgAAgztEu+LA3w8AAAAASUVORK5CYII=","orcid":"","institution":"University of Zurich","correspondingAuthor":true,"prefix":"","firstName":"Srinithi","middleName":"","lastName":"Purushothaman","suffix":""},{"id":482331161,"identity":"9d99a6ea-6216-428c-b134-41979df281b6","order_by":1,"name":"Tim Roloff","email":"","orcid":"","institution":"University of Zurich","correspondingAuthor":false,"prefix":"","firstName":"Tim","middleName":"","lastName":"Roloff","suffix":""},{"id":482331162,"identity":"968db189-41b2-4b5a-9d15-628f55bd4e2d","order_by":2,"name":"Adrian Egli","email":"","orcid":"","institution":"University of Zurich","correspondingAuthor":false,"prefix":"","firstName":"Adrian","middleName":"","lastName":"Egli","suffix":""},{"id":482331163,"identity":"54d6bd9a-d172-4ebf-988e-8d85baf1726d","order_by":3,"name":"Helena MB Seth-Smith","email":"","orcid":"","institution":"University of Zurich","correspondingAuthor":false,"prefix":"","firstName":"Helena","middleName":"MB","lastName":"Seth-Smith","suffix":""}],"badges":[],"createdAt":"2025-07-03 09:38:55","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7036422/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7036422/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1186/s12920-025-02305-2","type":"published","date":"2026-01-15T16:29:39+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":86398052,"identity":"ac7bf45b-2003-4de8-8de4-878f53f1e913","added_by":"auto","created_at":"2025-07-10 08:13:53","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":331363,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eOverview of different base-calling options and programs for ONT sequencing.\u003c/strong\u003e All base callers were used in command-line mode, except Dorado (SUP-5kHz); FAST, HAC (High accuracy), and SUP (Super accurate)\u003c/p\u003e","description":"","filename":"image1.png","url":"https://assets-eu.researchsquare.com/files/rs-7036422/v1/11fb948f5cb283ebb8e6a9a0.png"},{"id":86396947,"identity":"9223bec0-532f-4999-a623-9d583df3278c","added_by":"auto","created_at":"2025-07-10 08:05:53","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":145409,"visible":true,"origin":"","legend":"\u003cp\u003eONT read quality metrics for 20 sequenced isolates. \u003cstrong\u003ea\u003c/strong\u003e: mean quality score (qscore; red line represents the Illumina median quality score of 35); \u003cstrong\u003eb\u003c/strong\u003e: read length n50 in kilobase pairs, Illumina reads paired end 150 base pairs; \u003cstrong\u003ec\u003c/strong\u003e: total number of passed bases. The x-axis shows the base-caller models, and the y-axis shows the respective quality metrics. Each grid represents different ONT base-callers. The boxes represent the interquartile range, and the lines inside the boxes represent the median. The lines extending from the box (whiskers) represent the furthest data points that are not considered outliers. Gold (R9.4.1) and blue (R10.4.1) fill colors represent the different ONT flowcell chemistries. FAST; HAC (High accuracy); SUP (Super accuracy).\u003c/p\u003e","description":"","filename":"image2.png","url":"https://assets-eu.researchsquare.com/files/rs-7036422/v1/c5e913975454f5de375af988.png"},{"id":86396942,"identity":"7e0ab7fe-b3e6-424b-a87f-682f2c43b803","added_by":"auto","created_at":"2025-07-10 08:05:53","extension":"jpg","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":232125,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003ea-d: Assembly and annotation quality - \u003c/strong\u003eEach dot represents a bacterial genome assembly, with colors referring to isolates. \u0026nbsp;Each panel represents the base callers, and the nested panel represents the flowcell chemistry. a) Assembly N50 length in millions of base pairs. b) Mean CDS length in base pairs. c) Number of annotated pseudogenes. d) Number of rRNAs annotated. The x-axis represents the assemblies generated from the ONT (Flye, Medaka polished (v1.7.2 and v2.0.0)), srp, lrp, Illumina, and ATCC-reference genomes. The y-axis represents the respective assembly and annotation quality metrics. The reference panel represents the ATCC reference genomes.\u003c/p\u003e","description":"","filename":"image3.jpg","url":"https://assets-eu.researchsquare.com/files/rs-7036422/v1/ae45d7ca216f69a34c75dba2.jpg"},{"id":86396959,"identity":"ecd5211b-581f-4579-8c7c-a43b1be94582","added_by":"auto","created_at":"2025-07-10 08:05:53","extension":"jpeg","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":247955,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eSequencing errors detected on the read level\u003c/strong\u003e - The x-axis represents the different models (FAST, HAC, and SUP) used for the base-callers (Dorado, Guppy, and Rerio) along with Illumina. The y-axis represents the number of errors detected by Pilon on the log 10 scale. The bar color represents the flowcells (Gold - R9.4.1; Blue - R10.4.1; Red - Illumina). bursta - \u003cem\u003eB. stabilis; \u003c/em\u003eentfael \u003cem\u003e- E. faecalis;\u003c/em\u003e esccol\u003cem\u003e - E. coli; \u003c/em\u003eklequa\u003cem\u003e - K. quasipneumoniae; \u003c/em\u003epseaer\u003cem\u003e - P. aeruginosa\u003c/em\u003e; staaur\u003cem\u003e- S. aureus; \u003c/em\u003estrpyo\u003cem\u003e - S. pyogenes.\u003c/em\u003e The circle indicates zero SNPs called.\u003c/p\u003e","description":"","filename":"image4.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-7036422/v1/24c25dae3e38f34ddd7481c1.jpeg"},{"id":86396943,"identity":"85546039-2430-43b2-945f-0abdd20cbb0a","added_by":"auto","created_at":"2025-07-10 08:05:53","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":81681,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003ecgMLST allelic distance from reference - \u003c/strong\u003eEach colored dot represents a bacterial genome assembly, and the shapes represent the base caller modes. \u0026nbsp;Each panel represents the base callers, and the nested panel represents the flowcell chemistry. The x-axis represents the assemblies generated from the ONT (Flye, Medaka polished (v1.7.2 and v2.0.0)), srp, lrp, Illumina, and ATCC-reference genomes. The y-axis represents the allelic distance from the reference genomes calculated from Ridom Seqsphere+. The faded points represent the assemblies that have less than 90% of a good target percentage (smaller shapes), and the bold points represent the assemblies that have greater than 90% of good target percentage. The dashed blue line represents the minimum to a maximum of the Ridom allelic distance cutoff. The gray-shaded area within the blue-dashed lines represents the accepted range. The reference panel represents the ATCC reference genomes.\u003c/p\u003e","description":"","filename":"image5.png","url":"https://assets-eu.researchsquare.com/files/rs-7036422/v1/d959cf1d524cc6ff47318eec.png"},{"id":100616050,"identity":"555abf78-508c-43f0-995a-cabc081f469a","added_by":"auto","created_at":"2026-01-19 17:38:48","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":2143815,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7036422/v1/9e9767a0-f817-4043-8e49-b412392ea2d1.pdf"},{"id":86398054,"identity":"5313e598-b3f2-4991-b0ce-816f7f0d5274","added_by":"auto","created_at":"2025-07-10 08:13:53","extension":"xlsx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":50547,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryS1Readqualitymetrics.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7036422/v1/f0ad34f4adb8c4080d82b2c8.xlsx"},{"id":86398055,"identity":"888b3949-e803-43a3-8dda-cbb0f2263609","added_by":"auto","created_at":"2025-07-10 08:13:53","extension":"xlsx","order_by":2,"title":"","display":"","copyAsset":false,"role":"supplement","size":203919,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryS2assemblyqualitymetrics.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7036422/v1/a87ebfd43f433b28a157860d.xlsx"},{"id":86398058,"identity":"a0b357b9-18e5-47c1-b3aa-41c491681c41","added_by":"auto","created_at":"2025-07-10 08:13:54","extension":"xlsx","order_by":3,"title":"","display":"","copyAsset":false,"role":"supplement","size":511089,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryS3GTDBtaxa.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7036422/v1/8e6adacc44e8db080291416a.xlsx"},{"id":86398050,"identity":"9a489cbc-ca4f-4c0a-b707-04e39cd3054a","added_by":"auto","created_at":"2025-07-10 08:13:52","extension":"xlsx","order_by":4,"title":"","display":"","copyAsset":false,"role":"supplement","size":69943,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryS4ARG.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7036422/v1/a2d99dbc222215fa224356d9.xlsx"},{"id":86398279,"identity":"af49f005-6a90-42d4-a889-b0c8639a5aec","added_by":"auto","created_at":"2025-07-10 08:21:53","extension":"xlsx","order_by":5,"title":"","display":"","copyAsset":false,"role":"supplement","size":205333,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementaryS5Sequencingerrors.xlsx","url":"https://assets-eu.researchsquare.com/files/rs-7036422/v1/965a85db28220695238be09d.xlsx"}],"financialInterests":"No competing interests reported.","formattedTitle":"Benchmarking Illumina and Oxford Nanopore Technologies (ONT) sequencing platforms for whole genome sequencing of bacterial genomes and use in clinical microbiology","fulltext":[{"header":"Introduction","content":"\u003cp\u003eBacterial typing and antimicrobial susceptibility testing (AST) are important services offered by clinical microbiology laboratories for infection control and prevention and to inform treatment strategies. Whole genome sequencing (WGS) offers detailed pathogen profiling and can be used for genome-based species identification, high-resolution typing, including cluster identification, as well as prediction of antimicrobial resistance genes (ARG) and virulence genes (\u003cspan additionalcitationids=\"CR2\" citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e–\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e). WGS data can thus be used for diagnostics and, importantly, also for surveillance programs (\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e, \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e). To facilitate the accreditation of clinical laboratories and ensure consistent data interpretation and sharing, standardization of WGS workflows is critical (\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e, \u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e). A clear understanding of the advantages and limitations of different sequencing technologies is essential for their reliable integration into clinical and public health microbiology (\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e, \u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eShort-read, Illumina platform-generated data has been used extensively for WGS of bacterial isolates in settings spanning from high- to limited-resource (\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e, \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e). Illumina platforms offer workflows with turnaround times (TAT) of 2–5 days (\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e, \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e), high throughput, base call accuracy of 99.99% (\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e, \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e), and read lengths of 75–300bp (\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e). However, Illumina data has lower resolution for repetitive regions, structural variant identification, and thus limitations in generating complete, circularized bacterial genome and plasmid assemblies (\u003cspan additionalcitationids=\"CR18\" citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e–\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e) (\u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eThe advent of long-read sequencing technologies over the last decade is redefining the microbial genomics landscape and promising to shed light on the uncharted regions of the clinical bacterial genomes (\u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e). Long-read sequencing technologies generate read lengths in kilobases, often enabling complete genome assembly (\u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e15\u003c/span\u003e). Of particular interest in clinical microbiology labs is Oxford Nanopore Technologies (ONT). ONT offers rapid TAT for library preparation and sequencing (range here as for Illumina), real-time analysis, lower instrument costs, ease of implementation in resource-limited settings (e.g., handheld MinION sequencer), and the ability to generate circularized bacterial chromosome and plasmid assemblies (\u003cspan additionalcitationids=\"CR23 CR24 CR25\" citationid=\"CR22\" class=\"CitationRef\"\u003e22\u003c/span\u003e–\u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e26\u003c/span\u003e). However, ONT data has a lower base quality of the sequenced reads due to a high error rate compared to Illumina (\u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e16\u003c/span\u003e, \u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eOver the years, ONT has been dynamically modifying pore chemistries and developing better base-calling algorithms to generate high-quality reads and increase read accuracy. The speed of these developments and the subsequent rapid depreciation of previous hardware pose an issue for clinical microbiology laboratories. Frequent validation is thus required of the bioinformatic software, base-callers, and base-calling models, to ensure reproducibility and standardization. Although several studies (\u003cspan additionalcitationids=\"CR28 CR29 CR30 CR31\" citationid=\"CR27\" class=\"CitationRef\"\u003e27\u003c/span\u003e–\u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e32\u003c/span\u003e) have evaluated long- and short-read sequencing platforms for bacterial WGS-based typing, these investigations have focused on specific Gram-negative or Gram-positive species, and comparisons across different ONT base callers and models remain limited.\u003c/p\u003e\u003cp\u003eOur benchmarking study aimed to analyze a range of genomically diverse bacteria, including clinically relevant pathogens of the ESKAPE group (\u003cem\u003eEnterococcus faecium\u003c/em\u003e, \u003cem\u003eStaphylococcus aureus\u003c/em\u003e, \u003cem\u003eKlebsiella pneumoniae\u003c/em\u003e, \u003cem\u003eAcinetobacter baumannii\u003c/em\u003e, \u003cem\u003ePseudomonas aeruginosa\u003c/em\u003e, and \u003cem\u003eEnterobacter\u003c/em\u003e spp.). We compared sequencing data from routine Illumina workflows, ONT chemistries using the R9.4.1 and R10.4.1 flowcells, and ONT base-callers, namely Guppy, Dorado, and Rerio, with different base-calling models: FAST, High accuracy (HAC), and Super accuracy (SUP). We assessed read quality, assembly quality, taxonomic identification, cluster identification, and ARG prediction. Along with the Illumina-only and ONT-only \u003cem\u003ede novo\u003c/em\u003e assemblies, we also compared hybrid assemblies using the data. With this benchmarking study, we aim to address whether the improved ONT data can generate stand-alone assemblies on a par with those from Illumina data for pathogen profiling, and to investigate the potential paradigm shift in the use of short-to-long-read sequencing platforms.\u003c/p\u003e"},{"header":"Methods","content":"\u003cp\u003e\u003cb\u003eSample selection\u003c/b\u003e\u003c/p\u003e\u003cp\u003eWe selected American Type Culture Collection isolates (ATCC) bacterial isolates (n = 8) with %G + C content from 30–66%, and clinical ESKAPE (n = 12) isolates. Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e lists the bacterial isolates used.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cdiv class=\"gridtable\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003e\u003cb\u003eBacterial isolates used for the study\u003c/b\u003e. Clinical isolates were labeled as “s” for sensitive and “r” for resistant. For example, \u003cem\u003eacibau_r\u003c/em\u003e or \u003cem\u003eacibau_nr\u003c/em\u003e refers to \u003cem\u003eA. baumannii\u003c/em\u003e with a specific resistance pattern. The resistant and sensitive ESKAPE isolates were identified and phenotypic antimicrobial sensitivity testing carried out with routine culture-based diagnostics.\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"5\"\u003e\u003c/colgroup\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eBacterial isolates\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eIsolate information\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003eGram status\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c4\"\u003e\u003cp\u003e%G + C\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c5\"\u003e\u003cp\u003eShort name\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eAcinetobacter baumannii\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCarbapenem resistant, \u003cem\u003ebla\u003c/em\u003e\u003csub\u003eOXA−23\u003c/sub\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eNegative\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e39.0\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eacibau_nr\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eAcinetobacter baumannii\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCarbapenem resistant, \u003cem\u003ebla\u003c/em\u003e\u003csub\u003eOXA−23\u003c/sub\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eNegative\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e38.8\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eacibau_r\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eAcinetobacter pittii\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCarbapenem resistant,\u003c/p\u003e\u003cp\u003e\u003cem\u003ebla\u003c/em\u003e\u003csub\u003eOXA−500\u003c/sub\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eNegative\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e38.7\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eacipit\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eBurkholderia stabilis\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eATCCBAA-67\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eNegative\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e66.4\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003ebursta\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eCampylobacter jejuni\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eATCC700819\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eNegative\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e30.6\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003ecamjej\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eEnterococcus faecalis\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eATCC29212\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003ePositive\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e37.5\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eentfael\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eEnterococcus faecium\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eVancomycin sensitive\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003ePositive\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e37.7\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eentfae_s\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eEnterococcus faecium\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eVancomycin resistant, \u003cem\u003evanA\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003ePositive\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e37.7\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eentfae_r\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eEscherichia coli\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eATCC25922\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eNegative\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e50.4\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eesccol\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eEscherichia coli\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eNon-Extended spectrum beta lactamase (ESBL) producing, sensitive\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eNegative\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e50.6\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eesccol_s\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eEscherichia coli\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eESBL\u003c/p\u003e\u003cp\u003eproducing, \u003cem\u003ebla\u003c/em\u003e\u003csub\u003eCTX−M−15\u003c/sub\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eNegative\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e50.8\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eesccol_r\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eKlebsiella pneumoniae\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCarbapenem sensitive\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eNegative\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e57.3\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eklepne_s\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eKlebsiella quasipneumoniae\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eATCC700603\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eNegative\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e57.7\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003eklequa\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003ePseudomonas aeruginosa\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eATCC27853\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eNegative\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e66.1\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003epseaer\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003ePseudomonas aeruginosa\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCarbapenem sensitive\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eNegative\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e66.4\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003epseaer_s\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003ePseudomonas aeruginosa\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eCarbapenem resistant, \u003cem\u003ebla\u003c/em\u003e\u003csub\u003eVIM2\u003c/sub\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eNegative\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e66\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003epseaer_r\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eStaphylococcus aureus\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eOxacillin sensitive\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003ePositive\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e32.8\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003estaaur_s\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eStaphylococcus aureus\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eOxacillin resistant, \u003cem\u003emecA\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003ePositive\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e32.9\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003estaaur_r\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eStaphylococcus aureus\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eATCC25923\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003ePositive\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e32.9\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003estaaur\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eStreptococcus pyogenes\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eATCC19615\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003ePositive\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c4\"\u003e\u003cp\u003e38.5\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c5\"\u003e\u003cp\u003estrpyo\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/table\u003e\u003c/div\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eTwo \u003cem\u003eA. baumannii\u003c/em\u003e routine clinical isolates with the same resistance pattern were used to demonstrate the reliability of calling the \u003cem\u003ebla\u003c/em\u003e\u003csub\u003eOXA−23\u003c/sub\u003e allele across different genomes.\u003c/p\u003e\u003cp\u003e\u003cb\u003eDNA preparation\u003c/b\u003e\u003c/p\u003e\u003cp\u003eWe cultured the ATCC isolates on Colombia agar with 5% sheep blood and the ESKAPE isolates on LB agar plates. We extracted DNA from bacterial cultures using the QIAamp DNA Mini kit (Qiagen) with the Gram-positive protocol using 20mg/ml lysozyme, according to the manufacturer’s protocol. We used DNA from a single extraction in all experiments for each isolate, with the exception of the ATCC isolates on Illumina (see below). We used the Qubit 4™ Fluorometer with 1x dsDNA HS Assay Kit™ (Thermo Fisher Scientific) to quantify the DNA concentration.\u003c/p\u003e\u003cp\u003e\u003cb\u003eGenome sequencing\u003c/b\u003e\u003c/p\u003e\u003cp\u003eWe sequenced all ESKAPE isolates on the Illumina MiSeq platform with PE150, following QIAseq FX (Qiagen) library preparation according to diagnostic routine protocols (ISO norm 15189 / 17025). Illumina data for the ATCC isolate libraries prepared with QIAseq FX were obtained from a previous study (\u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e33\u003c/span\u003e) under ENA number PRJEB31421. For ONT sequencing, the rapid barcoding kit (RBK004 and RBK114.24) was used for library preparation and sequenced with both R9.4.1 and R10.4.1 flowcells in a GridION sequencer, multiplexing twelve samples per flowcell. We used the same DNA for library preparation for both flowcell chemistries. We sequenced with the 400-bps speed and 4 kHz model for the default 72 hours runtime, on flowcells each with \u0026gt; 1000 available pores. The demultiplexing was done using the inbuilt MinKNOW software (v22.10.7). We further sequenced the eight ATCC isolates using the inbuilt 5 kHz Dorado-SUP model provided within the MinKNOW software (v24.06.15).\u003c/p\u003e\u003cp\u003e\u003cb\u003eBase-calling for ONT\u003c/b\u003e\u003c/p\u003e\u003cp\u003eWe generated the base-calling for the Fast5 files from R9.4.1 and R10.4.1 in parallel using the three different command line base-callers Guppy (v6.4.6), Dorado (v0.3.4), and Rerio (
[email protected]). For Guppy and Dorado, three base-calling modes (FAST, HAC - High accuracy, and SUP - Super accuracy) were used; for Rerio, only the SUP mode is available. An overview of these different base-calling options and programs is shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e. We converted the Fast5 files into Pod5 files for Dorado base-calling using pod5 tools (v0.1.5). We set the minimum base quality score cutoff to nine for all base-calling modes. The sequencing was done using the 4 kHz model. We also compared the 5 kHz Dorado-SUP model for ATCC isolates. We used the fastq files obtained after the base-calling for downstream data analysis.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e\u003cb\u003eIllumina read QC and assembly\u003c/b\u003e\u003c/p\u003e\u003cp\u003eWe performed sequencing read quality control for the Illumina data using the FastQC program (v0.11.9)(\u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e34\u003c/span\u003e). The raw reads were subjected to adapter removal using Trimmomatic (v0.39)(\u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e35\u003c/span\u003e). After the adapter removal, we assembled the reads using Unicycler (v0.5.0)(\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e). We calculated mean genome read depth using the formula as follows: C = L*N/G; where C - Genome read depth; L - Read length; N - Total number of reads, and G - Genome size, and achieved over 30x mean read depth (our clinically validated minimum) for all samples.\u003c/p\u003e\u003cp\u003e\u003cb\u003eONT read QC and assembly\u003c/b\u003e\u003c/p\u003e\u003cp\u003eWe filtered reads with Filtlong (v0.2.0)(\u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e), using a minimum read length of 200 bp (--min_length 200) and retaining the top 90% of bases by quality (keep_percent 90). The sequencing read quality for the ONT data was calculated using NanoPlot (v1.40)(\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e). Over 30x mean read depth was obtained for all samples, except a few assemblies generated with FAST mode. We converted the fastq files to fasta using Seqtk(1.3-r106)(\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e) for Medaka polishing. We \u003cem\u003ede novo\u003c/em\u003e assembled the filtered reads using Flye (v2.9.1-b1780)(\u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e) and obtained the assembly summary from the Flye log. The assembled FASTA obtained from Flye was polished with Medaka (v1.7.2)(\u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e) using reads generated with the respective base-calling models. Medaka polishing was not performed for the genomes obtained using Rerio base-calling, as no medaka polishing model is available for this respective base-caller. We utilized the Medaka (v2.0.0)(\u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e) polishing model for bacterial methylation for the data generated using R10.4.1 flowcell chemistry for both the 4 and 5 kHz models.\u003c/p\u003e\u003cp\u003e\u003cb\u003eHybrid assemblies\u003c/b\u003e\u003c/p\u003e\u003cp\u003eWe generated hybrid assemblies based on Illumina-first assemblies using Unicycler (v0.5.0)(\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e) with Illumina-only assemblies polished with long-reads generated from all possible base-callers and models, hereafter referred to as long read polished (lrp). Next, we generated ONT-first hybrid assemblies using Polypolish(v0.5.0)(\u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e42\u003c/span\u003e), using medaka polished ONT-only assemblies, polished using Illumina reads, hereafter referred to as short read polished (srp). For the 5 kHz sequencing, we have not performed the hybrid assemblies, as the aim is to compare whether ONT alone reads base called with the bacterial base modification genome-aware base calling model, outperforms the hybrid assemblies.\u003c/p\u003e\u003cp\u003e\u003cb\u003eAnnotation\u003c/b\u003e\u003c/p\u003e\u003cp\u003eWe annotated the assembled genomes using Bakta (v1.10.3)(\u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e43\u003c/span\u003e) and obtained the total number of rRNA, coding sequence (CDS), and pseudogenes from the Bakta report. We calculated the mean CDS length using SeqKit (v2.4.0)(\u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e).\u003c/p\u003e\u003cp\u003e\u003cb\u003eReference genomes\u003c/b\u003e\u003c/p\u003e\u003cp\u003eTable\u0026nbsp;\u003cspan refid=\"Tab2\" class=\"InternalRef\"\u003e2\u003c/span\u003e lists the reference genomes for the ATCC isolates. We used the Irp assemblies generated with the Rerio-SUP model as reference genomes for the ESKAPE isolates, as the Rerio-SUP model accounted for bacterial methylation.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cdiv class=\"gridtable\"\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e\u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e\u003ctable float=\"Yes\" id=\"Tab2\" border=\"1\"\u003e\u003ccaption language=\"En\"\u003e\u003cdiv class=\"CaptionNumber\"\u003eTable 2\u003c/div\u003e\u003cdiv class=\"CaptionContent\"\u003e\u003cp\u003eATCC isolates, and reference genomes used in analysis.\u003c/p\u003e\u003c/div\u003e\u003c/caption\u003e\u003ccolgroup cols=\"3\"\u003e\u003c/colgroup\u003e\u003cthead\u003e\u003ctr\u003e\u003cth align=\"left\" colname=\"c1\"\u003e\u003cp\u003eATCC isolates\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c2\"\u003e\u003cp\u003eATCC ID\u003c/p\u003e\u003c/th\u003e\u003cth align=\"left\" colname=\"c3\"\u003e\u003cp\u003eRefseq Accession number\u003c/p\u003e\u003c/th\u003e\u003c/tr\u003e\u003c/thead\u003e\u003ctbody\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eBurkholderia stabilis\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eATCCBAA-67\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eNZ_CP016442.1,\u003c/p\u003e\u003cp\u003eNZ_CP016443.1,\u003c/p\u003e\u003cp\u003eNZ_CP016444.1\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eCampylobacter jejuni\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eATCC700819\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eNC_002163.1\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eEnterococcus faecalis\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eATCC29212\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eNZ_CP008816.1,\u003c/p\u003e\u003cp\u003eNZ_CP008815.1,\u003c/p\u003e\u003cp\u003eNZ_CP008814.1\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eEscherichia coli\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eATCC25922\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eCP009072.1\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eKlebsiella quasipneumoniae\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eATCC700603\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eNZ_CP014696.2,\u003c/p\u003e\u003cp\u003eNZ_CP014697.2,\u003c/p\u003e\u003cp\u003eNZ_CP014698.2\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003ePseudomonas aeruginosa\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eATCC27853\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eCP015117.1\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eStaphylococcus aureus\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eATCC25923\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eNZ_CP009361.1,\u003c/p\u003e\u003cp\u003eNZ_CP009362.1\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003ctr\u003e\u003ctd align=\"left\" colname=\"c1\"\u003e\u003cp\u003e\u003cem\u003eStreptococcus pyogenes\u003c/em\u003e\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c2\"\u003e\u003cp\u003eATCC19615\u003c/p\u003e\u003c/td\u003e\u003ctd align=\"left\" colname=\"c3\"\u003e\u003cp\u003eNZ_CP008926.1\u003c/p\u003e\u003c/td\u003e\u003c/tr\u003e\u003c/tbody\u003e\u003c/table\u003e\u003c/div\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e\u003cb\u003eAssembly quality\u003c/b\u003e\u003c/p\u003e\u003cp\u003eWe assessed the assembly quality using QUAST (v5.0.2)(\u003cspan citationid=\"CR45\" class=\"CitationRef\"\u003e45\u003c/span\u003e). The taxonomy identification was carried out on the assembled genome using the Genome Taxonomy Database Tool kit (GTDB-Tk) (v2.2.6) with database release (v214)(\u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e46\u003c/span\u003e). We performed ARG prediction using NCBI AMRFinderPlus (v3.12)(\u003cspan citationid=\"CR47\" class=\"CitationRef\"\u003e47\u003c/span\u003e) on the assembled genomes, using gene length coverage and sequence identity of 95% as cutoff scores. We used MOB-suite (v3.1.9)(\u003cspan citationid=\"CR48\" class=\"CitationRef\"\u003e48\u003c/span\u003e) to classify assembled contigs with the identified ARG as either of chromosome or plasmid origin.\u003c/p\u003e\u003cp\u003e\u003cb\u003eMapping and SNP error calling\u003c/b\u003e\u003c/p\u003e\u003cp\u003eWe aligned the read length filtered ONT reads, and adapter-trimmed Illumina reads to the respective reference genomes according to ESKAPE or ATCC bacterial isolates using Minimap2 (v2.24)(\u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e49\u003c/span\u003e) in ont-mode for ONT reads and sr-mode for Illumina reads. The BAM files were subjected to sorting, duplication removal, and indexing using Samtools (v1.15.1)(\u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e50\u003c/span\u003e). We performed SNP calling from the BAM files using Pilon (v1.24)(\u003cspan citationid=\"CR51\" class=\"CitationRef\"\u003e51\u003c/span\u003e), and vcftools (v0.1.16)(\u003cspan citationid=\"CR52\" class=\"CitationRef\"\u003e52\u003c/span\u003e) to process the vcf files with retaining sites with a minimum of two alleles (--min-alleles 2) and a minimum quality threshold of 1(--minQ 1). We extracted the SNPs annotated in the vcf files using vcftools (v0.1.16). Variant calling on short reads was also tested using Snippy (v4.6.0) (\u003cspan citationid=\"CR53\" class=\"CitationRef\"\u003e53\u003c/span\u003e) for comparison. We calculated the read depth of the called SNPs at each position using Samtools (v1.15.1)(\u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e50\u003c/span\u003e).\u003c/p\u003e\u003cp\u003e\u003cb\u003eCore genome MLST (cgMLST) analysis\u003c/b\u003e\u003c/p\u003e\u003cp\u003eWe analyzed all the assemblies in Ridom Seqsphere+ (v10.0.3)(\u003cspan citationid=\"CR54\" class=\"CitationRef\"\u003e54\u003c/span\u003e) and used cgMLST analysis with defined schemes (cgmlst.org). Statistical tests were not feasible, as each comparison involved a single species, and species-specific effects would have influenced the results.\u003c/p\u003e"},{"header":"Results","content":"\u003cp\u003e\u003cb\u003eRead and base quality\u003c/b\u003e\u003c/p\u003e\u003cp\u003eThe selection of diverse ATCC type strains and clinical isolates yielded a median base quality of 35.9 (IQR 33.2\u0026ndash;36.3) with Illumina sequencing. In contrast, sequencing using ONT with R10.4.1 flowcells and base-calling in SUP mode with Dorado and Guppy resulted in a median read quality score (mean Q-score) of 15.3 (IQR 14.7\u0026ndash;15.9) and 15.4 (IQR 14.9\u0026ndash;16.1). This was higher than the data from R9.4.1 flowcells, which were base-called in SUP mode with Dorado and Guppy, producing a median Q-score of 13.9 (IQR 13.6\u0026ndash;14.1) and 14.2 (IQR 14\u0026ndash;14.6) (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003ea). The number of bases generated with R9.4.1 flowcells was greater than that of R10.4.1 flowcells, but the R10.4.1 flowcells produced longer reads, with a higher N50 (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eb \u003cb\u003eand c\u003c/b\u003e). The quality metrics for each sample are provided in \u003cb\u003eSupplementary Table \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e.\u003c/b\u003e ATCC isolates resequenced with Dorado 5 kHz SUP mode yielded the highest median base quality of 19.5 (IQR 19.0\u0026ndash;19.7) compared to the R10.4.1 (4 kHz) and R9.4.1 base called reads (\u003cb\u003eSupplementary Table \u003cspan refid=\"MOESM1\" class=\"InternalRef\"\u003eS1\u003c/span\u003e\u003c/b\u003e). We also obtained duplex reads from the R10.4.1 flowcells. While the read quality was improved with duplex reads, the number of these reads was insufficient for downstream genome assembly (\u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e31\u003c/span\u003e). Therefore, these data are not included in the results.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e\u003cb\u003eAssembly quality and species identification\u003c/b\u003e\u003c/p\u003e\u003cp\u003eWe noted that bacterial genome assemblies generated using R10.4.1 Dorado SUP (4 and 5 kHz) and Rerio SUP base callers generally achieved higher assembly N50 values compared to those from R9.4.1 and Illumina platforms. The N50s are strongly species dependent, with the N50 of ONT and hybrid assemblies frequently reflecting longer assembly N50 lengths, with exceptions in a few isolates sequenced from R9.4.1 flow chemistries. entfae_s and acipit had shorter assembly length N50, whereas assemblies based solely on Illumina data generate shorter assembly N50 lengths (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003ea). There was no difference in the observed %G\u0026thinsp;+\u0026thinsp;C content between the assemblies from the two sequencing platforms, and the mean depth for the ONT and Illumina is also provided in Supplementary \u003cb\u003eTable S2\u003c/b\u003e. The GTDB-Tk correctly assigned bacterial taxonomy in all de novo assembled genomes, regardless of the sequencing technology or the flowcell chemistry (\u003cb\u003eSupplementary Table \u003cspan refid=\"MOESM3\" class=\"InternalRef\"\u003eS3\u003c/span\u003e)\u003c/b\u003e, except for \u003cem\u003eB. stabilis\u003c/em\u003e, was not classified for assemblies generated with R10.4.1 Dorado and Guppy in FAST mode similarly in srp hybrid assemblies from R10.4.1 Dorado and Guppy in FAST mode. Also, in R9.4.1, HAC mode could not classify ESKAPE \u003cem\u003eP. aeruginosa\u003c/em\u003e as the assembly was incomplete.\u003c/p\u003e\u003cp\u003e\u003cb\u003eAnnotation quality\u003c/b\u003e\u003c/p\u003e\u003cp\u003eAnnotated assemblies, across all species, generated with Illumina data, exhibited relatively consistent mean CDS lengths, suggesting accurate representation of the genome content and minimal frame-shift errors (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eb). Notably, annotated assemblies from R10.4.1 Dorado SUP (4 and 5 kHz) and Rerio SUP produced mean CDS lengths that were closer to those observed in Illumina assemblies, followed by the R10.4.1 Dorado HAC model (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003eb). The FAST model with Guppy and Dorado performed the worst, with a lower mean CDS length.\u003c/p\u003e\u003cp\u003ePseudogenes, being annotated disrupted gene, were used as a proxy for incorrect base calls or indels leading to mis-annotation. Illumina-based assemblies exhibited a lower number of pseudogenes compared to ONT-only assemblies. Similarly, assemblies from R10.4.1 Dorado SUP (4 and 5 kHz), Rerio SUP, and R10.4.1 Dorado HAC models also had fewer pseudogenes compared to those from other ONT flowcells and base calling models (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003ec). Assemblies generated using the FAST base-calling models from both Dorado and Guppy, using data from either R9.4.1 or R10.4.1 chemistries, consistently exhibited higher pseudogene counts than those generated with the SUP models from Rerio and Dorado. In hybrid assemblies, annotation of lrp-based assemblies resulted in fewer pseudogenes than the srp-based assemblies, specifically those generated with R9.4.1 chemistry.\u003c/p\u003e\u003cp\u003eRibosomal RNA (rRNA) genes are often found in multiple copies within genomes(\u003cspan citationid=\"CR55\" class=\"CitationRef\"\u003e55\u003c/span\u003e). ONT-only assemblies achieved the same total numbers of annotated rRNA genes as seen in the reference genomes, whereas they often collapse into single copies in Illumina assemblies (Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e3\u003c/span\u003ed). Hybrid assemblies, integrating both long- and short-read data, showed no differences in terms of assembly length, N50, number of annotated rRNAs, and %G\u0026thinsp;+\u0026thinsp;C content.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003e\u003cb\u003eARG detection\u003c/b\u003e\u003c/p\u003e\u003cp\u003eAll ARGs presented in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e were successfully identified in assemblies of ESKAPE isolates, regardless of the sequencing platforms used. However, assemblies generated using the FAST mode from ONT showed incomplete annotation of the \u003cem\u003evanA\u003c/em\u003e operon in \u003cem\u003eE. faecium\u003c/em\u003e (entfae_r) or entirely missed ARGs for certain bacterial species, notably \u003cem\u003eE. coli\u003c/em\u003e (esccol_r\u003cem\u003e)\u003c/em\u003e. Additionally, when using ONT R9.4.1 chemistry with Dorado and Guppy base calling in High accuracy (HAC) and Super accuracy (SUP) models, the \u003cem\u003evanA\u003c/em\u003e operon in \u003cem\u003eE. faecium\u003c/em\u003e was detected with only 81% gene length coverage.\u003c/p\u003e\u003cp\u003eTo investigate the chromosomal or plasmidic location of the ARGs, MOB-suite analysis of assembled genomes was correlated. The identified ARGs are located on the chromosomes, except for the \u003cem\u003evanA\u003c/em\u003e gene, which was classified as plasmid-encoded. Of note, the majority of the ARGs from lrp assemblies were not on contigs assigned specifically to chromosome or plasmid categories. The ARG results are shared in \u003cb\u003eSupplementary Table \u003cspan refid=\"MOESM4\" class=\"InternalRef\"\u003eS4\u003c/span\u003e\u003c/b\u003e\u003c/p\u003e\u003cp\u003e\u003cb\u003eSequencing error calling\u003c/b\u003e\u003c/p\u003e\u003cp\u003eThe number of sequencing errors was assessed using variant calling / SNP analysis, using ATCC (n\u0026thinsp;=\u0026thinsp;8) isolates, which have high-quality reference genomes available in the RefSeq database. Overall, Illumina reads mapped to the reference sequences generated fewer sequencing errors than ONT reads, with more errors seen in R10.4.1 than R9.4.1 reads (Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e4\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eWe observed that in ONT data (Dorado-SUP, R10.4.1), more sequencing errors were observed in \u003cem\u003eC. jejuni\u003c/em\u003e (~\u0026thinsp;10,000), followed by \u003cem\u003eE. faecalis (\u003c/em\u003e~\u0026thinsp;73\u003cem\u003e), E. coli (~\u0026thinsp;140)\u003c/em\u003e, and \u003cem\u003eP. aeruginosa (\u003c/em\u003e~\u0026thinsp;466\u003cem\u003e)\u003c/em\u003e compared to Illumina. Pilon identified sequencing errors with at least 30x read depth. A species-specific effect is observed between the variant calling for different base-callers and models, which could be attributed to the varying %G\u0026thinsp;+\u0026thinsp;C content, presence of modified bases (miscalled methylated sites), or highly repetitive regions in the genome.\u003c/p\u003e\u003cp\u003eFor ATCC \u003cem\u003eS. aureus\u003c/em\u003e, fewer sequencing errors were detected in ONT reads from Dorado-SUP and Rerio-SUP (R10.4.1 and R9.4.1 flowcells), with one error in ONT reads compared to two in Illumina. For \u003cem\u003eK. quasipneumoniae\u003c/em\u003e, no errors were detected in ONT data from R10.4.1, Rerio (4kHz), and Dorado SUP 4kHz (n\u0026thinsp;=\u0026thinsp;0 or n\u0026thinsp;=\u0026thinsp;1). \u003cem\u003eS. pyogenes\u003c/em\u003e had seven sequencing errors in ONT data (R10.4.1 Dorado 4kHz and Rerio SUP) and six errors in Illumina data. For \u003cem\u003eB. stabilis\u003c/em\u003e, Illumina reads had more errors than ONT data. No difference was observed in sequencing errors between the 4kHz and 5kHz models of Dorado SUP in R10.4.1. The complete number of sequencing errors detected and the alignment percentage to the respective reference genomes are provided in \u003cb\u003eSupplementary Table \u003cspan refid=\"MOESM5\" class=\"InternalRef\"\u003eS5\u003c/span\u003e\u003c/b\u003e for ATCC and ESKAPE isolates.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eSimilarly, the number of indels was higher in the ONT data compared to Illumina, with the exception of \u003cem\u003eK. quasipneumoniae\u003c/em\u003e and \u003cem\u003eS. aureus.\u003c/em\u003e The Snippy comparison is also provided in Supplementary \u003cb\u003eTable S5\u003c/b\u003e.\u003c/p\u003e\u003cp\u003e\u003cb\u003eCore genome analysis of assemblies\u003c/b\u003e\u003c/p\u003e\u003cp\u003eAn important readout in transmission and outbreak analysis is cgMLST. This approach allows for high-resolution comparison of bacterial isolates by analyzing the allelic differences across a defined set of core genes shared within a species. This is often a genome assembly-based comparison. Where schemes were available, we compared our assemblies using cgMLST. Assemblies from Illumina and hybrid approaches (lrp and srp) exhibit lower allelic distances from their respective reference genomes, indicating higher similarity to the reference genomes at the core genome level compared to ONT data (Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e5\u003c/span\u003e). ONT-only assemblies generated with R10.4.1 Dorado SUP and Rerio SUP for \u003cem\u003eS. aureus\u003c/em\u003e, \u003cem\u003eK. quasipneumoniae\u003c/em\u003e, and \u003cem\u003eK. pneumoniae\u003c/em\u003e also met the defined cluster thresholds. However, for other species, although some of the ONT-only assemblies from Dorado, Guppy, and Rerio (HAC and SUP models) achieved good target percentages above 90% (a quality control cut off), their allelic distances exceeded the cgMLST cutoff, meaning that the isolates would not be identified in the same cluster in an outbreak context (\u003cb\u003eSupplementary file S2\u003c/b\u003e). In contrast, ONT-only assemblies using the FAST model for Dorado and Guppy displayed lower than 90% good target percentages and higher allelic distances from the reference. The ATCC isolates sequenced with the Dorado SUP-5kHz model showed lower allelic distances from their respective ATCC references and were on par with Illumina assemblies, except for \u003cem\u003eC. jejuni\u003c/em\u003e, which still displayed 348 allele differences from the reference genome.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e"},{"header":"Discussion","content":"\u003cp\u003eIn this study, we evaluated short-read Illumina and long-read ONT technologies and software versions for bacterial whole genome sequencing using eight ATCC strains and twelve ESKAPE clinical isolates, with relevant analyses for clinical microbiology, including genome assembly quality, antimicrobial resistance identification, and taxonomic identification.\u003c/p\u003e\u003cp\u003eIllumina data consistently achieved much higher median quality scores compared to both ONT flow cell versions (\u003cspan citationid=\"CR56\" class=\"CitationRef\"\u003e56\u003c/span\u003e). The ONT results were also below the expected Q20, potentially influenced by library preparation methods (\u003cspan citationid=\"CR57\" class=\"CitationRef\"\u003e57\u003c/span\u003e). The higher total base yields produced by the R9.4.1 chemistry was likely due to shorter read lengths being preferentially sequenced, compared to the longer reads of R10.4.1. (Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003eb). Longer reads typically require more time to translocate through nanopores than shorter reads. Consequently, ONT platforms preferentially sequence shorter reads, resulting in an overall higher data yield for the R9.4.1 chemistry (\u003cspan citationid=\"CR58\" class=\"CitationRef\"\u003e58\u003c/span\u003e, \u003cspan citationid=\"CR59\" class=\"CitationRef\"\u003e59\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eWe observed that genome assemblies generated using ONT R10.4.1 chemistry and the SUP base-calling models from Rerio and Dorado (4 and 5 kHz) produced notably higher N50 values compared to R9.4.1 and Illumina assemblies. This increase in N50 indicates greater genome contiguity, which is particularly beneficial for resolving complex genomic structures, repetitive elements, and large structural variants. Similar findings have been reported by previous studies using long-read technologies, highlighting the advantage of ONT sequencing for improved assembly continuity and completeness (\u003cspan additionalcitationids=\"CR61\" citationid=\"CR60\" class=\"CitationRef\"\u003e60\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR62\" class=\"CitationRef\"\u003e62\u003c/span\u003e). These findings highlight the advantage of ONT-only assemblies in generating more contiguous genomes relative to Illumina-only assemblies.\u003c/p\u003e\u003cp\u003ePseudogene counts and CDS mean length varied by sequencing technology and base-calling model. Illumina assemblies consistently had the fewest pseudogenes, reflecting fewer sequencing errors and more accurate coding sequence annotations. Conversely, assemblies from FAST base-calling showed increased pseudogene counts due to lower raw read accuracy (\u003cspan citationid=\"CR63\" class=\"CitationRef\"\u003e63\u003c/span\u003e), causing frameshift mutations, indels, and misplaced stop codons. Thus, selecting high-accuracy base-calling models (SUP or HAC) is crucial for precise genome annotation and comparative genomic studies, such as pan-genome analysis and Genome-wide association studies, which rely on core gene annotations.\u003c/p\u003e\u003cp\u003eONT-alone assemblies and hybrid assemblies were able to resolve individual rRNA operons compared to Illumina-alone assemblies, in which they collapsed and assembled. This highlights the importance of long-read sequences provided by ONT in resolving repetitive genomic regions compared to short-read Illumina. These results suggest that ONT-alone assemblies R10.4.1 chemistry, specifically base-called on Dorado (5kHz) in the SUP model, can yield accurate annotations.\u003c/p\u003e\u003cp\u003eTaxonomic identification of all assembled bacterial genomes was accurate, regardless of the sequencing technology used. GTDB-Tk assigned all bacterial genomes to the correct species. This robustness could be attributed to the reliance on a conserved set of single-copy marker genes detected via amino-acid Hidden Markov Models (HMMs), which tolerate indels and %G\u0026thinsp;+\u0026thinsp;C-extreme underrepresentation, and from Average Nucleotide Identity (ANI) comparisons focused on those conserved regions. By enforcing minimal marker-gene completeness thresholds, GTDB-Tk ensures sufficient phylogenetic signals for accurate taxonomic assignment (\u003cspan additionalcitationids=\"CR65\" citationid=\"CR64\" class=\"CitationRef\"\u003e64\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR66\" class=\"CitationRef\"\u003e66\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eThe consistent detection of ARGs across the sequencing platforms highlights their reliability and effectiveness in identifying resistance determinants among ESKAPE pathogens. However, the limitations observed with ONT's FAST mode assemblies emphasize the importance of selecting appropriate sequencing and base calling strategies to accurately identify ARG alleles. Sequencing errors, particularly in ONT data, can lead to miscalling ARG alleles which, particularly in the cases of some beta-lactamase genes, could have clinical implications.\u003c/p\u003e\u003cp\u003eSNP calling is a critical process in bacterial WGS and comparison that relies on high-accuracy reads at a good read depth and is confounded by sequencing errors. Single-nucleotide changes can provide key insights during epidemiological investigations and outbreak analyses. We used SNP calling as a proxy for sequencing error analysis. A pronounced species-specific effect was evident in \u003cem\u003eB. stabilis\u003c/em\u003e, \u003cem\u003eE. faecalis\u003c/em\u003e, \u003cem\u003eE. coli\u003c/em\u003e, \u003cem\u003eC. jejuni\u003c/em\u003e, and \u003cem\u003eP. aeruginosa\u003c/em\u003e, which displayed greater divergence between platforms, with ONT detecting up to four orders of magnitude more sequencing errors depending on the base‐caller model. By contrast, \u003cem\u003eS. aureus, S. pyogenes\u003c/em\u003e, and \u003cem\u003eK. quasipneumoniae\u003c/em\u003e remained almost invariant, reflecting their relatively homogeneous genomes and lower repeat content. These results underscore that genomic features such as the presence of methylated motifs, %G\u0026thinsp;+\u0026thinsp;C skew, and repeat density might lead to mapping artifacts (\u003cspan citationid=\"CR67\" class=\"CitationRef\"\u003e67\u003c/span\u003e, \u003cspan citationid=\"CR68\" class=\"CitationRef\"\u003e68\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eAn important test of the generated assemblies was to evaluate the allelic distances from reference genomes using cgMLST. Our results demonstrate that Illumina and hybrid assemblies consistently yielded lower allelic distances from their reference genomes and clustered closely, within the accepted cgMLST thresholds. Interestingly, ONT-only assemblies produced using base caller models R10.4.1 Dorado SUP 4kHz and Rerio SUP 4kHz also showed comparable performance to Illumina for specific organisms, notably \u003cem\u003eS. aureus, K. quasipneumoniae\u003c/em\u003e, and \u003cem\u003eK. pneumoniae\u003c/em\u003e. However, our analysis revealed limitations for other species (\u003cem\u003eC. jejuni, E. faecalis, E. coli\u003c/em\u003e, and \u003cem\u003eP. aeruginosa\u003c/em\u003e) when employing ONT-only assemblies with Dorado, Guppy, and Rerio using High accuracy (HAC) and Super accuracy (SUP) models. Despite some of these assemblies having good target percentages of alleles identified above 90%, their allelic distances frequently surpassed the cgMLST cutoffs. Especially poor was the performance with \u003cem\u003eC. jejuni\u003c/em\u003e. This observation suggests species-specific variability in the assembly quality, potentially linked to genome complexity and issues in sequencing native modified bases. Furthermore, ONT-only assemblies employing FAST models for Dorado and Guppy performed notably worse, demonstrating both suboptimal target percentages (\u0026lt;\u0026thinsp;90%) and increased allelic distances from references. This underscores the limited suitability of FAST models for applications requiring high precision, such as cgMLST-based epidemiological investigations. As a consequence, ONT sequencing should only be used with great caution in outbreak investigations, and with model- and species-specific validation.\u003c/p\u003e\u003cp\u003eUsing the recent Dorado SUP-5kHz model provided within the MinKNOW software (v24.06.15), the sequenced ATCC isolates displayed lower allelic distances, comparable to Illumina assemblies, across most species, except for \u003cem\u003eC. jejuni.\u003c/em\u003e Based on these results, ONT-only assemblies generated using the Dorado (5 kHz) model can be considered on par with Illumina in terms of cgMLST cluster detection. However, care should be taken for species like \u003cem\u003eC. jejuni\u003c/em\u003e, which consistently showed higher allelic distances using ONT data. This may be due to its low %G\u0026thinsp;+\u0026thinsp;C content and the presence of complex methylation motifs that are known to reduce ONT sequencing accuracy, as discussed in the benchmarking efforts (\u003cspan citationid=\"CR69\" class=\"CitationRef\"\u003e69\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eInterestingly, several studies have also reported strain-specific erroneous clustering with the Dorado SUP 5 kHz model in several species, like \u003cem\u003eListeria monocytogenes\u003c/em\u003e and \u003cem\u003eBurkholderia pseudomallei\u003c/em\u003e (\u003cspan citationid=\"CR70\" class=\"CitationRef\"\u003e70\u003c/span\u003e, \u003cspan citationid=\"CR71\" class=\"CitationRef\"\u003e71\u003c/span\u003e). A multicenter study led by Dabernig-Heinz et al. (\u003cspan citationid=\"CR72\" class=\"CitationRef\"\u003e72\u003c/span\u003e) also demonstrated similar strain-specific erroneous clustering in \u003cem\u003eE. faecium, K. pneumoniae, S. aureus\u003c/em\u003e, and \u003cem\u003eL. monocytogenes\u003c/em\u003e species. The authors also discussed PCR preamplification of input DNA for ONT sequencing as an error mitigation strategy to overcome the faulty cgMLST clustering arising from methylated sites in the bacterial genomes. Similarly, PCR-based library preparation for ONT sequencing and masking of problematic genomic regions in \u003cem\u003eCorynebacterium diphtheriae\u003c/em\u003e strains and Vancomycin-resistant Enterococci strains has been proven to match Illumina results in cgMLST clustering (\u003cspan citationid=\"CR73\" class=\"CitationRef\"\u003e73\u003c/span\u003e). Another ONT error mitigation is to mask the modified bases in the bacterial genomes through the polishing of the assemblies using the cgMLST polisher tool provided within the Ridom SeqSphere\u0026thinsp;+\u0026thinsp;software suite. The strategy is to identify and mask ambiguous base calls at methylation-related error hotspots using allele‐aware masking derived from raw‐signal patterns to refine allele assignments and eliminate systematic miscalls. Studies in diverse multidrug-resistant nosocomial important Gram-negative and Gram-positive bacterial isolates and %G\u0026thinsp;\u0026plusmn;\u0026thinsp;C-rich \u003cem\u003eB. pseudomallei\u003c/em\u003e isolates (\u003cspan citationid=\"CR71\" class=\"CitationRef\"\u003e71\u003c/span\u003e, \u003cspan citationid=\"CR74\" class=\"CitationRef\"\u003e74\u003c/span\u003e) highlight that targeted masking or polishing of ONT errors via the ONT-cgMLST-Polisher is critical to reach Illumina-comparable resolution in cgMLST clustering.\u003c/p\u003e\u003cp\u003eA limitation of the study is the limited number of bacterial isolates that were used to validate the Dorado SUP 5 kHz performance across species, and the study did not compare the latest cgMLST ONT polisher made available within RidomSeqsphere+. Nonetheless, the study underscores the importance of leveraging the synergy between long- and short-read sequencing platforms and highlights the development of ONT across different stages from R9.4.1, 4kHz R10.4.1 Dorado followed by Rerio to 5kHz Dorado SUP and emphasizes the need to further improve and validate ONT\u0026rsquo;s bacterial base-calling models to account for modified bases across different bacterial species.\u003c/p\u003e"},{"header":"Conclusion","content":"\u003cp\u003eWe systematically and comprehensively evaluated bacterial WGS across two platforms, Illumina and ONT, comparing multiple flowcell chemistries, base callers, and hybrid assembly approaches. ONT data, particularly those generated with R10.4.1 flowcells and the SUP models in Dorado and Rerio, yielded highly contiguous genome assemblies and resolved repetitive regions more effectively. In contrast, Illumina consistently provided more accurate base calls, albeit with more fragmented assemblies. Hybrid assemblies leverage the strengths of both technologies, producing comprehensive assemblies alongside high-level accuracy. Nevertheless, challenges persist in addressing base modifications in certain bacterial strains, and although ONT chemistries and base-calling algorithms continue to improve, there are still species-specific sequencing errors that affect especially for outbreak investigations. The complementary use of long and short reads remains the most reliable strategy for achieving both complete and highly accurate bacterial genome assemblies.\u003c/p\u003e"},{"header":"Abbreviations","content":"\u003cdiv class=\"DefinitionList\"\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eANI\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eAverage Nucleotide Identity\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eARG\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eAntimicrobial resistance gene\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eATCC\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eAmerican Type Culture Collection\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eCDS\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eCoding DNA sequence\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003ecgMLST\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eCore genome multilocus sequence typing\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eENA\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eEuropean Nucleotide Archive\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eESBL\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eExtended-spectrum beta-lactamase\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eESKAPE\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003e\u003cem\u003eEnterococcus faecium\u003c/em\u003e, \u003cem\u003eStaphylococcus aureus\u003c/em\u003e, \u003cem\u003eKlebsiella pneumoniae\u003c/em\u003e, \u003cem\u003eAcinetobacter baumannii\u003c/em\u003e, \u003cem\u003ePseudomonas aeruginosa\u003c/em\u003e, and \u003cem\u003eEnterobacter\u003c/em\u003e spp\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eFAST\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eFAST (ONT base-calling mode)\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eGTDB-Tk\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eGenome Taxonomy Database Toolkit\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eHAC\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eHigh accuracy (ONT base calling mode)\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eIQR\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eInterquartile Range\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eONT\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eOxford Nanopore Technologies\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003ePE\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003ePaired end\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003ePCR\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003ePolymerase chain reaction\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eQC\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eQuality control\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eRBK\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eRapid barcoding kit (ONT)\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003erRNA\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eRibosomal RNA\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eSNP\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eSingle nucleotide polymorphism\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eSUP\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eSuper accuracy (ONT base calling mode)\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eTAT\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eTurnaround time\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eVCF\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eVariant Call Format\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv class=\"DefinitionListEntry\"\u003e\u003cdiv class=\"Term\"\u003e\u003cb\u003eWGS\u003c/b\u003e\u003c/div\u003e\u003cdiv class=\"Description\"\u003e\u003cp\u003eWhole genome sequencing\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003c/div\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eEthics approval and consent to participate\u003c/strong\u003e\u003cp\u003eNot applicable\u003c/p\u003e\u003c/p\u003e\u003cp\u003e\u003cstrong\u003eConsent for publication\u003c/strong\u003e\u003cp\u003eNot applicable\u003c/p\u003e\u003c/p\u003e\u003cp\u003e\u003ch2\u003eCompeting interests\u003c/h2\u003e\u003cp\u003eNot applicable\u003c/p\u003e\u003c/p\u003e\u003ch2\u003eFunding\u003c/h2\u003e\u003cp\u003eWe thank the Swiss National Science Foundation (reference 310030_192515) for supporting the research of AE.\u003c/p\u003e\u003ch2\u003eAuthor Contribution\u003c/h2\u003e\u003cp\u003eConceptualization and study design plan: SP, HSS, TR, and AE. Sequencing, bioinformatics analysis, and visualization: SP and HSS. Writing of the manuscript: SP. Providing critical feedback on the manuscript: HSS, TR, and AE.\u003c/p\u003e\u003ch2\u003eAcknowledgement\u003c/h2\u003e\u003cp\u003eWe would like to thank Dr. Vladimira Hinic and PD Dr. Stefano Mancini from the Institute of Medical Microbiology, University of Zurich for kindly providing the clinical ESKAPE isolates. We would also like to thank the High-Performance Cluster support provided by the Science IT team from the University of Zurich for performing the bioinformatics analysis. We thank Dr. Marco Meola for the helpful discussion and advice.\u003c/p\u003e\u003ch2\u003eData Availability\u003c/h2\u003e\u003cp\u003eThe raw fastq files from the ONT and the Illumina have been deposited in the European Nucleotide Archive (ENA) with accession ID: PRJEB90249 and will be made public upon acceptance of the manuscript.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eKozińska A, Seweryn P, Sitkiewicz I. A crash course in sequencing for a microbiologist. J Appl Genet. 2019;60:103\u0026ndash;11.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDidelot X, Bowden R, Wilson DJ, Peto TEA, Crook DW. Transforming clinical microbiology with bacterial genome sequencing. Nat Rev Genet. 2012;13:601\u0026ndash;12.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSimar SR, Hanson BM, Arias CA. Techniques in bacterial strain typing: past, present, and future. Curr Opin Infect Dis. 2021;34:339\u0026ndash;45.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePrice V, Ngwira LG, Lewis JM, Baker KS, Peacock SJ, Jauneikaite E et al. A systematic review of economic evaluations of whole-genome sequencing for the surveillance of bacterial pathogens. Microb Genom. 2023;9.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSchadron T, van den Beld M, Mughini-Gras L, Franz E. Use of whole genome sequencing for surveillance and control of foodborne diseases: status quo and quo vadis. Front Microbiol. 2024;15:1460335.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMutschler E, Roloff T, Neves A, Vangstein Aamot H, Rodriguez-Sanchez B, Ramirez M, et al. Towards unified reporting of genome sequencing results in clinical microbiology. PeerJ. 2024;12:e17673.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLau KA, Gon\u0026ccedil;alves da Silva A, Theis T, Gray J, Ballard SA, Rawlinson WD. Proficiency testing for bacterial whole genome sequencing in assuring the quality of microbiology diagnostics in clinical and public health laboratories. Pathology. 2021;53:902\u0026ndash;11.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRossen JWA, Friedrich AW, Moran-Gilad J, ESCMID Study Group for Genomic and Molecular Diagnostics (ESGMD). Practical issues in implementing whole-genome-sequencing in routine diagnostic microbiology. Clin Microbiol Infect. 2018;24:355\u0026ndash;60.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBianconi I, Aschbacher R, Pagani E. Current uses and future perspectives of genomic technologies in clinical microbiology. Antibiot (Basel). 2023;12:1580.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eQuainoo S, Coolen JPM, van Hijum SAFT, Huynen MA, Melchers WJG, van Schaik W, et al. Whole-genome sequencing of bacterial pathogens: The future of nosocomial outbreak analysis. Clin Microbiol Rev. 2017;30:1015\u0026ndash;63.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKigen C, Muraya A, Kyanya C, Kingwara L, Mmboyi O, Hamm T et al. Enhancing capacity for national genomics surveillance of antimicrobial resistance in public health laboratories in Kenya. Microb Genom. 2023;9.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBogaerts B, Winand R, Van Braekel J, Hoffman S, Roosens NHC, De Keersmaecker SCJ et al. Evaluation of WGS performance for bacterial pathogen characterization with the Illumina technology optimized for time-critical situations. Microb Genom. 2021;7.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLiu L, Li Y, Li S, Hu N, He Y, Pong R, et al. Comparison of next-generation sequencing systems. J Biomed Biotechnol. 2012;2012:251364.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBruzek S, Vestal G, Lasher A, Lima A, Silbert S. Bacterial whole genome sequencing on the Illumina iSeq 100 for clinical and public health laboratories. J Mol Diagn. 2020;22:1419\u0026ndash;29.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDe Maio N, Shaw LP, Hubbard A, George S, Sanderson ND, Swann J et al. Comparison of long-read sequencing technologies in the hybrid assembly of complex bacterial genomes. Microb Genom. 2019;5.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBejaoui S, Nielsen SH, Rasmussen A, Coia JE, Andersen DT, Pedersen TB, et al. Comparison of Illumina and Oxford Nanopore sequencing data quality for Clostridioides difficile genome analysis and their application for epidemiological surveillance. BMC Genomics. 2025;26:92.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBachmann JA, Tedder A, Laenen B, Steige KA, Slotte T. Targeted long-read sequencing of a locus under long-term balancing selection in Capsella. G3 (Bethesda). 2018;8:1327\u0026ndash;33.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLuo Y, Jang JH, Balkey M, Hoffmann M. 217 closed Salmonella reference genomes using PacBio sequencing. BMC Genom Data. 2025;26:15.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSierra R, Roch M, Moraz M, Prados J, Vuilleumier N, Emonet S, et al. Contributions of long-read sequencing for the detection of antimicrobial resistance. Pathogens. 2024;13:730.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGeorge S, Pankhurst L, Hubbard A, Votintseva A, Stoesser N, Sheppard AE, et al. Resolving plasmid structures in Enterobacteriaceae using the MinION nanopore sequencer: assessment of MinION and MinION/Illumina hybrid data assembly approaches. Microb Genom. 2017;3:e000118.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eShelenkov A, Petrova L, Mironova A, Zamyatin M, Akimkin V, Mikhaylova Y. Long-read whole genome sequencing elucidates the mechanisms of amikacin resistance in multidrug-resistant Klebsiella pneumoniae isolates obtained from COVID-19 patients. Antibiot (Basel). 2022;11:1364.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eAvershina E, Frye SA, Ali J, Taxt AM, Ahmad R. Ultrafast and cost-effective pathogen identification and resistance gene detection in a clinical setting using nanopore Flongle sequencing. Front Microbiol. 2022;13:822402.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWasswa FB, Kassaza K, Nielsen K, Bazira J. MinION whole-genome sequencing in resource-limited settings: Challenges and opportunities. Curr Clin Microbiol Rep. 2022;9:52\u0026ndash;9.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePronyk PM, de Alwis R, Rockett R, Basile K, Boucher YF, Pang V, et al. Advancing pathogen genomics in resource-limited settings. Cell Genom. 2023;3:100443.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBayliss SC, Hunt VL, Yokoyama M, Thorpe HA, Feil EJ. The use of Oxford Nanopore native barcoding for complete genome assembly. Gigascience. 2017;6:1\u0026ndash;6.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWick RR, Judd LM, Holt KE. Assembling the perfect bacterial genome using Oxford Nanopore and Illumina sequencing. PLoS Comput Biol. 2023;19:e1010905.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLerminiaux N, Fakharuddin K, Mulvey MR, Mataseje L. Do we still need Illumina sequencing data? Evaluating Oxford Nanopore Technologies R10.4.1 flowcells and the Rapid v14 library prep kit for Gram negative bacteria whole genome assemblies. Can J Microbiol. 2024;70:178\u0026ndash;89.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLinde J, Brangsch H, H\u0026ouml;lzer M, Thomas C, Elschner MC, Melzer F, et al. Comparison of Illumina and Oxford Nanopore Technology for genome analysis of Francisella tularensis, Bacillus anthracis, and Brucella suis. BMC Genomics. 2023;24:258.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBogaerts B, Van den Bossche A, Verhaegen B, Delbrassinne L, Mattheus W, Nouws S, et al. Closing the gap: Oxford Nanopore Technologies R10 sequencing allows comparable results to Illumina sequencing for SNP-based outbreak investigation of bacterial pathogens. J Clin Microbiol. 2024;62:e0157623.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eFoster-Nyarko E, Cottingham H, Wick RR, Judd LM, Lam MMC, Wyres KL et al. Nanopore-only assemblies for genomic surveillance of the global priority drug-resistant pathogen, Klebsiella pneumoniae. Microb Genom. 2023;9.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSanderson ND, Kapel N, Rodger G, Webster H, Lipworth S, Street TL et al. Comparison of R9.4.1/Kit10 and R10/Kit12 Oxford Nanopore flowcells and chemistries in bacterial genome reconstruction. Microb Genom. 2023;9.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSanderson ND, Hopkins KMV, Colpus M, Parker M, Lipworth S, Crook D et al. Evaluation of the accuracy of bacterial genome reconstruction with Oxford Nanopore R10.4.1 long-read-only sequencing. Microb Genom. 2024;10.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSeth-Smith HMB, Bonfiglio F, Cu\u0026eacute;nod A, Reist J, Egli A, W\u0026uuml;thrich D. Evaluation of rapid library preparation protocols for whole genome sequencing based outbreak investigation. Front Public Health. 2019;7:241.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ehttp://\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e\u003c/span\u003e\u003cspan address=\"http://www.bioinformatics.babraham.ac.uk/projects/fastqc/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. Accessed 6 Jun 2025.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114\u0026ndash;20.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWick RR, Judd LM, Gorrie CL, Holt KE. Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput Biol. 2017;13:e1005595.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWick R. Filtlong: quality filtering tool for long reads. Github.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDe Coster W, Rademakers R. NanoPack2: population-scale evaluation of long-read sequencing data. Bioinformatics. 2023;39.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLi H. seqtk: Toolkit for processing sequences in FASTA/Q formats. Github.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKolmogorov M, Yuan J, Lin Y, Pevzner PA. Assembly of long, error-prone reads using repeat graphs. Nat Biotechnol. 2019;37:540\u0026ndash;6.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003emedaka. Sequence correction provided by ONT Research. Github.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWick RR, Holt KE, Polypolish. Short-read polishing of long-read bacterial genome assemblies. PLoS Comput Biol. 2022;18:e1009802.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSchwengers O, Jelonek L, Dieckmann MA, Beyvers S, Blom J, Goesmann A. Bakta: rapid and standardized annotation of bacterial genomes via alignment-free sequence identification. Microb Genom. 2021;7.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eShen W, Le S, Li Y, Hu F. SeqKit: A cross-platform and ultrafast toolkit for FASTA/Q file manipulation. PLoS ONE. 2016;11:e0163962.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGurevich A, Saveliev V, Vyahhi N, Tesler G. QUAST: quality assessment tool for genome assemblies. Bioinformatics. 2013;29:1072\u0026ndash;5.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eChaumeil P-A, Mussig AJ, Hugenholtz P, Parks DH. GTDB-Tk v2: memory friendly classification with the genome taxonomy database. Bioinformatics. 2022;38:5315\u0026ndash;6.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eFeldgarden M, Brover V, Gonzalez-Escalona N, Frye JG, Haendiges J, Haft DH, et al. AMRFinderPlus and the Reference Gene Catalog facilitate examination of the genomic links among antimicrobial resistance, stress response, and virulence. Sci Rep. 2021;11:12728.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRobertson J, Nash JHE. MOB-suite: software tools for clustering, reconstruction and typing of plasmids from draft assemblies. Microb Genom. 2018;4.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLi H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics. 2018;34:3094\u0026ndash;100.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLi H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25:2078\u0026ndash;9.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWalker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE. 2014;9:e112963.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDanecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, et al. The variant call format and VCFtools. Bioinformatics. 2011;27:2156\u0026ndash;8.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSeemann T. snippy::scissors: Rapid haploid variant calling and core genome alignment. Github.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eJ\u0026uuml;nemann S, Sedlazeck FJ, Prior K, Albersmeier A, John U, Kalinowski J, et al. Updating benchtop sequencing performance comparison. Nat Biotechnol. 2013;31:294\u0026ndash;6.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eEspejo RT, Plaza N. Multiple ribosomal RNA operons in bacteria; Their concerted evolution and potential consequences on the rate of evolution of their 16S rRNA. Front Microbiol. 2018;9:1232.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMiller CS, Baker BJ, Thomas BC, Singer SW, Banfield JF. EMIRGE: reconstruction of full-length ribosomal genes from microbial community short read sequencing data. Genome Biol. 2011;12:R44.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eNi Y, Liu X, Simeneh ZM, Yang M, Li R. Benchmarking of Nanopore R10.4 and R9.4.1 flowcells in single-cell whole-genome amplification and whole-genome shotgun sequencing. Comput Struct Biotechnol J. 2023;21:2352\u0026ndash;64.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePurushothaman S, Meola M, Roloff T, Rooney AM, Egli A. Evaluation of DNA extraction kits for long-read shotgun metagenomics using Oxford Nanopore sequencing for rapid taxonomic and antimicrobial resistance detection. Sci Rep. 2024;14:29531.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDe La Cerda GY, Landis JB, Eifler E, Hernandez AI, Li F-W, Zhang J, et al. Balancing read length and sequencing depth: Optimizing Nanopore long-read sequencing for monocots with an emphasis on the Liliales. Appl Plant Sci. 2023;11:e11524.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSoto-Serrano A, Li W, Panah FM, Hui Y, Atienza P, Fomenkov A et al. Matching excellence: Oxford Nanopore Technologies\u0026rsquo; rise to parity with Pacific Biosciences in genome reconstruction of non-model bacterium with high %G\u0026thinsp;+\u0026thinsp;C content. Microb Genom. 2024;10.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWagner GE, Dabernig-Heinz J, Lipp M, Cabal A, Simantzik J, Kohl M, et al. Real-time nanopore Q20\u0026thinsp;+\u0026thinsp;sequencing enables extremely fast and accurate core genome MLST typing and democratizes access to high-resolution bacterial pathogen surveillance. J Clin Microbiol. 2023;61:e0163122.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLoman NJ, Quick J, Simpson JT. A complete bacterial genome assembled de novo using only nanopore sequencing data. Nat Methods. 2015;12:733\u0026ndash;5.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eNanopore sequencing accuracy. Oxford Nanopore Technologies. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://nanoporetech.com/platform/accuracy\u003c/span\u003e\u003cspan address=\"https://nanoporetech.com/platform/accuracy\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eChaumeil P-A, Mussig AJ, Hugenholtz P, Parks DH. GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database. Bioinformatics. 2019;36:1925\u0026ndash;7.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eParks DH, Chuvochina M, Rinke C, Mussig AJ, Chaumeil P-A, Hugenholtz P. GTDB: an ongoing census of bacterial and archaeal diversity through a phylogenetically consistent, rank normalized and complete genome-based taxonomy. Nucleic Acids Res. 2022;50:D785\u0026ndash;94.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMussig AJ, Chaumeil P-A, Chuvochina M, Rinke C, Parks DH, Hugenholtz P. Putative genome contamination has minimal impact on the GTDB taxonomy. Microb Genom. 2024;10.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBrowne PD, Nielsen TK, Kot W, Aggerholm A, Gilbert MTP, Puetz L et al. %G\u0026thinsp;+\u0026thinsp;C bias affects genomic and metagenomic reconstructions, underrepresenting %G\u0026thinsp;+\u0026thinsp;C-poor organisms. Gigascience. 2020;9.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eOlson ND, Lund SP, Colman RE, Foster JT, Sahl JW, Schupp JM, et al. Best practices for evaluating single nucleotide variant calling methods for microbial genomics. Front Genet. 2015;6:235.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWick R. ONT-only accuracy: 5 kHz and Dorado. Ryan Wick\u0026rsquo;s bioinformatics blog. 2023. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://rrwick.github.io/2023/10/24/ont-only-accuracy-update.html\u003c/span\u003e\u003cspan address=\"https://rrwick.github.io/2023/10/24/ont-only-accuracy-update.html\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBiggel M, Cernela N, Horlbog JA, Stephan R. Oxford Nanopore\u0026rsquo;s 2024 sequencing technology for Listeria monocytogenes outbreak detection and source attribution: progress and clone-specific challenges. J Clin Microbiol. 2024;62:e0108324.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWeigl S, Dabernig-Heinz J, Granitz F, Lipp M, Ostermann L, Harmsen D, et al. Improving Nanopore sequencing-based core genome MLST for global infection control: a strategy for %G\u0026thinsp;+\u0026thinsp;C-rich pathogens like Burkholderia pseudomallei. J Clin Microbiol. 2025;63:e0156924.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDabernig-Heinz J, Lohde M, H\u0026ouml;lzer M, Cabal A, Conzemius R, Brandt C, et al. A multicenter study on accuracy and reproducibility of nanopore sequencing-based genotyping of bacterial pathogens. J Clin Microbiol. 2024;62:e0062824.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eNeuenschwander S, Borcard L, Gempeler S, Miani MT, Casanova C, Ramette A. Evaluation of Oxford Nanopore Technologies workflows for genomic epidemiology of outbreak-associated bacterial isolates in the clinical setting. medRxiv. 2025.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePrior K, Becker K, Brandt C, Cabal Rosel A, Dabernig-Heinz J, Kohler C et al. Accurate and reproducible whole-genome genotyping for bacterial genomic surveillance with Nanopore sequencing data. J Clin Microbiol. 2025;:e0036925.\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"bmc-medical-genomics","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"mgnm","sideBox":"Learn more about [BMC Medical Genomics](http://bmcmedgenomics.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/mgnm/default.aspx","title":"BMC Medical Genomics","twitterHandle":"BMC_series","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true},"keywords":"Whole genome sequencing, Oxford Nanopore Technologies, Illumina","lastPublishedDoi":"10.21203/rs.3.rs-7036422/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7036422/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003ch2\u003eBackground\u003c/h2\u003e\u003cp\u003eIn microbial diagnostics, whole-genome sequencing (WGS) is used to address key questions such as species identification, presence of antimicrobial resistance genes (ARGs), virulence genes, and outbreak detection. The choice of sequencing technology is crucial to ensure high-quality data, cost-effectiveness, and efficient reporting times. We aimed to compare Illumina (short-read) and ONT (long-read) sequencing methods for WGS on different bacterial species for base accuracy and reliable taxonomic and ARG identification.\u003c/p\u003e\u003ch2\u003eMaterials and Methods\u003c/h2\u003e\u003cp\u003eWe used clinical isolates of ESKAPE pathogens (n\u0026thinsp;=\u0026thinsp;12) and ATCC strains (n\u0026thinsp;=\u0026thinsp;8) of varying %G\u0026thinsp;+\u0026thinsp;C. Illumina sequencing was performed on MiSeq (PE150) and ONT sequencing using GridION with R9.4.1 and R10.4.1 flowcells. Base-calling was performed using Guppy, Dorado, and Rerio software. We used \u003cem\u003ede novo\u003c/em\u003e assembly with Unicycler for Illumina and Flye for ONT, and two types of hybrid assemblies, Unicycler and Polypolish. We annotated genomes with Bakta and assessed the quality (QUAST, GTDB-Tk). We identified ARGs (AMRFinderPlus) and plasmids (MOB-suite). We mapped reads and called SNPs using Minimap2, Pilon, vcftools, and Snippy (Illumina). Core genome MLST analysis was conducted with Ridom Seqsphere+.\u003c/p\u003e\u003ch2\u003eResults\u003c/h2\u003e\u003cp\u003eWe observed that Illumina sequencing provided consistently high-quality reads (median Q-score 35), whereas for ONT R10.4.1, SUP mode showed higher median quality (median Q-score 15.3) compared to R9.4.1 (median Q-score 13.9, SUP mode). We observed that Illumina-based assemblies generated fewer genes annotated as disrupted; for ONT assemblies, the base-caller affects assembly annotation accuracy, with High accuracy (HAC) and Super accuracy (SUP) base-calling modes perform better than FAST mode. ONT assemblies resolved rRNA operons better than Illumina assemblies. Sequencing errors were determined by SNP calling, and varied widely by species, with ONT often generating more sequencing errors compared to Illumina. Hybrid assemblies combine accuracy and completeness effectively. Taxonomic identification and ARG detection were reliable across all methods.\u003c/p\u003e\u003ch2\u003eConclusion\u003c/h2\u003e\u003cp\u003eCombining Illumina and ONT technologies yielded optimal bacterial genome sequencing results, leveraging the high accuracy of short reads and improved contiguity of ONT long reads. The HAC and SUP ONT models with Dorado notably enhance genome assembly annotation and resolution of complex regions, although species-specific issues, likely due to repeat regions and base modifications, remain challenging even in SUP mode with Dorado. Hybrid approaches currently offer the most comprehensive and accurate genome assemblies for clinical microbiology. For reliable cgMLST even using the most recent ONT methods, resolution must be assessed on a species-by-species basis.\u003c/p\u003e","manuscriptTitle":"Benchmarking Illumina and Oxford Nanopore Technologies (ONT) sequencing platforms for whole genome sequencing of bacterial genomes and use in clinical microbiology","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-07-10 08:05:46","doi":"10.21203/rs.3.rs-7036422/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2025-09-10T10:37:12+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-07-16T10:38:58+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"86692382275224548279267959300723647251","date":"2025-07-10T06:17:14+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2025-07-08T10:56:49+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-07-07T09:07:27+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-07-07T09:06:06+00:00","index":"","fulltext":""},{"type":"submitted","content":"BMC Medical Genomics","date":"2025-07-03T09:32:34+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"bmc-medical-genomics","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"mgnm","sideBox":"Learn more about [BMC Medical Genomics](http://bmcmedgenomics.biomedcentral.com/)","snPcode":"","submissionUrl":"https://www.editorialmanager.com/mgnm/default.aspx","title":"BMC Medical Genomics","twitterHandle":"BMC_series","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"em","reportingPortfolio":"BMC Series","inReviewEnabled":true,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"ca19f4ff-8a77-4cb1-b858-b78b92533782","owner":[],"postedDate":"July 10th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[],"tags":[],"updatedAt":"2026-01-19T17:04:05+00:00","versionOfRecord":{"articleIdentity":"rs-7036422","link":"https://doi.org/10.1186/s12920-025-02305-2","journal":{"identity":"bmc-medical-genomics","isVorOnly":false,"title":"BMC Medical Genomics"},"publishedOn":"2026-01-15 16:29:39","publishedOnDateReadable":"January 15th, 2026"},"versionCreatedAt":"2025-07-10 08:05:46","video":"","vorDoi":"10.1186/s12920-025-02305-2","vorDoiUrl":"https://doi.org/10.1186/s12920-025-02305-2","workflowStages":[]},"version":"v1","identity":"rs-7036422","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7036422","identity":"rs-7036422","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.