Benchmarking long-read genome assemblers for three sequencing protocols and three agricultural species

doi:10.1101/2025.02.14.638238

Benchmarking long-read genome assemblers for three sequencing protocols and three agricultural species

2025 · doi:10.1101/2025.02.14.638238

preprint OA: closed

📄 Open PDF Full text JSON View at publisher

Full text 1,633 characters · extracted from oa-doi-fallback · click to expand

Abstract Today, long read technologies make it possible to produce telomere-to-telomere genome assemblies. These assemblies permit more accurate whole genome analyses, including on repeated regions. The genome assembly process can chain several steps such as read correction, contigs assembly, contig polishing, scaffolding and gap filling, among others. Each step usually requires at least one software package, and input sequences. For the end user, it is not necessarily clear which combination of tools, together with read types, will work best for a given genome. In this work, we will focus on contig production, which is a central and complex task of the assembly process. It aims at producing the longest, errorless, sequences, called contigs, from reads. While it is possible to produce contigs from short reads, long reads are now widely preferred, since they produce much longer contigs. In this work, we evaluate several contig producing software packages (usually named assemblers), on long reads generated by two sequencers using three protocols, for three eukaryotic, complex species with different characteristics. Our aim is twofold. First, we would like to give readers insight on the impact of sequencing technology and assembler combinations, in order to help them make their choice for a given genome. Second, we would like to present different assembly metrics and provide a critical view on their interpretation. Competing Interest Statement The authors have declared no competing interest. Footnotes Added a sentence in the data and code availability section about table 5 containing all datasets identifiers.

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: oa-doi-fallback ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00