Detecting Foldback Artifacts in Long-reads

preprint OA: closed
📄 Open PDF Full text JSON View at publisher
Full text 1,303 characters · extracted from oa-doi-fallback · click to expand
Abstract Long-read sequencing data is useful for detecting large and complex structural variations; however, technical artifacts can lead to false structural variant calls. In our analyses, we became aware of a foldback artifact in long-read data. Therefore, we developed the open-source Breakinator tool to flag putative foldback artifact reads, as well as previously known chimeric artifacts. Through an alignment-based approach, Breakinator can detect artifacts missed by existing quality control tools. We profiled the occurrences of foldbacks and chimeric reads in both Oxford Nanopore and PacBio sequences across a range of specimens, library types, sequencing chemistries, sequencing machines, and base-calling software. Competing Interest Statement M.M. holds equity in Bayer, Delve Bio, Isabl, and Karyoverse; consults for Delve Bio; receives research funding from Bayer; and receives patent licensing payments from Bayer and Labcorp. Footnotes Updated our method, Breakinator, to support the more common SAM/BAM/CRAM format files along with PAF files, highlighted that sequence between aligning read segments in artifact reads is enriched for repetitive elements, and added a paragraph discussing the envisioned use case of Breakinator in an analysis pipeline depending on sample and datatype.

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: oa-doi-fallback

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00