Skiver: Alignment-free Estimation of Sequencing Error Rates and Spectra using ( k, v )-mer Sketches

preprint OA: closed
Full text JSON View at publisher
Full text 1,365 characters · extracted from oa-doi-fallback · 2 sections · click to expand

Abstract

Background Quality control of sequencing datasets is an important first step in numerous bioinformatics pipelines such as mapping, variant calling, and assembly. Existing methods typically rely on alignment results or quality scores. However, the reference genome is not always available for mapping, and uncalibrated quality scores may yield biased estimates of error rates.

Results

We present skiver, a reference-free and alignment-free framework that estimates sequencing errors using (k, v)-mer sketches. By identifying the consensus through the sketched (k, v)-mers, skiver estimates survival and hazard rates that capture positional information of sequencing errors. Across simulated and real datasets from various sequencing platforms, skiver accurately recovers error rates and spectra. It also reliably handles complex datasets containing multiple strains, alleles, and repetitive regions through an outlier filtering strategy. Skiver is computationally efficient and provides a lightweight solution for error profiling in high-throughput sequencing. Availability and Implementation The implementation of skiver is available at https://github.com/GZHoffie/skiver, and the dataset and scripts for reproducibility are available at https://github.com/GZHoffie/skiver-test. Competing Interest Statement The authors have declared no competing interest.

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: oa-doi-fallback

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00