Quantitative Full-length transcriptome analysis by nanopore sequencing with Error-Aware UMI mapping

doi:10.64898/2026.01.14.699429

Quantitative Full-length transcriptome analysis by nanopore sequencing with Error-Aware UMI mapping

2026 · doi:10.64898/2026.01.14.699429

preprint OA: closed

Full text JSON View at publisher

Full text 1,831 characters · extracted from oa-doi-fallback · click to expand

Abstract Comprehensive transcriptome profiling is essential for understanding RNA diversity and regulation, yet accurate identification and quantification of full-length transcript isoforms remain challenging with short-read sequencing technologies. Nanopore sequencing enables direct sequencing of long cDNA molecules and thus offers a powerful solution for full-length transcriptome analysis, but its application to quantitative transcriptomics is limited by PCR amplification bias and the difficulty of unique molecular identifier (UMI) recognition under high sequencing error rates. Here, we developed UMImap, a dedicated pipeline for robust UMI identification, error correction, and deduplication in nanopore data. By integrating transcript-aware UMI correction with long-read isoform assembly, UMImap substantially improves UMI recognition accuracy compared with existing methods and effectively mitigates PCR-induced duplication bias. Using this framework, we identified tens of thousands of full-length transcript isoforms, including a large fraction of previously unannotated isoforms that are significantly longer than reference annotations. Quantitative analyses demonstrate the reliability of UMImap for transcript-level quantification. Functional and pathway enrichment analyses of highly expressed novel isoforms revealed coherent and biologically meaningful patterns, including strong enrichment in RNA processing, splicing, and translation pathways. Our results establish UMImap as an effective solution for UMI-based quantification in nanopore full-length transcriptome sequencing and highlight the potential of long-read sequencing to simultaneously achieve accurate isoform discovery and expression analysis in complex transcriptomes. Competing Interest Statement The authors have declared no competing interest.

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: oa-doi-fallback ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00