Single-and double-strand circulating DNA fragmentomics for enhanced cancer detection performance

preprint OA: closed
📄 Open PDF Full text JSON View at publisher
Full text 2,086 characters · extracted from oa-doi-fallback · click to expand
ABSTRACT In early detection of cancer, the use of circulating cell-free DNA (cirDNA) obtained from blood samples is notable for its minimally invasive nature. We have developed an algorithm designed to discriminate cancer patients and healthy individuals based on cirDNA fragment end motif analysis assisted by machine learning, using data obtained from shallow whole genome sequencing (a method we call EMA). We applied EMA to cirDNA from the plasma of patients with stage II-III breast cancer, stage I-III non-small cell lung cancer, and metastatic colorectal cancer (mCRC). CirDNA from 158 individuals was prepared following the conventional double-stranded DNA library preparation (DSP). Using 3 bp end motifs, each tumor type was detected with a sensitivity of 0.87-1.00, a specificity of 0.95, and an AUC above 0.96. The three selected cancer types could be differentiated with an accuracy (ACC) above 0.94. Multi-cancer detection by pooling samples from the three cancer types showed ACC, AUC and sensitivity of 0.98, 0.99 and 0.98, respectively. Comparisons with 4 and 2 bp end motifs were conducted, and our main observations were confirmed using an external public dataset (N=366). We also performed a single-stranded DNA library preparation (SSP) using mCRC patients and healthy control cirDNA, which allowed us to make the first ever end motif analysis in the literature which compares the use of DSP and SSP. As compared to EMAD (use of DSP), EMAS (use of SSP) produced a very significant difference in end motif frequency and an improved cancer detection performance (ACC, AUC and sensitivity of 0.97, 1.00 and 0.99, respectively). Furthermore, optimal performance was produced when the full-size range was used for EMAS, whereas when the dataset was restricted to fragments of 115 - 220 bp for EMAD. EMAS strong performance, coupled with its compatibility with cost-effective shallow cirDNA sequencing, positions this methodology as a potentially transformative tool in early cancer screening. Competing Interest Statement The authors have declared no competing interest.

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: oa-doi-fallback

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00