A proteogenomic approach to discover novel lncRNA-derived peptides and their potential clinical utility in hepatocellular carcinoma

preprint OA: closed
📄 Open PDF Full text JSON View at publisher
Full text 1,948 characters · extracted from oa-doi-fallback · click to expand
Abstract Peptides are increasingly recognized for their versatile functions in biological contexts but their clinical relevance and utility remain largely unexplored. Proteogenomic approaches can accelerate peptide discovery in clinical samples by integrating proteomic data with genomics and transcriptomics evidence. However, long noncoding RNA (lncRNA)-derived peptides (lncPeps) remain largely unidentified, resulting in unmatchable MS/MS spectra. To solve this problem, we have used high-quality Ribo-seq translatomic datasets to generate an extensive database of human liver lncPeps, which we subsequently applied to proteomics data of tumor–adjacent normal tissue pairs from hepatocellular carcinoma (HCC) patients. Using the new database, we discovered 105 novel lncPeps including lncPeps differentially expressed between tumor and non-tumor tissues, and lncPeps with significant correlation with prognosis. Remarkably, combining the expression of lncPeps with canonical proteins in a LASSO regression model improved predictive performance for recurrence, increasing the AUC by 0.005 to 0.085 across three recurrence time points. These findings suggest that lncPeps discovery contributes to our understanding of the molecular heterogeneity and progression of HCC, and broadens the range of potential biomarker candidates or treatment targets for the disease. Competing Interest Statement The authors have declared no competing interest. Abbreviations - HCC - hepatocellular carcinoma - lncRNA - long noncoding RNA - lncPep - lncRNA-encoded peptide - ORF - open reading frames - sORF - small ORF - lncRNA-ORF - long non-coding RNA ORF - Ribo-seq - ribosome profiling sequencing - CDS - coding sequence - UTR - untranslated region - ncRNA - non-coding RNA - PSM - peptide spectrum match - nsSNP - non-synonymous single nucleotide polymorphism - ROS - reactive oxygen species - GSEA - gene set enrichment analysis - FDR - false discovery rate.

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: oa-doi-fallback

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00