quantms-rescoring enables deep proteome coverage across protein quantification, immunopeptidomics, and post-translational modifications experiments

doi:10.64898/2026.01.12.698877

quantms-rescoring enables deep proteome coverage across protein quantification, immunopeptidomics, and post-translational modifications experiments

2026 · doi:10.64898/2026.01.12.698877

preprint OA: closed

Full text JSON View at publisher

Full text 2,352 characters · extracted from oa-doi-fallback · click to expand

Abstract The growing volume of public proteomics datasets and the advent of novel machine learning (ML)-based methods create unprecedented opportunities for discovery through large-scale reanalysis. However, traditional desktop tools are increasingly insufficient for processing and integrating data at this scale. To address this challenge, we present a novel package, quantms-rescoring, that extends the cloud-native quantms workflow with a machine learning-based rescoring module. Unlike prior tools that rescore single-engine outputs, quantms-rescoring seamlessly integrates multiple search engines (SAGE, COMET, and MSGF+), performs automatic model selection, model fine-tuning, and scales reproducibly on cloud infrastructures. In quantms-rescoring, we rely on multiple fragment-ion intensity (AlphaPeptDeep and MS2PIP) and retention-time prediction (DeepLC) methods to improve results from multiple peptide database search engines. It features automatic model selection, fine-tuning, and retraining for MS/MS intensity and retention time prediction to select the best model for a given dataset. We applied the novel workflow to five representative datasets spanning DDA label-free quantification, TMT 10-plex isobaric labelling of tumor proteomics data, immunopeptidomics, phospho-proteomics, and unseen lysine malonylation experiments. We achieved a 16-22.8% increase in identified spectra, along with the quantification of 2191 additional phosphorylated peptides and 1337 phosphosites. In the tandem mass tag (TMT)-labeled clear cell renal cell carcinoma dataset, 76 novel differentially expressed multiple search engines identified proteins with quantms-rescoring. Additionally, novel 11,688 HLA-II potential binders were detected in the immunopeptidomics dataset by multiple search engines with quantms-rescoring. For unseen malonylation data, we reported more than 58.8% malonylation PSMs and 30.5% modification sites than COMET alone. Together, these results show that integrating multi-engine searches with machine learning-derived features can be combined in a scalable workflow that enhances identification, PTM localization, and quantification performance. Competing Interest Statement T.S. and O.K. are officers in OpenMS Inc., a non-profit foundation managing OpenMS development. All remaining authors declare no competing interests.

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: oa-doi-fallback ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00