quantms-rescoring enables deep proteome coverage across protein quantification, immunopeptidomics, and post-translational modifications experiments

preprint OA: closed
Full text JSON View at publisher
Full text 2,352 characters · extracted from oa-doi-fallback · click to expand
Abstract The growing volume of public proteomics datasets and the advent of novel machine learning (ML)-based methods create unprecedented opportunities for discovery through large-scale reanalysis. However, traditional desktop tools are increasingly insufficient for processing and integrating data at this scale. To address this challenge, we present a novel package, quantms-rescoring, that extends the cloud-native quantms workflow with a machine learning-based rescoring module. Unlike prior tools that rescore single-engine outputs, quantms-rescoring seamlessly integrates multiple search engines (SAGE, COMET, and MSGF+), performs automatic model selection, model fine-tuning, and scales reproducibly on cloud infrastructures. In quantms-rescoring, we rely on multiple fragment-ion intensity (AlphaPeptDeep and MS2PIP) and retention-time prediction (DeepLC) methods to improve results from multiple peptide database search engines. It features automatic model selection, fine-tuning, and retraining for MS/MS intensity and retention time prediction to select the best model for a given dataset. We applied the novel workflow to five representative datasets spanning DDA label-free quantification, TMT 10-plex isobaric labelling of tumor proteomics data, immunopeptidomics, phospho-proteomics, and unseen lysine malonylation experiments. We achieved a 16-22.8% increase in identified spectra, along with the quantification of 2191 additional phosphorylated peptides and 1337 phosphosites. In the tandem mass tag (TMT)-labeled clear cell renal cell carcinoma dataset, 76 novel differentially expressed multiple search engines identified proteins with quantms-rescoring. Additionally, novel 11,688 HLA-II potential binders were detected in the immunopeptidomics dataset by multiple search engines with quantms-rescoring. For unseen malonylation data, we reported more than 58.8% malonylation PSMs and 30.5% modification sites than COMET alone. Together, these results show that integrating multi-engine searches with machine learning-derived features can be combined in a scalable workflow that enhances identification, PTM localization, and quantification performance. Competing Interest Statement T.S. and O.K. are officers in OpenMS Inc., a non-profit foundation managing OpenMS development. All remaining authors declare no competing interests.

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: oa-doi-fallback

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00