CellPulse: A Foundation Model of Coordinated Gene Dynamics Simulating Viral Infectious Diseases

preprint OA: closed
Full text JSON View at publisher
Full text 1,960 characters · extracted from oa-doi-fallback · click to expand
Abstract Understanding how cells respond to perturbations like viral infections requires models capturing coordinated gene dynamics. However, current gene expression foundation models are predominantly reliant on single-cell data and static gene expression, limiting their applicability in real clinical scenarios. We present CellPulse, a direction-aware foundation model trained on the Virus Stimulated Atlas (VISTA), a newly curated atlas of over 23 million bulk RNA-sequencing differential expression profiles from viral infections. CellPulse models the direction and magnitude of gene expression changes via a structured representation of differential expression and a direction-aware attention mechanism, enabling the learning of coherent regulatory programs. It shows powerful diagnosing capability by accurately classifying 31 distinct virus types across diverse clinical and laboratory samples, solely from host transcriptional signatures. Crucially, without prior knowledge injection, CellPulse’s interpretability reveals virus-associated host factors that mediate infection. Using a selection of host factors for in silico drug screening yielded numerous compounds with confirmed efficacies in wet-lab assays, while cell-based and animal experiments further verified the causal relationship between host targets and viral infections. Overall, CellPulse represents a generalizable foundation model for deciphering coordinated gene dynamics from bulk transcriptomics, bridging host response modeling with clinical relevance and therapeutic discovery for infectious diseases and beyond. Competing Interest Statement Wuhan Institute of Virology on behalf of the authors X.Z., Y.R., and X-X.Z., Institute of Software on behalf of the authors Y.W., L.Z., and D.L. have filed a patent application for the method for disease diagnosis and drug discovery based on modeling of large‑scale gene expression data. All other authors declare no competing interests.

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: oa-doi-fallback

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2026) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00