scRegulate: Single-Cell Regulatory-Embedded Variational Inference of Transcription Factor Activity from Gene Expression

preprint OA: closed
📄 Open PDF Full text JSON View at publisher
Full text 2,834 characters · extracted from oa-doi-fallback · 2 sections · click to expand

Abstract

Motivation Accurately inferring transcription factor (TF) activity from single-cell RNA sequencing (scRNA-seq) data remains a fundamental challenge in computational biology. While existing methods rely on statistical models, motif enrichment, or prior-based inference, they often depend on deterministic assumptions about regulatory relationships and rely on static regulatory databases. Few approaches effectively integrate prior biological knowledge with data-driven inference to capture novel, dynamic, and context-specific regulatory interactions.

Results

To address these limitations, we develop scRegulate, a generative deep learning framework leveraging variational inference to estimate TF activities guided by experimental TF-target gene relationships and progressively adapted based on the input scRNA-seq data. By integrating structured biological constraints with a probabilistic latent space model, scRegulate offers a scalable and biologically grounded estimation of TF activity and gene regulatory network (GRN). Comprehensively bench-marking on public experimental and synthetic datasets demonstrates scRegulate’s superior ability. Further, scRegulate accurately recapitulates experimentally validated TF knockdown effects on a Perturb-seq dataset for key TFs. Applied to experimental human PBMC scRNA-seq data, scRegulate infers cell-type-specific GRNs and identifies differentially active TFs aligned with known regulatory pathways. scRegulate’s TF activity representations capture transcriptional heterogeneity, enabling accurate clustering of cell types. scRegulate is highly efficient, frequently an order of magnitude faster than common baselines. Collectively, our results establish scRegulate as a powerful, interpretable, and scalable framework for inferring TF activities and GRNs from single-cell transcriptomics. Availability Results and scripts available at github.com/YDaiLab/scRegulate. Supplementary information Supplementary data are available at Bioinformatics online. Competing Interest Statement The authors have declared no competing interest. Footnotes This revised version incorporates all updates made during peer review and matches the manuscript accepted (in press) at Bioinformatics. We added new benchmarking of scRegulate versus pySCENIC on mouse embryonic stem cell scRNA-seq data, with results reflected in the updated Figure S5 and Table S2, and clarified performance comparisons using experimental human PBMC data. We also expanded the related-work section and updated Table S4 to include Dictys and scMTNI as recent dynamic GRN methods. The distinction between synthetic PBMC benchmarking (GRouNdGAN) and experimental human PBMC data has been clarified, and terminology throughout the manuscript now consistently states that scRegulate infers, rather than reconstructs, GRNs.

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: oa-doi-fallback

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00