Full text
3,003 characters
· extracted from
oa-doi-fallback
· click to expand
Abstract
Subcellular-level spatial transcriptomics data contain unprecedented contexts to uncover finer cellular clusters and their interactions. However, integrative analysis at subcellular resolution meets many challenging questions due to its ultra-large volume, ultra-high sparsity, and severe susceptibility to technical conditions and batch effects. We introduce STUltra, a scalable and accurate framework for integrating subcellular-level spatial omics data across spatial, temporal, and biomedical dimensions. Built on contrastive learning, STUltra combines a robust graph autoencoder with an interval sampling step to enhance batch-effect correction and enable clear characterization of shared and condition-specific tissue structures. It also provides seamless extension to super-resolution platforms such as Visium HD, Xenium, and Stereo-seq. STUltra is thus capable of identifying finer-grained cluster dynamics with distinguishable profile features, offering insights beyond those of prior studies. For example, STUltra successfully delineates interspersed macrophages within colorectal tumors, aligns mouse brain hippocampus across three subcellular platforms, and maps long-term muscle continuum during mouse embryonic development. Furthermore, from a mouse model of Alzheimer’s disease, STUltra detects disease-related astrocyte substructures and disentangles the regulatory network. Importantly, STUltra is remarkably scalable to process these datasets containing over 1,000,000 cells, outperforming existing tools in both accuracy and efficiency.
Competing Interest Statement
The authors have declared no competing interest.
5 Data availability
Source data for Figs. is available with this paper. The datasets analyzed in this study are all from publicly available datasets (Supplementary Table B4 ). Specifically, the human DLPFC dataset can be accessed in the spatialLIBD package (http://spatial.libd.org/spatialLIBD). The mouse sagittal posterior and anterior brain data can be accessed at https://support.10xgenomics.com/ spatial-gene-expression/datasets/1.0.0/V1 Mouse Brain Sagittal Posterior and https://support.10xgenomics.com/spatial-gene-expression/datasets/1.0.0/V1 Mouse Brain Sagittal Anterior, respectively. The mouse brain data generated by Visium HD, BMK S1000, and Stereo-Seq v2 are available at the website (www.genographix.com). The mouse embryo data can be accessed at https://db.cngb.org/stomics/mosta/. The human colorectal cancer Visium HD data are available from 10X Genomics data website: https://www.10xgenomics.com/datasets/ visium-hd-cytassist-gene-expression-libraries-of-human-crc. The human lesional and nonlesional skin data are publicly available at GEO under the accession number GSE202011. The mouse heart with myocardial infarction and mechanical injury data can be downloaded from GEO database with accession code GSE214611, and the metadata can be downloaded from https://drive.google.com/file/d/161iiznh3I8eiLe9tqb2xU7QDFmo6hLWD/view?usp=sharing.
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.