Abstract
Background Normalization is a critical yet often poorly understood step in microbiome studies. Suboptimal approaches may lead to inaccurate conclusions in downstream analyses of microbial communities. Currently, there is no benchmarking framework to evaluate how normalisation affects both sample stratification and differential abundance simultaneously across taxonomic levels. In this paper, we propose a simulation pipeline based on real data and multivariate exploratory data analysis to provide a structured and reproducible assessment of normalization methods.
Results
Normalization methods exhibited distinct accuracy across taxonomic levels and sequencing depths. In our case study, at the phylum level, edgeR-TMM and Rarefaction improved accuracy by reducing coverage-related variation while preserving biological structure. In contrast, at the genus level, the overall improvement by normalization was less pronounced, reflecting the weaker influence of sequencing depth variability in this scenario, and EdgeR-TMM again provided the most accurate estimation of biological effect. Multivariate visualizations supported these observations, highlighting both sample-level and taxon-level differences among methods. Yet, ordination-based summaries are not sufficient for differential abundance inference and can be misleading, motivating the use of a simulation environment with known ground truth.
Conclusions
Normalization performance varied with sequencing depth, sparsity, taxonomic resolution, and dataset size. Thus, there is no single normalization method that is expected to be optimal across all conditions. Our proposed simulation and analysis framework offers a reproducible and interpretable platform to evaluate existing and new normalization approaches in microbiome research for specific case studies.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
This version includes a minor correction in the Introduction. In Table 1, McKnight et al. 2019 [34] was previously described as a differential abundance study; this has been corrected to reflect that the study focused on beta-diversity (clustering). No other content, analyses, results, or conclusions were changed
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.