Full text
2,294 characters
· extracted from
oa-doi-fallback
· click to expand
ABSTRACT
Over-representation analysis (ORA) is the most commonly used interpretation tool for gene lists despite well-documented limitations: pathway boundaries are fixed, genes are assumed independent, and results depend on the background set. Network-based methods address these using interaction-network modularity, but introduce hub bias: highly connected genes appear clustered under naive nulls because curated networks overrepresent well-studied genes. Existing corrections are imperfect: edge permutation destroys the topology the test should condition on, and propagation methods hide the confound in parameter tuning. We introduce MANGO (Moran’s Autocorrelation for Network Gene Over-representation), which asks one conditional question: does a gene set’s spatial autocorrelation on a fixed biological network exceed what its degree composition alone would predict? MANGO computes Global Moran’s I under a null that conditions on both the network and the binned degree distribution of the gene set, then decomposes significant signals at the component and gene level. In benchmarks, uniform nulls produce a false positive rate of 1.0 on hub-enriched gene sets with no real clustering; ten-bin degree-stratified nulls bring that to 0.0 with no power loss (AUC ≥ 0.98; on degree-typical signals, |ΔAUC| ≤ 0.004). Pathway-spiking simulations confirm detection of real biological clustering across diverse pathway sizes and degree profiles. Applied to the FIGI colorectal cancer GWAS (204 SNPs), the set is degree-typical (KS p = 0.83), yet Moran’s I is highly significant (p < 0.001). Component-level jackknife localizes the entire signal to a single 24-gene module spanning TGF-β, Wnt/cadherin, and related pathways, with four bottlenecks (SMAD3, MYC, CTNNB1, PTPN1) matching established CRC driver biology.
eTOC blurb MANGO tests whether a gene set’s spatial autocorrelation on a biological network exceeds what its degree composition predicts, by conditioning Global Moran’s I on the binned degree distribution with the network held fixed. Significant signals are decomposed to modules, bottleneck genes, and statistical drivers through component jackknife, articulation-point, and gene-jackknife analysis.
Competing Interest Statement
The authors have declared no competing interest.
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.