Impact of analytic decisions on test-retest reliability of individual and group estimates in functional magnetic resonance imaging: a multiverse analysis using the monetary incentive delay task

doi:10.1101/2024.03.19.585755

Impact of analytic decisions on test-retest reliability of individual and group estimates in functional magnetic resonance imaging: a multiverse analysis using the monetary incentive delay task

2024 · doi:10.1101/2024.03.19.585755

preprint OA: closed

📄 Open PDF Full text JSON View at publisher

Full text 2,876 characters · extracted from oa-doi-fallback · click to expand

Abstract Empirical studies reporting low test-retest reliability of individual blood oxygen-level dependent (BOLD) signal estimates in functional magnetic resonance imaging (fMRI) data have resurrected interest among cognitive neuroscientists in methods that may improve reliability in fMRI. Over the last decade, several individual studies have reported that modeling decisions, such as smoothing, motion correction and contrast selection, may improve estimates of test-retest reliability of BOLD signal estimates. However, it remains an empirical question whether certain analytic decisions consistently improve individual and group level reliability estimates in an fMRI task across multiple large, independent samples. This study used three independent samples (Ns: 60, 81, 119) that collected the same task (Monetary Incentive Delay task) across two runs and two sessions to evaluate the effects of analytic decisions on the individual (intraclass correlation coefficient [ICC(3,1)]) and group (Jaccard/Spearman rho) reliability estimates of BOLD activity of task fMRI data. The analytic decisions in this study vary across four categories: smoothing kernel (five options), motion correction (four options), task parameterizing (three options) and task contrasts (four options), totaling 240 different pipeline permutations. Across all 240 pipelines, the median ICC estimates are consistently low, with a maximum median ICC estimate of .43 - .55 across the three samples. The analytic decisions with the greatest impact on the median ICC and group similarity estimates are the Implicit Baseline contrast, Cue Model parameterization and a larger smoothing kernel. Using an Implicit Baseline in a contrast condition meaningfully increased group similarity and ICC estimates as compared to using the Neutral cue. This effect was largest for the Cue Model parameterization; however, improvements in reliability came at the cost of interpretability. This study illustrates that estimates of reliability in the MID task are consistently low and variable at small samples, and a higher test-retest reliability may not always improve interpretability of the estimated BOLD signal. Competing Interest Statement The authors have declared no competing interest. Footnotes This Stage 2 Registered Report was submitted for review to Peer Community In: Registered Reports (PCI RR) on March 19th 2024. An error was identified in the ABCD pipeline in April 2024. A correction was made May 2nd 2024 (no interpretations changed) and resubmitted for review May 5th, 2024. Initial reviews were obtained June 17th, 2024. The Stage 2 revision was submitted on June 29th, 2024. It recommended for acceptance by the editor on July 9th, 2024 at PCI RR. This revision includes the accepted version of the Stage 2 registered report with the appropriate badge and updated zenodo link for final code.

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: oa-doi-fallback ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00