Evolving learning state reactivation and value encoding neural dynamics in multi-step planning

preprint OA: gold CC-BY-NC-4.0
📄 Open PDF Full text JSON View at publisher
Full text 1,514 characters · extracted from oa-doi-fallback · click to expand
Abstract Planning in value-based decision making is often dynamic, with reinforcement learning (RL) providing a powerful framework for investigating how value and action at each step change across trials. Surprisingly, the evolving neural signatures of value estimation and state reactivation in multi-step planning, both within and across trials, have received little consideration. Here, using magnetoencephalography (MEG), we detail neural dynamics associated with planning, wherein subjects were tasked to find an optimal path in order to maximise reward. Behavioural evidence showed improved performance across trials, including subjects showing an increasing disregard for low-value states. MEG data captured evolving value estimation signals such that, across trials, there was an emergence of stronger and earlier within trial value encoding linked to boosted vmPFC activity. Value encoding signals showed a positive correlation and individual performance metrics, as reflected in overall task-related reward earnings. Strikingly, across trials, there was an attenuation of state reactivation for negative-value states, an effect that positively correlated with evolving negative-value state avoidance behaviour. The finding linking neural dynamics, including a valence-dependent selective reactivation of negative states, to across-trial behavioural improvement advances an understanding of learning during multi-step planning. Competing Interest Statement The authors have declared no competing interest.

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

Ask this paper AI returns verbatim quotes from the full text · source: oa-doi-fallback

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc
last seen: 2026-05-20T01:45:00.602351+00:00
unpaywall
last seen: 2026-05-21T05:10:58.409756+00:00
License: CC-BY-NC-4.0