2026-03-26 東京大学国際高等研究所

図1:二重の、神経結合の自律的整合と、その大脳皮質の興奮/抑制バランスへの依存性
<関連情報>
- https://ircn.jp/pressrelease/20260303_morita_kumar
- https://www.jneurosci.org/content/early/2026/02/27/JNEUROSCI.1762-25.2026
中脳皮質線条体における状態表現と価値の強化学習と統合失調症のメカニズムへの示唆 Mesocorticostriatal reinforcement learning of state representation and value with implications for the mechanisms of schizophrenia
Kenji Morita and Arvind Kumar
Journal of Neuroscience Published:3 March 2026
DOI:https://doi.org/10.1523/JNEUROSCI.1762-25.2026
Abstract
Mesocorticostriatal dopamine projections are crucial for value learning, motivational control, and cognitive functions. However, while dopamine’s role in value learning as reward-prediction-error (RPE) has been much understood, precise roles in motivational control and cognitive functions remain more elusive. Computationally, this corresponds to that while the operation of mesostriatal dopamine could be minimally described by simple reinforcement learning (RL) models with one-dimensional reward/RPE and fixed state representation, (i) how reward-specific motivational control can be achieved through heterogeneous dopamine responses, and (ii) how sophisticated cortical state representation can be formed through mesocortical dopamine, cannot be captured by such simple models. To address both of these at once, we combined recent models for each of them: the “Reward Bases (RB)”, which achieved reward-specific motivational control through multi-dimensional RPE (but with fixed cortical representation), and the “online value-recurrent-neutral-network (OVRNN)”, which achieved state-representation learning through training of RNN by RPE (but of one-dimensional). We show that the combined model can achieve both functions simultaneously via double ‘feedback alignments’ of the cortical and striatal downstream connections to the mesocorticostriatal dopamine projections. Crucially, cortical inhibition-dominance is a key for successful learning. Excessive excitation leads to aberrant persistent activity, which disrupts the alignments and impairs reward-specific motivational control and credit assignment. This implies how negative and positive symptoms of schizophrenia could emerge from excitation-inhibition imbalance, and we show how our model could explain altered brain activations in patients. Our model thus provides an integrated computational account for dopamine’s functions, with implications on how its dysfunctions link to schizophrenia.
Significance statement Dopamine has been suggested to play crucial roles in value learning, motivational control, and cognitive functions, and they have been tried to be understood using the reinforcement learning (RL) framework. However, existing RL models have two limitations: reward identity/diversity is ignored, and state/action representation is handcrafted. Recent studies addressed either of them, but only separately. We combine these separate models, and demonstrate that reward-specific value and state representation can be simultaneously learned through double operations of “feedback alignment”, a bio-plausible alternative to the dominant machine-learning algorithm. Crucially, inhibition-dominance is a key for successful learning. Excessive excitation-induced persistent activity disturbs alignments and impairs motivational control and credit assignment, implying how excitation-inhibition imbalance could lead to negative and positive symptoms of schizophrenia.

