Back to AI Research

AI Research

Reason to Play: Behavioral and Brain Alignment Betw... | AI Research

Key Takeaways

  • Reason to Play: Behavioral and Brain Alignment Between Frontier LRMs and Human Game Learners This research investigates whether modern Large Reasoning Models...
  • Humans rapidly learn abstract knowledge when encountering novel environments and flexibly deploy this knowledge to guide efficient and intelligent action.
  • Can modern AI systems learn and plan in a similar way?
  • We study this question using a dataset of complex human gameplay with concurrent fMRI recordings, in which participants learn novel video games that require rule discovery, hypothesis revision, and multi-step planning.
  • Through targeted manipulations, we further show that brain alignment reflects the model's in-context representation of the game state rather than its downstream planning or reasoning.
Paper AbstractExpand

Humans rapidly learn abstract knowledge when encountering novel environments and flexibly deploy this knowledge to guide efficient and intelligent action. Can modern AI systems learn and plan in a similar way? We study this question using a dataset of complex human gameplay with concurrent fMRI recordings, in which participants learn novel video games that require rule discovery, hypothesis revision, and multi-step planning. We jointly evaluate models by their ability to play the games, match human learning behavior, and predict brain activity during the same task, comparing a suite of frontier Large Reasoning Models (LRMs) against model-free and model-based deep reinforcement learning agents and a Bayesian theory-based agent. We find that frontier LRMs most closely match human behavioral patterns during game discovery and predict brain activity an order of magnitude better than both reinforcement learning alternatives across cortical and subcortical regions, with effects robust to permutation controls. Through targeted manipulations, we further show that brain alignment reflects the model's in-context representation of the game state rather than its downstream planning or reasoning. Our results establish LRMs as compelling computational accounts of human learning and decision making in complex, naturalistic environments. Project page with interactive replays: this https URL

Reason to Play: Behavioral and Brain Alignment Between Frontier LRMs and Human Game Learners
This research investigates whether modern Large Reasoning Models (LRMs) learn and plan in ways that mirror human cognition. By comparing how humans and AI agents navigate complex, novel video games, the authors explore whether these models can serve as accurate computational models of human decision-making. The study uses a unique dataset that pairs human gameplay with concurrent fMRI brain scans, allowing researchers to evaluate not just how well the models perform, but how closely their internal representations align with the neural activity observed in the human brain during the same tasks.

Evaluating AI through Human Gameplay

The study utilizes the Video Game Description Language (VGDL) framework, which requires agents to infer rules, identify objects, and plan multi-step actions in unfamiliar environments. Researchers tested eight frontier LRMs—including models from the Qwen and DeepSeek families—against traditional reinforcement learning (RL) agents and a symbolic Bayesian agent. Unlike RL models, which often require extensive training to master a specific game, the LRMs were evaluated "off-the-shelf," meaning they had to learn the game mechanics in real-time using only the information provided during the interaction.

Behavioral Alignment

The results show that frontier LRMs are significantly more human-like in their learning efficiency than traditional deep reinforcement learning agents. While RL baselines often require vast amounts of experience to master a game, LRMs demonstrate a "discovery" speed that closely matches human participants. The models exhibit a distinct pattern of reasoning: they generate longer internal "chains of thought" when encountering novel game elements or starting a new level, and these traces shorten as the agent gains a better understanding of the game's rules. However, the study notes that while LRMs are excellent at learning, they sometimes struggle with "perseveration," where they repeat a previously successful path even when a more efficient route is available.

Brain Activity and Internal Representations

A key finding of this research is that the internal representations of LRMs are remarkably effective at predicting human brain activity. When the models were shown the same game states as the human participants, their hidden-state features predicted BOLD (blood-oxygen-level-dependent) signals in the human brain an order of magnitude better than the reinforcement learning alternatives. Crucially, the researchers found that this alignment is driven by the model’s in-context representation of the current game state, rather than its downstream planning or reasoning processes. This suggests that the way these models "perceive" and organize information about a new environment is fundamentally similar to the way the human brain processes complex, interactive tasks.

Implications for Cognitive Modeling

The study establishes frontier LRMs as compelling tools for understanding human learning and decision-making in naturalistic environments. By demonstrating that these models can achieve high levels of behavioral and neural alignment without task-specific fine-tuning, the authors provide a new framework for computational neuroscience. The project also provides an extensive open-source dataset, including over 100,000 reasoning traces and extracted hidden-state features, offering a rich resource for future research into the dynamics of in-context problem solving and the intersection of artificial and human intelligence.

Comments (0)

No comments yet

Be the first to share your thoughts!