Automated Adversarial Collaboration for Advancing Theory Building in the Cognitive Sciences

Cognitive science often struggles to build broad, integrated theories because researchers typically focus on narrow experimental tasks and local model comparisons. This paper introduces an automated framework that uses artificial intelligence to conduct "adversarial collaboration"—a process where competing theories are pitted against one another in a closed-loop system. By combining Large Language Models (LLMs), automated program synthesis, and smart experimental design, the system can discover, test, and refine theories without requiring researchers to manually define every model or experiment in advance.

How the System Works

The framework functions as an iterative, in-silico debate. It begins by registering "theory agents"—LLM-based systems that represent specific theoretical claims and their corresponding computational models. These agents analyze where their predictions diverge from their rivals and propose novel experiments to settle the dispute. The system then uses information-theoretic design to select the most informative experiment, runs it, and observes the results. Based on these outcomes, the agents update their beliefs, critique their opponents, and revise their own models. This cycle continues until the system can determine which theoretical account is best supported by the evidence.

Testing Against Classic Theories

To validate the framework, the researchers tested it against three well-known theories from human categorization research: the Generalized Context Model (GCM), Rule plus Exception (RULEX), and the Supervised and Unsupervised STratified Adaptive Incremental Network (SUSTAIN). The system was tasked with identifying the "ground-truth" theory from synthetic data generated by these models under varying levels of noise. The framework successfully recovered the correct theory in noiseless conditions and performed reliably across most noisy conditions, demonstrating that it can effectively adjudicate between complex, competing scientific accounts.

Key Findings and Limitations

The study revealed that the framework’s success is influenced by the inherent "recoverability" of the models it synthesizes. For instance, the GCM and SUSTAIN theories were highly recoverable even when data was noisy. However, the performance for the RULEX theory degraded rapidly as noise increased. This suggests that the system’s effectiveness is tied not just to the adjudication loop itself, but also to the capabilities of the underlying program synthesis tools. While the framework serves as a successful proof of concept for automated theory building, the authors note that it is not yet a complete solution and requires further development to handle more complex, real-world human behavioral data.

Automated Adversarial Collaboration for Advancing T... | AI Research

Key Takeaways

Automated Adversarial Collaboration for Advancing Theory Building in the Cognitive Sciences

How the System Works

Testing Against Classic Theories

Key Findings and Limitations

Comments (0)

No comments yet