Recursive Multi-Agent Systems

Recursive Multi-Agent Systems | AI Research

Key Takeaways

Recursive Multi-Agent Systems (RecursiveMAS) introduces a new way to scale the performance of multi-agent AI systems by treating their collaboration as a rec...
Recursive or looped language models have recently emerged as a new scaling axis by iteratively refining the same model computation over latent states to deepen reasoning.
We extend such scaling principle from a single model to multi-agent systems, and ask: Can agent collaboration itself be scaled through recursion?
To this end, we introduce RecursiveMAS, a recursive multi-agent framework that casts the entire system as a unified latent-space recursive computation.
RecursiveMAS connects heterogeneous agents as a collaboration loop through the lightweight RecursiveLink module, enabling in-distribution latent thoughts generation and cross-agent latent state transfer.

Paper AbstractExpand

Recursive or looped language models have recently emerged as a new scaling axis by iteratively refining the same model computation over latent states to deepen reasoning. We extend such scaling principle from a single model to multi-agent systems, and ask: Can agent collaboration itself be scaled through recursion? To this end, we introduce RecursiveMAS, a recursive multi-agent framework that casts the entire system as a unified latent-space recursive computation. RecursiveMAS connects heterogeneous agents as a collaboration loop through the lightweight RecursiveLink module, enabling in-distribution latent thoughts generation and cross-agent latent state transfer. To optimize our framework, we develop an inner-outer loop learning algorithm for iterative whole-system co-optimization through shared gradient-based credit assignment across recursion rounds. Theoretical analyses of runtime complexity and learning dynamics establish that RecursiveMAS is more efficient than standard text-based MAS and maintains stable gradients during recursive training. Empirically, we instantiate RecursiveMAS under 4 representative agent collaboration patterns and evaluate across 9 benchmarks spanning mathematics, science, medicine, search, and code generation. In comparison with advanced single/multi-agent and recursive computation baselines, RecursiveMAS consistently delivers an average accuracy improvement of 8.3%, together with 1.2$\times$-2.4$\times$ end-to-end inference speedup, and 34.6%-75.6% token usage reduction. Code and Data are provided in this https URL .

Recursive Multi-Agent Systems (RecursiveMAS) introduces a new way to scale the performance of multi-agent AI systems by treating their collaboration as a recursive, unified computation. Instead of having agents communicate through slow, text-based messages, this framework connects them in a continuous "latent space." By looping these agents together, the system can iteratively refine its reasoning, leading to more accurate results while significantly reducing the computational cost and token usage typically required for complex tasks.

Scaling Collaboration Through Recursion

Traditional multi-agent systems often struggle with inefficiency because agents must wait for one another to generate and process text. RecursiveMAS solves this by recasting the entire system as a recursive loop. Each agent acts like a layer in a larger model, passing "latent thoughts"—continuous numerical representations of information—directly to the next agent. This allows the system to perform deep, iterative reasoning without the overhead of constantly decoding and re-encoding text between steps.

The RecursiveLink Module

At the heart of this framework is the "RecursiveLink," a lightweight module that acts as a bridge between agents. It performs two critical functions:

Inner Link: Helps an individual agent refine its own ongoing thoughts during the generation process.
Outer Link: Enables heterogeneous agents (different models or sizes) to exchange information seamlessly, even if they have different internal dimensions.
By using a residual connection, the RecursiveLink preserves the core meaning of the information while focusing only on aligning the data between agents. This design makes the system more efficient, replacing expensive vocabulary-space decoding with fast, latent-space transformations.

Training for System-Wide Co-optimization

To make the system work as a cohesive unit, the researchers developed an "inner-outer loop" training algorithm. In the inner loop, individual agents are warmed up to better handle latent thought generation. In the outer loop, the entire system is trained as a single entity. Gradients are back-propagated through the entire recursive loop, which allows the system to learn how to collaborate effectively. Theoretical analysis shows that this latent-space approach prevents the "gradient vanishing" problems often found in text-based multi-agent systems, ensuring that the learning process remains stable.

Performance and Efficiency Gains

RecursiveMAS was tested across nine benchmarks covering mathematics, science, medicine, search, and code generation. Compared to existing single-agent and multi-agent baselines, the framework delivered an average accuracy improvement of 8.3%. Beyond accuracy, the system proved to be highly efficient, achieving 1.2x to 2.4x faster inference speeds and reducing token usage by 34.6% to 75.6%. These results demonstrate that recursive latent collaboration is a powerful, scalable strategy for building more capable and efficient AI systems.