Recursive Multi-Agent Systems (RecursiveMAS) introduces a new way to scale the performance of multi-agent AI systems by treating their collaboration as a recursive, unified computation. Instead of having agents communicate through slow, text-based messages, this framework connects them in a continuous "latent space." By looping these agents together, the system can iteratively refine its reasoning, leading to more accurate results while significantly reducing the computational cost and token usage typically required for complex tasks.
Scaling Collaboration Through Recursion
Traditional multi-agent systems often struggle with inefficiency because agents must wait for one another to generate and process text. RecursiveMAS solves this by recasting the entire system as a recursive loop. Each agent acts like a layer in a larger model, passing "latent thoughts"—continuous numerical representations of information—directly to the next agent. This allows the system to perform deep, iterative reasoning without the overhead of constantly decoding and re-encoding text between steps.
The RecursiveLink Module
At the heart of this framework is the "RecursiveLink," a lightweight module that acts as a bridge between agents. It performs two critical functions:
Inner Link: Helps an individual agent refine its own ongoing thoughts during the generation process.
Outer Link: Enables heterogeneous agents (different models or sizes) to exchange information seamlessly, even if they have different internal dimensions.
By using a residual connection, the RecursiveLink preserves the core meaning of the information while focusing only on aligning the data between agents. This design makes the system more efficient, replacing expensive vocabulary-space decoding with fast, latent-space transformations.
Training for System-Wide Co-optimization
To make the system work as a cohesive unit, the researchers developed an "inner-outer loop" training algorithm. In the inner loop, individual agents are warmed up to better handle latent thought generation. In the outer loop, the entire system is trained as a single entity. Gradients are back-propagated through the entire recursive loop, which allows the system to learn how to collaborate effectively. Theoretical analysis shows that this latent-space approach prevents the "gradient vanishing" problems often found in text-based multi-agent systems, ensuring that the learning process remains stable.
Performance and Efficiency Gains
RecursiveMAS was tested across nine benchmarks covering mathematics, science, medicine, search, and code generation. Compared to existing single-agent and multi-agent baselines, the framework delivered an average accuracy improvement of 8.3%. Beyond accuracy, the system proved to be highly efficient, achieving 1.2x to 2.4x faster inference speeds and reducing token usage by 34.6% to 75.6%. These results demonstrate that recursive latent collaboration is a powerful, scalable strategy for building more capable and efficient AI systems.
Comments (0)
to join the discussion
No comments yet
Be the first to share your thoughts!