Multi-Agent Transactive Memory (MATM) is a framework designed to help decentralized populations of AI agents share knowledge and learn from one another. Currently, when different AI agents solve tasks, they often discard their interaction histories—the step-by-step "trajectories" of their work—once a task is finished. This forces new agents to repeatedly solve the same problems from scratch. MATM creates a shared, searchable repository where agents can contribute their successful experiences and retrieve relevant procedural knowledge from others, effectively turning individual efforts into a collective, growing library of expertise.
How the Framework Works
MATM functions like a two-sided marketplace for procedural knowledge. When an agent (a "producer") completes a task, its action-observation trajectory is broken into chunks and stored in a shared index. When another agent (a "consumer") faces a similar problem, it can query this index to find relevant steps taken by previous agents. The system uses a "state-conditioned" approach, meaning the agent searches for guidance based on its current situation rather than just the initial task description. This allows the agent to receive specific, actionable advice on how to proceed from its current point in a task.
Improving Retrieval with Learning to Rank
To ensure agents get the most helpful information, the researchers implemented a "Learning to Rank" (LTRT) reranking stage. After an initial search retrieves a list of potential trajectory chunks, the reranker evaluates them using a variety of features, such as the producer agent’s past performance, the consumer agent’s specific needs, and the similarity between the current task and the stored trajectory. By training this reranker to prioritize trajectories that provide the highest "marginal utility"—meaning they actually improve the agent's success rate compared to working without help—the system ensures that the most effective solutions rise to the top.
Key Results
The researchers tested MATM in two complex, interactive environments: ALFWorld (household tasks) and WebArena (web navigation). The experiments showed that retrieving trajectories from the shared MATM repository consistently improved task performance. Furthermore, agents were able to complete tasks more efficiently, requiring fewer interaction steps to reach their goals. Importantly, these improvements were achieved without needing to coordinate the agents or perform expensive joint training, demonstrating that MATM can function as a scalable design pattern for open ecosystems where agents join and operate independently.
Considerations for Future Use
MATM is designed to grow organically as more agents interact with the repository, accumulating knowledge across diverse environments. Because the system allows for personalization—where the reranker learns which trajectories are most useful for specific types of agents—it can adapt to a heterogeneous population where different agents have varying capabilities. While the current research focuses on raw action-observation trajectories, the framework is flexible enough to support higher-level abstractions, such as shared skills or workflows, as the population’s collective memory continues to evolve.
Comments (0)
to join the discussion
No comments yet
Be the first to share your thoughts!