Enhancing Multi-Agent Communication through Attention Steering with Context Relevance
Multi-agent systems powered by Large Language Models (LLMs) are highly effective at solving complex tasks through collaboration. However, as these systems interact, they generate massive amounts of conversation history. This creates a "lost-in-the-middle" effect where critical instructions and evidence are buried under irrelevant data, causing the models to hallucinate or lose focus. This paper introduces Agent-Radar, a training-free method designed to manage this context by dynamically steering an agent’s attention toward the most relevant information without deleting or compressing the original conversation history.
How Agent-Radar Works
Instead of pruning or summarizing messages—which risks losing subtle but important details—Agent-Radar acts as a smart filter that guides the model's focus. It evaluates every sentence in the communication history based on three criteria:
Semantic Relevance: It uses a sentence encoder to measure how closely a piece of information relates to the agent's current goal.
Spatial Decay: Based on organizational network theory, it prioritizes messages from agents that are "closer" in the communication graph, assuming that direct collaborators provide more actionable evidence than distant ones.
Temporal Decay: It recognizes that older messages often become outdated as the conversation evolves, so it gives higher priority to more recent contributions.
By combining these three signals, the system assigns a relevance score to each sentence. Only the most important content is highlighted for the agent during the generation process, ensuring the model remains grounded in the current task.
Performance and Results
The researchers tested Agent-Radar across five benchmarks covering question answering, mathematical reasoning, and general reasoning. The results show that Agent-Radar consistently outperforms existing context management methods, such as pruning or compression, with performance gains of up to 7.64 absolute points.
Furthermore, the method is highly versatile. It was integrated into three popular multi-agent frameworks—GPTSwarm, AutoGen, and Multi-Agent Debate—and improved performance across all of them. In some cases, such as with GPTSwarm on the MuSiQue benchmark, the system achieved gains of up to 12.87 points.
Robustness and Scalability
A significant challenge in multi-agent systems is that performance often drops as the number of agents or interaction rounds increases, as the conversation becomes cluttered with redundant or circular feedback. The analysis demonstrates that Agent-Radar remains effective even as systems scale. Because the method dynamically adjusts attention based on the current state of the conversation, it helps agents avoid the common pitfalls of context dilution, keeping the collaboration focused on the primary objective even in complex, long-running interactions.
Comments (0)
to join the discussion
No comments yet
Be the first to share your thoughts!