AI Research

LongSeeker: Elastic Context Orchestration for Long-... | AI Research

Key Takeaways

LongSeeker: Elastic Context Orchestration for Long-Horizon Search Agents Long-horizon search agents often struggle as they perform complex tasks.
Long-horizon search agents must manage a rapidly growing working context as they reason, call tools, and observe information.
Naively accumulating all intermediate content can overwhelm the agent, increasing costs and the risk of errors.
We propose that effective context management should be adaptive: parts of the agent's trajectory are maintained at different levels of detail depending on their current relevance to the task.
To operationalize this principle, we introduce Context-ReAct, a general agentic paradigm for elastic context orchestration that integrates reasoning, context management, and tool use in a unified loop.

Paper AbstractExpand

Long-horizon search agents must manage a rapidly growing working context as they reason, call tools, and observe information. Naively accumulating all intermediate content can overwhelm the agent, increasing costs and the risk of errors. We propose that effective context management should be adaptive: parts of the agent's trajectory are maintained at different levels of detail depending on their current relevance to the task. To operationalize this principle, we introduce Context-ReAct, a general agentic paradigm for elastic context orchestration that integrates reasoning, context management, and tool use in a unified loop. Context-ReAct provides five atomic operations: Skip, Compress, Rollback, Snippet and Delete, which allow the agent to dynamically reshape its working context, preserving important evidence, summarizing resolved information, discarding unhelpful branches, and controlling context size. We prove that the Compress operator is expressively complete, while the other specialized operators provide efficiency and fidelity guarantees that reduce generation cost and hallucination risk. Building on this paradigm, we develop LongSeeker, a long-horizon search agent fine-tuned from Qwen3-30B-A3B on 10k synthesized trajectories. Across four representative search benchmarks, LongSeeker achieves 61.5% on BrowseComp and 62.5% on BrowseComp-ZH, substantially outperforming Tongyi DeepResearch (43.2% and 46.7%) and AgentFold (36.2% and 47.3%). These results highlight the potential of adaptive context management, showing that agents can achieve more reliable and efficient long-horizon reasoning by actively shaping their working memory.

LongSeeker: Elastic Context Orchestration for Long-Horizon Search Agents
Long-horizon search agents often struggle as they perform complex tasks. As these agents reason, use tools, and gather information, their "working memory" becomes cluttered with redundant or irrelevant data. This accumulation increases costs, slows down performance, and raises the risk of errors or hallucinations. This paper introduces LongSeeker, an agent designed to manage its own context dynamically. By using a new paradigm called Context-ReAct, the agent can actively reshape its memory, keeping only what is necessary to solve a task effectively.

A New Way to Manage Memory

The core of this research is the Context-ReAct paradigm, which allows an agent to treat its memory as an elastic, adjustable space. Instead of simply recording every step of a search, the agent is trained to perform specific "meta-operations" alongside its reasoning and tool use. These operations allow the agent to decide in real-time how to organize its history. By co-generating these decisions with its standard reasoning, the agent learns to maintain a high-quality, relevant context throughout the entire duration of a long-horizon task.

The Five Atomic Operations

To provide precise control over memory, the researchers developed five specific actions that the agent can perform on its own history:

Skip: Keeps the current context as is when it is already efficient.
Compress: Summarizes a range of past steps into a concise abstract, helping to clear out clutter.
Rollback: Abandons a failed or unproductive line of reasoning and returns to an earlier, more promising state.
Snippet: Extracts a specific, exact piece of information (like a number or code) to ensure accuracy without needing to summarize or rewrite it.
Delete: Completely removes a step that provides no value, reducing noise.
The researchers proved that these operations are "expressively complete," meaning they provide the necessary tools to transform any history into any desired state.

Performance and Results

The team developed LongSeeker by fine-tuning a Qwen3-30B-A3B model on 10,000 synthesized search trajectories. When tested against several benchmarks, including BrowseComp and BrowseComp-ZH, LongSeeker demonstrated significant improvements over existing search agents. For example, it achieved a 61.5% score on BrowseComp, notably outperforming other models like Tongyi DeepResearch and AgentFold. These results suggest that by actively managing its own memory, an agent can achieve more reliable and efficient reasoning, moving context management from a background task to a central part of how AI agents think.

Comments (0)

No comments yet

Be the first to share your thoughts!