ToolChoiceConfusion: Causal Minimal Tool Filtering for Reliable LLM Agents addresses a critical challenge in AI agent development: how to provide large language models (LLMs) with the right tools at the right time. While modern agents can perform complex tasks by calling external tools, they often struggle when presented with too many options. This paper argues that simply showing an agent all available tools—or even just those that seem semantically related to a request—can lead to errors, such as choosing the wrong tool or acting prematurely.
The Problem: ToolChoiceConfusion
As tool libraries grow, agents face a "ToolChoiceConfusion" failure mode. This occurs when an agent is exposed to tools that are relevant to the overall goal but are not actually useful at the current step. For example, if an agent needs to update a calendar event, it might be distracted by tools for creating or deleting events. These tools are related to the calendar, but they are not the correct next step if the agent first needs to search for an existing event ID. Exposing these "plausible but premature" tools increases the likelihood of errors, wastes tokens, and can lead to inefficient or failed task completion.
How CMTF Works
The authors propose Causal Minimal Tool Filtering (CMTF), a training-free method that filters tools based on "causal sufficiency" rather than just keyword matching. Each tool is assigned a lightweight contract consisting of its preconditions (what must be true before use) and its effects (what state changes occur after use).
CMTF uses these contracts to build a dependency graph of the task. Instead of showing the agent every possible tool, it calculates the minimal path required to reach the user's goal and exposes only the single, necessary tool for the immediate next step. By limiting the "visible tool frontier" to only what is causally required, the agent is shielded from distracting or premature options.
Key Findings
The researchers tested CMTF using a benchmark of 102 tasks and 100 tools across four different LLM backends. The results showed that CMTF significantly improves efficiency and reliability:
Reduced Complexity: The method successfully reduced the number of visible tools from 100 down to just one per step.
Cost Efficiency: By narrowing the tool menu, CMTF reduced token usage by approximately 90% compared to exposing all tools.
Performance: CMTF matched the strongest causal baselines in terms of overall task success while simultaneously reducing the occurrence of wrong-tool calls and premature actions.
Important Considerations
The study uses a controlled, synthetic benchmark to isolate tool-selection behavior from the variability of real-world APIs. Because the environment uses mocked, deterministic outputs, the findings focus specifically on how tool-exposure strategies influence agent decision-making. The authors note that this approach is designed to complement existing agent systems; it acts as a filter to improve the quality of the "menu" provided to the model, rather than replacing the model's own reasoning capabilities.
Comments (0)
to join the discussion
No comments yet
Be the first to share your thoughts!