Learning to Adapt: Self-Improving Web Agent via Cog...

Learning to Adapt: Self-Improving Web Agent via Cognitive-Aware Exploration introduces a new framework designed to make web-browsing AI agents more autonomous and adaptable. Current web agents often rely on manually written instructions or expensive, human-created examples to learn how to navigate websites. This paper proposes a way for agents to learn on their own by actively exploring the web, identifying their own knowledge gaps, and systematically improving their reasoning without needing constant human supervision.

The SCALE Framework

The core of the approach is the SCALE (Self-Cognitive-Aware Learning and Exploration) framework, which assigns three distinct roles to a single AI model: the Selector, the Predictor, and the Judger. The Selector identifies potentially confusing or unfamiliar elements on a webpage and proposes an action. The Predictor then guesses what will happen if that action is taken. Finally, the Judger compares the prediction to the actual outcome. If the prediction is wrong, the agent recognizes that it has hit a "cognitive boundary"—a limit in its current understanding. By focusing on these failures, the agent can update its knowledge and learn from its mistakes in a continuous, self-improving loop.

Global Planning with SCALE-Hop

To prevent agents from getting stuck in repetitive loops or focusing only on small, local areas of a website, the researchers introduced SCALE-Hop. This strategy treats the agent’s exploration history as a map or graph. Each node in this graph represents a specific state of a webpage. SCALE-Hop monitors how much of a site has been explored and uses a "verification-guided backtracking" mechanism. When the agent feels it has exhausted its options in one area, this system helps it navigate back to previously visited or unexplored sections, ensuring the agent develops a comprehensive, global understanding of the environment rather than just learning shallow, local behaviors.

SCALE-20k Dataset

To support the training of these agents, the authors created SCALE-20k, a large-scale dataset derived from the agent's own exploration traces across 19 real-world websites. This dataset contains over 25,000 items, including single-step actions, multi-step task trajectories, and page comprehension questions. By using this data, the researchers demonstrated that the framework significantly boosts the performance of existing Multimodal Large Language Models (MLLMs), such as InternVL2.5-8B and Qwen2.5-VL-7B, helping them become more effective at navigating complex and dynamic web environments.

Key Findings

The experimental results show that the SCALE framework allows agents to move beyond rigid, pre-defined task flows. By proactively seeking out challenging scenarios and learning from the resulting errors, the agents achieved significant improvements in task success rates. The study highlights that this self-driven approach is highly generalizable, meaning it can be applied to different AI models to help them adapt to novel websites more effectively than traditional methods that rely on static, human-provided data.

Learning to Adapt: Self-Improving Web Agent via Cog... | AI Research

Key Takeaways

The SCALE Framework

Global Planning with SCALE-Hop

SCALE-20k Dataset

Key Findings

Comments (0)

No comments yet