Back to AI Research

AI Research

EvoDS: Self-Evolving Autonomous Data Science Agent... | AI Research

Key Takeaways

  • EvoDS: Self-Evolving Autonomous Data Science Agent with Skill Learning and Context Management Data science workflows are complex, iterative, and often requir...
  • Recent progress in Large Language Model (LLM) agents has enabled promising advances in automated data science.
  • To address these challenges, we introduce EvoDS, a self-evolving autonomous data science agent that learns to expand its skills and adaptively managing long-term context through agentic reinforcement learning.
  • These strategies are orchestrated within a two-stage multi-agent training scheme, enabling EvoDS to autonomously improve over time.
  • Theoretically, we prove that EvoDS's hierarchical design reduces tool-selection error, and its optimization objective aligns with an information bottleneck principle, ensuring efficient context use.
Paper AbstractExpand

Recent progress in Large Language Model (LLM) agents has enabled promising advances in automated data science. However, existing approaches remain fundamentally limited by their static action sets and lack of principled long-horizon context management, hindering their ability to accumulate reusable experience across tasks and operate reliably in multi-stage, iterative data science pipelines. To address these challenges, we introduce EvoDS, a self-evolving autonomous data science agent that learns to expand its skills and adaptively managing long-term context through agentic reinforcement learning. Specifically, EvoDS introduces two key strategies: (1) Autonomous Skill Acquisition (ASA) mechanism, which enables agents to synthesize, validate, and reuse executable skills; and (2) Adaptive Context Compression (ACC) strategy, which treats context management as a learned control problem rather than passive truncation. These strategies are orchestrated within a two-stage multi-agent training scheme, enabling EvoDS to autonomously improve over time. Theoretically, we prove that EvoDS's hierarchical design reduces tool-selection error, and its optimization objective aligns with an information bottleneck principle, ensuring efficient context use. Empirically, EvoDS outperforms state-of-the-art open-source data science agents by an average of 28.9% across four diverse benchmarks while eliminating out-of-token failures. Our code and data are available at this https URL .

EvoDS: Self-Evolving Autonomous Data Science Agent with Skill Learning and Context Management
Data science workflows are complex, iterative, and often require long-term planning. While existing AI agents can perform individual tasks like writing code or running models, they often struggle with "static" limitations—they cannot easily learn from past experiences or manage the massive amounts of information generated during long projects. EvoDS is a new framework designed to solve these issues by creating a self-evolving agent that can autonomously acquire new skills and intelligently manage its own memory, allowing it to improve its performance over time.

A Hierarchical Approach to Teamwork

Rather than relying on a single, overburdened model, EvoDS uses a hierarchical multi-agent architecture. A "Manager Agent" acts as the central coordinator, overseeing the high-level goals of a project. It delegates specific subtasks—such as data cleaning, feature engineering, modeling, or visualization—to specialized sub-agents. By breaking down complex workflows into smaller, domain-specific pieces, the system reduces the complexity of any single agent's task and keeps the execution context organized, which helps prevent the "lost-in-the-middle" reasoning errors common in large-scale AI projects.

Autonomous Skill Acquisition

A core innovation of EvoDS is its ability to expand its own capabilities through an Autonomous Skill Acquisition (ASA) mechanism. When a sub-agent encounters a task it doesn't have a pre-built tool for, it can synthesize a new, executable skill. This process involves four steps: synthesizing the code, verifying that it works, caching it in a repository, and finally adding it to the agent’s permanent toolkit once it has proven its utility through repeated successful use. This allows the agent to evolve and become more efficient as it encounters a wider variety of data science challenges.

Adaptive Context Compression

As data science projects progress, they generate vast amounts of logs, code snippets, and data previews that can quickly overwhelm an AI’s memory. EvoDS addresses this with an Adaptive Context Compression (ACC) strategy. Instead of simply deleting old information when memory is full, the agents use a learned control process to distill raw execution results into concise summaries. These summaries focus on task-critical information—such as key outcomes or the root causes of errors—while discarding irrelevant details. This ensures the agent maintains a clear, relevant "memory" of the project without hitting token limits or losing track of the overall objective.

Performance and Results

EvoDS is trained using a two-stage process: first, it learns from expert demonstrations, and then it undergoes reinforcement learning to optimize its decision-making, skill usage, and memory management. In empirical tests across four diverse benchmarks, EvoDS outperformed state-of-the-art open-source data science agents by an average of 28.9%. Furthermore, the system demonstrated high reliability, successfully completing long-horizon tasks without suffering from the "out-of-token" failures that often plague other autonomous agents.

Comments (0)

No comments yet

Be the first to share your thoughts!