MLEvolve: A Self-Evolving Framework for Automated M...

MLEvolve: A Self-Evolving Framework for Automated M... | AI Research

Key Takeaways

MLEvolve is a framework designed to help AI agents autonomously discover and optimize machine learning algorithms over long periods.
Large language model (LLM) agents are increasingly applied to long-horizon tasks such as scientific discovery and machine learning engineering (MLE), where sustained self-evolution becomes a key capability.
However, existing MLE agents suffer from inter-branch information isolation, memoryless search, and lack of hierarchical control, which together hinder long-horizon optimization.
We present MLEvolve, an LLM-based self-evolving multi-agent framework for end-to-end machine learning algorithm discovery.
For stable long-horizon iteration, we further decouple strategic planning from code generation with adaptive coding modes.

Paper AbstractExpand

Large language model (LLM) agents are increasingly applied to long-horizon tasks such as scientific discovery and machine learning engineering (MLE), where sustained self-evolution becomes a key capability. However, existing MLE agents suffer from inter-branch information isolation, memoryless search, and lack of hierarchical control, which together hinder long-horizon optimization. We present MLEvolve, an LLM-based self-evolving multi-agent framework for end-to-end machine learning algorithm discovery. By extending tree search to Progressive MCGS, MLEvolve enables cross-branch information flow through graph-based reference edges and gradually shifts the search from broad exploration to focused exploitation with an entropy-inspired progressive schedule. To allow the agent to evolve with accumulated experience, we introduce Retrospective Memory, which combines a cold-start domain knowledge base with a dynamic global memory for task-specific experience retrieval and reuse. For stable long-horizon iteration, we further decouple strategic planning from code generation with adaptive coding modes. Evaluation on MLE-Bench shows that MLEvolve achieves state-of-the-art performance across multiple dimensions including average medal rate and valid submission rate under a 12-hour budget (half the standard runtime). Moreover, MLEvolve also outperforms specialized algorithm discovery methods including AlphaEvolve on mathematical algorithm optimization tasks, demonstrating strong cross-domain generalization. Our code is available at this https URL .

MLEvolve is a framework designed to help AI agents autonomously discover and optimize machine learning algorithms over long periods. While existing AI agents often struggle with "memoryless" search and isolated trial-and-error, MLEvolve introduces a multi-agent system that learns from its own history and shares insights across different search paths. By combining advanced search techniques with a structured memory system, it aims to automate the entire machine learning pipeline—from data preparation to model training—more efficiently than previous methods.

Improving Search with Progressive MCGS

Standard search methods often get stuck in local optima or waste time exploring unproductive paths. MLEvolve addresses this by using "Progressive Monte Carlo Graph Search" (MCGS). Unlike traditional tree-based searches that keep branches isolated, this approach uses a graph structure to allow information to flow between different branches. It also features a "progressive exploration schedule" that automatically shifts the agent’s focus from broad, experimental exploration to deep, focused exploitation as the search progresses. This ensures that the agent spends its limited time budget on the most promising solutions.

Learning from Experience

To prevent the agent from repeating past mistakes, MLEvolve includes a "Retrospective Memory" system. This component acts as a library of experience, combining two types of data: a static domain knowledge base that provides a "cold-start" foundation of best practices, and a dynamic global memory that records the results of every attempt made during the search. By retrieving relevant past experiences before making new decisions, the agent can build upon previous successes and avoid strategies that have already proven ineffective.

Hierarchical Planning and Adaptive Coding

Many existing agents struggle because they try to rewrite entire codebases for every minor adjustment, which is inefficient and difficult to control. MLEvolve solves this by decoupling strategic planning from code generation. It uses a hierarchical approach that allows the agent to choose the most appropriate way to update its code—such as performing a full rewrite, a stepwise update, or a targeted "diff-based" edit—depending on the current state of the search. This makes the long-horizon optimization process more stable and precise.

Performance and Results

When tested on the MLE-Bench platform, MLEvolve demonstrated significant improvements over existing methods. Under a 12-hour time budget—which is half the standard runtime for such tasks—the framework achieved a 65.3% average medal rate, establishing a new state-of-the-art performance. Furthermore, the system showed strong cross-domain capabilities, outperforming specialized algorithm discovery methods like AlphaEvolve on mathematical optimization tasks. These results suggest that the combination of graph-based search and retrospective memory is highly effective for complex, long-horizon machine learning engineering tasks.