MLEvolve is a framework designed to help AI agents autonomously discover and optimize machine learning algorithms over long periods. While existing AI agents often struggle with "memoryless" search and isolated trial-and-error, MLEvolve introduces a multi-agent system that learns from its own history and shares insights across different search paths. By combining advanced search techniques with a structured memory system, it aims to automate the entire machine learning pipeline—from data preparation to model training—more efficiently than previous methods.
Improving Search with Progressive MCGS
Standard search methods often get stuck in local optima or waste time exploring unproductive paths. MLEvolve addresses this by using "Progressive Monte Carlo Graph Search" (MCGS). Unlike traditional tree-based searches that keep branches isolated, this approach uses a graph structure to allow information to flow between different branches. It also features a "progressive exploration schedule" that automatically shifts the agent’s focus from broad, experimental exploration to deep, focused exploitation as the search progresses. This ensures that the agent spends its limited time budget on the most promising solutions.
Learning from Experience
To prevent the agent from repeating past mistakes, MLEvolve includes a "Retrospective Memory" system. This component acts as a library of experience, combining two types of data: a static domain knowledge base that provides a "cold-start" foundation of best practices, and a dynamic global memory that records the results of every attempt made during the search. By retrieving relevant past experiences before making new decisions, the agent can build upon previous successes and avoid strategies that have already proven ineffective.
Hierarchical Planning and Adaptive Coding
Many existing agents struggle because they try to rewrite entire codebases for every minor adjustment, which is inefficient and difficult to control. MLEvolve solves this by decoupling strategic planning from code generation. It uses a hierarchical approach that allows the agent to choose the most appropriate way to update its code—such as performing a full rewrite, a stepwise update, or a targeted "diff-based" edit—depending on the current state of the search. This makes the long-horizon optimization process more stable and precise.
Performance and Results
When tested on the MLE-Bench platform, MLEvolve demonstrated significant improvements over existing methods. Under a 12-hour time budget—which is half the standard runtime for such tasks—the framework achieved a 65.3% average medal rate, establishing a new state-of-the-art performance. Furthermore, the system showed strong cross-domain capabilities, outperforming specialized algorithm discovery methods like AlphaEvolve on mathematical optimization tasks. These results suggest that the combination of graph-based search and retrospective memory is highly effective for complex, long-horizon machine learning engineering tasks.
Comments (0)
to join the discussion
No comments yet
Be the first to share your thoughts!