NanoResearch: Co-Evolving Skills, Memory, and Polic...

NanoResearch: Co-Evolving Skills, Memory, and Polic... | AI Research

Key Takeaways

NanoResearch: Co-Evolving Skills, Memory, and Policy for Personalized Research Automation Current AI-powered research systems can automate the entire scienti...
LLM-powered multi-agent systems can now automate the full research pipeline from ideation to paper writing, but a fundamental question remains: automation for whom?
Researchers operate under different resource configurations, hold different methodological preferences, and target different output formats.
A system that produces uniform outputs regardless of these differences will systematically under-serve every individual user, making personalization a precondition for research automation to be genuinely usable.
We propose NanoResearch, a multi-agent framework that addresses these gaps through tri-level co-evolution.

Paper AbstractExpand

LLM-powered multi-agent systems can now automate the full research pipeline from ideation to paper writing, but a fundamental question remains: automation for whom? Researchers operate under different resource configurations, hold different methodological preferences, and target different output formats. A system that produces uniform outputs regardless of these differences will systematically under-serve every individual user, making personalization a precondition for research automation to be genuinely usable. However, achieving it requires three capabilities that current systems lack: accumulating reusable procedural knowledge across projects, retaining user-specific experience across sessions, and internalizing implicit preferences that resist explicit formalization. We propose NanoResearch, a multi-agent framework that addresses these gaps through tri-level co-evolution. A skill bank distills recurring operations into compact procedural rules reusable across projects. A memory module maintains user- and project-specific experience that grounds planning decisions in each user's research history. A label-free policy learning converts free-form feedback into persistent parameter updates of the planner, reshaping subsequent coordination. These three layers co-evolve: reliable skills produce richer memory, richer memory informs better planning, and preference internalization continuously realigns the loop to each user. Extensive experiments demonstrate that NanoResearch delivers substantial gains over state-of-the-art AI research systems, and progressively refines itself to produce better research at lower cost over successive cycles.

NanoResearch: Co-Evolving Skills, Memory, and Policy for Personalized Research Automation
Current AI-powered research systems can automate the entire scientific pipeline, from brainstorming ideas to writing papers. However, these systems often treat all users the same, ignoring individual research preferences, resource constraints, and methodological styles. NanoResearch addresses this by introducing a framework that learns and adapts to each specific researcher over time. By combining reusable procedural knowledge, long-term memory, and a policy that learns from feedback, the system evolves to become more effective and personalized the longer it works with a user.

The Three Pillars of Personalization

NanoResearch functions through a "tri-level co-evolution" process designed to bridge the gap between generic automation and personalized research:

Skill Bank: This module distills recurring research operations—such as specific debugging patterns or experimental setups—into compact, reusable rules. This ensures that the system does not have to "re-learn" how to solve common problems in every new project.
Memory Module: Unlike systems that only store session logs, this module maintains both user-specific and project-specific records. It grounds future planning decisions in the user’s actual research history, ensuring that the system remembers past successes, failures, and constraints.
Policy Learning: To handle nuanced preferences that are difficult to write down as rules, the system uses a label-free policy learning mechanism. It converts free-form user feedback into persistent updates to the system’s planner, allowing the AI to internalize a user’s unique style and research philosophy over time.

How the System Evolves

The framework operates as a continuous loop where these three components support one another. Reliable skills lead to more successful experiments, which in turn populate the memory with richer, more useful data. This improved memory allows the system to create better research plans, while the policy learning mechanism ensures that the entire process realigns with the user’s intent after every cycle. By treating personalization as a core requirement rather than an add-on, NanoResearch allows the system to produce higher-quality research at a lower cost as it gains experience with a specific researcher.

Performance and Results

To test the framework, the researchers evaluated NanoResearch across 20 different research topics spanning seven domains, including machine learning, computer vision, and time-series analysis. The system was compared against several state-of-the-art automated research frameworks. The results showed that NanoResearch consistently outperformed existing systems in terms of output quality, adherence to user requirements, and overall research effectiveness. Furthermore, the system demonstrated a clear trend of improvement, becoming more efficient and accurate over successive research cycles, confirming that its self-evolving design successfully adapts to individual needs.

NanoResearch: Co-Evolving Skills, Memory, and Polic... | AI Research

Key Takeaways

The Three Pillars of Personalization

How the System Evolves

Performance and Results

Comments (0)

No comments yet