SkillOS is a research framework designed to help AI agents evolve by learning how to curate their own "skills." While many current AI agents solve tasks in isolation, they often struggle to learn from their past experiences. SkillOS addresses this by creating a system where an agent can store, update, and delete reusable skills—represented as Markdown files—allowing it to become more efficient and capable over time as it encounters new, related tasks.
A Modular Approach to Self-Evolution
The SkillOS framework splits the agent’s responsibilities into two distinct parts: a "frozen" executor and a trainable curator. The executor is responsible for solving tasks using a collection of skills stored in an external repository (SkillRepo). Once a task is finished, the curator reviews the agent's performance and decides how to manage the repository. It can insert new skills, update existing ones, or delete those that are no longer useful. By keeping the executor fixed and training only the curator, the system ensures that the agent’s ability to manage its own knowledge base improves independently of its core problem-solving logic.
Learning Through Experience
To teach the curator how to manage skills effectively, the researchers use a reinforcement learning approach that mimics real-world streaming tasks. Instead of training on random, disconnected problems, the system organizes tasks into groups based on their underlying dependencies. For example, if an agent learns a specific mathematical technique in an early task, the system evaluates the curator based on whether that skill helps the agent solve a more complex, related problem later on. This "grouped" training provides the curator with clear feedback, helping it understand the long-term value of the skills it chooses to keep or refine.
Performance and Generalization
Experiments show that SkillOS significantly improves both the effectiveness and efficiency of AI agents compared to traditional memory-based methods. In testing, the system achieved up to a 9.8% performance boost and required 6% fewer interaction steps to complete tasks. A key finding is that the learned curator is highly adaptable; it can work with different types of agent backbones and across various domains. Furthermore, as the agent continues to operate, the SkillRepo evolves, moving from simple instructions to more sophisticated, high-level "meta-skills" that allow the agent to handle increasingly complex challenges.
Key Takeaways
SkillOS demonstrates that agents can move beyond "one-off" problem solving by treating their knowledge as a living library. By using Markdown as a standard format for skills, the system makes the agent's internal knowledge more readable and structured. The research highlights that the bottleneck for self-evolving agents is not just the ability to store information, but the ability to curate it—deciding what is worth remembering and how to refine those lessons for future success.
Comments (0)
to join the discussion
No comments yet
Be the first to share your thoughts!