Back to AI Research

AI Research

EvolveNav: Proactive Preflection and Self-Evolving... | AI Research

Key Takeaways

  • EvolveNav: Proactive Preflection and Self-Evolving Memory for Zero-Shot Object Goal Navigation Zero-Shot Object-Goal Navigation (ZS-OGN) tasks challenge AI a...
  • Zero-Shot Object-Goal Navigation (ZS-OGN) requires embodied agents to explore and locate target objects without any prior training.
  • To this end, recent methods leverage foundation models.
  • But they typically rely on static priors and lack adaptation, which leads to repeated errors and costly trial and error.
  • In this paper, we propose a self-evolving ZS-OGN framework that enables continuous test-time improvement.
Paper AbstractExpand

Zero-Shot Object-Goal Navigation (ZS-OGN) requires embodied agents to explore and locate target objects without any prior training. To this end, recent methods leverage foundation models. But they typically rely on static priors and lack adaptation, which leads to repeated errors and costly trial and error. In this paper, we propose a self-evolving ZS-OGN framework that enables continuous test-time improvement. Specifically, we build an agentic rule memory by extracting actionable knowledge from past trajectories. Then, we propose a retrieval strategy based on upper confidence bound, selecting effective rules by balancing semantic relevance and historical success. In addition, we introduce a memory-guided preflection module that forecasts potential outcomes before action, reducing inefficient exploration. Extensive experiments show that our method outperforms existing zero-shot baselines, achieving a 10.1\% improvement in success rate with fewer unnecessary steps.

EvolveNav: Proactive Preflection and Self-Evolving Memory for Zero-Shot Object Goal Navigation
Zero-Shot Object-Goal Navigation (ZS-OGN) tasks challenge AI agents to find specific objects in unfamiliar environments without any prior training. While current methods use foundation models to help agents "see" and "reason," these agents often struggle because they rely on static, unchanging knowledge. This leads to repetitive mistakes and inefficient exploration, where the agent wastes time and steps by repeatedly falling for the same visual traps. EvolveNav addresses this by introducing a framework that allows agents to learn from their own experiences in real-time, turning navigation into a process of continuous self-improvement.

Learning from Experience

The core of EvolveNav is a self-evolving memory system. After each navigation attempt, the agent reviews its trajectory to identify which decisions led to success or failure. It distills these experiences into actionable "rules"—such as recognizing misleading room layouts or identifying effective search patterns—and stores them in a memory bank. To ensure the agent doesn't get overwhelmed by too much information, it uses a strategy based on the Upper Confidence Bound (UCB). This mathematical approach helps the agent balance using well-proven rules with testing new, unverified insights, ensuring the memory bank remains both reliable and up-to-date.

Proactive Decision Making

To solve the problem of costly trial-and-error, EvolveNav introduces a "preflection" module. Instead of waiting to see if a path leads to a dead end or a mistake, the agent uses its memory bank to forecast potential outcomes before it moves. By retrieving the most relevant rules for the current scene, the agent evaluates candidate paths and filters out high-risk or low-value directions. This shift from passive, reactive navigation to proactive, risk-aware planning allows the agent to navigate more efficiently, saving precious steps in environments where every move counts.

Proven Performance

The framework was tested on standard benchmarks, including the HM3D and MP3D datasets, against a variety of existing zero-shot navigation methods. The results demonstrate that EvolveNav significantly improves navigation efficiency. By internalizing past lessons and anticipating risks, the agent achieved a 10.1% improvement in success rate compared to existing baselines. Furthermore, the agent required fewer unnecessary steps to reach its goals, proving that the combination of self-evolving memory and proactive preflection is a highly effective way to handle the complexities of unknown, real-world environments.

Comments (0)

No comments yet

Be the first to share your thoughts!