Back to AI Research

AI Research

ContextNest: Verifiable Context Governance for Auto... | AI Research

Key Takeaways

  • ContextNest: Verifiable Context Governance for Autonomous AI Agents introduces a new governance layer designed to sit beneath existing Retrieval-Augmented Ge...
  • We formalize this as context governance and present ContextNext, an open specification and reference implementation for governed AI-consumable knowledge vaults.
  • These mechanisms let organizations reconstruct which knowledge versions informed an agent output and whether those versions were AI-eligible when consumed.
  • We report first empirical results from two controlled experiments.
  • These results suggest that context governance addresses failure modes retrieval quality alone is not designed to resolve.
Paper AbstractExpand

Autonomous AI agents increasingly depend on external knowledge stores, yet most retrieval pipelines provide relevance without durable guarantees of provenance, version identity, integrity, traceability, or point-in-time reconstruction. We formalize this as context governance and present ContextNext, an open specification and reference implementation for governed AI-consumable knowledge vaults. ContextNext does not replace Retrieval-Augmented Generation (RAG); it supplies the governance layer beneath retrieval, determining which artifacts are approved, current, attributable, and integrity-verified before retrieval systems operate over them. The specification combines typed Markdown documents with metadata, deterministic set-algebraic selectors, contextnest:// URI references, SHA-256 hash-chained version histories, graph-level checkpoints, source nodes for live data through the Model Context Protocol (MCP), and audit traces of agent context consumption. These mechanisms let organizations reconstruct which knowledge versions informed an agent output and whether those versions were AI-eligible when consumed. We report first empirical results from two controlled experiments. In a stale-version attack isolating the governance-versus-retrieval failure mode, governed selection strictly Pareto-dominates BM25 sparse retrieval, with higher answer-quality pass rate (97% versus 93-90%) at about one-third the input-token cost. In a retrieval-determinism experiment over a 1,060-document corpus, deterministic selectors and BM25 return stable document sets across repeated identical queries (Jaccard 1.0), while a dense+HNSW baseline is non-deterministic on 80% of queries (mean Jaccard 0.611, worst case 0.210). These results suggest that context governance addresses failure modes retrieval quality alone is not designed to resolve. We release a core engine, CLI, and MCP server under open licenses.

ContextNest: Verifiable Context Governance for Autonomous AI Agents introduces a new governance layer designed to sit beneath existing Retrieval-Augmented Generation (RAG) systems. While RAG is excellent at finding relevant information, it often lacks the ability to guarantee that the information is accurate, current, or authorized for use. This paper addresses the "context governance gap," where autonomous agents might confidently retrieve and act upon outdated or unverified information, such as superseded company policies or deprecated technical documentation.

The Governance Layer

ContextNest does not replace RAG; instead, it acts as a gatekeeper that ensures only approved, integrity-verified, and current information reaches the retrieval system. It treats knowledge as a governed asset by using typed Markdown documents with structured metadata. By organizing these documents into a vault with a clear stewardship model, organizations can define exactly which documents are eligible for AI consumption. This ensures that when an agent retrieves data, it is working with a "known good" subset of the organization's knowledge base.

How It Works

The system relies on several technical mechanisms to provide transparency and security:

  • Deterministic Selectors: Unlike standard search methods that may return different results for the same query, ContextNest uses a set-algebraic grammar to ensure consistent, repeatable document selection.

  • Cryptographic Integrity: Every document version is part of a SHA-256 hash-chained history, making it easy to detect if information has been tampered with.

  • Audit Trails: The system records every instance of context consumption, allowing organizations to reconstruct exactly which version of a document informed a specific AI decision at any point in time.

  • Live Data Integration: Through the Model Context Protocol (MCP), the system can pull in live data while maintaining the same governance standards applied to static documents.

Empirical Performance

The researchers tested ContextNest in two controlled experiments. In a "stale-version attack," they compared governed selection against standard BM25 retrieval. ContextNest outperformed the baseline, achieving a 97% answer-quality pass rate compared to 90–93% for BM25, while simultaneously reducing the input-token cost by approximately two-thirds.
In a second experiment focused on reliability, the team tested how consistently different systems returned the same documents for identical queries. While standard dense-retrieval methods were non-deterministic on 80% of queries, ContextNest’s deterministic selectors provided perfect stability, returning the same document sets every time.

Important Considerations

ContextNest is specifically focused on "inference-time knowledge governance"—the process of ensuring the information used by an agent is trustworthy. The authors note that this is distinct from "action governance," which involves managing agent identity, tool-use authorization, and the final outcomes of an agent's work. While ContextNest provides the necessary foundation for trustworthy AI, it does not currently handle the full stack of action-authorization or the complexities of real-time multi-user collaboration, which remain areas for future development.

Comments (0)

No comments yet

Be the first to share your thoughts!