Why Neighborhoods Matter: Traversal Context and Provenance in Agentic GraphRAG
This research investigates the reliability of citations in Agentic GraphRAG systems—AI agents that navigate knowledge graphs to find answers. While these systems are designed to be more transparent than standard models by citing their sources, the authors argue that current citation methods are incomplete. They propose that an agent’s final answer is not just a product of the cited facts, but is also shaped by the broader journey the agent takes through the graph, including the nodes it visits but chooses not to cite.
Rethinking Citation Faithfulness
The paper frames citation faithfulness as a "trajectory-level" problem. In an agentic system, an AI might visit a dozen entities to reach a conclusion but only cite two of them. The authors hypothesize that the "visited-but-uncited" entities and the overall structure of the graph provide essential context that influences the agent's reasoning. To test this, they developed a methodology to isolate, remove, or mask different parts of the graph to see how these interventions affect the accuracy and stability of the AI's answers.
How the Experiments Work
The researchers conducted ablation studies using a benchmark of multi-hop questions. They compared several systems, ranging from standard RAG to more advanced agentic configurations. By systematically removing cited evidence, they tested whether those citations were truly necessary for a correct answer. By removing or masking "visited-but-uncited" entities, they measured whether the agent was secretly relying on information it didn't explicitly attribute. They also used "text-only" isolation to determine if the mere structure of the graph—the way entities are connected—provides helpful signals to the agent even when the text content is hidden.
Key Findings
The results show that while cited evidence is usually necessary—removing it often causes the model to change its answer or lose accuracy—it is not sufficient to explain the agent's performance. The experiments revealed that:
Cited evidence is not the whole story: Even when cited entities are removed, agents can sometimes still answer correctly, suggesting they rely on other parts of the graph.
Structural context matters: When the graph structure is preserved but the text of non-cited entities is hidden, the agent performs better than if the entire graph were removed. This indicates that the agent uses the "map" of the graph to navigate and reason.
Provenance is a trajectory: Because visited-but-uncited entities contribute to the final answer, the authors conclude that auditing these systems requires looking at the entire retrieval path, not just the final list of citations.
Future Directions and Limitations
The study is currently limited by its use of a smaller, controlled benchmark rather than a massive, real-world knowledge graph. The authors acknowledge that their findings are a starting point and suggest that future research should scale these experiments to larger, more complex datasets. Ultimately, the paper advocates for a shift in how we evaluate AI transparency: instead of just checking if a citation supports a claim, we should develop methods that account for the entire retrieval trajectory that led the agent to its conclusion.
Comments (0)
to join the discussion
No comments yet
Be the first to share your thoughts!