From Agent Loops to Deterministic Graphs: Execution Lineage for Reproducible AI-Native Work
Modern AI systems often rely on "agent loops," where a model continuously iterates, uses tools, and updates its conversational history until it reaches a final answer. While effective for simple tasks, this approach creates hidden dependencies and makes it difficult to track how work evolves. This paper introduces "execution lineage," a model that represents AI work as a directed acyclic graph (DAG). By treating intermediate steps as stable, identifiable artifacts rather than transient text, this approach ensures that AI-generated work remains consistent, maintainable, and reproducible even as requirements change.
Moving Beyond Conversational Loops
Current agentic workflows typically store state within a single, evolving conversational transcript. This makes it hard to isolate specific updates or understand exactly what a final answer depends on. When a small change is needed, these systems often perform "global recomputation," where the entire process is restarted. The authors argue that this is a structural failure: because the system lacks explicit boundaries between different stages of work, it cannot reliably distinguish between what should remain stable and what needs to be updated.
The Power of Execution Lineage
The proposed execution lineage model shifts the focus from prompt-based history to a structured graph of computations. Each node in this graph represents a specific unit of work with declared inputs and a clear output contract. By assigning each node an "execution identity," the system can determine exactly when a piece of work needs to be re-run and when it can be safely reused. This allows for "partial recomputation," where only the parts of the workflow affected by a change are updated, while the rest of the work remains preserved and consistent.
Results: Quality vs. Consistency
The authors compared their DAG-based approach against traditional loop-centric systems using policy-memo update tasks. While both methods could produce polished final outputs, they differed significantly in "maintained-state quality." When researchers introduced unrelated updates, loop-based systems often contaminated the final output with irrelevant context or unnecessarily regenerated the entire memo. In contrast, the execution lineage model achieved perfect preservation of unaffected work and ensured that changes propagated correctly through the graph. The study demonstrates that immediate task success in a single run can mask underlying inconsistencies that eventually cause problems in long-term, multi-step projects.
Key Takeaways for AI Systems
The core insight is that final answer quality and the stability of the underlying state are two different things. For long-lived, complex workflows, the system must be able to explain its dependencies and provide stable boundaries between stages. By adopting a graph-based structure, developers can move away from "prompt engineering tricks" and toward a more rigorous, systems-level approach that treats intermediate artifacts as first-class, addressable objects. This ensures that as AI-native work evolves, the system remains predictable and reliable.
Comments (0)
to join the discussion
No comments yet
Be the first to share your thoughts!