A Methodology for Selecting and Composing Runtime A...

A Methodology for Selecting and Composing Runtime Architecture Patterns for Production LLM Agents
This paper introduces a formal architectural framework for building production-grade LLM agents. While LLMs are powerful, they are inherently stochastic (probabilistic), whereas the software systems they interact with are deterministic. The author argues that the "seam" where these two worlds meet—the Stochastic-Deterministic Boundary (SDB)—is the most critical, load-bearing part of any agent system. By treating this boundary as a formal contract, developers can move away from trial-and-error debugging and toward a structured, reliable engineering approach.

The Stochastic-Deterministic Boundary (SDB)

The core of the proposed methodology is the SDB, a four-part contract that governs how an LLM’s output becomes a real-world action. To ensure reliability, every interaction must include:

Proposer: The LLM that generates the output.
Verifier: A deterministic check (such as a schema, policy rule, or classifier) that validates the proposal.
Commit Step: The durable write or action that occurs only after the proposal is accepted.
Reject Signal: A typed response sent back to the LLM if the verification fails, allowing the model to correct its course.
The paper notes that most production failures are not caused by the LLM itself, but by weaknesses at this boundary. By explicitly defining these four parts, teams can prevent common issues like unauthorized writes or hallucinated successes.

Organizing Agent Runtimes

The author organizes agent design into three orthogonal concerns that every production system must address:

Coordination: How the system splits and recombines work (e.g., hierarchical delegation or scatter-gather).
State: How the system remembers information across pauses (e.g., event-driven logs or shared state machines).
Control: Who decides what runs and when to stop (e.g., supervisors or human-in-the-loop gates).
By mapping these concerns to six specific architectural patterns—such as "Event-Driven Sequencing" or "Supervisor plus Gate"—the paper provides a catalog that allows developers to choose the right structure based on whether their agent is conversational, autonomous, or long-horizon.

Reliability and Architectural Momentum

A key insight of the paper is the distinction between per-call model variance and "architectural momentum." As LLM models improve, their individual output variance decreases; however, the reliability of the overall system remains dependent on the architecture surrounding the model.
The author defines architectural momentum as the strength of the SDB and the choice of runtime patterns. As base models become more capable, the architectural choices—rather than the model itself—become the primary lever for ensuring long-term system reliability.

Diagnostic Procedures and Failure Modes

The methodology includes a diagnostic procedure to help teams map production failures to specific pattern weaknesses. One notable failure mode identified is "replay divergence." This occurs in event-driven systems where an LLM consumer reads a deterministic event log; if the model version or prompt is updated, the same input can produce different downstream results, effectively breaking the consistency of the system's history. By using the provided five-step selection methodology, teams can produce an architecture decision record that helps identify and mitigate these risks before they manifest in production.

A Methodology for Selecting and Composing Runtime A... | AI Research

Key Takeaways

The Stochastic-Deterministic Boundary (SDB)

Organizing Agent Runtimes

Reliability and Architectural Momentum

Diagnostic Procedures and Failure Modes

Comments (0)

No comments yet