Back to AI Research

AI Research

A Methodology for Selecting and Composing Runtime A... | AI Research

Key Takeaways

  • A Methodology for Selecting and Composing Runtime Architecture Patterns for Production LLM Agents This paper introduces a formal architectural framework for...
  • Production LLM agents combine stochastic model outputs with deterministic software systems, yet the boundary between the two is rarely treated as a first-class architectural object.
  • This paper names that boundary the stochastic-deterministic boundary (SDB): a four-part contract among a proposer, verifier, commit step, and reject signal that specifies how an LLM output becomes a system action.
  • We argue that the SDB is the load-bearing primitive of production agent runtimes.
  • Around this primitive, we organize agent runtime design into three concerns: Coordination, State, and Control.
Paper AbstractExpand

Production LLM agents combine stochastic model outputs with deterministic software systems, yet the boundary between the two is rarely treated as a first-class architectural object. This paper names that boundary the stochastic-deterministic boundary (SDB): a four-part contract among a proposer, verifier, commit step, and reject signal that specifies how an LLM output becomes a system action. We argue that the SDB is the load-bearing primitive of production agent runtimes. Around this primitive, we organize agent runtime design into three concerns: Coordination, State, and Control. We present a catalog of six runtime patterns that compose the SDB differently across conversational, autonomous, and long-horizon agents: hierarchical delegation, scatter-gather plus saga, event-driven sequencing, shared state machine, supervisor plus gate, and human in the loop. For each pattern, we trace its lineage to distributed-systems concepts and identify what changes when the worker is stochastic. The paper contributes a five-step methodology for selecting runtime patterns, a diagnostic procedure that maps production failures to pattern weaknesses, and a failure mode called replay divergence, in which LLM-based consumers of a deterministic event log produce different downstream outputs under model-version or prompt changes. A stylized reliability decomposition separates per-call model variance from architectural momentum, motivating the claim that as model variance decreases, pattern choice and SDB strength become increasingly important levers for long-run reliability. We apply the methodology to five workloads and provide one runnable reference implementation for a 90-day contract-renewal agent.

A Methodology for Selecting and Composing Runtime Architecture Patterns for Production LLM Agents
This paper introduces a formal architectural framework for building production-grade LLM agents. While LLMs are powerful, they are inherently stochastic (probabilistic), whereas the software systems they interact with are deterministic. The author argues that the "seam" where these two worlds meet—the Stochastic-Deterministic Boundary (SDB)—is the most critical, load-bearing part of any agent system. By treating this boundary as a formal contract, developers can move away from trial-and-error debugging and toward a structured, reliable engineering approach.

The Stochastic-Deterministic Boundary (SDB)

The core of the proposed methodology is the SDB, a four-part contract that governs how an LLM’s output becomes a real-world action. To ensure reliability, every interaction must include:

  • Proposer: The LLM that generates the output.

  • Verifier: A deterministic check (such as a schema, policy rule, or classifier) that validates the proposal.

  • Commit Step: The durable write or action that occurs only after the proposal is accepted.

  • Reject Signal: A typed response sent back to the LLM if the verification fails, allowing the model to correct its course.
    The paper notes that most production failures are not caused by the LLM itself, but by weaknesses at this boundary. By explicitly defining these four parts, teams can prevent common issues like unauthorized writes or hallucinated successes.

Organizing Agent Runtimes

The author organizes agent design into three orthogonal concerns that every production system must address:

  • Coordination: How the system splits and recombines work (e.g., hierarchical delegation or scatter-gather).

  • State: How the system remembers information across pauses (e.g., event-driven logs or shared state machines).

  • Control: Who decides what runs and when to stop (e.g., supervisors or human-in-the-loop gates).
    By mapping these concerns to six specific architectural patterns—such as "Event-Driven Sequencing" or "Supervisor plus Gate"—the paper provides a catalog that allows developers to choose the right structure based on whether their agent is conversational, autonomous, or long-horizon.

Reliability and Architectural Momentum

A key insight of the paper is the distinction between per-call model variance and "architectural momentum." As LLM models improve, their individual output variance decreases; however, the reliability of the overall system remains dependent on the architecture surrounding the model.
The author defines architectural momentum as the strength of the SDB and the choice of runtime patterns. As base models become more capable, the architectural choices—rather than the model itself—become the primary lever for ensuring long-term system reliability.

Diagnostic Procedures and Failure Modes

The methodology includes a diagnostic procedure to help teams map production failures to specific pattern weaknesses. One notable failure mode identified is "replay divergence." This occurs in event-driven systems where an LLM consumer reads a deterministic event log; if the model version or prompt is updated, the same input can produce different downstream results, effectively breaking the consistency of the system's history. By using the provided five-step selection methodology, teams can produce an architecture decision record that helps identify and mitigate these risks before they manifest in production.

Comments (0)

No comments yet

Be the first to share your thoughts!