Back to AI Research

AI Research

Toward a Science of Intent: Closure Gaps and Delega... | AI Research

Key Takeaways

  • Toward a Science of Intent: Closure Gaps and Delegation Envelopes for Open-World AI Agents This paper addresses a fundamental challenge in AI deployment: why...
  • These perspectives do not explain why capable models remain difficult to deploy in open institutions.
  • We propose intent compilation: the transformation of partially specified human purpose into inspectable artifacts that bind execution.
  • The relevant deployment distinction is closed-world solver versus open-world agent.
  • In closed worlds, a checker is largely given; in open worlds, verification is distributed across semantic, evidentiary, procedural and institutional dimensions.
Paper AbstractExpand

Recent work has framed intelligence in verifiable tasks as reducing time-to-solution through learned structure and test-time search, while systems work has explored learned runtimes in which computation, memory and I/O migrate into model state. These perspectives do not explain why capable models remain difficult to deploy in open institutions. We propose intent compilation: the transformation of partially specified human purpose into inspectable artifacts that bind execution. The relevant deployment distinction is closed-world solver versus open-world agent. In closed worlds, a checker is largely given; in open worlds, verification is distributed across semantic, evidentiary, procedural and institutional dimensions. Weformalize this residual openness as a closure-gap vector, define delegation envelopes as pre-authorized regions of action space, distinguish misclosure from undersearch, and outline benchmark metrics for testing when closure interventions outperform additional inference-time search.

Toward a Science of Intent: Closure Gaps and Delegation Envelopes for Open-World AI Agents
This paper addresses a fundamental challenge in AI deployment: why highly capable models often fail when integrated into real-world institutions. The authors argue that current research focuses too heavily on "solving" tasks through faster reasoning or search, while ignoring the institutional requirements for action. They propose "intent compilation"—a process that transforms vague human goals into structured, inspectable contracts. By doing so, they aim to bridge the gap between a model’s internal reasoning and the external rules, evidence, and authority required to perform legitimate institutional actions.

The Problem: Closed-World vs. Open-World

The authors distinguish between "closed-world" solvers and "open-world" agents. In a closed-world task, the rules, success criteria, and evidence are already defined, allowing the AI to focus on finding the right answer. In an open-world setting, these conditions are often ambiguous or missing. The authors suggest that when these conditions are unsettled, simply giving an AI more time to "think" or search for an answer is ineffective. Instead, the system must first resolve the "closure gap"—the missing information regarding what is being asked, what evidence is allowed, what method is required, and who has the authority to act.

Intent Compilation and the Four-Contract Stack

To manage this, the authors introduce intent compilation, which organizes a task into four distinct contracts:

  • Semantic Contract: Defines the task ontology, such as goals, output formats, and acceptance criteria.

  • Evidentiary Contract: Specifies which sources are admissible, how to handle conflicts, and requirements for provenance.

  • Procedural Contract: Outlines the allowed tools, workflows, and safety constraints.

  • Institutional Contract: Sets the boundaries of authority, including who can approve actions and what happens if a policy is violated.
    By formalizing these as a "contract tuple," the system makes the requirements for an action transparent and auditable, rather than hiding them within the model's opaque internal state.

Delegation Envelopes

A key concept in the paper is the "delegation envelope," which defines the specific space where an AI is authorized to act autonomously. Inside this envelope, the agent can proceed based on its contracts. At the boundary of the envelope, the system is required to ask for clarification, and outside the envelope, it must escalate the task to a human or abstain entirely. The authors note that this envelope can be risk-sensitive: high-stakes actions require higher levels of confidence and stricter adherence to the four contracts before the system is permitted to execute them.

Misclosure vs. Undersearch

The authors introduce a new taxonomy to categorize AI failures. They distinguish between "undersearch"—where the AI has clear instructions but fails to find a correct answer—and "misclosure"—where the AI might generate a plausible answer, but it cannot be authorized because it violated a contract (e.g., using a prohibited source or acting without proper approval). This framework is designed to help researchers build better benchmarks that test not just the AI’s intelligence, but its ability to operate within the constraints of human institutions.

Comments (0)

No comments yet

Be the first to share your thoughts!