Back to AI Research

AI Research

U-Define: Designing User Workflows for Hard and Sof... | AI Research

Key Takeaways

  • U-Define: Designing User Workflows for Hard and Soft Constraints in LLM-Based Planning Large Language Models (LLMs) are increasingly used for task planning,...
  • LLMs are increasingly used for end-user task planning, yet their black-box nature limits users' ability to ensure reliability and control.
  • While recent systems incorporate verification techniques, it remains unclear how users can effectively apply such rigid constraints to represent intent or adapt to real-world variability.
  • For example, prior work finds that hard-only constraints are too rigid, and numeric flexibility weights confuse users.
  • We present U-Define, a system that lets users define constraints in natural language and categorize them as either hard rules that must not be violated or soft preferences that allow flexibility.
Paper AbstractExpand

LLMs are increasingly used for end-user task planning, yet their black-box nature limits users' ability to ensure reliability and control. While recent systems incorporate verification techniques, it remains unclear how users can effectively apply such rigid constraints to represent intent or adapt to real-world variability. For example, prior work finds that hard-only constraints are too rigid, and numeric flexibility weights confuse users. We investigate how interaction workflows can better support users in applying constraints to guide LLM-generated plans, examining whether abstracting strictness into high-level types (i.e., hard and soft) paired with distinct verification mechanisms helps users more reliably express and align intent. We present U-Define, a system that lets users define constraints in natural language and categorize them as either hard rules that must not be violated or soft preferences that allow flexibility. U-Define verifies these types through complementary methods: formal model checking for hard constraints and LLM-as-judge evaluation for soft ones. Through a technical evaluation and user studies with general and expert participants, we find that user-defined constraint types improve perceived usefulness, performance, and satisfaction while maintaining usability. These findings provide insights for designing flexible yet reliable constraint-based workflows.

U-Define: Designing User Workflows for Hard and Soft Constraints in LLM-Based Planning
Large Language Models (LLMs) are increasingly used for task planning, but their "black-box" nature makes it difficult for users to ensure that generated plans are reliable or aligned with specific needs. While some systems attempt to verify LLM outputs, they often rely on rigid, complex, or inaccessible methods that struggle to balance strict rules with personal preferences. U-Define addresses this by providing an interactive workflow that allows users to categorize their requirements into two distinct types—hard rules and soft preferences—and applies tailored verification methods to each, helping users achieve both reliability and flexibility in their planning tasks.

Defining Constraints by Intent

The core of U-Define is the ability for users to express their requirements in natural language and classify them based on their importance. Hard constraints are defined as non-negotiable rules that must never be violated, such as dietary restrictions or safety protocols. Soft constraints are defined as preferences that offer flexibility, such as a desired pace or style. By allowing users to explicitly label these constraints, the system moves away from confusing numeric weighting systems and instead uses categories that align with how people naturally think about their goals.

Tailored Verification Mechanisms

U-Define uses a two-pronged approach to verify that the LLM’s output matches the user's intent. For hard constraints, the system employs formal model checking, a rigorous mathematical process that ensures the plan strictly adheres to the defined rules. For soft constraints, the system uses an "LLM-as-judge" evaluation, which assesses the plan's alignment with the user's preferences. This combination allows the system to provide the high-level reliability of formal verification while maintaining the adaptive, context-aware benefits of LLMs.

Improving User Experience and Reliability

Through technical evaluations and user studies involving both general participants and domain experts, the researchers found that U-Define significantly improves the planning process. Users reported higher satisfaction and perceived the system as more useful compared to conventional workflows. The study showed that distinguishing between hard and soft constraints helps users feel more in control, as they can see exactly which requirements were met and which were violated, allowing them to iteratively refine their plans until they are satisfied.

Bridging Rigor and Flexibility

The findings suggest that the future of reliable AI planning lies in hybrid systems that combine deterministic, rule-based verification with the generative power of LLMs. By removing the need for users to understand complex, domain-specific programming languages, U-Define demonstrates that it is possible to create accessible, human-centered tools that handle the complexity of real-world planning. This approach provides a blueprint for designing AI systems that are both robust enough for high-stakes tasks and flexible enough for everyday use.

Comments (0)

No comments yet

Be the first to share your thoughts!