FastOMOP: A Foundational Architecture for Reliable Agentic Real-World Evidence Generation on OMOP CDM data
Generating real-world evidence (RWE) from electronic health records is a complex, manual task that requires deep clinical and technical expertise. While Large Language Models (LLMs) and multi-agent systems offer a way to automate this, they often struggle with reliability, safety, and the risk of unpredictable behavior. This paper introduces FastOMOP, an open-source, multi-agent architecture designed to make the generation of clinical evidence safer and more auditable by embedding governance directly into the system's infrastructure rather than relying on the models themselves to stay within safe boundaries.
A Layered Approach to Safety
FastOMOP addresses the risks of autonomous agents by separating the system into three independent layers: orchestration, governance, and observability. The orchestration layer manages how agents work together to solve a task, while the observability layer creates a permanent, transparent record of every action taken. The most critical component is the governance layer, which acts as a "process boundary." Because this layer operates independently of the agents' reasoning, it can enforce strict, rule-based safety checks. Even if an agent hallucinates or is compromised, it cannot bypass these rules, ensuring that only authorized and safe queries are ever executed against clinical data.
Pluggable Agent Teams
The architecture is designed to be flexible, allowing specialized "agent teams" to be plugged into the system. These teams handle specific parts of the research lifecycle, such as defining patient groups (phenotyping), designing studies, or performing statistical analysis. Because these agents operate within the FastOMOP framework, they automatically inherit the system's built-in safety and auditability features. This allows researchers to build and scale automated workflows without having to reinvent safety protocols for every new task.
Proven Reliability in Clinical Settings
To test the architecture, the researchers built a proof-of-concept system capable of translating natural language requests into SQL queries for OMOP CDM databases. They evaluated this system using three different datasets, including synthetic data, the MIMIC-IV database, and a real-world dataset from the Lancashire Teaching Hospitals NHS Foundation Trust. The results showed high reliability scores, ranging from 0.84 to 0.94. Notably, the system achieved perfect scores in blocking both adversarial queries and out-of-scope requests, demonstrating that the architectural approach successfully prevents agents from performing unauthorized or unsafe operations.
Architectural Governance vs. Model Capability
The findings suggest that the current "reliability gap" in clinical AI is not necessarily a limitation of the AI models themselves, but rather an architectural failure. By moving safety controls out of the agent's internal logic and into the infrastructure boundary, FastOMOP provides a way to deploy AI in high-stakes clinical environments where errors could lead to corrupted data or privacy breaches. This approach establishes a foundation for the progressive automation of the entire RWE lifecycle, ensuring that as these systems become more capable, they remain governed and transparent.

Comments (0)
to join the discussion
No comments yet
Be the first to share your thoughts!