Back to AI Research

AI Research

SIGA: Self-Evolving Coding-Agent Adapters for Scien... | AI Research

Key Takeaways

  • SIGA: Self-Evolving Coding-Agent Adapters for Scientific Simulation Scientific simulators are essential for modern research, but they often require complex,...
  • Advanced scientific simulators expose specialized input languages that turn simulation goals into executable configurations, but learning them can cost domain scientists hours to days.
  • We study simulator setup as a problem of agent-tool interface grounding: what minimal simulator-specific adaptations are needed for an off-the-shelf coding agent to operate real scientific software?
  • We introduce SIGA, a Simulator-Interface Grounding Adapter that supplies this contract through retrieval, procedural memory, in-trajectory validation, and validation-enforced termination.
  • We primarily evaluate SIGA on GEOS, an open-source multiphysics simulator used in subsurface science.
Paper AbstractExpand

Advanced scientific simulators expose specialized input languages that turn simulation goals into executable configurations, but learning them can cost domain scientists hours to days. We study simulator setup as a problem of agent-tool interface grounding: what minimal simulator-specific adaptations are needed for an off-the-shelf coding agent to operate real scientific software? Our intuition is that coding agents already know how to navigate files, edit code, run commands, and repair outputs, but they lack the simulator's executable contract: its vocabulary, structural constraints, validation rules, and termination conditions. We introduce SIGA, a Simulator-Interface Grounding Adapter that supplies this contract through retrieval, procedural memory, in-trajectory validation, and validation-enforced termination. We primarily evaluate SIGA on GEOS, an open-source multiphysics simulator used in subsurface science. SIGA produces a complete GEOS deck in about five minutes with TreeSim above 0.90, matching an extended-budget human expert who took about three hours, a roughly 36x wall-clock speedup. On a harder held-out set, grounding raises TreeSim from 0.720 to 0.789, a roughly 10% relative gain over the bare agent, and can reduce the across-seed standard deviation by 16x. Self-evolution further improves SIGA by rewriting adapter contents from prior trajectories, yielding the highest held-out GEOS mean and matching or outperforming the strongest hand-designed configuration. Transfers to OpenFOAM and LAMMPS show that the dominant mechanism shifts by interface: validation matters most when structural completeness is the bottleneck, while memory and retrieval matter most when domain correctness is the bottleneck. These results suggest that lightweight, self-improvable grounding layers can turn general coding agents into practical operators of scientific software.

SIGA: Self-Evolving Coding-Agent Adapters for Scientific Simulation
Scientific simulators are essential for modern research, but they often require complex, specialized input languages that can take domain experts hours or days to master. This paper introduces SIGA, a "Simulator-Interface Grounding Adapter" designed to bridge the gap between general-purpose AI coding agents and the rigid requirements of scientific software. By providing a lightweight, modular layer that teaches an agent the specific "contract" of a simulator—such as its vocabulary, structural constraints, and validation rules—SIGA allows off-the-shelf coding agents to generate valid simulation configurations in minutes rather than hours.

The Problem: The Simulator Setup Bottleneck

Advanced scientific simulators, such as GEOS for subsurface science, use complex domain-specific languages (DSLs). These languages require precise syntax, consistent naming across files, and adherence to physical constraints. While modern AI coding agents are excellent at navigating files and editing code, they often fail to produce runnable simulation decks because they lack an understanding of the simulator's specific rules. Without this "grounding," agents may produce plausible-looking but ultimately invalid configurations, forcing researchers to spend significant time debugging.

How SIGA Works

Rather than building a new agent from scratch, SIGA acts as a thin, plug-in wrapper around existing coding agents. It grounds the agent through four specific components:

  • Retrieval: Provides semantic search access to documentation, schemas, and technical examples.

  • Procedural Memory: A compact "cheatsheet" that keeps high-frequency vocabulary and configuration patterns visible to the agent.

  • In-Trajectory Validation: Allows the agent to check its work against schema rules while it is still drafting.

  • Validation-Enforced Termination: A "stop-hook" that prevents the agent from finishing until the configuration is structurally sound and passes all validation tests.
    These components are modular, allowing them to be adjusted based on the specific simulator being used. Furthermore, SIGA includes a self-evolution mechanism that allows the system to reflect on past performance and automatically rewrite its own adapter contents to improve future results.

Key Results

The researchers evaluated SIGA primarily on the GEOS simulator. The results showed that SIGA could produce a complete, high-quality simulation deck in about five minutes, matching the performance of a human expert who took three hours—a 36x speedup. On more difficult, held-out tasks, SIGA improved the quality of the generated configurations by 10% and significantly increased reliability by reducing the variance across different runs.
The study also found that the "dominant mechanism" for success shifts depending on the simulator. For simulators where structural completeness is the main hurdle, validation tools are the most critical. For simulators where domain-specific knowledge is the bottleneck, memory and retrieval tools provide the most value.

Implications for Scientific AI

SIGA demonstrates that researchers do not need to build bespoke, complex agent loops to operate scientific software. Instead, by using a lightweight, self-improvable grounding layer, they can turn general-purpose coding agents into reliable, practical tools for scientific workflows. This approach is highly portable, as the adapter can be updated or swapped to fit different simulators without needing to redesign the underlying AI agent.

Comments (0)

No comments yet

Be the first to share your thoughts!