Back to AI Research

AI Research

HarnessAPI: A Skill-First Framework for Unified Str... | AI Research

Key Takeaways

  • HarnessAPI is a Python framework designed to solve the problem of "dual-maintenance" for developers building AI tools.
  • Every Python function deployed as an LLM tool must today exist in two forms: an HTTP endpoint for human-facing clients and CI pipelines, and an MCP tool registration for agent runtimes such as Claude and Cursor.
  • These representations share business logic yet diverge in all the surrounding machinery (routing, validation, serialisation, streaming, and schema maintenance), and they drift apart as the underlying code evolves.
  • We present HarnessAPI, a Python framework that eliminates this duplication by treating a typed skill folder as the single source of truth.
  • Dual-mode content negotiation lets the same handler serve SSE-streaming and JSON-returning clients with no handler changes.
Paper AbstractExpand

Every Python function deployed as an LLM tool must today exist in two forms: an HTTP endpoint for human-facing clients and CI pipelines, and an MCP tool registration for agent runtimes such as Claude and Cursor. These representations share business logic yet diverge in all the surrounding machinery (routing, validation, serialisation, streaming, and schema maintenance), and they drift apart as the underlying code evolves. We present HarnessAPI, a Python framework that eliminates this duplication by treating a typed skill folder as the single source of truth. From one this http URL plus Pydantic schemas, the framework automatically derives a streaming HTTP endpoint with Server-Sent Events, an interactive OpenAPI/Swagger UI, and a zero-configuration MCP tool, all served from a single process. Dual-mode content negotiation lets the same handler serve SSE-streaming and JSON-returning clients with no handler changes. A dynamic code-generation mechanism ensures Pydantic type annotations propagate correctly to FastMCP's inspection layer, resolving a technical limitation that prevents naive closure-based registration. Measured across six representative skills using cloc, HarnessAPI reduces framework-facing boilerplate by 74% compared with a manually maintained dual-stack implementation (FastAPI server + FastMCP server). HarnessAPI subclasses FastAPI, inheriting its full middleware, dependency-injection, and deployment ecosystem. It is available at this https URL and on PyPI (pip install harnessapi)

HarnessAPI is a Python framework designed to solve the problem of "dual-maintenance" for developers building AI tools. Currently, developers often have to write the same business logic twice: once as an HTTP endpoint for web applications and again as an MCP (Model Context Protocol) tool for AI agents like Claude or Cursor. HarnessAPI eliminates this duplication by treating a single "skill folder" as the authoritative source of truth, automatically generating both HTTP and MCP interfaces from one set of Pydantic schemas and handler code.

The Skill-First Architecture

The core innovation of HarnessAPI is shifting the focus from "routes" or "tools" to the "skill." In traditional setups, a developer must manually keep an HTTP route and an MCP registration in sync. If the underlying data model changes, both must be updated independently, which often leads to errors or stale schemas. HarnessAPI inverts this by using the skill folder—containing a handler.py and models.py—to derive all necessary transport layers. Because both the HTTP and MCP interfaces are generated from the same Pydantic model at runtime, the framework structurally enforces consistency, making it impossible for the two interfaces to drift apart.

How It Works

HarnessAPI subclasses FastAPI and mounts FastMCP as a sub-application, allowing it to inherit the full ecosystem of middleware and deployment tools. When the server starts, the framework scans the skill directory, isolates each skill into its own synthetic package to prevent naming collisions, and registers it for both protocols. It also features "dual-mode content negotiation," meaning the same handler can serve a JSON response to batch clients or a streaming Server-Sent Events (SSE) response to interactive agents, depending on the request headers. To overcome technical limitations in how FastMCP inspects code, the framework uses a dynamic code-generation technique to ensure Pydantic type annotations are correctly recognized by the agent runtime.

Results and Efficiency

The framework was evaluated by comparing the amount of "boilerplate" code required to implement six representative skills using HarnessAPI versus a manual dual-stack approach (maintaining separate FastAPI and FastMCP servers). By using cloc to measure non-comment, non-empty lines of code, the study found that HarnessAPI reduces framework-facing boilerplate by 74%. This reduction is achieved without sacrificing feature parity, as the framework maintains the performance and flexibility of the underlying FastAPI and FastMCP technologies.

Important Considerations

While HarnessAPI simplifies deployment, users should be aware of its current scope. The framework is designed for local development and production deployment of tools, but it includes optional features like a "handler hot-swap" endpoint that allows AI coding agents to modify logic on the fly. Because this involves executing code dynamically, it carries specific security implications that developers should manage according to their deployment environment. Additionally, while the framework handles the heavy lifting of multi-protocol exposure, end-to-end latency and concurrency benchmarks are reserved for future research.

Comments (0)

No comments yet

Be the first to share your thoughts!