coding

State Machine Workflow Control for MCP Servers

Concierge is a workflow orchestration layer for MCP servers that uses state machines to control AI agent tool access by organizing capabilities into stages

What It Is

Concierge is a workflow orchestration layer for Model Context Protocol (MCP) servers that introduces state machines to control which tools AI agents can access at any given moment. Instead of exposing all available tools simultaneously, Concierge organizes them into stages and defines explicit transitions between those stages. An agent working through an e-commerce flow, for example, would only see search tools during the browsing stage, cart management tools after transitioning to the cart stage, and payment tools once it reaches checkout. This architectural pattern prevents agents from attempting illogical sequences like processing payments before adding items to a shopping cart.

The system works as a wrapper around existing MCP servers, requiring minimal code changes to implement. Developers define stages as collections of tool names and specify which stage transitions are permissible. The underlying MCP server continues to function normally, but Concierge intercepts tool calls and enforces the state machine rules before allowing execution.

Why It Matters

As MCP adoption grows and servers expose increasingly large tool collections, the cognitive load on language models becomes a genuine bottleneck. Models operating with 50, 100, or more available tools face a combinatorial explosion of possible action sequences. Even capable models struggle to maintain coherent multi-step workflows when every option remains perpetually available.

Traditional mitigation strategies prove inadequate at scale. Adding instructions like “always search before purchasing” to system prompts provides weak guardrails that models routinely ignore under certain prompt conditions. Fine-tuning models for specific workflows is expensive and inflexible. Concierge addresses this through architectural constraints rather than prompt engineering or model modification.

Development teams building complex MCP-powered applications gain predictable agent behavior without sacrificing the flexibility of the MCP ecosystem. Customer service systems, data analysis pipelines, and multi-step automation workflows all benefit from explicit state management. The approach also improves debugging since developers can trace exactly which stage an agent occupied when an error occurred.

For the broader MCP ecosystem, Concierge demonstrates that protocol extensions can add sophisticated capabilities while maintaining backward compatibility. Existing MCP servers require no modifications to work with Concierge, lowering the barrier to adoption.

Getting Started

Installation requires Python and access to an existing MCP server. The repository at https://github.com/concierge-hq/concierge contains complete setup instructions and example implementations.

A basic workflow definition looks like this:


# Wrap an existing MCP server mcp_server = FastMCP("shopping-server")
app = Concierge(mcp_server)

# Define stages with their available tools app.stages = {
 "search": ["query_products", "filter_by_category"],
 "selection": ["view_details", "add_to_cart"],
 "purchase": ["apply_coupon", "process_payment"]
}

# Define allowed transitions app.transitions = {
 "search": ["selection"],
 "selection": ["purchase", "search"],
 "purchase": ["search"]
}

The agent begins in the first defined stage and can only call tools within that stage. Transitions occur explicitly through tool calls or programmatically through the Concierge API. For workflows with hundreds of tools, Concierge includes semantic search capabilities to help agents discover relevant tools within their current stage.

The repository includes deployment configurations for self-hosting and references a free hosting option for testing workflows without infrastructure setup.

Context

Concierge occupies a middle ground between fully autonomous agents and rigid scripted workflows. Tools like LangChain and AutoGPT emphasize agent autonomy, while traditional workflow engines like Apache Airflow prioritize deterministic execution. Concierge preserves agent decision-making within bounded contexts.

The state machine approach introduces constraints that may feel restrictive for exploratory tasks where agents benefit from unrestricted tool access. Research workflows or open-ended analysis sessions might perform better with traditional MCP configurations. The framework works best for processes with clear sequential dependencies.

Semantic search for tool discovery within stages adds computational overhead compared to simple tool filtering. Teams running latency-sensitive applications should benchmark performance with their specific tool collections. The tradeoff between agent accuracy and response time varies based on model selection and tool complexity.

Alternative approaches include prompt-based tool filtering, where system prompts dynamically adjust available tools, and hierarchical MCP servers that expose different tool subsets through multiple server instances. Each strategy offers different tradeoffs between implementation complexity and runtime flexibility.