Stage-Based Tool Control for MCP Agent Workflows
Concierge is a Python library that adds state machine logic to Model Context Protocol servers, organizing tools into stages and controlling access based on
Concierge: Stage-Based Tool Access for MCP Agents
What It Is
Concierge is a Python library that adds state machine logic to Model Context Protocol (MCP) servers, preventing agents from calling tools in illogical sequences. Instead of exposing all available tools simultaneously, Concierge organizes them into stages and controls which tools appear based on the current workflow state.
The library wraps existing MCP servers and defines explicit stages with allowed transitions. For example, an e-commerce agent might progress through “browse,” “cart,” and “checkout” stages. At each stage, only relevant tools become visible - search functions during browsing, cart operations when managing items, and payment methods at checkout. The agent physically cannot access checkout tools until it has transitioned through the earlier stages.
This approach differs from traditional prompt engineering, which attempts to guide agent behavior through instructions. Concierge enforces constraints at the infrastructure level, making certain tool sequences impossible rather than merely discouraged.
Why It Matters
Large-scale agent deployments face a fundamental coordination problem. As tool libraries grow beyond a dozen functions, agents increasingly make nonsensical choices - attempting to finalize transactions before selecting products, or calling administrative functions during user-facing workflows. System prompts explaining proper sequences rarely prevent these errors, since language models lack inherent understanding of procedural dependencies.
Teams building production agents currently face an uncomfortable tradeoff: limit tool counts to maintain reliability, or accept frequent workflow failures. Concierge eliminates this constraint by making tool access conditional on workflow state. Developers can expose comprehensive tool libraries without risking illogical execution paths.
The context window benefits matter equally for cost-sensitive deployments. Serving 50 tool definitions on every message consumes thousands of tokens unnecessarily. By exposing only stage-appropriate tools, Concierge reduces per-message overhead substantially. For high-volume applications processing thousands of requests daily, this translates to measurable infrastructure savings.
The approach also creates clearer debugging paths. When agents fail, developers can identify which stage triggered the error and examine only the tools available at that point, rather than analyzing the entire tool library.
Getting Started
Concierge requires an existing FastMCP server. Installation uses pip:
The basic implementation wraps a FastMCP instance and defines stages with their associated tools:
mcp_server = FastMCP("shopping-assistant")
app = Concierge(mcp_server)
app.stages = {
"browse": ["search_products", "view_details"],
"cart": ["add_to_cart", "remove_from_cart", "view_cart"],
"checkout": ["apply_coupon", "process_payment"]
}
app.transitions = {
"browse": ["cart"],
"cart": ["checkout", "browse"],
"checkout": ["browse"]
}
Stage transitions happen automatically when agents call tools that belong to different stages. The library tracks state per conversation session, allowing multiple concurrent users without interference.
The repository at https://github.com/concierge-hq/concierge includes additional examples for multi-step workflows and semantic tool search for larger collections.
Context
Concierge addresses problems that alternative approaches handle differently. Tool filtering through prompt engineering remains the most common strategy, but relies on model compliance rather than enforcement. Retrieval-augmented generation (RAG) for tool selection can surface relevant functions dynamically, though it adds latency and doesn’t prevent illogical sequences.
LangGraph and similar orchestration frameworks offer more sophisticated state management but require rebuilding agent logic around their abstractions. Concierge integrates with existing MCP infrastructure without architectural changes, making adoption simpler for teams already using MCP servers.
The stage-based model works best for linear or branching workflows with clear progression. Applications requiring dynamic tool access based on complex conditions may find the stage abstraction limiting. Agents that need simultaneous access to tools from multiple domains - like a customer service bot handling billing, technical support, and account management - might require careful stage design to avoid artificial constraints.
The library currently supports FastMCP specifically, limiting compatibility with other MCP server implementations. Teams using different frameworks would need to wait for broader integration or contribute adapters themselves.
Related Tips
Real-time Multimodal AI on M3 Pro with Gemma 2B
A technical guide exploring how to run real-time multimodal AI applications using the Gemma 2B model on Apple's M3 Pro chip, demonstrating local inference
Agentic Text-to-SQL Benchmark Tests LLM Database Skills
A comprehensive benchmark evaluates large language models' abilities to convert natural language queries into accurate SQL statements for database interactions
Claude Dev Tools: Repos That Enhance Coding Workflow
GitHub repositories that extend Claude's coding capabilities by addressing friction points like premature generation, context-setting, and workflow validation