MCP servers: architecture, ownership, SLA

Model Context Protocol servers are the infrastructure layer of context engineering at L3 - dedicated services that give AI agents structured, real-time access to your organization's knowledge.

·MCP servers provide structured context (architecture, ownership, SLAs) to agents
·Context is organized across at least 3 of the 5 levels: System, Code, Org, Historical, Operational
·Token budget management is implemented (agents receive context within defined token limits)

·Context sources are versioned and tested for correctness
·Context budgeting policy defines priority order when token limits are reached

Evidence

·MCP server configuration files listing active context sources
·Token budget configuration in agent settings
·Context coverage audit showing 3+ context levels populated

What It Is

The Model Context Protocol (MCP) is an open standard that defines how AI agents discover and consume structured context from external services. An MCP server exposes "tools" - discrete capabilities that agents can call to retrieve information or take actions. In the context of coding workflows, MCP servers might expose: database schemas, API contract definitions, deployment status, team ownership graphs, incident history, or architecture documentation.

At L2, context is static - it lives in CLAUDE.md files and is written once, updated occasionally. At L3, context becomes dynamic: agents query MCP servers at runtime to retrieve the current state of the world. Instead of a static CLAUDE.md entry saying "the payments service owns these endpoints," an MCP server returns live ownership data from your service registry. Instead of a CLAUDE.md describing the database schema, an MCP server returns the actual schema from the running database.

Stripe is the canonical example: in their production agent workflows, agents have access to over 400 MCP tools covering every aspect of the Stripe platform - API definitions, internal service contracts, deployment pipelines, team ownership, and more. When an agent modifies a payments API endpoint, it can query the MCP server to see which teams own dependent services, what the current SLA commitments are, and what the deployment history looks like. This is qualitatively different from static context: the agent is reasoning with live organizational knowledge.

At L3, organizations treat MCP servers as internal infrastructure. They have defined owners, SLA targets (availability, latency), security postures (authentication, authorization, audit logging), and operational runbooks. The MCP server is not a prototype or a developer convenience - it's a production service on which agent workflows depend.

Why It Matters

MCP servers unlock a category of agent capability that static context files cannot provide:

Real-time accuracy - agents work with current state, not stale documentation that may be months out of date
Structured data instead of prose - database schemas, API contracts, and ownership graphs are queryable data structures, not paragraphs the agent has to parse
Organizational scope - MCP servers can aggregate context across the entire organization, not just a single repository
Programmatic access patterns - agents can retrieve only the context they need for a specific task, reducing context window pressure
Audit trail - every MCP query is logged, creating an audit trail of what context agents consulted when making decisions

The difference between L2 and L3 context engineering is the difference between a static map and a live navigation system. The static map (CLAUDE.md) is better than nothing. The live navigation system (MCP) tells you where the traffic jams are right now.

Tip

Start your MCP server journey with a read-only, non-sensitive data source - your team's service registry or API contract catalog. This lets you test the infrastructure pattern and demonstrate value without taking on security complexity at the outset.

Getting Started

Identify your highest-value context sources - What information do developers (and agents) spend the most time looking up? Database schemas, service ownership, API contracts, and deployment status are the most common answers. These are your MCP server candidates.
Stand up a simple MCP server - Use the official MCP SDK (TypeScript or Python) to build a server exposing 2-3 tools against one data source. Keep the first server small and focused. The goal is to validate the infrastructure pattern before building out broadly.
Define ownership explicitly - Every MCP server needs an owner: a team responsible for availability, correctness, and deprecation. Don't let MCP servers become orphaned infrastructure.
Set and publish SLAs - At minimum: availability target (e.g., 99.5%), maximum response latency (e.g., p95 < 500ms), and data freshness guarantee (e.g., ownership data refreshed every 15 minutes). Agent workflows that depend on your MCP server need to know what to expect.
Implement authentication and audit logging - MCP servers that expose sensitive organizational data need security controls. Use your organization's standard auth infrastructure. Log every tool invocation with the requesting agent's identity.
Document tools in the CLAUDE.md - List available MCP servers and their key tools in the project's CLAUDE.md. Agents need to know what MCP servers exist before they can use them.

Common Pitfalls

Building MCP servers without defined ownership. A MCP server without an owner becomes orphaned infrastructure within months. When it goes down or returns stale data, no one knows who to contact. Establish ownership before production deployment, not after.

Serving too much data without filtering. An MCP tool that returns an entire database schema (hundreds of tables) when an agent only needs 3 tables wastes the agent's context budget and slows down responses. Design MCP tools to be queryable - return what's needed for the specific request, not a data dump.

Ignoring latency. Agents call MCP servers synchronously during task execution. A slow MCP server (multi-second responses) compounds across multiple tool calls and makes agent workflows feel sluggish. Instrument your MCP servers from day one and set latency SLAs.

Not versioning the API. MCP server tool signatures evolve as your organizational knowledge model changes. Agents that depend on specific tool signatures will break if you change them without versioning. Apply API versioning practices to your MCP servers from the start.

How Different Roles See It

BobHead of Engineering

Bob's team has successfully deployed CLAUDE.md files across most repositories (L2). Developers are reporting good results, but senior engineers are raising a new problem: the context in CLAUDE.md files keeps going stale. The service ownership section was accurate six months ago; since then, three service owners have changed and two services were deprecated. Manually updating CLAUDE.md isn't keeping up with organizational change.

What Bob should do: Bob has identified the natural L2-to-L3 transition trigger: static context that changes faster than humans can update it needs to become dynamic context from an MCP server. Bob should sponsor a proof of concept: one MCP server exposing service ownership from the existing service registry. This is typically low-effort to build (the data already exists; the MCP server just makes it queryable). Bob should track whether agents' suggestions about service interactions become more accurate after the MCP server is available, and use that as the business case for expanding MCP infrastructure.

SarahProductivity Lead

Sarah has been tracking agent suggestion quality and sees a pattern: suggestions about the current state of the system (who owns what, what's deployed, what the current schema looks like) are consistently worse than suggestions about how to write code. The coding quality has improved with CLAUDE.md; the operational knowledge quality hasn't.

What Sarah should do: Sarah should frame the MCP server investment in terms of two ROI streams: (1) reduced time agents spend making mistakes about current organizational state, and (2) reduced time developers spend looking up that state manually. Both are measurable. She should also note the compounding nature: as agents take on more complex multi-step tasks (L4), the quality of their operational context becomes increasingly critical. MCP server investment at L3 is infrastructure that enables L4 and L5 capabilities.

VictorStaff Engineer - AI Champion

Victor has been manually providing operational context in every agent session: the current schema migration state, which services are currently degraded, what the team's sprint focus is. He's essentially acting as a human MCP server. He knows this doesn't scale and has been researching the MCP protocol to understand what building a real MCP server would take.

What Victor should do: Victor should build the first internal MCP server as a proof of concept, then use it to demonstrate the productivity delta to Bob. A minimal viable MCP server - say, one that exposes the current database schema and service health status - can be built in a day using the official MCP SDK. Victor should instrument it (log queries, measure latency), deploy it to the team's dev environment, and measure how much time it saves compared to his current manual-context approach. That measurement is the business case for operationalizing MCP infrastructure.

From the Field

Recent releases, projects, and discussions relevant to this maturity level.

releaseL3

topoteretes/cogneeCognee v0.5.6 transitions AI memory management from ad-hoc interactions to systematic context engineering by introducing bulk JSON/CSV import/export for large-sgithub.com

discoveredL3

Doorman11991/budget-aware-mcpModel-agnostic code memory MCP server. Budget-aware graph retrieval for AI agents. Sub-millisecond queries, token budgeting, deterministic results. Built obudget-aware-mcp shifts AI agent context management from high-latency vector searches to deterministic, hop-based graph walks using CodeGraphContext and tree-sigithub.com

discoveredL3

samber/cc-skills-golang🧑‍🎨 A collection of Golang agentic skills that worksModular instruction sets for Go projects—focusing on performance, testing, and security—implement the Agent Skills protocol to optimize context window efficiencgithub.com

releaseL3

kodustech/kodus-aiKodus-AI web-1.0.93 stabilizes enterprise integration by resolving connectivity issues for self-hosted GitLab instances, enabling AI agent deployment within prigithub.com

Where does your team actually sit on this?

This guide describes one level of one area. Run the assessment to place your team across all 16 areas, see which gates you have passed, and get a report you can take to your stakeholders.

Start the assessment

Context Engineering

Agent instruction files + Skills as the unit of reuse (SKILL.md alongside CLAUDE.md, AGENTS.md, llms.txt; official skill-packs from Apple/NVIDIA/Cloudflare)Memory beyond RAG: agentic grep/tool-search + repo-explorer subagent (FastContext) over vector DBs (Anthropic dropped embeddings in Claude Code)