Infrastructure
The technical layer that enables (or blocks) agents. From shared Jenkins to ephemeral agent sandboxes.
Agent Runtime & Sandboxing
- ·Agents run inside the developer's local IDE (no separate runtime)
- ·No isolation between agent execution and developer's local environment
Evidence
- ·Agent runs as IDE plugin with no containerization or isolation
- ·No sandboxing configuration exists
- ·Dedicated development environments exist for agent execution (separate from developer's primary workspace)
- ·Basic sandboxing via Docker or equivalent containers is implemented
- ·Agent credentials are scoped per project (not a single org-wide key)
Evidence
- ·Docker or container configuration files for agent environments
- ·Credential management configuration showing per-project scoping
- ·Environment provisioning documentation or scripts
- ·Isolated agent environments (devbox model) prevent agents from accessing other projects
- ·Pre-warmed containers with codebase at HEAD and dependencies installed are available
- ·Network isolation prevents agents from reaching production systems
Evidence
- ·Devbox configuration showing per-project isolation boundaries
- ·Pre-warmed container pool metrics (pool size, warm hit rate, cold start rate)
- ·Network policy configuration (Kubernetes NetworkPolicy, firewall rules) blocking production access
- ·Ephemeral devboxes spin up in under 10 seconds (Stripe benchmark)
- ·Devboxes come pre-loaded with codebase, dependencies, and MCP tools
- ·Kernel-level policy enforcement restricts agent actions (syscall filtering, resource limits)
Evidence
- ·Devbox spin-up latency dashboard showing P50 under 10 seconds
- ·Devbox snapshot configuration showing pre-loaded codebase, deps, and MCP tools
- ·Kernel policy configuration (seccomp profiles, cgroup limits)
- ·Dedicated compute infrastructure exists for agent fleet (not shared with developer workstations or production)
- ·Agent fleet auto-scales with load (agents scale up during business hours, scale down off-hours)
- ·Each agent runs in a fully isolated environment (Cursor approach: one machine per agent, or smart resource management)
Evidence
- ·Infrastructure allocation showing dedicated agent compute (separate from dev and prod)
- ·Auto-scaling configuration and scaling event logs
- ·Agent fleet dashboard showing per-agent isolation and resource utilization
MCP & Tool Integration
- ·No MCP servers are configured
- ·Agents rely solely on public API knowledge from training data
Evidence
- ·No MCP configuration files in repository or developer environment
- ·Absence of tool integration beyond IDE built-ins
- ·1-3 MCP servers are configured (e.g., Git, Jira, documentation)
- ·MCP setup is documented but configured manually per developer
- ·Basic tool authorization is implemented (agents authenticate to MCP servers)
Evidence
- ·MCP server configuration files (mcp.json or equivalent)
- ·Setup documentation for MCP server installation per developer
- ·MCP server authentication configuration
- ·Centralized MCP platform manages server provisioning, configuration, and lifecycle
- ·Domain-specific MCP servers exist (Architecture MCP, Ownership MCP, SLA MCP)
- ·RBAC controls which agents can access which MCP tools
Evidence
- ·MCP platform configuration showing centralized server management
- ·RBAC policy configuration for MCP tool access
- ·MCP server inventory listing domain-specific servers with owners
- ·Toolshed model: 400+ tools accessible behind a unified MCP gateway (Stripe model)
- ·Agent discovery: agents can query available tools and their capabilities at runtime
- ·MCP governance covers lifecycle management, versioning, and audit logging
Evidence
- ·MCP gateway configuration showing 400+ registered tools
- ·Agent discovery API or protocol documentation with runtime tool listing
- ·MCP governance logs showing lifecycle events (deploy, version, deprecate, audit)
- ·MCP operates as a bidirectional nervous system: production data flows to agents, agent actions flow to production
- ·Full production loop: Production -> MCP -> Agent -> Code -> Deploy -> Production
- ·Agent-to-Agent Protocol (A2A) and MCP are combined for multi-agent coordination
Evidence
- ·MCP configuration showing bidirectional data flow (production to agent, agent to production)
- ·End-to-end production loop traces (anomaly detected, agent invoked, fix deployed)
- ·A2A protocol configuration showing agent-to-agent communication channels
Build System
- ·Build uses default tool configuration (Maven/Gradle defaults, npm scripts without optimization)
- ·Full rebuild runs on every change (no incremental build support)
Evidence
- ·Build configuration file with default/untuned settings
- ·CI logs showing full rebuild on every PR
- ·Build caching is implemented (dependency cache, compilation cache)
- ·Parallel build steps are configured (test and lint run concurrently)
- ·Dedicated CI resources are allocated (not shared across all teams)
Evidence
- ·Build cache configuration (Gradle build cache, npm cache, Docker layer cache)
- ·CI pipeline configuration showing parallel step execution
- ·Dedicated runner or resource pool configuration
- ·Advanced build system (Bazel, Buck2, or Pants) is adopted for primary codebase
- ·Remote execution (EngFlow or equivalent) distributes build steps across multiple machines
- ·Incremental builds run only changed targets (not full rebuild)
Evidence
- ·Bazel/Buck2/Pants BUILD files in repository
- ·Remote execution configuration (EngFlow, BuildBuddy, or equivalent)
- ·Build log showing incremental target selection
- ·Any change gets build feedback in under 2 minutes
- ·Agent-specific build profiles exist (optimized for agent iteration patterns - fast feedback over comprehensive build)
- ·Build system understands agent iteration patterns and pre-caches likely next builds
Evidence
- ·Build duration dashboard showing sub-2-minute feedback for all change types
- ·Agent-specific build profile configuration
- ·Pre-cache hit rate metrics for agent iteration patterns
- ·Build is a commodity: near-instant feedback for agents regardless of codebase size
- ·Codebase is structured into self-contained modules/crates to eliminate compilation bottleneck (Cursor lesson)
- ·Disk I/O is optimized for concurrent agent workloads (parallel reads/writes across modules)
Evidence
- ·Build duration dashboard showing near-instant feedback for standard changes
- ·Codebase architecture showing modular structure (crate/module boundaries)
- ·Disk I/O benchmarks for concurrent agent build workloads
Observability & Feedback Loop
- ·Basic application logging exists
- ·Alerting fires on application errors
Evidence
- ·Logging configuration in application code
- ·Alert configuration (PagerDuty, Opsgenie, or equivalent)
- ·Structured logging is implemented (JSON logs with consistent fields)
- ·OpenTelemetry basic instrumentation is deployed (traces and metrics)
- ·Post-deploy monitoring checks run after each deployment
Evidence
- ·Structured logging configuration showing JSON format with standard fields
- ·OpenTelemetry SDK configuration in application code
- ·Post-deploy monitoring job configuration in CD pipeline
- ·Full observability stack is operational (OpenTelemetry + Grafana/Datadog or equivalent)
- ·Production metrics feed into dashboards accessible to all developers
- ·Incident data (post-mortems, error patterns) is available as agent context
Evidence
- ·Observability stack configuration (OTel collector, Grafana dashboards)
- ·Production metrics dashboards with developer access
- ·Incident data accessible via MCP or structured API
- ·Production anomaly detection auto-creates tickets and triggers agent investigation
- ·Self-healing for known patterns: agent detects known error pattern, applies known fix, deploys, and verifies
- ·Infrastructure recommends code changes based on production data (Vercel SDI model)
Evidence
- ·Auto-ticket creation logs triggered by production anomalies
- ·Self-healing event logs showing detection, fix, deploy, and verification steps
- ·Infrastructure recommendation pipeline configuration (production data to code change suggestions)
- ·Full production-to-agent loop operates autonomously: anomaly detected, investigated, fixed, tested, deployed
- ·Infrastructure self-drives: code defines infrastructure, production performance informs code changes
- ·Anomaly-to-deploy cycle completes without human intervention for 80%+ of known issue categories
Evidence
- ·End-to-end autonomous fix traces (anomaly to deployed fix with no human steps)
- ·Infrastructure-as-code showing production-informed code changes
- ·Autonomous resolution rate dashboard showing 80%+ for known issue categories
Author Commentary
April 2026 update: MCP is now the universal standard for agent-tool integration. With 97M+ npm downloads and Cursor 3 shipping with 30+ built-in MCP plugins, the "should we adopt MCP?" question is settled. The question is now "how mature is your MCP layer?" — L1 (zero) to L5 (nervous system). Teams without any MCP servers are falling behind the baseline. Disk I/O is the hidden bottleneck of multi-agent systems. Cursor discovered this building a browser with hundreds of agents: compiling a monolith = many GB/s reads/writes. Solution: restructure project into self-contained crates/modules. The same applies to JVM: modularization isn't just clean code, it's agent throughput. Stripe's devbox (10s spin-up, pre-warmed) is the gold standard of isolated agent runtime. Replicating this requires investment, but the alternative (agent on dev's laptop) doesn't scale beyond L2.