Back to Infrastructure
infrastructureL5 AutonomousAgent Runtime & Sandboxing

Each agent = isolated machine (Cursor approach) or shared with smart resource management

At L5, organizations running agent fleets at scale face a fundamental architectural choice: does each agent get its own isolated machine (strong isolation, higher cost), or do mult

  • ·Dedicated compute infrastructure exists for agent fleet (not shared with developer workstations or production)
  • ·Agent fleet auto-scales with load (agents scale up during business hours, scale down off-hours)
  • ·Each agent runs in a fully isolated environment (Cursor approach: one machine per agent, or smart resource management)
  • ·Cost per agent-hour is tracked and optimized
  • ·Fleet scaling responds to demand within 60 seconds

Evidence

  • ·Infrastructure allocation showing dedicated agent compute (separate from dev and prod)
  • ·Auto-scaling configuration and scaling event logs
  • ·Agent fleet dashboard showing per-agent isolation and resource utilization

What It Is

At L5, organizations running agent fleets at scale face a fundamental architectural choice: does each agent get its own isolated machine (strong isolation, higher cost), or do multiple agents share machines with smart resource management (weaker isolation per agent, lower cost, higher density)? Both approaches work at scale; the choice depends on the organization's security requirements, cost constraints, and the types of tasks agents run.

The isolated machine approach - one physical or virtual machine per agent task - provides the strongest security guarantees. Cursor's engineering blog describes this as their production approach for running background agents: each agent task runs in its own Firecracker microVM on its own physical host. There is no sharing of memory, kernel, or storage between tasks. If one agent task is compromised or behaves unexpectedly, it has zero ability to affect any other task. This isolation level is appropriate for tasks that handle sensitive code, process secrets, or make financial decisions. The cost is significant: you are paying for a full machine for each agent, regardless of how much of that machine the agent uses.

The shared machine approach runs multiple agents on the same node with resource management controls (Kubernetes resource limits, cgroups, separate namespaces) to isolate them from each other. This is higher density and lower cost. A single node running 20 agent containers can process 20 simultaneous tasks at a fraction of the cost of 20 separate VMs. The isolation is not as strong - agents share the kernel, and container escape vulnerabilities can theoretically allow one agent to affect others on the same host. But for many use cases, especially tasks that do not handle particularly sensitive data, this risk is acceptable.

In practice, most organizations at L5 use a hybrid approach: different isolation tiers for different task types. Routine development tasks (write a test, fix a lint error, update documentation) run on shared machines with container isolation. High-sensitivity tasks (working in the payments service, modifying authentication code, handling credentials) run on isolated machines. The routing logic that assigns tasks to the appropriate tier is itself an interesting engineering problem that defines the quality of the hybrid architecture.

Why It Matters

  • The cost difference between approaches is large - isolated VMs for 100 simultaneous agents can cost 5-10x more than containerized agents on shared nodes; for large agent fleets, this cost difference is a significant budget consideration
  • Isolation model defines your security guarantee - the choice between Firecracker VMs and shared containers is a choice about the strength of the security guarantee you can make; organizations with strong security requirements need VM-level isolation, not just container-level isolation
  • Disk I/O sharing is the key performance bottleneck - Cursor's engineering team identified disk I/O (not CPU or memory) as the bottleneck when running hundreds of agents on shared machines; the architecture choice must account for storage isolation and IOPS allocation
  • Bin packing efficiency matters at scale - 1,000 agent tasks on isolated VMs requires 1,000 VMs; the same tasks on shared nodes might require 50-100 nodes, each running 10-20 agent containers; the infrastructure management overhead of 1,000 VMs vs. 100 nodes is substantial
  • The hybrid architecture scales better than pure approaches - a pure isolated-VM approach is too expensive at very high agent counts; a pure shared-container approach creates security concerns for sensitive tasks; the hybrid gives organizations a principled way to scale both security and cost simultaneously

Getting Started

6 steps to get from here to the next level

Common Pitfalls

Mistakes teams actually make at this stage - and how to avoid them

How Different Roles See It

B
BobHead of Engineering

Bob's organization is planning a major expansion of agent usage that will bring simultaneous agent task counts from 50 to 300-500. At this scale, the infrastructure architecture choice has a significant cost and complexity impact. Bob needs to make an architectural decision that will serve the organization for 2-3 years, not just the current scale.

What Bob should do - role-specific action plan

S
SarahProductivity Lead

Sarah cares about two things: developer experience (are agents fast and reliable?) and cost (are we spending money on agent infrastructure wisely?). Both concerns point to the same question about architecture: are we spending isolation budget in the right places? Spending a lot on isolated VMs for low-sensitivity tasks is poor ROI. Running sensitive tasks on shared containers is a security risk. Sarah wants the hybrid architecture but needs the data to support the routing decisions.

What Sarah should do - role-specific action plan

V
VictorStaff Engineer - AI Champion

Victor has been following Cursor's engineering blog and has a clear technical opinion: Firecracker with per-task VMs is the right architecture for high-sensitivity work, and shared containers with tight cgroup controls are right for everything else. He has experimented with both and can provide concrete performance data: Firecracker startup at 200ms, container startup at 1 second, disk IOPS per task in both configurations.

What Victor should do - role-specific action plan

Agent Runtime & Sandboxing