Pre-warmed containers with codebase

Pre-warmed containers are agent environments that have been prepared in advance and are waiting in a ready state before any task is assigned to them.

·Isolated agent environments (devbox model) prevent agents from accessing other projects
·Pre-warmed containers with codebase at HEAD and dependencies installed are available
·Network isolation prevents agents from reaching production systems

·Container warm pool size matches team's agent usage patterns
·Network isolation rules are tested and audited quarterly

Evidence

·Devbox configuration showing per-project isolation boundaries
·Pre-warmed container pool metrics (pool size, warm hit rate, cold start rate)
·Network policy configuration (Kubernetes NetworkPolicy, firewall rules) blocking production access

What It Is

Pre-warmed containers are agent environments that have been prepared in advance and are waiting in a ready state before any task is assigned to them. Instead of creating a container from scratch when a task arrives (pull image, start container, clone repo, install dependencies - 60-180 seconds), you maintain a pool of containers that have already completed this initialization. When a task arrives, it is assigned to a pre-warmed container immediately, with startup overhead measured in seconds rather than minutes.

The preparation work happens asynchronously in the background, driven by webhooks or scheduled jobs. When a developer pushes to a branch, the CI system creates a pool of pre-warmed containers checked out to that commit, with dependencies installed and the codebase ready. When an agent task is submitted, it is handed a pre-warmed container from the pool, given its task-specific credentials, and starts executing immediately. The container that just received a task is replaced in the pool by a new pre-warmed container being prepared in the background.

The codebase cloning and dependency installation steps are the main startup cost drivers. A large monorepo can take 2-5 minutes to clone. A Node.js project with a large node_modules can take 3-10 minutes to install. Pre-warming eliminates both costs from the critical path. The agent task starts with a container that already has main cloned and npm install completed - all the developer sees is near-instant task start.

Pre-warming requires infrastructure investment: a pool manager that maintains the right number of containers for anticipated load, a mechanism to keep pools current as commits are pushed, and storage for the container pool state. But the payoff - startup times that match interactive developer expectations - is what makes autonomous agent workflows practical at scale. Stripe's public benchmark of 10-second devbox spin-up is achieved through pre-warming; cold-start containers cannot reach that number.

Why It Matters

Startup time determines whether developers wait or move on - a 3-minute devbox startup breaks developer flow; a 10-second startup is transparent; the difference in perceived latency determines whether developers use the infrastructure or work around it
Pre-warming amortizes expensive operations - codebase cloning and dependency installation happen once per pool refresh rather than once per task; for large codebases and dependency trees, this is a 10-100x reduction in per-task overhead
Pool management enables resource planning - maintaining a pool of pre-warmed containers provides a clean abstraction for capacity planning: you know exactly how many containers are available, how many tasks they can absorb, and when to scale the pool
Warm state includes build artifacts - pre-warmed containers can include pre-built artifacts (compiled binaries, transpiled JavaScript, populated build caches) that further accelerate task execution beyond just codebase presence
Enables responsive multi-agent dispatch - when a developer dispatches 5 parallel agent tasks, all 5 should start within seconds; without pre-warming, dispatching 5 tasks means 5 parallel 3-minute initializations before any work begins

Getting Started

Measure your current cold-start time - Time the full initialization sequence from container start to agent-ready: pull image, clone repository at HEAD, install dependencies, verify the agent can run. This baseline is what pre-warming needs to eliminate from the critical path.
Build a pool manager - Create a service that maintains N pre-warmed containers (start with N=5) ready to accept tasks. The pool manager should: create new containers to replace assigned ones, refresh containers when the main branch is updated, and kill containers that have been idle for more than a configurable duration (e.g., 30 minutes).
Optimize the base snapshot - Instead of running git clone and npm install in every container initialization, create a Docker image snapshot that includes the cloned repository and installed dependencies at a recent commit. Container initialization then reduces to pulling the snapshot and applying recent commits (git pull), which is much faster than a full clone.
Implement commit-triggered pool refresh - Set up a webhook from your repository (GitHub webhook or GitLab webhook) that triggers pool refresh when commits are pushed to monitored branches. Pool refresh should run initialization of new containers against the new commit and drain the old containers once the new ones are ready.
Add pool health monitoring - Monitor pool health with metrics: available container count, time-to-assign, initialization failure rate, and container age. Alert when the available count drops below 2 (potential task queuing) and when initialization failure rate exceeds 5%.
Tune pool size for your load pattern - Start with a small pool (3-5 containers) and observe assignment patterns over one week. If containers are frequently unavailable (tasks queue), increase the pool size. If most containers are idle when tasks arrive, reduce pool size. Target a pool utilization of 50-70%.

Tip

Separate the "warm" state from the "personalized" state. A pre-warmed container has the codebase and dependencies ready. The task-specific personalization (credentials, specific branch checkout, task description) happens at assignment time and should take under 10 seconds. This separation means the expensive preparation work is generic (done once for all tasks) while the cheap customization is specific (done per task).

6 steps to get from here to the next level

Common Pitfalls

Pre-warming to the wrong commit. If the pool is pre-warmed to main but an agent task needs to work on a feature branch, the container needs to apply the branch changes before starting work. This is fast for recent branches but slow for long-lived branches with many divergent commits. Pre-warm pools for your most active branches, not just main.

Not refreshing the pool after large dependency changes. When package.json, requirements.txt, or go.mod changes substantially, the pre-warmed containers have the old dependencies. The first few tasks after such a change will either fail (wrong dependency versions) or take the full cold-start time to reinstall. Trigger a pool refresh whenever dependency manifests change.

Letting pre-warmed containers accumulate state from previous tasks. If a container is "cleaned" after a task but not fully reset, state from one task can leak to the next. A pre-warmed container should be in a pristine state - the same state as a fresh initialization. Any container that has executed a task (even if the task was killed cleanly) should be destroyed and replaced, not recycled.

Pool size not tracking agent task load patterns. Agent load is not uniform: there are bursts (end of sprint, release preparation) and lulls. A static pool size sized for peak load wastes resources during lulls. A static pool sized for average load queues tasks during bursts. Implement dynamic pool sizing that scales with the number of queued tasks.

Pre-warming on build machines rather than close to execution. Pre-warmed containers should live on the same infrastructure where tasks will execute. A pre-warmed container image created on a build server and then transferred to an execution node loses most of the startup benefit to the transfer time. Pre-warming should happen on the target execution nodes.

Mistakes teams actually make at this stage - and how to avoid them

How Different Roles See It

BobHead of Engineering

Bob's team has adopted the devbox model and it is working, but developers are complaining about wait times. Spinning up a devbox for a new task takes 3-4 minutes, which is long enough that developers frequently switch to other work and lose context. Bob wants to improve the experience but does not know if the infrastructure investment in pre-warming is worth it.

What Bob should do: Bob should measure the opportunity cost of the current wait times. If developers average 5 agent tasks per day and each task has a 3-minute cold start, that is 15 minutes per developer per day of spinning-up time - 4% of a developer's day. For a team of 20 developers, that is 100 person-minutes per day waiting for devboxes to start. Over a quarter, that is significant. Bob should commission a two-week infrastructure sprint to implement a basic pre-warmed pool. If the sprint brings startup times under 30 seconds, the ROI calculation easily justifies the investment.

What Bob should do - role-specific action plan

SarahProductivity Lead

Sarah is tracking agent task throughput and has noticed that developers run fewer agent tasks than expected. When she interviews developers, a consistent theme emerges: "by the time the devbox is ready, I've already moved on to something else." The 3-minute startup is enough friction to prevent developers from using agents for small tasks that do not feel worth the wait. Pre-warming would unlock a category of quick agent tasks that are currently underused.

What Sarah should do: Sarah should identify the "quick tasks" that developers would use agents for if startup time were under 30 seconds but do not currently use them for given the 3-minute wait. Likely candidates: code explanation, generating a test for a single function, checking a regex, updating a type definition. These tasks individually seem small, but collectively they represent significant developer throughput. Sarah should present the pre-warming investment as "unlocking the quick-task category" with a concrete estimate of how many quick tasks per developer per day would be automated with under-30-second startup.

What Sarah should do - role-specific action plan

VictorStaff Engineer - AI Champion

Victor has been benchmarking devbox performance and knows that the cold-start time on the team's current setup is 4 minutes. He has read about Stripe's 10-second benchmark and has a clear theory about how to achieve it: a pre-warmed pool combined with a snapshot-based image that includes the repository at HEAD with dependencies installed. He estimates 2 days to prototype and 1 week to productionize.

What Victor should do: Victor should build the prototype. Create the snapshot-based image (repository + dependencies baked in), write a minimal pool manager (a shell script or small Go binary that maintains N containers), and measure the resulting startup time. If he can demonstrate 10-second startup times in a prototype, he has the proof of concept that converts the pre-warming discussion from "interesting idea" to "we know it works and here is how to build it." Victor should also document the exact steps to reproduce the prototype so the infrastructure team can build on it rather than starting from scratch.

What Victor should do - role-specific action plan