CI < 2 minutes
CI under 2 minutes is the Optimized (L4) milestone and represents a qualitative shift in how CI is used.
- ·CI completes in under 2 minutes (median)
- ·Ephemeral sandbox environments spin up in under 10 seconds for agent CI loops
- ·Agent sandbox CI supports 50+ iteration attempts in 5 minutes without blocking team CI queue
- ·P95 CI duration is under 3 minutes
- ·CI feedback latency (from push to result) is tracked and reported
Evidence
- ·CI run duration dashboard showing median under 2 minutes
- ·Sandbox spin-up time metrics showing sub-10-second P50
- ·Agent CI iteration logs showing 50+ attempts within 5-minute windows
What It Is
CI under 2 minutes is the Optimized (L4) milestone and represents a qualitative shift in how CI is used. At this speed, CI is no longer a gate that agents and developers wait for - it's a feedback mechanism they interact with continuously. A 90-second CI run means an agent can iterate 40 times per hour. The loop becomes tight enough that agents can run exploratory iteration: try an approach, get feedback, adjust, repeat - in a way that resembles test-driven development but at machine speed.
Achieving sub-2-minute CI is not primarily a configuration problem at this level. The pipeline architecture has already been optimized with caching, parallelization, and incremental builds. What remains is infrastructure-level work: pre-built base images that eliminate container startup time, warm runner pools that eliminate job startup latency, and distributed build systems (Bazel, Pants, Buck) that can execute build and test steps in true parallel across many machines rather than within a single runner.
The 2-minute target also changes what "passing CI" means. At 5-minute CI, teams run a meaningful fast-path and accept that some slower tests run post-merge. At 2-minute CI, teams invest in making the meaningful fast-path comprehensive enough to be a real quality gate. This requires significant test architecture work: tests that run in 2 minutes across a large codebase must be highly selective, highly parallelized, and architecturally isolated from I/O. Teams at L4 have typically invested 6-12 months getting their test suite to this shape.
The payoff at 2-minute CI is visible in throughput numbers. Stripe's engineering blog describes internal tooling that supports 1,000+ merges per week on monorepos, enabled by fast CI. Teams running dozens of parallel agents can sustain high PR throughput without CI becoming the bottleneck. The 2-minute threshold is where CI scales with agent usage rather than constraining it.
Why It Matters
- Agent iteration reaches machine speed - 40+ iteration cycles per hour means agents can work through complex multi-step implementations within a single 30-minute window
- CI becomes a development tool, not a gate - at 2 minutes, developers and agents use CI as a rapid feedback mechanism during development, not just before merge
- Merge queues become high-throughput - a 2-minute pipeline can process 30 PRs per hour per runner, enabling teams running dozens of parallel agents to sustain throughput without queuing
- Enables agent autonomy - when CI feedback arrives in 2 minutes, agents can close their own iteration loops without human intervention; the human reviews the final result, not each intermediate step
- Dramatically reduces total delivery time - a feature that requires 15 CI iterations takes 30 minutes at 2-minute CI versus 75 minutes at 5-minute CI; at scale, this compounds across every story point
Getting Started
6 steps to get from here to the next level
Common Pitfalls
Mistakes teams actually make at this stage - and how to avoid them
How Different Roles See It
Bob's team has reached 5-minute CI and the improvement in agent productivity is visible: developers report getting more done in agent sessions and PR throughput has increased 40% over the past quarter. Now he wants to push to 2 minutes to unlock the full Optimized-level capability. The challenge is that reaching 2 minutes requires runner infrastructure investment and potentially adopting Bazel, both of which require budget and platform team support.
Bob should build a business case framing 2-minute CI as agent infrastructure investment. His team's agents currently complete about 20 iterations per hour at 5-minute CI. At 2 minutes, that becomes 40 iterations per hour. If 10 developers each run 2 agent sessions per day for an hour each, the throughput improvement is 200 additional agent iteration-hours per month. Bob should estimate what that's worth in terms of features delivered (using historical velocity data) and compare to the cost of dedicated runners and the platform team's time for the Bazel migration. The ROI calculation is typically compelling: faster CI infrastructure pays for itself within 1-2 months when teams are running AI agents at scale.
Sarah's CI feedback latency dashboard now shows consistent 5-minute CI, and she's tracking agent iteration rate as a derived metric. The data shows that at 5 minutes, the team's most productive agent users are hitting a ceiling: their iteration rate plateaus after about 6 per hour, suggesting the CI wait is still the bottleneck even at 5 minutes. The developers describe it as "I start the next task while waiting for CI, then I'm context-switched and CI takes me a while to re-engage with."
Sarah should add a metric for "CI re-engagement latency" - how long after CI completes does it take the developer to respond to results? If this is consistently 3-5 minutes (the re-context-switch overhead), then the effective feedback loop is 8-10 minutes even with 5-minute CI. This finding makes the case for 2-minute CI more precisely: it's not just about raw speed, it's about whether the feedback arrives within the developer's attention window. Sarah should present this insight with data and propose 2-minute CI as the target that would keep feedback within the active attention window for most developers.
Victor has been experimenting with Bazel on a side branch for a month. He's implemented BUILD files for the core library modules and set up EngFlow for remote caching. On those modules, test execution time has dropped from 3 minutes to 45 seconds. He's confident the approach works but needs to show a clear path to team-wide adoption before proposing it.
Victor should productize his Bazel experiment into a concrete migration proposal: which modules to migrate first (start with the most frequently changed modules for maximum impact), what the BUILD file maintenance burden looks like (estimated hours per month to keep BUILD files current), and what the expected CI time improvement is (based on actual measurements from his experiment). He should propose a phased approach: migrate three modules in sprint 1, measure the impact on CI time, and use the data to decide on expanding the migration. This measured approach reduces risk and builds confidence. Victor should also flag the EngFlow option explicitly - many teams that need remote caching for Bazel don't want to operate the caching infrastructure themselves, and EngFlow's managed service is the standard enterprise solution.
Further Reading
5 resources worth reading - hand-picked, not scraped
From the Field
Recent releases, projects, and discussions relevant to this maturity level.