Shared runner, queue
A shared runner queue is the default CI infrastructure configuration at L1: a fixed pool of CI runners (virtual machines or containers) shared across all developers, all teams, and
- ·CI pipeline exists but takes longer than 15 minutes
- ·Agents receive no real-time CI feedback (wait for full pipeline completion)
- ·CI runs on every PR (not just on manual trigger)
- ·Shared runner queue exists even if slow
Evidence
- ·CI pipeline configuration file in repository
- ·CI run duration logs showing median > 15 minutes
What It Is
A shared runner queue is the default CI infrastructure configuration at L1: a fixed pool of CI runners (virtual machines or containers) shared across all developers, all teams, and all repositories in an organization. When more jobs are submitted than runners are available, jobs queue - waiting for a runner to become free before they can start. The queue is typically first-in, first-out, with no priority differentiation between a developer's quick lint check and a full integration test suite.
In the pre-AI era, shared runner queues were an acceptable starting point. Human developers push code infrequently enough - perhaps 5-10 times per day - that a modest shared pool handles the load with acceptable wait times. The organization budgets for runners based on typical human push frequency, and the queue rarely grows long enough to be painful.
AI agents break this assumption entirely. An agent iterating on a task can submit 10-20 CI runs per hour. A team with 10 developers each running one agent simultaneously is now generating 100-200 CI jobs per hour instead of 10-20. A shared runner pool sized for human usage becomes instantly undersized. Queue times grow from seconds to minutes. And because all teams share the same queue, a burst of agent activity from one team degrades CI performance for every other team in the organization. The downstream effects ripple: developers from other teams see their CI runs queuing, blame the "AI experiments" for breaking CI, and create organizational friction around AI tool adoption.
The shared runner queue problem is often the first infrastructure crisis that organizations hit when they start scaling AI agent usage. It's not a code problem; it's a capacity and isolation problem. The solutions exist and are well-understood (dedicated runners, auto-scaling, ephemeral sandboxes), but they require deliberate investment that many organizations make only after the crisis hits.
Why It Matters
- Agent bursts saturate shared queues immediately - a single developer running agents can generate more CI load than an entire human team, creating unpredictable queue spikes that affect everyone
- No isolation between teams means agent activity creates organizational friction - when one team's agent experiments slow down another team's CI, it creates social pressure against AI tool adoption
- Queue time is invisible in most CI dashboards - the 15-minute CI run actually took 25 minutes because it waited 10 minutes in queue; the queue time is often not reported separately, hiding the real problem
- Queue prioritization is impossible with shared infrastructure - a critical bug fix queuing behind an agent's experimental run has no way to jump the queue; all jobs are equal
- Agent operators lose visibility - agents waiting in a long queue have no way to know if the wait is normal or indicates a problem; they simply wait, blocking the iteration loop
Getting Started
6 steps to get from here to the next level
Common Pitfalls
Mistakes teams actually make at this stage - and how to avoid them
How Different Roles See It
Bob's team has started rolling out AI agents and three weeks in, he's getting complaints from developers on other teams: their CI is slow, and they've traced it to the shared runner pool being saturated by agent-generated CI jobs from Bob's team. Bob hasn't been tracking CI queue times - only run times - and hadn't realized the scope of the problem.
Bob should immediately create a dedicated "agent-ci" runner pool for his team's agent-driven workflows and update his team's CI configurations to route agent jobs to this pool. This is an afternoon of work and immediately isolates the load. Bob should also set up per-team CI cost dashboards so he can see his team's consumption going forward and have data for capacity planning conversations with the platform team. The isolation work is not just about solving the queue problem - it's about being a responsible consumer of shared infrastructure as agent usage grows. Bob should proactively communicate what he did and why to the other team leads, which rebuilds trust and demonstrates that AI adoption can be done without disrupting the rest of the organization.
Sarah's developer experience survey results show that "CI wait times" are the top friction point across the organization, but the underlying cause is split: some teams have slow CI (long run times) while other teams have fast CI but long queue times. Sarah needs to separate these two problems because they have different solutions.
Sarah should update her CI feedback latency dashboard to separately track p50 and p95 queue time and run time for each team. The teams with long queue times need runner isolation, not pipeline optimization. The teams with long run times need caching and parallelization. Sarah should present the disaggregated data in the next engineering all-hands: "team A has a queue time problem, teams B and C have a run time problem, and teams D and E are fine - here are the different solutions we're deploying." This targeted diagnosis prevents teams from investing in pipeline optimization when their real problem is runner isolation, and vice versa.
Victor's team has already moved to a dedicated runner pool and his agent workflows run without queue delays. But he's watching the organization's shared pool fill up as other teams start experimenting with agents. He has the solution (dedicated pools) and the data (queue time reduction from his team's migration).
Victor should write a "CI Runner Isolation Guide" - a practical document that describes exactly how his team migrated from the shared pool to dedicated runners, including the CI YAML changes, the runner registration process, and the expected queue time improvement. He should share it in the engineering Slack and offer to help the two most-affected teams implement it in a pair-programming session. Victor should also raise the broader organizational infrastructure question: should the platform team implement a standard "per-team runner pool" pattern that any team can request self-service? This structural change would prevent the shared queue saturation problem from recurring as agent adoption spreads.
Further Reading
5 resources worth reading - hand-picked, not scraped
From the Field
Recent releases, projects, and discussions relevant to this maturity level.
CI/CD Pipeline