Shared runner, queue

A shared runner queue is the default CI infrastructure configuration at L1: a fixed pool of CI runners (virtual machines or containers) shared across all developers, all teams, and

·A CI pipeline runs on pull requests
·CI results are reported after the pipeline completes

·CI runs on every PR (not just on manual trigger)
·Shared runner queue exists even if slow

Evidence

·CI pipeline configuration file in repository
·CI run duration logs showing median > 15 minutes

What It Is

A shared runner queue is the default CI infrastructure configuration at L1: a fixed pool of CI runners (virtual machines or containers) shared across all developers, all teams, and all repositories in an organization. When more jobs are submitted than runners are available, jobs queue - waiting for a runner to become free before they can start. The queue is typically first-in, first-out, with no priority differentiation between a developer's quick lint check and a full integration test suite.

In the pre-AI era, shared runner queues were an acceptable starting point. Human developers push code infrequently enough - perhaps 5-10 times per day - that a modest shared pool handles the load with acceptable wait times. The organization budgets for runners based on typical human push frequency, and the queue rarely grows long enough to be painful.

AI agents break this assumption entirely. An agent iterating on a task can submit 10-20 CI runs per hour. A team with 10 developers each running one agent simultaneously is now generating 100-200 CI jobs per hour instead of 10-20. A shared runner pool sized for human usage becomes instantly undersized. Queue times grow from seconds to minutes. And because all teams share the same queue, a burst of agent activity from one team degrades CI performance for every other team in the organization. The downstream effects ripple: developers from other teams see their CI runs queuing, blame the "AI experiments" for breaking CI, and create organizational friction around AI tool adoption.

The shared runner queue problem is often the first infrastructure crisis that organizations hit when they start scaling AI agent usage. It's not a code problem; it's a capacity and isolation problem. The solutions exist and are well-understood (dedicated runners, auto-scaling, ephemeral sandboxes), but they require deliberate investment that many organizations make only after the crisis hits.

Why It Matters

Agent bursts saturate shared queues immediately - a single developer running agents can generate more CI load than an entire human team, creating unpredictable queue spikes that affect everyone
No isolation between teams means agent activity creates organizational friction - when one team's agent experiments slow down another team's CI, it creates social pressure against AI tool adoption
Queue time is invisible in most CI dashboards - the 15-minute CI run actually took 25 minutes because it waited 10 minutes in queue; the queue time is often not reported separately, hiding the real problem
Queue prioritization is impossible with shared infrastructure - a critical bug fix queuing behind an agent's experimental run has no way to jump the queue; all jobs are equal
Agent operators lose visibility - agents waiting in a long queue have no way to know if the wait is normal or indicates a problem; they simply wait, blocking the iteration loop

Getting Started

Measure your actual queue time, not just CI run time - Most CI platforms report job duration from start to finish, not including queue time. Enable queue time reporting explicitly: GitHub Actions shows "Queued" time in the job summary; CircleCI shows queue time in the workflow view; BuildKite shows it in the build timeline. Establish baseline p50 and p95 queue times per team.
Identify the teams and workflows generating the most load - Query your CI platform's API or analytics to find which repositories and workflow types are generating the most CI minutes per day. This identifies the load distribution and helps you decide whether to add capacity, add isolation, or change usage patterns.
Add capacity to the shared pool as an immediate stopgap - If queue times are already painful, add more runners to the shared pool while planning the longer-term isolation work. GitHub Actions allows adding larger or more concurrent runners. CircleCI allows increasing concurrency. BuildKite supports adding more agents to the cluster. This buys time without requiring architecture changes.
Create a dedicated "agent runs" pool - Set up a separate runner pool specifically for CI runs triggered by AI agents. Label these runners (e.g., agent-ci), and configure agent-driven workflows to use the agent-ci label. This separates agent-generated load from human-generated load and prevents agent bursts from affecting developer CI times.
Implement basic priority queuing - Most enterprise CI platforms support priority levels or separate queues. Configure critical paths (main branch builds, hotfix branches) to use a high-priority queue that can preempt lower-priority agent jobs. This requires platform support (BuildKite queues, CircleCI resource classes, GitHub Actions concurrency groups).
Plan the path to dedicated runners per team - The shared queue is a transitional state. The target state is dedicated runner pools per team (see the Dedicated Runners Per Team guide). Plan the migration: identify the teams generating the most load, create dedicated pools for those teams first, and migrate the remaining teams over subsequent sprints.

Tip

Queue time is the invisible component of CI latency. If developers report that "CI is slow" but your timing data shows run time is acceptable, queue time is almost certainly the culprit. Measure queue time explicitly before diagnosing. A run that sits in queue for 8 minutes before starting is a queue problem, not a CI optimization problem.

6 steps to get from here to the next level

Common Pitfalls

Solving the symptom (adding runners) instead of the cause (no isolation). Adding more runners to the shared pool is a stopgap that becomes expensive quickly when agents are generating continuous load. The correct solution is isolation - dedicated pools for teams and agent workloads - not perpetually growing the shared pool.

Not charging back CI costs to teams. When CI infrastructure is shared with no visibility into per-team consumption, teams have no incentive to optimize their usage. A team running inefficient full-rebuild CI on every agent commit has no feedback mechanism telling them they're consuming 10x the resources of an efficient team. Implement per-team CI cost dashboards to create that feedback.

Assuming queue times are acceptable because they're "only a few minutes." At human push frequency, a 3-minute queue is irritating but acceptable. At agent push frequency (10-20 pushes per hour), a 3-minute queue means every agent iteration costs 3 extra minutes, translating to 30-60 extra minutes per hour of iteration wasted in queuing. The queue time threshold that's acceptable for humans is not acceptable for agents.

Mixing priorities without explicit queuing. Without priority-based queuing, all jobs compete equally for runners. This means a low-priority experimental agent run sits in the same queue as a critical production deployment. Implement explicit priority queuing before agents generate significant load - after that, retroactively adding priorities is harder because teams have already built workflows that assume flat queuing.

Ignoring the cascading effect on agent context. An agent that submits a change and has to wait 10 minutes in queue before CI starts may have its context window saturated by the time results arrive. Long queue times aren't just slow - they can cause agents to lose track of what they were doing and require human re-engagement. Fast CI and fast queue resolution are both necessary for agent effectiveness.

Mistakes teams actually make at this stage - and how to avoid them

How Different Roles See It

BobHead of Engineering

Bob's team has started rolling out AI agents and three weeks in, he's getting complaints from developers on other teams: their CI is slow, and they've traced it to the shared runner pool being saturated by agent-generated CI jobs from Bob's team. Bob hasn't been tracking CI queue times - only run times - and hadn't realized the scope of the problem.

Bob should immediately create a dedicated "agent-ci" runner pool for his team's agent-driven workflows and update his team's CI configurations to route agent jobs to this pool. This is an afternoon of work and immediately isolates the load. Bob should also set up per-team CI cost dashboards so he can see his team's consumption going forward and have data for capacity planning conversations with the platform team. The isolation work is not just about solving the queue problem - it's about being a responsible consumer of shared infrastructure as agent usage grows. Bob should proactively communicate what he did and why to the other team leads, which rebuilds trust and demonstrates that AI adoption can be done without disrupting the rest of the organization.

SarahProductivity Lead

Sarah's developer experience survey results show that "CI wait times" are the top friction point across the organization, but the underlying cause is split: some teams have slow CI (long run times) while other teams have fast CI but long queue times. Sarah needs to separate these two problems because they have different solutions.

Sarah should update her CI feedback latency dashboard to separately track p50 and p95 queue time and run time for each team. The teams with long queue times need runner isolation, not pipeline optimization. The teams with long run times need caching and parallelization. Sarah should present the disaggregated data in the next engineering all-hands: "team A has a queue time problem, teams B and C have a run time problem, and teams D and E are fine - here are the different solutions we're deploying." This targeted diagnosis prevents teams from investing in pipeline optimization when their real problem is runner isolation, and vice versa.

VictorStaff Engineer - AI Champion

Victor's team has already moved to a dedicated runner pool and his agent workflows run without queue delays. But he's watching the organization's shared pool fill up as other teams start experimenting with agents. He has the solution (dedicated pools) and the data (queue time reduction from his team's migration).

Victor should write a "CI Runner Isolation Guide" - a practical document that describes exactly how his team migrated from the shared pool to dedicated runners, including the CI YAML changes, the runner registration process, and the expected queue time improvement. He should share it in the engineering Slack and offer to help the two most-affected teams implement it in a pair-programming session. Victor should also raise the broader organizational infrastructure question: should the platform team implement a standard "per-team runner pool" pattern that any team can request self-service? This structural change would prevent the shared queue saturation problem from recurring as agent adoption spreads.