Incremental test selection (only changed paths)

Running only the tests affected by a given code change - using dependency graph analysis - so CI feedback stays fast as the codebase grows to millions of lines.

·Expected results are derived from requirements/specs (the requirement is the oracle, not the code)
·Acceptance tests are auto-generated from ticket requirements (Autonomous Requirements pipeline)
·Incremental test selection runs only tests affected by changed code paths

·Oracle reliability is reviewed per service, not just overall
·Test generation from tickets includes edge cases, not just happy paths

Evidence

·Oracle-reliability dashboard (e.g., TORS) with per-service breakdown
·Ticket-to-test pipeline configuration with sample outputs
·CI configuration showing incremental test selection (e.g., Bazel test targeting, Jest --changedSince)

What It Is

Incremental test selection is the practice of running only the tests that are affected by a specific code change, rather than the full test suite. When you change a function in module A, incremental test selection identifies which tests depend on module A (directly or transitively) and runs only those. Tests for modules B, C, and D are skipped because the change can't affect them.

This sounds simple, but it requires sophisticated dependency analysis. Tools like Nx (for JavaScript monorepos), Bazel (for multi-language builds), pytest-testmon (for Python), and Gradle's incremental build feature maintain a dependency graph of the codebase - mapping which files depend on which other files, which tests cover which code paths, and which tests are safe to skip for a given change. The analysis happens automatically; developers just run tests and see only the relevant ones execute.

At Level 1-2, teams run the full test suite on every change. This works until the test suite grows past a few hundred tests or the codebase passes a certain scale. Above that threshold, full test suite runs take 15, 30, or 60 minutes - making the feedback loop too slow for effective development. At Level 3 (Systematic), incremental test selection is implemented systematically so that CI feedback time stays under 10 minutes regardless of codebase size.

The distinction at L3 is that incremental selection is configured and maintained as an organizational practice, not a tool one engineer uses on their laptop. It's applied in the CI pipeline, tested for correctness (selection errors that skip relevant tests are treated as bugs), and measured for impact on feedback loop time.

Why It Matters

Incremental test selection becomes critical as AI-assisted development accelerates code production:

Feedback loop preservation - When AI agents produce code changes faster than humans, the feedback loop becomes the bottleneck. A 45-minute test suite run on every change means an agent can only iterate twice per hour. Incremental selection can reduce that to 5-8 minutes.
Parallel agent enablement - Multiple agents running simultaneously each need fast CI feedback. Running full test suites for 10 parallel agents simultaneously would saturate CI infrastructure. Incremental selection reduces per-agent CI load.
Developer experience - Long CI times are a documented cause of developer frustration and interrupted flow. Keeping local test runs under 2 minutes and CI runs under 10 minutes is a developer productivity goal in its own right.
Cost reduction - CI infrastructure costs scale with compute time. Incremental selection can reduce CI costs by 60-80% in large monorepos.
Scale enablement - Without incremental selection, test suite time grows linearly with codebase size. With it, time stays roughly constant. This is the difference between a CI system that degrades over time and one that scales.

Tip

Before implementing incremental test selection, measure your current test suite execution time and segment it by module or service. Often, 20% of the test suite accounts for 80% of the execution time - and those tests are rarely touched. Identifying the slow-test offenders lets you prioritize both incremental selection and test optimization.

Getting Started

Choose the right tool for your stack - The incremental selection tool must understand your language's module system and dependency resolution. Nx for JavaScript/TypeScript monorepos, Bazel for multi-language projects, pytest-testmon for Python, Gradle's --tests flag with dependency analysis for Java/Kotlin. Don't build custom dependency tracking - use tools designed for this problem.
Generate a dependency graph baseline - Most incremental selection tools require an initial analysis run to build the dependency graph. Run this on your main branch and commit the graph metadata to the repository so that incremental analysis only needs to track changes.
Implement in CI before local - Start with incremental selection in CI, not developer workstations. CI runs on clean checkouts where the full dependency graph is available. Local runs can use it after CI is proven stable.
Validate selection correctness - Periodically run the full test suite and compare the results to what incremental selection would have run. Any test that fails in the full run but would have been skipped by incremental selection is a selection error. Track selection accuracy as a metric.
Handle integration tests separately - Incremental selection works well for unit and component tests. Integration tests that span multiple services are harder to select incrementally because their dependencies are implicit (database schema, API contracts). Run integration tests on a longer schedule or use a separate selection strategy.
Measure and publish the impact - Before-and-after CI time comparison is your business case metric. Track average CI run time by branch type and publish it on your engineering metrics dashboard.

6 steps to get from here to the next level

Common Pitfalls

Silent selection errors. The dangerous failure mode of incremental selection is not incorrectly including tests - it's incorrectly excluding them. If the dependency graph misses a dependency and skips a relevant test, a real bug can pass CI undetected. Validate selection correctness regularly and treat selection errors as high-priority bugs.

Over-trusting the dependency graph for dynamic dependencies. Dependency graphs capture static import relationships. Dynamic dependencies - tests that read from a shared database, tests that depend on environment variables, tests that load configuration files - may not be captured. Tests with implicit environmental dependencies need to be explicitly tagged to always run, regardless of what code changed.

Using incremental selection as a substitute for test optimization. If individual tests are slow (multi-second unit tests, database-hitting tests that should be mocked), incremental selection reduces how many slow tests run but doesn't fix the underlying slowness. Combine incremental selection with test performance optimization for maximum impact.

Outdated dependency graphs causing incorrect selection. After large refactors, significant dependency changes, or tool upgrades, the cached dependency graph may be stale. Implement automated graph invalidation and full rebuild triggers (e.g., when package.json, build files, or configuration changes) to prevent stale graph failures.

Mistakes teams actually make at this stage - and how to avoid them

How Different Roles See It

BobHead of Engineering

Bob's team's CI pipeline takes 38 minutes end-to-end. Developers are waiting for CI results before responding to review comments, creating a 38-minute minimum cycle for each round of feedback. With two or three review rounds per PR, a single feature can take half a day just in CI wait time.

What Bob should do: Bob should make CI time a first-class engineering metric with a target of under 10 minutes. The 38-minute pipeline almost certainly has multiple contributors: slow tests, full-suite runs, sequential job configuration. Incremental test selection is one lever. Bob should run a CI time audit: which jobs take the most time, and which changes trigger the most test execution? Nx or similar tooling will show the dependency bottlenecks immediately. Bringing CI time from 38 to under 10 minutes with incremental selection is a realistic one-sprint initiative and a highly visible quality-of-life improvement for the whole team.

What Bob should do - role-specific action plan

SarahProductivity Lead

Sarah has been trying to improve cycle time (time from PR open to merge) and has hit a wall. She's addressed review turnaround time but the CI wait component is stubbornly high. She suspects test suite time is the culprit but hasn't been able to quantify it.

What Sarah should do: Sarah should decompose cycle time into its components: code writing time, review wait time, CI wait time, and merge queue time. If CI wait time is the dominant component (which it often is for teams with long test suites), incremental test selection addresses it directly. The metric to track after implementation: median CI wait time per PR. Incremental selection in a monorepo typically reduces median CI time by 60-80%. If Sarah can attribute that time reduction to a concrete reduction in cycle time, she has a compelling ROI story for the tooling investment.

What Sarah should do - role-specific action plan

VictorStaff Engineer - AI Champion

Victor has already configured Nx for the JavaScript services and reduced their CI time from 25 minutes to 4 minutes. The improvement is dramatic and well-received. But the Python services and the Java monolith still run full test suites and take 40 minutes.

What Victor should do: Victor has proven the pattern on one stack - now he needs to replicate it on the others. For Python, pytest-testmon is the natural equivalent to Nx. For the Java monolith, Gradle's incremental compilation and test selection should provide similar benefits. Victor should document the implementation approach for each stack (the configs, the CI integration steps, the validation approach) and present it as a cross-team initiative. The 40-minute Java pipeline is the highest-impact target: a 70% reduction there saves more total minutes per day than any other stack.

What Victor should do - role-specific action plan