Bazel + Remote Caching (EngFlow)
Bazel is Google's open-source build system, designed from the ground up for large monorepos and fast incremental builds.
- ·CI completes in under 5 minutes (median)
- ·Remote caching is implemented (Bazel remote cache, EngFlow, Gradle Enterprise)
- ·Incremental builds run only changed modules or fragments
- ·P95 CI duration is under 8 minutes
- ·Build system supports hermetic builds (reproducible outputs regardless of machine)
Evidence
- ·CI run duration dashboard showing median under 5 minutes
- ·Remote cache configuration and cache hit rate metrics
- ·Build configuration showing incremental/changed-only targeting
What It Is
Bazel is Google's open-source build system, designed from the ground up for large monorepos and fast incremental builds. Unlike traditional build systems (Make, Gradle, webpack) that operate on files and directories, Bazel operates on hermetic, content-addressed build actions: every build step declares its exact inputs and outputs, and Bazel can determine with certainty whether a step needs to be re-executed or can be served from cache. This property - correctness by construction - is what makes Bazel's caching reliable where other systems are not.
Remote caching extends Bazel's local cache to a shared, network-accessible store. When a developer or CI runner executes a build action that has already been computed by someone else (another developer, a previous CI run), Bazel fetches the pre-computed output from the remote cache rather than recomputing it. The result is that in a team of 20 developers all working on the same monorepo, most build actions are cache hits pulled from the network rather than local computations. CI runs that would take 10 minutes from scratch complete in 90 seconds when most actions are cache hits.
EngFlow is the leading enterprise provider of managed Bazel Remote Build Execution (RBE) and remote caching infrastructure. EngFlow runs the caching and execution infrastructure so your team doesn't have to operate Bazel's Remote Execution API (REAPI) servers. Teams connect their Bazel builds to EngFlow's cluster using a few lines of configuration, and immediately get: a high-performance remote cache, optional distributed build execution across EngFlow's machines, and build analytics that show exactly which actions are slow and why.
The combination of Bazel + remote caching is the infrastructure pattern that enables L3-L4 CI times for large, complex codebases. Simple codebases can hit 5-minute CI with parallelization and basic caching. But for monorepos with millions of lines of code and complex dependency graphs, Bazel's content-addressed caching is the tool that makes sub-5-minute CI achievable. Organizations like Google, Stripe, Dropbox, and Twitter have published accounts of using Bazel-style build systems to keep CI fast as codebases grow.
Why It Matters
- Correct incremental builds by construction - Bazel's hermetic action model guarantees that cache hits are correct; you can't get stale artifact bugs that plague makefiles and Gradle incremental builds
- Shared cache across developers and CI - a CI run following a developer's local build hits the cache for actions the developer already computed, collapsing "CI rebuilds everything from scratch" to "CI verifies the already-computed result"
- Linear CI time growth with codebase size - without Bazel, CI time grows as the codebase grows; with Bazel remote caching, CI time grows only with the size of the changed sub-graph, not the total codebase
- Remote Build Execution distributes compilation - EngFlow's RBE can distribute a 10-minute compilation across 50 machines, completing it in 12 seconds; this is the mechanism that makes sub-minute CI achievable for large codebases
- Build analytics expose bottlenecks - EngFlow's analytics dashboard shows action-level timing, cache hit rates, and critical path analysis; optimization becomes data-driven rather than guesswork
Getting Started
6 steps to get from here to the next level
Common Pitfalls
Mistakes teams actually make at this stage - and how to avoid them
How Different Roles See It
Bob's team runs a large TypeScript monorepo with 800k lines of code. They've implemented basic caching and parallelization, but CI is still at 8 minutes because the TypeScript compilation step takes 5 minutes even with incremental builds. A senior engineer has proposed adopting Bazel with EngFlow remote caching to get compilation to under 1 minute. Bob is interested but concerned about the migration effort.
Bob should fund a 2-sprint "Bazel proof of concept" with a single senior engineer (or Victor, the AI champion). The goal: get one service in the monorepo building with Bazel and remote caching, measure the compilation time improvement, and estimate the effort to migrate the rest. At the end of 2 sprints, Bob will have: actual performance data for the remote caching benefit, a realistic effort estimate for the full migration, and a recommendation on whether to proceed. This is a much better decision basis than theoretical estimates. If the proof of concept shows 5x compilation speedup, the migration effort is justified. If it shows 2x speedup with 6 months of migration effort, it may not be.
Sarah's CI feedback latency data shows that the TypeScript compilation step is 62% of total CI time. Caching and parallelization haven't helped this step because it's a single large tsc invocation that can't be trivially parallelized. Sarah sees Bazel remote caching as the intervention that addresses the specific bottleneck the data identifies.
Sarah should frame the Bazel proposal in terms of the specific bottleneck it addresses: "compilation is 62% of CI time; Bazel remote caching would serve pre-compiled outputs from cache for unchanged modules, reducing this step from 5 minutes to an estimated 30 seconds on most runs." She should then estimate the CI time impact: 8-minute CI minus 4.5 minutes of compilation time savings equals approximately 3.5-minute CI. That's a 56% reduction in CI time from a single infrastructure investment. Sarah should present this as a testable hypothesis: implement Bazel remote caching on one module, measure the compilation time, and validate the estimate before committing to a full migration.
Victor has been following Bazel development for two years and has used it at a previous company. He knows it works but also knows the migration pitfalls. He's the right person to lead the proof of concept Bob described.
Victor should own the Bazel proof of concept with a specific scope: one service, two sprints, clear success criteria (compilation time < 60 seconds with warm cache, cache hit rate > 85% after day 1). He should document every step of the proof of concept - BUILD file authoring decisions, hermeticity violations found and fixed, EngFlow configuration - as the foundation of the migration playbook if the team decides to proceed. Victor should also evaluate BuildBuddy as an alternative to EngFlow (open-source, self-hostable, lower recurring cost) and include a comparison in his recommendation. His technical credibility makes his recommendation on "Bazel yes/no, EngFlow vs. BuildBuddy" the deciding input for Bob's decision.
Further Reading
5 resources worth reading - hand-picked, not scraped
From the Field
Recent releases, projects, and discussions relevant to this maturity level.