"How much did we save?" - silence
"How much did we save with AI?" is the question every engineering leader eventually faces from finance, from the CTO, or from the board.
- ·DORA metrics (deployment frequency, lead time, change failure rate, MTTR) are not tracked, or tracked inconsistently
- ·No AI-specific metrics exist
- ·Team acknowledges the need for AI-specific metrics beyond traditional DORA
- ·Basic deployment frequency is at least known (even if not dashboarded)
Evidence
- ·Absence of metrics dashboard or inconsistent/manual tracking
- ·No AI-specific fields in existing metrics systems
What It Is
"How much did we save with AI?" is the question every engineering leader eventually faces from finance, from the CTO, or from the board. At L1, the answer is silence. Not "we saved $X" and not "we don't know yet" - just an uncomfortable pause followed by anecdotes, vague claims about "developer happiness," and a promise to get better data. The silence is the defining symptom of L1 metrics: the investment has been made, the tools have been deployed, but no one built the measurement infrastructure to answer the ROI question.
The silence has predictable consequences. Without ROI data, AI tool budgets become targets during cost-cutting cycles. Without ROI data, decisions about which AI tools to invest in are made on vendor demos and developer preference rather than evidence. Without ROI data, engineering leaders cannot make the case for expanding AI usage to skeptical business stakeholders. The silence is not just an embarrassment - it's a strategic liability.
The root cause of the silence is not malice or negligence. It's a sequencing error. Teams adopt AI tools because they seem promising and developers want them. They focus on adoption: getting tools installed, getting developers trained, getting the workflows working. Measurement is deferred until "after we're settled in." But "after we're settled in" never creates a natural moment to build measurement infrastructure, so it never happens. The deferral becomes permanent, and the team is left trying to reconstruct impact from incomplete data months or years later.
Understanding what "we saved" actually means is also harder than it sounds. Saved compared to what? Compared to hiring two more developers? Compared to the velocity you would have had without AI tools? Compared to a competitor who didn't adopt AI? The ROI question requires a counterfactual, and without baseline data, the counterfactual is unanswerable. This is why the measurement infrastructure must be built before or at the moment of AI tool adoption, not after.
Why It Matters
- Silence kills the budget - when finance asks for AI ROI and gets silence, the default assumption is "we don't know if this is working" - which justifies cutting the budget; teams with data protect their tools, teams without data lose them
- Silence prevents scaling - you can't make the case for expanding AI usage from 10 developers to 100 if you can't show what the 10 developers gained; measurement is the prerequisite for organizational scaling
- Silence erodes leadership credibility - engineering leaders who champion AI investment and can't produce ROI data look like they made decisions on hype; one bad budget cycle can set AI adoption back a year
- The window for baseline data closes - every month without measurement is a month of baseline data that's permanently lost; when you eventually instrument AI impact, you can never fully reconstruct the pre-AI comparison
- Silence is answered by alternative metrics - in the absence of real ROI data, organizations default to proxy metrics like PR count or lines of code; these metrics are gameable and misleading, and they incentivize the wrong behaviors
Getting Started
6 steps to get from here to the next level
Common Pitfalls
Mistakes teams actually make at this stage - and how to avoid them
How Different Roles See It
Bob is in a quarterly business review when the CFO asks: "We're spending $400K a year on AI coding tools. What's the return?" Bob knows the tools are valuable - his developers love them - but he has no data to cite. He mentions that developers seem more productive, that he's heard positive feedback, and that he'll get better data for the next review. The CFO makes a note.
What Bob should do - role-specific action plan
Sarah has been asked to produce a report on AI tool ROI for the annual engineering review. She starts pulling data and realizes she has almost nothing: some license invoices, some developer survey responses from 8 months ago, and GitHub commit history. There's no systematic measurement of AI impact anywhere.
What Sarah should do - role-specific action plan
Victor wants to advocate for expanding the team's AI investment - more agent workflows, dedicated CI infrastructure, team-wide MCP server setup. But every time he makes the case, leadership asks "but what's the ROI on what we've already invested?" and he doesn't have a clean answer.
What Victor should do - role-specific action plan
Further Reading
5 resources worth reading - hand-picked, not scraped