Maturity Matrix

DORA + basic AI tracking

At L2 (Guided), teams have moved past the L1 silence on metrics.

  • ·DORA metrics are tracked consistently with a dashboard
  • ·AI tool license count vs. active usage rate is measured
  • ·PR throughput per developer is tracked
  • ·AI acceptance rate (% of AI suggestions accepted) is measured per tool
  • ·Metrics are reviewed in team retrospectives at least monthly

Evidence

  • ·DORA metrics dashboard with current data
  • ·License utilization report (licenses purchased vs. active users)
  • ·PR throughput chart showing per-developer breakdown

What It Is

At L2 (Guided), teams have moved past the L1 silence on metrics. They're tracking the four DORA metrics - deployment frequency, lead time for changes, change failure rate, and mean time to restore - and they've added a basic layer of AI-specific tracking on top. "Basic AI tracking" means capturing the signals that DORA doesn't cover: how many of your PRs are AI-assisted, how many developers are actively using AI tools each week, and how AI-labeled PRs compare to human-labeled PRs on review time and defect rate.

The DORA + basic AI tracking combination is the minimum viable measurement stack for a team that has made a meaningful AI investment. DORA gives you delivery performance. Basic AI tracking gives you the first correlation signal: are the developers using AI tools more performing better on DORA metrics? This correlation is imperfect - it doesn't prove causation, it doesn't control for developer seniority, and it doesn't account for which AI usage patterns are producing results - but it's the first step toward evidence-based AI program management.

The word "basic" is important. Basic AI tracking at L2 does not include ITS, CPI, TORS, or Agent Autonomy Score - those are L3/L4 metrics that require more sophisticated instrumentation. Basic tracking is: usage rates, AI-labeled PR percentage, and a first pass at comparing AI-assisted vs. non-AI-assisted PR cycle times. This is achievable with a few GitHub Actions, a labeling convention, and a simple dashboard.

Most teams at L2 discover a pattern in their data that surprises them: there is a significant spread between high-usage and low-usage developers on DORA metrics. Developers who use AI tools daily and have developed good prompting and review habits are measurably faster. Developers who have licenses but use them inconsistently show no throughput improvement. This insight - that usage rate is a better predictor of impact than license count - is the core finding that drives L2 to L3 progression.

Why It Matters

  • Combines established and new signals - DORA metrics are trusted by engineering leadership and finance; pairing them with AI tracking gives AI metrics the credibility boost of being presented alongside a recognized framework
  • Surfaces the usage/impact correlation - basic AI tracking almost always reveals that high-usage developers outperform low-usage developers on delivery metrics; this data makes the case for adoption programs that move low-usage developers upward
  • Creates accountability for AI tool ROI - a team tracking DORA + AI metrics can answer the ROI question with data rather than anecdotes; "our AI-high-usage cohort has 40% shorter PR cycle time than our low-usage cohort" is a defensible ROI claim
  • Establishes the foundation for L3 metrics - ITS, CPI, and TORS require understanding of the AI-assisted PR workflow; teams that have already labeled PRs and tracked basic AI metrics have the data hygiene foundation to add these more sophisticated metrics later
  • Makes adoption gaps visible - basic tracking reveals which teams or individuals are not adopting AI tools effectively; this makes targeted intervention possible rather than leaving low-adoption situations to persist invisibly

Getting Started

6 steps to get from here to the next level

Common Pitfalls

Mistakes teams actually make at this stage - and how to avoid them

How Different Roles See It

B
BobHead of Engineering

Bob has established DORA tracking across his teams and is now seeing that deployment frequency is up year-over-year, but he can't tell how much of that is AI tools vs. the new CI pipeline they rebuilt in Q2. He wants to isolate the AI contribution.

What Bob should do - role-specific action plan

S
SarahProductivity Lead

Sarah has developer usage rate data from Copilot Business and basic DORA metrics from GitHub Insights. She wants to connect the two - to show that developers using Copilot more are shipping faster - but she's not sure how to do the analysis rigorously.

What Sarah should do - role-specific action plan

V
VictorStaff Engineer - AI Champion

Victor runs agent-heavy workflows and knows the DORA + basic tracking metrics are table stakes. He's already thinking about ITS and CPI, the L3 metrics. But he recognizes that the team needs to build the L2 foundation before jumping to L3.

What Victor should do - role-specific action plan