Back to Development
developmentL4 OptimizedCode Review & Quality

Policy-based auto-approval: 60%+ Green target

Setting a 60%+ Green rate as an organizational policy turns code quality into a measurable team KPI - and makes the auto-merge system self-reinforcing as teams work to qualify more PRs.

  • ·Automated Green/Yellow/Red classification runs on every PR
  • ·Green-classified PRs auto-merge without human review
  • ·Auto-approve rate target of 60%+ Green PRs is tracked and reported
  • ·Yellow PRs receive expedited human review (within 1 hour)
  • ·Classification model accuracy is validated monthly against human review outcomes

Evidence

  • ·Dashboard showing Green/Yellow/Red distribution across PRs
  • ·Auto-merge logs for Green PRs with zero post-merge reverts
  • ·Monthly auto-approve rate report showing 60%+ Green target tracking

What It Is

Policy-based auto-approval with a 60%+ Green target is the organizational layer on top of the Green auto-merge system. Rather than just enabling auto-merge and letting teams use it however they want, the 60% target sets an explicit expectation: 60% or more of a team's PRs should score Green and auto-merge each week.

The target is both a quality signal and a productivity signal. A team consistently hitting 60%+ Green is a team that writes code meeting a high quality bar before review: good test coverage, clean lint, well-structured changes of safe size, no AI review issues. A team below 60% has one or more systemic quality problems - test coverage gaps, persistent lint violations, oversized PRs, or changes consistently touching high-risk areas without the quality to match.

The 60% figure represents a specific balance: it's enough automation to meaningfully reduce the review bottleneck (if 60% of PRs auto-merge, human reviewers handle 40% of the volume they would otherwise handle), while keeping human review for the complex and high-risk 40%. Teams far above 60% might be defining Green too loosely. Teams far below 60% are leaving significant efficiency gains unrealized.

The policy creates an incentive structure: teams want to be above 60% because it means their code is flowing quickly to production. When a team's Green rate drops, engineering leads investigate: what changed? Is test coverage degrading? Are PRs getting larger? Is the AI review agent flagging more issues? The metric surfaces process problems before they become quality incidents.

Why It Matters

The 60% target transforms auto-merge from a technical feature into an organizational practice:

  • Creates accountability - Teams have a quality target they're measured against. The Green rate is visible, comparable across teams, and tracked over time.
  • Drives process improvement - Teams that care about their Green rate invest in the practices that raise it: better test writing, smaller PRs, using AI review to pre-check before submitting. The metric incentivizes the right behaviors.
  • Makes trade-offs explicit - A team working in a high-risk area (core security, payment processing) will naturally have a lower Green rate because more changes qualify as Yellow or Red. The policy accommodates this: the target for high-risk teams might be 40% rather than 60%. Making this explicit is better than pretending all teams are the same.
  • Produces fleet-level visibility - With the Green rate tracked per team and per repository, engineering leadership can see quality trends across the organization. A sudden drop in one team's Green rate signals a quality problem that merits investigation.
  • Compounds with AI code generation - At L4-L5, AI agents are generating significant code volumes. Agent-generated code that's well-structured and tested should score Green at high rates. If agents are producing code that consistently scores Yellow or Red, the agent configuration needs tuning.

The 60% target also makes the business case for quality investments self-evident. Every percentage point increase in the Green rate is measurable throughput improvement: more PRs auto-merge, less time in review queues, faster cycle time. Quality and efficiency are the same metric.

Tip

Publish the Green rate leaderboard internally. Teams that are consistently at 70%+ are doing something right - make it visible. Teams at 30% are struggling with something specific - offer support rather than blame. The metric works best as a learning tool, not a performance evaluation.

Getting Started

6 steps to get from here to the next level

Common Pitfalls

Mistakes teams actually make at this stage - and how to avoid them

How Different Roles See It

B
BobHead of Engineering

Bob has enabled auto-merge and is tracking the Green rate across his 8 teams. Four teams are consistently above 55%, two are at 40%, and two are below 30%. The 30% teams are his most experienced teams working on the core transaction processing service. He's not sure whether the low rate reflects a problem or just the nature of their work.

What Bob should do - role-specific action plan

S
SarahProductivity Lead

Sarah has been reporting PR cycle time to her stakeholders for a year. Now that auto-merge is enabled, she's watching cycle time fall - but she wants a metric that shows the underlying quality trend, not just the throughput trend. She wants to show that the team is getting better at writing code that meets quality standards, not just that they've automated the merge step.

What Sarah should do - role-specific action plan

V
VictorStaff Engineer - AI Champion

Victor is now focused on helping the teams below 40% improve their Green rate. He's been digging into the data for one team (the team at 28%) and has identified the root cause: 60% of their PRs score Yellow because they touch authentication code, which is in the high-risk file list. The authentication code needs changes more often than the team expected when they categorized it.

What Victor should do - role-specific action plan