Back to Development
developmentL1 Ad-hocCode Review & Quality

No distinction between AI vs human code

At L1, teams can't see how much of their codebase is AI-generated - making it impossible to measure adoption, calibrate review depth, or understand quality patterns.

  • ·All code is reviewed by a human before merge
  • ·No automated review tooling beyond basic CI checks
  • ·Code review turnaround is tracked (even if slow)
  • ·Team is aware that AI-generated code has higher defect rates (1.7x issues, 2.74x security vulnerabilities)

Evidence

  • ·PR approval records showing human reviewer on every merged PR
  • ·Average review turnaround time in PR analytics

What It Is

At L1, AI-generated code and human-written code are indistinguishable in the repository. A developer uses Copilot to generate a function body, accepts the suggestion, and commits it. The commit looks identical to a commit where the developer typed every character manually. There are no labels, no metadata, no annotations - nothing in the version history reveals the proportion of code that came from an AI model.

This lack of distinction is the natural starting point. When developers first adopt autocomplete or chat-based AI, they're adding a tool to their personal workflow without changing how they interact with the team's shared systems. The code looks the same, passes the same CI checks, gets reviewed the same way. There's no reason - yet - to distinguish the provenance.

The problem emerges when AI code becomes a significant fraction of commits. As adoption grows from one enthusiast to half the team, the organization develops a meaningful blind spot: it can't measure AI adoption at all. It can't answer "what percentage of our new code is AI-generated?" It can't tell whether AI-generated code has different defect rates than human-written code. It can't calibrate review depth for changes where AI did most of the work versus changes that required careful human reasoning. All of this information is invisible because no one added provenance tracking when adoption was small.

The solution isn't restriction - it's instrumentation. Teams don't need to prevent AI code, they need to see it.

Why It Matters

The invisibility of AI-generated code creates several compounding problems as teams move up the maturity ladder:

  • Adoption is unmeasurable - If you can't see how much of the codebase is AI-generated, you can't measure whether your AI tooling investment is being used. ROI calculations become guesswork.
  • Quality patterns are invisible - If AI-generated code has different defect rates (in either direction), you can't discover that pattern without provenance data. You can't improve what you can't see.
  • Review calibration is impossible - A reviewer who knows a PR is 90% AI-generated might (correctly) spend more time on business logic correctness and less on syntax. Without that signal, review effort is allocated uniformly regardless of how the code was produced.
  • Compliance risk accumulates silently - Some industries and clients require disclosure of AI-generated content. Without tracking, compliance is effectively impossible.
  • Context is lost for future analysis - Code that seems fine today may turn out to have been generated by a model with a known weakness. Without provenance, you can't audit which parts of your codebase might be affected.

As of April 2026, the cost of not distinguishing AI from human code is no longer theoretical - it's quantifiable. Studies show AI-generated code has 1.7x more issues than human-written code, with 2.74x more security vulnerabilities. In March 2026 alone, 35 new CVEs were attributed to AI-generated code. Not distinguishing AI code provenance is now a measurable security risk, not just a process gap. Organizations without attribution cannot audit which parts of their codebase may be affected by model-specific weaknesses, cannot prioritize security review for AI-heavy modules, and cannot demonstrate compliance with emerging AI code disclosure requirements.

The transition from L1 to L2 on this dimension is the shift from "we know AI is being used somewhere" to "we can see where AI code is in our repository and how it behaves." That visibility is the foundation for L3 systematic measurement and L4 policy-based automation.

Tip

Start tracking AI code origin at the PR level - a simple label or PR template checkbox is enough to begin. You don't need line-level attribution to get actionable data. Even coarse-grained data ("this PR was primarily AI-generated") is far more useful than no data at all.

Getting Started

6 steps to get from here to the next level

Common Pitfalls

Mistakes teams actually make at this stage - and how to avoid them

How Different Roles See It

B
BobHead of Engineering

Bob's leadership team is asking him to report on AI adoption. He approved Copilot licenses for 50 engineers six months ago. He knows usage is uneven but can't answer basic questions: How many developers are using it weekly? What percentage of new code is AI-generated? Is the code quality different? He has no data.

What Bob should do - role-specific action plan

S
SarahProductivity Lead

Sarah is preparing a quarterly review of the company's AI tooling investment. The CFO wants to know the return on the Copilot licenses. Sarah has no data showing AI impact on productivity because the team has never tracked AI code provenance. She's looking at usage metrics from GitHub's dashboard but they don't connect to outcomes.

What Sarah should do - role-specific action plan

V
VictorStaff Engineer - AI Champion

Victor is a heavy Copilot user and has been encouraging his teammates to use it. But in code review, he's noticed he has no way to know which parts of a PR were AI-generated. Sometimes he reviews AI-generated code as if it were carefully hand-crafted, only to discover (from a conversation with the author) that it was a Copilot completion accepted without much scrutiny. He thinks this should be visible.

What Victor should do - role-specific action plan