Diff awareness - reviewer knows it's AI code

When reviewers know which parts of a PR are AI-generated, they can calibrate review depth to match the actual risk - spending more time on business logic and less on syntax.

·AI-assisted review tool (CodeRabbit, Qodo, or equivalent) is active on all repositories
·Linter rules are configured and run in CI on every PR
·PRs clearly indicate whether code is AI-generated or AI-assisted (labels, tags, or commit metadata)

·AI review suggestions are triaged (accepted/rejected) rather than ignored
·Linter configuration is committed to the repository and versioned

Evidence

·AI review tool configuration in CI pipeline
·Linter configuration file in repository
·PR labels or commit metadata distinguishing AI-generated code

What It Is

Diff awareness is the practice of making AI code provenance visible to reviewers at review time - so a reviewer looking at a pull request knows which parts were AI-generated versus hand-crafted. At L2 (Guided), this is implemented through lightweight conventions: a PR label (ai-assisted), a note in the PR description ("The database query in UserRepository.kt was generated by Claude and reviewed by me"), or a commit message convention that distinguishes AI-generated changes.

This awareness is a stepping stone from the L1 state (no distinction between AI and human code) to the L3 state (systematic tracking and policy enforcement). At L2, the mechanism is social and manual - the developer discloses, the reviewer uses the information - rather than automated or enforced. But even manual disclosure provides immediate value: it changes how reviewers allocate their attention.

Knowing a function was AI-generated tells a reviewer something important: the code is likely syntactically correct and stylistically consistent, but may be semantically wrong for the specific business context. AI code tends to be good at the "how" and inconsistent at the "what" - it produces well-structured code that does the wrong thing, or that's missing the crucial edge case that exists only in your specific system's behavior. This is the exact opposite of the typical human-generated code failure mode (correct in intent, inconsistent in execution), and it calls for a different review focus.

Diff awareness doesn't mean AI code gets more review - it means it gets differently focused review. The reviewer can skim the syntax (the AI handled that) and focus on whether the logic matches the business requirement.

Why It Matters

Review calibration based on provenance makes review more efficient and more effective simultaneously:

More efficient - Reviewers don't need to read AI-generated boilerplate as carefully as human-written logic. Knowing a function is AI-generated allows skimming the implementation and focusing on the interface and contract.
More effective - AI-generated code has characteristic failure modes: plausible-looking but wrong business logic, missing error cases that aren't present in training data analogues, over-engineering for generic cases. Knowing code is AI-generated prompts reviewers to test it against these failure modes specifically.
Cultural normalization - When developers disclose AI use, it becomes normal. This is healthier than the alternative (silent AI use), where reviewers are reviewing AI code without knowing it and applying inappropriate review heuristics.
Data foundation - PR-level disclosure (even informal) is the seed of the provenance data you'll need for L3 systematic measurement. You can't analyze AI code quality patterns if you don't know which code is AI-generated.
Author self-review trigger - The act of writing "this was AI-generated" in a PR description prompts the author to consciously review the AI code before submitting. This alone catches a meaningful number of issues.

The key insight is that AI code is not uniformly higher or lower quality than human code - it's differently distributed in its failure modes. Review practices that calibrate to those specific failure modes get more value from the same review effort.

Diff awareness also sets up a human-skill discipline that gained traction in June 2026: "When I reject AI code even if it works" argues engineers should reject code they cannot personally understand, even when CI is green. Knowing a block is AI-generated is the trigger for that judgement call: a passing test suite tells you the code works today, not that anyone on the team can safely change it tomorrow. Make "I can explain what this does and why" a reviewer's bar for AI-generated logic, not just "the checks are green."

Tip

Establish a lightweight convention now rather than a heavyweight one. A single PR label and a sentence in the description is enough to start. Teams that try to track AI code at line-level granularity before they have automated tooling burn out quickly. Get the habit established first; improve the precision later.

Getting Started

Define what "AI-generated" means for your team - Is it: any code from an autocomplete suggestion? Only code where the developer asked for a full function body? Code where more than X% of tokens were AI-generated? A practical definition for L2: AI-generated code is code where the developer accepted an AI suggestion for a complete logical unit (function, class method, significant block) with minimal modification.
Create a PR label - Add an ai-assisted label to your GitHub or GitLab project. Document the convention in your team wiki: "Apply this label if AI generated a significant portion of the code changes in this PR."
Add AI disclosure to your PR template - Add an optional field to your PR template: "AI contribution (if any): [describe what was AI-generated]." Making it optional (not a required checkbox) reduces friction while establishing the norm.
Brief reviewers on review calibration - In a team session, discuss: "When you see an AI-assisted label, what does that change about your review?" Develop shared heuristics: focus on business logic correctness, check edge cases manually, verify that the AI understood the requirement rather than a similar-but-different one.
Lead by example with your own PRs - Staff engineers and tech leads disclosing AI use in their PRs normalizes the practice for the whole team. If senior engineers are transparent about using AI, junior developers will be too.
Track the disclosed PR rate over time - What percentage of merged PRs have the ai-assisted label? This gives you a coarse measure of AI adoption that grows more precise as the convention becomes habitual.

Common Pitfalls

Making disclosure feel like confession. If AI disclosure carries a stigma ("I didn't write this myself"), developers will omit it. The framing must be: AI disclosure is professional practice, like noting that a component came from a library. It signals transparency, not inadequacy.

Requiring disclosure for every autocomplete suggestion. If the convention is "disclose any AI use," you'll get either zero disclosure (too onerous) or meaningless disclosure (everything labeled, which tells reviewers nothing). A useful threshold: disclose when AI generated a complete logical unit that the developer accepted with minimal modification.

Reviewers ignoring the disclosure. If reviewers acknowledge the label but don't change their review behavior, the disclosure provides no value. Calibration needs to be explicit: "AI-generated code gets extra scrutiny on business logic, less on syntax." Make this a documented team practice.

Assuming all AI code needs more review. In some cases, AI-generated boilerplate (test scaffolding, serialization code, standard CRUD operations) requires less review than hand-written code because there's less room for idiosyncratic errors. The calibration goes both directions: skim AI boilerplate more aggressively, scrutinize AI business logic more carefully.

How Different Roles See It

BobHead of Engineering

Bob's team has been using Copilot for six months. In retrospective discussions, he's heard senior engineers say they sometimes feel like they're reviewing code without knowing how it was produced - some looks AI-generated but they're not sure, which makes them uncertain about review depth. There's no disclosure convention, so every review is flying blind on provenance.

What Bob should do: Bob should add a PR label and template field in a single afternoon of work, then announce the convention at the next team meeting. The announcement should include the why: "Knowing code is AI-generated helps reviewers focus on the right things. This isn't about surveillance - it's about making our review process smarter." Bob should ask his tech leads to apply the label to their own PRs for the first month, then monitor adoption. If disclosure stays low despite the convention, he should ask developers directly: "Are you using AI? If so, why aren't you disclosing?" The answers will surface whether the issue is cultural friction or genuine low AI adoption.

SarahProductivity Lead

Sarah has noticed that her PR cycle time metrics show high variance: some PRs close in 2 hours, others take 3 days. She suspects the long-tail PRs involve a lot of revision cycles (review comments, fixes, re-review). She wants to understand whether AI-generated code is correlated with more or fewer revision cycles.

What Sarah should do: Sarah should use the PR label data (once established) to analyze revision cycles by provenance. Do AI-assisted PRs have more comments per line of code than non-AI PRs? More revision cycles? Do they take longer from creation to merge? This analysis will either confirm that AI code needs calibrated review (and that diff awareness is the right intervention) or reveal that AI code has different patterns than expected. Either way, it gives Sarah data she can use to make investment decisions. She should share the analysis with reviewers - it helps them calibrate their own review instincts.

VictorStaff Engineer - AI Champion

Victor reviews 15-20 PRs per week. He's started recognizing Copilot-generated code by its patterns - a certain way of structuring error handling, characteristic variable names, boilerplate that's slightly too complete to be hand-typed. He's getting good at it, but it feels like guesswork. He wants explicit disclosure so he can calibrate efficiently.

What Victor should do: Victor should share his AI code pattern recognition heuristics with the team. Not as a way to detect undisclosed AI use, but as evidence that experienced reviewers already calibrate based on suspected provenance - and that making it explicit makes calibration more reliable. Victor can write a short internal blog post or Slack message: "Here's how I think about reviewing AI-generated code differently." This normalizes the practice and makes the case for disclosure. Victor should also be direct when reviewing PRs: if he suspects code is AI-generated, he can ask. "Was this function AI-generated?" is a legitimate review question, not an accusation.

From the Field

Recent releases, projects, and discussions relevant to this maturity level.

discussionL2

r/cursorDo you also click "keep file" 20 times?Cursor's AI-driven multi-file edit workflow generates a persistent review state that fails to synchronize with standard Git version control, creating a 'ghost' reddit.com

releaseL2

Kilo-Org/kilocodeKilo Code v7.2.48 optimizes human-agent collaboration by automating sidebar activation during context injection and streamlining Agent Manager workflows. It impgithub.com

discoveredL2

kyle-ssg/kydeA fast native commit and diff code editorKyde addresses an engineering maturity shift where AI-driven code generation moves the human role from writing to high-frequency auditing, making heavy JVM-basegithub.com

releaseL2

CopilotKit/CopilotKitCopilotKit release pr-5509-tour-2d4cfb4846 implements corrected PR tour videos recorded on 2026-06-24 to provide multimodal context for UI and logic changes. Thgithub.com

Where does your team actually sit on this?

This guide describes one level of one area. Run the assessment to place your team across all 16 areas, see which gates you have passed, and get a report you can take to your stakeholders.

Start the assessment

Code Review & Quality

Basic linter rules Lint-as-architecture (standards = enforced rules)