Agent instruction files in repo (60k+ repos on GitHub)

Agent instruction files - CLAUDE.md, .cursorrules, copilot-instructions.md - have become a standard software project artifact, with 60,000+ repositories on GitHub already containing them.

·CLAUDE.md or equivalent exists with project description, tech stack, and top conventions
·Written coding conventions document exists and is referenced from agent instruction files
·Agent instruction files are committed to the repository (not local-only)

·CLAUDE.md includes explicit prohibitions (banned libraries, anti-patterns)
·Agent instruction files are reviewed as part of the standard PR process

Evidence

·CLAUDE.md, .cursorrules, or .github/copilot-instructions.md in repository root
·Coding conventions document accessible from agent instruction files
·Commit history showing agent instruction file updates

May 2026 Update

Two shifts elevated agent instruction files from "rules in a markdown" to "shared spec interface":

DESIGN.md is replacing Figma exports for UI work. Token-efficient plain-text constraints turn out to be 10-100x cheaper than image embeds and produce more consistent agent output. Add DESIGN.md alongside CLAUDE.md / AGENTS.md whenever the agent touches UI.
Spec-Driven Development (SDD) frames AGENTS.md as a living spec. Andrew Ng + JetBrains launched a SDD with Coding Agents course on April 15; Thoughtworks is documenting enterprise adoption. The pattern: specs are not one-shot prompts, they are versioned, reviewed artefacts that both humans and agents read - and they evolve as the system does.
Skills (SKILL.md) are now the unit of reuse (June 2026). A SKILL.md packages a reusable, named capability (instructions plus referenced scripts and resources) that an agent loads on demand, and vendors have started shipping official skill-packs: Apple for Xcode 27 (exportable via xcrun agent skills export), NVIDIA (BioNeMo plus NVIDIA-Verified Agent Skills), Cloudflare (Zero Trust), and Alpaca (June 17). The convention-file stack is converging on AGENTS.md + CLAUDE.md + SKILL.md + llms.txt: the instruction files set durable context, Skills package the repeatable how-to. Anthropic reports Claude answering 95% of internal analytics queries (up from 21% accuracy) via skills plus data governance rather than a bigger model - though that accuracy decays to ~65% within a month if the skills are left untended, so treat Skills as maintained artefacts, not write-once files.

What It Is

Agent instruction files are project-specific configuration documents that tell AI agents how to behave in a given codebase. They go by different names depending on the tool: CLAUDE.md for Claude Code, .cursorrules for Cursor, .github/copilot-instructions.md for GitHub Copilot, AGENTS.md for OpenAI Codex. But they share the same purpose: providing AI agents with the context they need to make useful, convention-respecting suggestions.

As of 2025, these files have proliferated dramatically. Over 60,000 repositories on GitHub contain CLAUDE.md or .cursorrules files - a number that grew by an order of magnitude in 18 months as AI coding tools moved from early adoption to mainstream use. The open-source community has effectively standardized on the concept: when you create a project that uses AI tools, you add an instruction file. It's become as natural as adding a .gitignore or a README.md.

The ecosystem is currently fragmented. Each tool has its own file format and discovery mechanism. Cursor reads .cursorrules from the repository root and project-level .cursor/rules/ directories. Claude Code reads CLAUDE.md files at the root and in subdirectories, and merges them hierarchically. GitHub Copilot reads .github/copilot-instructions.md. Projects serious about AI tooling often maintain multiple files or a single canonical file that they reference from the tool-specific locations.

What these files contain ranges from minimal (tech stack + run commands) to comprehensive (architectural patterns, forbidden practices, per-module conventions, test requirements). The most effective instruction files are neither too short (missing critical context) nor too long (agents struggle to attend to instructions buried in thousands of lines of text).

Why It Matters

The proliferation of agent instruction files reflects a broader shift in how software projects are organized. Documentation used to be for humans; agent instruction files are documentation for AI. As AI agents become standard collaborators, projects without instruction files are at a structural disadvantage - their agents make more mistakes, require more correction, and produce less consistent code.

60k+ repos demonstrates this is now a standard practice, not an experimental edge case
Open-source examples provide templates - you can study how successful projects structure their instruction files before writing your own
Tool ecosystem is converging - despite different file names, the structure and content of effective instruction files is becoming standardized
First-mover effect in your organization - teams with good instruction files pull ahead of teams without them; the gap compounds over time as agents generate more code
Cross-tool compatibility matters - as developers switch tools or use multiple AI tools, a well-structured instruction file that can be referenced from multiple locations reduces maintenance burden

The 60k+ number also serves as a practical benchmark. If you're evaluating whether to invest in writing a CLAUDE.md file, the answer from the open-source community is clear: organizations that take AI-assisted development seriously have already made this investment.

Tip

Before writing your instruction file from scratch, search GitHub for CLAUDE.md or .cursorrules files in repositories using your tech stack. The open-source community has done substantial experimentation with what works - use their examples as a starting point.

Getting Started

Choose your primary file - If your team uses Claude Code as the primary tool, start with CLAUDE.md. If Cursor is dominant, start with .cursorrules. You can always add the others later; start with what your team will benefit from immediately.
Study open-source examples - Search GitHub for filename:CLAUDE.md or filename:.cursorrules filtered to your language or framework. Look at highly-starred repositories in your tech stack. Notice the structure and content patterns that appear repeatedly.
Write the minimum viable file - Project overview (2-3 sentences), tech stack (bulleted list), setup commands (literal commands that work), top 5 conventions (specific, actionable rules). Commit this within a week of deciding to write it.
Add a subdirectory structure for complex projects - For monorepos or projects with distinct domains, create subdirectory-level instruction files. Claude Code reads CLAUDE.md files in subdirectories and merges them with the root file, which allows per-module conventions to coexist with project-wide standards.
Establish a review process - Assign a team member to review and update the instruction file quarterly. Add "update CLAUDE.md if this changes the architecture" to your PR template.
Measure the impact - Compare AI suggestion acceptance rates before and after introducing the instruction file. In most cases, the improvement is significant enough to be obvious without formal measurement.

Common Pitfalls

Writing for humans, not agents. Instruction files read by agents should be explicit, structured, and unambiguous. Human-readable narrative documentation is good for READMEs. Agent instruction files benefit from clear section headers, bulleted lists, and precise directives. "Never use any types" is better agent instruction than "We try to maintain strong typing where possible."

Putting everything in one flat file. For large repositories, a single long CLAUDE.md becomes unwieldy. Agents' ability to attend to instructions degrades as file length increases. Use a hierarchical structure: root CLAUDE.md for project-wide context, subdirectory CLAUDE.md files for domain-specific conventions.

Treating tool-specific files as equivalent. .cursorrules and CLAUDE.md are read by different tools with different behaviors. A .cursorrules file that works well for Cursor autocomplete may not be appropriately structured for a Claude Code agent undertaking a multi-step task. Maintain tool-specific files that are optimized for each tool's behavior.

Not benefiting from the open-source ecosystem. The open-source community has done significant experimentation with agent instruction files. Don't start from scratch when there are thousands of public examples in your tech stack. Borrow liberally, adapting to your specific context.

How Different Roles See It

BobHead of Engineering

Bob read about the 60k+ stat in a newsletter and asked his leads how many of their repositories have a CLAUDE.md file. The answer was two out of thirty-seven. Bob knows his team is behind the curve, but a one-time mandate to "add CLAUDE.md files to all repos" isn't the right approach - it'll produce low-quality files that don't help anyone.

What Bob should do: Bob should run a structured rollout. First, identify the five repositories where AI tools are used most heavily - these are where the investment pays off fastest. Second, have a senior engineer produce a high-quality CLAUDE.md for one of these repositories as a reference implementation. Third, run a workshop where other teams use the reference implementation as a template to produce their own. Quality over breadth: five excellent instruction files are worth more than thirty mediocre ones. Bob can then track adoption and quality metrics to build the case for the next wave.

SarahProductivity Lead

Sarah wants to benchmark her organization's AI context engineering maturity against industry practice. She's been using the "60k+ repos" data point to argue that agent instruction files are table stakes, not a nice-to-have, but she needs more than a number to make the case for investment.

What Sarah should do: Sarah should commission a one-day audit: for each of the team's main repositories, have a developer spend an hour testing AI tool quality with and without a CLAUDE.md file. The delta in suggestion quality is the concrete evidence. She can also survey developers on the teams that have instruction files vs. those that don't - the satisfaction differential tends to be significant. With both the qualitative and quantitative data, she can make the investment case to stakeholders: instruction file authoring is a one-time cost that improves AI tool ROI for the lifetime of the repository.

VictorStaff Engineer - AI Champion

Victor is curious about how other teams structure their instruction files. He's written a CLAUDE.md for his main repository but isn't sure if he's capturing the right things, or if there are structural patterns he's missing. He wants to learn from the broader ecosystem.

What Victor should do: Victor should spend an afternoon studying the best open-source instruction files in his stack. The GitHub search filename:CLAUDE.md language:TypeScript (or whatever his stack is) will surface a wide range of examples. He should look for patterns in structure, what gets included, what stays out, and how complex projects handle multi-module conventions. He should then update his own file based on what he learns and write a brief guide for his team: "Here's the structure that works, here's why, here's how to add to it over time." This becomes the template for the rest of the organization's instruction file rollout.

From the Field

Recent releases, projects, and discussions relevant to this maturity level.

discoveredL2

akashrmalhotra/3d-portfolio3d portfolio website which devs can useDevelopers utilize this React 18 and TypeScript boilerplate to accelerate the deployment of immersive 3D interfaces by abstracting complex Three.js and GSAP (@ggithub.com

discoveredL2

VoltAgent/awesome-design-mdCollection of DESIGN.md files that capture design systems from popular websites. Drop one into your project and let coding agents build matching UI.Google Stitch’s DESIGN.md specification replaces traditional Figma exports and JSON schemas with plain-text markdown design tokens, optimizing for LLM token effgithub.com

articleL2

jonmagic.comGitHub Copilot Session Search and Resume CLIEngineering teams are addressing context fragmentation in terminal-based AI by building persistence layers on top of GitHub Copilot CLI's local SQLite storage ljonmagic.com

discoveredL2

kyegomez/PROMPTS.mdUnderstanding CLAUDE.md, MEMORY.md, SKILLS.md, SOUL.md, and Related Prompting MechanismsEngineering teams are standardizing agentic behavior by deploying root-level Markdown files that serve as a version-controlled 'personality package' to overcomegithub.com

Where does your team actually sit on this?

This guide describes one level of one area. Run the assessment to place your team across all 16 areas, see which gates you have passed, and get a report you can take to your stakeholders.

Start the assessment

Context Engineering

Written coding conventions MCP servers: architecture, ownership, SLA (universal standard)

Agent instruction files in repo (60k+ repos on GitHub)

May 2026 Update

What It Is

Why It Matters

Getting Started

Common Pitfalls

How Different Roles See It

Further Reading

From the Field

Where does your team actually sit on this?