Back to Development
developmentL3 SystematicCoding Agent Usage

CLI agents (Claude Code, Codex) as primary

How shifting from IDE plugins to CLI-based agents makes AI a programmable, scriptable part of your development workflow rather than a typing assistant.

  • ·CLI agents (Claude Code, Codex) are the primary coding interface for 50%+ of feature work
  • ·Per-team or per-repo rules files exist and are maintained with code review
  • ·Coding conventions are written as explicit, agent-parseable rules (not implicit tribal knowledge)
  • ·Agent usage is tracked per developer and per repository
  • ·Agent instruction files follow a standardized template across the organization

Evidence

  • ·CLI agent session logs or telemetry showing primary usage
  • ·Rules files in repository with commit history showing regular updates
  • ·Coding conventions document cross-referenced from agent instruction files

What It Is

CLI agents - Claude Code, OpenAI Codex CLI, Aider, and similar tools - run in your terminal rather than inside an IDE plugin. This architectural difference is more significant than it appears. A CLI agent is a programmable tool: it can be invoked by scripts, chained into workflows, run in CI/CD pipelines, triggered by git hooks, and executed in remote environments without a graphical IDE. An IDE plugin is a convenience feature; a CLI agent is infrastructure.

At L3 (Systematic), CLI agents become primary - not supplementary. The developer's main AI interaction is no longer the chat sidebar or inline suggestions, but a terminal session where they describe tasks, the agent executes them, and the developer reviews results. The IDE still exists (with Copilot still running for inline assistance), but the high-leverage work happens in the CLI.

Claude Code is the canonical example: run claude in your project root, give it a task, and it uses tools to read files, make edits, run tests, and iterate until the task is complete. As of March 2026, Claude Code also supports Computer Use (interacting with GUIs, browsers, and desktop applications) and Auto Mode (dynamically choosing between tool use strategies), extending CLI agents beyond pure code tasks into full-environment automation. OpenAI's original Codex CLI is no longer actively developed - OpenAI shifted investment to Codex integrated within ChatGPT, which operates as a cloud-hosted agent rather than a local CLI tool. Aider supports multiple backends and is especially strong for pair-programming style interaction. Gemini Code Assist Agent Mode, now GA on IntelliJ (and other JetBrains IDEs), blurs the CLI/IDE boundary by offering terminal-grade agentic capabilities from within the IDE. All of these tools share the same core architecture: a language model with tool access, running autonomously.

The "primary" designation at L3 reflects a workflow inversion. At L1-L2, developers write code and occasionally use AI to help. At L3, developers describe what they want and the agent writes code, with the developer reviewing and steering. The human role shifts from implementer to orchestrator.

Why It Matters

The move to CLI-first agents is the inflection point where AI assistance becomes systematically integrated rather than ad-hoc:

  • Scriptability - CLI agents can be invoked from Makefiles, shell scripts, CI/CD pipelines, and GitHub Actions; IDE plugins cannot
  • Automation foundation - the same CLI command you run manually can become an automated step in your workflow; this is the path to L4 unattended agents
  • Environment independence - CLI agents run anywhere a terminal runs: local machines, CI runners, remote servers, Docker containers
  • Composability - CLI agents can be chained with other CLI tools; claude "generate tests" | grep TODO is a legitimate workflow
  • Separation of concerns - your IDE focuses on editing; your terminal focuses on agentic tasks; concerns don't compete for the same interface

CLI agents also enable a crucial L3 practice: systematic measurement. When agent invocations are CLI commands, they can be logged, timed, and analyzed. You can measure how long tasks take, how often agents need corrections, and which task types produce the best results. This measurement is what makes L3 systematic rather than just guided.

Tip

Create shell aliases for your most common agent tasks. alias write-tests='claude "write unit tests for the file I just modified, following patterns in existing tests"' turns a multi-step interaction into a single keystroke. These aliases are also the seeds of your L4 automation scripts.

Getting Started

6 steps to get from here to the next level

Common Pitfalls

Mistakes teams actually make at this stage - and how to avoid them

How Different Roles See It

B
BobHead of Engineering

Bob's team is using Copilot well (L2), but adoption of Claude Code has been slower. Developers say the CLI interface feels unfamiliar and they prefer the IDE plugins they're used to. Bob isn't sure whether to push adoption or let it happen organically.

What Bob should do - role-specific action plan

S
SarahProductivity Lead

Sarah can now see a cleaner measurement story at L3. CLI agents produce logs, and logs produce metrics. She wants to establish the measurement infrastructure before the team scales their agent usage.

What Sarah should do - role-specific action plan

V
VictorStaff Engineer - AI Champion

Victor has been using Claude Code as his primary development tool for months. He's written shell scripts that invoke it automatically on common task patterns and has integrated it into his personal Makefile. His iteration speed on new features is 3-4x what it was before.

What Victor should do - role-specific action plan