Matrix/Delivery Management

Delivery Management

How we manage delivery in the age of agents. From human PR review to autonomous delivery pipeline.

4capabilities20levels61practices61guides

The matrix · full mapClick any cell · L4 is where teams aim

Capability ↓
Maturity →

L1 · Stage 01

Ad-hoc

L2 · Stage 02

Guided

L3 · Stage 03

Systematic

L4 · Stage 04

Optimized

Sweet spot

L5 · Stage 05

Autonomous

01·15 guides

CI/CD Pipeline→

Speed and reliability of your build-test-feedback loop for AI-generated code

CI runs; everyone waits

3 practices·3 guides→

Ten minutes, basic caching

3 practices·3 guides→

Five minutes, incremental by default

3 practices·3 guides→

Two minutes in an isolated microVM

3 practices·3 guides→

Feedback in seconds, capacity on demand

3 practices·3 guides→

02·15 guides

Merge & Deploy→

How PRs flow from creation to production - throughput, automation, and conflict handling

Every merge has a human in it

3 practices·3 guides→

A queue does the rebasing

3 practices·3 guides→

Policy decides what merges, and when

3 practices·3 guides→

Green auto-merges straight to production

3 practices·3 guides→

A thousand merges a week, agent-driven

3 practices·3 guides→

03·16 guides

Metrics →

What you measure to understand AI-assisted engineering productivity and quality

DORA at best; AI still unmeasured

3 practices·3 guides→

Token spend is finally on a dashboard

3 practices·3 guides→

You know your cost per merged PR

3 practices·3 guides→

Auto-approve rate is a managed number

5 practices·5 guides→

Cost per feature, value per token

2 practices·2 guides→

04·15 guides

Governance & Compliance→

Controls around AI-generated code - licensing, security scanning, and audit trails

Personal subscriptions, no policy

3 practices·3 guides→

A policy exists and spend has caps

3 practices·3 guides→

Every agent action leaves a trail

3 practices·3 guides→

Provenance is cryptographic, checks automated

3 practices·3 guides→

Compliance watches the regulators for you

3 practices·3 guides→

Climb the matrix

You don't have to figure this out alone.

Every level in this matrix has a path. Read the playbooks the teams that have climbed it wrote. Run the assessment with our consultants. Start where you are.

guide15 min read↗

AI-native CI/CD pipeline design

From green/red gates to autonomous deployment triggers.

playbook20 min read↗

Agent-aware incident response

How to structure on-call for agentic systems.

workshopHalf-day↗

Delivery metrics workshop

Measuring what matters in an AI-accelerated pipeline.

Live with Visdom

Book an AI Maturity Assessment session with your team.

We walk you through all four perspectives, score where you actually are, and leave you with a 90-day plan to climb in the dimensions that matter most.

Book an assessment →See what's included90-day plan - scored assessment - coaching

Author Commentary

July 2026 update: after the bill came due, June reshaped the delivery pipeline around two questions - where does the agent run, and can you prove what it did.

On the cost side the market turned from tokenmaxxing to efficiency (CNBC, June 26): the new metric is value per token, token arbitrage (judgment on the premium model, code-writing dispatched to a cheaper specialist) became a real pattern, and Anthropic's plan to meter Agent SDK usage on a separate credit was announced for June 15 then paused on June 16 - pricing is still a live procurement risk, not a settled cost. FrontierCode added the quality counterweight: track mergeability and post-merge outcomes, because a high pass-rate hides "unmergeable slop."

The pipeline also moved into hardware-isolated, increasingly self-hosted sandboxes. AWS Lambda MicroVMs (June 22, Firecracker, built for AI-generated code) and self-hosted E2B/Daytona keep the agent's code inside your own VPC, and GitHub made security validation for third-party coding agents GA (June 9) so CodeQL, dependency and secret scanning run before a PR is finalized. Provenance entered the merge path with Dapr 1.18 Verifiable Execution (June 11) - cryptographic, tamper-evident traces of which agent did what. And governance gained a new top priority: vendor sovereignty. When Anthropic disabled Fable 5 worldwide in 72 hours and Fable 5 on Bedrock turned out to ship prompts back to the vendor, model/vendor portability and data-residency routing stopped being nice-to-haves. EU AI Act GPAI enforcement powers begin Aug 2, 2026 (a Digital Omnibus deal may delay high-risk duties to Dec 2027). Stripe Minions is still the L5 north star; the new homework is proving the fleet is worth what it costs - and that you could still run it if your vendor vanished tomorrow.

Other perspectives

01Development·02Delivery Management·03Organization·04Infrastructure