One-shot unattended agents (Stripe Minions model)
How to launch AI agents that run to completion autonomously - writing code, running tests, fixing errors, and producing a PR without human supervision.
- ·Unattended agents (Stripe Minions model, Cursor Automations) execute tasks without developer presence
- ·Agents are invocable from at least two channels (Slack, CLI, Web, PagerDuty)
- ·Each developer runs 3-5 parallel agent sessions concurrently
- ·Agent task completion rate without human intervention exceeds 60%
- ·Agent invocation produces a PR within a defined SLA (e.g., under 30 minutes for standard tasks)
Evidence
- ·Agent invocation logs from multiple channels with timestamps
- ·Dashboard showing parallel agent session counts per developer
- ·PR history showing agent-authored PRs merged without synchronous developer oversight
What It Is
One-shot unattended agents are AI agents given a task and left to run until they complete it - no human supervision, no confirmation prompts, no mid-task steering. The agent writes code, runs the test suite, reads the output, fixes failures, iterates, and when it reaches a passing state, opens a pull request. The developer reviews the PR; they don't watch the work.
This pattern is named after Stripe's "Minions" system, described publicly in 2024-2025: a framework for dispatching AI agents to complete tasks in isolated sandboxes. Each "minion" gets a task description, a clean environment, and execution authority. When it succeeds (tests green, PR opened) or fails definitively (stuck in a loop, cannot fix the errors), it reports back. Stripe uses this pattern at scale for tasks like dependency upgrades, API compatibility fixes, and security patches. While Stripe Minions remains the canonical example, the pattern has since been commercialized. Cursor Automations (2026) provides always-on agents triggered from external systems - Slack messages, Linear tickets, GitHub issues, PagerDuty alerts - that spin up, complete a task, and produce a PR without developer initiation. Claude Code subagents with named @ mentions enable composing multi-agent workflows where a parent agent delegates subtasks to specialized child agents, each running in its own context. The one-shot unattended pattern is no longer experimental infrastructure - it's becoming a standard product feature.
At L4 (Optimized), unattended agents are the default mode for well-defined tasks. The shift from L3 is the removal of the human loop: at L3, a developer runs Claude Code in YOLO mode and monitors the terminal; at L4, the agent runs in a sandbox and the developer checks their email for a PR notification. The developer's attention is decoupled from the agent's execution.
The sandbox is essential. Unattended agents must run in isolated environments - ephemeral VMs, Docker containers, or git worktrees - where their actions cannot affect production systems, other developers' work, or infrastructure outside the task scope. The sandbox provides the blast radius limit that makes autonomous execution safe.
Why It Matters
Unattended agents change the economics of software development fundamentally:
- Decouples developer attention from task execution - a developer can have 5 agents running while they focus on architecture, review, or strategy; execution scales independently of human time
- Enables overnight work - agents launched at end of day complete tasks while the developer sleeps; the next morning's inbox contains PRs, not partially finished work
- Makes parallelism natural - one developer can launch multiple agents on different tasks without context switching overhead; each task progresses independently
- Shifts human role to review - developers become reviewers and directors rather than implementers; this is a fundamentally different and more leveraged use of senior engineer time
- Creates reproducible task execution - the same task specification produces the same kind of PR; this consistency enables measurement and optimization
The Stripe Minions model also demonstrates something important about L4 readiness: it requires L2-L3 investment to be safe. Unattended agents without a mature CLAUDE.md produce autonomous bad decisions at scale. The context engineering work of L2-L3 is what makes L4 viable, not just risky.
The measure of whether a task is suitable for unattended execution is: can you write acceptance criteria that an automated test suite can verify? If yes, the agent can work to those criteria autonomously. If not, the task requires human judgment that an agent cannot substitute.
Getting Started
6 steps to get from here to the next level
Common Pitfalls
Mistakes teams actually make at this stage - and how to avoid them
How Different Roles See It
Bob has been hearing about "autonomous agents" for months and is excited about the concept but nervous about the execution. His team is at L3 - they use CLI agents well, have mature CLAUDE.md files, and good test coverage. He wants to pilot unattended agents but doesn't know where to start safely.
What Bob should do - role-specific action plan
Sarah can see that unattended agents have significant ROI potential - if an agent can complete a task while a developer is doing something else, that's pure throughput multiplication. But she needs to measure this to justify the infrastructure investment (sandboxes, orchestration tooling).
What Sarah should do - role-specific action plan
Victor has already run unattended agents manually (launching Claude Code in YOLO mode and walking away). He's seen it work well for test generation and dependency upgrades. He wants to build the automation layer that makes this scalable and repeatable.
What Victor should do - role-specific action plan
Further Reading
6 resources worth reading - hand-picked, not scraped
From the Field
Recent releases, projects, and discussions relevant to this maturity level.