McKinsey's QuantumBlack documented their agentic software architecture in public. The surprise isn't the model choice or the orchestration framework — it's that the whole thing rides on a folder layout, frontmatter metadata, and a single human review at the end. Here's the architecture in plain English, and a working scaffold you can drop into any Claude Code project this afternoon.
McKinsey's QuantumBlack practice published the architecture they use for agentic software development. I was expecting a vendor pitch. What I got was something more useful — a documented, opinionated, surprisingly minimal pattern that any small team can copy.
The headline number from their own materials: roughly 90% of agent-generated code lands accurately, leaving about 10% for human review. The bank-modernization case study they describe — 400 pieces of legacy software, $600M budget — is being executed with humans in supervisory roles over agent squads, each agent contributing to a defined sequence.
That part is the press-release version. The interesting part is the architecture underneath, because they wrote it down.
The pattern is two layers and a folder:
An orchestration layer that is deterministic, not agentic — a workflow engine that controls phase transitions and artifact state.
An execution layer of specialized agents, each running inside a bounded scope.
A folder convention (.sdlc/) that holds context, specs, templates, and accumulated knowledge as machine-readable artifacts.
Humans enter the loop exactly once: when the PR opens. The whole feature — specs, architecture decisions, task breakdown, code, tests — is reviewed together. Earlier intervention kills the speed advantage, because every interruption forces a context rebuild.
That's the entire pattern. The reason it works is that it's boring in the right places.
!Architecture overview
Why "Deterministic Orchestration" Matters
The instinct, when you start building agent workflows, is to make the orchestrator itself an agent. A planner agent that decides what to do next. A router agent that picks which sub-agent to call. A supervisor agent that grades the supervisors.
That works in demos. It fails in production for the same reason agentic loops with no exit conditions fail: agents are non-deterministic, and chaining non-determinism multiplies it.
The QuantumBlack pattern flips this. The orchestrator is a plain workflow engine — phase A produces artifact X, phase B reads artifact X and produces artifact Y, phase C reads Y and produces Z. Transitions are gated by automated evaluations, not by an agent's opinion. Phase ordering is fixed.
Inside each phase, an agent does the creative work — write the spec, propose the architecture, generate the tasks, write the code. But the agent is bounded. It has a clear input (the previous phase's artifact), a clear output (this phase's artifact), and a clear exit condition (the artifact passes evaluation).
The result: the system has agentic behavior in the parts where you want creativity, and zero agentic behavior in the parts where you want repeatability. Most of the failure modes of agentic development — looping, oscillation, drift — come from putting non-deterministic logic where deterministic logic belongs.
If you've ever watched an agent confidently re-decide the same architecture question on three different turns of the same session, you already know why this matters.
The .sdlc/ Folder
The whole convention rests on this:
``
.sdlc/
context/ # persistent project knowledge — coding standards, domain glossary, infra map
specs/ # per-feature specifications, one folder per feature
templates/ # artifact templates used by every phase
knowledge/ # accumulated decisions and post-mortems (writes back from PR reviews)
src/ # the actual codebase
`
Every artifact inside .sdlc/specs/
`yaml
id: feat-2026-0427-agentic-sdlc
status: in_review
phase: implementation
parent: spec-2026-0427-agentic-sdlc
artifacts:
spec: ./spec.md
architecture: ./architecture.md
tasks: ./tasks.yaml
pr: https://github.com/org/repo/pull/4421
created: 2026-04-26
owner: haim
`
This is the part that makes the pattern compose. Because every artifact is machine-readable and every artifact references its parent, an agent can be dropped into any phase and reconstruct the full context by walking the tree backwards. There is no hidden state. The entire history of how this feature was reasoned about is in the folder.
When the PR reviewer leaves a comment that changes a design assumption, that comment gets written back to .sdlc/knowledge/. The next time an agent is asked to design something similar, that knowledge file is part of its context. The system gets better the more PRs it ships.
This is the closest thing I've seen to documentation that maintains itself.
!Folder structure
The Phased Workflow
The QuantumBlack flow has four phases, each gated:
Requirements — agent reads the request and any linked context, produces a spec.md with frontmatter, problem statement, success criteria, scope boundaries.
Architecture — agent reads the spec, surveys .sdlc/context/ for relevant patterns, produces architecture.md with component decisions and trade-offs.
Tasks — agent decomposes the architecture into a tasks.yaml with explicit dependencies and acceptance criteria per task.
Implementation — agent (or parallel agents, one per task) writes code and tests against src/, opens a PR, and links every commit back to the originating task.
Each phase ends with an automated evaluation gate: spec covers the requested scope, architecture is consistent with context/, tasks are independently executable, code passes tests and lints. Only after the gate passes does the next phase start.
The human reviewer does not see the spec when it's written, the architecture when it's drafted, or the tasks when they're decomposed. They see all of it together when the PR opens. The PR is the unit of review.
This sounds wrong the first time you read it. The instinct is to stop the agent at every phase, check, course-correct. The QuantumBlack data — and my own experience running similar loops — says the opposite. The agent is faster, more consistent, and easier to evaluate when you let it complete the full cycle. Mid-flight intervention forces a context rebuild and almost always introduces inconsistency between artifacts. If something is wrong in the spec, you'd rather see it expressed all the way through to code, because then you can see whether it actually breaks anything.
The 90% Number, Honestly
The "90% accuracy" claim from QuantumBlack's public materials is not a software-delivery speed-up — it's the share of agent-generated code that requires no human correction in their internal benchmarks. The "30%" figure that floats around the same conversation is from a different domain (credit-review workflows). I am separating these on purpose because the temptation to mash all the numbers together is exactly how this kind of architecture pattern gets oversold.
What I think the honest claim looks like is this: with a deterministic orchestrator, bounded agent execution, and a frontmatter-driven artifact tree, a small team can run several full requirements-to-PR cycles per day on features that used to take a week. The agent isn't faster than a senior engineer. It's faster than the coordination overhead between a junior, a tech lead, an architect, and a PR reviewer. That's where the time savings live.
If you remember nothing else from this post, remember that. Agentic development is not a code-generation story. It's a coordination story.
!Phased workflow
Try It Yourself
Here is a working scaffold you can drop into a Claude Code project this afternoon. It builds the .sdlc/ folder, gives you a spec template with frontmatter, and wires four agents — one per phase — that read and write into the tree.
Step 1 — Scaffold the folder
`bash
mkdir -p .sdlc/{context,specs,templates,knowledge}
cat .sdlc/templates/spec.md <<'EOF'
id:
phase: requirements
parent: null
artifacts:
spec: ./spec.md
architecture: ./architecture.md
tasks: ./tasks.yaml
pr: null
created:
•
• In scope:
• Out of scope:
Open questions
EOF
cat .sdlc/templates/architecture.md <<'EOF'
id:
phase: architecture
parent:
Data flow
Trade-offs considered
Decisions
EOF
cat .sdlc/templates/tasks.yaml <<'EOF'
spec_id:
• id: t1
title: ""
depends_on: []
acceptance:
• ""
EOF
`
Step 2 — Define the four phase agents
Drop these into .claude/agents/ (the Claude Code agent directory). Each agent has a single phase scope and a single exit condition. They read and write inside the feature's spec folder, nothing else.
`markdown
name: requirements-agent
description: Phase 1 of the .sdlc/ flow — turns a feature request into a spec.md
tools: Read, Write, Edit, Glob, Grep, WebSearch
You are the requirements phase of an agentic SDLC.
Read the feature request from the user. Read every file in .sdlc/context/.
Read .sdlc/templates/spec.md. Write a new spec to
.sdlc/specs/
Exit criteria:
• spec.md exists with valid frontmatter
• problem, success criteria, scope are filled in
• open questions are listed (do not invent answers)
Do NOT touch architecture.md, tasks.yaml, or src/.
`
`markdown
name: architecture-agent
description: Phase 2 — reads spec.md and produces architecture.md
tools: Read, Write, Edit, Glob, Grep
You are the architecture phase of an agentic SDLC.
Read .sdlc/specs/
existing patterns. Read .sdlc/knowledge/ for prior decisions. Write
architecture.md using .sdlc/templates/architecture.md as the template.
Exit criteria:
• architecture.md frontmatter parent matches the spec id
• every "open question" from the spec is either answered or explicitly
marked as still-open
• decisions section references existing patterns where applicable
Do NOT modify spec.md. Do NOT write tasks.yaml.
`
`markdown
name: tasks-agent
description: Phase 3 — decomposes architecture into tasks.yaml
tools: Read, Write, Edit, Glob
You are the task-decomposition phase.
Read spec.md and architecture.md. Produce tasks.yaml using the template.
Each task must be independently executable, have explicit acceptance
criteria, and declare its dependencies.
Exit criteria:
• every component in architecture.md is covered by at least one task
• task ids are referenced in dependency edges that form a DAG (no cycles)
• acceptance criteria are testable
`
`markdown
name: implementation-agent
description: Phase 4 — implements tasks against src/, opens PR
tools: Read, Write, Edit, Glob, Grep, Bash
You are the implementation phase.
Read tasks.yaml. For each task in dependency order: implement the change
in src/, write or update tests, run the test suite, commit with a message
that includes the task id (e.g. "feat:
When all tasks pass: open a PR. The PR description must link every commit
back to the originating task and reference the spec, architecture, and
tasks files.
Exit criteria:
• all tests pass
• PR is open
• spec.md frontmatter pr field is updated to the PR URL
• spec.md status is set to in_review
`
Step 3 — Run the pipeline
From the repo root:
`bash
kick off the requirements phase against a free-form feature request
claude code --agent requirements-agent \
"Add rate limiting to the public API. Per-IP, 100 req/min, returns 429."
review the generated spec, then trigger the next phase
claude code --agent architecture-agent \
"Process spec at .sdlc/specs/feat-2026-0426-rate-limiting/spec.md"
continue through tasks and implementation
claude code --agent tasks-agent \
"Process .sdlc/specs/feat-2026-0426-rate-limiting/architecture.md"
claude code --agent implementation-agent \
"Process .sdlc/specs/feat-2026-0426-rate-limiting/tasks.yaml"
`
Step 4 — Wire the evaluation gates
The deterministic part is the evaluation between phases. The simplest version is a shell script per phase:
`bash
.sdlc/gates/check-spec.sh
#!/usr/bin/env bash
set -euo pipefail
SPEC="$1"
frontmatter exists
head -1 "$SPEC" { echo "missing frontmatter"; exit 1; }
required sections
for section in "## Problem" "## Success criteria" "## Scope"; do
grep -q "^$section" "$SPEC" || { echo "missing $section"; exit 1; }
done
success criteria has at least one bullet
awk '/^## Success criteria/,/^## /' "$SPEC" | grep -q '^- ' \
|| { echo "no success criteria"; exit 1; }
echo "spec gate OK"
`
You can graduate to a real workflow engine (Temporal, Prefect, even GitHub Actions) once the pattern stabilizes. Start with shell scripts. The point of the deterministic layer is that it's auditable and boring, not that it's sophisticated.
Step 5 — Close the knowledge loop
When a PR review changes an assumption, write the lesson back:
`bash
cat .sdlc/knowledge/2026-decisions.md <<'EOF'
2026-04-26 — Rate-limiting buckets per-IP, not per-API-key
Reviewer:
Decision: Buckets are keyed by client IP, not API key, because anonymous
endpoints have no key. This overrides the architecture agent's initial
proposal.
EOF
`
The next time an agent designs anything in this surface area, that knowledge file is in its context window. The system learns from its reviewers.
What This Doesn't Solve
The pattern is not a silver bullet. It explicitly assumes:
• Your codebase has stable conventions worth distilling into .sdlc/context/. Greenfield projects with no history will struggle until enough decisions accumulate.
• Your test suite is trustworthy. Implementation-agent's exit criterion is "all tests pass." If the tests are weak, the agent will ship weak code that passes weak tests.
• Your reviewers are willing to read full PRs, not approve them in chunks. The single-review-at-the-end discipline only works if the review is real.
• You have someone who maintains .sdlc/templates/ and .sdlc/context/. Templates rot. Context drifts. Owning that maintenance is the new thing teams need to staff.
The most underestimated part is the last one. Documentation has historically failed because there was no incentive to keep it current. In this pattern, the templates and context files are load-bearing — agents read them every cycle, and bad context produces bad output immediately. That feedback loop is what keeps the docs honest.
The Bigger Picture
Every team that's serious about agentic development is converging on something close to this shape. Anthropic's Managed Agents API gives you the bounded-execution piece. GitHub Agentic Workflows gives you the deterministic-orchestration piece. Spec-driven frameworks (Kiro, Spec Kit, AGENTS.md) are converging on artifact conventions that look very similar to .sdlc/`.
The thing QuantumBlack got right is making it small enough to copy. The whole pattern is a folder, four phases, four agents, and a shell script per gate. That's it.
If your team is still trying to make a single super-agent do the whole pipeline, this is your sign to break it into phases. The overhead of designing the artifact contracts pays itself back the first time an agent crashes mid-implementation and the next agent picks up exactly where it left off, because everything it needs is already on disk.
Coding was never the bottleneck. Coordination was. Agentic SDLC is the first pattern I've seen that takes that seriously.