2026-05-19
Tutorial: AI-Assisted Project Development — Human in the Loop
The single most important pattern in AI-assisted development is the
human-in-the-loop cadence: a structured, recursive process where every
step is guided by a human and executed with AI assistance. It scales from a
three-minute bug fix all the way up to a six-month project delivery.
This tutorial walks through the full macro-process, shows how each persona
participates, and introduces shrink to fit — the principle that lets you
apply the same pattern at any scale.
What Is the Human-in-the-Loop Cadence?
Most AI-assisted development falls into one of two traps: either the human
micromanages every keystroke (defeating the purpose of AI assistance), or the
human steps away entirely and the agent drifts. The human-in-the-loop cadence
avoids both by defining clear handoff points where human judgment is required
and AI autonomy is bounded.
The cadence follows this macro-structure:
flowchart TD
A[Project Setup<br/>Diataxis + Deep Dives] --> B[Phase 1: Requirements<br/>Product Marketing]
B --> C[Phase 2: Architecture<br/>Architect]
C --> D[Phase 3: PM & Sprint Breakdown<br/>Developer + PM]
D --> E[Phase 4: Implementation Loops<br/>Sprint Planning → Issues → Dev → Review]
E --> F{More sprints?}
F -->|Yes| E
F -->|No| G[Milestone Review]
G --> H{More milestones?}
H -->|Yes| D
H -->|No| I[Project Completion]
Every box is human-in-the-loop — the human approves outputs, makes
decisions, and sets direction. The AI executes, drafts, and accelerates.
Why Human-in-the-Loop?
A critical goal of the human-in-the-loop pattern is to catch divergence from
goals early. When an agent runs unchecked, ambiguity in requirements
compounds: a vague PRD phrase becomes a misunderstood architecture decision,
which becomes an incorrectly implemented feature, surfacing as a bug weeks
later. Each step amplifies the original ambiguity — small gaps in early
requirements expand into costly rework.
Human-in-the-loop takes more time per step, but it lets the human course
correct early and often — filling gaps, flagging misunderstandings, and
adjusting direction before work is merged.
flowchart LR
S((Start)) -->|HITL| H[Human-in-the-Loop]
S -->|Autonomous| A[Autonomous Agent]
H --> H1[Review]
H1 --> H2[Review]
H2 --> H3[Review]
H3 --> G((◎ On target))
A --> A1[Execute]
A1 --> A2[Execute]
A2 --> A3[Execute]
A3 --> M((○ Miss))
Even though the HITL path has more steps (review checkpoints), it converges on
the goal because each review is a chance to steer back to center. The
autonomous path appears faster per step but drifts further with every unchecked
decision.
Cost Containment
Quality isn't the only reason to keep humans in the loop. AI usage is a
direct operating cost, and unchecked autonomous agents are unchecked spend.
Every prompt, every large context window, every multi-step chain of agent calls
appears on your invoice. A solo developer experimenting with AI feels this at a
small scale; an engineering organization running dozens of autonomous agents
feels it immediately.
The human-in-the-loop cadence creates natural cost pressure by design:
Humans take on what they're better at. Not every task benefits from AI
generation. Reviewing a two-line config change, writing a one-sentence
changelog entry, or renaming a variable are tasks a human completes in seconds
— an AI completes them in thousands of tokens. The cadence surfaces these
opportunities: when a human is already reviewing output, they can simply fix
the small things themselves rather than round-tripping to the agent.
Scope boundaries contain consumption. Sprints are fixed scope. When each
sprint has a defined goal and a human must approve before the next begins, AI
consumption is bounded per cycle. There are no runaway agent loops spending
tokens on work that was never approved.
The right tool for the right task. The cadence makes the AI-vs-human
decision explicit at every step. Complex reasoning, large-scale generation,
and cross-file refactors are where AI delivers outsized leverage. Mechanical
tasks, judgment calls, and stakeholder communication are where humans are
faster and cheaper. A well-run HITL workflow routes each task accordingly.
Organizational resilience against AI dependency. This is the longer-term
cost argument: teams that run fully autonomous pipelines accumulate hidden
risk. If a model is deprecated, a provider raises prices, an API changes
behavior, or leadership decides to shift tools, the team has no fallback —
they've lost the institutional knowledge of how to do the work without the
AI. Human-in-the-loop keeps engineers engaged with the codebase, the
architecture, and the decisions being made. That knowledge doesn't disappear
if the AI does. It also means the team can intelligently evaluate when the
next model generation actually improves their workflow — rather than being
locked in by dependency.
The goal is not to minimize AI usage — it's to maximize value per dollar
spent. Human-in-the-loop is the mechanism that keeps that ratio in check.
The Core Principle: Shrink to Fit
The cadence above looks heavyweight for a single bug fix. That's where
shrink to fit comes in: the same process applies at every scale, but the
phases collapse or compress based on context.
| Scale | Example | How the Cadence Shrinks |
|---|---|---|
| Full project | New product launch | Full six-phase cadence |
| Feature | Add payment integration | Phases 1-3 are quick conversations; Phase 4 is one sprint |
| Bug fix | Fix login crash | Phase 1 = "here's the bug", Phase 2 = "here's the fix", Phase 3-4 = one task, Phase 5 = merged |
| Typo | Fix a docs typo | The entire cadence is a single prompt: "Fix typo on line 42. Verify with the project's lint command." |
The invariant is always human approval at each boundary. The content
changes; the structure stays.
Phase 0: Project Setup
Applies to: new projects or brownfield projects without existing agent docs.
Before the cadence can run, the project needs:
- A Diataxis documentation structure — the agent-docs layout
(00-readme/through13-personas/) - Deep dives on the existing codebase, architecture, conventions
- Persona definitions for the roles that will participate in the cadence
If this setup is already complete, skip to Phase 1. Otherwise, run a bootstrap
session to initialize the structure, then run deep-dive sessions with the
Architect and Developer personas to populate the explanation and reference docs.
Phase 1: Requirements
Persona: Product Marketing
The Product Marketing persona owns this phase. Their job is to produce a
requirements document (PRD) that captures:
- Context: What problem are we solving? For whom? Why now?
- Requirements: Functional and non-functional, with acceptance criteria
- Prioritization: Impact vs. effort, strategic alignment, customer value
- Success criteria: How will we know this is done?
In practice: Describe the project or feature to the agent using the
Product Marketing persona. The agent will ask clarifying questions, then
produce a structured PRD. Review and approve before moving on.
Deliverable: Approved PRD.
Phase 2: Architecture
Persona: Architect
The Architect persona takes the approved requirements and produces:
- System architecture / component breakdown
- Technology decisions (or validates existing ones)
- Key interfaces and data flow
- Architecture Decision Records (ADRs) for significant choices
In practice: Hand the requirements to the agent using the Architect
persona. Review the architecture proposal, debate trade-offs, and approve
before moving on.
Deliverable: Architecture document + ADRs (01-explanation/architecture-*.md,
01-explanation/decisions/).
Phase 3: Project Management & Sprint Breakdown
Personas: Developer + Project Manager
The Developer and Project Manager personas work together to turn requirements
and architecture into a plan:
- Task breakdown: Decompose the work into discrete, estimatable tasks
- Estimates: Each task gets two estimates:
- Human hours: Time without AI assistance (traditional estimate)
- AI-assisted hours: Time with AI agent assistance
- Sprint definition: Group tasks into sprints achievable in "long
sessions" with the AI. Each sprint should have a clear goal and
deliverable. - Project management document: A plan showing:
- Project flow and dependencies
- Milestones (groups of sprints)
- Human vs. AI-assisted date estimates
- Sprint timelines (Gantt-style)
- Risk areas where human judgment is critical
In practice: The agent drafts the full PM document. Review the estimates
— debate any AI estimate that seems too aggressive or too conservative.
Approve the sprint structure before implementation begins.
Deliverable: Approved project management plan (05-plans/roadmap.md,
05-plans/sprint-backlog.md, 05-plans/pm-plan.md).
Phase 4: Implementation Loops
Personas: Developer + Project Manager
This is where the actual work happens, one sprint at a time.
Sprint Planning
At the start of each sprint, the Developer and PM personas:
- Review sprint goal and backlog
- Create stories / issues for every task
- Assign estimates, priorities, and labels
- Confirm tests are included in every task definition
Trust-Based Delegation
The human decides how much autonomy to give the agent based on accumulated
trust:
| Trust Level | Delegation Scope | Review Cadence |
|---|---|---|
| Low | Single task per prompt | Review every output |
| Medium | Group of related issues | Review after each group |
| High | Entire sprint | Sprint review at end |
Start low. As the agent demonstrates understanding of the project's
conventions, code style, and testing patterns, widen the scope.
Implementation
Every task follows the same pattern:
- Branch: Create a feature branch from
main— never push directly - Implement: Agent writes code following project conventions
- Test: Unit and integration tests (included in the task definition)
- Review: AI-augmented code review
- Initially, the human reviews every PR. They may use an AI from a
different LLM provider than the one that wrote the code to get an
independent second opinion. Treat the AI as an enhanced reviewer that
flags issues for the human to evaluate — not an authority that replaces
human judgment. - As trust builds, the human can delegate review responsibility: first
letting AI approve routine changes (docs, refactors with tests), then
broader changes, eventually allowing fully AI-conducted reviews and
approvals. - Code review guidelines must be written into the agent's instructions
(e.g.,AGENTS.md,CLAUDE.md, or the project's review template).
Guidelines should cover: correctness, test coverage, style consistency,
security, performance, and documentation. The agent should cite the
specific guideline when flagging or approving each issue. - Merge: Squash-merge to
mainafter approval
In practice: Use the Parallel Hours time tracking integration
(/session-start [issue#]and/session-end)
to log AI-assisted time automatically. Each task is tracked separately so you
can compare human estimates vs. actual AI-assisted time.
Sprint Review
At the end of every sprint, the agent generates a sprint review:
- What was completed (vs. planned)
- What was not completed and why
- Time tracked (human vs. AI-assisted)
- Any blockers or risks for the next sprint
- Updated estimates for remaining work
The human reviews, adjusts the backlog, and approves moving to the next
sprint.
Deliverable: Completed features in main, sprint review document.
Phase 5: Milestone Reviews
After several sprints, a milestone completes. Milestone reviews are more
substantial than sprint reviews:
- Demo: What was built (working software or documentation)
- Metrics: Human vs. AI-assisted time across all sprints, estimate
accuracy, velocity trend - Retrospective: What worked, what didn't, what to change
- Updated roadmap: Remaining milestones re-estimated based on actual
velocity
The human and agent present the review together. This is a key trust-building
moment — consistent milestone deliveries justify wider delegation in future
phases.
Deliverable: Milestone review document, updated roadmap.
Phase 6: Project Completion
The final phase ties everything together:
- Final delivery of all milestones
- Project retrospective (what went well, what to improve next time)
- Knowledge base updates (updated docs, ADRs, personas)
- Close-out metrics (total human time, total AI-assisted time, velocity,
estimate accuracy)
Putting the Cadence Together
Here's how a full project might look end-to-end:
| Week | Phase | Activity | Human Role |
|---|---|---|---|
| 1 | Setup | Bootstrap docs, deep dives | Guide deep dives |
| 2-3 | Requirements + Architecture | PRD, ADRs, architecture docs | Review and approve |
| 4 | PM Sprint Breakdown | Task breakdown, estimates, sprint plan | Debate estimates, approve plan |
| 5-6 | Sprint 1 | Sprint planning, implementation, review | Daily check-ins, sprint review |
| 7-8 | Sprint 2 | Sprint planning, implementation, review | Daily check-ins, sprint review |
| 9 | Milestone 1 Review | Demo, metrics, retro | Present with agent |
| 10-16 | Sprints 3-6 | Repeat sprint pattern | Trust level increases |
| 17 | Milestone 2 Review | Demo, metrics, retro | Present with agent |
| 18 | Completion | Final delivery, project retro | Close out |
Remember: shrink to fit. A bug fix skips weeks 1-4 entirely. A feature
add compresses weeks 1-3 to a single conversation. Only full projects run
the entire table.
Personas Reference
Each phase of the cadence is owned by one or more personas. Personas are
detailed character definitions — background, goals, pain points, how they
communicate — that you give to the agent before a phase begins. Switching
personas shifts the agent's perspective and focus without changing the
underlying model.
| Persona | Name | Phase | What they own |
|---|---|---|---|
| Product Marketing | Taylor | Phase 1 | PRD, problem framing, success criteria, prioritization |
| Architect | Amara | Phase 2 | System design, technology decisions, ADRs, component breakdown |
| Project Manager | Morgan | Phase 3–4 | Sprint planning, task breakdown, estimates, milestone tracking |
| Developer | Wei | Phase 3–4 | Implementation, tests, code review, branch and merge workflow |
| Docs Author | Yuki | Throughout | Documentation structure, templates, reference material, tutorials |
| Editorial Reviewer | Kai | Throughout | Inclusive language, clarity, accuracy across all content types |
| Evaluator | Sam | Pre-adoption | Framework fit assessment, integration effort, adoption readiness |
| Operator | Jordan | Setup + ongoing | GitHub configuration, issue labels, MCP servers, CI/CD pipelines |
| Junior Developer | Hikaru | Phase 3–4 | Onboarding path, learning scaffolding, step-by-step guidance |
| Legal & Contracts | Morgan Ellis | As needed | AI adoption proposals, contracts, training scopes, PoC boundaries |
Personas work best when introduced explicitly at the start of a conversation:
"You are Taylor, the Product Marketing persona for this project. Here is the
persona definition: [paste persona]. Now let's work on the PRD for..."
Key Artifacts
Each phase produces a concrete deliverable. These are the canonical documents
the cadence relies on:
| Artifact | Phase | Purpose |
|---|---|---|
| PRD (Product Requirements Document) | 1 | Captures problem, requirements, acceptance criteria, and success metrics |
| Architecture document | 2 | System design, component breakdown, technology decisions |
| ADRs (Architecture Decision Records) | 2 | Rationale for significant technical choices |
| PM plan | 3 | Sprint structure, milestones, estimates (human vs. AI-assisted), Gantt timeline |
| Sprint backlog | 3–4 | Task list with story points, labels, and priority for each sprint |
| Stories / issues | 4 | Individual tasks with acceptance criteria, estimates, and test requirements |
| Sprint review | 4 | Completed vs. planned, time tracked, blockers, updated estimates |
| Milestone review | 5 | Demo, metrics, retrospective, updated roadmap |
| Project retrospective | 6 | Final metrics, lessons learned, updated docs and personas |
The cadence is the structure. The personas are the judgment. The artifacts are
the receipts — the evidence that each phase completed with human approval before
the next began.