This is Part 1 of a two-part series. This post covers the solo operator case: one human, six AI agents, a disciplined loop, and real software shipping. Part 2 covers what breaks when you scale past one operator.

By Marco Vargas, Cloud Engineering Solution Architect 

I’ve been through two industry paradigm shifts in my career: Cloud and Agile. Both took years to be recognized as paradigm changes rather than tool upgrades. I couldn’t tell you the exact day either one clicked for me.

AI was different. The click was sharp, and I can tell you what I was working on when it landed. What I landed on was an AI-First SDLC — not a tool upgrade, a different relationship with the machine entirely.

I’m not adopting a new tool. I’m evolving into a new paradigm.

If you’ve been in development long enough, you know most high-level languages are imperative. You tell the machine what to do step by step, line by line. If X, then Y. Loop, assign, return.

Years later, declarative paradigms took hold. Infrastructure as Code showed up: Terraform, Bicep, Ansible. You stopped describing the steps and started declaring the outcome. The tool figured out the how.

At first, AI looked like the next hop on that same road. I’d been using AI tools heavily for months: Claude, Cursor, ChatGPT, Codex, Gemini. I used them to accelerate the work I already knew how to do. Faster imperative code, faster declarative infra. But I had the feeling I was using them wrong. Maybe not wrong. I wasn’t taking advantage of what AI actually was.

AI is the next jump. The relationship with the machine itself changes.

Treat AI as a highly competent but not astute assistant. It punishes ambiguity. It rewards clear, well-structured instructions.

Here is where that click landed for me.

A year ago I had a team of developers building a multi-tenant accounting system for a real client. Months in, progress was slow. The platform, not the team, was why.

The project ran on Odoo: mature, open-source, accounting built in, the obvious choice for any consulting team. Conventions over code, XML over types, database records over explicit configuration. Human-friendly in every sense. And that was exactly the problem. A platform that rewards human intuition punishes an agent. Humans learn conventions by osmosis. Agents can’t. They need the rule in a file they can grep.

So I threw the Odoo project out and restarted from zero.

One human and six AI agents rebuilt the accounting system on a deliberately AI-legible stack: Next.js, Prisma, Terraform on AWS, declarative everywhere.

In a single session, the rebuild went from zero to a green foundation in 45 minutes. The git log has the receipts. From there, 35+ sprints in 15 days. By sprint 30-something, the reviews got faster because the agents stopped drifting. I kept every gate. The system is in pre-production today, awaiting stakeholder approval before launch.

What I landed on was an AI-First SDLC: not a tool upgrade, a different relationship with the machine entirely.

This post is Solo Mode: one human operator, a team of six AI agents, one project, one disciplined loop. The clearest place to see the paradigm is where you can rebuild from scratch deliberately, which is what I did. Legacy migration, multi-team coordination, and enterprise governance are different problems for different posts. They share the same paradigm choice when iterating with a machine: imperative, declarative, or AI-first.

Stack at a glance:

  • Frontend + API: Next.js (App Router) + TypeScript
  • ORM: Prisma (declarative schema, generated client, migration diffs out-of-the-box)
  • DB: PostgreSQL
  • Auth: NextAuth v5 + AWS Cognito (JWT sessions)
  • Infra: Terraform on AWS. Disposable environments, nothing lives on a pet server.
  • CI/CD: GitHub Actions
  • Testing: Vitest for unit, Playwright for end-to-end
  • Bias: favor declarative over imperative everywhere. Prisma schema over hand-rolled SQL, Terraform over AWS console clicks, state machines over scattered if branches, CLAUDE.md rules over per-PR reminders.

The declarative bias isn’t aesthetic. It’s the reason the AI setup works at all.

What the declarative bias bought, concretely

The AI-legibility principle wasn’t abstract. Every stack choice earns its place:

  • Prisma over hand-rolled SQL or ActiveRecord-style ORMs. The schema is a single declarative file; migrations are diffs against that file; the client is generated. Agents don’t guess. They read schema.prisma.
  • Terraform over AWS console clicks. Infrastructure is a declarative graph; drift is detectable; teardown is one command. Disposable environments become the default, not a special effort.
  • State machines over scattered `if` branches. Transitions are a finite list; illegal states are unrepresentable; the Tech Lead agent can grep the state machine instead of tracing call sites.
  • CLAUDE.md rules over per-PR reminders. Non-negotiable constraints live in one file; every agent loads them; nothing depends on a human remembering to remember.

Every piece of the stack is legible to an agent without a tour guide. That’s the actual product.

Current state: the AI-First SDLC loop that works

What follows is what currently works on this project. It is not a recipe. Plenty of pieces can be done better, and probably will be in the next iteration. The point is to document the journey, not hand down a template.

The virtual dev team

Six Claude Code subagents, one human decision-maker. Each agent has a single-file definition, explicit I/O boundaries, and a model choice tuned to its role:

#AgentModelResponsability
1POSonnetBRD → GitHub Issues + sprint plan
2ArchitectOpusSchema, ADRs, technical design
3DeveloperSonnetImplementation (one story at a time)
4Tech LeadOpusCode review, security, performance
5DevOpsSonnetInfrastructure, CI/CD
6QASonnetTests, bug reports, release notes
*AuditorOpusDeep structural reviews, on-demand

The human (me) reviews between every phase. That’s not a bottleneck. That’s the point.

The sprint loop

PO         → GitHub Issues (Todo) + plan.md    →  I review
Architect  → design.md + schema                →  I review
Developer  → code + moves issues → Testing     →  I spot-check
Tech Lead  → review + fix antipatterns         →  I review
DevOps     → deploy to test                    →  I test
QA         → tests + version bump + release    →  I review
Me         → review.md                         →  feeds next sprint

The coordination primitives

  • GitHub Issues as the source of truth for story status.
  • Filesystem as the bus. Agents communicate via docs (plan.md, design.md, qa-report.md, questions.md), not shared conversation.
  • Agents are stateless. Context composes through pointers. Sprints are small by rule. Each agent sees only the current sprint, and the sprint itself is capped at four implementation stories. Arch, infra, and devops stories don’t count toward the cap. That cap is load-bearing: it keeps the PO’s plan one-screen-readable, the Architect’s design focused, and the Developer’s working set tight. The PO writes the plan and points the Architect at the relevant area. The Architect reads the plan, consults prior ADRs, writes the design, and points the Developer at where code should go. The Developer doesn’t “know the codebase.” The design tells it where. Tech Lead reads the commits, not the history. Each handoff is a pointer, not a memory dump, and because the sprint is bounded, no hop ever has to cover more than a screen’s worth of scope. One exception: the codebase auditor has a small, bounded memory for architectural patterns and recurring antipatterns. It runs on demand, not inline in the sprint loop.
  • Global contract at .claude/CLAUDE.md. One file, loaded by every agent. It carries the stack definition, non-negotiable security rules (role checks on every endpoint, row-level tenancy filters, append-only audit trails, immutable primary keys where the business demands permanence), coding standards (Decimal not float for money; canonical business formulas and fixed domain constants, so every agent computes the same way), agent-coordination rules (one story at a time; on uncertainty, write to `questions.md` and STOP; check docs/feedback/ first), Definition of Done, versioning, and git conventions. Editing it changes every agent’s behavior on the next invocation. No deploy, no restart.
  • A single command (/sprint N) that orchestrates the loop end-to-end.

Why it works Solo

  • One decision-maker: no coordination overhead between humans.
  • Human-in-the-loop at every phase: I see every handoff.
  • Filesystem artifacts, not in-memory conversation. Agents are stateless between invocations. State lives on disk.
  • Model selection per role: Opus where judgment matters (review, architecture), Sonnet where execution matters.

The journey: what the git log actually says

Two moments on the git log make the argument.

The real pivot: 45 minutes, 8 commits

One session. 10:55 AM to 11:38 AM. Eight commits, straight from the git log:

10:55  [S0-001] scaffold Next.js 15 + TypeScript strict + Tailwind
10:58  [S0-002] Prisma 6 setup + User/Role schema + db singleton
11:01  [S0-003] NextAuth v5 + credentials provider + 7 roles + login page
11:02  [S0-004] shadcn/ui + Tailwind CSS v4 + Button/Input/Card components
11:03  [S0-005] Docker multi-stage build + docker-compose with PostgreSQL 16
11:04  [S0-006] CI/CD pipeline + Vitest + smoke tests
11:35  [S0-QA]  33/33 tests passing, fix lint command for Next.js 16
11:38  [S0-DEVOPS] add /api/health endpoint, verify full CI pipeline

The entire foundation in 45 minutes: scaffold, database, auth, UI library, containerization, CI, tests, health endpoint, green pipeline. One human, six agents, a declarative AI-legible stack.

This is the moment the paradigm claim stopped being theoretical.

Sprint velocity, and the stops get quieter

Across twelve days, the project ran S10 through S35. Multiple sprints per day by the end. Each one: PO → Architect → Developer → Tech Lead → DevOps → QA, with ADRs, design docs, tests, release notes.

The human-review gates at each phase went from mandatory-and-slow (early on) to superficial (around sprint thirty-something). I didn’t remove supervision. The CLAUDE.md rules, the per-agent definitions, and the sprint contract converged to the point where the agents stopped drifting. The loop became self-correcting.

That’s where the project lives now.

Where the loop earns its keep

Two examples from the git log, each with receipts:

  • The rulebook drifts; the Tech Lead catches it. CLAUDE.md says “money is always Decimal, never float.” Across multiple sprints, that rule was violated four times in new places. Every time, the Tech Lead agent caught it and re-fixed it before it shipped. Cited: [S30-QA] Decimal fixes, [S32-review] replace float arithmetic with Decimal, [S35-002] Decimal discipline in an API route, fix: add decimal.js to migrator Docker stage. A single-file rulebook is the source of truth. The Tech Lead is the enforcement layer.
  • Security antipatterns surface in review, not in development. [S29-review] flagged three issues in one pass: a missing tenant-scoped row filter, a non-atomic multi-step mutation, and a guard missing for modification after a terminal state. [S32-review] caught XSS in an email template. The Developer agent writes plausible code. The Tech Lead agent (Opus) catches the real vulnerabilities. If Tech Lead had been Sonnet, these would have shipped.

Conclusion

Stop treating AI as another tool. Read this post for the shift, not the product. 

What you walked through is six Claude Code subagents, a single CLAUDE.md contract, sprints capped at four stories, and filesystem handoffs instead of shared conversation. None of those choices make sense until the mental model changes. There’s no library to install, no SaaS to subscribe to, no framework to adopt. You copy a pattern.

The shift isn’t mine either. People across the world are converging on the same pattern: an agent can carry a role. Not a personality, just an operational profile, a developer who writes code, an architect who defines structure, a reviewer who flags drift. Chain role-bound agents together with bounded context, structured handoffs, and review gates, and value compounds one piece at a time. The handoff medium varies, filesystem here, graph stores, SQL chains, even no shared state at all elsewhere, but the shape is the same. That convergence is the signal: this is becoming the norm, not a novelty.

The shift is something you do, not something you buy. Start with the mindset, not the tools. The rulebook comes before the agents. Structured handoffs beat shared conversation. Every phase gets a review gate, because that is the quality primitive that keeps agents from drifting. When you look at the next orchestrator framework, ask which primitives it actually gives you.

The next time you face a problem you want to solve with software, keep one question in mind:

If you could clone yourself into an assembly line, with each clone handling one bounded part of the work, then add outside specialists for the parts beyond your expertise, and if every clone and every outside specialist cost real money by the minute, how would you define the task you hand to each of them?

Answer that honestly, and the orchestration follows.

Start.


Related Resources

How One Auto Giant Cut Support Costs and Shipped Faster — All Thanks to the Cloud


Are You Ready for Agentic AI? A Practical Framework for Enterprise Maturity


Accelerating Medical Affairs Platform Development With an AI-Enabled Engineering Pod