Back to all posts

VM0 dev workflow: Managing AI agents like a team

In the VM0 dev team, every developer works with multiple Claude Code instances at the same time. Usually more than eight.

We treat Claude Code the same way we treat a real developer. (Yes, our company is half-jokingly called AI Colleagues Co!)

Because of that, the design philosophy behind the VM0 dev workflow mirrors classic team management practices in software engineering.

We use GitHub Issues to track work, Pull Requests for code review and merging, and GitHub Actions to handle automation. Over two months, this setup helped us ship 404 releases and write more than 230,000 lines of code.

This post explains how we made that workable, and why the key problem was never AI capability, but human coordination.

AI-Powered dev workflow in practice

When you coordinate many AI agents in parallel, the bottleneck isn’t whether the model can write code. The real bottleneck is human cognitive load

This workflow consists of 14 slash commands, organized into three layers: Deep Dive, Issue Management, and PR Management.

Let’s first look at what my workflow looks like and how a feature usually gets built.

  1. Requirement alignment

    A human opens a Claude session and starts with /deep-research. Claude gathers facts from the codebase, documentation, and relevant context. We discuss the findings and align on what problem we are actually solving.

  2. Solution exploration

    Using /deep-innovate, Claude proposes several possible directions, with trade‑offs. We discuss, narrow down, and choose a direction.

  3. Issue creation

    We create a GitHub issue using /issue-create. The human reviews the issue to make sure requirements are clearly captured.

  4. Planning and approval

    Use /issue-plan to let Claude continue the work. Claude will automatically run the full deep-dive workflow and post the results to the issue, including:

    1. findings from /deep-research
    2. comparisons from /deep-innovate
    3. a concrete implementation plan from /deep-plan
  5. Implementation

    After approval, /issue-action lets Claude implement the plan, write tests, open a PR, and ensure CI passes.

  6. Review and merge

    We use /pr-review for a structured review, then do final human review before merging.

    The human intervenes at three checkpoints: requirements, direction, and acceptance. Everything else runs autonomously.

Mindset shift: you’re leading a team of AI developers

The moment I realized we needed a structured workflow was when adding more Claude sessions actually made things worse. The more instances I ran in parallel, the harder it became to track what each one was doing, what state the work was in, and what had already been decided.

Without external tools, I simply couldn’t manage that many Claude instances at once. That’s when it clicked: this wasn’t an AI problem, it was a management problem.

GitHub is already the natural tool for collaboration in software development, so instead of inventing something new, I started treating Claude the same way I treat a human teammate. Once I did that, my management bandwidth suddenly scaled.

Ten years of project and team management experience finally made sense in this new context. By treating Claude as a team member and GitHub as our shared communication and management space, the whole system became manageable again.

A good team leader knows when to engage and when to step back:

CheckpointWhat I doWhat AI does
RequirementsAlign on the problem, clarify scopeResearch codebase, gather context
DirectionReview findings, approve approachPropose 2-3 approaches, evaluate trade-offs
AcceptanceReview PR, verify qualityImplement, test, fix CI

This mirrors how effective software teams operate. I don't micromanage developer but set clear requirements, review key decisions, and verify the final output. The same principle applies when managing AI agents.

The deep dive flow enforces structured, slow thinking

The deep dive workflow enforces deliberate thinking before implementation. Sometimes Claude runs into a dead end. When that happens, we force Claude to stop and think, and then talk it through together. It has three phases:

PhaseCommandPurposeOutput
Research/deep-researchGather facts, understand contextresearch.md
Innovate/deep-innovationExplore multiple approachesinnovate.md
Plan/deep-planDefine concrete stepsplan.md

Each phase has strict boundaries.

These constraints force Claude into slow, deliberate reasoning instead of jumping straight to code. Without them, edge cases and architectural concerns are often missed!

Usage example

/deep-research investigate the authentication flow, I'm seeing token expiration issues

[Claude researches, analyzes 12 related files, finds 3 similar patterns]

/deep-innovate what are our options for fixing this?

[Claude presents 3 approaches with trade-offs, you pick one]

/issue-create let's track this fix

For simple tasks, you can skip the deep dive and go directly to /issue-create.

For complex tasks with technical uncertainty, the deep dive phases help ensure you and Claude are aligned before implementation begins.

Use GitHub as shared memory

Most AI tools treat context as temporary. When the session ends, the memory disappears.

VM0 uses GitHub as persistent memory:

GitHub featureWhat it stores
Issue bodyRequirements and decisions
Issue commentsResearch, options, plans
PR commentsReviews and summaries
LabelsWorkflow state

This also solves a human problem: context recovery.

When I am managing 8+ Claude instances, I receive notifications that work is complete. But I can't reconstruct from Claude's conversation what it was doing, what decisions were made, or what the current state is.

GitHub issues solve this. Each issue displays:

This structured format makes review efficient. So I can quickly scan the phases, understand the approach, and approve or request changes, all without needing to remember the original conversation.

When work finishes, I don’t need to remember what happened in a chat window. I can open the issue and see the full story, structured and written down.

Handoff between agents

Because all context lives in GitHub, work can move between agents seamlessly:

For long discussions, /issue-compact consolidates everything into a clean issue body. This makes handoffs easy for both humans and AI.

Let’s summarize the workflow patterns

After all that, let me summarize a few practical tips.

Simple tasks

/issue-create → /issue-plan → /issue-action → /pr-check-and-merge

Use this when requirements are clear and the work is straightforward.

Complex tasks

/deep-research → discussion → /deep-innovate → discussion →
/issue-create → /issue-plan → /issue-action →
/pr-review → /pr-check

This prevents wasted effort on the wrong approach.

Parallel work

Multiple agents can work at once while the human reviews completed checkpoints. This is where the workflow scales best.

Agent 1: /issue-plan #123
Agent 2: /issue-plan #124
Agent 3: /pr-review #100
Agent 4: /deep-research new feature requirements

Command reference

Deep dive commands

CommandPurpose
/deep-researchGather information, understand codebase. No suggestions allowed.
/deep-innovateExplore 2-3 approaches, evaluate trade-offs. No code allowed.
/deep-planCreate concrete implementation steps. No implementation allowed.

Issue commands

CommandPurpose
/issue-createCreate issue from conversation context
/issue-bugCreate bug report with reproduction steps
/issue-featureCreate feature request focused on requirements
/issue-planExecute full deep-dive workflow, post results to issue
/issue-actionContinue implementation after human approval
/issue-compactConsolidate issue body + comments for handoff

PR commands

CommandPurpose
/pr-checkMonitor CI pipeline, auto-fix, retry up to 3x
/pr-reviewReview PR commit-by-commit against project standards
/pr-commentSummarize conversation discussion to PR comment

Getting started

  1. Start simple: Use /issue-create → /issue-plan → /issue-action for your first task
  2. Add deep dive for complex tasks: When requirements are unclear or technically complex, start with /deep-research
  3. Scale gradually: Add more Claude instances as you get comfortable with the review rhythm
  4. Trust the process: Let Claude work autonomously between checkpoints

The workflow is designed to be adopted incrementally. You don't need to use all 14 commands from day one. Start with the basic issue flow, then add deep dive phases and parallel work as you gain confidence.

Scaling considerations: What to do when you have more agents

The workflow has been tested with 10+ concurrent Claude instances. Our recommendation:

The limiting factor isn't the workflow, it's human attention and decision quality. When managing more than 10 agents, you risk becoming a bottleneck at review checkpoints, and decision quality starts to degrade.

The classic "two pizza team" principle applies here. The same constraints that limit human team size also limit how many AI agents one person can effectively manage.

I'm currently exploring an 8×8 two-tier team structure for scaling beyond 10 agents, but haven't yet developed effective practices. I'll share more when there are concrete results…

The VM0 dev workflow changes how we think about software development when AI becomes part of the team.

When you treat AI agents as team members rather than tools, everything clicks into place. GitHub becomes your team's shared memory. Issues become work items. PRs become deliverables. And you become the team leader, focusing on architecture, direction, and quality while your AI team handles the implementation.

That's how we shipped 404 releases in 2 months. And it's how you can scale your own development with AI.

Related Articles

Stay in the loop

// Get the latest insights on agent-native development.

SubscribeJoin Discord