In the VM0 dev team, every developer works with multiple Claude Code instances at the same time. Usually more than eight.
We treat Claude Code the same way we treat a real developer. (Yes, our company is half-jokingly called AI Colleagues Co!)
Because of that, the design philosophy behind the VM0 dev workflow mirrors classic team management practices in software engineering.
We use GitHub Issues to track work, Pull Requests for code review and merging, and GitHub Actions to handle automation. Over two months, this setup helped us ship 404 releases and write more than 230,000 lines of code.
This post explains how we made that workable, and why the key problem was never AI capability, but human coordination.
AI-Powered dev workflow in practice
When you coordinate many AI agents in parallel, the bottleneck isn’t whether the model can write code. The real bottleneck is human cognitive load
This workflow consists of 14 slash commands, organized into three layers: Deep Dive, Issue Management, and PR Management.
Let’s first look at what my workflow looks like and how a feature usually gets built.
-
Requirement alignment
A human opens a Claude session and starts with
/deep-research. Claude gathers facts from the codebase, documentation, and relevant context. We discuss the findings and align on what problem we are actually solving. -
Solution exploration
Using
/deep-innovate, Claude proposes several possible directions, with trade‑offs. We discuss, narrow down, and choose a direction. -
Issue creation
We create a GitHub issue using
/issue-create. The human reviews the issue to make sure requirements are clearly captured. -
Planning and approval
Use
/issue-planto let Claude continue the work. Claude will automatically run the full deep-dive workflow and post the results to the issue, including:- findings from
/deep-research - comparisons from
/deep-innovate - a concrete implementation plan from
/deep-plan
- findings from
-
Implementation
After approval,
/issue-actionlets Claude implement the plan, write tests, open a PR, and ensure CI passes. -
Review and merge
We use
/pr-reviewfor a structured review, then do final human review before merging.The human intervenes at three checkpoints: requirements, direction, and acceptance. Everything else runs autonomously.
Mindset shift: you’re leading a team of AI developers
The moment I realized we needed a structured workflow was when adding more Claude sessions actually made things worse. The more instances I ran in parallel, the harder it became to track what each one was doing, what state the work was in, and what had already been decided.
Without external tools, I simply couldn’t manage that many Claude instances at once. That’s when it clicked: this wasn’t an AI problem, it was a management problem.
GitHub is already the natural tool for collaboration in software development, so instead of inventing something new, I started treating Claude the same way I treat a human teammate. Once I did that, my management bandwidth suddenly scaled.
Ten years of project and team management experience finally made sense in this new context. By treating Claude as a team member and GitHub as our shared communication and management space, the whole system became manageable again.
A good team leader knows when to engage and when to step back:
| Checkpoint | What I do | What AI does |
|---|---|---|
| Requirements | Align on the problem, clarify scope | Research codebase, gather context |
| Direction | Review findings, approve approach | Propose 2-3 approaches, evaluate trade-offs |
| Acceptance | Review PR, verify quality | Implement, test, fix CI |
This mirrors how effective software teams operate. I don't micromanage developer but set clear requirements, review key decisions, and verify the final output. The same principle applies when managing AI agents.
The deep dive flow enforces structured, slow thinking
The deep dive workflow enforces deliberate thinking before implementation. Sometimes Claude runs into a dead end. When that happens, we force Claude to stop and think, and then talk it through together. It has three phases:
| Phase | Command | Purpose | Output |
|---|---|---|---|
| Research | /deep-research | Gather facts, understand context | research.md |
| Innovate | /deep-innovation | Explore multiple approaches | innovate.md |
| Plan | /deep-plan | Define concrete steps | plan.md |
Each phase has strict boundaries.
- Research: no suggestions
- Innovate: no details
- Plan: no implementation
These constraints force Claude into slow, deliberate reasoning instead of jumping straight to code. Without them, edge cases and architectural concerns are often missed!
Usage example
/deep-research investigate the authentication flow, I'm seeing token expiration issues
[Claude researches, analyzes 12 related files, finds 3 similar patterns]
/deep-innovate what are our options for fixing this?
[Claude presents 3 approaches with trade-offs, you pick one]
/issue-create let's track this fix
For simple tasks, you can skip the deep dive and go directly to /issue-create.
For complex tasks with technical uncertainty, the deep dive phases help ensure you and Claude are aligned before implementation begins.
Use GitHub as shared memory
Most AI tools treat context as temporary. When the session ends, the memory disappears.
VM0 uses GitHub as persistent memory:
| GitHub feature | What it stores |
|---|---|
| Issue body | Requirements and decisions |
| Issue comments | Research, options, plans |
| PR comments | Reviews and summaries |
| Labels | Workflow state |
This also solves a human problem: context recovery.
When I am managing 8+ Claude instances, I receive notifications that work is complete. But I can't reconstruct from Claude's conversation what it was doing, what decisions were made, or what the current state is.
GitHub issues solve this. Each issue displays:
- The original requirements
- Research findings (what was discovered)
- Innovation phase (what options were considered)
- The approved plan (what will be implemented)
This structured format makes review efficient. So I can quickly scan the phases, understand the approach, and approve or request changes, all without needing to remember the original conversation.
When work finishes, I don’t need to remember what happened in a chat window. I can open the issue and see the full story, structured and written down.
Handoff between agents
Because all context lives in GitHub, work can move between agents seamlessly:
- One agent create an issue or PR
- Another continues later using
/deep-research issue 123or/issue-plan 123or/deep-research PR 124
For long discussions, /issue-compact consolidates everything into a clean issue body. This makes handoffs easy for both humans and AI.
Let’s summarize the workflow patterns
After all that, let me summarize a few practical tips.
Simple tasks
/issue-create → /issue-plan → /issue-action → /pr-check-and-merge
Use this when requirements are clear and the work is straightforward.
Complex tasks
/deep-research → discussion → /deep-innovate → discussion →
/issue-create → /issue-plan → /issue-action →
/pr-review → /pr-check
This prevents wasted effort on the wrong approach.
Parallel work
Multiple agents can work at once while the human reviews completed checkpoints. This is where the workflow scales best.
Agent 1: /issue-plan #123
Agent 2: /issue-plan #124
Agent 3: /pr-review #100
Agent 4: /deep-research new feature requirements
Command reference
Deep dive commands
| Command | Purpose |
|---|---|
/deep-research | Gather information, understand codebase. No suggestions allowed. |
/deep-innovate | Explore 2-3 approaches, evaluate trade-offs. No code allowed. |
/deep-plan | Create concrete implementation steps. No implementation allowed. |
Issue commands
| Command | Purpose |
|---|---|
/issue-create | Create issue from conversation context |
/issue-bug | Create bug report with reproduction steps |
/issue-feature | Create feature request focused on requirements |
/issue-plan | Execute full deep-dive workflow, post results to issue |
/issue-action | Continue implementation after human approval |
/issue-compact | Consolidate issue body + comments for handoff |
PR commands
| Command | Purpose |
|---|---|
/pr-check | Monitor CI pipeline, auto-fix, retry up to 3x |
/pr-review | Review PR commit-by-commit against project standards |
/pr-comment | Summarize conversation discussion to PR comment |
Getting started
- Start simple: Use
/issue-create→/issue-plan→/issue-actionfor your first task - Add deep dive for complex tasks: When requirements are unclear or technically complex, start with
/deep-research - Scale gradually: Add more Claude instances as you get comfortable with the review rhythm
- Trust the process: Let Claude work autonomously between checkpoints
The workflow is designed to be adopted incrementally. You don't need to use all 14 commands from day one. Start with the basic issue flow, then add deep dive phases and parallel work as you gain confidence.
Scaling considerations: What to do when you have more agents
The workflow has been tested with 10+ concurrent Claude instances. Our recommendation:
- Up to 10 agents: Comfortable for deep collaboration with each
- Beyond 10: Not recommended
The limiting factor isn't the workflow, it's human attention and decision quality. When managing more than 10 agents, you risk becoming a bottleneck at review checkpoints, and decision quality starts to degrade.
The classic "two pizza team" principle applies here. The same constraints that limit human team size also limit how many AI agents one person can effectively manage.
I'm currently exploring an 8×8 two-tier team structure for scaling beyond 10 agents, but haven't yet developed effective practices. I'll share more when there are concrete results…
The VM0 dev workflow changes how we think about software development when AI becomes part of the team.
When you treat AI agents as team members rather than tools, everything clicks into place. GitHub becomes your team's shared memory. Issues become work items. PRs become deliverables. And you become the team leader, focusing on architecture, direction, and quality while your AI team handles the implementation.
That's how we shipped 404 releases in 2 months. And it's how you can scale your own development with AI.


