AI Commentary · Part 13 of 14

Multi Agent Setup for Developers

Multi Agent Setup for Developers

We’ve all heard about using multiple agents to speed up development. Affaan Mustafa who won the Anthropic x Forum Ventures Hackathon in NYC took it to the extreme. His project (Everything Claude Code) has 38 specialized agents that are coordinated by a central planner agent. This is analogous to having a 30 people engineering team with a TL/TPM coordinating the efforts.

In this post, we talk about how to set up multi agents for developers and show how it works in practice.

Native Agent Teams

Claude Code has a native agent team feature. Once it’s enabled, it can spawn team agents with their own context windows and tool access.

To Enable

Agent team can be enabled by setting the environment variable CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS to 1, ie. export CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1

It can also be enabled by updating the project .env, ie. echo “CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1” >> .env

To Invoke

It’s very straightforward to invoke the subagents. Just state it in the prompt:

“Review this PR with a team — one agent for security, one for performance, one for correctness.”

How It Works

It is made of three components:

  1. Team lead which resides in the primary interactive session, and it breaks tasks into subtasks and spawns teammates. In the previous example, it breaks the PR review tasks into 3 subtasks, and spawns 3 teammates.
  2. Teammates are independent agents with their own context windows and tool access. Because they can work in parallel, they can speed up the completion of the over task.
  3. Mailbox system: it’s the communication between agents.

Use Patterns

Start with 2-3 teammates. You can set up them to run tests, e.g. unit test, integration test and end to end tests. Here are some handy dandy file tools I created using this process.

DIY Multi-Agents

If you want to have more control, you can simply launch multiple Claude Code sessions. Because each Claude Code session is isolated, they need to be explicitly told what their roles are and what they need to do. This can be accomplished through system prompts or separate CLAUDE.md files, for example, a system prompt could be:

You are the Python backend agent. Poll taskbox.db for tasks where assigned_to = ‘py’ and status = ‘pending’. When done, update the status and write output to outputs/py/.”

Or the CLAUDE.md could be:

# CLAUDE.md for the QA agent
You are the QA agent in a 6-agent team.
- Your job: write and run tests for code in outputs/py/ and outputs/js/
- Read completed tasks from taskbox.db where assigned_to=‘qa’
- Write test results to outputs/qa/results.md - Message the PM by inserting a row into messages.db

All the agents will share the same repo directory, and they communicate through a SQLite or a file. The human user only sends messages to the PM session which interprets the goal, creates tasks and assigns them to the agents. The other agents are running in a loop polling for new work.

What a Session Launch Looks Like

bash
# Each pane runs a different command
tmux new-session -s octobots

# Pane 0: PM — you talk to this one
claude —system prompts/pm.md

# Pane 1: BA
claude —system prompts/ba.md

# Pane 2: TL
claude —system prompts/tl.md

# Pane 3: Python dev
claude —system prompts/py.md

# Pane 4: JS dev
claude —system prompts/js.md

# Pane 5: QA
claude —system prompts/qa.md

Each prompts/<role>.md file is what gives that agent its identity, task-polling instructions, and communication protocol.

Third Party Orchestrators

Tools like Multiclaude and Gas Town offer alternative orchestration patterns. Gas Town is better for solo devs running many parallel agents; Multiclaude is stronger for team usage and longer “fire and forget” workflows.

The Everything Claude Code Setup

ECC is a supercharged DIY multi-agents, but with a lot more built in. Rather than you manually wiring up a SQLite task queue, system prompts, and tmux sessions yourself, ECC provides all of that infrastructure pre-built:

  • 63 pre-built specialized agents (planner, architect, code-reviewer, security-reviewer, go-reviewer, etc.) you’d otherwise have to write from scratch as custom system prompts
  • /multi-plan, /multi-execute, /multi-backend, /multi-frontend commands that handle the multi-agent task decomposition and orchestration for you
  • Hooks that fire on tool events (file edits, session start/stop) to handle memory persistence and context passing between agents — this is the equivalent of the SQLite message queue we described, but automated
  • Skills as reusable workflow definitions agents can invoke
  • Rules as always-on guidelines each agent follows within its scope

The agents communicate through files on disk. It defines SessionStart and Session-end hooks to load context and save context. This is how memory persists across agents.

Example

Let’s walk through a concrete end-to-end example — building a user authentication feature in a Next.js + PostgreSQL app.

Setup (one time)

bash
# Install ECC into Claude Code
/plugin marketplace add https://github.com/affaan-m/ECC
/plugin install ecc@ecc

# Copy rules for your stack
git clone https://github.com/affaan-m/ECC.git
cd ECC
mkdir -p ~/.claude/rules/ecc
cp -r rules/common ~/.claude/rules/ecc/
cp -r rules/typescript ~/.claude/rules/ecc/

The Project

You have a Next.js app with a PostgreSQL database. You want to add:

  • JWT-based login/signup API routes
  • A React login form
  • Tests for both
  • Security audit before shipping

Step 1: Plan

You talk only to the planner. Open Claude Code in your project root:

/ecc:plan “Add JWT authentication — signup and login API routes,
a React login form, and full test coverage”

The planner agent responds with a breakdown like:

Implementation Blueprint:
─────────────────────────
Backend tasks (assigned: py/ts backend agent):
[T-01] Create users table migration (email, hashed_password, created_at)
[T-02] POST /api/auth/signup — validate, hash password, insert user, return JWT
[T-03] POST /api/auth/login — validate credentials, verify hash, return JWT
[T-04] JWT middleware for protected routes

Frontend tasks (assigned: js/ts frontend agent):
[T-05] LoginForm component (email + password fields, error states)
[T-06] useAuth hook — wraps fetch calls to /api/auth/*
[T-07] Protected route wrapper component

QA tasks (assigned: qa agent):
[T-08] Unit tests for signup/login route handlers
[T-09] Integration test for full auth flow
[T-10] E2E test: user can sign up, log in, see protected page

Writing tasks to state store…
Tasks written. Run /multi-execute to begin.

This gets written to the SQLite state store.

Step 2: Execute

/multi-execute

This spawns three agents. In tmux you’d see three panes light up. Each agent loads the task list at startup (via the SessionStart hook) and picks up its assigned tasks.

Backend agent starts working on T-01 through T-04:

outputs/backend/
migrations/001_create_users.sql
pages/api/auth/signup.ts
pages/api/auth/login.ts
lib/auth/jwt.ts
lib/auth/password.ts

Frontend agent works on T-05 through T-07 in parallel:

outputs/frontend/
components/LoginForm.tsx
hooks/useAuth.ts
components/ProtectedRoute.tsx

QA agent starts on T-08, but it needs the backend output first. It polls the state store, sees T-01 through T-04 are still in_progress, so it writes to WORKING-CONTEXT.md:

[QA] Waiting on T-01 to T-04. Will begin unit tests once
backend marks tasks complete.

When the backend agent finishes T-02, it updates the state store:

T-02: status=complete, output=pages/api/auth/signup.ts

The QA agent’s next poll picks this up and starts writing tests against it immediately, without waiting for the full backend to finish.

Step 3: Review a Blocker

Halfway through, the backend agent hits a snag — it needs to know what fields the login form will actually send. It writes to WORKING-CONTEXT.md:

[BACKEND] T-03 blocked: need to confirm request shape from frontend.
Assuming { email: string, password: string } — frontend agent please confirm.

You notice this in your tmux pane. You relay the message to the frontend agent (since agents can’t interrupt each other directly):

# In the frontend agent’s pane:
> The backend agent is assuming login sends { email, password }.
Is that what useAuth sends?

Frontend agent confirms and updates its output. You tell the backend agent to unblock and continue. It updates the task:

T-03: status=in_progress (unblocked)

Step 4: Security Audit

Once all tasks are marked complete, before you merge anything:

/security-scan

The AgentShield security agent spins up and scans everything in outputs/:

Security Audit Results
──────────────────────
✓ Password hashing: bcrypt with cost factor 12 — OK
✗ JWT secret: hardcoded string in lib/auth/jwt.ts — CRITICAL
Fix: move to process.env.JWT_SECRET

✗ Signup route: no rate limiting — HIGH
Fix: add rate-limit middleware (suggest: upstash/ratelimit)

✓ SQL queries: parameterized — OK
✓ No secrets in outputs/ — OK

2 issues found. Fix before merging.

You fix the two issues directly, or ask the backend agent to fix them:

> Fix the two security issues flagged by the audit:
1. Move JWT secret to env var
2. Add rate limiting to the signup route

Step 5: Code Review

/code-review

The code-reviewer agent reads all the outputs and gives feedback:

Code Review — Auth Feature
──────────────────────────
backend/signup.ts
⚠ No input validation on email format — suggest zod schema
✓ Error handling looks solid

frontend/LoginForm.tsx
⚠ No loading state while fetch is in-flight — UX issue
✓ Error states handled correctly

hooks/useAuth.ts
✓ Clean abstraction, easy to test

Overall: 2 minor issues. Good to fix before merge.

Step 6: Merge Outputs

Once you’re happy, you copy the outputs into your actual source tree:

bash
cp -r outputs/backend/pages/api/auth pages/api/
cp -r outputs/backend/lib/auth lib/
cp -r outputs/frontend/components/LoginForm.tsx components/
cp -r outputs/frontend/hooks/useAuth.ts hooks/
cp -r outputs/frontend/components/ProtectedRoute.tsx components/
cp -r outputs/qa/__tests__ __tests__/

Run your test suite to confirm everything passes, then open your PR.

What Just Happened

StepWho did itHow they knew what to do
Task decompositionplanner agentYour prompt
Backend codebackend agentTasks from SQLite state store
Frontend codefrontend agentTasks from SQLite state store, in parallel
QA testsQA agentPolled state store, read outputs/backend/
Blocker resolutionYou (relayed)WORKING-CONTEXT.md
Security auditAgentShield agentScanned outputs/
Code reviewcode-reviewer agentRead all outputs/

The agents never actually talked to each other directly — you and the shared file system were the communication layer. But because the planner decomposed the work cleanly upfront and the hooks kept state consistent, it feels coordinated.