Addy Osmani —《Claude Code Swarms(Agent Teams)》(全文)
原文: https://addyosmani.com/blog/claude-code-agent-teams/ 作者: Addy Osmani(Google Director, Cloud AI / Gemini) 发表日期: 2026 年 2 月 5 日 本译版定位: 完整逐段翻译 + 译注。配套精读:Addy Osmani 三连
译者前言
这篇是 Addy Osmani 三连之外的”第四篇必读” —— 它把 multi-agent 协作从 LangChain 实验室阶段、拽进了 Claude Code 一等公民阶段。2026 年 2 月 Anthropic 在 Claude Code 里 ship 了 Agent Teams 功能(社区叫它 swarm,蜂群),开发者社区从这一天起,多 agent 协作不再是”自己拼 bash 脚本 + tmux 分屏”,而是 settings.json 里的一个开关。
读这篇前最好对 Claude Code 的 subagent 已经有概念。subagent 是”一个聚焦小工人,做完报告给 main agent”。Agent Team 是质变:多个 Claude Code 实例并行工作,彼此可以发消息、抢任务、独立讨论,有自己的收件箱(inbox)和共享 task list。这是从”工头 + 下属”到”扁平化小队”的范式跳跃。
这篇的实战价值在于:Addy 把”什么场景该用 swarm,什么场景反而该用单 agent”讲得很清楚。不要为了用 swarm 而 swarm。读完你会知道:并行调试假说、跨层 feature(前端 + 后端 + test)、并行 code review,这三个场景 swarm 真有质变收益;其他大部分场景,单 session 或 subagent 反而更划算。
Claude Code Swarms
原文:Claude Code now supports agent teams (swarms). Instead of a single agent working through a task sequentially, a lead agent can delegate to multiple teammates that work in parallel - researching, debugging, and building while coordinating with each other. Enable agent teams in your
settings.jsonto give it a spin. If you’ve been playing with multi-agent orchestration via Conductor, Gas Town or similar, this news will be exciting to you.
Claude Code 现在支持 agent teams(社区叫 swarm)。不再是一个 agent 顺序走完任务,而是一个 lead agent 把工作委派给多个并行工作的 teammate —— 一边做研究、debug、构建,一边互相协调。在你的 settings.json 里启用 agent teams 就能体验。如果你之前在玩 Conductor、Gas Town 或类似的多 agent 编排,这个消息会让你兴奋。
🟢 译者注:Conductor(maxxx-rsf 等做的) 和 Gas Town(Geoffrey Huntley 的项目)都是 2025 年下半年开始流行的 multi-agent 编排实验项目,允许并行调度多个 Claude / GPT 实例。Anthropic 现在把这个能力官方化,说明它从”小众玩法”晋升到”产品默认能力”。
原文:The community has been calling these patterns swarms - coordinated teams of AI agents, each with specialized roles, working in parallel with structured communication. What started as developers discovering feature-flagged capabilities in Claude Code’s binary and building workarounds with subagents and bash scripts is now a first-class feature. The TeammateTool, the inbox-based communication, the tmux split panes - it’s all here.
社区一直把这种模式叫 swarm(蜂群) —— AI agent 协调团队,每个有专精角色,通过结构化通信并行工作。原本是开发者从 Claude Code 二进制里挖出 feature flag 的能力、用 subagent 和 bash 脚本拼出 workaround;现在它成了一等功能。TeammateTool、inbox 通信、tmux 分屏 —— 全都在这里。
原文:This matters because it’s a fundamentally different architecture from the single-agent model most of us have been using. If you’ve been following the shift from conductor to orchestrator or experimenting with parallel agent workflows, agent teams are where those ideas become concrete.
这件事重要,因为它是和我们大多数人在用的单 agent 模型在架构上根本不同的另一种东西。如果你一直关注 从 conductor 到 orchestrator 的迁移,或者实验过 并行 agent workflow,agent team 就是这些想法落地具象的地方。
为什么 multi-agent 协调重要
原文:The single-agent model has a well-known failure mode. You ask Claude to do something complex - refactor authentication across three services, say - and it gets maybe 60% of the way there before context degrades. Details from step 2 blur into step 5. You
/clearand start over. Repeat until frustrated.
单 agent 模型有一个众所周知的失败模式。你让 Claude 做一件复杂的事 —— 比如跨三个服务重构认证 —— 它走到 60% 左右,context 开始退化。第 2 步的细节在第 5 步模糊掉。你 /clear 重来。重复直到挫败。
原文:The core insight behind swarms is simple ~ LLMs perform worse as context expands. This isn’t just about hitting token limits, rather the more information in the context window, the harder it is for the model to focus on what matters right now. Adding a project manager’s strategic notes to a context that’s trying to fix a CSS bug actively hurts performance.
swarm 背后的核心洞察很简单 ——
原文金句:LLMs perform worse as context expands.
中译:context 越大,LLM 表现越差。
这不仅是 token 上限的问题;context window 里信息越多,模型越难聚焦在此刻真正重要的东西上。给一个正在修 CSS bug 的 context 加上 PM 的战略笔记,反而会让性能变差。
原文:Human teams work in kind of the same way. We don’t have backend engineers sitting in on frontend code reviews. We don’t CC the entire company on every Slack thread. Specialization is about focus.
人类团队大致也这么工作。我们不让后端工程师参加前端 code review。我们不在每条 Slack 线程里 CC 全公司。专精的本质是聚焦。
原文:Multi-agent patterns formalize this. By giving each agent a narrow scope and clean context, you get better reasoning within each domain, independent quality checks, natural checkpoints between phases, and graceful degradation when one agent fails. The testing agent has testing in its context, not the three-hour planning discussion. The security reviewer doesn’t wade through performance optimization notes.
multi-agent 模式把这件事形式化。给每个 agent 一个窄 scope + 干净 context,你就在每个领域内得到更好的推理、独立的质量检查、阶段之间的自然 checkpoint、以及单个 agent 失败时的优雅降级。测试 agent 的 context 里只有测试,而不是三小时的规划讨论。安全 reviewer 不必趟过性能优化笔记。
原文:The caveat is that this only works when tasks are properly scoped. “Build me an app” burns tokens while agents flail. “Implement these five clearly-defined API endpoints according to this specification” produces something good.
前提是任务必须 scope 得对。
- “给我搭个 app” → agent 们瞎飘,烧 token。
- “按这份规约实现这五个明确定义的 API 端点” → 产出好东西。
Agent team 是什么
原文:The architecture is straightforward. One Claude Code session becomes the team lead. It spawns teammates - each a full, independent Claude Code instance with its own large token context window. There’s a shared task list with dependency tracking, an inbox-based messaging system for inter-agent communication, and teammates can self-claim work as they finish tasks.
架构很直白。一个 Claude Code session 成为 team lead(队长)。它派生 teammate(队员) —— 每个都是一个完整、独立的 Claude Code 实例,带着自己的大 token context window。有一个共享的 task list 带依赖追踪,有一个基于 inbox 的消息系统用于 agent 间通信,teammate 可以在做完任务后自己抢下一个工作。
| 组件 | 角色 |
|---|---|
| Team lead(队长) | 创建 team、派生 teammate、协调工作 |
| Teammates(队员) | 独立 Claude Code 实例,在分配的任务上工作 |
| Task list(任务清单) | 共享工作项,带依赖追踪和自动解锁 |
| Mailbox(邮箱) | agent 之间直接发消息 —— 不只是回报给 lead |
原文:This is different from Claude Code’s existing subagents. Subagents are focused workers that report results back to a single parent - they can’t talk to each other. Agent teams are actual collaboration - teammates share findings, challenge each other’s approaches, and coordinate independently. The tradeoff is token cost - each teammate is a separate Claude instance.
这与 Claude Code 已有的 subagent 不同。subagent 是聚焦工人,把结果汇报给一个父 agent —— 他们彼此说不上话。agent team 是真正的协作 —— teammate 共享发现、互相挑战方案、独立协调。代价是 token 成本 —— 每个 teammate 是一个独立的 Claude 实例。
| Subagents | Agent teams | |
|---|---|---|
| Context | 自己的窗口;结果返回给调用方 | 自己的窗口;完全独立 |
| Communication | 只回报给主 agent | teammate 之间直接互发消息 |
| Coordination | 主 agent 全包 | 共享 task list + 自我协调 |
| 适合场景 | 聚焦任务,只需要结果 | 复杂工作,需要讨论和协作 |
| Token 成本 | 较低 | 较高 —— 每个 teammate 是独立实例 |
原文:Use subagents when you need quick, focused workers. Use agent teams when teammates need to share findings, challenge each other, and coordinate on their own.
需要快速、聚焦的工人时用 subagent。需要 teammate 共享发现、互相挑战、自我协调时用 agent team。
上手
原文:Enable agent teams by adding the experimental flag to your settings:
在你的 settings 里加上实验性 flag 启用 agent teams:
// settings.json
{
"env": {
"CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS": "1"
}
}
原文:Then tell Claude what team you want in natural language. It handles spawning and coordination from there:
然后用自然语言告诉 Claude 你想要什么 team。它从这里开始接管 spawning 和协调:
I'm designing a CLI tool that helps developers track TODO comments across
their codebase. Create an agent team to explore this from different angles:
one teammate on UX, one on technical architecture, one playing devil's advocate.
我在设计一个 CLI 工具,帮开发者跨代码库追踪 TODO 注释。创建一个
agent team 从不同角度探索:一个 teammate 关注 UX,一个关注技术架构,
一个唱反调(devil's advocate)。
原文:Claude creates the team with a shared task list, spawns teammates for each perspective, has them explore the problem, and synthesizes findings. The lead’s terminal lists all teammates and what they’re working on.
Claude 用共享 task list 创建 team,为每个视角派生 teammate,让他们探索问题,综合发现。lead 的终端列出所有 teammate 和他们在做什么。
原文:You can also be explicit about team structure:
你也可以显式规定 team 结构:
Create a team with 4 teammates to refactor these modules in parallel.
Use Opus for each teammate.
创建一个 4 人 team,并行重构这几个模块。每个 teammate 用 Opus。
Swarm 的甜蜜区
原文:Here’s a pattern that works: tasks where parallel exploration adds real value and teammates can operate largely independently.
这是一个能 work 的模式:
原文金句:tasks where parallel exploration adds real value and teammates can operate largely independently.
中译:并行探索确实带来价值、且 teammate 能在很大程度上独立操作的任务。
原文:Competing hypotheses for debugging. Spawn five teammates each investigating a different theory about why the app exits after one message. Have them talk to each other to disprove each other’s theories, like a scientific debate. This is genuinely better than sequential investigation, which suffers from anchoring - once you explore one theory, subsequent investigation is biased toward it. Multiple investigators running adversarial debates converge on root causes faster.
1. 并行假说 debug。 派 5 个 teammate,每个调查一个不同理论:为什么 app 在一条消息后退出。让他们互相对话,像科学辩论一样反驳对方的假说。这真比顺序调查好 —— 顺序调查受锚定效应(anchoring)之苦,一旦你探索了一个理论,后续调查都会偏向它。多个调查者跑对抗式辩论,更快收敛到 root cause。
🟢 译者注:这个用法是 swarm 我个人最看好的场景。bug 调查的最大失败模式就是”早期假说锁定 + 后续证据被该假说筛选过”。让 5 个 teammate 各自抱不同假说,互相 falsify,等于强制搞 Popper 式的可证伪科学。这是单 agent 跑不出来的。
原文:Parallel code review with different lenses. One teammate on security implications, one checking performance impact, one validating test coverage. A single reviewer tends to gravitate toward one type of issue at a time. Splitting review criteria into independent domains means each gets thorough attention simultaneously.
2. 多视角并行 code review。 一个 teammate 看安全影响,一个查性能影响,一个验证测试覆盖。单个 reviewer 倾向于一次只关注一种问题。把 review 标准拆到独立领域,每个都同时获得彻底关注。
原文:Cross-layer feature work. Changes that span frontend, backend, and tests - each owned by a different teammate. Instead of one agent context-switching between layers, three agents work in parallel with full focus on their domain.
3. 跨层 feature 工作。 跨前端、后端、测试的改动 —— 每层由不同 teammate 拥有。不是一个 agent 在层之间切 context,而是三个 agent 在自己的领域里全焦点并行。
原文:Research and exploration. Multiple teammates investigate different approaches simultaneously, share what they find, and converge on the best path forward. Research findings flow directly into implementation context - no telephone game.
4. 研究与探索。 多个 teammate 同时调查不同方法,分享发现,收敛到最佳前进路径。研究发现直接流入实现 context —— 没有传话游戏(telephone game)。
🟢 译者注:telephone game(传话游戏)= 中文里的”传声筒游戏”,信息传一圈失真。Addy 在说:swarm 里 research 直接喂给 implementation,中间没人转述,所以失真小。
控制 team
原文:You interact with agent teams through the lead’s terminal. A few controls that matter in practice:
你通过 lead 的终端和 agent team 交互。实践中重要的几个控制:
显示模式
原文:Two options. In-process (default): all teammates run inside your main terminal. Use
Shift+Up/Downto select a teammate and type to message them directly. PressEnterto view a teammate’s session,Escapeto interrupt their turn,Ctrl+Tto toggle the task list. Works in any terminal.
两个选项。
In-process(进程内,默认):所有 teammate 跑在你的主终端里。Shift+Up/Down 选择 teammate,直接打字给他们发消息。Enter 查看 teammate 的 session,Escape 打断他们当前回合,Ctrl+T 切换 task list。任何终端都能用。
原文:Split panes: each teammate gets its own pane via tmux or iTerm2. You see everyone’s output at once and click into a pane to interact directly. Set it in settings or per-session:
Split panes(分屏):每个 teammate 通过 tmux 或 iTerm2 获得自己的 pane。你一次看到所有人的输出,点进某个 pane 直接交互。在 settings 里设置,或单 session 覆盖:
{
"teammateMode": "tmux"
}
原文:Or override for a single session:
或者单 session 覆盖:
claude --teammate-mode in-process
Plan approval(计划审批)
原文:For risky work, require teammates to plan before implementing. The teammate works in read-only mode until the lead approves their approach:
对风险工作,要求 teammate 在实现前先 plan。teammate 在只读模式下工作,直到 lead 批准他们的方案:
Spawn an architect teammate to refactor the authentication module.
Require plan approval before they make any changes.
派一个 architect teammate 重构认证模块。要求 plan approval,
之后才能做任何改动。
原文:If rejected, they revise and resubmit. You can influence the lead’s judgment with criteria: “only approve plans that include test coverage” or “reject plans that modify the database schema.”
如果被拒,他们修改重提。你可以用标准影响 lead 的判断:“只批准包含 test coverage 的 plan” 或 “拒绝修改数据库 schema 的 plan”。
Delegate mode(委派模式)
原文:Press
Shift+Tabto restrict the lead to coordination only - spawning, messaging, shutting down teammates, and managing tasks. No code touching. This stops a common problem: the lead getting distracted and implementing things itself instead of waiting for teammates.
按 Shift+Tab 把 lead 限制为只做协调 —— 派生、发消息、关停 teammate、管理任务。不碰代码。 这阻止一个常见问题:lead 分心,自己动手实现,而不是等 teammate。
🟢 译者注:这个模式名字 delegate mode 直译就是”委派模式”。它专门防止 lead 越权变 worker —— 一个非常人类的失败模式,工程经理也常犯。
直接和 teammate 交互
原文:Each teammate is a full Claude Code session. You can message any of them directly to give additional instructions, ask follow-up questions, or redirect their approach - without going through the lead.
每个 teammate 都是完整的 Claude Code session。你可以直接给他们任何一个发消息,提供额外指令、追问、调整方法 —— 不必走 lead。
Task 管理
原文:The shared task list coordinates work across the team. Tasks have three states: pending, in progress, and completed. Tasks can depend on other tasks - a pending task with unresolved dependencies can’t be claimed until those dependencies complete. Auto-unblocking happens when a blocking task finishes.
共享 task list 协调全 team。任务有三个状态:pending、in progress、completed。任务可以依赖其他任务 —— 依赖未解决的 pending 任务无法被 claim。阻塞任务完成时自动解锁。
原文:The lead can assign tasks explicitly, or teammates can self-claim the next unassigned, unblocked task when they finish. Task claiming uses file locking to prevent race conditions.
lead 可以显式分配任务,也可以让 teammate 完成后自己抢下一个未分配、未阻塞的任务。Task claim 用文件锁防止竞态。
原文:Teams and tasks are stored locally:
team 和 task 存在本地:
~/.claude/teams/{team-name}/config.json # Team 元数据 + 成员
~/.claude/tasks/{team-name}/ # Task list
原文:Teammates can read the config file to discover other team members.
teammate 可以读 config 文件发现其他 team 成员。
关停
原文:Ask the lead to shut down specific teammates - they can approve or reject the request with an explanation. To clean up the whole team:
让 lead 关停特定 teammate —— 他们可以带解释批准或拒绝。清理整个 team:
Clean up the team
原文:Always use the lead for cleanup. Teammates shouldn’t run it because their team context may not resolve correctly, potentially leaving resources in an inconsistent state. The lead checks for active teammates and fails if any are still running, so shut them down first.
永远用 lead 做清理。teammate 不应该跑清理,因为他们的 team context 可能解析不正确,可能留下不一致状态的资源。lead 会检查活跃 teammate,如果有还在跑的就 fail,所以先关停他们。
这里有一个管理学的对应
原文:I keep coming back to this: the skills that make someone a strong engineering manager translate directly into effective agent orchestration. Agent teams make this even more explicit.
我反复回到这件事:让一个人成为强力工程经理的技能,直接转化成有效的 agent 编排能力。agent team 让这件事更明显。
原文:Task sizing matters. Too small and coordination overhead dominates. Too large and teammates work too long without check-ins, risking wasted effort. The sweet spot is self-contained units that produce a clear deliverable. Having 5-6 tasks per teammate keeps everyone productive and lets the lead reassign work if someone gets stuck.
任务粒度重要。 太小,协调开销主导。太大,teammate 太久不汇报,有浪费精力的风险。甜蜜区是自包含单元 + 清晰可交付物。每个 teammate 5-6 个任务,让大家保持产能,lead 在有人卡住时能重新分配。
原文:File ownership matters. Two teammates editing the same file leads to overwrites. Break the work so each teammate owns a different set of files. Same boundary-setting you’d do with a human team to avoid merge conflicts.
文件所有权重要。 两个 teammate 编辑同一个文件 → 覆盖事故。把工作拆开,让每个 teammate 拥有不同的文件集合。和你为了避免 merge conflict 在人类团队里设置的边界一样。
原文:Context loading matters. Teammates get your project’s
CLAUDE.md, MCP servers, and skills automatically, but they don’t inherit the lead’s conversation history. Include task-specific details in the spawn prompt:
Context 加载重要。 teammate 自动获得项目的 CLAUDE.md、MCP servers、skills,但他们不继承 lead 的对话历史。在 spawn prompt 里包含任务专属细节:
Spawn a security reviewer teammate with the prompt: "Review the
authentication module at src/auth/ for security vulnerabilities. Focus
on token handling, session management, and input validation. The app
uses JWT tokens stored in httpOnly cookies. Report any issues with
severity ratings."
派一个 security reviewer teammate,prompt 如下:"审查 src/auth/ 的
认证模块,找安全漏洞。聚焦 token 处理、session 管理、输入验证。
该 app 使用存在 httpOnly cookies 里的 JWT token。报告所有问题
并标注 severity。"
原文:The more specific the brief, the better the output. Same as always - but now you’re writing briefs for a team, not a single agent.
brief 越具体,输出越好。和以前一样 —— 但现在你为一个 team 写 brief,不是一个 agent。
注意事项
原文:This is experimental. The rough edges are real and worth knowing about.
这是实验性的。毛刺是真的,值得知道。
原文:The lead sometimes implements instead of delegating. Tell it to wait: “Wait for your teammates to complete their tasks before proceeding.” Or use delegate mode (
Shift+Tab) to restrict the lead to coordination-only tools.
Lead 有时不委派,自己动手。 告诉它等:“在 teammate 完成任务前等着。” 或者用 delegate mode(Shift+Tab)限制 lead 只能用协调类工具。
原文:No session resumption for in-process teammates.
/resumeand/rewinddon’t restore in-process teammates. After resuming, the lead may try to message teammates that no longer exist. Spawn fresh ones.
in-process teammate 不支持 session 恢复。 /resume 和 /rewind 不会恢复 in-process teammate。恢复后,lead 可能尝试给已经不存在的 teammate 发消息。派新的。
原文:Task status can lag. Teammates sometimes fail to mark tasks as completed, blocking dependent tasks. Check whether the work is actually done and nudge the lead or update manually.
Task 状态可能滞后。 teammate 有时忘了把任务标 completed,阻塞了依赖任务。检查工作是不是真做完了,提醒 lead 或手动更新。
原文:One team per session, no nested teams. A lead manages one team at a time. Teammates can’t spawn their own teams or teammates. Only the lead manages the team. This is deliberate - preventing infinite recursion, runaway token costs, and loss of human oversight. Clean up the current team before starting a new one.
每个 session 一个 team,没有嵌套 team。 lead 一次管一个 team。teammate 不能派生自己的 team 或 teammate。只有 lead 管理 team。这是故意的 —— 防止无限递归、token 成本失控、丢失 human oversight。开新 team 前清理当前 team。
原文:Token costs scale with teammates. Each teammate is a separate Claude instance with its own context window. For routine tasks, a single session is more cost-effective. Multi-agent patterns pay off on larger, parallelizable work - not for fixing a typo.
Token 成本随 teammate 数量增长。 每个 teammate 都是独立 Claude 实例,有自己的 context window。对常规任务,单 session 更划算。multi-agent 模式只在大型、可并行的工作上回本 —— 不是修一个 typo。
原文:Split panes require tmux or iTerm2. Not supported in VS Code’s integrated terminal, Windows Terminal, or Ghostty. The default in-process mode works everywhere.
Split panes 需要 tmux 或 iTerm2。 VS Code 集成终端、Windows Terminal、Ghostty 不支持。默认 in-process 模式哪里都能用。
原文:Permissions propagate from the lead. All teammates start with the lead’s permission settings. If the lead runs with
--dangerously-skip-permissions, all teammates do too. You can change individual teammate modes after spawning, but can’t set per-teammate modes at spawn time.
权限从 lead 传播。 所有 teammate 用 lead 的 permission 设置启动。如果 lead 用 --dangerously-skip-permissions 跑,所有 teammate 也是。spawn 之后能改单个 teammate 的模式,但 spawn 时不能按 teammate 设置。
原文:Shutdown can be slow. Teammates finish their current request or tool call before shutting down, which can take time.
关停可能慢。 teammate 在关停前会完成当前请求或 tool call,这可能要一些时间。
一句警告
原文:There’s a seductive quality to watching agents work in parallel. The activity metrics are impressive - commits per hour, parallel task completion, lines of code touched.
看着 agent 并行工作有种诱人的品质。活动指标令人印象深刻 —— 每小时 commit 数、并行任务完成、动到的代码行数。
原文:But activity doesnt always translate to value.
但活动不总是转化为价值。
原文:The risk with multi-agent systems is that they make it easy to produce large quantities of code very quickly. That code still needs to be right, maintainable, and actually solving the problem. I’ve seen developers lose the plot, spending more time configuring orchestration patterns than thinking about what they’re building.
multi-agent 系统的风险是让你很容易非常快地产出大量代码。 那些代码仍然需要是对的、可维护的、真在解决问题。我见过开发者迷失方向,花在配置编排模式上的时间多于花在思考自己在搭什么上的时间。
原文:Let the problem guide the tooling, not the other way around. If a single agent in a focused session gets you there faster, use that. If you need parallel specialists, use agent teams. Agent teams add coordination overhead and use significantly more tokens than a single session. They work best when teammates can operate independently. For sequential tasks, same-file edits, or work with many dependencies, a single session or subagents are more effective.
原文金句:Let the problem guide the tooling, not the other way around.
中译:让问题引导工具,而不是反过来。
如果单 agent 在聚焦 session 里更快,用单 agent。如果你需要并行专家,用 agent team。agent team 增加协调开销,消耗的 token 比单 session 多得多。它们在 teammate 能独立工作时最好用。对顺序任务、同文件编辑、多依赖工作,单 session 或 subagent 更有效。
🟢 译者注:这是这篇最重要的一段警告。swarm 是会让人 high 的工具(看着多个 agent 同时工作,工程师会从产生一种”我在指挥乐团”的快感),但 90% 的日常 coding 任务,单 session 更快、更便宜、更可控。
用 Compound Engineering 把 swarm 用到顶
原文:If you want a more structured workflow around agent teams, the Compound Engineering Plugin from Every might be worth a look. It’s a Claude Code plugin that adds specialized review agents and a plan → work → review → compound cycle designed around the idea that each unit of engineering work should make subsequent units easier.
如果你想要一套围绕 agent team 的更结构化 workflow,Every 出的 Compound Engineering Plugin 值得看一眼。这是一个 Claude Code 插件,加入了专精的 review agent,以及一个 plan → work → review → compound 循环 —— 围绕这个想法设计:每个工程工作单元都应该让后续工作单元更容易。
原文:Install it directly in Claude Code:
直接在 Claude Code 里装:
/plugin marketplace add https://github.com/EveryInc/compound-engineering-plugin
/plugin install compound-engineering
原文:The parts most relevant to agent teams:
/workflows:planturns feature ideas into detailed implementation plans (exactly the kind of upfront specification that makes agent delegation work well),/workflows:reviewruns multi-agent code review before merging (security, performance, architecture, and complexity - each with its own specialized reviewer), and/workflows:compounddocuments learnings so future agents benefit from past work.
对 agent teams 最相关的部分:
/workflows:plan—— 把 feature 想法变成详细实现 plan(正是这种前置规约让 agent 委派工作得好)。/workflows:review—— 在 merge 前跑多 agent code review(安全、性能、架构、复杂度 —— 各有专精 reviewer)。/workflows:compound—— 记录学到的东西,让未来 agent 受益于过往工作。
原文:That last piece is the interesting one. The plugin’s philosophy - 80% planning and review, 20% execution - maps cleanly onto what makes agent teams effective. The better your specs, the better the agent output. The more learnings you codify, the less each subsequent agent flails. It’s the same compounding dynamic I described with AGENTS.md and persistent context but packaged into a repeatable workflow.
最后这块最有趣。插件的哲学 —— 80% 规划和审查、20% 执行 —— 干净地对应到”什么让 agent team 有效”。spec 越好,agent 产出越好。学到的东西越多被编码下来,后续每个 agent 越少瞎飘。这和我在 AGENTS.md 和持久 context 里描述的复利效应一样,只是被打包成了可复用 workflow。
原文:It also works with OpenCode and Codex (experimentally) if you’re not exclusively in Claude Code.
它也(实验性地)与 OpenCode 和 Codex 兼容,如果你不是只用 Claude Code。
上手 —— 我会怎么做
原文:If you’re new to multi-agent coordination, start small.
如果你刚开始接触 multi-agent 协调,从小处起步。
原文:Start with research and review. Tasks that have clear boundaries and don’t require writing code - reviewing a PR from three angles, researching a library, investigating a bug with competing theories. These show the value of parallel exploration without the coordination complexity of parallel code changes.
1. 从 research 和 review 起步。 边界清晰、不需要写代码的任务 —— 从三个角度 review 一个 PR、调研一个库、用对抗假说调查一个 bug。这些展示并行探索的价值,没有并行代码改动的协调复杂度。
原文:Then try cross-layer features. Frontend, backend, and tests each owned by a different teammate. Clean boundaries, clear deliverables.
2. 然后试跨层 feature。 前端、后端、测试各由不同 teammate 拥有。边界干净,交付物清晰。
原文:Then scale to larger refactors. Multiple services, parallel implementation after a shared design phase. This is where the time compression gets dramatic - what might take days of sequential work compresses into hours of parallel execution plus review.
3. 然后扩展到更大的重构。 多服务,共享设计阶段之后并行实现。这是时间压缩戏剧化的地方 —— 本来要几天顺序工作的东西,被压缩成几小时的并行执行 + review。
原文:The core skill isn’t writing less code. It’s decomposing problems into structures that agent teams can execute - knowing what to build, what correctness means, and how to verify results. The implementation increasingly becomes a matter of sufficiently precise specification.
原文金句:The core skill isn’t writing less code. It’s decomposing problems into structures that agent teams can execute.
中译:核心技能不是写更少的代码。是把问题分解成 agent team 可执行的结构 —— 知道要造什么、正确性意味着什么、怎么验证结果。实现越来越变成”足够精确的规约”问题。
原文:This is what agentic engineering looks like when the agents can actually coordinate. The architecture for coordinated AI agent systems is here. Use it wisely.
这就是当 agent 真能协调时,agentic engineering 看起来的样子。协调式 AI agent 系统的架构已经在这里了。明智地使用它。
原文:The full Claude Code agent teams documentation has the complete setup and usage guide.
完整的 Claude Code agent teams 文档 有完整 setup 和使用指南。
译者总评
- Subagent vs. Agent Team 是两种工具,不是替代关系。subagent 适合”派人干活、汇报回来”;agent team 适合”小队协作、彼此挑战”。90% 的场景仍然是单 session 或 subagent,真正适合 swarm 的是:并行假说 debug / 多视角 code review / 跨层 feature / 研究探索 —— 这四类。
- Swarm 的最大风险是”看着活动指标 high”。Addy 的警告非常重要:agent team 让产出代码量翻倍很容易,但那些代码仍然要正确、可维护、真解决问题。如果你团队开始迷恋 swarm 的”指挥乐团感”而不是产出价值,这就是 comprehension debt 的雷区(参见 Addy 三连里的《Comprehension Debt》)。
- 管理学技能直接转化成 agent 编排技能。任务粒度、文件所有权、context 加载、delegate mode —— 每一个都对应工程经理日常处理的人类团队问题。这意味着资深工程经理在 agent 时代是被低估的资产。
- “80% 规划和审查、20% 执行”的 Compound Engineering 哲学:spec 越好,agent 产出越好。这与《Long-running Agents》里 Addy 那句”正在升值的技能不是写代码,是写出能在和自主执行器接触后存活下来的 spec”完全互锁。新工程师的核心技能正在迁移到”问题分解 + 规约撰写”。
- Anthropic 把 multi-agent 从实验拽进默认能力,意味着 LangGraph / AutoGen 等”自己拼装多 agent”的框架,产品定位需要重新评估。当 Claude Code 一个开关就能给你 swarm 时,框架的价值在于”能做更复杂的编排 / 自己跑模型 / 跨厂商” —— 而不是”提供 multi-agent 能力”本身。
🔗 调研来源
- 原文: https://addyosmani.com/blog/claude-code-agent-teams/
- 配套精读: Addy Osmani 三连
- 配套全文: Agent Harness Engineering(全文) / Long-running Agents(全文)
- 官方文档: Claude Code — Agent Teams
- 相关原文: Addy — From conductor to orchestrator
- 相关原文: Addy — Coding agents and the engineering manager
- 相关原文: Addy — Self-improving agents
- 相关插件: Compound Engineering Plugin (Every)
📝 配套精读 + 译者点评:Addy Osmani 三连