Playwright MCP / CLI + SKILLs:AI 写测试的 production 级答案
影响力:Microsoft 2026 在 Playwright 1.58 起推 CLI+SKILLs 模式,token 成本比纯 MCP 低 4×;Playwright 45.1% market adoption(Cypress 14.4%);周 npm 下载 33M。 干活密度:🟢 干活级 关键时间:Playwright 1.59.1(2026-04-01);Playwright MCP v0.0.73(2026-05-01)
🔥 影响力卡片
- Playwright 1.59.1(2026-04-01)/ 1.58.0(2026-01-23,引入 CLI+SKILLs)/ 1.57.0(2025-11-25,切到 Chrome for Testing)
- Playwright MCP v0.0.73(2026-05-01)
- 数据:周 npm 下载 33M(Cypress 6.5M);GitHub 95k★
- 关键 skill 仓库:lackeyjb/playwright-skill + AgentMantis/test-skills + anthropics/skills
🎯 为什么必读
2024 问 “AI 能不能写测试”,2025 问 “好不好”,2026 问 “哪个 workflow 出 production-grade 测试”。主流答案是 Skill(规范) + Playwright MCP/CLI(真上下文) + 强制 code review。
如果你团队还在让 LLM 直接 prompt 写 e2e 测试 —— 你的测试 90% 包含幻觉 selector,跑一周就废。这一篇是 production 级别的最佳实践。
一句话总结
AI 写 e2e 不能裸 prompt;必须给 LLM 真实页面 context(MCP)+ 团队规范(Skill)+ 人类 review。
💎 金句墙
★ “CLI+SKILLs is 4× cheaper than pure MCP for coding agents.” “对 coding agent 而言,CLI+SKILLs 比纯 MCP 便宜 4×。” —— Microsoft Playwright 团队 2026 公告。译者点评:这是 token 成本现实 —— 纯 MCP 每次工具调用要往返多次(LLM 描述 → MCP server 响应),CLI+SKILLs 让 agent 直接 spawn 进程,把工具调用压缩到一次往返
★ “Don’t let the model guess selectors. Give it real page snapshots.” “别让模型猜 selector,给它真实页面快照。” —— Checkly 方法论博客金句。译者点评:这是 AI 写测试翻车的最大根因。LLM 看着需求描述瞎写
[data-testid="submit-button"],实际页面根本没这个 testid。给 ariaSnapshot 才能让 LLM 看见真实结构
📋 核心精读
1. Playwright MCP 安装(VSCode / Cursor / Claude Desktop)
# VSCode 一键安装
code --add-mcp '{"name":"playwright","command":"npx","args":["@playwright/mcp@latest"]}'
// Cursor / Claude Desktop config
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": ["@playwright/mcp@latest"]
}
}
}
🟢 译者点评:装上后,agent 拥有 5 个 MCP 工具:browser_navigate / browser_click / browser_snapshot / browser_route(mock)/ browser_console_messages。
2. Microsoft 2026 推荐:CLI + SKILLs 模式(替代纯 MCP)
<!-- .claude/skills/playwright-test/SKILL.md -->
---
name: playwright-test
description: |
Generate Playwright e2e tests with real page context.
Use when user asks to write or update e2e tests.
---
# Playwright Test Generation
When asked to write a test:
1. Navigate to the target URL with `npx playwright codegen` or use page.goto
2. Get page structure via `page.ariaSnapshot()` — DON'T guess selectors
3. Write the test in `tests/<feature>.spec.ts`
4. Run `npx playwright test` to verify
5. If failures, debug with trace: `npx playwright trace show test-results/.../trace.zip`
## Rules
- Always use `page.getByRole`, `page.getByLabel`, `page.getByText` (a11y selectors)
- Avoid CSS selectors except when no a11y attribute exists
- One test per user flow, named after the flow
## Examples
- references/login-flow-test.spec.ts.md
- references/form-submit-test.spec.ts.md
🟢 译者点评:SKILL.md 把”团队规范”沉淀成 agent 可读的格式;CLI(npx playwright)让 agent 直接 spawn 进程拿真实结果,比纯 MCP 工具调用 token 省 4×。
3. 关键 API:page.ariaSnapshot()
// 在 LLM 写测试前,先给它看真实结构
import { test, expect } from '@playwright/test';
test('debug — see real page', async ({ page }) => {
await page.goto('https://example.com');
console.log(await page.ariaSnapshot());
// 输出 a11y tree,LLM 据此写 selector
});
输出大致是:
- main:
- heading "Welcome" [level=1]
- textbox "Email"
- button "Sign In"
🟢 译者点评:这是 LLM 写出能跑测试的关键。LLM 看到 textbox "Email" 就知道用 page.getByRole('textbox', { name: 'Email' }),不需要瞎猜 testid。
4. 三个 production 级 skill 仓库
| 仓库 | 特点 | 安装 |
|---|---|---|
| anthropics/skills | 官方,含 evals + A/B blind testing(2026-03 更新) | /plugin marketplace add anthropics/skills |
| lackeyjb/playwright-skill | 模型自动决定何时调用,自动写并执行 | /plugin install playwright-skill@playwright-skill |
| AgentMantis/test-skills | 5 个生命周期 skill(init / pom / regression / handover / promote);支持 40+ agent | npx skills add AgentMantis/test-skills --agent claude-code |
🟢 译者点评:AgentMantis/test-skills 是最完整的。5 个 skill 覆盖 e2e suite 完整生命周期 —— init(起 suite)/ pom(写 page object)/ regression(回归 case)/ handover(交接 doc)/ promote(从 staging 推到 prod)。
5. Checkly 黄金 prompt 模板
You are a Playwright test generator and an expert in TypeScript,
Frontend development, and Playwright end-to-end testing.
Use MCP tools to navigate the site (don't assume).
Access page snapshots before interactions.
Generate tests only after completing all steps.
Verify with `npx playwright test`.
🟢 译者点评:“don’t assume” 三个字关键。LLM 默认行为是”基于训练数据猜”,这个 prompt 强制它”先调工具看真实页面”。
6. Playwright 1.59 新特性(干活级)
// CLI 调试 agent
npx playwright test --debug=cli
// async disposables(Playwright 1.59)
test('cleanup automatic', async () => {
await using browser = await chromium.launch();
await using context = await browser.newContext();
// ... 离开作用域自动 close
});
// 多客户端共享 browser
const browser = await chromium.connect({ wsEndpoint });
const ctxA = await browser.bind('A');
const ctxB = await browser.bind('B');
7. Vitest 4 Browser Mode(替代 e2e 的另一选择)
// vitest.config.ts
test: {
browser: {
enabled: true,
instances: [{ browser: 'chromium' }, { browser: 'firefox' }]
}
}
// 视觉回归断言
await expect(locator).toMatchScreenshot();
🟢 译者点评:Vitest 4 Browser Mode 转正(2025-10-22),一流 Playwright Trace 集成。如果你做组件测试,Vitest browser mode > Playwright(更轻);完整 e2e 仍是 Playwright。
🟢 译者总评
- 现在就改的:把团队所有 e2e prompt sheet 整理成
.claude/skills/playwright-test/SKILL.md,跑skills-ref validate - 必装:Playwright MCP(本地调试)+ AgentMantis/test-skills(团队规范)
- 强制规则:LLM 不允许直接写 selector,必须先
page.ariaSnapshot() - PR 拦截:CI 加一步
npx playwright test --reporter=html,失败的 AI-written 测试不允许 merge - 不要:盲目升级 Cypress 到 15 —— 老项目没痛点不动;新项目直接 Playwright(成本/性能/AI 集成全胜)
- 配套读:SKILL.md(skill 写法)+ MCP 协议+ TypeScript 7 Corsa(TS 测试代码 type check 提速)