· FRONTEND-2026-RADAR · 2026.05.05 · 11 MIN ·

Playwright MCP / CLI + SKILLs:AI 写测试的 production 级答案

Microsoft 2026 在 Playwright 1.58 起推 CLI+SKILLs 模式,token 成本比纯 MCP 低 4×;Playwright 45.1% market adoption,周下载 33M。中文解读 + 译者点评。 · by 思扬

AI · HERO seed:3420260505 Microsoft 2026 在 Playwright 1.58 起推 CLI+SKILLs 模式,token 成本比纯 MCP 低 4×;Playwright 45.1% market adoption,周下载 33M。中文解读 + 译者点评。

FIG.00 — cover · ai-generated · placeholder

影响力:Microsoft 2026 在 Playwright 1.58 起推 CLI+SKILLs 模式,token 成本比纯 MCP 低 4×;Playwright 45.1% market adoption(Cypress 14.4%);周 npm 下载 33M。 干活密度:🟢 干活级 关键时间:Playwright 1.59.1(2026-04-01);Playwright MCP v0.0.73(2026-05-01)

🔥 影响力卡片

Playwright 1.59.1(2026-04-01)/ 1.58.0(2026-01-23,引入 CLI+SKILLs)/ 1.57.0(2025-11-25,切到 Chrome for Testing)
Playwright MCP v0.0.73(2026-05-01)
数据:周 npm 下载 33M(Cypress 6.5M);GitHub 95k★
关键 skill 仓库:lackeyjb/playwright-skill + AgentMantis/test-skills + anthropics/skills

🎯 为什么必读

2024 问 “AI 能不能写测试”,2025 问 “好不好”,2026 问 “哪个 workflow 出 production-grade 测试”。主流答案是 Skill(规范) + Playwright MCP/CLI(真上下文) + 强制 code review。

如果你团队还在让 LLM 直接 prompt 写 e2e 测试 —— 你的测试 90% 包含幻觉 selector,跑一周就废。这一篇是 production 级别的最佳实践。

一句话总结

AI 写 e2e 不能裸 prompt;必须给 LLM 真实页面 context(MCP)+ 团队规范(Skill)+ 人类 review。

💎 金句墙

★ “CLI+SKILLs is 4× cheaper than pure MCP for coding agents.” “对 coding agent 而言,CLI+SKILLs 比纯 MCP 便宜 4×。” —— Microsoft Playwright 团队 2026 公告。译者点评:这是 token 成本现实 —— 纯 MCP 每次工具调用要往返多次(LLM 描述 → MCP server 响应),CLI+SKILLs 让 agent 直接 spawn 进程,把工具调用压缩到一次往返

★ “Don’t let the model guess selectors. Give it real page snapshots.” “别让模型猜 selector,给它真实页面快照。” —— Checkly 方法论博客金句。译者点评:这是 AI 写测试翻车的最大根因。LLM 看着需求描述瞎写 [data-testid="submit-button"],实际页面根本没这个 testid。给 ariaSnapshot 才能让 LLM 看见真实结构

📋 核心精读

1. Playwright MCP 安装(VSCode / Cursor / Claude Desktop)

# VSCode 一键安装
code --add-mcp '{"name":"playwright","command":"npx","args":["@playwright/mcp@latest"]}'

// Cursor / Claude Desktop config
{
  "mcpServers": {
    "playwright": {
      "command": "npx",
      "args": ["@playwright/mcp@latest"]
    }
  }
}

🟢 译者点评:装上后,agent 拥有 5 个 MCP 工具:browser_navigate / browser_click / browser_snapshot / browser_route(mock)/ browser_console_messages。

2. Microsoft 2026 推荐:CLI + SKILLs 模式(替代纯 MCP)

<!-- .claude/skills/playwright-test/SKILL.md -->
---
name: playwright-test
description: |
  Generate Playwright e2e tests with real page context.
  Use when user asks to write or update e2e tests.
---

# Playwright Test Generation

When asked to write a test:

1. Navigate to the target URL with `npx playwright codegen` or use page.goto
2. Get page structure via `page.ariaSnapshot()` — DON'T guess selectors
3. Write the test in `tests/<feature>.spec.ts`
4. Run `npx playwright test` to verify
5. If failures, debug with trace: `npx playwright trace show test-results/.../trace.zip`

## Rules
- Always use `page.getByRole`, `page.getByLabel`, `page.getByText` (a11y selectors)
- Avoid CSS selectors except when no a11y attribute exists
- One test per user flow, named after the flow

## Examples
- references/login-flow-test.spec.ts.md
- references/form-submit-test.spec.ts.md

🟢 译者点评:SKILL.md 把”团队规范”沉淀成 agent 可读的格式;CLI(npx playwright)让 agent 直接 spawn 进程拿真实结果,比纯 MCP 工具调用 token 省 4×。

3. 关键 API:`page.ariaSnapshot()`

// 在 LLM 写测试前,先给它看真实结构
import { test, expect } from '@playwright/test';

test('debug — see real page', async ({ page }) => {
  await page.goto('https://example.com');
  console.log(await page.ariaSnapshot());
  // 输出 a11y tree,LLM 据此写 selector
});

输出大致是:

- main:
    - heading "Welcome" [level=1]
    - textbox "Email"
    - button "Sign In"

🟢 译者点评:这是 LLM 写出能跑测试的关键。LLM 看到 textbox "Email" 就知道用 page.getByRole('textbox', { name: 'Email' }),不需要瞎猜 testid。

4. 三个 production 级 skill 仓库

仓库	特点	安装
anthropics/skills	官方,含 evals + A/B blind testing(2026-03 更新)	`/plugin marketplace add anthropics/skills`
lackeyjb/playwright-skill	模型自动决定何时调用,自动写并执行	`/plugin install playwright-skill@playwright-skill`
AgentMantis/test-skills	5 个生命周期 skill(init / pom / regression / handover / promote);支持 40+ agent	`npx skills add AgentMantis/test-skills --agent claude-code`

🟢 译者点评:AgentMantis/test-skills 是最完整的。5 个 skill 覆盖 e2e suite 完整生命周期 —— init(起 suite)/ pom(写 page object)/ regression(回归 case)/ handover(交接 doc)/ promote(从 staging 推到 prod)。

5. Checkly 黄金 prompt 模板

You are a Playwright test generator and an expert in TypeScript,
Frontend development, and Playwright end-to-end testing.

Use MCP tools to navigate the site (don't assume).
Access page snapshots before interactions.
Generate tests only after completing all steps.
Verify with `npx playwright test`.

🟢 译者点评:“don’t assume” 三个字关键。LLM 默认行为是”基于训练数据猜”,这个 prompt 强制它”先调工具看真实页面”。

6. Playwright 1.59 新特性(干活级)

// CLI 调试 agent
npx playwright test --debug=cli

// async disposables(Playwright 1.59)
test('cleanup automatic', async () => {
  await using browser = await chromium.launch();
  await using context = await browser.newContext();
  // ... 离开作用域自动 close
});

// 多客户端共享 browser
const browser = await chromium.connect({ wsEndpoint });
const ctxA = await browser.bind('A');
const ctxB = await browser.bind('B');

7. Vitest 4 Browser Mode(替代 e2e 的另一选择)

// vitest.config.ts
test: {
  browser: {
    enabled: true,
    instances: [{ browser: 'chromium' }, { browser: 'firefox' }]
  }
}

// 视觉回归断言
await expect(locator).toMatchScreenshot();

🟢 译者点评:Vitest 4 Browser Mode 转正(2025-10-22),一流 Playwright Trace 集成。如果你做组件测试,Vitest browser mode > Playwright(更轻);完整 e2e 仍是 Playwright。

🟢 译者总评

现在就改的:把团队所有 e2e prompt sheet 整理成 .claude/skills/playwright-test/SKILL.md,跑 skills-ref validate
必装:Playwright MCP(本地调试)+ AgentMantis/test-skills(团队规范)
强制规则:LLM 不允许直接写 selector,必须先 page.ariaSnapshot()
PR 拦截:CI 加一步 npx playwright test --reporter=html,失败的 AI-written 测试不允许 merge
不要:盲目升级 Cypress 到 15 —— 老项目没痛点不动;新项目直接 Playwright(成本/性能/AI 集成全胜)
配套读:SKILL.md(skill 写法)+ MCP 协议+ TypeScript 7 Corsa(TS 测试代码 type check 提速)