WebMCP
当浏览或自动化通过 WebMCP API(window.navigator.modelContext)公开工具的网页时,应使用此技能。它教会代理如何在网站上发现、检查并调用 WebMCP 工具,而不是依赖 DOM 抓取或 UI 操作。
作者:brunobuddy · 最新版本:1.0.0
收藏:1 · 下载:1k
说明文档
# WebMCP — Discover and Use Website Tools
## What is WebMCP
WebMCP is a browser API that lets websites expose JavaScript functions as structured tools for AI agents. Pages register tools via `window.navigator.modelContext`, each with a name, description, JSON Schema input, and an `execute` callback. Think of it as an MCP server running inside the web page itself.
Spec: https://github.com/webmachinelearning/webmcp
## Detecting WebMCP Support
Before interacting with tools, check whether the current page supports WebMCP:
```js
const supported = "modelContext" in window.navigator;
```
If `false`, the page does not expose WebMCP tools — fall back to DOM interaction or actuation.
## Discovering Available Tools
Tools are registered by the page via `provideContext()` or `registerTool()`. The browser mediates access. To list available tools from an agent's perspective, evaluate:
```js
// Browser-specific — the exact discovery API depends on the agent runtime.
// Typically the browser exposes registered tools to connected agents automatically.
// From page-script perspective, tools are registered like this:
window.navigator.modelContext.provideContext({
tools: [
{
name: "tool-name",
description: "What this tool does",
inputSchema: { type: "object", properties: { /* ... */ }, required: [] },
execute: (params, agent) => { /* ... */ }
}
]
});
```
Key points:
- Each tool has `name`, `description`, `inputSchema` (JSON Schema), and `execute`.
- `provideContext()` replaces all previously registered tools (useful for SPA state changes).
- `registerTool()` / `unregisterTool()` add/remove individual tools without resetting.
- Tools may change as the user navigates or as SPA state updates — re-check after page transitions.
## Tool Schema Format
Tool input schemas follow JSON Schema (aligned with MCP SDK and Prompt API tool use):
```js
{
name: "add-stamp",
description: "Add a new stamp to the collection",
inputSchema: {
type: "object",
properties: {
name: { type: "string", description: "The name of the stamp" },
year: { type: "number", description: "Year the stamp was issued" },
imageUrl: { type: "string", description: "Optional image URL" }
},
required: ["name", "year"]
},
execute({ name, year, imageUrl }, agent) {
// Implementation — updates UI and app state
return {
content: [{ type: "text", text: `Stamp "${name}" added.` }]
};
}
}
```
## Invoking Tools
When connected as an agent, send a tool call by name with parameters matching `inputSchema`. The `execute` callback runs on the page's main thread, can update the UI, and returns a structured response:
```js
// Response format from execute():
{
content: [
{ type: "text", text: "Result description" }
]
}
```
- Tools run sequentially on the main thread (one at a time).
- `execute` may be async (returns a Promise).
- The second parameter `agent` provides `agent.requestUserInteraction()` for user confirmation flows.
## User Interaction During Tool Execution
Tools can request user confirmation before sensitive actions:
```js
async function buyProduct({ product_id }, agent) {
const confirmed = await agent.requestUserInteraction(async () => {
return confirm(`Buy product ${product_id}?`);
});
if (!confirmed) throw new Error("Cancelled by user.");
executePurchase(product_id);
return { content: [{ type: "text", text: `Product ${product_id} purchased.` }] };
}
```
Always respect user denials — do not retry cancelled tool calls.
## Agent Workflow
1. Navigate to the target website.
2. Check `"modelContext" in window.navigator` to confirm WebMCP support.
3. Discover registered tools (names, descriptions, schemas).
4. Select the appropriate tool based on the user's goal and the tool description.
5. Invoke with correct parameters matching `inputSchema`.
6. Read the structured response and relay results to the user.
7. After SPA navigation or state changes, re-discover tools — the set may have changed.
8. If no WebMCP tool fits the task, fall back to DOM-based interaction.
## Important Constraints
- **Browser context required** — tools only exist in a live browsing context (tab/webview), not headlessly.
- **Sequential execution** — tool calls run one at a time on the main thread.
- **No cross-origin tool sharing** — tools are scoped to the page that registered them.
- **Permission-gated** — the browser may prompt the user before allowing tool access.
- **Tools are dynamic** — SPAs may register/unregister tools based on UI state.