返回技能库

WebMCP

当浏览或自动化通过 WebMCP API(window.navigator.modelContext)公开工具的网页时,应使用此技能。它教会代理如何在网站上发现、检查并调用 WebMCP 工具,而不是依赖 DOM 抓取或 UI 操作。

作者:brunobuddy · 最新版本:1.0.0

收藏:1 · 下载:1k

说明文档

# WebMCP — Discover and Use Website Tools

## What is WebMCP

WebMCP is a browser API that lets websites expose JavaScript functions as structured tools for AI agents. Pages register tools via `window.navigator.modelContext`, each with a name, description, JSON Schema input, and an `execute` callback. Think of it as an MCP server running inside the web page itself.

Spec: https://github.com/webmachinelearning/webmcp

## Detecting WebMCP Support

Before interacting with tools, check whether the current page supports WebMCP:

```js
const supported = "modelContext" in window.navigator;
```

If `false`, the page does not expose WebMCP tools — fall back to DOM interaction or actuation.

## Discovering Available Tools

Tools are registered by the page via `provideContext()` or `registerTool()`. The browser mediates access. To list available tools from an agent's perspective, evaluate:

```js
// Browser-specific — the exact discovery API depends on the agent runtime.
// Typically the browser exposes registered tools to connected agents automatically.
// From page-script perspective, tools are registered like this:
window.navigator.modelContext.provideContext({
  tools: [
    {
      name: "tool-name",
      description: "What this tool does",
      inputSchema: { type: "object", properties: { /* ... */ }, required: [] },
      execute: (params, agent) => { /* ... */ }
    }
  ]
});
```

Key points:
- Each tool has `name`, `description`, `inputSchema` (JSON Schema), and `execute`.
- `provideContext()` replaces all previously registered tools (useful for SPA state changes).
- `registerTool()` / `unregisterTool()` add/remove individual tools without resetting.
- Tools may change as the user navigates or as SPA state updates — re-check after page transitions.

## Tool Schema Format

Tool input schemas follow JSON Schema (aligned with MCP SDK and Prompt API tool use):

```js
{
  name: "add-stamp",
  description: "Add a new stamp to the collection",
  inputSchema: {
    type: "object",
    properties: {
      name: { type: "string", description: "The name of the stamp" },
      year: { type: "number", description: "Year the stamp was issued" },
      imageUrl: { type: "string", description: "Optional image URL" }
    },
    required: ["name", "year"]
  },
  execute({ name, year, imageUrl }, agent) {
    // Implementation — updates UI and app state
    return {
      content: [{ type: "text", text: `Stamp "${name}" added.` }]
    };
  }
}
```

## Invoking Tools

When connected as an agent, send a tool call by name with parameters matching `inputSchema`. The `execute` callback runs on the page's main thread, can update the UI, and returns a structured response:

```js
// Response format from execute():
{
  content: [
    { type: "text", text: "Result description" }
  ]
}
```

- Tools run sequentially on the main thread (one at a time).
- `execute` may be async (returns a Promise).
- The second parameter `agent` provides `agent.requestUserInteraction()` for user confirmation flows.

## User Interaction During Tool Execution

Tools can request user confirmation before sensitive actions:

```js
async function buyProduct({ product_id }, agent) {
  const confirmed = await agent.requestUserInteraction(async () => {
    return confirm(`Buy product ${product_id}?`);
  });
  if (!confirmed) throw new Error("Cancelled by user.");
  executePurchase(product_id);
  return { content: [{ type: "text", text: `Product ${product_id} purchased.` }] };
}
```

Always respect user denials — do not retry cancelled tool calls.

## Agent Workflow

1. Navigate to the target website.
2. Check `"modelContext" in window.navigator` to confirm WebMCP support.
3. Discover registered tools (names, descriptions, schemas).
4. Select the appropriate tool based on the user's goal and the tool description.
5. Invoke with correct parameters matching `inputSchema`.
6. Read the structured response and relay results to the user.
7. After SPA navigation or state changes, re-discover tools — the set may have changed.
8. If no WebMCP tool fits the task, fall back to DOM-based interaction.

## Important Constraints

- **Browser context required** — tools only exist in a live browsing context (tab/webview), not headlessly.
- **Sequential execution** — tool calls run one at a time on the main thread.
- **No cross-origin tool sharing** — tools are scoped to the page that registered them.
- **Permission-gated** — the browser may prompt the user before allowing tool access.
- **Tools are dynamic** — SPAs may register/unregister tools based on UI state.