WebMCP: A Bomb Google Planted in Chrome 146

AI Agents no longer need to "pretend to be human" to browse the web.

Image

Google has quietly launched an early preview of WebMCP in Chrome 146, which can be enabled via a flag.

Image

And this thing could fundamentally rewrite how AI Agents interact with web pages.

Chrome 146 includes an early preview of WebMCP, enabled by flag, allowing AI Agents to directly query and execute services without having to browse the web like a user. Services can be declared via the imperative navigator.modelContext API or through declarative forms.

And this, in developer Alex Volkov's words, is like an API within the UI.

This is really interesting.

WebMCP is a new standard that allows web developers to expose a direct set of tools for AI Agents / smart browsers, so they no longer need to click buttons but can directly call functions on the website!

Current Agents

The way AI Agents currently operate web pages is essentially simulating a human user: taking screenshots, identifying where buttons are, clicking, filling forms, waiting for page loads...

It's like hiring a genius assistant but making him operate the computer blindfolded, relying solely on constant screenshots to "see" what's on the screen.

The result is: slow, expensive, and fragile...

When a website redesigns, the Agent is lost.

A simple search operation might consume thousands of tokens to process screenshots and DOM parsing.

WebMCP's approach is completely different: make the website actively tell the Agent "what I can do."

Two Exposure Methods

WebMCP offers developers two paths.

Imperative API

Register tool functions via JavaScript's navigator.modelContext.registerTool(). For example, an e-commerce site could register a search_products tool; the AI Agent, upon discovery, directly passes keywords to call it and receives structured product data—no screenshots, no DOM parsing, no simulated clicks on the search box.

Declarative Forms

By annotating HTML form elements, the Agent automatically understands interactive capabilities on the page. This method is simpler, suitable for lightweight scenarios.

The two methods can be mixed.

Seasoned developers use imperative for fine-grained control, simple sites use declarative for quick integration, maximizing flexibility.

Extremely Token-Efficient

Based on test data, WebMCP's structured tool calls compared to screenshot-based Agent interactions can save up to 89% of token consumption.

This means that while it might have taken 2000 tokens to process a screenshot to "understand" a page, now a JSON response of 20-100 tokens gets the job done.

And no screenshot verification is needed, the tool's return value directly confirms the result.

Microsoft and Google Join Forces

Moreover, WebMCP isn't just Google playing alone.

Microsoft's Edge team independently proposed a "WebModel Context" scheme, and the Chrome team had a similar "Script Tools" proposal.

As it turned out, when they met, they discovered they were overlapping, so they decided to merge into a unified WebMCP proposal under the W3C Web Machine Learning Community Group.

Kyle Pflug, Product Manager for the Microsoft Edge platform, stated:

WebMCP lets web pages expose MCP tools to Agents, similar to tools exposed by traditional MCP servers, but without requiring a separate server component. This is naturally suited for "human-in-the-loop" scenarios because it runs in the browser's browsing context, which can simplify state and authentication—something very tricky in traditional browsing Agent schemes.

Simply put: The web page itself becomes an MCP server, but without actually running a server.

How Authentication Works

You might wonder: How does authentication work? Will it reuse the user's existing login session?

The answer is: Yes, exactly.

WebMCP runs in the browser's browsing context, naturally inheriting the user's current authentication session and the browser's same-origin security model. The tools the Agent calls have exactly the same permissions as manual user operations, requiring no additional OAuth flow or API Key.

This is far simpler than traditional server-side MCP schemes.

Kyle Pflug also confirmed that they expect "some websites to use both WebMCP and traditional MCP servers" because they serve different scenarios: WebMCP is suitable for browser scenarios with users present, while traditional MCP is suitable for headless server-side scenarios.

Humans and AI

WebMCP's design philosophy has a clear red line: Agents are assistants, not replacements.

The official documentation lists several principles:

  • The human interface of the webpage remains the main body; WebMCP will not replace your UI

  • AI Agents enhance rather than replace human interaction

  • All operations by the Agent remain visible and controllable to the user

  • Humans and AI collaborate, rather than AI working alone

Therefore, WebMCP does not support headless browsing, fully autonomous Agents, nor backend service integration. It is designed precisely for the scenario where "the user is sitting in front of the browser, with the Agent helping alongside."

The Future of a Two-Layer Web

As mainstream browsers begin to natively support structured interaction between AI Agents and web pages, an interesting change is happening: websites may need to split into two layers.

The human-facing layer: Visual, brand-focused, narrative-driven.

The Agent-facing layer: Structured, schema-driven, fast-response.

Perhaps it's time to discuss "Agent SEO":

How friendly your website is to AI Agents might become a new dimension of competition; websites that don't expose WebMCP tools might gradually become "invisible" to Agents.

Although WebMCP is currently in a very early stage, with API design still iterating, and the implementation in Chrome 146 requires manually enabling a flag, the direction may already be self-evident:

The browser is no longer just a tool for humans; it is simultaneously becoming the operating system for AI Agents.


Related Links:


分享網址
AINews·AI 新聞聚合平台
© 2026 AINews. All rights reserved.