Google Personally Proves It: GUI Is Dead, But the Corpse Is Still Moving

Google DeepMind has released a browser that can generate entire websites in real-time using Gemini 3.1 Flash-Lite.

You type a sentence, and it instantly "writes" the webpage for you right before your eyes.

With this move, Google has essentially confirmed my previous观点 with a single demo.

Let's Look at the Demo First

Google's Flash-Lite Browser looks like an ordinary browser, except the address bar has been transformed into an input box. Instead of entering a URL, you simply tell it what you want to see.

Google engineers stated in the video:

What you see in this browser is not a real website. It is generated from scratch by Gemini 3.1 Flash-Lite.

In the demo, the user entered "a guide to watering my cheese plant".

And then... the browser started "generating".

The tab displayed "Generating...", while the address bar showed PlantCare > Watering Your Monstera. 1,068 tokens, 1.93 seconds, and a complete plant care webpage materialized on the screen from nothing.

The generated page featured a navigation bar, icons, a multi-column layout, along with watering frequency, temperature requirements, and soil checking instructions.

The key detail lies here:

When you click "Search" in the navigation, it generates a search page on the spot. However, there is no actual search functionality behind this search box. The engineer explained:

There is no real search function in the search box. We send the current page and the entered text to the model together. It understands what should be displayed next, then rewrites the complete code to "imagine" the next step.

In other words, every click and every input you make on the page triggers the model to regenerate the entire page's code.

There is no pre-generated data, no history. The model infers what the next page should look like based on the current page and the elements you interact with.

2,122 tokens, 4.86 seconds.

A complete multi-page website, from intent to rendering, entirely in real-time.

The most interesting part was the final demo: the user asked it to generate "the most annoying website on the internet".

It actually generated a page that read "WELCOME TO THE CHAOS", with red dashed borders, a large green button saying "CLICK ME IF YOU CAN!", and a purple banner at the top warning "DON'T CLICK ANYTHING!".

2,031 tokens, 5.24 seconds. Complete with a touch of AI-style humor.

Moreover, through streaming code output, the page begins rendering during the generation process, making the perceived latency for users shorter than the actual generation time.

The engineer also mentioned that if you want more refined UI, you can switch to Flash or Pro models. But for real-time response scenarios like this, Flash-Lite's speed advantage is simply too obvious.

Three-Layer Differentiation

In my previous article "GUI Is Dying, CLI Is Everything", I discussed how the CLI-Anything project enables Agents to control all desktop software via command line. Last week, I wrote "OpenCLI: Everything Can Be CLI", extending this concept to websites and Electron applications.

In these two articles, my core argument was:

GUI is essentially a translation layer. Humans spent 40 years wrapping computers in graphical interfaces, but Agents don't need this layer of translation at all.

Google's demo this time validates this from another direction.

What it proves is: GUI doesn't even need to be "pre-designed" anymore.

What is traditional Web? Developers pre-write HTML/CSS/JavaScript, deploy to servers, and users request static or dynamic pages in return.

The premise of this entire process is: someone designed the interface in advance.

Flash-Lite Browser eliminates this premise. Pages are no longer "retrieved from a server" but "written live in front of you".

If you look at this alongside the revival of CLI, you'll find that interfaces are undergoing a three-layer differentiation:

Three-Layer Differentiation of Interfaces

Bottom Layer: CLI becomes the runtime for Agents. Agents control computers using command lines, with text input, structured output, composability, and strong determinism. This is their native language.

Middle Layer: Protocols become the communication standard for Agents. MCP connects Agents to tools, AG-UI connects Agents to users, and A2A connects Agents to Agents. A complete protocol triangle is taking shape.

Top Layer: GUI becomes the output of AI. Interfaces are no longer pre-drawn by human designers but are generated in real-time by AI based on intent. You get exactly what you ask for.

Who Still Needs GUI?

One thing needs to be made clear here: GUI hasn't truly "died".

It has simply changed owners.

Previously, GUI was for humans. People controlled computers by clicking buttons and filling out forms. But now? Humans just speak natural language to AI.

CLI is for Agents, and GUI has conversely become something AI shows to humans.

This reversal is actually quite thought-provoking and interesting.

It's like when we use AskUserQuestion in Claude Code. When an Agent needs human confirmation during task execution, what does it do? It pops up a text question for the human to answer.

This is essentially a minimalist GUI, except the initiator has changed from human to AI.

Google's Flash-Lite Browser pushes this logic to the extreme: AI doesn't just pop up a question; it directly generates an entire webpage interface for you.

You say you want to see a plant watering guide, and it renders a complete plant care website with navigation, search, and columns.

Previously, humans operated GUI to command computers. Now, AI generates GUI to display information to humans.

The direction of interaction has reversed.

Your Terminal Is Already an AI Runtime

The 2025 Stack Overflow Developer Survey shows that 78% of professional developers spend more than half their work time in the terminal.

In 2023, this number was 62%.

Claude Code was released in February 2025 and reached $1 billion ARR by November. A SemiAnalysis report from February 2026 shows that 4% of public commits on GitHub were generated by Claude Code.

Faros AI surveyed 99 professional developers; 59% use Claude Code, with satisfaction ranking first.

The trend behind these numbers indicates: The terminal is transforming from "a place to execute commands" to "a place where you delegate work to AI".

IDE is designed for "suggestions"—you write code, it gives you completions. CLI Agent is designed for "delegation"—you state requirements, it does the work. These are two different categories.

Research also shows that the factor determining whether developers use CLI or GUI is not professional level but task type. CRUD and debugging use CLI; monitoring uses Web consoles.

In other words, task type determines interaction form, not user preference.

Applied to the AI era, the same logic holds. Agents execute tasks using CLI because text protocols are their native language. Displaying results to humans? That uses GUI because humans understand information most efficiently through visual means.

The Protocol Triangle

However, CLI and GUI alone are not enough. Agents need to communicate with tools, with users, and with other Agents. This requires a set of standard protocols.

A "protocol triangle" is currently forming in the industry:

MCP (Model Context Protocol): Initiated by Anthropic, released in late 2024, and donated to the Linux Foundation in late 2025. OpenAI has also officially adopted it. It solves the problem of how Agents connect to tools and APIs.

AG-UI (Agent-User Interaction Protocol): An open-source protocol initiated by CopilotKit. Microsoft's Agent Framework is already compatible, and Google ADK has integrated it. There are over 2 million agent-user interactions per week. It solves the problem of how Agents communicate with front-end UI.

A2UI (Agent-to-UI): Google's open-source declarative UI specification. Agents generate JSON to describe interface components, and clients render using native components. It doesn't send executable code; instead, it combines interfaces through a trusted component catalog, ensuring security.

Three protocols, three lines, building the infrastructure of the Agent world.

The real-time UI generation demonstrated by Flash-Lite Browser is essentially an extreme demonstration of the A2UI approach: the Agent doesn't just describe the interface; it directly writes complete HTML/CSS/JavaScript.

Not Reliable Enough

Of course, this is still somewhat conceptual and not reliable enough yet.

The Decoder's assessment of Flash-Lite Browser was:

The results are unstable, and the content quickly drifts off-topic and becomes nonsensical.

After all, when you ask an LLM to generate a complete webpage in real-time, the results may differ each time. Navigating to the same page might show a three-column layout last time and a two-column layout this time. Searching the same keyword might return completely different content.

Some complained: "model-generated UI in production? the debugging stories alone will be legendary".

Others pointed out: "what phishing pages? uncatchable?"

Indeed, when webpage content is entirely generated by AI, traditional URL verification, certificate checks, domain blacklists...

All these security mechanisms become ineffective.

Flash-Lite's speed is 360+ tokens per second, 2.5 times faster than Gemini 2.5 Flash. The pricing is quite affordable: $0.25 per million tokens for input, $1.50 per million tokens for output.

But "fast" and "cheap" don't equal "reliable".

At least at this stage, real-time generated UI is more suitable for prototype exploration and idea visualization. There's still a long way to go before production environments.

The Fifth Migration

In my article "Karpathy: All Software Will Be Rewritten for Agents", I proposed a framework called "Four Migrations":

In the mainframe era, software users were operators.

In the PC era, users became ordinary people.

In the mobile era, users became everyone.

In the Agent era, users became AI.

Looking back now, I think we should add one more layer.

The Fifth Migration: Interface users shift from "human-operated" to "AI-generated".

The first four migrations changed "who uses software". The fifth migration changes "who creates interfaces".

Previously, designers drew prototypes, front-end developers wrote code, testers validated, and then it went live. A page from design to launch took a week if fast, a month if slow.

Now, AI can generate a complete page with 2,000 tokens in 5 seconds.

Of course, there's a world of difference in quality between these two types of "interfaces". But the direction is: interfaces are transforming from "products pre-designed by humans" to "services generated in real-time by AI based on intent".

Websites are no longer documents but conversations. Browsers are no longer readers but rendering engines. Front-end engineers are no longer people who write interfaces but people who define component libraries and safety guardrails.

The transition from "pre-fabricated pages" to "instant generation" is a fundamental change to the concept of digital state. If UI is created at the moment of interaction, then the concept of "static website" becomes a historical artifact.

Intent-Driven

Connecting all these threads, you'll see a clear trajectory:

The endgame of interfaces is no longer fixed buttons and pages, but dynamic generation that follows intent.

Humans speak to AI in natural language. AI executes tasks using CLI and APIs. AI displays results to humans using real-time generated GUI.

In this cycle, neither CLI nor GUI has disappeared. They've each found new positions.

CLI serves Agents. GUI serves humans. Natural language connects the two.

And this demo from Google, though somewhat rough, demonstrates a possibility: if browsers no longer "fetch" pages but "generate" them...

Then all the Web infrastructure we've spent 30 years building—from CDN to SEO to caching strategies to responsive design...

Do we need to rethink all of it?

The entire Web may be transforming from an "archive of information"

into a "renderer of intent".

Related Links:

Google DeepMind Flash-Lite Browser: https://aistudio.google.com/flashlite-browser
Google DeepMind Tweet: https://x.com/GoogleDeepMind/status/2036483295983100314
Gemini 3.1 Flash-Lite: https://deepmind.google/models/gemini/flash-lite/
AG-UI Protocol: https://www.copilotkit.ai/ag-ui
A2UI Protocol: https://developers.googleblog.com/introducing-a2ui-an-open-project-for-agent-driven-interfaces/