Stanford Releases 423-Page AI Report! US-China Gap Narrows to Just 2.7%, Tsinghua's DeepSeek Enters Global Top 10

Reported by New Zhixyuan

Editors: Hao Kun, Taozi

[New Zhixyuan Digest] Stanford's "2026 AI Index Report" has been officially released! This 432-page heavyweight document holds immense value: In the peak showdown between China and the US in AI, the gap has nearly vanished, shrinking to just 2.7%. Of the 95 top-tier AI models produced globally each year, almost all are concentrated in major tech giants. Most brutally, employment for developers aged 22-25 has been cut by 20%.

Today, Stanford HAI has重磅 released the "2026 AI Index Report"!

This 423-page annual report comprehensively reveals the latest power landscape of the global AI industry.

It offers one core conclusion: AI capabilities are growing incredibly fast; however, humanity's ability to measure and govern it has not kept pace.

Among these findings, the most shocking conclusion is—

The performance gap between Chinese and US AI models has basically disappeared. In their peak showdowns, the lead changes hands frequently, with Anthropic's current advantage standing at a mere 2.7%.

The US spends more on AI than anyone else, yet it is becoming increasingly difficult to recruit top talent.

The report also points out that AI evolution has not only avoided the so-called "bottleneck" but is instead surging forward at an unprecedented speed.

Over the past year, more than 90% of the world's top models have matched or even surpassed human performance on PhD-level scientific questions, multimodal reasoning, and competition mathematics.

Especially in coding capabilities, SWE-bench scores skyrocketed from 60% to nearly 100% within a single year.

However, the phenomenon of AI being "lopsided" is extremely severe, presenting a distorted reality:

LLMs can win gold medals at the International Mathematical Olympiad (IMO), yet they cannot correctly read an analog clock, with an accuracy rate of only 50.1%.

Meanwhile, the fear of AI stealing jobs has turned from prediction into reality, with contemporary young "workers" being the first to suffer.

Below, we dive directly into the substance: the 12 most critical hard-core trends from the "2026 AI Index Report" worth attention.

Other Highlights at a Glance:

Global AI computing power has increased 30-fold in 3 years; NVIDIA holds 60% of the market, and almost all chips come from a single foundry, TSMC.
In 2025, global corporate AI investment reached $581.7 billion, doubling year-on-year, with the US alone absorbing nearly half.
The number of AI researchers entering the US has dropped 89% over 7 years, with an 80% drop in the past year alone.
Employment for software developers aged 22-25 has declined by 20% since 2024, with entry-level positions precisely cut out.
China has cumulatively built 85 public AI supercomputers, more than double that of North America, ranking first globally.
AI usage in Chinese workplaces exceeds 80%, far surpassing the global average of 58%.
The strongest models are becoming increasingly black-boxed; 80 out of 95 representative models have not released their training code.

China and US Face Off

Gap Narrows to Just 2.7%

Stanford plotted the #1 US model and the #1 Chinese model from the Arena leaderboard since May 2023 on the same coordinate system.

In May 2023, gpt-4-0314 led with 1320 points, while China's chatglm-6b trailed by over 300 points.

In February 2025, DeepSeek-R1 briefly tied with top US models for the first time.

By March 2026, the US's Claude Opus 4.6 scored 1503 points, while China's dola-seed-2.0-preview scored 1464 points.

Today, the gap between Chinese and US AI is only 39 points. Converted to a percentage, that is 2.7%.

Even more noteworthy is the frequency of leadership changes over the past year. Since early 2025, the top models of both nations have swapped positions on the Arena leaderboard multiple times.

The numbers are also approaching a 50-50 split.

In 2025, the US released 50 "notable models," and China closely followed with 30 top-tier large models.

In the first tier, OpenAI, Google, Alibaba, Anthropic, and xAI stand together, splitting the global TOP 5 evenly.

Looking further down to the TOP 10, Chinese institutions and enterprises occupy four spots: Alibaba, DeepSeek, Tsinghua, and ByteDance.

The focus of the open-source ecosystem has also clearly shifted eastward this year.

DeepSeek, Qwen, GLM, MiniMax, and Kimi have continuously pushed the capability curve of open-source weights forward.

Adding in paper publication counts, citation numbers, patent output, and industrial robot installations, China ranks first globally in all categories.

Price is another battlefield.

Overseas developers have calculated on X that the output price of Seed 2.0 Pro is approximately one-tenth that of Claude Opus 4.6.

Performance is neck-and-neck, yet the price is only one-tenth. The chain reaction of this fact has only just begun.

90% of Frontier Models Come from Industry

Unprecedented Speed to God-Tier Status

Of the 95 most representative models released last year, over 90% came from the industrial sector, not academic institutions or government labs.

Academia can no longer keep up with the frontier.

The release speed is also accelerating abnormally.

In February 2026 alone, eight flagship models entered the arena in the same month, including Gemini 3.1 Pro, Claude Opus 4.6, GPT-5.3 Codex, Grok 4.20, Qwen 3.5, Seed 2.0 Pro, MiniMax M2.5, and GLM-5.

The cycle to reach "god-tier" status has changed from "years" to "months".

Benchmarks Hit Ceiling in One Year

AI Has No Bottleneck

The steepest curve is in programming.

SWE-bench Verified, a benchmark for fixing real bugs, rose from 60% to nearly 100% in one year.

It didn't just rise by a few points; it basically hit the ceiling.

Terminal-Bench, which tests an Agent's ability to handle real terminal tasks, rose from 20% last year to 77.3%.

The success rate of cybersecurity Agents in solving problems rose from 15% to 93%.

Gemini Deep Think won a gold medal at the International Mathematical Olympiad.

PhD-level scientific Q&A (GPQA Diamond), competition math (AIME), and multimodal reasoning (MMMU)—these "hard bones" once considered "unsurpassable by humans"—have all been gnawed down by frontier models.

The most telling metric is Humanity's Last Exam.

This is a test specifically designed to "stump AI and favor human experts," with questions provided by top experts in various fields.

Last year, OpenAI's o1 scored 8.8%. In one year, frontier models pushed the score up by another 30 percentage points. Currently, both Claude Opus 4.6 and Gemini 3.1 Pro have surpassed 50%.

Jagged Frontier

Can Win IMO Gold but Can't Read a Clock

However, the same index threw out another set of numbers.

The accuracy rate of the strongest models on the task of "reading an analog clock" is 50.1%.

The operational success rate of robots in laboratory simulation environments (RLBench) has reached 89.4%. However, when moved to real home scenarios to complete chores like washing dishes or folding clothes, the success rate immediately drops to 12%.

Between the lab and the kitchen, there is a 77 percentage point gap.

Researchers have named this phenomenon the "jagged frontier." The distribution of AI capabilities is uneven; it can win a gold medal in math olympiads but cannot reliably tell you what time it is.

AI can win a gold medal in math olympiads, but only has a 50% probability of reading an analog clock. AI is accelerating, but not in the same direction.

Additionally, in agent tasks, OSWorld tests show that frontier AI capabilities (66.3%) are approaching the human baseline.

However, in the PaperArena test, which specifically evaluates research logic, the strongest AI-powered agents scored only 39%, possessing only half the capability of a PhD student.

But this unevenness does not stop enterprises from stuffing AI into production lines.

Another number from the AI Index is that global corporate AI adoption has reached 88%. Nine out of ten companies have integrated AI into some workflow.

The cost is rising in tandem. Recorded AI-related incidents increased from 233 in 2024 to 362.

Money is Accelerating

$581.7 Billion Poured into AI

In 2025, global corporate AI investment reached $581.7 billion, a year-on-year increase of 130%. Of this, private investment was $344.7 billion, up 127.5%.

Both curves have nearly doubled.

By country, the US is in a league of its own. In 2025, US private AI investment was $285.9 billion. Furthermore, 1,953 new AI startups were added in one year, which is more than 10 times that of the second-ranked country.

Money is accelerating towards the US. However, another core US resource is flowing in the opposite direction.

Talent is Flowing Away

AI Researchers Entering the US Dropped 89%

One set of numbers inside made people pause.

From 2017 to the present, the number of AI researchers and developers entering the US has decreased by 89%.

More critically, this decline is accelerating. In the past year alone, the drop reached 80%.

The US remains the country with the highest density of AI researchers globally, but the faucet of inflow is being tightened.

The curves for money and talent are diverging. This is a situation unseen in the past decade.

Computing Power Up 30x in Three Years

The Chokepoint Lies in One Company's Hands

While the AI capability curve accelerates, the underlying computing power curve is running even faster.

From 2021 to now, total global AI computing power has increased 30-fold. In the past three years, it has more than tripled every year.

This curve is supported by a handful of companies.

NVIDIA's GPUs alone account for over 60% of the world's AI computing power. Amazon and Google rank second and third with self-developed chips, but combined, they still fall far short of NVIDIA.

Almost all of these chips come from a single foundry: TSMC. The steeper the computing curve, the narrower the chokepoint.

Meanwhile, the cost is also increasing.

The total power consumption of global AI data centers has reached 29.6 GW, equivalent to the entire electricity demand of New York State during peak hours. The estimated carbon emissions for a single training run of xAI Grok 4 are 72,816 tons of CO2 equivalent, comparable to the exhaust of 17,000 cars driven for a full year.

Where data centers are built, where the electricity comes from, and where chips are produced—these three questions have become the most headache-inducing issues on the desks of all AI company CEOs this year.

Generative AI Penetration Reaches 53% in Three Years

Chinese Workplace Usage Exceeds 80%

Generative AI has achieved a 53% penetration rate among the global population within three years.

This speed is faster than personal computers and faster than the internet.

However, penetration speed is highly correlated with country. Singapore at 61% and the UAE at 54% are both ahead of the US. The US ranks only 24th among surveyed countries, with a penetration rate of 28.3%.

If we switch the dimension from consumers to the workplace, the contrast is even greater.

Another set of data in the report shows that in 2025, 58% of employees globally have begun regularly using AI at work. However, in five countries—China, India, Nigeria, the UAE, and Saudi Arabia—this proportion exceeds 80%.

China's workplace AI penetration rate is already more than 20 percentage points higher than the global average.

Even more interesting is consumer value.

The AI Index estimates that by early 2026, generative AI tools will create $172 billion in value annually for US consumers. From 2025 to 2026, the median value per user tripled.

The vast majority of users are still using the free version.

The amount ordinary people are willing to pay for AI is far lower than the value it creates for them. This scissors difference is what all AI companies are currently trying to bridge.

Entry-Level Positions Sharply Reduced

Developer Jobs for Ages 22-25 Slashed by 20%

The section of the entire AI Index that may cause Chinese readers the most silence is likely the part about young employment.

For the software developer demographic aged 22 to 25, employment numbers have dropped by approximately 20% from 2024 to the present.

During the same period, the peer group of older colleagues is actually growing.

It's not just development jobs. Other high AI-exposure industries like customer service are showing the same pattern.

Even more worrying are the results of corporate surveys. Executives interviewed generally expect that future layoff scales will be larger than in the past few months.

This isn't about macro unemployment rates; it's about entry-level positions being precisely cut out.

When the first job is gone, a rung of the entire career ladder is missing. No one can currently calculate the long-term impact of this.

AI is Rewriting the Way Scientific Discoveries are Made

If the employment section is cold, the science section is hot.

AI-related papers in natural sciences, physical sciences, and life sciences increased by 26% to 28% year-on-year in 2025.

In terms of specific applications, this year saw AI complete an end-to-end weather forecasting process for the first time. From raw meteorological observation data directly outputting final forecasts for temperature, wind speed, and humidity, without any traditional numerical models intervening in between.

AI is transforming from "helping you write papers" and "helping you calculate numbers" to "making discoveries itself."

Hospitals are the same. In 2025, a large number of hospitals began deploying AI tools that automatically generate clinical records from consultation conversations. Doctors from multiple hospital systems reported that the time spent writing medical records decreased by up to 83%, and work burnout significantly declined.

But the same index poured cold water on medical AI. A review of over 500 clinical AI studies found that nearly half relied on exam-style datasets, with only 5% using real clinical data.

That AI can reduce the time doctors spend typing is certain. The clinical value of AI on real patients still has many question marks.

Self-Learning Wave Explodes Globally

Formal Education Has Fallen Behind

Formal education cannot keep up with AI.

Four out of five high school and college students in the US now use AI to complete school assignments. However, only half of high schools have AI usage policies, and only 6% of teachers believe these policies are written clearly.

Students are running ahead, teachers are still standing still, and rules have not yet appeared.

While formal education falls behind, a self-learning wave is exploding globally. It states that the three countries with the fastest growth in learning AI engineering skills are the UAE, Chile, and South Africa.

Not the US, not Europe.

The steepest part of the skill curve is growing where no one is looking.

Strongest Models Become Least Transparent

Experts and Public Tear Apart

The strongest models are becoming the least transparent models.

The average score of the Foundation Model Transparency Index dropped from 58 last year to 40 this year. The AI Index directly named Google, Anthropic, and OpenAI, stating they have all given up on disclosing the training data scale and training duration of their latest models.

Of the 95 most representative models released last year, 80 did not release training code.

Public sentiment has also become more complex.

Globally, the proportion of people who believe AI benefits outweigh the harms rose from 52% to 59%. But during the same period, the proportion feeling nervous about AI rose from 50% to 52%.

Both directions are growing simultaneously.

The most divided is the US. Only 33% of Americans believe AI will make their jobs better, compared to a global average of 40%. Americans' trust in their own government's regulation of AI is the lowest among surveyed countries, at 31%.

Singaporeans' trust in their government's regulation of AI is 81%.

Following the recent attack on Sam Altman's home, insiders in Silicon Valley were "surprised to find" that ordinary people in Instagram comment sections did not sympathize, with some even feeling it "should have been more intense."

They didn't realize things had gotten this bad.

Data from Pew and Ipsos cited in the research report shows that the gap in perception between experts and the public on dimensions like AI's impact on employment, healthcare, and the economy generally exceeds 30 percentage points, with the largest item reaching 50 percentage points.

On one side, curves in the lab are soaring; on the other, unease in ordinary people's hearts is accumulating.

There is no bridge in between.

In Conclusion

There are hundreds of charts in the 423-page report, but it really only draws one graph.

The horizontal axis is time, the vertical axis is capability.

The model capability curve is flying, the computing power curve is flying, the investment curve is flying, the adoption rate curve is flying. Everything else is either standing still or moving downward.

This is the entire content of the 2026 AI Index.

AI is accelerating. Everything else is decoupling.

If you are in this industry, the question to ask now is not "what will the future be like," but "which curve are you standing on?"

References:
https://hai.stanford.edu/ai-index/2026-ai-index-report
https://hai.stanford.edu/news/inside-the-ai-index-12-takeaways-from-the-2026-report
https://www.nature.com/articles/d41586-026-01199-z
https://hai.stanford.edu/assets/files/ai_index_report_2026.pdf