Anthropic CEO: The Data Bottleneck for Large Models No Longer Exists, Models Are Training Themselves

Anthropic CEO Dario Amodei just revealed that the training data bottleneck the entire industry has been worried about actually doesn't exist anymore. For years, this industry has been obsessed with scraping the open web—more data, more text, and more human output to feed models. But Amodei says: "I think data is no longer the most central thing."

This is a fundamental shift. Amodei: "Static data is becoming less and less relevant. A large amount of the data we use today is a reinforcement learning environment in which the model trains—it is dynamic data generated by the model itself." It is not scraped, not licensed, and not written by humans. It is data self-generated by the model through pure trial and error.

When you train a model on complex mathematics or agent programming, you are not stuffing a textbook into it, but giving it an environment. The model experiments, fails, adjusts, and tries again on its own. Amodei: "Give it some math problems, and the model tries to solve them on its own." It generates its own experience, over millions of iterations, accumulating on the basis of the previous one each time. No human intervention required. This thoroughly shatters the entire narrative that "large models will hit a data wall."

You cannot curb competitors by locking down copyrights, nor slow down this race by setting up paywalls. When models learn through their own synthetic experience, the open web has become irrelevant. The only true bottleneck remaining is computing power.

And this is where geopolitical gaming becomes incredibly critical. The country that wins the computing power race won't just create smarter models—it will create models that can self-generate intelligence, continuously compound on themselves, and iteratively break through all boundaries of human knowledge. We are no longer using the past to train large models.


分享網址
AINews·AI 新聞聚合平台
© 2026 AINews. All rights reserved.