DeepMind's Nobel-winning CEO's latest interview: The current large model path is not a dead end, but the brute-force methods everyone uses might be wrong; Chinese models are already leading in the open-source domain

Edited by Yu Cheng

Recently, Google DeepMind CEO Demis Hassabis appeared on Y Combinator's (YC) flagship interview series "How to Build the Future."

The series was originally initiated a few years ago by OpenAI CEO Sam Altman, primarily inviting top figures in tech to discuss frontier technology, entrepreneurship, the future of AI, scientific breakthroughs, and other grand topics. The show was briefly paused later, but after Garry Tan took over as YC President, he revived it and now hosts it himself.

Some readers may already be familiar with Demis, but for those who might not be, here's a brief introduction:

Demis was a chess prodigy as a child and designed a hit video game, "Theme Park," at the age of 17. After graduating, starting a business, and then returning to academia, he earned a PhD in cognitive neuroscience, where he focused on researching the mechanisms of memory and imagination in the brain.

In 2010, he co-founded DeepMind with a single mission: to solve intelligence.

Since then, their creation AlphaGo defeated the world champion in Go; the AlphaFold series cracked the 50-year "protein structure prediction" grand challenge in biology, ultimately leading to him winning the Nobel Prize in Chemistry in 2024. Now, he leads Google DeepMind in building Gemini, marching towards Artificial General Intelligence (AGI).

After listening to this interview, we found that he and OpenAI co-founder Greg Brockman share a very similar passion for and sense of mission regarding AI. Both decided from a very young age, back when AI was considered unworkable, that researching AI was the most impactful thing they could do.

Demis believes the current mainstream large model path is not a dead end. Paradigms like large-scale pre-training, RLHF (Reinforcement Learning from Human Feedback), and Chain of Thought reasoning will be part of the final architecture for AGI.

He discussed current obstacles to achieving AGI, including challenges like continual learning, long-term reasoning, and memory.

The human brain excels at continual learning. During the REM (Rapid Eye Movement) sleep stage within the dream cycle, the hippocampus is highly active and participates in memory replay, which helps consolidate memory and integrate new knowledge into the existing knowledge base. Current models still lack this continual learning mechanism, which he believes is one factor hindering agents from executing complete tasks.

Regarding long-term reasoning, he gave an example of playing chess with Gemini. By observing Gemini's thought process, he found that sometimes Gemini would consider a move, realize it was a terrible blunder, but because it couldn't find a better one, it would still play that move.

"In a very precise reasoning system, you shouldn't see that happening," so he believes AI is still missing something regarding "introspection" of its own thinking process, meaning there is significant room for improvement in monitoring Chain of Thought reasoning.

At the same time, he emphasized that to achieve AGI, you must have an "active system" that proactively solves problems for you, and agents are the necessary path for this. He stated, "From the very beginning of DeepMind, we've been working on agents," and he believes we are just getting started.

On whether AI has creativity, Demis couldn't give a definitive answer. He believes that if AI could invent Go, propose "a new set of Millennium Prize problems considered just as profound, meaningful, and worthy of a lifetime's research and effort by top mathematicians," or propose the "Annus Mirabilis" (miracle year) achievements that Einstein published in 1905, including the Special Theory of Relativity, using only the physics knowledge available in 1901, then he would agree that AI possesses the ability to create new things.

He predicts AGI's arrival around 2030, much later than the late 2026 or 2027 predicted by Anthropic CEO Dario Amodei. When Garry asked him for entrepreneurial advice for young people wanting to enter the AI industry, he mentioned that they should imagine what the world will look like once AGI is achieved and build things that will still be useful when the AGI era arrives.

Furthermore, he counter-intuitively emphasized the value of small models. He said that through distillation, small models can acquire the same capabilities as large models. Simultaneously, small models can serve AI applications extremely quickly, efficiently, with very low latency and cheaply, such as powering Google's dozen-plus products with over a billion users. They are also more suitable for running on edge devices like phones, smart glasses, and robots.

Speaking of edge devices, he mentioned that the best models for edge devices are likely open-source models. In the open-source arena, Chinese models are currently leading, while Gemma is also very competitive.

We genuinely feel that Demis's talk was packed with substance, every sentence valuable, and Garry's questions were highly insightful and right to the point. It was a high-quality conversation. The interview contains many more brilliant insights. The full transcript is below:

Current Model Paradigms Will Be Part of AGI's Final Architecture

Garry: You've been thinking about AGI for longer than almost anyone. When you look at the current paradigm—large-scale pre-training, RLHF, Chain of Thought—how much of the final architecture for AGI do you think we've figured out? What's fundamentally still missing?

Demis: On that question, I think those components you just mentioned will definitely be part of the final architecture for AGI. They've come such a long way, and we've already proven they can do so many things. I can't imagine a few years down the line we'll discover it's a dead end; that doesn't make sense to me. But on top of the techniques we know work, maybe one or two key things are still missing. Things like continual learning, long-term reasoning, certain aspects of memory, these are still unsolved right now. Plus, how to make systems perform more consistently in all areas. I think these are all required for AGI. The situation could be that existing technology, with some innovations and incremental improvements, can scale to that stage. Or maybe there are still one or two core big problems that need to be cracked. I don't think there are more than one or two, if they truly exist. My bet is it's about a 50/50 probability. Of course, at Google DeepMind, we are working on both directions simultaneously.

Areas like memory still have huge room for innovation: Continual Learning, Dream Cycles, and the Hippocampus

Garry: I think, when dealing with a series of agent systems, what's most mind-blowing to me is that they are largely reusing the same weights. This concept of continual learning is very interesting, because currently we're sort of jury-rigging it with duct tape, like using "dream cycles" at night and similar approaches.

Demis: Exactly, dream cycles are very cool. We used to think about this in terms of the consolidation of episodic memory. Actually, this is what I studied during my PhD: how the hippocampus works, and how it elegantly integrates new knowledge into the existing knowledge base. The brain does this remarkably well; it replays important episodes during sleep, especially during REM sleep, so you can learn from them. In fact, one of the ways our earliest Atari program, DQN, was able to master games was through "experience replay." We borrowed this idea from neuroscience—replaying successful trajectories many times. That was back in 2013, the 'Dark Ages' of AI, and it was a very important event. I agree with you, right now we are somewhat jury-rigging things, like shoving everything into the context window. But that seems a bit unsatisfactory, right? Actually, even though we are dealing with machines rather than biological brains, and theoretically you can have millions or tens of millions of context window tokens or memories, and it can be perfect, there is still a cost associated with looking up and finding the correct information relevant to the specific decision you have to make right now. This cost is non-negligible, even if you can potentially store everything. I think there's actually a lot of room for innovation in areas like memory.

Garry: I mean, it is amazing, because it feels like a million-token context window is already huge, to be honest, it's big enough already.

Demis: It is big enough for most scenarios where it should be used. If you think of the context window as equivalent to "working memory," humans only have a capacity of a few items, around 7 or so, and now we have million- or even ten-million-token context windows. But the problem is, we try to store everything in it, including unimportant, incorrect things. This current approach is quite brute-force and doesn't seem right. Also, if you try to process live video and naively log all the tokens, a million tokens isn't actually that much, just about 20 minutes worth. So, if you want a system that understands things happening in your life over a span of, say, a month or two, you actually need more. DeepMind historically has leaned towards Reinforcement Learning and search, like AlphaGo, AlphaZero, and MuZero.

Many Ideas for Building Gemini Came from Early AlphaGo Explorations

Garry: How much of that philosophy is embedded in the process of building Gemini today? Is Reinforcement Learning still underrated?

Demis: Yes, I think it might indeed be underrated. It ebbs and flows like the tide. From the very beginning of DeepMind, we have been working on agents. In fact, that was our stated focus. All the Atari work, and especially AlphaGo, are agent systems. By agents, we mean systems that can autonomously complete goals, make proactive decisions, and formulate plans. To make it tractable, we did this in the domain of games, then in increasingly complex games like StarCraft (AlphaStar). We basically played through every game on the market. The next question was, can you generalize these models into world models or language models, rather than just models for simple or complex games? That's what we've been doing over the past few years. But actually, you could think of much of what we do today, including the 'thinking modes' and 'chain-of-thought reasoning' of all leading models, as a return to certain aspects of AlphaGo's early explorations. I actually think a lot of the work we did back then is still highly relevant today. We are revisiting those old ideas in more general ways and at massive scale, including Monte Carlo Tree Search and other methods to enhance RL on top of existing reinforcement learning. I think many ideas from AlphaGo and AlphaZero are highly relevant to today's Foundation Models. I think most progress we'll see in the coming years will stem from this.

Through Distillation, Small Models Gain the Same Capabilities as Large Ones

Garry: I have a question. Obviously, today you need larger and larger models to get smarter and smarter, but we also see 'distillation' working, where smaller models can run much faster. You have incredible Flash models, which I find perform at about 95% of frontier models, but at one-tenth the cost, is that right?

Demis: I think this is one of our core strengths. While you have to build the biggest models to have frontier capabilities, one of our biggest advantages has always been the ability to very quickly distill and package that capability into increasingly smaller models. Clearly, we invented this distillation process; people like Jeff and Oriol are world experts on this. And we have a huge demand to do this, because we have to support perhaps the world's largest AI application surface. Obviously, Search with AI Overviews, then Gemini app, and now almost every Google product (Maps, YouTube, etc.) has Gemini or its related technology in it. That's billions of users, over a dozen products with over a billion users each; they must be served extremely fast, efficiently, cheaply, and with very low latency. This gives us a very important incentive to make Flash, or even smaller Flashlight models, extremely efficient. Hopefully, this will end up being very useful for many workloads you all use.

Garry: I'm curious how smart these smaller models can actually become. Does the distillation process have limits? Like, could a 50B or 400B model become as smart as today's Mythos (referring to top-tier large models)?

Demis: I don't think we've hit any kind of information limit yet, or at least none of us know of one. Maybe at some point there's an information density limit you can't surpass, but our current hypothesis is: about a year or half a year after one of our leading Pro models or frontier models is released, you'll be able to get the same capability in very tiny edge device models. You can see this with our Gemma models, too. I hope everyone is using Gemma 2 models; I think they have amazing power relative to their size. This again uses a lot of distillation techniques and ideas on how to make these tiny models extremely efficient. So, I haven't seen any theoretical limit yet. I think we are very far from that limit.

Small models are lower cost, faster, and more suitable for local deployment

Garry: That's amazing. That's really good, because one of the weirdest things we see now is that throughput for engineers can be 500x to 1000x of what it was just six months ago. In this room, some people are doing roughly 1000x the workload of a Google engineer from the 2000s, as Steve Yegge talks about.

Demis: I think that's incredibly exciting. Small models have many uses. One is cost, but speed is equally critical. If you consider programming or other things, you can iterate much faster, especially when collaborating with the system. There are many demands for fast systems, maybe not quite frontier-level, like you said, reaching 95% or 90%, but that's good enough, and the gain you get from iteration speed far outweighs the lost 10%. I think another big thing is running these systems on edge devices, not just for efficiency, but also for privacy and security. If you consider running these systems across different devices handling extremely private information, or think about robotics—like a robot in your home—I think you'll want highly efficient and powerful localized models. These local models might be orchestrated by some big or frontier model in the cloud, but you only delegate to it in specific circumstances. Maybe you process all audio and video streams locally, and the data stays local. I can imagine that being a very desirable end state.

To achieve full general intelligence, the "continual learning" challenge must be solved

Garry: Going back to context and memory. Current models are stateless, but if a developer uses a task model with 'continual learning' capability, what would that development experience look like? Do you have any ideas on how to guide it?

Demis: I think that's very interesting. I believe the current lack of continual learning is one of the factors preventing agents from executing complete tasks. They are very useful for certain aspects of a task now; you can cobble them together to do cool things, but they can't adapt well to the context you're situated in. I think that's the missing piece for them to truly be 'fire and forget' and handle everything themselves. They need to be able to learn about the specific context you place them in. To get full general intelligence, we have to crack this problem.

AI is still missing something regarding 'introspection' of its own thinking process

Garry: How are we doing on reasoning? Models can now do impressive chain-of-thought reasoning, yet they still fail on things that a good undergrad wouldn't get wrong. What specifically needs to change? What progress do you expect to see in reasoning?

Demis: There's still a lot of room for innovation within the 'thinking paradigm.' I would say our current methods are still quite simplistic and brute-force. You can imagine there is a lot of space in monitoring the Chain of Thought, like intervening during the thinking process. I often get the impression with our systems and competitors' systems: they almost 'overthink' and get stuck in loops. One thing I like to do sometimes is play chess with Gemini. All the leading foundation models perform pretty poorly at games, which is interesting. Watching their thinking traces is very cool because these are well-understood. I can quickly tell if it goes off-topic, and whether its thinking is effective is very provable. What we see is, sometimes it will consider a move, realize it's a blunder, but it can't find a better one, so it goes back and plays that move anyway. In a very precise reasoning system, you shouldn't see that happening. So, I think there's still a huge gap, but to be clear, it probably only needs one or two tweaks to fix these gaps. These gaps are obvious. That's why you see this 'spiky intelligence': on one hand, it can solve super-hard gold medal problems in the IMO (International Mathematical Olympiad); on the other hand, as we've seen, if you ask in a certain way, it still makes basic arithmetic errors or fundamental reasoning errors. So, to me, it's missing something about some kind of 'introspection' of its own thinking process.

Agents are the necessary path to AGI, and we are just getting started

Garry: Agents are very hot right now. Some say they are overhyped. I personally think we are just at the beginning. It's utterly crazy. Where does DeepMind's internal research tell you the actual capability of agents stands relative to the external hype?

Demis: I agree with you, I think we are just getting started. To achieve AGI, you must have an 'active system' that proactively solves problems for you. That has always been clear to us. So, agents are that necessary path, and I think we are just starting out. I think we're all adapting to how best to work; you yourself are at the cutting edge in your personal experiments with this. I believe many of you are doing the same. I think the key is how to integrate it into your workflow so it's not just a 'cherry on top,' but starts handling some fundamental tasks. My impression is that right now we are all doing various experiments, but it might only be in the last few months that we are starting to discover truly valuable application use cases, and the technology might just now be good enough to support that, right? It's no longer a toy-like demo, but genuinely adding value to your time and efficiency. I often think, I see many people trying, like launching dozens of agents to run for 40 hours, but I'm not sure I see the output that justifies that investment yet, but I think that day will come. So, I still think we're in the experimental phase. We haven't seen that AAA blockbuster that's top of the charts and completely 'vibe coded,' right? I've seen, and I've personally built, I'm sure we've all done some neat little demos, like I can prototype 'Theme Park' in half an hour now, a game that took me, at age 17, 6 months to make. It's truly heartbreaking and staggering, and I even get a feeling that if I spent a whole summer delving into it, I could really make something incredible. But it still requires craftsmanship, human 'soul,' and taste. I think that's something you must ensure you bring to anything you build. And I think this also shows that it's still missing something, because why haven't we seen a kid make a hit game that sells 10 million copies yet? Given the effort already invested, that should be possible. So, something is still somewhat missing. Maybe related to the process, or the tools, I'm not entirely sure. You might know better than me, as I'm sure you're all experimenting in this area. I haven't seen the kind of result I expected, the one that truly unlocks all the value, and I think that will come within the next 6 to 12 months.

Unsure if AI has creativity

Garry: How much of this is done autonomously, or is it... I mean, I don't think we'll see 'autonomy first.' We might actually see people in this room operating at 1000x efficiency first; that's what you should see first, and then many of you, like game companies or other types of companies, will use these tools to build some kind of blockbuster app or game; that will happen first, before more parts become automated. I mean, there's human involvement, and right now humans don't want to say these were made by agents.

Demis: If we want to talk about creativity, I often make a point: look at things we've already done, like AlphaGo. Obviously, everyone knows about Move 37 in Game 2. For me, I had been waiting for a moment like that, to use it to kickstart scientific projects like AlphaFold. We started AlphaFold the day we returned from Seoul, that was 10 years ago. I'm going to Korea again soon to celebrate AlphaGo's tenth anniversary. But simply coming up with 'Move 37' isn't enough. That's cool and useful, but can it invent Go? That's what I want to see. I want a system where, if you give it a high-level description, it could invent Go. For example, a description like: a game you can learn the rules of in 5 minutes but takes a lifetime to master; aesthetically beautiful, yet you can play a game in a couple of hours one afternoon. You can imagine this is the description I give, and I want the returned result to be Go. Clearly, today's systems can't do that. So the question is why; I think something is still missing there.

Garry: Maybe someone in this room could build it.

Demis: If so, the answer is 'nothing is missing,' it's just the way we use the systems. That might actually be the answer. Maybe our systems today already have this capability, provided a sufficiently brilliant, creative person uses them, provides the project's thrust and soul, and masters the tools to the point of almost merging with them. I could imagine, if you try these tools day and night (as many of you are doing) and combine that experience with true deep creativity, you could make something even more incredible.

Chinese models lead in open-source, and edge devices are best with open-source models

Garry: Let's pivot to talk about open-source models, or open-weight models. The recently released Gemma, you are building extremely capable and accessible open-source models, even runnable locally. What does this mean to you? Will AI become something in users' hands, rather than mostly living in the cloud? Does this change who can build with these models?

Demis: Broadly speaking, we are strong supporters of open source and open science. You mentioned AlphaFold at the beginning; we released it all for free publicly, and even today, all our scientific work is published in leading journals. We want to create world-leading models at their respective sizes, and we hope Gemma has already achieved that. We are heavily committed to this path and hope you all will experiment, build on, and enjoy using Gemma. I think the downloads have already reached 40 million, and that's in just two and a half weeks. So we're very excited about this. Simultaneously, I think having a 'Western tech stack' in the open-source arena is important. Obviously, many Chinese models are very excellent; they are currently leading in the open-source domain, and we believe Gemma is very competitive at every size point. For us, it involves resources, talent, and compute; nobody has enough idle compute to simultaneously make two maximum-scale frontier models with different properties. That's very difficult. But what we decided so far is, for our edge models (things we intend to use on Android, smart glasses, and robots), they are best as open-source models, because once you deploy them onto these endpoints, they are inherently accessible on the surface anyway. So they might as well be fully open, right? Therefore, we made a decision to unify this at the 'Nano' size level. This works for us strategically as well. We want as many people as possible building on it, and of course, we ourselves will build on it.

Multimodal model Gemini has long-term advantages; Genie is very important for robotics

Garry: Earlier, before we came on stage, I showed you my version of the Samantha demo from 'Her.' For me, it's a bit nerve-wracking trying to demo things to you. But it worked out, which was awesome. Gemini was born multimodal. I spend a lot of time on these models; I mean, the contextual depth and tool-calling capabilities during direct voice interaction with the model are, honestly, the strongest out there right now, bar none.

Demis: Yes. I think this is still a slightly underestimated aspect of the Gemini family, that we designed it to be multimodal from the beginning. That actually made the task harder initially, harder than just focusing purely on text, but we believed we would benefit from it in the long run. I think we are now seeing the payoff in terms of building world models. For instance, Genie, which we built on top of Gemini, I think is very important for robotics. That's why Gemini Robotics (which many of you may have tried) will be built on top of the multimodal foundation model. We think Gemini's strong advantage in multimodality gives us a competitive edge. We are increasingly applying this to projects like Waymo. Also, if you imagine a device and assistant going with you into the real world (be it a phone, glasses, or something else), it needs to understand the physical world around you, intuitive physics, and the physical context you're in. This is exactly what our systems are extremely good at, and I suspect that's why you like using them in your setup. We plan to continue pushing in this direction; I think we have the strongest models currently for handling these types of problems.

For decades to come, inference will not be "basically free"

Garry: The cost of inference is dropping rapidly. When inference is basically free, what becomes possible? How does this change the direction your team is actually optimizing for?

Demis: I'm not sure inference will truly become "basically free." There's something like the Jevons Paradox here; I think we will end up using whatever compute we can get our hands on. You can imagine millions of agents, swarms of agents working collaboratively; that's one way to consume inference compute. Or you could imagine single agents or smaller groups of agents thinking in multiple directions and then doing ensembling. We are experimenting with all these things, and you probably are too. I think all this will consume all available inference compute.

(Note: Jevons Paradox, also known as the Jevons effect, is a classic phenomenon in economics. Its core conclusion is that when technological progress significantly increases the efficiency of using a certain resource, the total consumption of that resource often rises instead of falling, rather than decreasing.)

I mean, maybe one day the cost can be almost zero. Of course, if we solve nuclear fusion, superconductors, battery optimization, or some combination thereof (which I think we will, through materials science), energy costs will be basically zero, but there will still be costs for the physical manufacturing of chips and such. I think for at least the next few decades, there will still be some bottleneck along these lines. If so, there will still be a quota on the inference side, and we will still need to use it efficiently.

A complete 'virtual cell' using AI is about 10 years away

Garry: Fortunately, small models are getting smarter and smarter, which is fantastic. There are many bio and biotech founders in the audience. AlphaFold 3 took us beyond proteins into the broader biomolecular realm. How far are we from simulating a complete cellular system? Or is this still a fundamentally different order of magnitude of difficulty?

Demis: We spun out Isomorphic Labs from DeepMind after AlphaFold 2, and it is going incredibly well. It's not just developing AlphaFold; as many of you know, AlphaFold is just one link in the drug discovery chain. We are trying to do adjacent biochemistry and chemistry research to design compounds with the right properties. We will have some major announcements in this area soon. I think progress is going very well. Ultimately, you want a complete 'virtual cell.' I mention this in many science talks: a fully working simulation of a cell, which you can perturb, and its output is close enough to experimental results to have practical utility. You can skip a huge amount of search steps, generate vast synthetic data to train other models, and then predict real cell behavior. I think we are roughly about 10 years away from a complete virtual cell. We are starting to work on this from the science side at DeepMind now, first with a virtual nucleus, as it's relatively self-contained. The trick to handling all these problems is: can you slice out a complexity chunk? Ultimately, you want to simulate the human body, but can you simulate it to the right level of detail? What sufficiently independent slice can you pull out? You can simulate and approximate the inputs and outputs of this independent system, then just focus on the system itself. From this perspective, the nucleus is very interesting. Another issue is that there's not enough data yet. You need data. I've talked to leading scientists working on various electron microscopy and other imaging techniques. If we could image a live cell without killing it, that would be revolutionary, because then you could turn it into a vision problem, and we know how to solve vision problems. But currently, I haven't seen any technique that provides nanometer-scale resolution without destroying the cell, while also observing all the dynamic interactions. You can obviously take static images at that resolution, which are now very detailed, but that's not enough to convert it into a complex vision problem. So, one possible solution path is: a hardware-driven data scheme; or, we build better, deep learning-based simulators for these dynamic systems. That's the more modeling-heavy solution.

AI will be the ultimate tool for science, used to solve the "root node" problems in science

Garry: You've been looking at various scientific domains, not just bio, but material science, drug discovery, climate modeling, mathematics. If you had to rank the scientific fields that will be most dramatically transformed over the next five years, what's on your list?

Demis: They are all incredibly exciting. This is the main motivation and original intent behind my over 30 years of work on AI: using AI as the ultimate tool. I've always believed AI would be the ultimate tool for science, to accelerate scientific understanding, scientific discovery, medical progress, and our comprehension of the universe around us. Actually, when you look at our original mission statement (which is still how we think), it involved two steps. Step one: solve intelligence, i.e., build AGI. Step two: use it to solve everything else. Over time, we had to modify it slightly because people would ask, 'Do you really mean solve everything else?' We literally meant it. I think people are starting to understand what that means today, but specifically, I mean solving the 'root node problems' in science. Those are scientific domains that, if cracked, can open up entirely new branches of research or avenues of discovery. AlphaFold is the prototypical case of what we wanted to do. Over 3 million researchers globally, practically every biologist in the world, now uses AlphaFold. Some pharmaceutical exec friends of mine tell me that almost every new drug developed from now on will use AlphaFold at some stage in the R&D process. That's something we're incredibly proud of, and the kind of impact we hope to have with AI. But I do think this is just the beginning. I haven't seen a scientific or engineering field where it can't help. As for the domains you mentioned, I feel like we are at an 'AlphaFold 1' moment. We have very promising results, but haven't fully cracked that super-challenge in the field yet. But I think in the next few years, we'll have a lot to talk about in all these areas. Material science you mentioned, from material science all the way to mathematics, are all very exciting.

Garry: I mean, it feels somewhat Promethean. Like this capability has been handed to humanity.

Demis: I suppose so. Of course, with the Prometheus myth comes the caution that we must also be careful how we use it, for what purpose, and the misuse that the same set of tools could lead to.

Combining AI with Deep Tech creates enormous value; pursue what you truly love

Garry: There are many in this room trying to build companies applying AI to science. In your view, what distinguishes a startup genuinely pushing the frontier forward from one that just wraps an API around a foundation model and calls it 'AI-driven science'?

Demis: Look, here's what I would recommend. I was thinking, if you are in Y Combinator's position, what would I do? One thing you must do, obviously, is capture the trends in AI technology development. That's one of the tough parts. But I do think there is tremendous space in combining the direction AI is heading with some area of deep technology. I think that 'golden overlap' exists right there, whether it's materials, medicine, or other deeply difficult scientific domains. I think those interdisciplinary teams, especially if involving the 'world of atoms' (the physical world), have no shortcuts, at least for the foreseeable future. These areas are relatively safe from being immediately swamped by the next foundation model update. So, if you are looking for opportunities, I think that's one of the more defensible areas. I've always loved Deep Tech, so I am biased toward Deep Tech. I think anything worthwhile and enduring is rarely easy, so I've always been drawn to Deep Tech. Obviously, AI was that way when we started in 2010, right? Back then, people thought 'we know it doesn't work,' that's what investors and even academia told me; it was considered a very niche topic, 'we tried it in the 90s and we know it doesn't work.' But if you have conviction in your idea and a thesis for why this time is different, or a unique combination in your background (ideally, you are an expert in both machine learning and the domain you're applying it to, or you can create a founding team with that expertise), I think you can have a huge impact and build vast value there.

Garry: That's a very important message. I mean, it's easy to forget that once you are successful, you are successful; but before you are, everyone is against you.

Demis: Oh, absolutely. I mean, no one believed in it. That's why I think you have to work on something you truly love. For me, I was going to work on AI no matter what. From a very young age, I decided this was the most impactful thing I could think of. It turned out to be true, but it might not have been; maybe we could have been 50 years too early. And it was also the most interesting thing I could think of. So, even if we were still in some tiny garage today, and it still wasn't working well, I would still be working on AI today. I would still find a way, maybe back in academia or something else, but I would find a way to continue researching it.

Scientific fields ripe for an AlphaFold-like breakthrough: a "vast combinatorial search space," a clear objective function, and sufficient data

Garry: AlphaFold was a case where you pursued it and succeeded. So, what makes a scientific domain ripe for an AlphaFold-style breakthrough? Is there a pattern or specific objective function?

Demis: When I have five free minutes, I should write this down. But the lesson I've learned from all our Alpha projects, specifically AlphaGo and AlphaFold, is this: the techniques we have, and the problems I like to hunt for, are those well-characterized as 'vast combinatorial search spaces.' In a way, the vaster the space, the better. It means no brute-force or special-case algorithm can solve it. Both the possible moves in Go and the different configurations of proteins have possibilities far exceeding the number of atoms in the universe. Then, you need a clear objective function. For example, you can view it as minimizing free energy in a protein or winning a game of Go. You need a clearly specified objective function so you can do hill-climbing optimization. Next is enough data, or a simulator that can generate plenty of in-distribution synthetic data for you. If these conditions are met, I think using today's methods, you can go very far in solving the problem and finding the 'needle in a haystack' solution. By the way, I view drug discovery in the same way, right? There is definitely some compound out there that can address this disease, provided we can find it, and it has no side effects, etc. As long as the laws of physics allow it, the only question is how to find it in an efficient, tractable manner. I think we proved for the first time with AlphaFold that these systems can find that needle, in that case, the perfect Go move.

AI is close to making true scientific discoveries: the 'Einstein Test'

Garry: Getting slightly more meta. We talk about humans using these methods to create AlphaFold, but there's a meta-level: humans using AI to explore the space of possible hypotheses. How far are we from an AI system that can do genuine scientific reasoning, and not just pattern matching on data?

Demis: I think very close. We're working on general systems like this. We have a system called Co-scientist, and other algorithms like AlphaEvolve, which can do more than a base Gemini; obviously, all frontier labs are experimenting in this area. So far, I haven't seen anything—though we are all mulling the same things, like some math problems harder than IMO—but I haven't seen any truly significant, enormous discovery yet. That's my own assessment. I think it's imminent. I think it might relate to the creativity we talked about earlier and going beyond the known frontier. Obviously, at that point, it's no longer just pattern matching, because there are no patterns to match; it's a step beyond extrapolation. It's a kind of analogical reasoning, which I think these systems currently lack, or at least we aren't using them in the right way. So, in science, I often say, can it propose a truly interesting hypothesis, not just solve one? When we say 'not just,' we're talking about solving the Riemann hypothesis or Millennium Prize problems and such. That would obviously be remarkable, and maybe we are a few years away from doing that. I'd love to solve the P vs NP problem; that's my favorite. But even harder than that is: can it propose a new set of Millennium Prize problems considered just as profound, meaningful, and worthy of dedicating a life's work to solving by top mathematicians? I think that is a higher level of difficulty, and I think we still don't know how to do that. Still, I don't believe this is supernatural; I think these systems will eventually be able to do this. Maybe we're just missing one or two things. One test we sometimes use is what I call the 'Einstein Test': Can you train a system with the physics knowledge available in 1901, and then see if it can propose the 'Annus Mirabilis' results, including special relativity, as Einstein did in 1905? Can it do that? I think we can run this test and observe if it's possible. Once achieved, I think we'll be on the verge of these systems being able to invent truly novel things.

Predicts AGI arrival around 2030; advises building things that will still be useful in the AGI era

Garry: Final question. For the technical people in this room who want to work on massive-scale AI, even approaching the scale of what you've built—you've been a pioneer for many years, this is one of the world's greatest AI efforts. For that, I think everyone in this room gives heartfelt thanks to you and your colleagues at DeepMind. Thank you. Regarding building systems at the frontier, what do you know now that you wish you knew when you were 25?

Demis: I think we've covered a part of it, which is that you'll find pursuing hard and profound problems is, in some ways, no harder than pursuing shallow, easy, more superficial ones. They are just difficult in different ways. But considering life is very short, and your time and energy are limited, you might as well invest your life force into things that wouldn't make an impact if you didn't do them, didn't push them forward. I would view things through that lens. Another thing is, we talked about Deep Tech, I love interdisciplinary work. I think that will be even more prevalent in the coming years—combining domains and searching for the connections between them, and using AI will make this much easier. Finally, I would say, depending on your AGI timeline (my estimate is around 2030), if you start a Deep Tech journey today, it's typically, in my estimation, a 10-year journey. Now you have to contemplate AGI arriving mid-journey. What does that mean? It's not necessarily bad, but you have to factor it in: Can your project leverage it? What will AGI systems do with it? This goes back a bit to what you mentioned earlier about AlphaFold and general AI systems. One thing I can foresee is that models like Gemini, Claude, or these general systems will use specialized systems like AlphaFold as tools. I don't think we'll stuff everything into one giant 'brain,' because there's too much degradation; if I put all protein information into Gemini, it would make no sense. We don't need Gemini to do protein folding. Going back to your point on information efficiency, it would absolutely negatively affect its language capabilities. Therefore, I think a better approach is to have very good general tool-calling models that can even train those specific tools, but the tools exist in independent systems. Thinking through the implications is interesting, including what you might build today, and also the physical layer, like what kind of factories you'd build, what kind of financial systems, and so on. I think you need to take this seriously, on one hand imagining what that world will look like, and then build things so that when that world arrives mid-journey, they can be useful.

Garry: Demis Hassabis, everyone. (Applause)

Reference link:

https://www.youtube.com/watch?v=JNyuX1zoOgU