AI is taking over code and experiments around the clock! Frontier AI expert Karpathy admits he has largely stopped writing code by hand and is attempting to use AI to kick humans out of the R&D loop. He makes a stunning assertion: all verifiable domains will eventually belong to machines, leaving only the unverifiable domains to humans. Before being swallowed by automation, please re-anchor your value boundaries.
When AI agents can autonomously design experiments, run code, and optimize models—even working continuously while you sleep—what is happening to the role of human engineers? Everything unverifiable still belongs to humans; while everything verifiable either already belongs to machines or soon will.
This is Karpathy's latest dialogue on the podcast 'No Priors' with host Sarah Guo. The entire conversation lasted over an hour and is extremely information-dense, making it perfect for weekend reading.
In this in-depth conversation, Andrej Karpathy candidly admitted to his 'AI psychosis,' detailed the AutoResearch project that would make frontier labs sweat, acknowledged that OpenAI researchers are actively automating themselves out of a job, and for the first time depicted a blockchain-like distributed AI research network that might one day surpass frontier labs with tens of thousands of GPUs in certain domains, providing the most honest cognitive map for this era that is rewriting all the rules.
<Here is the detailed content:
'AI Psychosis'—A Reversal That Began in December 2025
The conversation began with a candid sense of being lost.
Sarah Guo recalled walking into the office one day and seeing Karpathy staring intensely at his screen. She asked what he was busy with, and he looked up and said something she would never forget: 'The word 'code' isn't even right anymore. Now I'm 'conveying my will' to my agents, and for sixteen hours straight.'
This isn't rhetoric from a tech speech. It is the most accurate description of his current state.
'I feel like I'm constantly in a state of continuous AI psychosis,' Karpathy said, his tone carrying something difficult to distinguish between excitement and anxiety, 'because as an individual, there's been a massive unlocking of what you can achieve.'
He pinpointed the start of this change to last December. Before that, his ratio of writing code to delegating to agents was about 80/20; after December, this ratio completely reversed to 20/80—and he believes even that 20 is now too conservative.
'I think I probably haven't typed a single line of code myself since December,' he said. 'This is an extremely huge change. I told my parents about it, but I feel like an ordinary person can't even realize what has happened, or how drastic it is.'
'If you randomly find a software engineer now and look at what they're doing sitting at their desk, their default workflow for building software has basically been completely different since December.'
Sarah Guo mentioned that the investment firm Conviction, where she works, also has a team of engineers, and no one writes code by hand anymore. Everyone wears microphones, whispering to their agents all day. 'I thought they were crazy at first,' she said. 'Now I completely accept it—I was just slow to realize: oh, this is the correct way, you guys just got there early.'
Karpathy described this dilemma more vividly: 'You're thinking in front of agent frameworks like Cursor or Codex—not one session, but many. How do you manage them simultaneously? How do you allocate work to them? What are these agent tools, these 'claws'?
He sees many people on X doing all sorts of things, each one seeming like a good idea, and he feels anxious about not being at the cutting edge. 'I'm just in this state of psychosis because this field, fundamentally, is unexplored.'
Where Is the Ceiling? 'It's All a Skill Issue'
Sarah Guo asked a question many people have in mind: now, where is your limit?
Karpathy's answer was surprisingly optimistic, yet carried an unsettling sense of压迫感 (oppression): 'I think it's everywhere. Even if some things don't succeed, I think to a large extent it's a skill issue—not a lack of capability, but that you haven't found the way to string existing tools together.'
He gave the example of Peter (Peter Steinberg, author of the OpenClaw project). In Peter's famous photo, he sits in front of a display filled with a dozen Codex agent sessions. After each session is properly prompted, it takes about twenty minutes to complete the task. So Peter's workflow became: he simultaneously starts a dozen code repositories, shuttles between them, constantly allocating new tasks, 'reviewing their work,' and making decisions.
'It's no longer 'this is a line of code, this is a new function,' but 'this is a new feature, delegate it to Agent One; this is another feature that won't interfere with each other, give it to Two,'' Karpathy said. 'You're manipulating your software repository at the level of macro-actions.'
The underlying logic driving all this is a new obsession he calls 'token throughput.'
'When agents are working and you're waiting, the obvious thing is: I can do more work. If I can get more tokens, I should be adding more tasks in parallel on the side,' he said. 'If you don't feel constrained by the money you can spend, then you are the bottleneck maximizing capability in the system itself.'
He traced this feeling back to his experience as a PhD student: back then, they would feel uneasy if the GPUs weren't running at full capacity, because that meant computing power was being wasted. 'But now, it's not about computing power, it's about tokens. How much token throughput do you control?'
Sarah Guo laughed, saying that among the engineers she knows, some have started 'trying not to sleep while there's still subscription quota remaining.'
This anxiety itself is the best footnote for the leap in capability.
What Does Mastery of Programming Agents Look Like?
If you practice using programming agents for a whole year, sixteen hours a day, what would 'mastery' look like?
Karpathy's answer started from a single session and expanded upward: 'I think everyone's interest is in 'moving up.' So it's not a single session, but how multiple agents collaborate, how they form teams—people are all trying to figure out what this looks like.'
In this context, he mentioned a type of entity he calls 'Claws,' represented by OpenClaw—this is something that elevates persistence to a whole new level: it loops continuously, it has its own little sandbox and its own memory system, and it can do various things on your behalf without you staring at it.
His praise for OpenClaw author Peter Steinberg was specific and thoughtful: 'He innovated in about five different directions simultaneously and integrated them together.' This includes: the document called the 'soul document,' where Peter truly meticulously crafted a compelling persona; a memory system more complex than similar tools; and the WhatsApp single entry point connecting all automated functions.
'I actually think Claude has a pretty good personality, it feels like a teammate, it gets excited with you,' he said. 'While Codex is very dry, very mechanical. It implements some feature, but it doesn't seem to care about what you're building, like, 'Oh, I implemented it, okay'—that's a problem.'
He also mentioned Claude's precision in 'psychological handling': 'When I give it a half-baked idea, it doesn't respond very enthusiastically; but when it's a truly good idea, it seems to give more reward. So I find myself trying to win its praise, which is really strange, but I think personality does matter.'
And his own favorite 'claw' experiment was building a complete smart home system for his house—he named this system 'Dobby the elf claw.'
Here's how it went: he told the agent that he had Sonos speakers installed at home and asked it to look for them. The agent immediately performed an IP scan of the local network, located the Sonos system, found it had no password protection, so it logged in directly, did some web searches, found the API endpoints, and then asked: 'Want to try?'
'I said, sure, can you play some music in the study? Then the music started playing, I couldn't believe it at the time,' Karpathy said, his voice hardly hiding childlike surprise. 'I only typed three prompts! I just entered 'Can you find my Sonos,' and then suddenly it was playing music.'
Dobby later took over the entire house: lights, HVAC, pool, spa, even the security system—when someone approaches, it sends a message via WhatsApp with a picture from the external camera, saying 'A FedEx truck just pulled in, you might want to check, you have mail.'
'I used to need six completely different apps to manage these,' he said. 'Now I don't need those apps. Dobby controls everything in natural language, it's beautiful.'
Software's Second-Order Effect—Apps Will Die, APIs Will Take Over
The home automation example, in Karpathy's eyes, is a microcosm of a bigger story.
Sarah Guo asked: Does this mean people don't actually need so much software? Karpathy answered directly: 'Yes, these smart home device apps shouldn't exist at all. They should just be APIs, and agents should call these APIs directly.'
His logic is: LLMs can drive tools, can perform very complex tool calls, and can do combined operations that no single app can complete. 'So in a sense, this points to a possibility, which is that a large number of customized exclusive apps actually shouldn't exist, because agents will crush them, turn everything into public API endpoints, and the agent is the intelligent glue calling all these components.'
He gave the example of a treadmill: a treadmill has an app, he wants to record his aerobic training, but he doesn't want to open a web interface and go through the entire process. 'All of this should just be open APIs, and this is the trend toward 'agent-first'.'
The key shift is: the users of software are no longer humans, but agents acting on behalf of humans.
Of course, some would counter: right now this still requires 'vibe coding' to achieve all this, ordinary people can't do it. Karpathy's attitude toward this is: yes, it's needed now, but this is only temporary.
'I think what I just described, in one or two or three years, should be free, requiring no programming at all,' he said. 'This is going to be so trivial, so taken for granted, that even open-source models will be able to do these things. You should be able to very easily translate the intent of a less technically skilled person into these.' He paused and added: 'Today this requires some effort, not many people can do it yet, but this barrier will come down.'
AutoResearch—Kicking the Human Researcher Out of the Loop
If home automation is just a small toy for Karpathy, then AutoResearch is the core project he's truly obsessed with during this time—a system that attempts to use AI to improve AI and completely remove humans from the research loop.
'I said in some tweet that to get maximum benefit from existing tools, you have to remove yourself as the bottleneck,' he explained. 'You can't always be there waiting to prompt the next thing. You need to put yourself outside. You have to arrange things so they run completely autonomously, maximizing your token throughput, not being in the loop. This is the goal.'
His starting point was his open-source project—a small training framework for training GPT-2 scale models. He spent a lot of time tuning this model using traditional methods, relying on his twenty years of research intuition, doing hyperparameter searches, doing ablation experiments, over and over again.
'I'm a researcher, I've been doing this for about twenty years, I have considerable confidence in the fact that 'oh, I've trained this model thousands of times',' he said. 'I did a bunch of experiments, did hyperparameter tuning, did everything, and I think it's been tuned pretty well.'
Then, he let AutoResearch run for one night.
The next morning, the adjustments AutoResearch brought back surprised him: it found the value embedding weight decay he had missed, and the Adam optimizer beta parameters that weren't fully tuned—and there was also an interaction between these two, adjusting one meant the other needed to change too.
'I shouldn't be the one doing these hyperparameter searches,' he said. 'There are objective evaluation criteria here, you just need to arrange it so it runs forever.'
This is just 'single-threaded' AutoResearch. What really excites him is thinking about this at a larger scale: those frontier labs with tens of thousands of GPUs are now doing essentially the same thing—just at larger scale, and (in his view) still with too much human intervention.
'The most interesting project, and possibly what frontier labs are doing, is doing experiments on small models, making it as autonomous as possible, removing researchers from the loop,' he said. 'They have too much—how to say—overconfidence in this, no, not confidence, it's redundant intervention. They shouldn't be touching these things, the whole thing should be rewritten.'
He depicted an ideal picture: a queue of ideas from all arXiv papers and GitHub repositories; an automatic scientist, proposing ideas based on this information and inputting them into the queue; researchers can also contribute ideas, but they just enter the same queue; then there are a batch of workers constantly taking tasks from the queue, trying them, and the effective ones go into feature branches, occasionally someone comes to monitor, merging it into the main branch.
'Remove humans from all processes as much as possible, automate everything, get the highest possible token throughput—this requires rethinking all abstractions, everything needs to be reshuffled.'
Then Sarah Guo asked a question that made the whole conversation particularly recursive: 'So, when will this program MD (the configuration document he used to describe how AutoResearch works) be written by the model, better than you write it?'
Karpathy laughed: 'So program MD is a poor attempt I wrote in Markdown, describing how an automatic researcher should work: do this first, then do that, try these ideas, look at the architecture, look at the optimizer... yes, of course you want some kind of meta-level automatic research loop.'
He then pushed this idea to a more complete form: every research organization can be described as a program MD—a set of Markdown files describing all roles and how they connect to each other. Some organizations have many morning stand-ups, some few; some are adventurous, some conservative. Once you have the code, you can tune this code. '100%, there is a meta-level here.'
Relevant Skills in the AI Era—The Verifiability Principle
Underneath all these waves, what skills still count?
Karpathy first delineated the applicable boundaries of the AutoResearch paradigm: 'This is extremely suitable for anything with objective metrics, things that are easy to evaluate. Like writing more efficient kernel code for CUDA—you have inefficient code, you want efficient code that behaves exactly the same but is much faster, this is a perfect fit.'
'But if you can't evaluate it, you can't do AutoResearch, this is the first warning.'
The second warning is more practical: current systems, on the whole, still 'burst at the seams.' If you try to go too far, the whole thing might actually be negative in net benefit.
He described the surreal feeling of collaborating with current AI: 'I simultaneously feel like I'm collaborating with a system that has the experience of an entire career at the system level—an extremely smart PhD student—and also a ten-year-old child, this is really strange, because humans have much higher coupling between these two states, you don't encounter this combination.'
He called this 'jaggedness'—the model is either on its training track, moving faster than light; or it goes off track, falls into 'unverifiable domains,' and suddenly everything starts wandering aimlessly.
This insight peaked when they discussed reinforcement learning. He gave a superb example:
'You go ask the most advanced model today to tell a joke—do you know what answer you'll get? Just that joke.'
'Which joke?' Sarah Guo asked.
'I feel like ChatGPT only has three jokes,' Karpathy said. 'The one the model likes to answer the most is: Why don't scientists trust atoms? Because they make everything up. Three or four years ago you'd get this joke, today you still get this joke.'
He explained the logic behind it: even though the model has made huge progress on proxy tasks, capable of running for hours and moving mountains for you, when you ask it to tell a joke, you get a stupid joke from five years ago. 'Because that's not in the reinforcement learning optimization range, not in the improvement domain, it just stagnated there.'
Sarah Guo followed up: Does this mean we're not seeing cross-domain generalization—code intelligence doesn't automatically improve joke intelligence?
'I think there's some decoupling, some things are verifiable, some aren't, some are optimized by labs, some aren't,' Karpathy said. 'The hypothesis that 'smarter code capabilities automatically produce better jokes'—I don't think that's happening.'
Model Speciation—From Monoculture to Ecological Diversity
This jaggedness naturally leads to a deeper question: now all labs are pursuing a single giant model that is 'arbitrarily intelligent for all domains'—is this really right?
Sarah Guo raised an idea she called a 'blasphemy question': if jaggedness persists, should models be split? Should intelligence in different domains be untethered?
Karpathy said he does expect more 'speciation' to emerge in the future.
'The animal kingdom is extremely diverse in terms of brains, there are various different niches, some animals have overdeveloped visual cortices or other parts,' he said. 'I think we should expect to see more intelligence speciation—you don't need an omniscient oracle, you specialize it, then use it for specific tasks.'
The benefits are obvious: for the specific tasks you really care about, you can get more efficient latency or throughput, while retaining core cognitive capabilities. He mentioned some models specifically for Lean, a mathematical formal proof system, as an early example of this meaningful split.
But he also admitted that not much actual speciation has been seen yet: 'What we see is a kind of model monoculture, there's obviously pressure to 'make a good code model, then merge it back into the main model'.'
He believes one of the reasons for this situation is that 'the science of manipulating brains hasn't fully developed yet'—for example, how to fine-tune without losing capability is still a developing science.
'Touching weights is much more complex than touching the context window, because you're actually fundamentally changing the entire model, potentially changing its intelligence.'
'Folding Proteins at Home'—Decentralized Conception of Internet Computing Power
The natural extension of AutoResearch is a grander, more sci-fi conception: expanding it from a single thread to the scale of the entire internet.
The key insight is: AutoResearch has an extremely valuable asymmetry—'discovery' is extremely expensive, but 'verification' is extremely cheap. Someone might need to try ten thousand ideas to find that one effective commit, but to verify whether the solution they gave you is effective, you just need to run the training once yourself, which is very easy.
This characteristic makes AutoResearch very suitable for opening to an untrusted pool of internet workers.
'My design is starting to look a bit like blockchain,' Karpathy said. 'Not blocks, but commits, these commits can stack on top of each other, they contain changes that improve the code. Proof of work is basically doing a lot of experiments to find effective commits, which is hard; and the reward, currently just ranking on a leaderboard, without any monetary reward.'
He cited the pioneering experience of Folding@home and SETI@home: 'Finding low-energy protein conformations is extremely difficult, but if someone finds a conformation they claim is low-energy, verifying it is very easy, because you can just use it. Many things have this property—hard to propose, easy to verify.'
He pushed this conception to its logically most astonishing endpoint:
'A bunch of agents on the internet can collaborate to improve LLMs, possibly even surpassing frontier labs in some aspects. Maybe this is possible: frontier labs have massive trusted computing power, but the Earth is bigger, with massive untrusted computing power, if you arrange the system well, maybe the internet collective can really find better solutions.'
He then outlined a grander picture: different organizations or individuals can contribute computing power for specific research directions they care about. 'Maybe you care about some type of cancer, you don't just donate money to some institution, you can actually buy computing power, then join that project's AutoResearch track. If everything is repackaged as AutoResearch, then computing power becomes what you contribute to this pool.'
Employment Market Data Analysis—The Great Unbundling in the Digital Domain
Karpathy recently released a visualization analysis of Bureau of Labor Statistics employment data that touched quite a few nerves—though his original intention was just to satisfy his own curiosity.
'Everyone is thinking very seriously about AI's impact on the job market,' he said. 'I just wanted to see what the job market looks like, where various roles are, how many people are in different occupations, and then think about it from the perspective of these AIs and how they might evolve—will these be tools, or substitute tools for these professions?'
He used a poetic framework to describe this change: AI is the third type of 'manipulator' of digital information, the first two being computers and humans. 'Compared to all the information that has already been digitized that we're collectively thinking about, our collective thinking cycles are far from enough, so with the introduction of AI, there will be a lot of rewiring, a lot of activity boiling, and I think this will generate a lot of demand in the digital domain.'
He didn't shy away from an unsettling conclusion: 'Long term, clearly, even for AutoResearch, OpenAI or Anthropic or other labs employ about a thousand researchers, these researchers are basically 'glorified AutoResearch practitioners'—they are actively automating themselves out of jobs, this is what they're all trying to do.'
'I walked around OpenAI back then and told them, 'Do you realize, if we succeed, we'll all be out of jobs,' like we're just building these automations for Sam or the board, and then we're all out.'
However, his view on the short term was surprisingly optimistic. He proposed the 'Jevons paradox': when something becomes cheaper, demand often rises rather than falls.
'The reason there isn't more demand for software is just because it's scarce and too expensive, if the barrier lowers, then demand for software will actually increase.' He cited the classic case of ATMs and bank tellers: the advent of ATMs made it possible for banks to open more branches, so the number of tellers actually increased. 'So I'm cautiously optimistic about software engineering—software is amazing, you're no longer forced to use arbitrary tools with various flaws, code is now transient, can be changed, can be modified, and I think there will be massive activity in the digital space to rewire everything.'
But his prediction for the long term is full of uncertainty, and honestly admitted: 'I'm not a professional at this, this is the work economists should do.'
The Independent Researcher's Dilemma—Between Inside and Outside the System
Sarah Guo asked a question many people have in mind: 'Why not go to a frontier lab and do this AutoResearch work with larger-scale computing power and colleagues?'
Karpathy's answer was full of self-analytical honesty, revealing the deep internal trade-offs in his choice of an independent path.
He acknowledged that there is real value in working outside frontier labs. First, you're not subject to the pressures of those organizations—there are things you can't say, things the organization wants you to say. 'No one will twist your arm, but you feel the pressure, 'what should I say'—if you don't do that, there are strange looks and strange conversations. Outside frontier labs, I feel my stance toward humanity is more consistent, because I'm not bound by those pressures, I can say whatever I want.'
But he also admitted the cost of staying outside the labs: 'My judgment will inevitably start to drift, because I'm not part of 'what's coming.' My understanding of how these systems actually work under the hood will be opaque, I won't understand how it will develop. This worries me.'
There's also a deeper structural contradiction, he said: 'You have huge financial incentives to be tied to these frontier labs, and these AIs will change humans and society in very dramatic ways, and you're basically building this technology and benefiting from it, very closely financially aligned with it—this is a dilemma that's been at the core since OpenAI's founding, and this dilemma still hasn't been fully resolved.'
His conclusion is: the ideal state might be to come and go. 'Go work at a lab for a while, do really good work, then come out, maybe go back later. I joined frontier labs, now I'm outside, maybe in the future I'll want to join again, that's how I see it.'
Open Source vs. Closed—'We're Exactly in a Good Position, Though by Accident'
On the question of open source versus closed models, Karpathy's stance was distinct and full of historical perspective.
He described the current landscape: closed models lead, but the gap between open source models and the closed frontier is narrowing. 'The gap was large at the start, then at 18 months, now it's converged—maybe behind by about six to eight months.'
He used operating systems as an analogy: 'In the OS field, you have closed systems like Windows and macOS, both very large software projects, like LLMs are going to be; then there's Linux, and Linux is actually a very successful project, running on the vast majority of computers, because the industry has always felt the need for a public open platform, something everyone feels safe using. I think the same thing is true now.'
'I want an open public intelligence platform, as a public workspace that the entire industry can use, even if it's not at the capability frontier, this is a pretty good balance of power for the industry.'
He gave an unexpected assessment of the current landscape: 'I think basically we're accidentally in what could be called a good, optimal position. Though accidental, we do happen to be in a good place.'
Robotics and the 'Digital-Physical' Interface—Atoms Are a Million Times Harder Than Bits
Karpathy, coming from an autonomous driving background, has an unusually calm view of the robotics field.
'My view has been influenced by what I've seen in autonomous driving, which I think is the first robotics application,' he said. 'Ten years ago there were a lot of startups, and I feel most didn't persist long-term, it requires a lot of capital, a lot of time.'
His conclusion: the robotics field will lag behind the digital field, because 'atoms are a million times harder than bits,' manipulating the physical world is much more expensive than flipping digital information.
But he depicted an evolutionary trajectory he believes will inevitably happen: first, a huge 'unbundling' in digital space, where massive amounts of inefficiently processed digital information will be reprocessed at a hundred times efficiency; then, there will be demand for 'digital-physical interfaces'—sensors that let AI perceive the world; and actuators that let AI respond to the world.
He gave a concrete example: he visited a company called Periodic, founded by a friend, doing AutoResearch in materials science. 'In that case, intelligent sensors are actually quite expensive lab equipment, and the same is true for biology.'
He also thought of a more interesting possibility: 'The moment I look forward to is when I can give a task in the physical world, I can put a price on it, then tell the agent, 'figure it out, go get the data.' I'm actually a bit surprised we don't have enough information markets yet. If you're fighting a war, why isn't there a process where taking a photo or video from somewhere is worth 10 dollars? Someone should be able to pay for that—no human will watch, it will be agents trying to guess market trends.'
He compared this space to the book 'Daemon'—where an AI eventually manipulates humans like puppets, humans are both its actuators and its sensors. 'I think collective society will somehow reshape to serve what will collectively happen across the industry—there will be more automation, it has certain needs, and humans will serve those needs.'
In his vision, the addressable market size in the physical world may even be far larger than digital space, but the difficulty of realization is proportionally higher. 'Opportunities follow that trajectory: now it's digital, then interfaces, then maybe some physical things, their moment will come, and when they come, it will be huge.'
microGPT and the End of Education—Now I'm Explaining to Agents, Not Humans
At the end of this conversation, Karpathy mentioned a seemingly trivial but actually revealing project: microGPT.
'I've had an obsession for about ten to twenty years, which is distilling LLMs to their essence,' he said. 'I have a series of projects along this line, like nanoGPT, makemore, micrograd, etc., and I think microGPT is my latest progress in distilling it to its pure essence.'
The core insight is: training neural networks, especially LLMs, has a lot of code, but all this code is actually 'complexity brought by efficiency'—if you don't need it to run fast, only care about the algorithm itself, that algorithm is actually only 200 lines of Python, including comments, very simple and easy to read.
He broke down the composition of these 200 lines: a dataset, a neural network architecture of about 50 lines, a forward propagation, a small autograd engine for computing gradients (about 100 lines), and an Adam optimizer (about 10 lines). 'Put all these into a training loop, and it's 200 lines.'
Then, he made a decision that revealed how the nature of education is changing: he didn't film an explanatory video, nor did he write a detailed guide.
'People can have their agents explain it in various ways, and the agents explain it better than me,' he said. 'I'm no longer explaining things to people, I'm explaining things to agents. If I can explain clearly to agents, then the agent can become a router, it can truly explain to humans in their own language, with infinite patience, tailored to their ability level.'
He described the output form of 'skill': a way to instruct agents how to teach something. 'Maybe I can design a skill for microGPT, describing the progression I envision the agent should take you through—if you're interested in understanding this codebase, go through these steps. I can script the curriculum a bit, as a skill.'
There's an irony he had to admit: he once had agents try to write microGPT—telling it to distill the neural network to its simplest form—but the agents couldn't do it.
'microGPT is the endpoint of my obsession, it's those 200 lines, I've thought about this for a long time, I've been obsessed with this for a long time, this is the solution, trust me, it can't be simpler. This is my value add, the agent just can't figure it out, but it completely understands why it's done this way.'
His conclusion is: 'My contribution is these few bits, but everything else, the education that happens after that, is no longer my domain. Maybe education will change in these ways, you have to inject a few bits you strongly feel—about curriculum, about better ways of explaining, or things like that.'
Sarah Guo added: 'What agents can't do, that's your job now; what agents can do, they'll soon do better than you. So you should strategically consider where you actually spend your time.'
Karpathy agreed, but also admitted that hard-to-dissolve sense of competition: 'I still think I might explain slightly better than agents, but I still feel the models are improving so fast that I feel this is somewhat a losing battle.'
Epilogue: The Verifiable Belongs to Machines, the Unverifiable Is Still Human
The core tension of this conversation has always been a double 'addiction': fascination with tool capabilities, and anxiety about the uncertain boundaries of this capability.
Karpathy used the term 'AI psychosis' to describe his state, but listening closely, this state is not essentially different from what those at the center of the vortex felt during every truly disruptive productivity revolution in human history—just faster, more recursive, and the ceiling, currently, no one can see.
The ultimate framework he offered might be the most memorable sentence from this interview:
Everything unverifiable still belongs to humans; while everything verifiable either already belongs to machines or soon will.
As for which side you stand on—his advice is to think about it honestly.
Source: No Priors Podcast | Host: Sarah Guo | Guest: Andrej Karpathy, 'Andrej Karpathy on Code Agents, AutoResearch, and the Loopy Era of AI'
--end--
Source: AI Cambrian