Why I Think Karpathy is Wrong on the AGI Timeline - podcast episode cover

Why I Think Karpathy is Wrong on the AGI Timeline

Oct 20, 202510 min
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

Karpathy is confusing LLM limitations with AI system limitations, and that makes all the difference. 

Become a Member: https://danielmiessler.com/upgrade

See omnystudio.com/listener for privacy information.

Transcript

S1

Andrej Karpathy came on Dwarkesh. S podcast recently and I have a number of thoughts. The consensus seems to be that Karpathy thinks AGI is ten years away and therefore Gary Marcus is right. And people like myself and Sholto and all the other people saying AGI is within a few years have just basically lost the war. It's a compelling narrative, but that's not really what happened. He did, however,

say that he thinks AGI is ten years out. But the AGI debate has always hinged on definitions, and I think the one that Karpathy is using is the reason he's wrong. It came from back when he was at OpenAI, and it basically goes like this. An AI that can do any economically valuable work as good or better than a human. And again, that goes all the way back, like, I don't know, whenever Karpathy was at OpenAI. This is over five years ago. I simply don't think this is

the best definition to use. I quite like it as a pure definition or as a computer science definition, but I think we should use one that focuses more on practically and directly helping humans and avoiding bad outcomes for humans, as opposed to talking about what's interesting and valuable to AI people like us. I'm worried about human worker replacement, specifically human knowledge work, and that's why I've been using

this definition since 2023. And Dwarkesh is now using this definition as well, which is an AI system that can replace an average knowledge worker. For me, this is a better definition for two reasons. One, it focuses on the fact that it's an AI system and not one particular component of a system like a model. Two it provides a more direct benchmark for the thing we care about, which is our companies actually replacing workers with the system. Yes or no. And this system part is extremely key.

I have no reason or even ability to disagree with Karpathy on the limitations of pure llms. He recently wrote yet another LLM from scratch by hand A thousand lines of code. He is the actual sensei here. Like I know 0.0017% of what he knows about Llms. The problem is, AI systems aren't just the llms themselves, they're not naked neural nets. When you go to ChatGPT and you're talking with GPT five, you're not talking to a base neural net,

you're talking to an AI system. You're talking to the result of that initial LLM being shaped and molded with colossal amounts of extra scaffolding and engineering to be the best possible system it can be for doing that particular task. In this case, being a chatbot or an assistant. This distinction is crucial because replacing human jobs will also be done through composite, stitched together systems that are many times

more powerful than their parts. To replace a project manager or an executive assistant, the company's building human worker replacement aren't going to wait for GPT nine or Gemini 7.5 to maybe solve their problems. Human worker replacement will happen through AI products and systems that work around the pure limitations of llms and of individual model intelligence like Rag. Expanding context, windows, context management, things like that. And the

best example of this is actually Claude code. It's just a brilliant example. Just throwing out estimates when Cloud Code came out, which was earlier in 25, in like basically March of 25 when it launched, it was like five times better than opus, which was its best model at the time for doing coding tasks and stuff like that. Well, it's less than ten months later and it's already gotten many times better than that already. It's like a night

and day difference. Yes, the models got better, but that's not what made the difference. It was constant iterative improvements, grinding towards improving how the AI talks to itself and how humans interact with the AI coordination, context management, context engineering. And just now they added skills, which takes the whole

thing to like completely different tier. This is exactly the type of efficiency ratchet that will apply to human work replacement, where we don't have enough context window to read all the company's docs. Companies will have or invent systems to do that, whether or not general enough to match human flexibility. They'll just add so many great use cases and capabilities based roughly around like the agent skills thing from anthropic that they just released that we eventually won't notice because

it'll cover most use cases. The part that concerns me most about the speed of progress towards AI replacing human knowledge workers is not the speed of the AI system improvement. It's also the fact that the bar is so low. A good portion of our culture's comedy is based on

the utter incompetence of, like, half of our workforce. We're talking about the worst possible customer service, people bragging about how little work they do, making a sport of doing the bare minimum, showing up the bare minimum amount of time, not doing hardly any work and getting away with it and getting paid. People absolutely detesting their jobs. Even decent workers just mindlessly punch in and out a lot of

the time. Mediocrity is the baseline, almost by definition. That is what multibillion dollar human worker replacement startups are competing with, not the top 10% performers that you know, a lot of us know, at least for now. Think of it this way in the time that we went from cloud code not existing to getting really, really good to now having shareable work task replacement skills, the bottom 50% of knowledge workers improved by how much? Zero in the time

since ChatGPT came out. Right. So we're talking about late 22. So we're talking about what is that over three years in the time since ChatGPT came out, we're talking about a stark difference in AI before then and now, three full years go by, the bottom 50% of knowledge workers improved their capabilities. By how much? Again, 0%. The bar for human work replacement is not moving, while the capabilities

of AI systems are going absolutely apeshit. Now, you might push back saying this is only for the people not trying very hard or who aren't that smart or whatever. True. But it doesn't matter. You and me and Dwarkesh and Karpathy are going to be fine. So what? I'm worried about everyone else. If AI only eats the absolute worst bottom 50% of knowledge workers in the next 5 or 10 years, we're still talking about hundreds of millions of jobs,

or even 25%. So basically a bunch of I just did a bunch of research on this, and the total number of knowledge workers worldwide is right around a billion. 1 billion knowledge workers. So half is a big percentage. That's 500 million people, but let's just say it's 10%. Let's just say it's 25%. And we've already established that these are the least competent people at the job. So no, they won't be pivoting easily to another knowledge work position.

This is why I disagree with Karpathy on AGI. It's not because he's wrong about Llms having severe limitations. He's not, but he's focused on the wrong thing. If the thing we care about is AI's near-term and practical impact on humanity, the thing to watch is not the pure LLM tech or the specific technical limitations of RL to achieving continuous learning. It's the trillions of dollars being invested in replacing the worst performing human workers, who will likely never get better

than they already are. Those trillions are being spent on scaffolding workarounds to LLM limitations that provide us just general enough AGI to start replacing people and from there it will only improve. Given what we've seen in systems like cloud code cursor codecs that dramatically magnify model capability. While the models continue to improve along their own axis as well, do you really want to bet that good enough generality won't be hit in the next couple of years? I

wouldn't take that bet. And this is why I think AGI will arrive before 2028. Like a 70% chance. A rough guess who really knows. And before 2030, I'm guessing 95%. Not because all the stuff Karpathy is talking about will be solved by then, but because it won't matter if it's solved with trillions of dollars in funding and trillions of dollars in market opportunity, we're almost guaranteed to cloud code our way past a very low bar of millions of barely there employees.

Transcript source: Provided by creator in RSS feed: download file
For the best experience, listen in Metacast app for iOS or Android