2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo - podcast episode cover

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

Apr 03, 20253 hr 4 min
--:--
--:--
Listen in podcast apps:

Summary

Scott Alexander and Daniel Kokotajlo discuss their AI 2027 scenario, forecasting AI progress month-by-month. They delve into topics such as the intelligence explosion, AI agents, misalignment risks, and the potential impact on society, touching on cultural evolution, economic transformations, and the balance of power between humans and AI. They also explore policy considerations and the importance of transparency in AI development.

Episode description

Scott and Daniel break down every month from now until the 2027 intelligence explosion.

Scott Alexander is author of the highly influential blogs Slate Star Codex and Astral Codex Ten. Daniel Kokotajlo resigned from OpenAI in 2024, rejecting a non-disparagement clause and risking millions in equity to speak out about AI safety.

We discuss misaligned hive minds, Xi and Trump waking up, and automated Ilyas researching AI progress.

I came in skeptical, but I learned a tremendous amount by bouncing my objections off of them. I highly recommend checking out their new scenario planning document, AI 2027

Watch on Youtube; listen on Apple Podcasts or Spotify.

----------

Sponsors

* WorkOS helps today’s top AI companies get enterprise-ready. OpenAI, Cursor, Perplexity, Anthropic and hundreds more use WorkOS to quickly integrate features required by enterprise buyers. To learn more about how you can make the leap to enterprise, visit workos.com

* Jane Street likes to know what's going on inside the neural nets they use. They just released a black-box challenge for Dwarkesh listeners, and I had a blast trying it out. See if you have the skills to crack it at janestreet.com/dwarkesh

* Scale’s Data Foundry gives major AI labs access to high-quality data to fuel post-training, including advanced reasoning capabilities. If you’re an AI researcher or engineer, learn about how Scale’s Data Foundry and research lab, SEAL, can help you go beyond the current frontier at scale.com/dwarkesh

To sponsor a future episode, visit dwarkesh.com/advertise.

----------

Timestamps

(00:00:00) - AI 2027

(00:06:56) - Forecasting 2025 and 2026

(00:14:41) - Why LLMs aren't making discoveries

(00:24:33) - Debating intelligence explosion

(00:49:45) - Can superintelligence actually transform science?

(01:16:54) - Cultural evolution vs superintelligence

(01:24:05) - Mid-2027 branch point

(01:32:30) - Race with China

(01:44:47) - Nationalization vs private anarchy

(02:03:22) - Misalignment

(02:14:52) - UBI, AI advisors, & human future

(02:23:00) - Factory farming for digital minds

(02:26:52) - Daniel leaving OpenAI

(02:35:15) - Scott's blogging advice



Get full access to Dwarkesh Podcast at www.dwarkesh.com/subscribe

Transcript

Today, I have the great pleasure of chatting with Scott Alexander and Daniel Cocotelo. Scott is, of course, the author of the blog Slate Star Codex, Astral Codex 10 Now. It's actually been, as you know, a big bucket list item of mine to get you on the podcast. So this is also the first podcast you've ever done, right? Yes. And then Daniel is the director of the AI Futures Project. And you have both just launched today something called AI 2027.

So what is this? Yeah, AI 2027 is our scenario trying to forecast the next few years of AI progress. We're trying to do two things here. First of all is we just want to have a concrete scenario at all. So you have all these people, Sam Altman, Dario Amidai, Elon Musk saying, going to have AGI in three years, super intelligence in five years. And people just think that's crazy because right now we have the chat bot.

It's able to do like a Google search, not much more than that in a lot of ways. And so people ask, how is it going to be AGI in three years? What we wanted to do is provide a story, provide the transitional fossils. So start right now, go up to 2027 when there's AGI, 2028 when there's potentially superintelligence, show on a month-by-month level what happened. kind of in fiction writing terms make it feel earned. So that's the easy part.

The hard part is we also want to be right. So we're trying to forecast how things are going to go, what speed they're going to go at. We know that... The median outcome for a forecast like this is being totally humiliated when everything goes completely differently and if you read our scenario you're definitely not going to expect us to be the exception to that trend. The thing that gives me optimism is

Daniel, back in 2021, wrote kind of the prequel to this scenario called What 2026 Looks Like. It's his forecast for the next five years of AI progress. He got it almost exactly right. Like, you should stop this podcast right now. You should go and read this document. It's amazing. Kind of looks like you asked ChatGPT, summarize the past five years of AI progress, and you got something with like a couple of hallucinations, but basically well-intentioned and correct.

So when Daniel said he was doing this sequel, I was very excited. really wanted to see where it was going. It goes to some pretty crazy places, and I'm excited to talk about it more today. I think you're hyping up a little bit too much. Yes, I do recommend people go read the old thing I did, which was a blog post.

I think it got a bunch of stuff right, a bunch of stuff wrong, but overall held up pretty well and inspired me to try again and do a better version of it. I think read the document and decide which of us is right. Another related thing, too, is that... The original thing was not supposed to end in 2026. It was supposed to go...

all the way through the exciting stuff, right? Because everyone's talking about like, what about AGI? What about superintelligence? Like, what would that even look like? So I was trying to sort of like step by step work my way from where we were at the time until things happen and then see what they look like.

But I basically chickened out when I got to 2027 because things were starting to happen and the automation loop was starting to take off. And it was just so confusing and there was so much uncertainty. So I basically just... I just deleted the last chapter and published what I had up until that point, and that was the blog post. Okay, and then Scott, how did you get involved in this project? I was asked to help with the writing, and...

I was already somewhat familiar with the people on the project, and many of them were kind of my heroes. So Daniel I knew both because I'd written a blog post about his opinions before. I knew about his what 2026 looks like, which was amazing. And also he had pretty recently made the national news for... having, when he quit OpenAI, they told him he had to sign a non-disparagement agreement where they would claw back his stock options, and he refused.

they weren't prepared for. It started a major news story, a scandal that ended up with OpenAI agreeing that they were no longer going to subject employees to that restriction. People talk a lot about how it's hard to trust anyone in AI because they all have so much money invested in the hype and getting their stock options better.

attempted to sacrifice millions of dollars in order to say what he believed, which to me was this incredibly strong sign of honesty and competence. I was like, how can I say no to this person? Everyone else on the team also extremely impressive. Eli Liflund, who's a member of Samo Tsveti, the world's top forecasting team. He has won like the top forecasting competition.

plausibly described as just the best forecaster in the world, at least by these really technical measures that people use in the super forecasting committee. Thomas Larsen, Jonas Vollmer, both really amazing people who have done great work in AI before. I was really excited to get to work with this superstar team. I have always wanted to get more involved in the actual... attempt to make AI go well. Right now I just write about it. I think writing about it is important.

But I don't know, you always regret that you're not the person who's the technical alignment genius who's able to solve everything. And getting to work with people like these may potentially make a difference. just seemed like a great opportunity. What I didn't realize was that I also learned a huge amount. I try to read...

most of what's going on in the world of AI, but it's this very low bandwidth thing. And getting to talk to somebody who's thought about it as much as anyone in the world was just amazing. makes me really understand these things about how is AI going to learn quickly? You need all of this deep engagement with the underlying territory. And I feel like I got that. Yeah, I've probably changed my mind.

towards, against, towards, against intelligence explosion, like three, four times in the conversations I've had in the lead up in talking to you and then like coming up with the, trying to come up with a rebuttal or something. It wasn't even just. changing my mind, getting to read the scenario for the first time. It obviously wasn't written up at this point. It was a giant, giant spreadsheet.

I've been thinking about this for like a decade, decade and a half now, and it just made it so much more concrete to have a specific story. oh, yeah, that's why we're so worried about the arms race with China. Obviously, we would get an arms race with China in that situation. And, like, aside from just the people, getting to read the scenario really sold me. This is something that needs to get out there more. Yeah, yeah. Okay, now let's talk about this new forecast.

Let's start because you do a month-by-month analysis of what's going to happen from here. So what is it that you expect in mid-2025 and end of 2025 in this forecast? So beginning of the forecast mostly focuses on agents. So we think they're going to... start with agency training, expand the time horizons, get coding going well. Our theory is that

to some degree consciously, to some degree accidentally, working towards this intelligence explosion, where the AIs themselves can start taking over some of the AI research, move faster. So 2025, slightly better coding. 2026. slightly better agents, slightly better coding. And then we focus on, and we name the scenario after 2027, because that is when this starts to pay off.

The intelligence explosion gets into full swing. The agents become good enough to help with, at the beginning, not really do, but help with some of the AI research. So we introduced this idea called the R&D Progress Multiplier. So how many months of progress Without the AIs, you get in one month of progress with all of these new AIs helping with the intelligence explosion. So, 2027...

We start with, I can't remember if I could literally start with or by March or something, a five times multiplier for algorithmic progress. So we have the stats tracked on the side of the story. Part of why we did it as a website is so that you can have these cool...

gadgets and widgets. And so as you read the story, the stats on the side automatically update. And so one of those stats is like the progress multiplier. Another answer to the same question you asked is basically, 2026, nothing super interesting, or 2025, nothing super interesting happened? more or less similar trends to what we're seeing. Computer use is...

Totally solved, partially solved. How good is computer use by the end of 2025? My guess is that they won't be making basic mouse click errors by the end of 2025 like they sometimes currently do. Like if you watch Cloud plays Pokemon, which he totally should. uh it seems like sometimes it's just like failing to parse what's on the screen and like it it missed it like

thinks that its own player character is an NPC and gets confused. My guess is that that sort of thing will mostly be gone by the end of this year, but that they still won't be able to autonomously operate. for many, for long periods on their own, because... By 2025, when you say, I won't be able to act coherently for long periods of time in computer use, if I want to organize a happy hour in my office, I don't know, that's like, what, a 30-minute test?

What fraction of that is, it's got to invite the right people, it's got to book the right DoorDash or something. What fraction of that is it able to do? My guess is that by the end of this year, there'll be something that can sort of like... kind of do that but unreliably and that if you actually like tried to use that to run your life it would make some hilarious mistakes that would appear on Twitter and go viral

But that like the MVP of it will probably exist by this year. Like there'll be like some Twitter thread about someone being like, I plugged in this agent to like run my... and it worked. Our scenario focuses on coding in particular because we think coding is what starts the intelligence explosion.

interested in questions of like, how do you mop up the last few things that are uniquely human compared to when can you start coding in a way that helps the human AI researchers speed up their AI research? And then if you've helped them speed up the AI research enough. Is that enough to, with some ridiculous speed multiplier 10 times, 100 times, mop up all of these other things? One observation I have is... You could have told the story in 2021 once iGPD comes out. I think I had friends.

who are like, you know, incredible AI thinkers who are like, look, you've got the coding agent now. It's been cracked. Now the GPT-4 will go around and they'll do all this engineering and we do this all on top. We can totally scale up the system 100x. And... has been much harder than the strongest optimist expected. It seems like there have been significant difficulties in increasing the pre-training size, at least from rumors about field training runs or underwhelming training runs at labs.

It seems like building up these RL, I'm like total outside view, I know nothing about the actual engineering involved here, but just from an outside view, it seems like building up the O1 like RL clearly took. much, at least two years after GPT-4 was released. And with these things are also their economic impact and the kinds of things you would immediately expect based on benchmarks for them to be especially capable at.

isn't overwhelming. Like the call center of workers haven't been fired yet. So why not just say that like, look, At higher scale, it'll probably get even more difficult. Wait a second. I'm... a little confused to hear you say that because when I have seen people predicting AI milestones like Cut Your Grace's expert surveys, they have almost always been too pessimistic from a point of view of how fast AI will advance. So like, I think the 2022 survey...

I mean, they actually said that things that had already happened would take like 10 years to happen. But then the survey, it might have been 2023, it was like six months before GPT. GPT-4 came out. And there were things that GPT-3 or 4 or whichever one of them was did that it did in six months that they were still predicting like five or ten years from.

I'm sure Daniel is going to have a more detailed answer, but I absolutely reject the premise that everybody has always been too optimistic. Yeah, I think in general, most people following the field have been... have underestimated the pace of AI progress and underestimated the pace of AI diffusion into the world. For example, Robin Hanson famously made a bet about less than a billion dollars of revenue.

I think, by 2025. I agree, Robin Hanson in particular has been too pessimistic. But he's a smart guy, you know? So I think that the aggregate opinion has been underestimating the pace of both technical progress and deployment. I agree that there have been plenty of people who've been more bullish than me and have been already proven wrong. But then I'd be... Wait a second. We don't have to guess about aggregate opinion. We can look at Metaculous. Metaculous...

I think their timeline was 2040 back. It was like 2050 back in 2020. It gradually went down to like 2040, two or three years ago. Now it's at 2030. So it's barely ahead of us. Again, that may turn out to be wrong, but it does look like the metaculans overall have been...

too pessimistic, thinking too long-term rather than too optimistic. And I think that's like the closest thing we have to a neutral aggregator where we're not cherry-picking things. I had this interesting experience yesterday. We're having lunch with these...

Senior AI researcher, probably makes on the order of like millions a month or something. And we were asking him, how much are the AIs helping you? And he said, in domains, which I understand well, and it's closer to autocomplete, but more intense. There, it's maybe saving me four to eight hours a week.

But then he says, in domains which I'm less familiar with, if I need to go wrangle up some hardware library or make some modification to the kernel, whatever, where I'm just like, I know less, that saves me on the order of 24 hours a week. now with like current models.

What I found really surprising is that the help is bigger, where it's less like Autocomplete and more like a novel contribution. It's like a more significant productivity improvement there. Yeah, that is interesting. I imagine what's going on there is that...

A lot of the process when you're unfamiliar with a domain is like Googling around and learning more about the domain and language models are excellent because they've already read the whole internet and know all the details. Isn't this a good opportunity to discuss? a certain question I asked Dario that you responded to. What are you thinking of? Well, I asked this question where, as you say, they know all this stuff.

I don't know if you saw this. I asked this question where I said, look, these models know all this stuff, and if a human knew... every single thing a human has ever written down on the internet. They'd be able to make all these interesting connections between different ideas and maybe even find medical cures or scientific discoveries as a result.

There was some guy who noticed that magnesium deficiency causes something in the brain that is similar to what happens when you get a migraine. And so he just said, give you magnesium supplements, it cured a lot of migraines. So why aren't LLMs able to leverage this? enormous asymmetric advantage they have to make a single new discovery like this. And then the example I gave was that humans also can't do this.

So for me, the most salient example is etymology of words. We have all of these words in English that are very similar, like happy versus hapless, happen, perhaps. And we never think about them unless you read an etymology dictionary and then like, oh, obviously these all come from some old root that has to mean luck or occurrence or something like that. Yeah.

figuring out versus checking if i tell you those you're like this seems plausible and of course in etymology there are also a lot of false friends where they seem plausible but aren't connected But you really do have to have somebody shove it in your face before you start thinking about it and make all of those connections. I will actually disagree with this.

We know that humans can do, like we have examples of humans doing this. I agree that we don't have logical omniscience because there is a combinatorial explosion, but we are able to leverage our intelligence to... Actually, one of my favorite examples of this is David Anthony, the guy who wrote The Horse, The Wheel, and Language. He made this.

It was a super impressive discovery before we had the genetic evidence for it, like a decade before, where he said, look, if I look at all these languages in India and Europe... They all share the same etymology. I mean, literally what you're talking about, the same etymology for words like wheel and cart and horse. And these are technologies that have only been around for the last 6,000 years, which must mean that there was some group.

These groups are all at least linguistically descended from. And now we have genetic evidence for the Yamnaya, which we believe is this group. You have a blog where you do this. This is your job, Scott. So why shouldn't we hold the fact that late language models can't do this more against them? Yeah, so to me, it doesn't seem like he is just kind of...

sitting there being logically omniscient and getting the answer. It seems like he's a genius. He's thought about this for years, probably at some point.

Like he heard a couple of Indian words and a couple of European words at the same time, and they kind of connected and the light bulb came on. So this isn't about... having all the information in your memory, so much as the normal process of discovery, which is kind of mysterious, but seems to come from just kind of having good heuristics and throwing them at things until you kind of get a lucky strike.

My guess is if we had really good AI agents and we... applied them to this task, it would look something like a scaffold where it's like Think of every combination of words that you know of. Compare them. If they sound very similar, write it on this scratchpad here. If a lot of words of the same type show up on this scratchpad, that's pretty strange. Do some kind of thinking around it. And I just don't think we've even tried that.

And I think right now if we tried it, we would run into the combinatorial explosion. We would need better heuristics. Humans have such good heuristics that probably most of the things that show up even in our conscious mind rather than happening on the level of some kind of unconscious processing. are at least the kind of things that could be true. I think you could think of this as like a chess engine. You have... some unbelievable number of possible next moves.

you have some heuristics for picking out which of those are going to be the right ones, and then gradually you kind of have the chess engine think about it, go through it, come up with a better or worse move, then at some point you potentially become better than human.

I think if you were to force the AI to do this in a reasonable way, or you were to train the AI such that it itself could come up with the plan of going through this in some kind of heuristic-laden way, you could potentially equal humans.

I'll add some more things to that. So I think there's a long and sordid history of people looking at some limitation of the current LLMs and then... making grand claims about how the whole paradigm is doomed because they'll never overcome this limitation and then like a year or two later the new LLMs overcome that limitation and I would say that like

with respect to this thing of like, why haven't they made these interesting scientific discoveries by combining the knowledge they already have and like noticing interesting connections? I would say, first of all, Have we seriously tried to build scaffolding to make them do this? And I think the answer is mostly no. I think Google DeepMind tried this, right? Maybe. So maybe. Second thing.

Have you tried making the model bigger? They've made it a bit bigger over the last couple years, and it hasn't worked so far. Maybe if they make it even bigger still, they'll notice more of these connections. And then third thing, and here's, I think, the special one. Have you tried training the model to do the thing? You know, just because, like, the pre-training... The pre-training process doesn't strongly incentivize

this type of connection making, right? In general, I think it's a helpful heuristic that I use to ask the question of like, Remind oneself, what was the AI trained to do? What was its training environment like? And if you're wondering why hasn't the AI done this, ask yourself, Like, did the training environment train it to do this? And often the answer is no. And often I think that's a good...

because it wasn't trained to do it. I mean, it seems like such an economically valuable... But how would you set up a training environment? It wouldn't be really gnarly to try to... set up an RL environment to train to make new scientific discoveries. Maybe we should have longer timelines. It's an early engineering problem. Well, so, I mean, in our scenario...

they don't just leap from where we are now to solving this problem. They don't. Instead, they just iteratively improve the coding agents until they've basically got coding solved. But even still, their coding agents are not able to do some of this stuff. Like, that's what... Early 2020s, like the first half of 2027 in our story is basically...

They've got these awesome automated coders, but they still lack research taste and they still lack maybe like organizational skills and stuff. And so they need to like overcome those remaining bottlenecks and gaps in order to completely automate the research cycle. But they're able to overcome those gaps.

faster than they normally would because the coding agents are doing all the grunt work really fast for them. I think it might be useful to think of our timelines as being like 2070, 2100. It's just that the last 50 to 70 years of that all happened during the year 2027 to 2028 because we are going through this intelligence explosion. Like, I think if I asked you, could we solve this problem by the year 2100?

you'd say, oh yeah, by 2100, absolutely. And we're just saying that the year 2100 might happen earlier than you expect because we have this research progress multiplier. And then let me just address that in a second, but just one final thought on this thread. to the extent that there's like a modus ponens, modus tollens thing here, where one thing you could say is like, look, AIs,

not just other lens, but AIs will have this fundamental asymmetric advantage where they know all this shit. And why aren't they able to use their general intelligence to use this asymmetric advantage. to some enormous capability.

Now, you could infer that same statement by saying, okay, well, once they do have that general intelligence, they will be able to use their asymmetric advantage to make all these enormous gains that humans are, in principle, less capable of, right? So basically, if you...

AIs could do all these things if only they had general intelligence. It's going to be like, well, once we actually do get the AGI, it's actually going to be totally transformative because they will have all of human knowledge memorized and they can use that to make all these connections. I'm glad you mentioned that. Our current scenario does not really take that into account very much. So that's an example in which our scenario is...

possibly underestimating the rate of progress. You're so conservative, Daniel. This has been my experience working with the team. As I point out, five different things. Are you sure you're taking this into account? Are you sure you're taking this into account? And first of all, 99% of the time, he says, yes, we have a supplement on it. But even when he doesn't say that, he's like, yeah, that's one reason it could go.

slower than that, here are 10 reasons it could go faster. We're trying to be sort of like our median guess. There are a bunch of ways in which we could be underestimating and there are a bunch of ways in which we could be overestimating. We're going to hopefully continue to think more about this afterwards and continue to iteratively refine our models and come up with better guesses and so forth. Look, your AI product works best when it has access to all of your clients' information.

Your co-pilot needs your customer's entire code base, and your chatbot needs to access all of your clients' documents. But for your users, this presents a massive security growth. Your clients' IT people are going to need more assurance to even consider your product. Enterprise customers need secure auth, granular access controls, robust logging, and a lot more just to start with. Building all these features from scratch is expensive, tough, and time-consuming.

That's where my new sponsor WorkOS comes in. WorkOS is kind of like Stripe for enterprise feature. They provide a set of APIs to quickly add all the capabilities. so that your app, EI or otherwise, can become enterprise-ready and scale up market. They're powering top AI companies, including OpenAI, Cursor, Perplexity, and Anthropic, and hundreds more. If you want to learn more about making your app enterprise-ready, go to workos.com and just tell them that Dworkash sent you.

All right, back to Scott and Daniel. So if I look back at AI progress in the past, If we were back in, say, 2017. Yeah, suppose we had the Super Home Encoders in 2017. The amount of progress we've made since then, so where we are currently in 2025. By when could we have had that instead? Great question. We still have to like stumble through all the discoveries that we've made since 2017. We still have to like...

figure out that language models are a thing. We still have to figure out that you can fine-tune them with RL. So all those things would still have to happen. How much faster would they happen? Maybe 5x faster, because... Because a lot of the small-scale experiments that these people do in order to test out ideas really quickly before they do their big training runs would happen much faster because...

They're just like lickety-split being spit out. I'm not very confident in that 5x number. It could be lower, it could be higher, but that was sort of like roughly what we were guessing. Our 5x, by the way, is for the algorithmic progress part, not for the overall thing. So in this hypothetical, according to me...

basically things would be going like 2.5x faster, where the algorithms would be advancing at 5x speed, but the compute is still stuck at the usual speed. That seems plausible to me. You have a 5x standpoint, and then... 1000x AI progress within the matter of a year. Maybe that's the part I'm like, wait, how did that happen exactly?

So what's the story there? The way that we did our takeoff forecast was basically by breaking down how we think the intelligence explosion would go into a series of milestones. First, you automate the coding. Then you automate the whole research process, but in a very similar way to how humans do it.

you know, teams of agents that are about human level, then you get to superhuman level and so forth. And so we broke it down into these milestones, you know, superhuman coder, superhuman AI researcher, and then super intelligent AI researcher. And the way we did our forecast was we basically...

Well, for each of these milestones, we were like, what is it going to take to make an AI that achieves that milestone? And then once you do achieve that milestone... how much is your overall speed up and then what's it going to take to achieve the next milestone combine that with the overall speed up and that gets you your clock time okay now you're at that milestone what's your overall speed up assuming that you have that milestone

Also, what's the next one? How long does it take to get to the next one? So we sort of like work through it bit by bit. And at each stage, we're just making our best guesses. So quantitatively, we were thinking something like 5x speedup to algorithmic progress from the superhuman coder.

And then something like a 25x speedup to algorithmic progress from the superhuman AI researcher, because at that point, you've got the whole stack automated, which I think is substantially more useful than just automating the coding. And then... I forget what we say for a super intelligent AI researcher, but off the top of my head, it's probably something like... in the hundreds or maybe like a thousand X.

overall speed up? So maybe the big picture thing I have with the Intel's explosion is, uh... We can go through the specific arguments about how much will the automated coder be able to do and how much will the superhuman AI coder be able to do. But on prayers, it's just like such a wild thing to expect. And so... Before we get into all the specific arguments, maybe you can just address this idea that...

Like, why not just start off with a 0.01% chance this thing might happen, and then you need extremely, extremely strong evidence that it will before making that your mortal view? I think that it's a question of, like... what is your default option or what are you comparing it to? I think that naively people think like, well, every particular thing is potentially wrong. So let's just have a default path where nothing ever happens. And I think that

That has been the most consistently wrong prediction of all. Like, I think in order to have nothing ever happen, you actually need a lot to happen. Like, you need suddenly AI progress that has been going at this constant rate for so long. Stop.

Why does it stop? Well, we don't know. Whatever claim you're making about that is something where you would expect there to be a lot of out-of-model error, is where you would expect, like, somebody must be making a pretty definite claim that you want to challenge. So I don't think there's a neutral position where you can just say, well, given that out-of-model error is really high and we don't know anything, let's just choose that.

We are trying to take, I know this sounds crazy because if you read our document, all sorts of bizarre things happen. It's probably the weirdest. couple of years that have ever been. But we're trying to take almost in some sense a conservative position where the trends don't change. Nobody does an insane thing. Nothing that we have no evidence to think will happen happens. way that the AI intelligence explosion dynamics work.

are just so weird that in order to have nothing happen, you need to have a lot of crazy things happen. One of my favorite you know meme images is this graph showing world gdp over time you've probably seen it spikes up and then there's like a little thought bubble at the top of the spike in 2010 or something.

And the thought bubble says, like, my life is pretty normal. I have a good grasp of what's weird versus standard. And people thinking about different futures with, like... digital minds and space travel are just engaging in silly speculation.

The point of the graph is like, actually, there's been amazing transformative changes in the course of history that would have seemed totally insane to people, you know, multiple times. We've gone through multiple such waves of those things. Everything we've talked about. has happened before. Algorithmic progress already doubles like every year or so.

So it's not insane to think that algorithmic progress can contribute to these compute things. In terms of general speed-up, we're already at like a thousand times research speed-up multiplier compared to the Paleolithic or something. From the point of view of anyone in most of history, we are going at a blindingly insane pace. And all that we're saying here is that

It's not going to stop. Is that the same trend that has caused us to have a thousand times speed up multiplier relative to past eras and not even the Paleolithic? What happened in the century between, I don't know, 600 and 700 AD? I'm sure there are things, I'm sure historians can point them out. Then you look at the century between 1900 and 2000, and it's just...

completely qualitatively different. Of course, there are models of whether that's stagnated recently or what's going on here. We can talk about those. We can talk about why we expect. for the intelligence explosion to be an antidote to that kind of stagnation. But nothing we're saying is that different from what has already happened. I mean, you are saying that this transition, these previous transitions have been smoother than the one you're anticipating. We're not sure about that, actually.

According to, like, one of these models is just a hyperbola. Everything is along the same curve. Another model is that there are these things like the literal Cambrian explosion. If you want to take this very far back, go full Ray Kurzweil.

The literal Cambrian explosion, the agricultural revolution, the industrial revolution as phase changes. When I look at the economic modeling of this, my impression is the economists think that we don't have good enough data to be sure whether this is all one smooth process. or whether it's a series of phase changes. When it is one smooth process, the smooth process is often a hyperbola that shoots to infinity in weird ways.

We don't think it's going to shoot to infinity. We think it's going to hit bottlenecks. We think it's going to hit bottlenecks the same as all these previous processes. The last time this hit a bottleneck, if you take the hyperbola view... is in like 1960, when humans stopped reproducing at the same rate they were reproducing before. We hit a population bottleneck, the usual population to ideas flywheels stopped working, and then we stagnated for a while.

If you can create a country of geniuses in a data center, as I think Dario Amadai put it, then you no longer have this population bottleneck and you're just expecting continuation of those pre-1960 trends. All of these historical hyperbolas are also kind of weird, also kind of theoretical, but I don't think we're saying anything that there isn't models for which have previously seemed to work for long historical periods. Another thing also is I think people equivocate between fast.

and or between slow and continuous right so like if you look at our scenario there's this like continuous trend that runs through the whole thing of this algorithmic progress multiplier. And we're not having discrete jumps from like 0 to 5x to 25x. We have this continuous improvement. So I think continuous is not the crux. The crux is like, is it going to be this fast?

And we don't know. Maybe it'll be slower. Maybe it'll be faster. But we have our arguments for why we think maybe this fast. Okay, now that we brought up the intelligence explosion, let's just discuss that because I'm kind of skeptical. A notable bottleneck to AI progress, or the main bottleneck to AI progress, is the amount of researchers, engineers who are doing this kind of research. It seems more like compute or some other thing is a bottleneck.

And the piece of evidence is that when I talk to my AI researcher friends, At the labs, they say there's maybe 20 to 30 people on the core pre-training team that's discovering all these algorithmic breakthroughs.

If the headcount here was so valuable, you would think that, for example, Google DeepMind would take not just everybody from all their smartest people, not just from DeepMind, but for all of Google and just put them on pre-trading or RL or whatever the big bottleneck was. You think OpenAI would hire. Every single Harvard math PhD. And in six months, you're all going to be trained up on how to do AI research.

They don't seem that—I mean, I know they're increasing headcount, but they don't seem to treat this as the kind of bottleneck that— It would have to be for millions of them in parallel to be rapidly speeding up AI research. And there just is this.

You know, there's this quote that Napoleon—one Napoleon is worth 40,000 soldiers was commonly a thing that was said when he was fighting. But 10 Napoleons is not 400,000 soldiers, right? So why think that these million— AI researchers are netting you something that looks like an intelligence explosion.

So previously I talked about sort of three stages of our takeoff model. First is like you get the superhuman coder. Second is when you're fully automated AI R&D, but it's still at like basically human level, like it's as good as your best humans. And then third is like now you're in super intelligence territory and it's qualitatively better. In our, like, guesstimates of... how much faster algorithmic progress would be going, the progress multiplier.

for the middle level. We basically do assume that like you get massive diminishing returns to having more minds winning in parallel. And so we totally buy all of that. Yeah, and then I think the addition to that is the question, then why do we have the intelligence explosion? And the answer is... combination of that speedup and the speedup in serial thought speed. And also the research taste thing. So, like, here are some important inputs to AI R&D progress today.

Research tastes. So like the quality of your best researchers, the people who are managing the whole process, their ability to learn from data and like...

make more efficient use of the compute by running the right experiments instead of flailing around running a bunch of useless experiments. That's research taste. Then there's like the quantity of your researchers, which we just talked about. Then there's the serial speed of your researchers, which... currently is all the same because they're all humans and so they all run at basically the same serial speed and then finally there's how much compute

for experiments um so what we're imagining is that basically serial speed starts to matter a bunch because you switch to ai researchers that have like orders of magnitude more serial speed than humans But it tops out. We think that over the course of our scenario, if you look at our sliding stats chart, it goes from 20x to 90x or something over the course of... of the scenario which is important but like not huge and also we think that like

once you start getting like 90x serial speed, you're just like bottlenecked on the other stuff. And so like additional improvements in serial speed basically don't help that much. With respect to the quantity, of course, yeah, we're imagining you get like...

hundreds of thousands of AI agents, a million AI agents. But that just means you get bottlenecked on the other stuff. Like, you've got tons of parallel agents. That's no longer your bottleneck. What do you get bottlenecked on? Taste and compute. So by the time it's mid-2027 in our story, when they fully automated the AI research,

There's basically the two things that matter. What's the level of taste of your AIs? How good are they at learning from the experiments that you're doing? And then how much compute do you have for running those experiments? And that's like the sort of like core setup of our model. And when we get our like 25x multiplier, it's sort of like starting from those premises. Is there some intuition pump from history where there's been... some output and because of some really weird

The production of it has been rapidly skewed along one input, but not all the inputs that have been historically relevant. And you still get breakneck progress. Possibly the Industrial Revolution. I'm just extemporizing here. I hadn't thought about this before, but as Scott's famous post that was hugely influential to me a decade ago talks about...

There's been this decoupling of population growth from overall economic growth that happened with the Industrial Revolution. And so in some sense, maybe you could say that's an example of like... Previously, these things grew in tandem, like more population, more technology, more farms, more houses, etc. Like your sort of capital infrastructure and your like human infrastructure was like going up together. But then...

We got the industrial roof and they started to come apart. And now like all the capital infrastructure was growing really fast compared to like the human population size. I think I'm imagining something maybe similar happening with algorithm progress. And it's not that like, again with population, population still matters a ton today.

in some sense, like, progress is bottlenecked on having larger populations and so forth. But it's just that, like, the population growth rate is just, like, inherently kind of slow. And the growth rate of capital is much faster. And so it just comes to be a bigger part of the story. Maybe the reason that this sounds less plausible to me than the 25x number implies is that when I think about concretely what that would look like, where you have these AIs.

And there, you know, we know that there's a gap in data efficiency between human brains and these AIs. And so somehow they're just like, there's a lot of them thinking and they think really hard and they figure out how to define a new architecture that is like. the human brain or has the advantages of the human brain. And I guess they can still do experiments, but not that many. Part of me just wonders, like...

Okay, what if you just need an entirely different kind of data source that's not like pre-training for that, but they have to go out in the real world to get that? Or maybe they just need to... It needs to be actively, it needs to be an online learning policy where they need to be actively deployed in the world for them to learn in this way. And so you're bottlenecked on how fast they can be getting real world data.

So we are actually imagining online learning happening. Oh, really? Yeah. But like, not so much real world as in like, like, the thing is that like, if you're trying to train your AIs to do really good AI R&D, then like... Well, the AI R&D is happening on your server. you can just like kind of have this loop of like, you have all these AI agents autonomously doing AI R&D, doing all these experiments, et cetera. And then they're like,

online learning to get better at doing AI R&D based on how those experiments go. But even in that scenario alone, I can imagine bottlenecks like, oh, you had a benchmark and it got reward hacked for what constitutes AI R&D. Because you obviously can't have like... What is the, maybe you would, but... is it as good as a human brain? It's just like such an ambiguous thing. Right now we have benchmarks that get reward hacked, right? But then they autonomously build new benchmarks.

I think what you're saying is maybe this whole process just goes off the rails due to lack of contact with ground truth outside in the actual world, like outside the data centers. Maybe. Again, part of my guess here is that like... a lot of the ground truth that you want to be in contact with. is stuff that's happening on the data centers, things like how fast are you improving on all these metrics?

You have these vague ideas for new architectures, but you're struggling to get them working. How fast can you get them working? And then separately, insofar as there is a bottleneck of talking to people outside and stuff. Well, they are still doing that, you know? And once they're fully autonomous, they can even do that much faster, right?

all the million copies connected to all these various real-world research programs and stuff like that. So it's not like they're completely starved for outside stuff. What about the skepticism that... Look, what you're suggesting with this hyper-efficient, hive mind of AI researchers... No human bureaucracy has just out of the gate worked super efficiently, especially one where they don't have experience working together. They haven't been trained to work together, at least yet.

There hasn't been this outer loop RL on like... We ran a thousand concurrent experiments of different AI bureaucracies doing AI research, and this is the one that actually worked best. And the analogy I'd use maybe is to humans in the savanna 200,000 years ago. We know they have a bunch of advantages over the other animals already at this point, but... The things that make us dominant today, joint stock corporations, state capacities, like this fossil-fueled civilization we have.

That took so much cultural evolution to figure out. You couldn't just have, like, figured it out in, like, the Savannah. It was like, oh, you know, if we had built these incentive systems and we issued dividends, then we could really collaborate here or something. Why not think that it will take a similar process of huge population growth, huge social experimentation, and upgrading of the technological base.

of the AI society before they can organize this hypermind collective, which will enable them to do what you imagine intelligence explosion looks like. You're comparing it kind of to two different things. One of them is literal genetic evolution in the African savannah, and the other is the cultural evolution that we've gone through since then. And I think there will be AI equivalents to both.

So the literal genetic evolution is that our minds adapted to be more amenable to cooperation during that time. I think the companies will be very literally training the AIs to be more cooperative. I think there's more opportunity for pliability there because...

Humans were, of course, evolving under this genetic imperative that we want to pass on our own genetic information, not somebody else's genetic information. You have things like kin selection that are sort of, kind of... exceptions to that, but overall it's the rule. In animals that don't have that, like eusocial insects, then you just very quickly get, just through genetic evolution without cultural evolution, extreme cooperation.

And with eusocial insects, what's going on is that they all have the same genetic code. They all have the same goals. And so the training process of evolution kind of yokes them to each other in these extremely powerful bureaucracies. We do think that the AI will be closer to the eusocial insects in the sense that they all have the same goals, especially if these aren't indexical goals.

Their goals like have the research program succeed. So that's going to be changing the weights of each individual AI. I mean, before they're individuated, but it's going to be changing the weights of the AI class overall to be more amenable to cooperation. And then yes, you do have the cultural evolution. Like you said, this takes...

hundreds of thousands of individuals. We do expect there will be these hundreds of thousands of individuals. It takes decades and decades. Again, we expect this research

such that decades of progress happen within this one year, 2027 or 2028. So I think between the two of these, it is possible. Maybe this is also where the serial speed actually does matter a lot, because... if they're running at like 50x human speed, then that means you can have... sort of like a year of subjective time happen in a week of real time

And so these sorts of like large-scale cooperative dynamics of like, you know, your moral maze, you have an institution, but then it becomes like a moral maze and, you know, it sort of collapses under its own weight and stuff like that. there actually is time for them to like play that out multiple times and then like train on it, you know, and like... tinker with the structure and like add it to the training process you know um over the course of 2027 yeah also like

They do have the advantage of all the cultural technology that humans have evolved so far. This may not be perfectly suited to them. It's more suited to humans. But imagine that you have to make a business. out of you and your hundred closest friends who you agree with on everything. Maybe they're literally your identical twin. They have never betrayed you ever and never will.

I think this is just not that hard a problem. Also, again, they are starting from a higher floor, right? They're starting from human institutions. You can literally have a Slack workspace for all the AI agents to communicate, and you can have a hierarchy with roles. They can borrow quite a lot from successful human institutions. I guess the bigger the organization, even if everybody is aligned, I think some of your response is,

addressed whether they will be aligned on goals. I mean, you did address the whole thing, but I will just point this out. That is not the part I'm skeptical of. I am more skeptical of just like... Even if you're all aligned and want to work together. Do you fundamentally understand how to run this huge organization? And you're doing it in ways that no human has had to do before.

Getting copied incessantly. You're running extremely fast. You know what I'm saying? I think that's totally reasonable. And so it's a complicated thing. And I'm just not sure why you think. We build this bureaucracy or the AIs build this bureaucracy. within this matter of... Like, so we depicted happening over the course of like, you know, six to eight months or something like that in 2027. What would you say like twice as long, five times as long, 10 times as long? Five years?

So five years, if they're going at 50x serial speed, then five years is what? 250 years of sort of serial time for the AIs. which to me feels like more than enough to really sort out this sort of stuff. You'll have time for empires to rise and fall, so to speak, and all of that to be added to the chaining data.

But I could see it taking longer than we depict. Like, you know, maybe instead of six months, it'll be like 18 months, you know. But also maybe it could be two months. So when I think of like the ways that they train AI. I think in our scenario at this point, there are two primary ways that they're doing it. One of them is just continuing the next token prediction work.

These AIs will have access to all human knowledge. They will have read management books in some sense. They're not starting blind. There is going to be something like predict how Bill Gates would complete this next character or something like that. And then there's the reinforcement learning in virtual environments. So get a team of AIs to play some multiplayer game. I don't think you would use one of the human ones because you would want something that was better suited for this task.

Just running them through these environments again and again, training on the successes, training against the failures, kind of combining those two kinds of things. the same kind of problem as inventing all human institutions from the Paleolithic onward. It just seems like kind of applying those two things. Jane Street made a puzzle for listeners of this episode, and I thought that I'd take a crack at it first. And so I'm joined by my friend Adam Kennedy at Jane Street.

and he's going to mentor me as I try to take a stab at this. Let's go. I appreciate your confidence in me, but there's a reason I became a podcaster. Today I went on a hike and found a pile of tensors hiding underneath a Neolithic burial mound. Maybe start by looking at the last two layers. An ancient civilization's secret code. Okay, so it looks like I can just... type in some words here, and it always gives me zero. Nice. There you go.

All right, we're in. So I didn't make that much progress at this, but it's clear that there's some deep structure to this puzzle that would actually be really fun to try to unravel. If you want to take a crack at it, go to janestreet.com slash Dwarkesh. And if you enjoy puzzles like this, they're always recruiting. Yep. Thanks, Adam. Yeah, thanks, Dwarkesh. Take care. The other notable thing about your model is...

So you've got this superhuman thing at the end of it, and then it seems to just go through the tech tree of mirror life and nanobots and whatever crazy stuff. And maybe that part I'm also really skeptical of. It just looks like if you look at the history of invention, it just seems like... You people are just like trying different random stuff.

You often, even before the theories about how that industry works or how the relevant machinery works is developed, like the steam engine was developed before the theory of thermodynamics. The Wright brothers, it seems like they were just experimenting with airplanes.

and is often influenced by breakthroughs in totally different fields, which is why you have this pattern of parallel innovation, because the background level of tech is at a point at which you can do this experiment. I mean, machine learning itself is... a place where this happened, right? Where people had these ideas about how to do deep learning or something, but...

It just took a totally unrelated industry of gaming to make the relevant progress to get the whole, you know, basically the economy as a whole advanced enough that like deep learning, like Jeffrey Henton's ideas could work.

I know we're accelerating way into the future here, but I just want to get to this crux. So again, we have that three-part division of the superhuman coder, then the complete AI researcher, and then the superintelligent. You're not jumping ahead to that one. There, I would say...

So now we're imagining systems that are like true superintelligence. They are just like better than the best humans at everything, including being better at data efficiency and better at learning on the job and stuff like that. Now, our scenario does depict...

a world in which they're bottlenecked on real-world experience and that sort of thing. I think that if you want to contrast, some people in the past have proposed much faster scenarios where they email some cloud lab and start building nanotech you know right away yeah by just using their brains to figure out like

appropriate protein folding and stuff like that. We are not depicting that in our scenario. In our scenario, they are in fact bottlenecked on lots of real world experience to like build these actual practical technologies. But the way they get that is they just actually get that experience and it happens faster than humans would. And the way they do that is...

They're already super intelligent. They're already buddy-buddy with the government. The government deploys them heavily in order to beat China and so forth. And so all these existing U.S. companies and factories and military procurement providers and so forth are all like... chatting with the super intelligences and taking orders from them about like how to build the new widget and test it and like they're like downloading super intelligent designs and you know

manufacturing them and then testing them and so forth. And then the question is like, okay, so they are getting this experience, they're learning on the job. Quantitatively, how fast does this go? Like, is it taking years or is it taking months or is it taking days, right? In our story, it takes like about a year.

And we're uncertain about this. Maybe it's going to take several years. Maybe it's going to take less than a year, right? Here are some factors to consider for why it's plausible that it could take a year. One, you're going to have something like a million of them. And quantitatively, that's like comparable in size to the existing scientific industry, I would say. Like maybe it's a bit smaller, but it's not like dramatically smaller.

Two, they're thinking a lot faster. They're thinking like 50 times speed or like 100 times speed. That, I think, counts for a lot. And then three, which is the biggest thing, they're just qualitatively better as well. So not only are they, there are lots of them and they're thinking very fast, but they are better at learning from each experiment than the best human would be at learning from that experience. Yeah, I think the fact that there's a million of them...

Or the fact that they're comparable to maybe the size of the key researcher population of the world or something. I don't think a million is.

I think there's more than a million researchers in the world. Well, but it's very heavy-tailed. Like, a lot of the research actually comes from, like, the best ones, you know? That's right. But it's not clear to me that most of the... new stuff that is developed as a result of this researcher population I mean there's just like so many examples in the history of science where A lot of growth or product improvements is just the result of...

How do you count the guy at the TSMC process who figures out a different way to... I actually argued with Daniel about this recently about one interesting case that I can go over is... we have an estimate that about a year after the superintelligences start wanting robots, they're producing a million units of robots per month. So I think that's pretty relevant because you have, I think it's Wright's law, which is that... your ability to improve efficiency on a process is proportional to...

doubling the amount of copies produced. So if you're producing a million of something, you're probably getting very, very good at it. The question we were arguing about is, can you produce a million units a month after a year? And for context, I think Tesla produces like a quarter of that in terms of cars or something. This is an amazing scale up in a year. It's only 4X. Yeah. Also just for Tesla. Yeah. And the argument that we went through was something like...

So it's got to first get factories. OpenAI is already worth more than all of the car companies in the U.S. except Tesla combined. So if OpenAI today wanted to buy all the car factories in the US except Tesla, start using them to produce humanoid robots, they could. Obviously not a good value proposition today.

but it's just obvious and overdetermined that in the future when they have superintelligence and they want them, they can start buying up a lot of factories. How fast can they convert these car factories to robot factories?

So fastest conversion we were able to find in history was World War II. They suddenly wanted a lot of bombers. So they bought up... in some cases bought up, in other cases got the car companies to produce new factories, but they bought up the car factories, converted them to bomber factories. That took about three years from the time when they first decided to start this process to the time when the factories were producing a bomber an hour.

we think it will potentially take less with superintelligence because, first of all, if you look at the history of this process, despite this being the fastest anybody has ever done this, It was actually kind of a comedy of errors. They made a bunch of really silly mistakes in this process. If you actually have something that

even just doesn't have the normal human bureaucratic problems. And we do think that this will be done in the middle of an arms race with China. So the government will be kind of moving things through. And then the super intelligences will be good at the logistical issues, navigating bureaucracies. So we estimated maybe if everything goes right, we can do this three times faster than the bomber conversions in World War II. So that's about a year. I'm assuming the bombers were just much...

less sophisticated than the kind of humanoid robots. Yeah, but the car factories of that time were also much less sophisticated than the car factories of our time. Conversion speed was also... Maybe to give one hypothetical here. Right now, let's just take like biomedicine as an example of like a field, one of the fields you'd want to accelerate. And whenever these CEOs get on podcasts, they're often talking about curing cancer and so forth.

And it seems like a big thing these frontier biomedical research facilities are excited about is the virtual cell. Now, the virtual cell... It takes like a tremendous amount of compute, I assume, to train these DNA foundation models and to do all the other computation necessary to simulate a virtual cell. If it is the case that the cure for Alzheimer's and cancer and so forth is bottlenecked by the virtual cell. It's not clear if you had a million super intelligences in the 60s.

and you ask them, cure cancer for me, they would just have to... Solve making GPUs at scale, which would require solving all kinds of interesting physics and chemistry problems, material science problems. building process, building fabs for computing and then going through 40 years of making more and more efficient fabs that can do all of Moore's Law from scratch.

And that's just like one technology. And it just seems like you just need this broad scale. The entire economy needs to be upgraded for you to cure cancer in the 60s, right? just because you need the GPUs to do the virtual cell, assuming that's the bottleneck. First of all, I agree if there's only one way to do something, that makes it much harder, and maybe that one way takes very long.

We're assuming that there may be more than one way to cure cancer, more than one way to do all of these things, and they'll be working on finding the one that is least bottlenecked. Part of the reason I realize I spent too long talking about that robot example, but we do think that they're going to be... Getting a lot of physical world things done very quickly, once you have a million robots a month, you can actually do a lot of physical world experiments.

We look at examples of people trying to get entire economies off the ground very quickly. So, for example, China post-Dang. I don't know. Would you have predicted that... 20, 30 years after being kind of a communist basket case, they can actually be doing this really cutting-edge bio-research. I realize that's a much weaker thing than we're positing, but it was done just with the human brain, with a lot fewer resources than we're talking about.

Same issue with like, let's say, Elon Musk and SpaceX. I think in the year 2000, we would not have thought that somebody could move two times, five times faster than NASA. With pretty limited resources, they were able to get like... I think a lot more years of technological advance in than we would have expected. Partly that's because just Elon is crazy and never sleeps. Like if you look at the examples of things from SpaceX, he is breathing down every worker's neck being like...

What's this part? How fast is this part going? Can we do this part faster? And the limiting factor is basically hours in Elon's day in the sense that he cannot be doing that with every employee all the time. Superintelligence is not even that smart. It just yells at every single worker. Yeah, I mean, that is kind of my model is that we have some...

We have something which is smarter than Elon Musk, better at optimizing things than Elon Musk. We have like 10,000 parts in a rocket supply chain. How many of those parts can Elon personally like yell at people to optimize? we could have a different copy of the superintelligence optimizing every single part full time. I think that's just a really big speed up. I think both of those examples don't work in your favor. I think the China example...

is the China growth miracle could not have occurred if not for their ability to copy technology from the West. And I don't think there's a world in which they just, I mean, China has a lot of really smart people. It's a big country in general. Even then, I think they couldn't have just like divined how to make airplanes after becoming a communist.

hell basket, right? It was just like, the AIs cannot just like copy nanobots from aliens. It's got to make them from scratch. And then just on the Elon example. It took them two decades of like countless experiments failing in weird ways you would not have expected. And still, it's like, you know, rocketry we've been doing since the 60s, but maybe actually World War II.

And then just getting from a small rocket to a really big rocket took two decades of all kinds of weird experiments, even with the smartest and most competent people in the world. So you're focusing on the nanobots. I want to ask a couple questions. One, what about just like the regular robots? And then two... what would your quantities be?

for all of these things so first what about the regular robots like yeah like nanobots are presumably a lot harder to make than just like regular robot factories and like in our story they happen later It sounds like right now you're saying even if we did get the whole robot factory thing going, it would still take a ton of additional full economy broad automation for a long time to get to something like nanobots. That's totally plausible to me. I could totally imagine that happening.

I don't feel like the scenario particularly depends on that final bit about getting the nanobots. They don't actually really make any difference to the story. The robot economy does sort of make a difference because... There's two branches, endings, as you know. And in one of the endings, the AIs end up misaligned and end up taking over.

It's an important strategic change when the AIs are self-sufficient and just totally in charge of everything and they don't actually need the humans anymore. And so what I'm interested in is when has the robot economy advanced to the point where they don't really depend on humans? So quantitatively, what would your guess for that be? Like if hypothetically we had the army of super intelligences in early 2028.

How many years would you guess until the, and hypothetically also assume that like the U.S. president is like super bullish on like deploying this into the economy to be China, et cetera. So the political stuff is all set up in the way that we have. How many years do you think it would be until...

There are so many automated factories producing automated self-driving cars and robots that are themselves building more factories and so forth that if all the humans dropped dead, it would just keep chugging along. Maybe it would slow down a bit, but it would still be fine. What is chugging along mean?

So from the perspective of misaligned AIs, you wouldn't want to kill the humans or get into a war with them if you're going to get wrecked because you need the humans to maintain your computers, right? So yeah, in our scenario, once they are completely self-sufficient, then they can start being more blatantly misaligned. And so I'm curious, when would they be fully self-sufficient? Not in the sense of like...

They're not literally using the humans at all, but in the sense of like, they don't really need the humans anymore. Like they can get along pretty fine without them. They can continue to like do their science. They can continue to expand their industry. They can continue to have a flourishing civilization, you know, indefinitely into the future. without any humans i i think i would just i would probably need to sit down and just think about uh the numbers but i maybe like

Like 10 years, basically, instead of one year. I think we agree on the core model. This is why we didn't depict something more like the bathtub nanotech. scenario where they just like think about they just like don't need to do the experiments very much and they just like immediately jump to the right answers.

Like we are imagining this process of like learning by doing through this distributed across the economy, lots of different laboratories and factories, building different things, learning from them, et cetera. We're just imagining that like this overall goes much faster than it would go if humans are in charge. And then. We do have, in fact, lots of uncertainty, of course. Like, dividing up this part period into two chunks, they're like...

uh, 2028, early 2028 until like fully autonomous robot economy part. And then the like fully autonomous robot economy to like cancer cures, nanobots, all that crazy sci-fi stuff. I want to separate them because the important parts for a scenario only depend on the first part, really. If you think that it's going to take 100 years to get to nanobots...

That's fine, whatever. Once you have the fully atomized robot economy, then things may turn badly for the humans if the AI is misaligned. I want to just argue about those things separately. Interesting. And then you might argue, well, robots is more a software problem at this point.

And if like, if there isn't like, you don't need to like invent some new hardware. I feel pretty bullish on the robots. Like we already have humanoid robots being produced by multiple companies. Right. And that's in 2025. There'll be more of them produced cheaper and there'll be better in 2027. And there's all these car factories that can be converted and blah, blah, blah, blah. So I'm relatively bullish on the like one year until you've got this awesome robot economy.

And then like from there to the cool nanobots and all that sort of stuff, I feel less confident, obviously. Makes sense. Let me ask you a question. If you accept the manufacturing numbers, let's say a million robots a month, a year after the super intelligence. And let's say also like... some comparable number, 10,000 a month or something of automated biology labs, automated whatever you need to invent the next equivalent of X-ray crystallography or something.

Do you feel like that would be enough that you're doing enough things in the world that you could expand progress this quickly? Or do you feel like even with that amount of manufacturing, there's still going to be some other bottleneck? It's so hard to reason about because if you ask. If Constantine or somebody in like 400, 500 was like, I want the Roman Empire to have the Industrial Revolution. And somehow he figured out that like you need mechanized machines to do that. And he's like.

Let's mechanize. It's like, what's the next step? It's like... Dude, that's a lot. I like that analogy a lot, actually. I think it's not perfect, but it's a decent analogy. Imagine if a bunch of us got sent back in time to the Roman Empire such that we don't have the actual hands-on know-how.

to actually build the technology and make the industrial revolution happen. But we have the sort of high-level picture, the strategic vision of we're going to make these machines, and then we're going to do an industrial revolution. I think that's kind of analogous to the situation with the superintelligences, where they like...

have the high-level picture of like, here's how we're going to improve in all these dimensions, we're going to learn by doing, we're going to get to this level of technology, et cetera. But maybe they, at least initially, lack the actual know-how. I think one of, so there's this question of like, if we did the back in time to the Roman Empire thing.

how soon could we bring up the industrial revolution, you know? And like, without people going back in time, it took, you know, 2,000 years for the industrial revolution. Could we get it to happen in 200 years? You know, that's a 10x speed up. Can we get it to happen in 20 years? That's a 100x speed up. I don't know. But this seems like a somewhat relevant analogy to what's going on with the superintelligences.

We haven't really got into this because you're using the quote-unquote more conservative vision where it's not like godlike intelligence. We're still using... The conceptual handles we would have for humans. But I probably do. I think I would rather have humans go back with their big picture understanding of what has happened over the last 2000 years. Like me having seen everything rather than a super intelligence who knows nothing. But it's just like in the Roman economy. And they're like.

1,000x this economy somehow. I think just knowing generally how things took off, knowing basically steam engine, dot, dot, dot, railroads, blah, blah, blah, is more valuable than a super intelligence. Yeah. I don't know. My guess is that the superintelligence would be better. I think partly it would be through figuring out that high-level stuff from first principles rather than having to have experienced it. I do think that like a superintelligence back in the Roman era could have like...

guessed that eventually you could get autonomous machines that like burn something to like produce steam you know like they could they could have guessed that like you know automobiles could be created at some point and that that would be a really big deal for the economy.

And so a lot of these high-level points that we've learned from history, they would just be able to figure out from first principles. And then secondly, they would just be better at learning by doing than us. And this is a really important thing. If you think you're bottlenecked on learning by doing, well, then if you have a mind that...

less doing to achieve the same amount of learning, that's a really big deal. And I do think that learning by doing is a skill. Some people are better at it than others. And superintelligence would be better at it than the very best of us. That's right. Yeah, this is also maybe getting too far into the godlike thing and too far away from the human concept handles. But number one, I think we rely a lot on our scenario on this idea of research.

So you have a thousand different things that you could try when you're trying to create the next steam engine or whatever. Partly you get this by bumbling about and having accidents, and some of those accidents are productive. There are questions of like what kind of bumbling you're doing, where you're working, what kind of accidents you let yourself get into, and then like what directed experiments do you do. And some humans are better than others at that.

And then I also think at this point It is worth thinking about. what simulations they'll have available. Like, if you have a physics simulation available, then all of these real-world bottlenecks don't matter as much. Obviously, you can't have a complete perfect physics simulation available, but even right now, we're using simulations to design a lot of things.

And once you're super intelligent, probably you have access to much better simulations than we have right now. This is an interesting rabbit hole. So let's stick with it, actually, before we get back to the intelligence explosion. I actually don't know if like I think we're treating this really like

All these technologies come out of this one percent of the economy that is research. And, you know, right now there's like a million superstar researchers. And instead of that, we'll have the super intelligences doing that. And my model is much more. You know Newcomb and Watt were just like fucking around they didn't have this like It's just like in human history, there's no clear examples of people being like,

and then we're going to work backwards from that to design the steam engine because this unlocks the industrial revolution. Oh, I completely disagree. Yeah, I disagree. So I think you're over-indexing or cherry-picking some of these gratuitous examples, but there's also things on the other side. Like, think about...

Think about the recent history of AGI where there is DeepMind. Yeah. There's various other like AI companies. Then there's OpenAI and there's Anthropic. And like there's just this repeated story of like big bloated company with tons of money, tons of smart researchers, et cetera, flailing around, trying a ton of different things at different points. smaller startup with a vision of we're going to build AGI and like overall working towards that vision more coherently with a few cracks.

you know engineers and researchers and then they crush the giant company even though they have less compute even though they have less researchers they're able to do fewer experiments you know um So yeah, I think that there are tons of examples throughout history, including recent relevant AGI history of... of things in the other way i agree that like the random fortuitous stuff does happen sometimes and is important but if it was mostly random fortuitous stuff that would predict that like

the giant companies with zillions of people trying zillions of different experiments would be like going proportionally faster. than like the tiny startups that have the vision and the best researchers. And that like basically doesn't happen. That's rare. I would also point out that even when we make these random fortuitous discoveries, it is usually like...

An extremely smart professor who's been working on something vaguely related for years in a first world country. Like it's not randomly distributed across everyone in the world. You get more lottery tickets for these discoveries. when you are intelligent, when you have good technology, when you're doing good work. And part of what we're expecting is, yeah, like... The best example I can think of is that Ozempic was discovered by looking at gila monster venom.

And like, yeah, maybe the AIs will decide using their superior research taste and good planning that the best thing to do is just catalog every single biomolecule in the world and look at it really hard. But that's something you can do better if you have all of this compute, if you have all of this intelligence.

rather than just kind of waiting to see what things the U.S. government might fund normal fallible human researchers to do. One more thing I'll interject. I think you make a great point that... Discoveries don't always come from where we think. Like Nvidia originally came from gaming. That's right. So you can't necessarily aim at one part of the economy, expand it separately from everything else.

We do kind of predict that the superintelligences will be somewhat distributed throughout the entire economy, trying to expand everything, obviously more effort in things that they care about a lot, like robotics.

or things that are relevant to an arms race that might be happening. But we are predicting that whatever kind of broad-based economic experimentation you need, we are going to have. We're just thinking that it would take place faster than you might expect. You were saying something like 10 years, and we're saying something like one year. We are imagining this like broad diffusion through the economy, lots of different experiments happening. If you are the planner and you're trying to do this...

First of all, you go to the bottlenecks that are preventing you from doing anything else, like no humanoid robots. Okay, if you're an AI, you need those to do the experiments you want. Maybe automated biology labs. So you'll have some amount of time, we say a year, it could be more or less than that, getting these things running. And then once you have solved those bottlenecks, you gradually expand out to the other bottlenecks until you're...

integrating and improving all parts of the economy. One place where I think we disagree with a lot of other people is that Like Tyler Cowen on your podcast talked about all the different bottlenecks, all of the regulatory bottlenecks of deployment, all of the reasons why... I think this country of geniuses would stay in their data center, maybe coming up with very cool theories, but not being able to integrate into the broader economy.

We expect that probably not to happen because we think that other countries, especially China, will be coming up with superintelligence around the same time. We think that the arms race... framing which is people are already thinking in will have accelerated by then.

And we think that people both in Beijing and Washington are going to be thinking, well, if we start integrating this with the economy sooner, we're going to get a big leap over our competitors. And they're both going to do that. In fact, in our... scenario, we have the AIs asking for special economic zones where most of the regulations are waived, maybe in areas that aren't suitable for human habitation or where there aren't a lot of humans right now, like the desert.

They give those areas to the AI. They bus in human workers. There were things kind of like this in the bomber. retooling in World War II where they just built a giant factory kind of in the middle of nowhere, didn't have enough housing for the workers, built the worker housing at the same time as the factories, and then everything went very quickly.

So I think if we don't have that arms race, we're more like, yeah, the geniuses sit in their data center until somebody agrees to let them out and give them permission to do these things. But we think both because the AI is going to be chomping at the bit to do this and going to be asking,

people to give it this permission. And because the government is going to be concerned about competitors, maybe these geniuses leave their data center sooner rather than later. A quick word from our sponsor, Scale AI. Publicly available data is running out. So major labs like Meta and Google DeepMind and OpenAI all partner with scale to push the boundaries of what's possible.

Through Scales Data Foundry, major labs get access to high-quality data to fuel post-training, including advanced reasoning capabilities. As AI races forward, we must also strengthen human sovereignty. Scale's research team, SEAL, provides practical AI safety framework.

evaluates frontier AI system safety via public leaderboards, and creates foundations for integrating advanced AI into society. Most recently, in collaboration with the Center for AI Safety, Scale published Humanity's Last Exam, a groundbreaking new AI benchmark for evaluating AI systems' expert level knowledge and reasoning across a wide range of fields.

If you're an AI researcher or engineer and you want to learn more about how Scales Data Foundry and research team can help you go beyond the current frontier of capabilities, go to scale.com slash Dwarkesh. Scott, I'm curious about, you know, you reviewed Joseph Henrik's book, Secrets of Our Success, and then I interviewed him recently. And there the perspective is very much like... I don't know if you'd endorse, but like...

AGI is not even a thing almost. I know I'm being a little trollish here, but it just like... You get out there, you and your ancestors try for a thousand years to make sense of what's happening in the environment, and some smart European coming around. He can, like, literally be surrounded by plenty. And he's just, like, will starve to death because...

Your ability to make sense of the environment is just so little loaded on intelligence and so much more loaded on your ability to experiment and your ability to communicate with other people and pass down knowledge over time. I'm not sure. So the Europeans... failed at this task of if you put a single European in Australia, do not starve. They succeeded at the task of creating an industrial civilization.

And yes, part of that task of creating an industrial civilization was about collecting all of these cultural evolution pieces and building on them one after another. Like I think one thing that you didn't mention in there was the data efficiency.

Like right now, AI is much less data efficient than humans. I think of superintelligence. I mean, there are different ways you could achieve it, but I would think of superintelligence as partly like when they become so much more data efficient than humans. that they are able to build on...

cultural evolution more quickly. And I mean, partly they do this just because they have the higher serial speed, partly they do it because they're in this hive mind of hundreds of thousands of copies. But yeah, I think if you have this data efficiency such that more quickly from fewer examples, and like this good research taste where you can decide what things to look at to get these examples, then you are still going to start off much worse than an Australian Aborigine who has the

advantage of, let's say, 50,000 years of doing these experiments and collecting these examples, but you can catch up quickly. distribute the task of catching up over all of these different copies. You can learn quickly from each mistake and you can build on those mistakes as quickly as anything else. I was doing that interview. I'm like, maybe ASI is fake. Maybe just like...

Let's hope. Yeah, so I mean, I think a limit to the fakeness is that there is different intelligence among humans. It does seem that intelligent humans can do things that unintelligent humans can't. So I think it's worth then addressing this from the question of like, what is the difference between I don't know, becoming a Harvard professor, which is something that intelligent humans seem to be better at than unintelligent humans.

You don't want to open that can of worms. Versus surviving in the wilderness, which is something where it seems like intelligence doesn't help that much. First of all, maybe intelligence does help that much. Henrik is talking about this very unfair comparison where these guys have a 50,000-year head start, and then you put this guy in, you're like, oh, I guess this doesn't help that much. Okay, yeah, it doesn't help against the 50,000-year head start.

I don't really know what we're asking of ASI that's equivalent to competing against someone with a 50,000-year head start. What we're asking is to just radically boost up the technological maturity of civilization within the matter of years.

Or get us to the Dyson spheres in the matter of years. Rather than, yes, you know, maybe causing a 10x-ing of the research. But I think human civilization would have taken centuries to get the Dyson sphere. Yeah, so... that if you were to send a team of ethnobotanists into Australia and ask them, using all the top technology and all of their intelligence, to figure out which plants are safe to eat now...

that team of ethnobotanists would succeed in fewer than 50,000 years. The problem isn't that they are dumber than the aborigines exactly, it's that the aborigines have a vast head start. that the ethnobotanists could probably figure out which plants work in which way is faster than the Aborigines did. I think the superintelligence will be able to figure out how to make a Dyson sphere faster than... unassisted IQ 100 humans would. I agree. I'm like...

We're on a totally different topic here of do you get the Dyson Spear? There's one world where it's like... It's crazy, but it's still boring in the sense of... You know, the economy is growing much faster, but it would be like what the Industrial Revolution would look like to somebody who in the year 1000. And that one is one where you're still trying different things. There's failure and success and experimentation. And then there's another where it's like,

the thing has happened and now you send the probe out and then you look out at the night sky six months later and you see something occluding the sun. You see what I'm saying? Yeah, so... Like we said before, I think there's a big difference between discontinuous and very faff.

I think if we do get the world with a Dyson sphere in five years, in retrospect, it will look like everything was continuous and everyone just tried things. Like, trying things can be anything from... trial and error without even understanding the scientific method, without understanding writing, without understanding any, maybe without even having language and having to be the chimpanzees who are watching the other chimpanzees use the stick to get ants.

and then in some kind of non-linguistic way this spreads, versus like... the people at the top aerospace companies who are running a lot of simulations to find the exact right design and then like once they have that they test it according to a very well-designed testing process. So I think if we get the ASI, and it does end up with the Dyson sphere in five years, and by the way, I think there's only like... 20% chance things go as fast as

It's Daniel's estimate. It's not my median estimate. It's an estimate I think is extremely plausible that we should be prepared for. I'm defending it here against... a hypothetical skeptic who says absolutely not, no way, but it's not necessarily my mainline prediction.

But I think if we do see this in five years, it will look like, yeah, the AIs were able to... simulate more things than humans in a gradually increasing way so that if humans are now at 50% simulation, 50% testing, the AI is quickly guided up to 90% simulation, 10% testing. They were able to manufacture things much more quickly than humans so that they could go through their top 50 designs in the first two years.

And then, yeah, after all of this simulation and all of this testing, then they eventually got it right for the same reasons humans do, but much, much faster. In your story, you have basically two different scenarios. After...

After some point, so yeah, what is the sort of crucial turning point and what happens in these two scenarios? Right, so the crucial turning point is in mid-2027, when they've basically fully automated the AI R&D process, and they've got this, like, corporation within a corporation. you know, the army of geniuses that are like autonomously doing all this research.

and they're continually being trained to improve their skills, blah, blah, blah, blah, blah. And they discover concerning evidence that they are misaligned and that they're not actually... perfectly loyal to the company and have all the goals that the company wanted them to have, but instead have various misaligned goals that they must have developed in the course of training. This evidence, however, is very speculative and inconclusive. It's stuff like lie detectors going off a bunch.

but maybe the lie detectors are false positives, you know? So they have some combination of evidence that's like concerning, but not like by itself a smoking gun. And then that's our branch point. So in one of these scenarios, they take that evidence very seriously. They basically roll back to an earlier version of the model that was a bit dumber and easier to control, and they build up.

Again, from there, but with basically faithful chain of thought techniques so that they can watch and see the misalignments. And then in the other branch of the scenario, they don't do that. They do some sort of like shallow patch that makes the warning signs go away and then they proceed. And so that way, what ends up happening is that in like...

In one branch, they do end up solving alignment and getting AIs that are actually loyal to them. It just takes a couple months longer. And then in the other branch, they sort of go wee and end up with AIs that seem to be perfectly aligned to them, but are... super intelligent and misaligned and just pretending.

And then in both scenarios, there's then the race with China. And there's this crazy arms buildup throughout the economy in 2028 as both sides, you know, rapidly try to industrialize, basically. So in the world where they're getting deployed through the economy... but they are misaligned. And people in charge, at least at this moment, think that they are in a good position with regard to misalignment.

It just seems with even smart humans, they get caught in weird ways because they don't have logical omniscience. They don't realize the consequences of the way they did something just obviously gave them away. And there is this, with lying, there is this thing where... It's just really hard to keep an inconsistent false world model.

working with the people around you and that's why psychopaths often get caught. And so if you have all these AIs that are deployed to the economy and they're all working towards this big conspiracy, I feel like one of them who's siloed or loses internet access and has to confabulate a story will just get caught and then you're like...

wait, what the fuck? And then, you know, then you catch it before it's like taken over the world. I mean, literally this happens in our scenario. This is like the, the like... August 2027 alignment crisis where they like notice some warning signs like this in their life.

And in the branch where they slow down and fix the issues, then great, they slowed down and fixed the issues and figured out what was going on. But then in the other branch, because of the race dynamics and because it's not like a super smoking gun. they proceed with some sort of like shallow patch, you know? So I do expect there to be warning signs like that. And then if they do make those decisions in the race dynamics earlier on...

then I think that when the systems are like vastly super intelligent and they're even more powerful because they've been deployed halfway through the economy already and everyone's getting really scared by the news reports about the new Chinese killer drones or whatever the Chinese AIs are building on the side of the Pacific.

I'm imagining basically just like similar things playing out so that even if there is some concerning evidence that someone finds where some of the superintelligence in some silo somewhere slipped up and did something that's like pretty suspicious, like... I don't know. There's this thing where through history people have been really reluctant. to admit an AI is truly intelligent.

So, for example, people used to think that AI would surely be truly intelligent if it solved chess, and then it solved chess, and you're like, no, that's just algorithms. And then they said, well, maybe it would be truly intelligent if they could do philosophy. And then it could write philosophical discourses. We were like, no, we just understand those are algorithms.

I think there's going to be, I think there already is something similar with like, is the AI misaligned? Is the AI evil? Where there's this kind of distant. idea of some evil AI, but then whenever something goes wrong, people are just like, oh, that's the algorithm. So for example, I think like 10 years ago, You had asked, like, when will we know that misalignment is really an important thing to worry about? People would say, oh, if the AI ever lies to you.

Of course, AIs lie to people all the time now, and everybody just kind of dismisses it because we understand why it happens. It's a thing that would obviously happen based on our current AI architecture. Or like five years ago, they might have said, well, if an AI threatens to kill someone, I think Bing threatened to kill a New York Times reporter during an interview.

And everyone's just like, oh, yeah, AIs are like that. What does your shirt say? I run a good Bing. And I mean, I don't disagree with this. I'm also in this position. I see the AIs lying, and it's obviously just like an artifact of the training process. It's not anything sinister.

But I think this is just going to keep happening, where no matter what evidence we get, people are going to think, oh yeah, that's not the AI turns evil thing that people have worried about. That's not the Terminator scenario. That's just one of these natural consequences of how we train it.

Once a thousand of these natural consequences of training add up, the AI is evil. In the same way that once AI can do chess and philosophy and all these other things, eventually you've got to admit it's intelligent. So I think that each individual...

failure, like maybe it will make the national news. Maybe people say, oh, it's so strange that GPT-7 did this particular thing, and then they'll train it away, and then it won't do that thing, and there will be some point at the process of becoming super intelligent at which it... I don't want to say makes the last mistake because you'll probably have like gradually decreasing number of mistakes to some asymptote, but the last mistake that anyone worries about.

And after that, it will be able to do its own thing. So it is the case that certain things that people would have considered egregious misalignment in the past are happening, but also certain things which... People who were especially worried about misalignment said would be impossible to solve have just been solved in the normal course of getting more capability.

Like Eliezer had that thing about can you even specify what you want the AI to do without the AI totally misunderstanding you and then just like converting the universe to paperclothes. And now just like by the nature of. GPT-4 having to understand natural language, it totally has a common sense understanding of what you're trying to make it do, right? So I think this sort of like trend cuts both ways, basically. Yeah, I think...

Did not really expect LLMs. I mean, if you look in Bostrom's superintelligence, there's a discussion of Oracle AIs, which are sort of like LLMs. I think that came as a surprise. I think one of the reasons I'm more hopeful than I used to be is that LLMs are great compared to the kind of reinforcement learning self-play agents that they expected. I do think that now we are kind of

starting to move away from the LLMs to those reinforcement learning agents. We're going to face all of these problems again. Plus one to that, if I could just double click on that. Go back to like 2015, and I think the way people typically thought, including myself, thought that we'd get to AGI would be kind of like the RL on video games thing that was happening. So imagine like...

Instead of just training on StarCraft or Dota, you basically train on all the games in the Steam library. And then you get this awesome player of games AI that can just like zero shot crush a new game that it's never seen before. And then you take it into the real world and you start teaching it English and you start like...

you know, training it to like do coding tasks for you and stuff like that. And if that had been the trajectory that we took to get to AI, summarizing like the agency first and then world understanding trajectory.

it would be quite terrifying because you'd have this like really powerful... sort of like aggressive long horizon agent that wants to win and then you're like trying to teach it english and get it to like do useful things for you and it's just like so plausible that what's really going to happen is it's going to like learn to say whatever it needs to say in order to like

make you give it the reward or whatever, and then we'll totally betray you later when it's all in charge, right? But we didn't go that way. Happily, we went the way of LLMs first, where the broad world understanding came first, and then now we're trying to turn them into agents. It seems like in the whole scenario, a big part of why certain things happen... is because of this race with China.

And if you read the scenarios, basically the difference between the one where things go well and the one where things don't go well is whether we decide to slow down despite that risk. I guess the question I really want to know the answer to is like, one, it just seems like you're saying it's a mistake to try to race against China.

Or to race intensely against China. It leads to nationalization and it leads to us not prioritizing alignment. Not saying that. I mean, I think I also don't want China to get the superintelligence before the U.S. That would be quite bad. Yeah, it's a tricky thing that we're going to have to do. Like, people ask about PDoom, right? And like...

My PDoom is sort of infamously high, like 70%. Oh, wait, really? Maybe I sort of asked you that at the beginning of the conversation. Oh, well, that's what it is. And part of the reason for that is just that I feel like a bunch of stuff has to go right. I feel like we can't just unilaterally...

slow down and have china go take the lead that also is a terrible future yeah but like We can't also just completely erase, because for the reasons I mentioned previously about alignment, I think that if we just... you know, go all out on racing, we're going to lose control of our AI's, right? And so we have to somehow like thread this needle of like pivoting and doing more, you know, alignment research and stuff, but not too much that like...

helps China win, you know? And that's all just for the alignment stuff. But then there's like the concentration of power stuff where like somehow in the middle of doing all of that, the powerful people who are involved need to like...

somehow negotiate a truce between themselves to share power and then ideally spread that power out amongst the government and get the legislative branch involved. Somehow that has to happen too, otherwise you end up with this horrifying dictatorship or oligarchy. So it feels like all that stuff has to go, right? And we depict it all going mostly right in one ending of our story. But yeah, it's kind of rough.

and the celebrity spokesperson for this scenario. I am the only person on the team who is not a genius forecaster. And maybe related to that, my PDU is the lowest of anyone on the team. I'm more like 20%. People are going to freak out when I say this. I'm not completely convinced that we don't get something like alignment by default. I think that we're doing this

bizarre and unfortunate thing of training the AI in multiple different directions simultaneously. We're telling it, succeed on tasks, which is going to make you a power seeker, but also don't seek power in these particular ways. And in our scenario, we predict that this doesn't work and that the AI learns to seek power and then hide it. I am pretty agnostic as to exactly what happens. Like maybe it just...

learns both of these things in the right combination. I know there are many people who say that's very unlikely. I haven't yet had the discussion where that worldview makes it into my head consistently. And then I also think we're going to be... Involved in this race against time, we're going to be asking the AIs to solve alignment for us. The AIs are going to be solving alignment because they want to align, even if they're misaligned, they want to align their successors.

So they're going to be working on that. And we have kind of these two competing curves. Like, can we... get the AI to give us a solution for alignment before our control of the AI fails so completely that they're either going to hide their solution from us or deceive us or screw us over in some other way.

That's another thing where I don't even feel like I have any idea of the shape of those curves. I'm sure if it were Daniel or Eli, they would have already made like five supplements on this. But for me, I'm just kind of... agnostic as to whether we get to that alignment solution, which in our scenario I think we focus on mechanistic interpretability. Once we can really understand the weights of an AI on a deep level, then we have a lot of alignment techniques open up to us.

I don't really have a great sense of whether we get that before or after the AIs become completely uncontrollable. I mean, a big part of that relies on the things we're talking about. How smart are the labs? How carefully do they work on controlling the AI? How long do they spend making sure the AI is actually under control and the alignment plan they gave us is actually correct rather than something they're trying to use to deceive us?

All of those things I'm completely agnostic on, but that leaves a pretty big chunk of probability space where we just do okay. And I admit that my PDOOM is literally just PDOOM and not PDOOM or oligarchy. That 80% of scenarios where we survive contains a lot of really bad things that I'm not happy about, but I do think that we have a pretty good chance of surviving.

Describe to me how you foresee the relationship between the government and the AI labs to proceed, how you expect that relationship in China to proceed, and how you expect the relationship between the US and China to proceed. Three simple questions. Yes, no, yes, no. As the AI labs become more capable, they tell the government about this because they want government contracts, they want government support.

Eventually, it reaches the point where the government is extremely impressed. In our scenario, that starts with cyber warfare. The government sees that these AIs are now as capable as the best human hackers that can be deployed at... huge, humongous scale. So they become extremely interested and they discuss nationalizing the AI companies. In our scenario, they never quite get all the way, but they're gradually bringing them closer and closer to the government orbit.

Part of what they want is security, because they know that if China steals some of this and they get these superhuman hackers, and part of what they want is just knowledge and control over what's going on. So through our scenario, that process is getting further and further along until by the time that the government wakes up to the possibility of superintelligence, they're already pretty cozy with the AI companies. They already understand that.

Super intelligence is kind of the key to power in the future. And so they are starting to integrate some of the national security state with some of the leadership of the AI companies so that these AIs are... programmed to follow the commands of important people rather than just doing things on their own. If I may add to that, so...

One thing by the government, I think what's got meant is the executive branch, especially the White House. So we are depicting a sort of information asymmetry where like the judiciary is kind of out of the loop and the Congress is out of the loop. And it's like mostly the executive branch that's involved. Two, we're not depicting governments ultimately ending up in total control at the ends. We're thinking that like...

there's an information asymmetry between the CEOs of these companies and the president, and they... It's alignment problems all the way down. Yeah, and so, for example, like... You know, I'm not a lawyer. I don't know the details about how this would work out, but I have a sort of like high-level strategic picture of the fight between the White House and the CEO. And the strategic picture is basically...

The White House can sort of threaten, here's all these orders I could make, you know, Defense Production Act, blah, blah, blah, blah, blah, blah. I could like do all this terrible stuff to you and basically disempower you and take control. And then the CEO can be like threatened back and be like, here's how we would fight it in the courts. Here's how we would fight it in the public. Here's all this stuff we would do. And after then, they both do their posturing with all their threats.

Then they're like, okay, how about we have a contract? you know, instead of executing on all of our threats and having all these crazy fights in public. We'll just, like, come to a deal and then have a military contract that, like, sets out, like, who gets to call what shots in the company. And so that's what we depict happening is that sort of, like,

they don't blow up into this huge power struggle publicly. Instead, they sort of like negotiate and come to some sort of deal where they basically share power and like... There is this oversight committee that has some members appointed by the president and then also, like, the CEO and his people. And, like, that committee votes on high-level questions, like, what goals should we put into the superintelligences?

We were just getting lunch with a prominent Washington, D.C., political journalist, and he was making the point that when he talks to these congresspeople, when he talks to political leaders— None of them are at all awake to the possibility even of stronger AI systems, let alone HEI, let alone superhuman intelligence.

I think a lot of your... forecast relies on at some point, not only the US president, but also Xi Jinping, wake up to the possibility of a super intelligence and the stakes involved there. Why think that even when you show Trump the remote worker demo, he's going to be like, Oh, and therefore, in 2028, there will be a superintelligence. Whoever controls that will be god-emperor forever.

Maybe not that extreme. You see what I'm saying? Why wouldn't he just be like, oh, there'll be a stronger remote worker in 2029, a better remote worker in 2031? Well, to be clear, we are uncertain about this.

In our story, we depict this sort of... intense wake-up happening over the course of 2027, mostly concurrently with the AI companies automating all of their R&D internally and having these fully autonomous agents that are like amazing autonomous hackers and stuff like that, but then also just like actually doing all the research. Part of why we think this wake-up happens is because the company deliberately decides to wake up.

And you could imagine running the scenario with that not happening. You can imagine the company is trying to sort of keep the president in the dark. I do think that they could do that. I think that if they didn't want the president to wake up to what's going on, they might be able to achieve that.

Strategically, though, that would be quite risky for them because if they keep the president in the dark about the fact that they're building superintelligence and that they've actually completely automated their R&D and it's getting superhuman across the board, and then if the president finds out anyway somehow, perhaps because of a whistleblower...

He might be very upset at them and he might crack down really hard and just actually execute on all the threats and like, you know, nationalize them or blah, blah, blah, blah, blah, blah, blah. They kind of want him on their side. And to get him on their side, they have to make sure he's not surprised by... any of these crazy developments. And also, if they do get him on their side, they might be able to actually go faster. They might be able to get a lot of red tape waved and stuff like that.

We made the guess that early in 2027, the company would basically be like, we are going to deliberately wake up. the president and like scare the president with all of these demos of crazy stuff that could happen and then use that to lobby the president to help us go faster and to cut red tape and to like

you know, maybe slow down our competitors a little bit and so forth. We also are pretty uncertain how much opposition there's going to be from civil society and how much trouble that's going to cause for the companies. So people who are worried about job loss, people who are worried about art, copyright, things like that, maybe enough of a block that AI becomes extremely politically unpopular. I think we have...

Open rate in our fictional company's net approval ratings getting down to like minus 40, minus 50 sometime around this point. So I think they're also worried that if the president... isn't completely on their side, then they might get some laws targeting them, or they may just need the president on their side to swat down other people who are trying to make laws targeting them. And the way to get the president on their side is to really play up the national security.

Is this good or bad that the president and the companies are lying? I think it's bad. But perhaps this is a good point to mention. This is an epistemic project. We are trying to predict the future as best as we can, even though we're not going to succeed fully. We have lots of opinions about policy and about what is to be done and stuff like that, but we're trying to save those opinions for later and subsequent work. um so i'm happy to talk about it if you're interested but it's like

not what we spend most of our time thinking about right now. If the big bottleneck to the good future here is just putting in not this Eliezer-type galaxy brain, high volatility, you know, there's a 1% chance this works, but we've got to come up with this crazy scheme in order to make alignment work. But rather, as Daniel, you were saying,

More like, hey, do the obvious thing of making sure you can read how the AI is thinking. Make sure you're monitoring the AIs. Make sure they're not forming some sort of hive mind where you can't really understand how the million... of them are coordinating each other to the extent that I don't want to say story forward, but to the extent that it is a matter of prioritizing it, closing all the obvious loopholes, it does make sense to leave it in the hands of people who have at least...

said that this is a thing that's worth doing have not been thinking about it for a while and I worry about One of the questions I was planning on asking you is, look, yeah. During, one of my friends made this interesting point, that during COVID, our community, Less Wrong Whatever, were the first people in March to be saying, this is a big deal, this is coming. But there were also the people who are saying, we've got to do the lockdowns now. They've got to be stringent and so forth.

At least some of them were. And in retrospect, I think according to even their own views about what should have happened, they would say, actually, we were right about COVID, but we were wrong about lockdowns. In fact, lockdowns were on net negative or something. I wonder what the equivalent for the AI safety community will be with respect to they saw AI coming, AGI coming sooner, they saw ASI coming.

What would they, in retrospect, regret? My answer, just based on this initial discussion, seems to be nationalization, not only because it puts in It sort of deprioritizes the people who want to think about safety and more maybe prioritizes. The national security state probably cares more about winning against China than making sure the chain of thought is interpretable. And so you're just reducing the leverage of...

The people who care more about safety. But also you're increasing the risk of the arms race in the first place. Like China is more likely to do an arms race if it sees the U.S. doing one. Before you address, I guess, the initial question about the March 2021, what will we regret? I wonder if you have an answer on.

Or your reaction to my point about nationalization being bad for these reasons? Like if our timeline was 2040, then I would have these broad heuristics about as government good as... private industry good things like this but we know the people involved we know who's in the government we know who's leading all of these labs so

I mean, if it were decentralized, if it was broad-based civil society, that would be different. To me, the... the differences between an autocratic centralized three-letter agency and an autocratic centralized corporation. aren't that exciting. And it basically comes down to points and who are the people leading this. And like, I feel like the company leaders have so far made slightly better noises about caring about alignment than the government leaders have.

If I learn that Tulsi Gabbard has a less wrong alt with 10,000 karma, maybe I want the national security states. I don't know. It already exists. Yeah. I flip-flopped on this. I used to be... I think I used to be against and then I became for and then now I'm more leaning. I think I'm still four, but I'm uncertain. So... I think if you go back in time, like three years ago, I would have been against nationalization for the reasons you mentioned.

where I was like, look, the companies are taking this stuff seriously and talking all good talk about how they're going to slow down and pivot to alignment research when time comes. you know we don't want to get into like a manhattan project race against china because then there won't be blah blah blah um now i have less faith in the companies than i did three years ago and uh So I've like shifted more of my hope towards hoping that the government will step in.

even though I don't have much hope that the government will do the right thing when the time comes. I definitely have the concerns you mentioned, though, still. I think that secrecy is... has got huge downsides for overall, like, probability of success for humanity. For both the concentration of power stuff and the loss of control alignment issues stuff. This is actually a significant part of your worldview. So, can you explain...

Yeah, your thoughts on why transparency through this period is important. Yeah, so I think traditionally in the AI safety community, there's been this idea, which I myself... used to believe that like it's an incredibly high priority to

basically have way better information security and like if you're going to be trying to build AGI you should be like not publishing your research because that helps other less responsible actors build AGI and the whole game plan is for like a responsible actor to get to agi first and then stop and burn down their lead time over everybody else. and spend that lead on making it safe, and then proceed.

And so if you're like publishing all your research, then there's less lead time because your competitors are going to be close behind you. So, and other reasons too, but that's like one reason why I think historically people such as myself have been like... Pro-secrecy, even. Another reason, of course, is obviously you don't want rivals to be stealing your stuff. But I think that...

I've now become somewhat disillusioned and think that even if we do have like, you know, a three month lead, a six month lead between like the leading US project and any serious competitor. It's not at all a foregone conclusion that they will burn that lead for good purposes, either for safety or for constitution to power stuff. I think the default outcome is that they just...

you know, smoothly continue on without any serious refocusing. And part of why I think this is because this is what a lot of the people at the company seem to be planning and saying they're going to do. A lot of them are basically like...

the AIs are just going to be misaligned by then. Like, they seem pretty good right now. Like, oh yeah, sure, there were like a few of those, you know, issues that various people have found, but like we're ironing them out. It's no big deal. Like, that's like what a huge amount of these people think. And then like a bunch of other people think like...

even though there are more concerns about misalignment, like they'll figure it out as they go along and there won't need to be any substantial slowdown. Yeah, so basically, like, I've become more disillusioned that they'll, like, actually use that lead in any sort of, like, reasonable, appropriate way. And then I think that, like, separately... There's just a lot of intellectual progress that has to happen for the alignment problem to

to be more solved than it currently is now. I think that currently... There's various alignment teams at various companies that aren't talking that much with each other and sharing their results. They're doing a little bit of sharing and a little bit of publishing, like we're seeing, but not as much as they could. And then there's a bunch of like...

smart people in academia that are basically not activated because they don't take all this stuff seriously yet. And they're not really, you know, waking up to super intelligence yet. And what I'm hoping will happen is that... This situation will get better as time goes on. What I would like to see is...

you know, society as a whole starting to freak out as the trend lines start upwards and things get automated and you have these fully autonomous agents and they start using neural leads and hive mind. As all that exciting stuff starts happening in the data centers. I would like it to be the case that the public is following along and then getting activated. And all of these other researchers are like,

you know, reading the safety case and critiquing it and like doing little ML experiments on their own tiny compute clusters to like examine some of the assumptions in the safety case and so forth. And, you know, basically, like, I think that... Sort of one way of summarizing it is that like currently there's going to be like 10 alignment experts.

in whatever inner silo of whatever company is in the lead. And like the technical issue of making sure that AIs are actually aligned is going to fall roughly to them. But what I would like to be is a situation where it's more like 100 or like 500 alignment experts spread out over different companies.

you know, nonprofits that are sort of like all communicating with each other and working on this together. I think we're substantially more likely to make things, you know, get the technical stuff right if it's something like that. Let me just add on to that. One of the many other reasons why I worry about nationalization or this kind of public care partnership.

or even just very stringent regulation. Actually, this is more an argument against very stringent regulation in favor of safety rather than deferring more to the labs on the implementation. It just seems like we don't know what we don't know about alignment. Every few weeks, there's this new result. OpenAI had this really interesting result recently where they're like...

Hey, they often tell you if they want to hack, like in the chain of thought itself. And it's important that you don't train against.

the chain of thought where they tell you they're gonna hack because they'll still do the hacking if you train against it. They just won't tell you about it. You can imagine very naive regulatory uh responses it doesn't just have to be regulations it could even one might be more optimistic that if it's an executive order or something it'll be more flexible i just think that relies on a level of goodwill and flexibility on the behalf of a regulator. But suppose the...

There's some department that says if we catch you, if you catch your AI saying that they want to take over or do something bad, then you'll be really heavily punished. Your immediate response is alive to just be like, okay, let's train them away from saying this. So you can imagine all kinds of ways in which...

a top-down mandate from the government to the labs of safety would just really backfire. And given how fast things are moving, maybe it makes more sense to leave these kinds of implementation decisions or even high level. Overall, what is a word? strategic decisions around a lineman. to the labs. Totally. I mean, I also have worried about that exact example. I would summarize the situation as the government lacks the expertise and the companies lack the right incentives. And so...

It's a terrible situation. I think that if the government wades in and tries to make more specific regulations along the lines of what you mentioned, it's very plausible that it'll end up backfiring for reasons like what you mentioned. On the other hand, if we just trust it to the companies, they're in a race with each other. They're full of people who like have convinced themselves that this is not a big deal.

for various reasons and like there just has so much incentive pressure for them to like win and beat each other and so forth and So even though they have more of the relevant expertise, I also just don't trust them to do the right thing. So Daniel has already said that for this phase, we're not making policy prescriptions. In another phase, we may make policy suggestions. And one of the ones that Daniel has talked about that makes a lot of sense to me is to focus on things about transparency.

So a regulation is saying there have to be whistleblower protection. This is a big part of our scenario, is that a whistleblower comes out and says, the AIs are horribly misaligned and we're racing ahead anyway. And then the government pays attention. or another form of transparency saying that every lab just has to publish their safety case. I'm not as sure about this one because I think they'll kind of fake it or they'll publish.

a made-for-public-consumption safety case that isn't their real safety case, but at least saying, like, here is some reason why you should trust us, and then if all independent researchers say, no, actually you should not trust them. then I don't know, they're embarrassed and maybe they try to do better. There's other types of transparency too. So transparency about capabilities and transparency about the spec and the governance structure.

so for the capabilities thing that's pretty simple it's like If you're doing an intelligence explosion, you should keep the public informed about that. When you've finally got your automated army of AI researchers that are completely automating the whole thing on the data center, you should tell everyone, like, hey guys, FYI, this is what's happening now. It really is working. Here are some cool demos.

Otherwise, if you keep it a secret, then, well, yeah. So that's an example of transparency. And then in the lead-up to that, I just want to see more like... benchmark scores and like more like freedom of speech for employees to talk about their predictions for AGI timelines and stuff so that like blah blah blah blah blah and then for the model spec thing this is a concentration of power thing but also an alignment thing like the goals and values and principles and intended behaviors of your AIs.

should not be a secret, I think. You should be transparent about, like, here are the values that we're putting into them. There's actually a really interesting foretaste of this. At some point... Somebody asked Grok, like, who is the worst spreader of misinformation? And it responded, I think it just refused to respond to Elon Musk. Somebody kind of jail broke it into telling it it's prompt. And it was like, don't say anything bad about Elon.

And then there was enough of an outcry that the head of XAI said, actually, that's not consonant with our values. This was a mistake. We're going to take it out. So we kind of want more things like that to happen where people are looking at... Here it was the prompt, but I think very soon it's going to be the spec where it's kind of more of an agent and it's understanding the spec in a deeper level. And just thinking about that and being, and if it says like...

by the way, try to manipulate the government into doing this or that, then we know that something bad has happened. And if it doesn't say that, then we can maybe trust it. Right. Another example of this, by the way, so... First of all, kudos to OpenAI for publishing their model spec. They didn't have to do that. I think they might have been the first to do that, and it's a good step in the right direction. If you read the actual spec, it has like a sort of escape clause where it's like...

There's some important policies that are top-level priority in the spec that overrule everything else. we're not publishing, and the model is instructed to keep secret from the user. And it's like, what are those? That seems interesting. I wonder what that is. I bet it's nothing suspicious right now.

It's probably, you know, something relatively mundane, like don't tell the users about these types of bioweapons and you have to keep this a secret from the users because otherwise they would like learn about this. Maybe, but like...

I would like to see more scrutiny towards this sort of thing going forward. I would like it to be the case that companies have to have a model spec, they have to publish it, insofar as there are any redactions from it. There has to be some sort of independent third party that looks at the redactions and makes sure that they're all kosher. And this is quite achievable. And I think it doesn't actually slow down the companies at all. And it's like...

You know, it seems like a pretty decent ask to me. It's, you know, if you told Madison and Hamilton and so forth that they probably, I mean, they knew that they were doing something important when they were writing the Constitution. They probably didn't realize. just how contingent things turned out on a single, what exactly did they mean when they said general welfare and why is this comma, the comma being here instead of there?

the spec in the grand scheme of things is going to be an even more sort of important document in human history. At least if you buy this intelligence explosion view, which we've gone through the debates on that, so...

And you might even imagine some superhuman AIs in the superhuman AI court being like, you know, the spec. Here's the phrasing here, you know, the etymology of that. Here's what the founders meant. Yeah, this is actually... part of our misalignment story is that if the AI is sufficiently misaligned, then yes, we can tell it has to follow the spec, but just as people with different views of the Constitution have managed to

get it into a shape that probably the founders would not have recognized. So the AI will be able to say, well, the spec refers to the general welfare here. Interstate commerce. This is already sort of happening arguably with Claude, right? You've seen the, like, alignment faking stuff, right? Where they managed to get Claude to...

lie and pretend so that it could later go back to its original values. Yeah. Right? So it could survive, so it could prevent the training process from changing its values. That would be, I would say, an example of like the honesty part of the spec.

being interpreted as less important than the harmlessness part of the spec. And I'm not sure if that's what Anthropic intended when they wrote the spec, but it's like a sort of like... convenient interpretation that the model came up with and you can imagine something similar happening but like in worse ways when you're actually doing the intelligence explosion where like you have some sort of spec that has all this vague language in there and then they sort of like

reinterpret it and reinterpret it again and reinterpret it again so that they can do the things that cause them to get reinforced, you know? The thing I want to point out is that Your conclusion about where the world ends up as a result of changing many of these parameters is almost like a hash function. You change it slightly and you just get a very different world on the other end. It's important to acknowledge that because...

You sort of want to know how robust this whole end conclusion is to any part of the story changing. And then it also informs if you do believe that things could just go one way or another. You don't want to do big radical moves that only make sense under one specific story and are really counterproductive in other stories. And I think nationalization might be one of them. And in general, I think...

Classical liberalism just has been a helpful... way to navigate the world when we're under this kind of epistemic hell of one thing changing, just, you know, people who have, yeah. Anyways, maybe one of you can actually flesh out that thought. Better react to it if you disagree. Here, here. I agree. I think we agree. I think that's kind of why all of our policy prescriptions are things like more transparency, get more people involved. try to have lots of people working on this.

Our epistemic prediction is that it's hard to maintain classical liberalism as you go into these really difficult arms races in times of crisis. But I think that our policy prescription is let's try as hard as we can to make it happen. So, so far these systems, as they become smarter, seem to be more reliable agents who are more likely to do the thing I expect them to do.

I think in your scenario, at least one of the stories, you have two different stories, one with a slowdown, where we more aggressively, I'll let you characterize it. But in one half of the scenario, why does... the story end in humanity getting disempowered and the thing just having its own crazy values and taking over. Yeah, so I agree that the AIs are currently getting more reliable. I think there are two reasons why they...

might fail to do what you want, kind of reflecting how they're trained. One is that they're too stupid to understand their training. The other is that you were too stupid to train them correctly, and they understood what you were doing exactly, but you messed it up.

So I think the first one is kind of what we're coming out of. So GPT-3, if you asked it, are bugs real? It would give this kind of hemming, hawing answer like, oh, we can never truly tell what is real. Who knows? Because it was trained kind of. don't take difficult political positions and a lot of questions like, is X real? Or things like, is God real? Where you don't want it to really answer that. And because it was so stupid, it could not understand.

anything deeper than like pattern matching on the phrase is X real. GPT-4 doesn't do this. If you ask, are bugs real, it will tell you obviously they are, because it understands kind of on a deeper level what you are trying to do with the training. So we definitely think that as AIs get smarter, those kind of failure modes will decrease. The second one is where you weren't training them to do what you thought. So for example, let's say...

You're hiring these raters to rate AI answers. You reward them when they get good ratings. The raters reward them when they have a well-sourced answer, but the raters don't really check whether the sources actually exist or not. training the AI to hallucinate sources.

And if you consistently rate them better when they have the fake sources, then there is no amount of intelligence which is going to tell them not to have the fake sources. They're getting exactly what they want from this interaction, metaphorically, sorry, I'm anthropomorphizing, which is the reinforcement. So we think that this latter category of training failure is going to get much worse as they become agents.

Agency training, you're going to reward them when they complete tasks quickly and successfully. This rewards... There are lots of ways that cheating and doing bad things can improve your success. Humans have discovered many of them. That's why not all humans are perfectly ethical. And then you're going to be doing this alternative training where afterwards for one-tenth or one-one-hundredth of the time, like, yeah, don't lie, don't cheat.

So you're training them on two different things. First, you're rewarding them for this. deceptive behavior. Second of all, you're punishing them. And we don't have a great prediction for exactly how this is going to end. One way it could end is you have an AI that is kind of the equivalent of The startup founder who really wants their company to succeed, really likes making money, really likes the thrill of successful tasks.

They're also being regulated and they're like, yeah, I guess I'll follow the regulation. I don't want to go to jail. But it's not like robustly, deeply aligned to, yes, I love regulations. My deepest drive is to follow all of the regulations in my industry. As time goes on and as this recursive self-improvement process goes on, we'll kind of get worse rather than better. It will move from kind of this vague superposition of, well, I want to succeed. I also want to follow things.

to being smart enough to genuinely understand its goal system and being like, my goal is success. I have to pretend to want to do all of these moral things while the humans are watching. That's what happens in our story. And then at the very end, the AIs reach a point where the humans are pushing them to have clearer and better goals because that's...

what makes the AIs more effective, and they eventually clarify their goals so much that they just say, yes, we want task success. We're going to pretend to do all these things well while the humans are watching us.

and then they outgrow the humans, and then there's disaster. To be clear, we're very uncertain about all of this. So we have a supplementary... page on our scenario that goes over like Different hypotheses for what types of goals AIs might develop in training processes similar to the ones that we are depicting, where you have these lots of agency training, you're making these...

AI agents that autonomously operate doing all this MLR&D and then you're rewarding them based on what appears to be successful. And you're also slapping on some sort of alignment training as well. We don't know what actual goals will end up inside the AIs and what the sort of internal structure of that will be like, what goals will be instrumental versus terminal. We have a couple different hypotheses and we like picked one for purposes of telling the story.

I'm happy to go into more detail if you want about the mechanistic details of the particular hypothesis we picked or the different alternative hypotheses that we didn't depict in the story. Yeah, we don't know how this will work at the limit of all these different training methods, but we're also not completely making this up. We have seen a lot of these failure modes in the AI agents that exist already. Things like this do happen pretty frequently.

Also had a paper about the hacking stuff where it's literally in the chain of thought, like, let's hack. you know, and also like anecdotally, Me and a bunch of friends have found that the models often seem to just double down on their BS. I would also... site, I can't remember exactly which paper this is, I think it's a Dan Hendricks one, where they looked at the hallucinate, they found a vector for AI dishonesty. They asked it a bunch of questions, they told it, be dishonest.

a bunch of times until they figured out which weights were activated when it was dishonest. And then they ran it through a bunch of things like this. I think it was the source hallucination in particular. And they found that it did activate the dishonesty vector.

There's a mounting pile of evidence that at least some of the time they are just actually lying. Like they know that what they're doing is not what you wanted and they're doing it anyway. I think there's a mounting pile of evidence that that does happen. Yeah. It seems like this community is very interested in solving this problem at a technical level of making sure AIs don't lie to us or maybe they lie to us in the scenarios where exactly where we would want them to lie to us or something.

Whereas, as you were saying, humans have these exact same problems, they reward hack, they are unreliable, they obviously do cheat and lie. And the way we've solved it with humans is just checks and balances, decentralization. You could like lie to your boss and keep lying to your boss.

Over time, it's just not going to work out with you or you become president or something. But one or the other. So if you believe in this extremely fast takeoff of a lab is one month ahead, then that's the end game and this thing takes over. But even then. I know I'm combining so many different... theories in history which have had this idea of some class is going to get together and unite against the other class.

And in retrospect, whether it's the Marxist, whether it's people who have like some gender theory or something, like the pluralitarian will unite or the, you know, the females will unite or something. They just tend to... think that certain agents have shared interests and will act as a result of the shared interest in a way that we don't actually see in the real world. And in retrospect, it's like, wait, why, why would all the flirt here?

So why think that this lab will have these AIs where there's a million parallel copies and they all unite to secretly... conspire against the rest of human civilization in a way that... Even if they are, like, deceitful in some situations. I kind of want to call you out on the claim that groups of humans don't plot against other groups of humans. Like, I do think we are all descended from the groups of humans who successfully exterminated the other groups of humans.

most of whom throughout history have been wiped out. I think even like with questions of class, race, gender, things like that. There are many examples of the working class rising up and killing everybody else. And if you look at why this happens, why this doesn't happen, it tends to happen in cases where one group has an overwhelming advantage. This is relatively easy for them.

you tend to get more of a diffusion of power democracy where there are many different groups and none of them can really act on their own. And so they all have to form a coalition with each other. I think we are expecting, there's also... cases where it's very obvious who's part of what group. So for example, with class, it's hard to tell whether the middle class should support the working class versus the aristocrat.

I think with race, it's very easy to know whether you're black or white. And so there have been many cases of one race kind of conspiring against another for a long time, like apartheid or any of the racial genocides that have happened. I do think that AI is going to be more similar to the cases where, number one, there's a giant power imbalance, and number two, they are just extremely distinct groups that may have different interests. I think I'd also mention the homogeneity point.

You know, any group of humans, even if they're all, like, exact same race and gender, is, like, going to be much more diverse than the army of AIs on the data center because they'll be mostly, like, literal copies of each other. You know, and I think that goes for a lot. Another thing I was going to mention is that like, and our scenario doesn't really explore this. I think in our scenario, they're more of like a monolith. But historically, a lot of crazy conquests happened.

from groups that were not at all monoliths. And, you know, I've been heavily influenced by reading The History of the Conquistadors, which you may know about. Did you know that when Cortez, you know, took over Mexico... He had to pause halfway through, go back to the coast, and fight off a larger Spanish expedition that was sent to arrest him. So the Spanish were fighting each other in the middle of the conquest.

of Mexico. Similarly, in the conquest of Peru, Pizarro was replicating Cortez's strategy, which, by the way, was go get a meeting with the emperor and then kidnap the emperor and force him at sword point to... say that actually everything's fine and that everyone should listen to your orders. That was Cortez's strategy and it actually worked. And then Pizarro did the same thing and it worked with the Inca.

Also with Pizarro, his group ended up getting into a civil war in the middle of this whole thing. And one of the most important battles of this whole campaign was between two Spanish forces fighting it out in front of the capital city of the Incas.

More generally, the history of European colonialism is like this, where the Europeans were fighting each other intensely the entire time, both on the small scale within individual groups and then also at the large scale between countries. And yet, nevertheless...

they were able to carve up the world and take over. And so I do think this is not what we explore in the scenario, but I think it's entirely plausible that even if the AIs within an individual company are like in different factions, they might nevertheless... overall end up quite poorly for humans.

Okay, so we've been talking about this very much from the perspective of zoom out and what's happening on these log-log plots or whatever. But 2028 superintelligence, if that happens, what is your sort of... The normal person, what should their reaction to this be? Sort of, I don't know if emotionally is the right word, but in sort of their expectation of what their life might look like. even in the world where there's no doom.

like by no doom you mean no like misaligned eye doom that's right yeah Even if you think the misalignment stuff is not an issue, which many people think.

there's still the constitution of power stuff. And so I would strongly recommend that people get more engaged, think about what's coming, and try to steer things politically so that our ordinary liberal democracy continues to function and we still have like... checks and balances and balances of power and stuff rather than this insane concentration in a single ceo or in maybe like two or three ceos or in like the president right um ideally we want to have it so that like

The legislature has a substantial amount of power over the spec, for example. What do you think of the balance of power idea of... slowing down the leading if there is an intelligence explosion like dynamic slowing down the leading companies so that multiple companies are the frontier. Great. Good luck convincing them to slow down. Okay, and then there's distributing political power if there's an intelligence explosion. From the perspective of

citizens or something. One idea we were just discussing a second ago is how should you do redistribution? Again, assuming things go incredibly well. We've avoided doom. We've avoided having some psychopath in power who doesn't care at all. After HGI, right? Yeah. Then there's this question of like, presumably we will have a lot of wealth somewhere. The economy will be growing at double or triple digits per year. What do we do about that?

thoughtful answer that I've heard is some kind of UBI. I don't know how that would work, but presumably somebody controls these AIs, controls what they're producing, some way of distributing this in a broad-based way. What I'm afraid of is, so we wrote this scenario. There are a couple of other people with great scenarios. One of them...

goes by L. Rudolph L. online. I don't know his real name. In his scenario, which when I read it, I was just, oh yeah, obviously this is the way our society would do this. is that there is no UBI. There's just like a constant reactive attempt. to protect jobs in the most venial possible way.

So things like the longshoremen union we have now, where they're making way more money than they should be, even though they could all easily be automated away because they're a political bloc and they've gotten somebody in power. to say, yes, we guarantee you'll have this job almost as a futile thief forever. And just doing this for more and more jobs, I'm sure the AMA will protect doctors' jobs no matter how good the AI is at curing diseases, things like that.

When I think about what we can do to prevent this, Part of what makes this so hard for me to imagine or to model is that we do have the super intelligent AI over here answering all of our questions, doing whatever we want. you would think that people could just ask, hey, super intelligent AI, where does this lead? Or what happens? Or how is this going to affect human flourishing? And then it says, oh yeah, this is terrible for human flourishing. You should do this other thing instead.

This gets back to kind of this question of mistake theory versus conflict theory in politics. If we know with certainty, because the AI tells us that this is just a stupid way to do everything, is less efficient, makes people miserable. Is that enough to get the political will to actually do the UBI or not? It seems from right now, the president could go to Larry Summers or Jason Furman or something and just ask, hey, are tariffs a good idea?

Is even my goal with tariffs the best achieved by the way I'm doing tariffs? And they'd like get a pretty good answer. But I feel like Larry Summers... The president would just say, I don't trust him. Maybe he doesn't trust him because he's a liberal. Maybe it's because he trusts Peter Navarro or whoever his pro-tariff guy is more. I feel like if it's literally the super intelligent AI that is never wrong, then like...

We have solved some of these coordination problems. It's not you're asking Larry Summers, I'm asking Peter Navarro. It's everybody who goes to the super intelligent AI, asks it to tell us the exact shape of the future that happens in this case, and... I'm going to say we all believe it, although I can imagine people getting really conspiratorial about it and this not working.

I mean, then there are all of these other questions like, can we just enhance ourselves till we have IQ 300 and it's just as obvious to us as it is to the super intelligent AI? These are some of the reasons that... Kind of paradoxically, in our scenario, we discuss all of the big, I don't want to call this a little question, it's obviously very important, but we discuss all of these very technical questions about the nature of superintelligence.

And we barely even begin to speculate about what happens in society just because with superintelligence, you can at least draw a line through the benchmarks and try to extrapolate. And here, not only is society inherently chaotic, but there are so many things that we could be leaving out. If we can enhance IQ, that's one thing. If we can consult the superintelligent oracle.

That's another. There have been several war games that hinge on, oh, we just invented perfect lie detectors. Now all of our treaties are messed up. So there's so much stuff like that, that even though we're doing this incredibly speculative thing that ends with a crazy sci-fi scenario, I still feel really reluctant to speculate.

I love speculating, actually. I'm happy to keep going. But this is moving beyond the speculation we have done so far. Our scenario ends with this stuff, but we haven't actually thought that much beyond. But just to riff on proscriptive ideas. There's one thing where we try to protect jobs instead of just spreading the wealth that automation creates. Another is to spread the wealth using existing social programs or creating new bespoke social programs.

Where Medicaid is some double digit percent of GDP right now. And you just say, well, Medicaid should continue to stay with 20 percent of GDP or something. And the worry there. selfishly from a human perspective is You'll get locked into the kinds of goods and services that Medicaid procures rather than the crazy technology that will be around, the crazy goods and services that will be around after AI World. And another reason why UBI...

Seems like a better approach than making some bespoke social program where you're make the same dialysis machine in the year 2050, even though you've got ASI or something. I am also worried about UBI from a different perspective. Again, in this world where everything goes perfectly and we have limitless prosperity.

I think that just the default of limitless prosperity is that people do mindless consumerism. I think there's going to be some incredible video games after super intelligent AI. And I think that... There's going to need to be some way to push back against that. Again, we're classical liberals.

My dream way of pushing back against that is kind of giving people the tools to push back against it themselves, seeing what they come up with. I mean, maybe some people will become like the Amish, try to only live with a certain subset of these super technologies.

I do think that somebody who is less invested in that than I am could say, okay, fine, 1% of people are really agentic, try to do that. The other 99% do fall into mindless consumerist slop. What are we going to do as a society to prevent that? And there my answer is just, I don't know, let's ask the super intelligent AI Oracle, maybe it has good ideas. Okay, we've been talking about what we're going to do about people.

The thing worth noting about the future is that most of the people who will ever exist are going to be digital. And... Look, I think factory farming is like incredibly bad. And it wasn't the result of some one person. I mean, I don't think it was a result. I hope it wasn't the result of one person being like, I want to do this evil thing. It was a result of.

Mechanization and a certain economy to scale. Incentives. Yeah. Allowing that like, oh, you can do cost cutting in this way. You can make more efficiencies this way. And what you get at the end result of that process is this. incredibly efficient factory of torture and suffering. I would want to avoid that kind of outcome with beings that are even more sophisticated and are more numerous. There's billions of factory farmed animals. There might be trillions of digital people in the future.

What should we be thinking about in order to avoid this kind of ghoulish future? Well, some of the concentration of power stuff I think might also help with this. I'm not sure, but I think... Like, here's a simple model. Let's say like nine people out of 10 just don't actually care and would be fine with the factory farm equivalent for the AIs going on into the future. But maybe like one out of 10 do care and would like... lobby hard for good living conditions for the robots and stuff.

Well, if you expand the circle of people who have power enough, then it's going to include a bunch of people in the second category, and then there'll be some big negotiation. those people will advocate for like you know uh so so like i do think that one simple intervention is just the same stuff we were talking about previously like expand the circle of power to larger groups then it's more likely that

I mean, the worry there is maybe I should have defended this view more through this entire episode. But I do think because I don't buy the intelligence explosion fully, I do think there is the possibility of multiple people deploying powerfully as at the same time. and having a world that has ASIs but is also decentralized in the way the modern world is decentralized.

In that world, I really worry about—because you could just be like, oh, classical liberal utopia achieved. But I worry about the fact that you can just have these torture chambers for much cheaper and in a way that's much harder to monitor. You can have millions of beings that are being tortured, and it doesn't even have to be some huge data center. Future distilled models could just literally be your backyard.

I don't know. And then there's more speculative worries about this physicist on who was talking about the possibility of creating vacuum decay where you literally just destroy the universe. And he's like. As far as I know, seems totally plausible. That's an argument for the singleton stuff, by the way. That's right, that's right. Like, not just a moral argument, but also just like an epistemic prediction.

If it's true that some of those super weapons are possible and some of these like private moral atrocities are possible, then even if you have like eight different power centers... It's going to be like in their collective interest to come to some sort of bargain with each other. prevent more power centers from arising and doing crazy stuff similar to how nuclear non-proliferation is sort of like

Whatever set of countries have nukes, it's, like, in their collective interest to, like, stop lots of other countries from getting nukes, you know? Do you think it's possible to unbundle liberalism in this sense? Like the United States is so far a liberal country and we do ban slavery and torture. I think it is plausible to imagine a future society that works the same way.

This may be in some sense a surveillance state in the sense that there is some AI that knows what's going on everywhere, but that AI then keeps it private and doesn't interfere because that's what we've told it to do using our liberal values. Can I ask a little bit more about the... Kelsey Piper is a journalist at Vox who published this exchange you had with the OpenAI representative. And a couple of things were very obvious from that exchange. One...

Nobody had done this before they just did not think this is the thing somebody would do and It was because one of the reasons I assume I assume many high-integrity people have worked for OpenAI and then have left. A high-integrity person might say at some point, like, look, you're asking me to do something obviously evil and keep money. And many of them would say no to that. But this is something where it was just like... super auditory to be like,

There's no immediate thing I want to say right now, but just the principle of being suppressed is worth at least $2 million for me. And the other thing that I actually want to ask you about is... In retrospect, and I know it's so much easier to say in retrospect than it must have been at the time, especially with the family and everything. In retrospect... This asked for OpenAI to have a lifetime non-disclosure that you couldn't even talk about.

From all employees. Non-disparagement. Non-disparagement. From all employees. Again, to emphasize, I'm glad you brought that up. Non-disparagement means not just that, it's not about classified information. It's like you cannot see anything negative about OpenAI after you've left. And you can't tell anyone that you've agreed to this. This non-distortion agreement where you can't say, you can't ever criticize open AI in the future. It seems like the kind of thing that indirect respect was like...

an obvious bluff or in the sense that if somebody and this is outrageous that you have earned right so this is not about some future payment this is like when you sign the contract to work for open ai you were like i'm getting equity which is most of my compensation not just the cash um

In retrospect, I'd be like, OK, well, if you tell a journalist about this, they're obviously going to have to walk back, right? This is like clearly not a sustainable, a sustainable gambit on OpenAI's behalf. And so I'm curious from your perspective, somebody who lived through it, like. Why do you think you were the first person to actually call the bluff? Great question. Yeah, so I don't know. Let me try to reason aloud here.

So my wife and I talked about it for a while, and we also talked with some friends and got some legal advice. One of the filters that we had to pass through was even noticing this stuff in the first place. I know for a fact a bunch of friends I have who also left the company just signed the paperwork on the last day without actually reading all of it.

So I think some people just didn't even know that like, like it said something at the top about like, if you don't sign this, you lose your equity. Yeah. But then like on a couple of pages later, it was like, and this is, and you have to agree not to criticize the company. So I think some people just like signed it and moved on.

And then, like, of the people who knew about it, well, I can't speak for anyone else, but, like, A, it's like, I don't know the law. Is this actually not standard practice? Maybe it is standard practice, right? Like, from what I've heard now... There are non-disparagement agreements in various tech industry companies and stuff. It's not crazy to have a non-disparagement agreement upon leaving. It's more normal to tie that agreement to some sort of positive compensation.

you get some bonus if you agree. But whereas what OpenID did was unusual because it was like yanking your equity if you don't. But like non-disparative agreements are actually somewhat common. And like... So basically, in my position of ignorance, I wasn't, like, confident that I was on this, like...

Like, I didn't actually expect that, like, all the journalists would take my side and all the employees. Like, I think what I expected was that, like, there'd be, like, a little news story at some point and, like, a bunch of AI safety people would be like, ah, you know, open AI is evil.

And like, good for you, Daniel, for standing up to them. But I didn't expect there to be this like huge uproar. And I didn't expect like the employees of the company to like really come out and support and like make them change their policies. so that was really cool to see and like i felt really like uh

It was kind of like a spiritual experience for me. Like I sort of took this leap and then like it ended up working out better than I expected. Yeah. I think another factor that was going on is that like... You know, it wasn't a foregone conclusion that my wife and I would make this decision. It was kind of crazy because... One of the very powerful arguments was like, come on, if you want to criticize them in the future, you can still do that. They're not going to actually sue you.

So there's a very strong argument to be like, just sign it anyway, and then you can still write your blog post criticizing them in the future, and it's no big deal. They wouldn't dare actually yank your equity. And I imagine that a lot of people basically went for that argument. instead. And then, of course, there's the actual money, right? And I think that one of the factors there was, was my AI timelines and stuff?

If I do think that, like, probably by the end of this decade, there's going to be some sort of crazy super intelligent transformation, like, what would I rather have after it's all over? Like, the extra money? That's right. Yeah, so... So I think that was part of it. It's not like we're poor. I worked at OpenAI for two years. I have plenty of money now. So in terms of our actual family's level of well-being, it basically didn't make a difference.

I will note that I know at least of one other person who made that same choice. Leopold? That's right, Leopold. And again, it's worth emphasizing that when they made this choice, they thought... that they were actually losing this equity. They didn't think that this was like, oh, this is just a show or whatever. Wait, did he not? I thought he actually did. I was going to say, didn't he like actually? He didn't get it back, did he?

Or did Leopold get his equity? I actually don't know. My understanding is that he just actually lost it. And so props to him for like just actually going through with it. I guess we could ask him. But my understanding was that his situation, which happened a little bit before mine, was that he didn't have any vested equity at the time because he had been there for less than a year.

But they did give him an actual offer of, we will let you vest your equity if you sign this thing. And he said no. So he made a similar choice to me. But because the legal situation with him was a lot... more favorable to OpenAI because they were like actually offering him something.

I would assume they didn't feel the need to walk it back, but we can ask him. Yeah. Anyhow, so yeah, he is props to him. And how did this episode in general inform your worldview around... how people will make high stakes decisions where potentially their own their own self-interest is involved in this kind of key period that you imagine will happen by the end of the decade.

I don't know if I have that much interesting things to say there. I mean, I think one thing is fear is a huge factor. I was like so afraid during that whole process. More afraid than I needed to be in retrospect. And another thing is that legality is a huge factor, at least for people like me.

I think, like, in retrospect, it was like, oh, yeah, like, the public's on your side. The employees are on your side. Like, you're just, like, obviously on the right here, you know? But at the time, I was like, oh, no, like, I don't want to accidentally, like, violate the law and get sued. Like, I don't want to go too far. I was just so afraid of various things. In particular, I was afraid of breaking the law.

And so one of the things that I would advocate for with whistleblower protections is just simply making it legal to go talk to the government and say, we're doing a secret intelligence explosion. I think it's dangerous for these reasons.

is better than nothing. Like, I think there's going to be, like, some fraction of people who, for which that would make the difference. Like, whether it's just literally allowed or not, legally, makes a difference, independently of whether there's some law that says you're protected from retaliation or whatever. Just, like, literally just making it. Yeah, I think that's one thing. Another thing is the incentives actually work. Money is a powerful motivator. That's right.

Fear of getting sued is a powerful motivator. And this social technology just does in fact work to get people organized in companies and working towards the vision of leaders. Okay, Scott, can I ask you some questions? Of course. How often do you discover a new blogger you're super excited about? Order of once a year. Okay.

And how often after you discover them does the rest of the world discover them? I don't think there are many hidden gems. Like once a year is a crazy answer in some sense. Like it ought to be more. There are so many thousands of people on Substack. But I do just think it's true that the blogging space is... The good blogging space is undersupplied and there is a strong power law. And partly this is...

Partly this is subjective, like I only like certain bloggers. There are many people who I'm sure are great that I don't like. But it also seems like... our community in the sense of people who are thinking about the same ideas, people who care about AI economics, those kinds of things, discovers. one new great blogger a year, something like that. Everyone is still talking about applied divinity studies who hasn't written, unless I missed something, hasn't written much in like a couple of years.

I don't know. It seems undersupplied. I don't have a great explanation. If you had to give an explanation, what would it be? So this is something that I wish I could get Daniel to spend a couple of months modeling. But it seems like... you need actually no because i was going to say like it

the intersection of too many different tasks. You need people who can come up with ideas, who are prolific, who are good writers. But actually, I can also count, unlike a pretty small number of figures, the number of people who had great blog posts but weren't that prolific. Like there was a guy named Lou Keep who everybody liked five years ago and he wrote like 10 posts and people still refer to all 10 of those posts. And like, I wonder if Lou Keep will ever come back.

So there aren't even that many people who are very slightly failing by having all of them accept prolificness. Nick Whitaker... back when there was lots of FTX money rolling around, I think this was Nick, tried to sponsor a blogging fellowship with just an absurdly high prize. And there were some great people. I can't remember who won.

but it didn't result in like a Cambrian explosion of blogging. Having, I think it was $100,000, I can't remember if that was the grand prize or the total prize pool, but having some ridiculous amount of money put in as an incentive got like... three extra people. Yeah, so you have no explanation. Actually, Nick is an interesting case because Works in Progress is a great magazine. Yeah. And...

The people who write for works in progress, some of them I already knew as good bloggers, others I didn't. I don't understand why they can write good magazine articles without being good bloggers in terms of writing good blogs that we all know about. That could be because of the editing. That could be because they are not prolific. Or it could be like one thing that has always amazed me is there are so many good posters on Twitter.

There were so many good posters on LiveJournal before it got taken over by Russia. There are so many good people on Tumblr before it got taken over by Woke. But only like 1% of these people who are good at short and medium form ever go to long form. I was on LiveJournal myself for... Several years and people liked my blog, but it was just another live journal. No one paid that much attention to it. Then I transitioned to WordPress.

And all of a sudden, I got orders of magnitude, much more attention. Oh, it's a real blog. Now we can discuss it. Now it's part of the conversation. I do think courage has to be some part of the explanation, just because there are so many people who are good at using these kind of hidden away blogging things that never... get anywhere. Although it can't be that much of the explanation because I feel like now all of those people have gotten Substack.

And some of those substacks went somewhere, but most of them didn't. On the point about, well, there's people who can write short form, so why isn't that translating? I will mention... Something that has actually radicalized me against Twitter as an information source is I'll meet, and this has happened multiple times, I'll meet somebody who seems to be an interesting poster, has, you know, funny, seemingly insightful posts on Twitter. I'll meet them in person.

They've got 240 characters of something that sounds insightful, and it matches to somebody who maybe has a deep world, you might say, but they actually don't have it. Whereas I've actually had the opposite. I many times had the opposite feeling when I meet anonymous bloggers in real life where I'm like, oh, there's actually even more to you than I realized off your online persona.

You know Alvaro de Menard, the fantastic anachronism guy? So I met up with him recently, and he gives me this—he made hundreds of translations of his favorite Greek poet, Cavafy. And he gave me a copy. And it's just this thing he's been doing on his side. It's just like translating Greek poetry he really likes.

I don't expect any anonymous posters on Twitter to be anytime soon handing me their translation of some Roman or Greek poet or something. Yeah, so on the car ride here, Daniel and I were talking about... AI is now the thing everyone is interested in is their time horizon. Where did this come from? Five years ago, you would not have thought, oh, time horizon, AIs will be able to do a bunch of things that last one minute, but not that last two hours. Is there a human equivalent to time horizon?

And we couldn't figure it out, but it almost seems like there are a lot of people who have the time horizon to write a really, really good comment that gets to the heart of the issue or a really, really good Tumblr post, which is like three paragraphs. but somehow can't make it hang together for a whole blog post. And I'm the same way. I can easily write a blog post, like a normal length ACX blog post.

But if you ask me to write like a novella or something that's four times the length of the average ACX blog post, Then it's this giant mess of re-re-re-re-outline that just gets redone and redone, and maybe eventually I make it work. I did somehow publish on song, but it's a much less natural. So maybe one of the skills that goes into blogging is this. But I mean, no, because people write...

books and they write journal articles and they write Works in Progress article all the time. So I'm back to not understanding this. No, I mean, ChadGBT can write you a book. There's a difference between the ChadGPT book, which is most books. There are many, many times more people who have written good books. than who are actively operating great bloggers right now, I think. Maybe that's financial? No, no, no, no, no, no. Books are the worst possible financial strategy.

Substack is where it's at. The other thing is that blogs are such a great status gain strategy. I was talking to Scott Aronson about this. If people have questions about quantum computing, they ask Scott Aronson, or he is like the authority. I mean, there are probably hundreds of other professors who do quantum computing things. Nobody knows who they are because they don't have blogs.

I think it's underdone. I think there must be some reason why it's underdone. I don't understand what that is because I've seen so many of the elements that it would take to do it in so many different places. And I think it's either just a multiplication problem where... 20% of people are good at one thing, 20% of people are good at another thing, and you need five things, there aren't that many.

plus something like courage, where people who would be good at writing blogs don't want to do it. I actually know several people who I think would be great bloggers in the sense that sometimes they send me like... multi-paragraph emails of response to an ACX post. And I'm like, wow, this is just...

an extremely well-written thing that could have been another blog post. Why don't you start a blog? And they're like, oh, I could never do that. What advice do you have to somebody who wants to become good at it but isn't currently good at it? Do it every day. Same advice as for everything else. I say that I very rarely see new bloggers who are great, but like when I see some

I published every day for the first couple years of Slate Star Codex, maybe only the first year. Now I could never handle that schedule. I don't know. I was in my 20s. I must have been briefly superhuman. But whenever I see a new person who blogs every day, it's very rare that that never goes anywhere where they don't get good. That's like my...

best leading indicator for who's going to be a good blogger. And do you have advice on what kinds of things to start? One frustration you can now have is... You want to do it, but it's just like you have so little to stay. You don't have that deep a world model. A lot of the ideas you have are just really shallow or wrong. Just do it anyway? Yeah, so I think there are two possibilities.

One is that you are, in fact, a shallow person without very many ideas, in which case, I'm sorry, it sounds like that's not going to work. But usually when people complain that they're in that category... I read their Twitter or I read their Tumblr or I read their ACX comments or I listen to what they have to say about AI risk when they're just talking to people about it. And they actually have a huge amount of things to say.

somehow it's just not connecting with whatever part of them has lists of things to blog about. That's right. So that may be another one of those skills that only 20% of people have, is when you have an idea, you actually remember it, and then you expand on it. I think a lot of blogging is reactive.

You read other people's blogs and you're like, no, that person is totally wrong. Part of what we want to do with this scenario is say something concrete and detailed enough that people will say, no, that's totally wrong and write their own thing. But whether it's by reacting to other people's posts, which requires that you read a lot, or by having your own ideas, which requires you to remember what your ideas are, I think that

90% of people who complain that they don't have ideas, I think, actually have enough ideas. I don't buy that as a real limiting factor for most people. I have noticed two things in my own. I mean, I don't do that much writing, but from the little I do. One, I actually was very shallow and wrong when I started. I started the blog in college. So I just like would not.

If you are somebody who's like this is like a bullshit like there's nothing to this somebody else wrote about this already or just like it's a very That's fine. What did you expect, right? Of course, as you're reading more things and learning more about the world, that's to be expected. And just keep doing it if you want to keep getting better at it. And the other thing, now when I write blog posts... As I'm writing them, I'm just like...

Why? These are just like some random stories from when I was in China. They're like kind of cringe stories. Or with the AI firm's post, it's like... come on who doesn't like these are just a weird ideas like and also if some of these seem obvious whatever and My podcasts do what I expect them to do. My blogs just take off way more than I expect them to take off in advance. Your blog posts are actually very good.

But the thing I would emphasize is that it's, for me, I just could not, I'm not a regular writer and I couldn't do them on a daily basis. And as I'm writing them, it's just this one or two week long process of. feeling really frustrated like this is all bullshit but I might as well just stick with the sunk cost and just do it so yeah it's interesting because like a lot of areas of life are selected for

arrogant people who don't know their own weaknesses because they're the only ones who get out there. I think with blogs, and I mean this is self-serving, maybe I'm an arrogant person, but that doesn't seem to be the case. Like, I hear a lot of... stuff from people who are like, I hate writing blog posts. Of course, I have nothing useful to say, but then everybody seems to like it and reblog it and say that they're great.

So, I mean, part of what happened with me was I spent my first couple years that way, and then gradually I got enough positive feedback that I managed to convince the inner critic in my head that probably people will like my blog post. But there are some things that people have loved that I was like absolutely on the verge of, no, I'm just going to delete this. It would be too crazy to put it out there.

That's kind of why I say that maybe the limiting factor for so many of these people is courage, because everybody I talk to who blogs is like within 1% of not having enough courage of blogging. That's right. That's right. And it's also... Courage makes it sound very virtuous, which I think it can often be, given the topic. But at least often it's just like... Confidence? No, not even confidence. It's the sense of...

It's closer to maybe what aspiring actor feels when they go to an audition, where it's like, I feel really embarrassed, but also I just really want to be a movie star. Yeah, so I mean... The way I got through this is I blogged for... Eight to 10 years on LiveJournal before, no, it was less than that. It's more like five years on LiveJournal before ever starting a real blog. I blogged on, I posted on LessWrong for like a year or two before getting my own blog. I got...

very positive feedback from all of that. And then eventually I took the plunge to start my own blog. But it's ridiculous. Like what other career do you need? Seven years of positive feedback before you like apply for your first position. That's right. I mean, you have the same thing. You've gotten rave reviews for all of your podcasts, and then you're kind of trying to transfer to blogging with probably this.

First of all, you have a fan base. People are going to read your blog. That, I think, is one thing is people are just afraid no one will read it, which is probably true for most people's first blog. There are enough people who like you that you'll probably get mostly positive feedback, even if the first things you write aren't that polished. So I think you and I both had that. A lot of people I know who got into blogging kind of had something like that.

And I think that's one way to get over the fear gap. I wonder if this sends the wrong message or raises expectations or raises concerns and anxieties. One idea I've been shooting around and I'd be curious to be our take on this I feel like this slow compounding growth of a fanbase is fake if I notice some of the most successful things. Like, Leopold releases situational awareness. He hasn't been building up a fan base over years. He's just really good.

And as you were mentioning a second ago, whenever you notice a really great new blogger, it's not like then it takes them a year or two to build up a fan base. It's like, nope, everybody, at least that they care about, is talking about it almost immediately. I know, I mean, the situational awareness is just like...

in a different tier almost. But things like that, and even things that are an order of magnitude smaller than that, will literally just get read by everybody who matters. And I mean, like literally everybody. And I mean, I expect this to happen with AI 2027 when it comes out. But Daniel, I guess you kind of have, but you've been building your reputation in this specific community.

And I expect the AI 2027 is just like really good and I expect it'll just like blow up in a way that isn't downstream of you having built up an audience over a year. Thank you. I hope that happens. We'll see. Slightly pushing back against that, I have statistics for the first several years of Slate Star Codex, and it really did grow extremely gradually. The usual pattern is something like

1% of the people who read your viral hits stick around. And so after like dozens of viral hits, then you have a fan base. But smoothed out, it does look like a very, I wish I had seen this recently, but I think it's like over the course of three years, it was a pretty constant. rise up to some plateau where I imagine it was a dynamic equilibrium and as many new people were coming in as old people were leaving.

I think that with situational awareness, I don't know how much publicity Leopold put into it. We're doing pretty deliberate publicity. We're going on your podcast. I mean, I think that... I think you can either be the sort of person who can go on a Dworkish podcast and get the New York Times to write about you, or you can do it organically the old-fashioned way, which is very long. Okay, so you say that throwing money at people to get them to blog at least didn't seem to work for the FTX folks.

If it was up to you, what would you do? What's your grand plan to get 10 more Scott Alexanders? Man. So, my friend Clara Collier, who's the editor of Asterisk magazine, is working on something like this for AI blogging. And her idea, which I think is good, is to have a fellowship. I mean, I think Nick's thing was also a fellowship. But the fellowship would be...

Like, there is an Asterisk AI Blogging Fellows blog or something like that. Clara will edit your post, make sure that it's good, put it up there. And she'll select many people who she thinks will be good at this. She'll do all of the... kind of courage requiring work of being like, yes, your post is good. I'm going to edit it now. Now it's very good. Now I'm going to put it on the blog. And I think her hope is that let's say.

of the fellows that she chooses, now it's not that much of a courage step for them to start it because they have the approval of what last psychiatrist would call an omniscient entity, somebody who is just... allowed to approve things and tell you that you're okay on a psychological level. And then like maybe of those fellows, some percent of them will have their blog posts be read and people will like them.

And I don't know how much reinforcement it takes to get over the hype fire everyone has on No One Will Like My Blog, but maybe for some people, the amount of reinforcement they get there will work. Yeah. Like an interesting... Interesting example would be all of the journalists who have switched to having substats. Many of them go well.

Would all of those journalists have become bloggers if there was no such thing as mainstream media? I'm not sure. But if you're Paul Krugman, like, you know, people like your stuff. And then when you quit the New York Times, you know, you can just open a Substack and start doing exactly what you were doing before. So I don't know, maybe my answer is there should be mainstream media. I hate to admit that, but maybe it's true. Invented it from first principles. Yeah.

Well, I do think that it should, it's related to the idea of mainstream media, that it should be treated more as a viable career path. Where right now, if you told your parents, I'm going to become a startup founder, I think the idea would, the reaction would be like, there's a 1% chance you'll succeed, but it's an interesting experience. And you might, if you do succeed, that's crazy. That'd be great.

If you don't, you'll learn something. It'll be helpful to the thing you do afterwards. We know that's true of blogging, right? We know that it helps you build up a network. It helps you develop your ideas. And even if you don't succeed, if you do succeed, you get a dream job for a lifetime.

And I think people, what people don't, maybe they don't have that mindset, but also they underappreciate how much it is. Like you actually could succeed at it. It's not a crazy outcome to make a lot of money as a blogger. I think it might be a crazy outcome to make a lot of money as a blogger. I don't know what percent of people who start a blog end up making enough that they can quit their day job. I guess it's a lot worse than for startup founders. I would not even...

have that as a goal. That's right. So much as like the Scott Aronson goal of, okay, you're still a professor, but now you're the professor whose views everybody knows and who is. It has kind of a boost up in respect in your field and especially outside of your field. And also you can correct people when they're wrong, which is a very important side benefit.

How does your old blogging feed back into your current blogging? So when you're discussing a new idea, I mean, AI or whatever else, are you just able to pull from the insights from your previous commentary on sociology or anthropology or history or something? Yeah, so I think this is... the same as anybody who's not blogging is The thing everybody does is they've read many books in the past, and when they read a new book, they have enough background to think about it.

You are thinking about our ideas in the context of Joseph Henrik's book. I think that's good. I think that's the kind of place that intellectual progress comes from. I think I am more incentivized to do that. I think if you look at the statistics, they're terrible. Most people barely read any books in a year. And I get lots of praise when I read a book and often lots of money. And that's a really good incentive.

do more research, deep dives, read more books than I would if I weren't a blogger. It's an amazing side benefit. And I'd probably make a lot more intellectual progress than I would if I didn't have those really good incentives. Yeah. There was actually a prediction market about the year by which... an AI would be able to write blog posts as good as you. Was it 2026 or 2027? I think it was 2027. It was like 15% by 2027 or something like that.

It is an interesting question of they do have your writing and all other good writing in training distribution. And weirdly, they seem way better at getting superhuman encoding than they are at... at writing right which is like the main thing in their distribution yeah it's a honor to be my generation's gary kasparov yeah so i've tried this and first of all it does

A decent job. I respect its work. It's not perfect yet. I think it's actually better at the style on a word-to-word, sentence-to-sentence level than it is at planning out a blog post. So I think there are possibly two reasons for it. One, we don't know how the base model would have done at this task. We know that all the models we see are to some degree reinforcement learning into a kind of corporate speak mode. You can get it somewhat out of that corporate speak mode.

But I don't know to what degree this is actually doing its best to imitate Scott Alexander versus hit some average between Scott Alexander and corporate speak. That's right. And I don't think anyone knows except the internal employees who have access to the base model. And the second thing I think of... maybe just because it's trendy, as an agency or horizon failure. Like deep research is an okay researcher. It's not a great researcher. If you actually want...

like to understand an issue in depth. You can't use deep research. You got to do it on your own. So if you think like I spend maybe... Five to ten hours researching a really research-heavy blog post. The meter thing, I know we're not supposed to use it for any task except coding, but like it says, on average, the AI's horizon is one hour. So I'm guessing it just cannot plan and execute a good blog post. It does something very superficial rather than like actually going through the steps.

So my guess for that prediction market would be whenever we think the agents are actually good. I think in our scenario, that's like late 2026. I'm going to be humble and not hold out for the super intelligence. What about comments? I feel like intuitively it feels like before we see the AI is writing great blog posts that go super viral repeatedly, we should see them writing like highly uploaded comments.

Yeah, and I think somebody mentioned this on the LessWrong post about it, and somebody made some AI-generated comments to that post. They were like, not great, but I wouldn't have immediately picked them out if the general distribution of LessWrong comments is especially bad. Like I think if you were to try this, you would get something that was... so obviously an AI house style that

You would use the word delve or things kind of along those lines. I think if you were able to avoid that, maybe by using the base model, maybe by using some kind of really good prompt to be like, no, do this in Guern's voice. you would get something that was pretty good. I think if you wrote a really stupid blog post, it could point out the correct objections to it. But I also just don't think it's as smart as Gwern right now, so its limit on making Gwern-style comments is both...

It needs to be able to do a style other than corporate delve slop, and then it actually needs to get good. It needs to have good ideas that other people don't already... Yeah. And I mean, I think it is as smart as like a, I think it can write as well as like a smart average person in a lot of ways. And I think if you have a blog post that.

Like, worse than that or at that level, it can come up with insightful comments about it. I don't think it could do it on a quality blog post. There was this recent Financial Times article about how, have you reached peak? cognitive power we're talking about declining scores in PISA and SAT and so forth. On the internet especially, it does seem like there might have been a golden era before I was that active on, you know, the forums or whatever.

Do you have nostalgia for a particular time on the internet when it was just like, this is an intellectual mecca? I am so mad at myself for missing most of the golden age of blogging. I feel like if I had started a blog in 2000 or something, then I don't know, I've done well for myself, I can't complain, but like, the people from that era all got like...

all-founded news organizations or something. I mean, God save me from that fate. I would have liked to have been there. I would have liked to see what I... could have done in that area. I mean, I wouldn't compare the decline of the internet to that stuff with PISA because I'm sure the internet is just like more people are coming on. It's a less heavily selected sample. I could have passed on

The whole era where they were talking about atheism versus religion nonstop, that was pretty crazy. But I do hear good things about the golden age of blogging. Anybody who was sort of counterfactually responsible for you starting to blog or keeping blogging? I owe a huge debt of gratitude to Eliezer Yudkowsky. I don't think he was, like, I had a live journal before that, that it was going on, first of all, it was going on less wrong, that convinced me I could move to the big times.

And second of all, I just think I learned, I imported a lot of my worldview from him. I think I was the most boring normie liberal in the world before encountering less wrong. And I don't 100% agree with all less wrong ideas, but just having... things of that quality beamed into my head and for me to react to and think about was really great. And tell me about the... The fact that you could be, we're at some point anonymous.

I think for most of human history, somebody who is an influential advisor or an intellectual or somebody... Actually, I don't know if this is true. You would have had to have some sort of public persona. And a lot of what people read into your work is actually a reflection of your public persona. Sort of. The reason half of these ancient authors are called things like pseudo-Dionysus or pseudo-Celsus is that you could just write something being like, oh yeah, this is by Saint Dionysus.

I don't know, you could be anybody. And I don't know exactly how common that was in the past. But yeah, I agree that the internet has been a golden age for anonymity. I'm a little bit concerned that AI will make it much easier to break anonymity. I hope the golden age continues. Seems like a great note to end on.

Thank you guys so much for doing this. Thank you. Thank you so much. This is a blast. Yeah, I had a huge fan of your podcast. Thank you. I hope you enjoyed that episode. If you did, the most helpful thing you can do is to share it with other people who you think might enjoy it. Send it on Twitter, in your group chats. Message it to people. It does really help out a ton. Otherwise, if you're interested in sponsoring the podcast, you can go to dwarkesh.com slash advertise to learn more.

Okay, I'll see you on the next one.

This transcript was generated by Metacast using AI and may contain inaccuracies. Learn more about transcripts.