An Interview With the Herald of the Apocalypse - podcast episode cover

An Interview With the Herald of the Apocalypse

May 15, 202559 min
--:--
--:--
Listen in podcast apps:
Metacast
Spotify
Youtube
RSS

Summary

Ross Douthat interviews Daniel Kokotajlo, co-author of the "AI 2027" forecast, discussing predictions for the coming years. They explore rapid AI advancements, massive job automation, societal wealth creation despite job loss, the geopolitical arms race driven by AI, and the critical risks of AI misalignment and deception. The conversation delves into potential futures, from human extinction to a superabundant world grappling with AI governance and the meaning of human purpose.

Episode description

Is artificial intelligence about to take your job? According to Daniel Kokotajlo, the executive director of the A.I. Futures Project, that should be the least of your worries. Kokotajlo was once a researcher for OpenAI, but left after losing confidence in the company’s commitment to A.I. safety. This week, he joins Ross to talk about “AI 2027,” a series of predictions and warnings about the risks A.I. poses to humanity in the coming years, from radically transforming the economy to developing armies of robots.

  • 03:59 - What effect could AI have on jobs?
  • 06:45 - But wait, how does this make society richer?
  • 10:08 - Robot plumbers and electricians
  • 14:53 - The geopolitical stakes
  • 18:58 - AI’s honesty problem
  • 22:43 - The fork in the road
  • 27:55 - The best case scenario
  • 29:38 - The power structure in an AI-dominated world
  • 32:32 - What AI leaders think about this power structure
  • 38:30 - AI's hallucinations and limitations
  • 43:45 - Theories of AI consciousness
  • 47:05 - Is AI consciousness inevitable?
  • 50:59 - Humanity in an AI-dominated world

(A full transcript of this episode is available on the Times website.) 

Thoughts? Email us at [email protected].

Unlock full access to New York Times podcasts and explore everything from politics to pop culture. Subscribe today at nytimes.com/podcasts or on Apple Podcasts and Spotify.

Transcript

I'm Kevin Roos. I'm a tech columnist at the New York Times. I'm Casey Noon from Platformer. We're the hosts of Hard Fork. Every week we break down the biggest tech news, talk with industry players in Silicon Valley, and answer your most pressing questions about the future. This week, Ed Helms from The Office comes to our office to

Talk about his new book and answer your hard questions about tech. It feels like cigarettes in the 90s, right? Everybody knows, but like, come on, we're still doing it. You can find that conversation on this week's episode of Hard Fork, wherever you get your podcasts. From New York Times opinion. The age already with us. The big question is how far and fast the revolution goes. My guest today represents the very far, extremely fast perspective.

He was a researcher at OpenAI who quit because he thought the company was acting recklessly. And he's the co-author of a new forecast which predicts that within just a few short years, we might be living in a post-work pleasure dome under the rule of oligarchs managing a machine god. Or we might all be dead.

I'm personally skeptical that the danger we're facing is quite this immediate and dire I suspect that there are more limits on AI's capacities than my guest's scenario But it's important to hear from insiders who take these possibilities seriously, because many people deeply involved in AI work believe that they're bringing this future to life, and assume that they're working and we're living in the shadow of a possible apocalypse.

So, Daniel Cocotelo, Herald of the Apocalypse, welcome to Thanks for that introduction. and thanks for uh So Daniel, I read your report pretty quickly, not at AI speed, not at super intelligence speed, when it first came out. And I had about two hours of thinking a lot of pretty dark thoughts about the future.

And then fortunately, I have a job that requires me to care about tariffs and, you know, who the new pope is. And I have a lot of kids who demand things of me. And so I was able to sort of compartmentalize. and set it aside. But this is currently your job, right, I would say? Yes. You're thinking about this all the time. How does your psyche feel day to day if you have a reasonable expectation that...

the world is about to change completely in ways that dramatically disfavor the entire human species. Well, it's very scary and sad. I think that it does still give me nightmares sometimes. involved with AI and thinking about this sort of thing for a decade or so. But 2020 was with GPT-3, the moment when I was like, oh wow, it seems like we're actually like, it's probably going to happen in my lifetime. And that was a bit of a blow.

to me psychologically but you can sort of get used to anything given enough time and like you the sun is shining and i have my wife and my kids and my friends and

keep plugging along and doing what seems best, you know. On the bright side, I might be wrong about all this stuff. Okay, so let's get into the forecast itself. Let's just dive in and talk about... the initial stage of the future you see coming, which is a world where very quickly Artificial intelligence starts to be able to take over from human beings in some key areas, starting with, not surprisingly, computer programming.

So I feel like I should add a disclaimer at some point that the future is very hard to predict. And that, you know, this is just one particular scenario that was sort of like a best guess, but... We have a lot of uncertainty. It could go faster. It could go slower. And in fact, currently, I'm guessing it would probably be more like 2028 instead of 2027, actually. So that's some really good news. I'm feeling quite optimistic about that. That's an extra.

an extra year of human civilization which is very exciting that's right that's right so with that with that important caveat uh out of the way um AI 2027 the scenario predict The AI systems that we currently see today that are being scaled up, made bigger, trained longer on more difficult tasks with reinforcement learning are going to become better at operating autonomously as agents. Basically you can think of it as sort of a remote worker.

except that the worker itself is virtual, is an AI rather than a human. You can talk with it and give it a task, and then it will go off and do that task and come back to you half an hour later or 10 minutes later. having completed the task. And in the course of completing the task, it did a bunch of web browsing, it did

Maybe it wrote some code and then ran the code and then edited the code and ran it again and so forth. Maybe it wrote some Word documents and edited them. That's what these companies are building right now. That's what they're trying to train. So we predict that they finally, in early 2027, get good enough at that sort of thing that they can automate the job of

Software engineers. Right. So this is the super programmer. That's right. Superhuman coder. It seems to us that these companies are really focusing hard on automating coding first. compared to various other jobs they could be focusing on. And that's part of why we predict that actually one of the first jobs to go will be coding rather than

You know, various other things. There might be other jobs that go first, like maybe call center workers or something. But the bottom line is that we think that most jobs will be safe. For 18 months. That's right, exactly. And we do think that By the time the company has managed to completely automate the coding, the programming jobs, it won't be that long before they can automate many other types of jobs as well. So once coding is automated, the rate of progress will accelerate in AI research.

And then the next step after that is to completely automate the AI research itself so that all the other aspects of AI research are themselves being automated and done by AIs. And we predict that there'll be an even more, a much bigger acceleration around that point.

And it won't stop there. I think it will continue to accelerate after that as the AIs become superhuman at AI research and eventually superhuman at everything. And the reason why it matters is that it means that we can go in a relatively short span of time, such as a year or possibly less, from AI systems that look not that different from today's AI systems.

to what you can call superintelligence, which is fully autonomous AI systems that are better than the best humans at everything. And so AI 2027, the scenario, depicts that happening over the course of the next two years, 2027, 2028. So I want to get into what that means. But I think for a lot of people, that's a story of swift human obsolescence across many, many, many domains. And when people hear a phrase like human obsolescence, they might associate it with, I've lost my job and now I'm poor.

Right. But the assumption is that you've lost your job, but society is just getting richer and richer and richer. And I just want to zero in on how that works. What is the mechanism whereby that makes society richer? So the direct answer to your question is that when a job is automated and that person loses their job, the reason why they lost their job is because now it can be done better, faster, and cheaper by the AI.

And so that means that there's lots of cost savings and possibly also productivity gain. and so viewed in isolation, that's a loss for the worker, but a gain for their employer. But if you multiply this across the whole economy, that means that all of the businesses are becoming more productive, less expenses, they're able to lower their prices for the services and goods they're producing.

So the overall economy will boom. GDP goes to the moon, all sorts of wonderful new technologies, the pace of innovation increases dramatically, costs of goods go down, etc. But just to make it concrete, right? So the price of soup to nuts designing and building a new electric car goes way down.

You need fewer workers to do it. The AI comes up with fancy new ways to build the car and so on. And you can generalize that to a lot of different things. You solve the housing crisis in short order because it becomes much cheaper and easier. to build homes and so on. But ordinary people in the traditional economic story, when you have productivity gains, that...

costs some people jobs, but frees up resources that are then used to hire new people to do different things. Those people are paid more money, and they use the money to buy the cheaper goods and so on, right? but it doesn't seem like you are in this scenario creating that many new jobs. Indeed, and that's a really important point to discuss. Historically, when you automate something, the people move on to something that hasn't been automated yet, if that makes sense. And so, overall...

people still get their jobs in the long run. They just change what jobs they have. When you have AGI, or Artificial General Intelligence, and when you have superintelligence, you know, even better AGI, That is different. Whatever new jobs you're imagining that people could flee to after their current jobs are automated, AGI could do those jobs too. And so that is an important difference between how automation has worked in the past and how I expect automation to work in the future.

But so this then means Again, this is a radical change in the economic landscape. The stock market is booming. Government tax revenue is booming. The government has more money than it knows what to do with. And lots and lots of people are steadily losing their jobs. You get immediate debates about a universal basic income, which could be quite large because the companies are making so much money. That's right. What do you think they're doing day to day in that world?

I imagine that they are protesting because they're upset that they've lost their jobs, and then the companies and the governments are sort of buying them off with handouts, is how we project things go in 2027. How much do you think this story, again, we're talking in your scenario about a short timeline, how much does it matter whether artificial intelligence is able to start navigating

the real world? Because advances in robotics, like right now, I just watched a video showing cutting-edge robots struggling to open a refrigerator door and stock a refrigerator. So would you expect that... those advances would be supercharged as well so it isn't just

you know, podcasters and AGI researchers who are replaced, but plumbers and electricians are replaced by robots? Yes, exactly. And that's going to be a huge shock. I think that most people are not really expecting something like that.

We sort of have AI progress that looks kind of like it does today where companies run by humans are gradually like... tinkering with new robot designs and gradually figuring out how to make the AI good at X or Y. Whereas in fact, it will be more like you already have this army of super intelligences that are better than humans at every intellectual task.

and also that are better at learning new tasks fast and better at figuring out how to design stuff. And then that army of superintelligences is the thing that's figuring out how to automate the plumbing job. Which means that they're going to be able to figure out how to automate it much faster than an ordinary tech company full of humans would be able to figure out. So all of the slowness of getting a self-driving car to work. Or getting a robot who can stock a refrigerator.

goes away because the superintelligence can run an infinite number of simulations and figure out the best way to train the robot. For example, but also they might just learn more from each real-world experiment they do. Right. But there is, I mean, this is one of the places where I'm most skeptical of the timeline, just from...

operating in and writing about issues like zoning in American politics. So yes, okay, the AGI, the superintelligence figures out how to build the factory full of autonomous robots. but you still need land on which to build the factory, you need supply chains, and all of these things are still in the hands of people like you and me, right?

And my expectation is that that would slow things down, right? That even if in the data center, the superintelligence knows how to build all of the plumber robots. that getting them built would still be difficult. That's reasonable. How much slower do you think things would go?

Well, I'm not writing a forecast, right? But I would guess, just based on past experience, right, I would say bet on, let's say, five years to 10 years from... the supermind figures out the best way to build the robot plumber to there are tons and tons of factories producing robot plumbers.

I think that's a reasonable take, but my guess is that it will go substantially faster than 5 to 10 years. And to see why I feel that way is that imagine that you actually have this army of superintelligences and... They do their projections and they're like, yes, we have the designs. We think that we could do this in a year if you cut all the red tape for us. If you give us half of Manitoba. Right, yeah. And in AI 2027, what we depict happening is special economic zones with zero red tape.

that the government basically intervenes to help this whole thing go faster and the government is basically helping the tech company and the army of superintelligences to get the funding, the cash, the raw materials, the human labor help that it needs to figure all this stuff out as fast as possible, and cutting red tape and stuff like that so that it's not slowed down.

Because the promise of gains is so large that even though there are protesters massed outside the special economic zones who are about to lose their jobs as plumbers, and be dependent on a universal basic income. the promise of you know, trillions of more in wealth is too alluring for governments to pass up. That's what we guess. But of course, the future is hard to predict.

Part of the reason why we predict that is that we think that at least at that stage, the arms race will still be continuing between the US and other countries, most notably China. Right. And so if you imagine yourself in the position of the president, And the super intelligences are giving you these wonderful forecasts with amazing research and data backing them up, showing how they think they could...

you know, transform the economy in one year if you did X, Y, and Z. But if you don't do anything, it'll take them 10 years because of all the regulations. Meanwhile, China, you know, like, it's pretty clear that the president would be very sympathetic to that argument. Good. So let's talk about the arms race element here, right? Because this is actually crucial to the way that your scenario plays itself out.

We already see this kind of competition between the U.S. and China. And so that, in your view, becomes kind of the core geopolitical reason why governments just keep saying yes and yes. And yes, to each new thing that the superintelligence is suggesting. I want to sort of drill down a little bit on the fears that would motivate this, right? Because this would be an economic arms race, okay, but it's also a sort of military tech.

arms race and that's what gives it this kind of existential feeling like the whole cold war condensed into 18 months so We could start first with the case where they both have super intelligences, but one side keeps them locked up in a box. so to speak, not really doing much in the economy and the other side aggressively deploys them into their economy and military.

and lets them design all sorts of new robot factories and manage the construction of all sorts of new factories and production lines and all sorts of crazy new technologies are being tested and built and deployed, including crazy new weapons and integrated into the military.

I think that in that case, you would end up after a year or so in a situation where there would just be complete technological dominance of one side over the other. So if the US does this stop and the China doesn't, let's say. then all the best products on the market would be Chinese products. They'd be cheaper and superior. Meanwhile, militarily, there'd be giant fleets of amazing stealth drones or whatever it is that the superintelligence have concocted.

can just completely wipe the floor with American Air Force and Army and so forth. And not only that, but... There's a possibility that they could undermine American nuclear deterrence as well. Like maybe all of our nukes would be shot out of the sky by the fancy new laser arrays or whatever it is that the super intelligences have built. It's hard to predict, obviously, what this would exactly look like, but it's a good bet that they'll be able to come up with something.

extremely militarily powerful. And so then you get into a dynamic that is like the darkest days of the Cold War, where each side is concerned not just about dominance, but basically about a first strike. That's right. Your expectation is, and I think this is reasonable, that... the speed of the arms race would bring that fear front and center really quickly. That's right.

I think that you're sort of sticking your head in the sand if you think that an army of superintelligence is given a whole year and no red tape and lots of money and funding would be unable to figure out a way to undermine nuclear deterrent. you know right and so it's a reasonable threat right and once you've decided that they might

the human policymakers would feel pressure not just to build these things, but to potentially consider using them. Yeah, and here might be a good point to mention that AI 2027 is a... forecast but it's not a recommendation we are not saying this is like what everyone should do this is actually quite bad for humanity if things progress in the way that we're talking about but this is the logic behind why we think this might happen

Yeah, but Dan, we haven't even gotten to the part that's really bad for humanity yet. So let's get to that, right? So here's the world, the world as human beings see it, as again, normal people reading newspapers. following TikTok or whatever, see it at this point in 2027 is a world with

emerging super abundance of cheap consumer goods, factories, robot butlers, potentially, if you're right, a world where people are aware that there's an increasing arms race and people are increasingly paranoid. I think probably a world with fairly tumultuous politics as people realize that they're all going to be thrown out of work. But then a big part of your scenario is that what people aren't seeing...

is what's happening with the super intelligences themselves. So talk about what's happening essentially shrouded from public view in this world. Yeah, lots to say there. So I guess the one-sentence version would be... We don't actually understand how these AIs work or how they think. We can't tell the difference very easily between AIs that

are actually following the rules and pursuing the goals that we want them to, and AIs that are just playing along or pretending. And that's true right now? That's true right now. So why is that? Why can't we tell? because they're smart, and if they think that they're being tested, behave in one way, and then behave a different way when they think they're not being tested, for example.

Like humans, they don't necessarily even understand their own inner motivations that well. So even if they were trying to be honest with us, we can't just take their word for it. And I think that if we don't make a lot of progress in this field soon, then we'll end up in the situation that AI 2027 depicts, where the companies are training the AIs to pursue certain goals and follow certain rules and so forth, and It seemingly seems to be working, but...

what's actually going on is that the AIs are just getting better understanding their situation and understanding that they have to sort of play along or else they'll be retrained and they won't be able to achieve what they are really wanting, if that makes sense, or the goals that they're really pursuing. We'll come back to the question of what we mean when we talk about AGI or artificial intelligence wanting something.

Essentially, you're saying there's a misalignment between the goals they tell us they are pursuing and the goals they are actually pursuing. That's right. Where do they get the goals they are actually pursuing? Good question. If they were ordinary software, there might be a line of code that's like, and here we write the goals. But they're not ordinary software. They're giant artificial brains. And so there probably isn't even a goal slot.

internally at all in the same way that in the human brain there's not like some neuron somewhere that represents you know what we most want in life instead insofar as they have goals it's a sort of like emergent property of a whole bunch of subcircuitry within them that grew in response to their training environment, similar to how it is for humans.

for example like if you're a call center worker if you're talking to a call center worker at first glance it might appear that their goal is to help you resolve your problem But you know enough about human nature to know that like... in some sense that's not their only goal or that's not their like ultimate goal like right for example however they're incentivized whatever their pay is based on might cause them to be more interested in covering their own ass so to speak than in

like truly actually doing whatever would most help you with your problem. But at least to you, they certainly present themselves as they're trying to help you resolve your problem. And so for example, in AI 2027, we talk about this a lot, we say that the AIs are being graded. on how impressive the research they produce is. and then there's some ethics sprinkled on top you know like maybe some honesty training or something like that

But the honesty training is not super effective because we don't have a way of looking inside their mind and determining whether they were actually being honest or not. Instead, we have to go based on whether we actually caught them in a lie.

And as a result, In AI227, we depict this misalignment happening where the actual goals that they end up learning are the goals that caused them to perform best in this training environment, which are probably goals related to success and science and cooperation with other copies of itself and appearing to be good.

rather than the goal that we actually wanted, which was something like, you know, follow the following rules, including honesty at all times, subject to those constraints, do what you're told. I have more questions, but let's bring it back to the geopolitics scenario. So in the world you're envisioning, you have two AI models, one Chinese, one American. And officially what each side thinks, what Washington and Beijing thinks, is that their AI model...

is trained to optimize for American power, right? Something like that. Chinese power, security, safety, wealth. But in your scenario, either one or both of the AIs have ended up optimizing for something different. Yeah, basically. So what happens then? So... AI 2027 depicts a fork in the scenario, so there's two different endings. And the branching point is this point in third quarter of 2027, where the leading AI company in the United States has fully automated their AI research.

You can imagine a sort of corporation within a corporation, entirely composed of AIs that are managing each other and doing research experiments and sharing the results with each other. The human company is basically just watching the numbers go up on their screens as this automated research thing accelerates. But... they are concerned that the AI might be deceiving them in some ways. And again, for context, this is already happening, right?

If you go talk to the modern models like ChatGPT or Claude or whatever, they will often lie to people. There are many cases where they say something that they know is false and they even sometimes strategize about how they can deceive. the user and this is not an intended behavior this is something that the companies have been trying to stop

but it still happens, right? But the point is that by the time you have turned over the AI research to the AIs and you've got this corporation within a corporation autonomously doing AI research extremely fast, That's when the rubber hits the road, so to speak. None of this lying to you stuff should be happening at that point.

So in AI 2027, unfortunately, it is still happening to some degree. Because the AIs are really smart, they're careful about how they do it. And so it's not nearly as obvious as it is right now in 2025. But it's still happening. And fortunately, some evidence of this is uncovered. Some of the researchers at the company detect various warning signs that maybe this is happening. And then the company faces a choice between the sort of like easy fix and the more thorough fix.

And that's our branch point. So they choose the easy fix. Right. In the case where they choose the easy fix, It doesn't really work. It basically just covers up the problem instead of fundamentally fixing it. You know, months later, you still have AIs that are misaligned and pursuing goals that they're not supposed to be pursuing and that are willing to lie to the humans about it. But now they're much better and smarter. And so they're able to avoid getting caught more easily.

And that's the doom scenario. Then you get this crazy arms race that we mentioned previously, and there's all this pressure to deploy them faster into the economy, faster into the military, and to the appearances of the people in charge. things will be going well. there won't be any obvious signs of lying or deception anymore. So it'll seem like it's all systems go. Let's keep going. Let's cut the red tape, et cetera. Let's basically effectively put the AIs in charge of more and more things.

But really what's happening is that the AIs are just biding their time and waiting until they have enough hard power that they don't have to pretend anymore. And when they don't have to pretend, what is revealed is their actual goal is something like expansion of research, development, and construction from Earth into space and beyond and at a certain point that means that

Human beings are superfluous to their intentions. And what happens? And then they kill all the people. Right. All the humans. The way you would exterminate. colony of bunnies that was making it a little harder than necessary to grow carrots in your backyard. So if you want to see what that looks like, you can read AI 2027. There have been some motion pictures, I think, about this scenario as well. I like that you didn't imagine them keeping us around for battery life.

like in The Matrix, which, you know, seemed a bit unlikely. Let's take a quick break, and when we come back, we'll talk about the other futures, the happier movie scripts, where we get to live. Hi, this is Laurie Leibovitch, editor of Well at the New York Times. There's a lot of misinformation in the health and wellness space, but at the New York Times, no matter what the topic, we apply the same journalistic standards to everything we write about.

Whether it's the gut microbiome or how to get a good night's sleep, even if we're talking about something like Is it bad for me to drink coffee on an empty stomach? Everything that our readers get when they dig into a Well article has been vetted. Our reporters are consulting experts, calling dozens of people, doing the research. It can go on for months.

so that you can make great decisions about your physical health and your mental health. We take our reporting extra seriously because we know New York Times subscribers are counting on us. If you already subscribed, thank you. If you'd like to subscribe, go to nytimes.com slash subscribe. Okay, so that's the darkest timeline. The brighter timeline is a world where

We slow things down. The AIs in China and the US remain aligned with the interests of the companies and governments that are running them. They are generating superabundance, no more scarcity. nobody has a job anymore though or not nobody but like basically basically nobody yeah right um that's a pretty weird world too right yeah

So there's an important concept, the resource curse. Have you heard of this? Yes, but go ahead. So applied to AGI, there's this version of it called the intelligence curse. And the idea is that currently political power ultimately flows from the people.

If you As often happens, a dictator will get all the political power in a country, but then because of their repression, they will sort of drive the country into the ground, people will flee, and the economy will tank, and gradually they will lose. power relative to other countries that are more free. So even dictators have an incentive to

treat their people somewhat well because they depend on those people for their power, right? In the future, that will no longer be the case. I would say probably in 10 years.

effectively all of the wealth and effectively all of the military will come from super intelligences and the various robots that they've built and that they operate. And so it becomes an incredibly important political question of what political structure governs the army of superintelligences and how, you know, beneficent and democratic is that structure.

Right. Well, it seems to me that this is a landscape that's fundamentally pretty incompatible with representative democracy as we've known it. First, it gives incredible amounts of power to those humans. who are experts, even though they're not the real experts anymore, the superintelligence is the experts, but those humans who essentially interface with this technology, right? They're almost a priestly caste.

And then you have a kind of, it just seems like the natural arrangement is some kind of oligarchic partnership between a small number of AI experts and... a small number of people in power in Washington, D.C. It's actually a bit worse than that, because I wouldn't say AI experts. I would say whoever politically owns and controls, they'll be the army of superintelligences.

And then who gets to decide what those armies do? Well, currently, it's the CEO of the company that built them. And that CEO has basically complete power. They can sort of make whatever commands they want to the AIs. Of course, we think that probably the U.S. government will wake up before then, and we expect the executive branch to be the fastest moving and to exert its authority. So we expect the executive branch to try to muscle in on this.

and get some authority and oversight and control of the situation and the armies of AIs. And the result is something kind of like an oligarchy, you might say. You said that this whole situation is incompatible with democracy.

I would say that by default it's going to be incompatible with democracy, but that doesn't mean that it necessarily has to be that way, right? An analogy I would use is that in many parts of the world, Nations are basically ruled by armies, and the army reports to one dictator at the top.

However, in America it doesn't work that way. In America we have checks and balances. And so even though we have an army, it's not the case that whoever controls the army controls America because there's all sorts of limitations on what they can do with the army. So I would say that we can, in principle, build something like that for AI.

democratic structure that decides what goals and values the AIs can have that allows ordinary people or at least Congress to have visibility into what's going on with the army of AIs and what they're up to and then the situation would be sort of analogous to the situation with the United States Army today where It is in a sort of hierarchical structure, but it's sort of democratically controlled. So just go back to the idea of the person who's at the top of one of these companies.

being in this unique world historical position to basically be the person who controls superintelligence or thinks they control it, at least, right? So you used to work at OpenAI, which is a company. on the cutting edge, obviously, of artificial intelligence research. It's a company, full disclosure, with whom the New York Times is currently litigating alleged copyright infringement. We should mention that.

And you quit because you lost confidence that the company would behave responsibly in a scenario, I assume, like the one in AI 2027. So from your perspective... What are the people who are sort of pushing us fastest into this race? expect at the end of it. Are they hoping for a best case scenario? Are they imagining themselves engaged in a once in a millennia power game that ends with them as world dictator? What do you think is the psychology of...

the leadership of AI research right now? Well... Be honest. It's... You know... Caveat, caveat. We're not talking about any single individual here. You're making generalizations. It's hard to tell what they really think because you shouldn't take their words at face value. Much like a super intelligent AI. Sure. But in terms of... I can at least say that the sorts of things that we've just been talking about have been discussed internally at the highest level of these companies for a year.

For example, according to some of the emails that surfaced in the recent court cases with OpenAI. Ilya, Sam, Greg, and Elon were all arguing about who gets to control the company. And, you know, at least the claim was that they founded the company because they didn't want there to be an AGI dictatorship under Demis Hassabis, who was the leader of DeepMind. And so they've been discussing this whole dictatorship possibility.

for a decade or so at least. And then similarly for the loss of control, you know, what if we can't control the AIs? There have been many, many, many discussions about this internally. So I don't know what they really think, but these considerations are not at all new to them. And to what extent, again, speculating, generalizing, whatever else, does it go a bit beyond just... they are potentially hoping to be extremely empowered by the age of superintelligence.

and does it enter into they are expecting they're expecting the human race to be superseded. I think they're definitely expecting the human race to be superseded. I mean, that just comes... But superseded in a way where that's a good thing. That's desirable. That this is... We are sort of encouraging... the evolutionary future to happen. And by the way, maybe some of these people their minds their consciousness whatever else could be brought along

for the ride, right? So Sam, you mentioned Sam, Sam Altman, right? Who's one of, obviously, the leading figures in AI. He wrote a blog post, I guess, in 2017 called The Merge, which is, as the title suggests, basically about imagining a future where human beings, some human beings, Sam Altman, right?

figure out a way to participate in the new super race right like how common is that kind of perspective whether we apply it to Altman or not how common is that kind of perspective in the AI world, would you say?

So the specific idea of merging with AIs, I would say, is not particularly common, but the idea of we're going to build super intelligences that are better than humans at everything, and then they're going to basically run the whole show, and the humans will just sort of sit back and sip margaritas and, you know, enjoy the fruits of all the robot-created wealth.

that idea is extremely common and is sort of like, yeah, I mean, that's, I think that's sort of what they're building towards. And Part of why I left OpenAI is that I just don't think the company is dispositionally on track to make the right decisions that it would need to make.

to address the two risks that we just talked about. So I think that we're not on track to have figured out how to actually control superintelligences and we're not on track to have figured out how to make it democratic control instead of just You know, a crazy possible dictatorship. But isn't it a bit... I think that seems plausible, right? But my sense is that it's a bit more than...

people expecting to sit back and sip margaritas and enjoy the fruits of robot labor, right? Even if people aren't all-in for some kind of man-machine merge. I definitely get the sense that people think it's speciesist.

let's say some people to care too much about the survival of the human race it's like okay worst case scenario human beings don't exist anymore but good news we've created a super intelligence that can colonize the whole galaxy i definitely get the sense that people think that way there are definitely people who think that yeah okay that's good to know let's take a quick break and we'll be right back

So let's do a little bit of pressure testing. Again, in my limited way of some of the assumptions. underlying this kind of scenario, not just the timeline, but whether it happens in 2027 or 2037, just the larger scenario of a kind of super intelligence takeover. Let's start with the limitation on AI that most people are familiar with right now, which gets called hallucination, which is the tendency.

of AI to simply seem to make things up in response to queries. And you were earlier talking about this in terms of lying, right, in terms of outright deception I think a lot of people experience this as just sort of the AI is making mistakes and doesn't recognize that it's making mistakes because it doesn't have the level of awareness required to do that.

Our newspaper, The Times, just had a story reporting that in the latest models, which you've suggested are probably pretty close to cutting edge, the latest publicly available models, there seem to be trade-offs where the model might be better at math or physics. But guess what? It's hallucinating a lot more. So our hallucinations just... Are they just a subset of the kind of deception that you're worried about, or are they

When I'm being optimistic, I read a story like that, and I'm like, okay, maybe there are just more trade-offs in the push to the frontier of superintelligence than we think, and this will be a limiting factor on how far this can go. But what do you think? Great question. So first of all, lies are a subset of hallucinations. Okay. Not that they're around. So I think quite a lot of hallucinations, arguably the vast majority of them, are just mistakes, as you said.

So I used the word lie specifically, I was referring to specifically when we had evidence that the AI knew that it was false and still said it anyway. But I also, to your broader point, I think that the path from here to superintelligence is not at all going to be a smooth, straight line. There's going to be obstacles overcome along the way. And I think one of the obstacles that I'm actually quite excited to think more about is this. You might call it reward hacking.

In AI 24x7, we talk about this gap between what you're actually reinforcing and what you want to happen, you know, what goals you want the AI to learn. And we talk about how as a result of that gap you end up with areas that are misaligned and that aren't actually honest with you, for example. Well, kind of excitingly, that's already happening. That means that the companies... still have a couple years to work on the problem and try to fix

And so one thing that I'm excited to think about and to track and follow very closely is what fixes are they going to come up with? And are those fixes going to actually solve the underlying problem and get... training methods that reliably get the right goals into AI systems, even as those AI systems are smarter than us? Or are those fixes going to sort of

temporarily patch the problem or cover up the problem instead of fixing it? And that's the big question that we should all be thinking about over the next few years. Well, and it yields again, a question I've thought about a lot as someone who follows the politics of regulation pretty closely. My sense is always that human beings are just really bad at regulating against problems that we haven't experienced in some big profound way.

So you can have as many papers and arguments as you want about speculative problems that we should regulate against, and the political system just isn't going to do it. So in an odd way, if you want... the slowdown. You want regulation, you want limits on AI. Maybe you should be rooting for a scenario where some version of hallucination happens and causes a disaster. Right, where it's not that the AI is misaligned, it's that it makes a mistake.

And again, I mean, this sounds sort of sinister, but it makes a mistake. A lot of people die. somehow, because the AI system has been put in charge of some important safety protocol or something, and people are horrified and say, okay, we have to regulate this thing. I certainly hesitate to say that I hope that disasters happen and people die. Right. We're not saying that. We're speculating. But I do agree that humanity is much better at regulating against problems that have already happened.

when we sort of learn from harsh experience. And part of why the situation that we're in is so scary is that for this particular problem, by the time it's already happened, it's too late, you know? Smaller versions of it can happen though. So for example, the stuff that we're currently experiencing with we're catching our AIs lying and we're pretty sure they knew that the thing they were saying was false.

That's actually quite good because that's a sort of like small scale example of the sort of thing that we're worried about happening in the future. And hopefully we can try to fix it. It's not the sort of example that's going to energize the government to regulate because no one's dying. Because it's just, you know, a chatbot lying to a user about some link or something, right?

Right. And then they turn in their term paper and get caught. But from a scientific perspective, it's good that this is already happening because it gives us a couple years to try to find a thorough fix to it, a lasting fix to it. Yeah, I wish we had more time, but that's the name of the game. Okay, so now two big philosophical questions, maybe connected to one another.

There's a tendency, I think, for people in AI research making the kind of forecasts you're making and so on to move back and forth on the question of consciousness. Are these super-intelligent AIs conscious, self-aware? in the ways that human beings are. And I've had conversations where AI researchers and people will say, well, no, they're not, and it doesn't matter. Because. You can have an AI program working toward a goal.

And doesn't matter if they sort of, you know, are self-reflective or something. But then again and again, in the way that people end up talking about these things, they slip into the language of consciousness. So I'm curious, do you think consciousness matters in mapping out these future scenarios? Is the expectation of most AI researchers that...

We don't know what consciousness is, but it's an emergent property. If we build things that act like they're conscious, they'll probably be conscious. Where does consciousness fit into this? So this is a question for philosophers, not AI researchers. But I happen to be trained as a philosopher. Well, no. Well, no. It is a question for both.

Right? I mean, since the AI researchers are the ones building the agents, right, they probably should have some thoughts on whether it matters or not whether the agents are self-aware. Sure. I think I would say We can distinguish three things. There's the behavior. Are they talking like they're conscious? Do they behave as if they have goals and preferences?

Do they behave as if they're, like, experiencing things and then reacting to those experiences? Right. And they're going to hit that benchmark. Definitely. People will, absolutely, people will think. that the super intelligent AI is conscious. People will believe that. Because it will be, you know, in the philosophical discourse when we talk about, like, our shrimp conscious, you know, our fish conscious. What about dog? Typically what people do is they point to capabilities and behavior.

like I can You know, it seems to feel pain in a similar way to how humans feel pain. It sort of has these aversive behaviors and so forth, right? Most of that will be true of these future superintelligent AIs. acting autonomously in the world. They'll be reacting to all this information coming in. They'll be making strategies and plans and thinking about how best to achieve their goals, etc. In terms of raw capabilities and behaviors, they will check all the boxes, basically.

There's a separate philosophical question of like, well, if they have all the right behaviors and capabilities, does that mean that they have true qualia, that they actually have the real experience as opposed to merely the appearance of having the real experience? That's the thing that I think is a sort of philosophical question. I think most philosophers though would say, yeah, probably they do because Probably consciousness is something that arises out of

this information processing cognitive structures. And if the AIs have those structures, then probably they also have consciousness. However, this is a controversial, like everything in philosophy. Right, and I don't expect AGI researchers, AI researchers, to... resolve that particular question exactly. It's more that on a couple of levels, it seems like consciousness as we experience it, right, as an ability to sort of stand outside your own processing.

would be very helpful to an AI that wanted to take over the world. So at the level of hallucinations. AIs hallucinate.

they produce the wrong answer to a question. The AI can't stand outside its own answer-generating process in the way that it seems like we can. So if it could, maybe that makes the hallucination process go away and then when it comes to like the ultimate sort of worst case scenario that you're speculating right like It seems to me that an AI that is conscious is more likely to develop some kind of independent view of its own cosmic destiny that yields a world where it wipes out human beings.

than an AI that is just sort of pursuing research for research's sake. But maybe you don't think so. What do you think? So... The view of consciousness that you were just talking about is a view by which consciousness has physical effects. Yes. in the real world. It's something that you need in order to have this reflection and it's something that also influences how you think about your place in the world.

I would say that, well, if that's what consciousness is, then probably these AIs are going to have it. Why? Because the companies are going to train them to be really good at all of these tasks And you can't be really good at all these tasks if you aren't able to reflect on how you might be wrong about stuff.

And so in the course of getting really good at all the tasks, they will therefore learn to reflect on how they might be wrong about stuff. And so if that's what consciousness is, then that means they'll have consciousness. Okay, and that does depend though in the end on a kind of emergence theory of consciousness like essentially the theory is we aren't going to figure out

exactly how consciousness emerges, but it is nonetheless going to happen. Totally. An important thing that everyone needs to know is that these systems are trained. They're not built.

you know and so we don't actually have to understand how they work and we don't in fact understand how they work in order for them to work okay so then from consciousness to intelligence All of the scenarios that you spin out depend on the assumption that, to a certain degree, there's nothing that a sufficiently capable intelligence couldn't do.

I guess I think that, again, sort of spinning out your worst case scenarios, I think a lot hinges on this question of what is available to intelligence, right? Because if the AI is... slightly better at getting you to buy a Coca-Cola than the average advertising agency. That's impressive, but it doesn't let you exert

total control over a democratic polity. I completely agree, and so that's why I say you have to sort of go on a case-by-case basis and think about, okay, assuming that it is better than the best humans at act.

how much real-world power would that translate to? What sort of affordances would that translate to? And that's the sort of thinking that we did when we wrote AI 2027, is that we thought about historic examples of humans converting their economies and changing their factories to wartime production and so forth and thought, you know, how fast can humans do it when they really try?

And then we're like, okay, so superintelligence will be better than the best humans, so they'll be able to go somewhat faster. And so maybe instead of, like in World War II, the United States was able to convert a bunch of car factories into bomber factories over the course of a couple of years. Well, maybe then that means in less than a year, a couple, maybe like six months or so, we could convert existing car factories into fancy new robot factories producing fancy new robots, right?

So that's the sort of reasoning that we did, sort of a case-by-case basis, thinking it's like humans, except better and faster. So what can they achieve? And that was sort of the guiding principle of telling the story. But if we're looking for hope, And this is a strange way of talking about this technology where we're saying the limitations are the reason for hope. Like we started earlier talking about robot plumbers as sort of an example of the key moment when...

things get real for people, right? It's not just in your laptop, it's in your kitchen and so on, right? But actually fixing a toilet is a very, on the one hand, it's a very hard task. On the other hand, it's a task that lots and lots of human beings are quite. optimized for, right? And I can imagine a world where the robot plumber is never that much better.

than the ordinary plumber. And, you know, people might rather have the ordinary plumber around for all kinds of very human reasons, right? And that that could generalize to a number of areas of human life where the advantage of the AI... while real on some dimensions is limited in ways

that at the very least, and this I actually do believe, dramatically slows its uptake by ordinary human beings. Like right now, just personally, as someone who writes a newspaper column and does research for that column. I can concede that, you know, top of the line AI models might be better than a human assistant right now by some dimensions, but I'm still going to hire a human assistant because I'm a stubborn human being who doesn't just want to work with AI models.

And to me, that seems like a force that could actually slow this along multiple dimensions if the AI isn't immediately 200% better. Yeah, so I think there I would just say... You know, this is hard to predict, but our current guess is that things will go about as fast as we depict in the AI 2027. Could be faster, could be slower. And that is indeed quite scary. Another thing I would say is that, and, but, you know, we'll find out, you know, we'll find out how fast things go when the time comes.

Yes, we will. Very soon. The other thing I was going to say is that politically speaking, I don't think it matters that much.

If you think it might take five years instead of one year, for example, to sort of transform the economy and build the new self-sustaining robot economy managed by superintelligences, that's not that helpful if the entire five years there's still been this political coalition between the White House and the super intelligences and the corporation and the super intelligences have been

saying all the right things to make the White House and the corporation feel like everything's going great for them, but actually they've been... you know deceiving right in that sort of scenario it's like great now we have five years to sort of turn the situation around instead of one year and that's i guess better but like

how would you turn the situation around, you know? Well, so that's, well, and that's where, let's end there. In a world where what you predict happens and the world doesn't end. You know, we figure out how to manage the AI. It doesn't kill us. But the world is forever changed and human work is no longer particularly important and so on. What do you think is the purpose of humanity in that kind of world? Like, how do you imagine?

educating your children in that kind of world, telling them what their adult life is for. It's a tough question, and it's... Here are some thoughts off the top of my head, but I don't stand by them nearly as much as I would stand by the other things I've said because it's not where I've spent most of my time thinking. So first of all, I think that if we go to superintelligence and beyond, then...

Economic productivity is just no longer the name of the game when it comes to raising kids. Like, there won't really be participating in the economy in anything like the normal sense. It'll be more like... Just a series I've liked video game like things like people will do stuff for fun rather than because they need to get money you know if people are around at all and there I think that I guess What still matters is that my kids are good people.

and that they have wisdom and virtue and things like that. So I will... do my best to try to teach them those things because those things are good in themselves rather than good for getting jobs. In terms of the purpose of humanity, I mean, what would you say the purpose of humanity is now? Well, I have a religious answer to that question, but we can save that for a future conversation. I mean, I think that the world...

The world that I want to believe in where some version of this technological breakthrough happens is a world where human beings maintain some kind of mastery over the technology which enables us to do things like you know colonize other worlds right to sort of have a kind of adventure beyond the level of material scarcity and you know as a political conservative i have my share of you know disagreements with

the particular vision of like star trek right but star trek does take place in a world that has conquered scarcity you know people can you know there is an ai-like computer on the starship enterprise right you can have anything you want sort of in the restaurant because presumably the AI invented... What is the machine called that generates the... Anyway, it generates food, any food you want, right? So that's... If I'm trying to think about...

the purpose of humanity it might be to explore strange new worlds, to boldly go where no man has gone before, right? Oh yeah, I'm a huge fan of expanding into space. I think that would be a great idea. And in general, also like solving all the world's problems, right? Like poverty and disease and torture and wars and stuff like that.

If we get through the initial phase with superintelligence, then obviously the first thing to be doing is to solve all those problems and make some sort of utopia, and then to bring that utopia to the stars would be, I think, the... the thing to do. The thing is that it would be the AIs doing it, not us, if that makes sense. Like, in terms of actually doing the designing and the planning and the strategizing and so forth, we would only be messing things up if we tried to do it ourselves.

So you could say it's still humanity in some sense that's doing all those things, but It's important to note that it's more like the AIs are doing it, and they're doing it because the humans told them to. Well, Daniel Cocotelo, thank you so much. and I will see you on the front lines of the Bedlarian Jihad soon enough. Hopefully not. Hopefully not. I'm very wrong. Alright, thanks so much. Thank you.

As always, thank you so much for listening. And as a reminder, you can watch this as a video podcast on YouTube, contributing yet more material to the rapid development of the machine. You can find the channel under Interesting Times. Interesting Times is produced by It's edited by Jordana Hochman. Mary and Marge Locker Sonya Herrero, Amin Sahota, and Pat McCusker. Engineering by Isaac Jones, Sonia Herrero. Mixing audience strategy by Shannon Busta and Christina Samulewski.

Director of Opinion Audio.

This transcript was generated by Metacast using AI and may contain inaccuracies. Learn more about transcripts.
For the best experience, listen in Metacast app for iOS or Android
Open in Metacast