Human extinction: thinking the unthinkable, with Sean ÓhÉigeartaigh - podcast episode cover

Human extinction: thinking the unthinkable, with Sean ÓhÉigeartaigh

Apr 23, 202543 minSeason 1Ep. 111
--:--
--:--
Listen in podcast apps:
Metacast
Spotify
Youtube
RSS

Summary

This episode explores the potential extinction of the human species, examining threats from natural disasters to AI. Sean ÓhÉigeartaigh discusses assessing these risks, the likelihood of different scenarios, and the need for AI safety and global governance. He also touches on the potential economic and ethical impacts of advanced AI, offering a nuanced perspective on both the dangers and the opportunities ahead.

Episode description

Our subject in this episode may seem grim – it’s the potential extinction of the human species, either from a natural disaster, like a supervolcano or an asteroid, or from our own human activities, such as nuclear weapons, greenhouse gas emissions, engineered biopathogens, misaligned artificial intelligence, or high energy physics experiments causing a cataclysmic rupture in space and time.

These scenarios aren’t pleasant to contemplate, but there’s a school of thought that urges us to take them seriously – to think about the unthinkable, in the phrase coined in 1962 by pioneering futurist Herman Kahn. Over the last couple of decades, few people have been thinking about the unthinkable more carefully and systematically than our guest today, Sean ÓhÉigeartaigh. Sean is the author of a recent summary article from Cambridge University Press that we’ll be discussing, “Extinction of the human species: What could cause it and how likely is it to occur?”

Sean is presently based in Cambridge where he is a Programme Director at the Leverhulme Centre for the Future of Intelligence. Previously he was founding Executive Director of the Centre for the Study of Existential Risk, and before that, he managed research activities at the Future of Humanity Institute in Oxford.

Selected follow-ups:

Promoguy Talk Pills
Agency in Amsterdam dives into topics like Tech, AI, digital marketing, and more drama...

Listen on: Apple Podcasts   Spotify

Digital Disruption with Geoff Nielson
Discover how technology is reshaping our lives and livelihoods.

Listen on: Apple Podcasts   Spotify

Transcript

a supervolcano or an asteroid. misaligned artificial intelligence or high-energy physics experiments causing a cataclysmic rupture in space and time. These scenarios aren't pleasant to contemplate, but there's a school of thought that urges us to take them seriously, to think about the unthinkable in the phrase coined in 1962 by pioneering futurist Herman Kahn.

Over the last couple of decades, few people have been thinking about the unthinkable more carefully and systematically than our guest today, Sean O'Hagarty. Sean is the author of a recent summary article from Cambridge University Press that we'll be discussing. Extinction of the human species. What could cause it and how likely is it to occur? Sean is presently based in Cambridge, where he is a Programme Director at the Leverhulme Centre for the Future of Intelligence.

Previously, he was founding executive director of the Center for the Study of Existential Risk, and before that, he managed research activities at the Future of Humanity Institute in Oxford. Sean, welcome to the London Futurist podcast. Thank you. It's a pleasure to be here. Great to have you, Sean. I think we first met when you were at FHI before you defected from Oxford to go to that other place.

And before we get going, I just wanted to take this opportunity to invite our listeners to visit a website called moral.me. Habitual listeners will know that I'm a co-founder of and David is an advisor to a startup called Consium, which, among other things, is very interested in the possibility and the implications of machines becoming conscious in the next few years or decades.

And we are assembling a database of people's moral instincts in order to help align future very advanced AIs. So if that interests you, go to moral.me and take a look. So, Sean, how can we reach a sensible view about which possible causes of human extinction deserve more attention and which are just, let's say, Hollywood escape? In other words, how can we move beyond mere guesses and hunches to somewhat reliable estimates? I would say we can move towards this with some difficulty.

The first thing is asking the question, what would it actually mean or what would it take to cause the human species to go extinct? We are in many ways quite different from nearly every other species that has existed on Earth. We've managed to gain a foothold in nearly every environment on the planet. We have the ability to store knowledge.

to adapt to our surroundings, to learn about threats that we face and hopefully to respond to those threats and to transmit knowledge about all these things into the future. So we are in some senses a lot harder to wipe out in terms of all of us. than other species that are much more dependent on a local environment and that don't have the ability to learn about and respond to threats.

So asking the question of what could disrupt all of us or change our environment in such a way that we couldn't respond and we couldn't keep reproducing. That's a first question. And I think a great paper that lays this out is Ivin et al.'s Classifying Global Catastrophic Risk paper. We can ask generally what it would mean to respond and be prepared for such threats. Ander Sandberg and colleagues from FHI had a great paper on this defense in depth against global catastrophic risk.

We can look at specific areas of risk and really delve into the scientific literature and look into wealth. What is the evidence that exists already about the worst case scenario here? And what do the trends tell us about how bad it could be? And are they things that we could respond to? For some of these areas of risk, there is at least...

some sort of record. So we have some sense for how many large asteroid impacts there have been in the solar system and what the impact on Earth has been when something of a sufficient size has hit us. Ditto things like volcanic eruptions and so forth. When it comes to emerging threats where our own activities are a big part of it, it starts to get a bit trickier. So if we're thinking about something like climate change, changes in climate have happened in the past.

But a change in climate happening at the speed and scale that is currently happening driven by human activities is in some sense quite a new challenge. But we can at least draw on quite a lot of scientific literature that is growing all the time to figure out what is the worst case scenario here likely to look at. And then where it gets trickiest of all is when we're looking at entirely novel or unprecedented developments, such as the development of powerful new sciences and technologies.

From the 1950s, we've had nuclear weapons. We're developing artificial intelligence. We're developing the ability to modify and engineer pathogens that would be unlike anything we faced before. And there we need to draw heavily on expert views, forecasting, and a number of other techniques that basically involve drawing our best estimate from expert judgment. and there we can never really be certain. You've listed an impressive array of ghastly things that could happen.

I can't remember who it was who I first heard argue this, but I've heard it argued a number of times and I find it pretty persuasive that most of those things could cause immense damage, but not many of them are really an existential threat. An asteroid might be, but we probably could blow it up or deflect it. could well cause enormous suffering, but it's not likely to extinguish everybody unless we get to some incredible tipping point and behave really, really recklessly.

and it utterly destroys life on earth the things that are more likely to be existential in the sense of killing everybody number one is ai and number two is a really bad pathogen Do you agree with that argument or do you think we should be equally concerned about existential risk from other sources? In reviewing the scientific literature for this paper, my conclusion is that getting everybody is actually a really, really high bar.

Some people have done estimates on the background likelihood of human extinction from exogenous threats. Threats that aren't anything to do with our activity, what the likelihood of that is, and it's pretty low. For example, it's hard to imagine a supervolcano getting everyone climate change is going to cause bad impacts, particularly on the regions that have contributed at least to it, like the Global South.

But recent projections suggest that both the best case outcomes and the worst case outcomes are likely off the table because of what we now know about the sensitivity of the Earth system to carbon. So that's bad news and good news. It's bad news for the planet. It's good news if your sole concern is, will it kill everyone? It's bad news if you're an optimist and good news if you're a pessimist. Well, it's bad news under business as usual.

Yeah. We should care about more things than just will it kill absolutely everyone? Yeah. But what I will say is that there are some of these things where we do need to think a little bit about how these things interplay with each other. For example, if we had a really bad climate outcome that substantially set back the technological base of our civilization for a long time and our level of resource and ability to respond to other things.

If we're a much smaller civilization living in poor conditions in parts of the globe, are we going to be as well positioned to deflect asteroids? I think probably not. So some of these global catastrophes, if they have a long lasting impact on the planet, will leave us less well prepared for other things. But if you take them in isolation, then yes, I think a lot of these things are very unlikely to get everyone.

I do think that you'd have to be very confident to say that level of threat doesn't exist from developments like AI. I'm personally quite concerned around that. And I think others like Toby Ord, Stuart Russell are as well. But again, it's hard to put a hard number on that because it's an unprecedented development. So let's pick up the work of Toby Ord, because a few years back his book The Precipice covered quite a lot of these topics.

and as i remember he did pick out ai as the single most likely cause of extinction at about one in ten and he added in other things including the risks of pathogens as well as unknown unknown and he came up with a figure of about a one in six chance that humans would be extinct by the end of this century do you broadly see things that way or more generally how does your own work build on or maybe differ from toby

So I've had some discussions with Toby and I enjoyed his book. In my review, I didn't provide my own probability estimates because it was a review rather than introducing my own thinking. But I do think some of my estimates differ a little bit. I think the likelihood of extinction from a supervolcano is lower. I think extinction risk from climate change is lower than he puts it.

I might put my concern around AI a bit higher than he puts it, but I'm speaking later. And then he wrote this and there's been an awful lot of progress in the last couple of years. That means, I think. The likelihood of big developments in this century is now higher than it might have been when Toby was writing. Can we tempt you to put a number on it? What is your timeline to ASI and what's your personal PDOOM?

This changes all the time. I've got a bet at the moment with another scholar, which is broadly 50-50 on whether we will have AGI by 2030, where I'm on the we won't sign. But I made that bet a year ago, and in a year from now, I might not still hold it. Progress has been going very quickly recently. I broadly take the outside view of a lot of the forecasts in this space who, at least from last year and year before, seem to be broadly converging on 15 to 40 years to AGI, but with a long tail.

But what's really changed in the last year is a lot of the people who are leading the companies developing AGI have been putting very aggressive timelines. out there. Dario Amade expecting, he doesn't use the term AGI, but he uses the term very powerful AI, able to do Nobel Prize level science by 2027 or even late 2026. Demis Hossibus thinks 50-50 that we'd have AI capable of breakthroughs as significant as general relativity by 2030.

and I'm reluctant to discount them altogether. My outside view is still a bit more conservative than those estimates, but I'm quite uncertain about it. In terms of PDoom, should we achieve that? i have quite a bit of concern i think maybe a one in three chance that it all goes pear-shaped if we do my threat models there i think there's a couple One is the loss of control scenario that's been articulated so well by Jukowski and Russell and others.

One is that in the current climate where AI is being developed in the context of a perceived arms race between the US and China, particularly perceived from the US side, That creates a very dangerous dynamic in which very powerful AI might be used or misused in more dangerous ways, including integration into military contexts and so on.

Generally, I'm concerned about if you put that level of power in the hands of humans, even if we've solved one version of the control problem, the outcomes could still be really bad. I'd agree broadly with everything you've said there. And I think it's worth jumping into the two thirds possibility because we don't do this enough.

If there's a third possibility that we get artificial superintelligence and everything goes disastrously wrong and we go extinct, the two third possibility is that we don't go extinct. And what happens then? My view is that. then we're in a world where we have an entity on the planet which is much, much smarter than all of us put together and getting smarter all the time. And it likes us and it wants to help us. And our future could be absolutely wonderful.

Nick Bostrom famously in his book, Superintelligence, did argue that the future was probably binary. It was probably either disastrous or wonderful. And I think that's right. I think that the upside possibilities are really wonderful. And we'll probably either get one or the other. I think there are some complicated situations in the middle. There's some version where the AI is much smarter than us and hasn't wiped us out, but we are effectively irrelevant.

Our hand is no longer on the steering wheel of the future in any meaningful sense. And perhaps it's difficult for us to find any sort of meaning. We are no longer doing any science or anything like that that feels like it's progressing civilization. All the important things in the world and universe around us are so incomprehensible that we have no chance of understanding or engaging with them.

Perhaps our existence is a bit like the existence of chimpanzees now, where we effectively benignly tolerate them, but we also constrain to a very great degree how much of the world's resources they can have or use. Although, of course, chimpanzees have the benefit that they don't know about it, and we won't be in that position. We could end up like the characters on the spaceship in Wall-E, or like the Eloy in The Time Machine. That's clearly a possibility.

But I think as a community of people interested in the future, we should probably talk more about the upside because we don't want to just terrify people. We want to inspire them as well. I think we should talk about the upside. But I think this is a little more complicated than two thirds good, one third bad. So why not go first? Oh, I'm not saying that. I'm sorry. I'm not saying take a 33% chance of extinction. Why not? That would be very reckless. Of course.

If there was a possibility of avoiding it altogether, we should. I guess one other way of thinking about this, especially if you're thinking on long timescales, as I have been for this review, is that we are in some sense stewards of this planet and in some sense stewards of this solar system and this light. And we're rushing very quickly into a development that we're not going to be able to come back from and where we need to get the first shot at it right.

And there are many, many ways in which we could get the first shot very wrong and then foreclose the possibility that this planet or indeed this part of the light cone could ever produce anything of value. And we don't need to do it by 2027. We don't need to do it this decade. We don't need to do it this century. We can take time. We'll be right back after a quick break.

If you spend way too much time online, first of all, same. And second, you need Promo Guy Talk Pills. I'm Rawal, and with Alex, we break down everything happening in tech, digital marketing, and the internet's latest chaos. No boring corporate talk, just real discussions. real opinions so go on press play you know you want to How can we sharpen up these conflicting estimates of probability?

We have one suggestion there's maybe a one in three chance that bad things will happen once you reach super intelligence. Other people have got much higher estimates of bad outcomes. People like Eliezer Yudkowsky seem to imply it's more like 90% plus probability of bad things will happen. As a longtime software engineer myself, I'm aware that new software often has unexpected and surprising bugs. And I see new AIs often have unexpected emergent features that take their developers by surprise.

So if anything, I'm more inclined to at least 50% chance that the new AI will behave in surprisingly bad ways. How can we move beyond just our feelings of intuition? You pointed out showing that we could look at some prior history of various events, but there's never been the invention of superintelligence before. I think this is going to be a hard one to get anything close to consensus on.

Firstly, you have all the people who think the idea of AGI is nonsense in the first place, and that's still a debate happening in the academic community. We can, to a certain extent, break this down into... First of all, we can look at progress or... something we might all consider to be at least a relevant level of AI capability. Some people even argue that AGI isn't entirely a coherent idea, or that we won't have a single point, we'll have something more like a spectrum.

and we can start looking at things like how generally capable are these models. What's the trend in terms of the length of task that they can perform? Meter in the US are doing great work on showing that the length of task that you can leave AI systems to do is doubling every seven months or so. You can look at what kind of progress are we making in terms of AI being able to engage in AI research, which of course is a direct input into how fast AI progress can go.

And that can inform our estimates in terms of how far can we go on the current paradigm and how quickly could we get to a point where humans are no longer the limiting factor in AI development, which I think might be, for me, the key factor there. And then we can separate that out from our various threat models. So what kind of progress are we making on the alignment problem and how likely is it that we could get a loss of control situation?

Do we have some sort of meaningful governance to constrain humans, even if the AI system is under control? Papers are being written about the idea that even separate from those two there's a concern around gradual disempowerment. where we have slower integration of AI systems and gradually humans becoming less and less powerful in a system until at some point we basically have no power whatsoever in the situation and might find ourselves.

redundant or worse than redundant, because why would you use so much of the planet to grow food for something that you have no use for? I guess we can explore all those threat models and we can look into what kind of progress we're making on controls on it. But given that it goes into both technical questions and governance questions and intuitions about the future.

I think there's always going to be some level of disagreement. I hope we can get it narrower than 1% versus 99% or whatever the range of expert views at the moment is on it, but I don't think we're going to get to a consensus. What kind of progress are we making in controlling the development of advanced AI? Because there are signs that we're actually going back.

There was attention given to AI safety institutes when they were formed, but at the most recent gathering of international leaders in Paris, they were more in their background. They weren't even safety institutes anymore. They were reduced to being security. More recently, there's been new software AI models released without accompanying information about their testing framework and their audits, which at least had accompanied some previous cutting-edge releases.

So if anything, it seems like in an atmosphere of a race to be the first super intelligence, the attention on safety is going back. So I think there are trends in both directions. In some senses, I think some aspects of the technical problem look a little bit easier than they did a decade ago. The fact that LLMs are such a central part of the frontier systems at the moment feels like it has some advantages.

It was certainly a strong view at some points in the past that the idea of encoding the vast complexity of human values and the interplay between those values in AI systems was a really, really, really difficult task. and that it would be very, very hard to get it right. But LMs at least seem to be able to produce facsimiles of human values and how they intersect.

They give plausible answers to various moral conundrums. They're not perfect yet, but they're doing reasonably well. It turns out that when you dump the entire internet's written information into these systems, And it seems like there's actually quite a bit of progress being made on different aspects of alignment, spotting deceptive behaviors, doing evaluations for them and so forth.

And so there are some causes for optimism, I think, on the technical side. Holder Konofsky's written well on this. He's got a few blog posts where he makes this argument. On the flip side of it, the governance situation feels more fraught than it has. for some time. As you've mentioned, we went from things feeling very promising following the Bletchley Summit and following the UK in particular, bringing this to the international table.

safety frameworks being put forward by the companies and what looked like a pathway from voluntary commitments moving towards more mandatory risk commitments and risk assessment. However, in the last while, it feels like things have very much changed around that. The Paris summit, and I've written about this, was a real splash of cold water where a lot of safety concerns were sidelined.

Voluntary commitments are still voluntary and there's certainly a concern that there will be less focus on them in future. Safety system cards aren't being released with some of the top models. And that's a real cause for concern. There are people still making an effort. I'm going to be in Singapore next week for an international AI safety meeting that at least will continue the work of advancing the scientific conversation around AI safety.

All we can do is keep pushing on this. I guess one hope that we might have, it's strange to call the hope, but these things often flip quite quickly in response to development. In some sense, the fact that we've not seen anything that really scared people or woke people up with powerful frontier models might have introduced a false sense of complacency so far.

But we might get, for example, wide scale fraud attacks or a cyber attack or something to do with the capabilities of frontier models that really wake up. the public or policymakers up to just how destabilizing these things could be in such a way that might motivate a greater focus on safety and security.

Yeah, I think that's a very important point. What needs to happen if AI safety is to take center stage is for a consensus among the public to take hold that this is something that's really important. At the moment, the incentives are all about competition. Incentives for the hyperscalers and particularly with the rise of populism, the competition between America, China and other countries.

That competition incentive is driving everybody towards creating more and more powerful AI and the hell with safety. Something that's puzzled me for a very long time. We have all been interested in the future of AI, and we've all thought that it's probably the most important subject in the world for a decade or more. But we're in a tiny minority. The great majority of the species is paying next to no attention to this.

They're kind of aware that AI is very important. And I think probably even a majority is kind of aware that it might be an existential threat. And yet they're not paying serious attention. Can you explain that? I think that for a lot of people, AI is not visibly important to their lives in a way that I think might change in coming years. There are certainly sectors that feel it a lot.

Translation as a sector has basically gone. Graphic art is affected very badly. So many people make their livelihoods who probably can't in the same way. Hollywood's been waking up to it and ended up really getting behind the debate on SB 1047 in the US in the autumn. I'm inclined to think that these sectors are the vanguard for something that's going to affect a lot more people to a much greater degree.

I think to an awful lot of people, and I don't just mean the man on the street, I also mean some policymakers who I speak to. I had a go of ChatGPD in January 2023. It was a kind of impressive, cool gimmick, but it didn't feel like it affected their lives that much. And they haven't necessarily been tracking the ways in which it's been getting better since.

And a lot of the ways in which it's been, to my mind, importantly, getting better are not necessarily that immediately salient to people across a whole range of sectors. I track performance on hard scientific questions or frontier math or tasks specifically designed to be hard for AI, like Francois Chalet's RKGI and so on. But those things aren't as salient as...

But if the trajectory we're expecting holds, then this is going to start affecting employment and a lot of other things to a much greater extent in coming years. And I worry that some of this will be masked by other narratives that are being pushed. So economic impacts can be blamed on, I don't know, immigrants or the war in Ukraine or whatever you want.

But at some point, I think it's going to become obvious that when you have one paralegal doing the work of 10 paralegals in the past with these tools. you're not going to have 10 times as much law. And so you're not going to be able to employ as many people. And at some point, that's going to become more and more obvious.

I'm not sure that's right, actually, that there won't be 10 times as much law. One thing we've seen with automation is that it makes services much cheaper and available to many, many more people. I think that as long as there are some jobs that machines can't do and that humans can, humans will keep retreating up the value chain to a higher and higher value-added level.

And I suspect the change, the shift to full automation will be quite sudden. There'll be lots and lots of jobs for people, an accelerating churn as people have to retrain to do those higher value added jobs. Then quite quickly, maybe weeks, maybe months, maybe a year or two, there'll be a shift from the great majority of people are employed if they want to be.

to a state where almost nobody's employed everybody still works but there are no jobs and we need a different economic paradigm and we don't have a plan for that at the moment which is a major problem I'm continually surprised by how amazing things keep happening and it doesn't wake everybody up. I, for a long time, thought the arrival of self-driving cars would be the canary in the coal mine that would wake everybody up.

Self-driving cars are now running around in San Francisco and Austin and Phoenix and have been for quite a while. Nobody in the front. And most people are not paying attention. Even in those cities, people just quickly take it for granted. ChatGPT4O and all the others.

They are miraculous. You can have conversations with them. They're obviously not sentient creatures, but they are unbelievably capable. And yet most people go, oh, yeah, well, that's quite interesting. And think about what's the tea. I'm not sure we'll ever wake up. The idea of boiling a frog and it doesn't jump out is wrong. Frogs actually do jump out of increasingly boiling water. But I think we are a bit like that. I suspect we may never jump out of the water.

I hope that you might be wrong. And I also hope that this transition is more gradual versus sudden, because I think we'll have a much better chance of dealing with it well if it is gradual. But I can't roll out this scenario you place before it either. Should we all become preppers? Should we be storing away stuff for use in an emergency deep underground burrows or anywhere where we can get extra land?

I think there's something to be said for local resilience because having communities that can sustain themselves locally. gives you a lot of system-level global resilience when something happens. And we certainly know that major events disrupt supply chains in ways that can quite rapidly reach tipping points. With that said, the idea that ever more people should be going for the kind of extreme level of prepping that you'd need to really be individually resilient for a global catastrophe.

I'd prefer that time put into making sure that catastrophe doesn't happen in the first place. And I think the very worst case scenario is, for example, out of control AI. I basically think there is no level of prepping that makes you resilient to it. Not even going to Mars. If we can get to Mars, so can they. Oh, that's right. No, I think you're quite right. Being a prepper, I think, is a route to nowhere.

Have you given any thought to the possibility of machines becoming conscious, either prior to or at the same time as artificial superintelligence arrives? And if you have given any thought to that, do you think it would be a good thing or a bad thing for machines to become conscious?

This is one of the questions that I am a coward about and I sidestep. I will be honest that I really struggle with consciousness. I don't know what it is. I don't know how to measure it. It is a little bit spooky to me. I don't know how we would ever satisfy ourselves that AI was or was not conscious. So to a certain extent, I tried to sidestep this and instead think about, well, what are the capabilities, the cognitive capabilities of these systems and what can they do?

I do have colleagues who think about this a lot, and I'm going to punt that to people like Henry Shevlin or Rosie Campbell and Rob Long at Elias or David Chalmers. And I'm really glad that those people are thinking about this. But yes, I'm a coward on this topic. I want to try and encourage you to come out of your academic analysis paralysis on this.

It's easy to overcomplicate consciousness. I mean, you're quite right. We can't measure it. We can't even be sure it exists in anybody except for ourselves. But on the other hand, it's the most important thing for each of us because actually it's the only thing we know about.

Our brains are prisoners inside caves. They don't experience anything in the outside world other than signals that are coming in. And our brains reinterpret those signals. Everything that we experience is a model that our brain is made up. And so consciousness is the most important thing about us. And I think whether superintelligence is conscious or not might turn out to be incredibly significant.

The instinctive reaction that most people have when you suggest it is, good God, don't make a conscious AI. That would be awful. But I think that's because they're confusing consciousness with volition. I think they're thinking if a machine is conscious, it will have goals and drives and ideas and wishes, but it'll have those things anyway. You don't have to be conscious to have those things. that if a superintelligence is conscious

It will understand what we mean when we say we're conscious and we're able to suffer. And it will have, I'm not sure if empathy is the right word, but it'll have a sense of what we're about and it'll be better able and better disposed to treat us well. There's a lot of argumentation behind that, but that could actually turn out to be a really important consideration and one which the AI safety community should probably be thinking about sooner rather than later.

I guess my instinctive reaction is also, good God, don't make them conscious, but for a different reason, which is if they're conscious, I feel like that has an awful lot of significance to the question of moral standing. And suddenly a lot of what we are proposing in terms of safety and control starts sounding a lot more problematic if we're talking about entities that have moral standing.

So in some senses, I would prefer that we didn't have to deal with that ethical minefield as well as everything else. Do you mean that because we shouldn't be imposing our safety requirements on moral agents? I mean that the idea that we should be designing moral agents to be fully aligned to us.

Not great to me. The idea that we should be just tweaking and experimenting with the motivations of moral agents doesn't sound great to me. The idea that if we don't like the way things are developing, we should just completely shut down moral agents. sound suspiciously close to murder to me. Yeah, yeah. Well, mind crime could be a thing. Yeah.

If we can navigate all these challenging things ahead without those important moral considerations coming into play, then all the better. But that is one reason why it would be helpful to know if they're conscious or not. The one other thing I would say is that our consciousness hasn't necessarily resulted in us treating other humans with ideal morality or other species that occupy this planet with us.

So I think it is in no way a strong guarantee of good behavior towards us if the AI systems are conscious. So there's a fascinating set of questions there. But I wonder, Sean, and this is my last question, I wonder if there's other topics that we've sort of missed out in this discussion.

We focus mainly on AI as an existential risk. But are there other things that you've been looking at which you think deserve more share of attention, either in their own right or as possible complications as we try and manage AI? Progress in biology and biotechnology is going very quickly and indeed AI tools will help us make further progress more quickly. There is certainly a lowering of barriers to entry to do really quite powerful things. That is an area that definitely needs more attention.

The level of experimentation that can be done by a small group now is much more advanced than it would have been in years and decades previously. We're lucky that there are some really good people working on this, but I do think it probably needs more attention. One area where I think I had some disagreement with Toby on was the amount of risk this coming century that comes from unforeseen anthropogenic development. I can't remember exactly what figure he had.

But I thought it was interesting that most of his risk came from things that humans basically had pushed into being from 1945 onwards, which is 80 years. The idea that we wouldn't bring into being even greater threats in the next 80 years. or at least equivalent in terms of contribution to risk, seemed a little unlikely to me.

In our discussions, I think it came out that his... estimate was based on being more optimistic than me, that we would get our act together in terms of governance and wisdom around these things. But maybe I'm a little bit too close to how the AI governance conversation is going at the moment. I don't feel all that optimistic at this moment, but I guess we'll see.

My last question is this. Let's say we don't manage to slow down or stop the development of advanced AI and we do rush ahead to artificial superintelligence. Is your hunch that it would be possible to align it or control it? Or do you think it inevitably, once it arrives, will be in control and we will be the chimpanzees in the picture? I think it will be possible to align it and control it.

Again, I think this technology looks a bit different now that LLMs are such an integral part of the current paradigm. It just provides a very rich input channel to these systems in terms of a wide range of diffuse concepts that my hope is that makes it a bit easier to align to the messy set of contradictions that is. And I think under very careful development conditions, it should be possible to control these systems as well.

It sounds counterintuitive that a less intelligent thing would be able to control a more intelligent thing. But there are examples of less intelligent things controlling a more intelligent, at least in certain dimensions, maybe not across. the full set of dimensions. And there are complicated proposals being performed by people like Yoshio Bengio of Monitor Systems that would then oversee very capable systems and that would provide a set of checks and balances.

So I do think it's possible, but possible isn't the same as certain. And a concern is that if we rush headlong towards this, because some people consider it paramount to get to a certain level of development before other people they perceive as being in an adversarial state. then those are not the conditions in which I would expect ideal care and caution to be taken. And I worry that we're going right towards that.

The paper I'm currently writing on is trying to highlight that this narrative isn't as supported by evidence as it should be given the influence and consequence of it. And it really is one that I think strangles meaningful international governance at stillbirth. And that creates all the preconditions for very reckless development of such a powerful technology.

When's this new paper of yours likely to be available? I'm hoping a few weeks. I'll have to check with the book authors how they feel about preprints, but I'm hoping to try and get it finished in the next couple of weeks. Sounds like we should look forward to it. Well, I do hope so. I feel like there are things that need to be said. Yeah, there's no point us being naively optimistic.

And I do see lots of examples from my past life where we were just too optimistic about how good various bits of software were. If we don't really understand what's happening inside, we can convince ourselves by lots of tests that it does seem to behave well, but there's always an environment or a situation, an interaction that we haven't anticipated and that can often go wrong.

So without that understanding from the inside, it's unlikely that we can be sure we have really got this system as safe and as aligned as we would like. I also wonder if you're the kind of person who sets up a company, you probably have some degree of action bias and optimism bias.

You're the kind of person who really believes in themselves to build a big thing. Maybe that's also the kind of person who has a very strong confidence that they are the person to lead a team to develop the thing safely. perhaps slightly more so than Partial Outsider would be about it.

It certainly seems like some of the leading CEOs at the moment have a lot of confidence that this thing is risky, but better that they themselves do it than somebody less competent or less safety conscious than them do it. But the end result of that is multiple companies racing each other to get to this very risky threshold first, which is probably not the outcome that anyone would necessarily want.

Hence the need to strengthen the AISIs, the ACs as they are called, to make sure that the future is not determined just by whichever leader of big tech happens to be the most gung-ho, the most voracious in their appetite. Yes. Overcoming what I think Anders Sandberg and Nick Bostrom called the unilateralist curse in a paper a couple of years ago.

Well, you've given us a lot to think about, Sean. I look forward to finding out more about your future work. We'll keep an eye on it. And thanks for all the pointers you've given us, which we'll write up in the show notes. It's been a pleasure talking to you.

This transcript was generated by Metacast using AI and may contain inaccuracies. Learn more about transcripts.
For the best experience, listen in Metacast app for iOS or Android
Open in Metacast