The case for a conditional AI safety treaty, with Otto Barten - podcast episode cover

The case for a conditional AI safety treaty, with Otto Barten

May 09, 202538 minSeason 1Ep. 113
--:--
--:--
Listen in podcast apps:
Metacast
Spotify
Youtube
RSS

Summary

This episode features Otto Barten discussing a Conditional AI Safety Treaty as a means to govern AI development and deployment. He addresses the challenges of international cooperation, balancing innovation with safety, and the role of public awareness. The discussion covers the specifics of the treaty, potential obstacles, and the Existential Risk Observatory's ongoing efforts to promote AI safety.

Episode description

How can a binding international treaty be agreed and put into practice, when many parties are strongly tempted to break the rules of the agreement, for commercial or military advantage, and when cheating may be hard to detect? That’s the dilemma we’ll examine in this episode, concerning possible treaties to govern the development and deployment of advanced AI.

Our guest is Otto Barten, Director of the Existential Risk Observatory, which is based in the Netherlands but operates internationally. In November last year, Time magazine published an article by Otto, advocating what his organisation calls a Conditional AI Safety Treaty. In March this year, these ideas were expanded into a 34-page preprint which we’ll be discussing today, “International Agreements on AI Safety: Review and Recommendations for a Conditional AI Safety Treaty”.

Before co-founding the Existential Risk Observatory in 2021, Otto had roles as a sustainable energy engineer, data scientist, and entrepreneur. He has a BSc in Theoretical Physics from the University of Groningen and an MSc in Sustainable Energy Technology from Delft University of Technology.

Selected follow-ups:


Music: Spike Protein, by Koi Discovery, available under CC0 1.0 Public Domain Declaration

Promoguy Talk Pills
Agency in Amsterdam dives into topics like Tech, AI, digital marketing, and more drama...

Listen on: Apple Podcasts   Spotify

Digital Disruption with Geoff Nielson
Discover how technology is reshaping our lives and livelihoods.

Listen on: Apple Podcasts   Spotify

Transcript

That's the dilemma to govern the development of Advanced AI. Our guest is Otto Barton, director of the Existential Risk Observatory, which is based in the Netherlands but operates internationally. In November last year, Time Magazine published an article by Otto advocating what his organization calls a conditional AI safety treaty. In March this year these ideas were expanded into a 34-page pre-print which we'll be discussing today.

international agreements on AI safety, review and recommendations for a conditional AI safety treaty. Before co-founding the Existential Risk Observatory in 2021, Otto had roles as a sustainable energy engineer, data scientist and entrepreneur. He has a BSc in Theoretical Physics from the University of Groningen and an MSc in Sustainable Energy Technology from Delft University of Technology. Aldo, welcome to the London Futurists Podcast. Thank you, David. Thank you for joining us, Artang.

Otto, before we discuss AI safety, what could you tell us about how your own personal focus evolved from climate change to existential risk? Basically, the trigger for me was just to get to know the problem of AI existential risk. I think especially AI existential risks and also other existential risks.

If you are aware that there is a possibility that either humanity and its existence completely or that we will lose all future value, then I think it's obviously of writing importance to other things. Even something like climate, which I still do think is very important as well. Climate is also an existential risk in itself, but probably not the largest one.

It's estimated by Toby Ort, an existential risk researcher and effective altruism co-founder, to be about one in a thousand chance, while AI is estimated to be roughly one in a ten for the next hundred years. So that's two orders of magnitude higher. And I think these estimates are probably realistic. I still think climate change is a big issue, but mostly for non-existential reasons. I would personally rank it on maybe number two after the future of AI for important issues in the world.

But a little bit more about my personal journey maybe. I was working as a sustainable energy engineer indeed. building wind turbines and i later had a startup by smart charging electric cars working on the engineering side of the energy transition as you might call it but i was also activist at extinction rebellion and another ngo in the netherlands And I visited a lecture by Andrew Sandberg, a futurist of the Future of Humanity Institute in London, that you asked it actually back in.

I encountered there that there is a possibility that we end existence. I never thought about that. I thought it was quite absurd in retrospect to have never thought about the future of humanity. I couldn't even tell in one order of magnitude how many years we might have to live. I think it is really important to zoom out and to think about our larger future.

I was also convinced that technology is probably going to play a huge role in this. It has played a huge role so far in us being able to feed the 8 billion people around that we have now. We would never have been anywhere close to this world without technology. That's definitely important. And I think for the future technology, it's quite unlikely that we have discovered all science yet. It's quite unlikely that we have built all technology yet.

I think it's quite likely that there are big things ahead of us that will be immensely important for our future. And it doesn't have to be hundreds of years ahead of us. It could also be even a few years ahead of us. Perhaps one more example. I was working on this startup back then. The concept was to smart charge electric cars whenever there is most solar and wind power available.

And I was working on this and it would have taken, in any case, maybe about 10 years or something before we could do this at scale. And I was thinking, okay, 10 years from now, that was back in 2020. i thought it doesn't seem that unlikely that we have agi already and this whole enterprise is basically for nothing either because agi went well and then an agi can do this in a trivial time or because agi went poorly and now we're not around anymore

It just seemed kind of pointless, to be honest, to work on something that has a bit of a larger timescale. If you know or suspect that human-level AI is coming and the impact that it could have. Yeah. Before we get into the existential risk of AI, I'll take...

From your time as an activist, did you manage to square the circle of how to carry out protests which Register with the media and with the public, but don't really really annoy people I think that in the UK anyway, and I'm pretty sure in America and other places, the activities of some of the activist groups like Just Stop Oil has actually created a backlash and has pushed people away from the message that they're trying to spread. Is there a way to square that circle?

I think there's all kinds of things that you can do and you can be on this whole spectrum between giving policy advice and being very nice and polite to people all the way to doing the more disruptive actions that just Topol has been doing.

I don't think there's a serious risk of backlash, and I think that happened in the Netherlands as well, that Extinction Rebellion was doing protest. I think that could be one of the reasons that many people found it populist, because they're radicalized to the right in response of actions by groups like Extinction Rebellion.

On the other hand, to be honest, I think that the responsibility rests in the people responding to this and not really in the people doing these actions. I think everyone is responsible for what they're doing themselves. What the activists are doing is simply saying, this is an enormous problem, it requires a lot more attention, and we demand from our government that it solves this problem, and I think that's not unreasonable at all.

It may very well be a reasonable thing to do, but if it causes enough members of the public to react against the message, then that has an effect on public policy. It undermines any momentum there is towards a more progressive public policy and that's a real problem. I don't think it's a real problem, but I think that the responsibility of that does not rest at the activists, but at the people actually responding in such a way.

I think those activists are morally quite ahead in a sense. They are saying what they believe is true and they're acting like it. And I think that's more or less the best that you can expect from people. That's already quite a lot.

but you could expect even more and say okay apart from doing this please also factor into account all the effects that it could have on society try to model this and try to act like it i think that's too much to expect from a lot of people But I do think that if you are reliably able to do this and you are certain or near certain that your protest will cause backlash.

then it might be better not to pursue that strategy and pursue another strategy. You could see this perhaps as a mistake in activist organizations that are leading to an adverse outcome. But it's all pretty hard to predict. I do think you also need pressure to get things moved. My own view is that if an organisation is perceived as only being negative, as only being against, as only advocating some sacrifice, then it's unlikely to be widely embraced. It needs to have a positive message too.

So rather than just saying let's use less energy, a better message is there is a wonderful green economy ahead in which there's lots of jobs, there's lots of energy, there's lots of healthy environment. Please do it like this and transition away from the old trajectory.

And with AI, I think just saying let's pause AI can be counterproductive. So I like the framing that other people say, which is we're not trying to pause all of AI. We're just trying to pause the parts that are particularly dangerous. And we will get the benefits of AI. We'll get the benefits more reliably provided various safety measures are advocated.

So that's the takeaway I had. My message last year to myself was I need to be clear at spelling out the risks because many people deny the risks. But my message this year to myself, 2025, is to keep on emphasising the positive upsides of these safety approaches. I don't know if that matches your own thoughts, Otto.

It's very difficult, I think. It's very difficult in this field to even say what is net positive and net negative, and it's even more difficult to say what is the most net positive. I think there are all kinds of takes that are quite different that could all be defended intellectually.

I don't think POS AI can be defended intellectually as potentially the most obvious thing to do. It's kind of a precautionary principle approach, and they're being explicit about a precautionary principle approach. If it's possible that this leads to urban extinction, then let's not do it. I don't think that's a common sense thing to say.

However, of course, this is also giving backlash among the people developing it and then it depends on what is your trend model, what's your solution, proposal, which actors do need to move and in what way to get this done. Is your solution proposal something that can be brought about by only policy action or do you need public awareness as well? There are many takes that I think are perfectly defendable intellectually, and I think PauseAI is one of them.

However, it's also definitely a possibility that it will give backlash and actually what you need is less action only informing actors for example or by only doing some kind of intervention to get people at the labs to behave more responsibly or to get bureaucrats for example to bring about small changes in legislation and i think that's also a defendable approach

I think it's really difficult to say what will work in the end. Personally, I'm sympathizing with people who say what's on their mind. And I think it's up to the rest of society to take the right actions in the end. So let's get on to your Conditional AI Safety Treaty. As David said, you've published a 34-page preprint, International Agreements on AI Safety Review and Recommendations for a Conditional AI Safety Treaty. What is a Conditional AI Safety Treaty?

Great question. It's combining several elements that we think are the best potentially from the space. One is an air safety treaty, and an air safety treaty has been proposed by many people. The more explicit variants of that say something like the precautionary principle. So if AI development is that dangerous and it could cause an existential event, then let's agree with multiple countries, possibly all the countries of the world, or possibly the US and China, for example, to not do this.

that's one of the intellectual outputs of the space if you like that i think is promising i think this is also a useful rallying point Something else now that we have incorporated in the Conditional Air Safety Treaty is a responsible scaling policy.

and this is propagated and already done by labs like Anthropic and I think OpenAI as well and Google DeepMind, that they are doing evaluations and they're basically testing what is the capability of our models and what is the misalignment of our models.

and if the capability is close to a main thread model then they commit to well it's a little bit unclear what they commit to exactly but let's say they would at least not release such a model that can for example be used to build bioweb or can cause loss of control or some other threat model that is very serious. We'll be right back after a quick break.

If you spend way too much time online, first of all, same. And second, you need promo guide talk pills. I'm Rawal, and with Alex, we break down everything happening in tech, digital marketing, and the internet's latest chaos. No boring corporate talk, just real discussions with real opinions. press play you know you want to Now if things are clear-cut, if it's clearly visible whether an AI is going to cause existential damage, then people will stop.

But I think the problem is that a lot of these things are contentious. People are saying, well, it looks to me that my AI won't cause any damage. I've tested it a bit. It doesn't cause any existential damage in the test environment. Therefore, I want to release it. And others might say, you haven't tested it adequately. You've only tested it under particular conditions. There's going to be a lot of argument as to whether it really is responsible to release it. How can that be answered?

I would say someone building it themselves is quite likely biased and is quite likely not really committing that much of their own bandwidth to safety. I worked as an engineer and what you're really trying to do is solve the problem ahead of you and for those working on AI, the problem ahead of them is increasing capability.

That's, I think, top of their mind and probably occupying 99% of their ad space because it's already difficult enough and it's really difficult to think about other things as well. So I don't think you can expect from those people that they are having safety on top of mind simply because of bandwidth reasons, but also because of bias reasons. Of course, they are heavily involved in those companies. They have shares and everything.

So of course what they want to do is get it out and make money with it in the end. And for academics, this is somewhat better, but not hugely better, I think. They might not have monetary incentives, but they do, of course, have fame and glory incentives to get their stuff published before something better does.

So I don't think from those people you can expect that they are very safety conscious. And I agree with you, David, that if it would be completely obvious that we would lose control over this model or it would be completely obvious that someone would manufacture a bioweapon, then they would probably stop themselves. and it might be that we actually end up in that scenario.

But I think a much better scenario would be if other people would make the trade-off between safety and economic development or societal use of a model and not the ones developing it themselves. And of course, traditionally, governments have held its role and I think it really makes sense in this case to decide.

so also at the moment given that you have the hyperscalers locked in a very fierce battle with each other to continually push the performance the capability of their systems given that you have America and China seemingly increasingly locked in a competitive struggle. And also you have a whole slew of people who think the idea of existential risk is science fiction, including very senior, experienced and reputable members of the AI community like Andrew Ring and Dan Lacoon.

Do you think there is any realistic possibility at the moment of an AI safety treaty being negotiating and being agreed without first there being a Chernobyl-style AI catastrophe? I'm not sure. so a catastrophe might be one option of course what people call in the space called a warning shot I'm quite uncertain about timelines and I'm quite uncertain about how the public opinion is updating about AI already.

And I think quite a big part of the public opinion, but also of experts actually, is still quite skeptical about AI existential risk for capabilities reasons.

so they're simply looking at large language models and they're like okay we're prompting here but how is this going to take over the world i don't see it and i think that's a fairly commonly held opinion and in a sense it's reasonable i think if you look at current ai So I would expect that if and when AI capabilities are progressing further, And I hope that will happen gradually and not very fast.

If it does happen very fast, it's quite dangerous. If it happens gradually, then I think we're in a safer situation and we will probably have the situation that these models are gradually getting better. But they're getting obviously better and they're getting obviously more capable and they're also getting obviously more dangerous FMSUs.

then I would expect public opinion, but also expert opinion to move quite a bit towards more caution. And I think in such a world, an AI safety treaty becomes much more thinkable already. But even right now, I do see positive points in the regulation that's right now already accepted in Brussels. There's AIX, of course, and also an EU code of conduct. This EU code of conduct is already enforcing a responsible scaling policy for companies.

That's already going fairly far towards what we are proposing in the Conditional Air Safety Treaty. And that's with current levels of awareness. So I don't think there is reason for hope, even in the current climate. But that having been said, of course, also the international climate right now is quite against cooperation, unfortunately, especially between the US and China, which is unfortunately, again, the most important cooperation required for this.

So right now, it does seem to be difficult, but I can easily see ways that this would improve a lot over the coming years. Both the cooperation between major countries, but also AI existential risk awareness in the public and in experts. Let's talk about the EU controlling what AI companies do. Just today I think there was news that Meta is threatening to downgrade the functionality of some of its tools in Facebook in Europe.

European users will have a poorer experience because of what the EU is mandating. So I don't think it's by any means clear who's going to have the upper hand in the struggle between the big tech firms who want the freedom to innovate as they see fit. and the politicians who are often viewed as being out of touch in what they're trying to control.

Another thing that happened today, well, probably yesterday, with Facebook was Mark Zuckerberg appeared on another podcast by a young man called Dwark A. and was asked about the political situation in America. where he has given a million pounds to Trump's inauguration event and has made various other gestures of support towards Trump. And Zuckerberg's reply was very interesting.

He did not breathe a word of criticism about Trump and what a lot of people think is a blatant attack on democracy, massive corruption and economic sabotage. But he was very critical of the previous government. What I think we're seeing in America, regardless of what you think about Trumpism, is proof positive that the big tech firms are not too big to be taken on by government.

Trump is having a wrestling match with Big Tech and he is winning, hands down. They are all bending the knee. Some of them have jumped in feet first, have absolutely swallowed the Kool-Aid like Musk, but others are clearly overawed by him and intimidated by him. And they are towing their line. They're doing what the government in the form of Trump tells them to do. And I am sure that Europe can also make big tech companies do what it wants to do.

If the lawmakers decide, here's the new rules, then those are the new rules and mid-tact will obey. We'll be right back after a quick break. Should I be excited or afraid of AI? The answer is yes. It's both. Things are going to look so different in the next couple years. You will not be ready for the disruptions that are going to come. I'm Jeff Nielsen, host of Digital Disruption, the go-to podcast for technology leaders.

Our generation's conflict, I think, is going to be emotional, not economic. The companies that are going to win are the ones that have a couple of characteristics. Stay ahead of the curve with Digital Disruption. Available now on your favorite podcast platform. Mostly, that's also my understanding that sovereignty is still dropping business interests in the end, usually.

Although of course businesses can lobby quite a bit and they can put pressure on governments or they can leave and go to another country. So it's not trivial. But I do think if we really want to do things as a society, then I think that usually happens. There are ways to do things, but I think the problem in many cases is that we don't really want to do the right thing. That's also why I'm a believer in raising awareness of certainly this issue, but also of some others.

Well let's compare this with the threat from nuclear weapons. I think everybody in the world is aware of the dangers of nuclear weapons. Everybody in the world hopes that there's no World War III, but still nuclear missile capability is spreading despite international treaties prohibiting it. There is increasing threat from some rogue state.

who ignore the treaties. There is threat potentially from terrorist groups who might get their hands on some of the weapons that haven't been well managed by failing states elsewhere in the world. If we can't even keep an eye on nuclear weapons despite the world wanting to, what hope is there for controlling AI capability given that AI capability isn't easy to detect? Excellent question and I think we can broaden this a little bit even to 3Ds in general.

In the international policy community, people generally support the concepts of treaties between countries, but they're also skeptical sometimes about enforcement, which is always subject to political will and not always easy. I would say that a world with three days is much better than a world without three days. Despite all their great points, there are quite some relative success stories of trainees.

And I would actually point to a nuclear proliferation as more or less a success story, if you like, in the sense that we haven't had a nuclear war for the last 70 years. And I think that with 12,000 that's already Juarez pointing at us every second, it's something that we should appreciate by the best that this has never happened.

It's quite unlikely if you would have asked me, maybe I'm a bit of a pessimist in a sense, but if you would have asked me in 1945, is there going to be a nuclear war in the next 70 years, I would have said surely. It's quite a miracle, if you like, that this hasn't happened.

I think that's the right word, actually, Otto. I think miracle is the right word. There's been at least seven, at least six, and I think some very near misses where we just escaped by the skin of our teeth. And, of course, most people don't know about Moses. And nuclear technology is much, much harder to deploy than advanced AI and advanced AI is getting easier and easier and cheaper and cheaper every year or so, which isn't really true of nuclear weapons.

So I'm not sure that our narrow escape survival nuclear devastation probably wouldn't have been extinct, but devastation. I'm not sure it's a great reassurance for what happens when really advanced AI gets cheap. For AI, I wouldn't look at one person creating AI on their computer and trying to regulate this somehow. I agree that this is probably not possible, although I'm not sure if you read Ostrom's easy UX paper.

He's basically pointing to this scenario where hypothetically everyone would have a way to easily build a nuclear bomb. He's more or less proposing universal surveillance as an option against that. I'm quite skeptical about universal surveillance.

i don't think we should go into this in this direction and personally i don't think we would have any option at this point still still regulating it I think we should look at things that are bigger and regulatable and those are of course chip factories or even lithography companies. There is a giant supply chain in the end that is very delicate and spread between many countries. For example, there's only one company sized in Germany that can make giant mirrors that are apparently also required.

I think there are many of these supply chain companies that are actually crucial and that are all working together right now to increase our AI capabilities. And I think in such a situation, blocking this somehow should be doable. So the place to regulate AI is probably not a scientist trying to work on this or at the engineer programming this. It should be somewhere up in the supply chain. There should be something that's actually regulatable.

One of the great things about Nick Bostrom, Galaxy Brain, that he is, is his willingness to take arguments to their logical conclusion and accept the consequences. And his idea in the vulnerable world, as one way of putting it, is as megadeath. become much cheaper, in other words as it becomes easy for a grunt or teenager to kill half the human population with a bioweapon.

We have to accept that maybe privacy is a thing for yesterday and a universal panopticon where some kind of central authority can see what we're all doing and can stop us doing stuff if it gets nasty. Maybe that is an inevitable outcome. I'm not saying that's true, but I love the fact that Bustam is willing to take arguments to that sort of extreme.

And talking of taking arguments to an extreme, one of my current hobby horses, as listeners will know, is machine consciousness. And I'd like to run something by you, Otto. Do you think that machine consciousness can play a role in AI safety? I want to read you a quote from the homepage of the California Institute for Machine Consciousness, which is a non-profit set up by Yosha Bach.

He says attempting to control highly advanced agentic systems far more powerful than ourselves is unlikely to succeed. Our only viable path may be to create AIs that are conscious, enabling them to understand and share common ground with us. Does that resonate with you at all?

I don't think it will work. That's the main issue. First of all, I think consciousness is completely undefined, so it's really difficult to have discussions about consciousness if scientists specializing in this have completely different definitions of what we're actually talking about. So first of all, we would need to round this somehow in something physical or measurable.

Maybe self-awareness is a related concept that's already a bit more clear what people actually mean. I would say self-awareness is the capability to model yourself in exchange with other people or in exchange with the rest of the world. And I think that that awareness or that capability is actually a part of intelligence that I think is quite likely that we'll get this by default when capabilities are increasing.

I can hardly see an AI being very active if it doesn't have a clue what role it has in the rest of the world. So I would say that it's quite likely, and perhaps also even displays in large language models currently, that AI will have some grasp of the rest of the world or of the kind of responses that it can expect from humans it's interacting with, but also of itself.

So I would say that quite likely we will get self-awareness by default. However, it's a completely different story whether this is actually helping with AI safety. We are quite well aware of the concept of ENDS. We know more or less what ENDS do and what ENDS want. But still, of course, if we're building a route and there's an ENTL in the way, we will definitely build a route anyway. So I think it's not obvious that AI, even if it's conscious, would have common ground for that.

I like the idea of emphasizing the awareness of the AI. Perhaps the AI itself could be doing the total universal surveillance of what it's doing. In other words if an AI is asked to create a dangerous pathogen it will say hey I'm aware that this will cause a terrible risk I'm not going to do it.

A bit like some of the AIs already trying to resist some things when they're asked. Currently these AIs can usually be jailbroken so it may refuse at first to do something but if you ask it in the right way you can bypass its protection. But if that awareness is somehow deep enough in the AI, it's possible that it will act as a universal surveillance. It won't be reporting on people. It won't be saying, you know, this guy's cheating, is doing bad things in terms of their normal life.

It will only prevent them from doing something really damaging. How we program that, of course, is a huge problem. There's the risk that it will be incorrectly programmed. There's the risk that it will go wrong in some cases. There's the risk it will indeed be jailbroken despite my wishes. So I still think we need to do more.

in terms of, as you said, controlling the supply chain. But I think there is something that's well worth pursuing in that idea. I'm not sure I have much to add to that. I agree with what you're saying. My next question to you then, Otto, is what is next for the Existential Risk Observatory? You've promoted this conditional AI safety treaty. What's coming next?

From the start, Extinction Risk Observatory has been about reducing human extinction by informing the public debate. So we will continue to do that and we will continue to mostly do media work. We've published about 40 media items so far, and we're planning to expand that and also do some research in this area of how high is actually the public awareness right now. According to our last measurement point, it was 18%, measured in some way, of course.

I'm quite curious to track this, whether this is going up and how fast, whether there is a tipping point. So 18% is what? Is that the number of people who are aware of existential risk? What we've actually done is ask people the question through a survey. Assume that humanity goes extinct in the next 100 years. What do you think that the top three most likely causes are?

The reason that we've done that is to not prime people somehow towards AI and see how many people come up with AI or something similar like a robot apocalypse. You're surveying the public so I get all kinds of interesting answers. But roughly 18% said before any intervention of ours that AI was in their top three. So they were at least connecting the concept of human extinction with the concept of AI.

and this has also risen in the first measurement we've had about seven percent and then five percent and then eighteen percent in the third According to some literature, there's tipping points in between 10 and 25% when a new idea is entering a group of people. I can imagine that this is actually a new idea penetrating society and at some point will reach a tipping point. But we'll also try to increase awareness of Accenture Risk ourselves by doing media work.

We'll also, of course, continue to develop the Conditional AI Safety Treaty and try to push this further in policy circles, especially there's an India AI, not Action Summit, not Safety Summit, but AI Impact Summit, I think, coming up. That could be an interesting point to inform people of a case as a policy proposal. It's one thing for people to be aware of a risk. It's another thing for them to actually prioritise action against it.

If you ask people, do you support democracy, do you support freedom, most people will say yes, but actually are they taking action to defend democracy and defend freedoms? or what other time they want. There's many stages involved in building a social movement that will actually cause change. That's true for sure. But I do think that the public awareness is quite a bottleneck right now to take action in AI safety.

I think a lot of people, if you condition them on, let's say AI is an existential risk, even if they would be skeptical of it, then they would say, yes, obviously in that case, we would need regulation. Not everyone. Effective accelerationists are a noticeable exception, but I think this is fairly common.

And I think right now, also many politicians, if you talk to them about this problem, they say, yes, I would like to help you with that. I would also actually like to pass regulation on this topic, but I cannot really sell this to my voter.

Even if they would be aware of it themselves and they would want to do something about it, the politician is also not universally powerful, of course. They very much have a voting base to take into account. And if they know that this issue is not at all seen in their voting base, then there's only so much that they can do. So awareness, even if people are not actually joining a campaign or something or not taking any concrete action themselves, is still very important.

This starts perhaps in those informing society. That's one reason that we're focusing on the media. They have quite an important role here to play in between politics, if you like, and the public, but also other thought leaders like academics, artists, etc. We think it's quite important that those people get on board with the concepts at least.

So you've got the activity to share information, to raise awareness. But let's come back to what you would like to tell people about the Conditional Safety Treaty. I'm not sure we've actually got to the bottom of what the conditionality is.

is the idea that there's a treaty prepared and people don't put it into practice yet, but when a certain threshold is approached, when people can see... the canaries dying, as it were, to use that metaphor of the canary in the coal mine, there will be a treaty that people are ready to quickly adopt at that point.

yeah so what the treaty says is that if we get too close to a level of a main thread model for example also control but also other thread models and alignment has not been conclusively solved So if there is a significant safety risk, as assessed by an international network of AI safety institutes, then the signatory countries agree that they will pass training runs of those dangerous models. That's the core of what it really says.

And a big advantage of already doing this right now is that it solves the coordination problem, as any treaty does. Any treaty I would say between countries is solving a coordination problem that these two countries would like to do something, but they can only do it if they know that the other country is doing the same thing.

This coordination problem is now very clearly present, I think, between China and the US, but maybe later also other countries about AI existential risk. So that's one thing that conditional AI safety treaty would solve.

but also a timing problem and that ties into the conditionality of the treaty conditionality or if then clause of the treaty that says only if and when we get too close to either loss of control or another main threat model Then those countries will need to know something about it.

So in principle, countries that are skeptical should still be able to sign up. They may say, well, I don't think AI is going to get out of control. I think it's never going to pose an existential risk. The conditions you're worrying about won't arise. So they should still be able to sign it. Then if circumstances change, if they see things happening which they never expected would happen, it will then lead them to flip more easily into the compliance mode rather than the criticism mode.

I think so too. I think there is no rational reason for someone who is skeptical about the capabilities of AI, skeptical that we will get to such a capable AI soon. If people are skeptical about that, there's still no reason for them not to sign or support this treaty. countries but also scientists and also members of the public. That's the main advantage of the conditionality proposal.

Personally, I found out that I had a hard time of supporting the POS right now. I mean, I do support the POS. I support the work for POSAI as well. I'm personally campaigning for a pause right now if I would be a king of the Netherlands or a king of the world or something and I would actually have the power to do this. The king of the Netherlands doesn't by the way.

then I would still have a hard time of actually posting AI because even I would say, okay, but we're probably not at that level yet. There's also a big economic reason to continue development. But I would really think that the world is better if the Conditional Air Safety Treaty was actually put in action, and I think this should be done right now, and it's going to be done right now.

And what kind of reaction are you getting? Are people saying, yes, good idea, sign me up? Or are they opposing it in some ways or just ignoring it? They're pretty interested in our work. In the existential risk space itself, there's giant support for something like this.

There are discussions about relative details. For example, should you have a network of AI safety institutes deciding this? Or should you have a centralized global AI organization deciding this? We can discuss those relative details. but the concept of some kind of an AI safety treaty, and I think also the inclusion of conditionality, although the post AI people and stop AI people are not really on this wavelength. But I think there's a lot of support in general in the AI existential experience.

Outside of the AI existentialist space, I think people are quite impressed with this work. And in a sense, what you're communicating is just that I'm not just being hyperbolic or something. In the past, we've written pieces in Time, for example, learning about AI existential risk.

framing it as human extinction and saying we have to know something about this but a lot of people are simply not taking this into account since they assume a cyberball or something that's a difficult part of this communication And I think by even thinking a few steps further and saying, okay, this is what we should actually do about it, you're also communicating that at least you've given me this thought, so it seems a bit less hyperbolic.

I do think there's quite some demand, if you like, outside of the existential risk space for communicating solutions as well. That's also feedback that I got from a journalist once that, yes, okay, it's an interesting and important problem. But I'm just more interested in writing pieces about solutions than about problems.

And this is a solution. I think you can present this in a way, and we've also done this in the timepiece to say, okay, AR extension risk is a big problem, but it could relatively easily be solved if we actually sign this treaty and if we actually protect these actions. So please keep communicating solutions.

So we can figure out what the contentious areas still are, how these evaluations are going to be done, who's going to do them, how they're going to be agreed. But that's the right conversations to have. So thanks Otto very much for coming on the Futures Podcast. positive reception of You have many thanks.

This transcript was generated by Metacast using AI and may contain inaccuracies. Learn more about transcripts.
For the best experience, listen in Metacast app for iOS or Android
Open in Metacast