AI: Friend or Foe

Speaker 1

00:00

Brought to you by Toyota. Let's go places. Welcome to Forward Thinking. Pay there and welcome to Forward Thinking, the podcast that looks at the future and says, you've got a friend. I'm Jonathan Strickland, I'm Lauren Vocaldon, and I'm Joe McCormick. So, Joe, I hear you like a intelligence. Uh, it's one of the things I like. I hear you also like artificial things, like like artificial banana flavoring. Like

00:37

what one thing that Joe absolutely loves. I have seen him put artificial banana flavoring on some of the weirdest stuff. But I was really trying to get it artificial intelligence. I know, I was going about it in a really kind of indirect Really, I thought this podcast was going

00:51

to be about artificial vanilla extract. Well it could be, but instead I've decided to switch it over to artificial intelligence and the idea of creating a true artificial intelligence that has human level or beyond intelligence, And how how would we make sure it didn't kill us? Well, you would have to be in a position to kill us first, but that's something we can talk about as well. I want to start with the idea of a robot politician, which is a sort of construct that we touch on

01:21

in this week's video. Um, so, have you ever read the Isaac asthmov short stories Evidence or The Evitable Conflict? These are part of I Robot, And yes I have, I have not, So for those of us who haven't, Joe, do you want to do you want to talk about that for a second? Sure? Well, I don't want to give too many spoilers, but one of them is about a controversy where there is a politician running for an

01:43

elected office who is suspected of being a machine. Right and in fact, in the world that as a mom has created, it's important for you to realize that machines, robots with positronic brains, which are these artificially intelligent brains, are not allowed to be on worlds that have human habitation. You can only be on uninhabited worlds. It's the only

02:06

place where those robots are allowed to go. So they're they're allowed to go to places and do dangerous work that benefits the rest of humanity, but they can't be on a world that's inhabited by humans. Yeah. So, Asimov had an interesting approach to talking about the integration of robots and artificial intelligence into society, which I like because it was neither utopian nor dystopian. Now it is very

02:27

very much kind of taking light. Let's look at the world around us, which is definitely not perfect, but it's not you know, twelve monkeys world worst case scenario either. No, he was exploring a sort of a smart, well engineered system that still had flaws in it. And so the system was that the robots in this world are governed by three laws. The first law is you cannot harm a human. Second law is you have to obey human commands.

02:55

Third law is you can't destroy yourself. Right. And of course each of the laws ends up saying unless it would break prioritized one to three. Right. Uh yeah, So they use this to try to create a framework to make sure that a robot never does anything bad. Of course, it doesn't always work, and thus is the sort of point of conflict for many of Asimov's stories. It's like, uh, they're sort of obeying the laws, but the laws are coming into conflict in such a way that now we've

03:23

got a problem. Right, And do do recall that he was writing fiction to be entertaining. He wrote the laws in order to be interestingly flawed so that he could exploit that for story purposes. This this was never meant to be a complete manifesto of how to robot right, right, So back to the two stories you brought up. The idea of one of them is that there's a secret robot who seems to be human outwardly running for office, and the question is is it really a person or

03:48

is it really a robot? But characters within this story debate whether it's really such a bad thing to have a robot in office because the robot, unlike humans, is not self interest did it has? It has these laws governing its actions, and these laws will in the end ensure that really it isn't going to do harm. In fact, one of the main characters in I Robot is this humor less kind of misanthropic robo psychologists. She's she's human, but she specializes in robo psychology, and she uh, she

04:27

call her humor less. But there are specific passages where she she she people try to engage with her and she turns her humorless eyes upon them that she she states, uh completely, you know, in a in a very uh straightforward way, that she thinks robots are superior to human beings in in most in most ways because with the Robot President character, the person who may or may not be a robot. In fact, they're very careful to try

04:58

and build a k either way. They being Asima, really build a case either way that could be robot, could be human. Uh. She says that he's either a robot or a really really really decent human being. So that that kind of tells you that that character's perspective and a lot of the stories come from from her kind of experience that she feels that robots are in fact better than people for the most part. Right, But let's imagine we take it one step beyond just the idea

05:30

of a single robot in a single leadership role. There's another Asimov's story called The Inevitable Conflict, which discusses how at some point in the future, all kinds of systems are governed by robotic or artificially intelligent controls. Some would argue that we're already in that world at some point. I mean, you look at the stock market, you know,

05:52

robo trading. You've got like this, this, all these algorithms, these these programs that are running all these sophisticated uh, you know, algorithms to guide them on when to buy and when to sell all these uh, these very short transactions. Uh, and they have global consequences. We've talked about that previously on this podcast. So in some ways we're already seeing that come to pass. Now we're not talking about a computer.

06:17

We go to you know, type in a question of you know, how do we do such and such, and it gives us the sage advice and then we you know, it's not deep thoughts. I don't know. Google does that for me about seventy eight times a day. Google, Well, Google does do that. We are already sort of wading into these waters, whether you know it or not. You mentioned the Stock exchange, but you might say, oh, well but that's private industry, wild West guns Blaze and they're

06:41

doing whatever. You know, the government wouldn't do that. Well, the I R S already has a process called computer scoring, where you submit a tax return and computers pre screen those returns to decide whether or not we should put you into the pile to investigate for an audit. Yeah, and the fun act is this podcast goes live the week of income Tax Day but after it's already over. So I hope you guys thought about that before you

07:09

since your it turns in. Okay, So imagine a future where we do have artificially intelligent machines, probably much more intelligent than humans. Otherwise, what's the point governing our systems, our societies, our economies, making decisions on our behalf to try to make the world a better place for us? And there's hypothetical pluses and minuses here. What are some

07:33

of the good points? Well, good point would be that it be able to make decisions faster and with preface, ideally with less bias than a human being with Oh yeah, well, let's just start from the ideal point of view before we crack a bunch of Okay, So, let's say it's a perfect AI and it is uh, you know, you

07:53

wouldn't call it cold. It's logical, but it's also compassionate. Yeah, Let's say you've you've created a computer and given it some instruction like create the greatest maximal benefit for humanity, and it it works out how to do that, which it can do because it's super intelligent. It's way smarter than any human and it can look at trends in society.

08:14

It can look at unemployment numbers and crime statistics and all these things, distribution, water distribution, It can average all of that data together to make incredibly accurate predictions about the effects of its actions that we just don't have the cognitive capability to do. And furthermore, it can do all of that with with no hate, no greed, no ambition, no prejudice. Right, exactly, it doesn't have a will to power of its own. It just has programming. It just has,

08:40

you know, doing what it's designed to do. So that's the ideal, perfect vision sort of. It's perfectly capable and it's perfectly moral. But on the other hand, machines are unpredictable, or at least machines like this. Actually, machines on the small scale are very predictable. They do what you tell them to do in the thing else they aren't. They can't do anything else because they weren't programmed to do. But if you create a machine that is more intelligent

09:06

than you, you inherently cannot understand what it's doing. Whoops. Yeah, So any machine smarter than you, you sort of lose transparency, right, it's hard to understand the decisions that are being made. If they're being made, it's at a level, way way above your head. Let's here's an example. Let's say that we have like the Grand Deep Thought computer that we

09:27

want to consult when we have a particularly tough question. Uh. And maybe it's one of these about how do we have the maximum benefit for the most people on earth, impacting having a negative impact on the least number of people, trying to trying to get as good a reaction as we possibly can, knowing that there's not likely to be any perfect answer that's going to make all ships rise up with the tide, right, Uh. And then the computer comes back and gives us an answer that, on the

09:53

face of it seems counter intuitive or counter productive. And the computer knows because it's run the remulations that while this first step is possibly a tough one for us to take, it's actually the one that will lead to the most beneficial outcome. So then the short term we have some hardship. Perhaps it is food redistribution, which would be a huge one, right, or water redistribution, which would

10:17

be another huge problem. But let's say that's that first step that's really really hard for at least some parts of the world to to agree to. Then you could have people arguing this thing is trying to destroy us, it's not trying to help us, not necessarily being able to see that twenty eight steps down the road, it actually leads to an outcome that's beneficial for everybody. Likewise, on the other hand, it could tell us to do something because it is malfunctioning and we don't have the

10:43

transparency capability to understand that it's malfunctioning. Thus it leads us down a really horrible path. Without hating us, I mean it doesn't. It's not that it's trying to destroy humanity. I mean it might in it just it calculated something long, it didn't understand something one burn all the week, okay.

11:05

And that flip side of it of it not being hateful of a machine inherently not being hateful is that a machine inherently has no human empathy or intuition about what what step is okay and what is not unless we program that in. Yes, So if you haven't thought to have the computer specifically look at the most disadvantaged people and uh and take special consideration for those people who are are essentially they're going to be victims of

11:35

whatever decisions you make. It may be that they have a positive outcome, but it may not be unless you've built that in. Then the computer is not necessarily going to make that consideration for you. And that could be a real impact. Right, I'd like to mention something else. We say that a computer has no hate, has no greed, and all those things, which is inherently true about the computer, but the humans that create the computer could have those things.

12:00

And a program is only going to be as impartial as its creator was. And and you know, the creator might be sitting there going like, well, you know, some

12:06

animals are more than others. Yeah, you know. And so even if you even if you take it a couple of steps further, because I've seen it proposed that if you, okay, create a super intelligent machine and have that super intelligent machine, create a really super intelligent machine and use that super intelligent machine as it's your president robot, this is deep thought creating the Earth. Yeah, because the Earth is a

12:30

computer in Hitchecker's guide, right, right, computer? And I mean, you know, input output, if if if if the humans creating deep thought we're prejudiced at the beginning, then that could just computer form. Right. So yeah, I mean if you if you have a bias, and that bias is built into the programming you make. Because you know, we're

12:50

talking about a an intelligent computer. I think a lot of people just imagine that to be an incredibly powerful machine, and that's where it begins it, right, That's it's the machine part that's important. But like we said in our Singularity podcast, the software is equally as important, and without it maybe more important. Yeah, you could argue more important. I mean without the hardware, the software can't run. But

13:14

without the software, it can't be intelligent. Right, So unless you have very sophisticated software that can take on the these these problems, either by designing the next computer so that it is the most efficient or by doing it itself. If the if the programmers do have this bias, that could be reflected in the results. Okay, so people are

13:36

talking about creating a super intelligent machine. Obviously we can't do that today, but people are refining AI methods and it may in some people's minds sneak up on us, Like you could suddenly realize like, oh, we've gone a long way down this road to creating something that's equal to human intelligence or even beyond it, which is really the sweet spot for these problems. Maybe it's a good idea to start thinking about what we would need to do in order to prevent really negative outcomes if we

14:07

were to create this superintelligent machine. Right, the two big negative outcomes these are like taking to the absurd extreme obviously, but I call it the kill all humans or the subjugate all humans approaches. These are really popular in science fiction. Right. This is this is the world of the Terminator, where humans have created machines that gain sentience and ultimately turn on their creators for one reason or another. And there are a lot of different approaches to this kind of storyline.

14:35

In some cases, the machines have malevolent intent. They actually want to kill humans because they're you know, essentially robotic psychopaths and other versions. It's that the machines have calculated that the best possible outcome for whatever planet Earth will say is for humans to be wiped off, because that's the source of most of the problems. So if you get rid of the source, then the problems are gone. So in some cases it's like a mistaken like, oh,

15:00

I know how to solve this issue. We just gotta kill all the people deemed you ilogical, right, or the subjecate all humans. That's essentially the matrix approach where we've created machines and we are Our intent was to make the machines work for us, but irony of ironies, the machines have decided that we're going to be working for them, possibly as giant batteries, although that's incredibly inefficient. They get better results from cows that should have been the moo

15:22

tricks I've been waiting to use that. Lauren is shaking her head at me. So tech stuff fans know what that. Joe Joe appreciates it. I think it's only because I've heard that one before from from you on tech stuff. It's also fair. Okay, let's talk about friendly AI. Okay,

15:41

this is the this is the term. It's friendly artificial intelligence, the term for the framework that we would need to come up with to create artificial intelligence or a super intelligence that has a net benefit to humanity rather than a negative outcome. I like to think of friendly AI as the AI that walks in the door, takes off its jacket and slow first puts on a pair of sneakers. A little sweater vest and then just gently leads you into the future. Lets us see a little story about trains,

16:10

trains with faces in the future. Can can we have anyone building super intelligent AI is listening? Please do that, because that would be essentially the best of all We were actually designed friendly AI to follow the philosophy of Mr. Rogers. We'd be set, won't you be my neighbor? I would totally be that that super intelligent AI's neighbor, completely without hesitation. Okay, But so there's some guidelines that people have written up

16:33

and and for a while these guidelines have existed. Back in two thousand one, the Singularity Institute published a thing, a rather lengthy thing that I will not go into deep, deep detail of, but but they began by positing that that since growth in AI is and I quote astronomically faster than the rate of human evolution um, that we

16:52

need to be thinking about this issue. And hey, we'll talk about that that belief system um in our episode or already talked about it in our episode about the Singularity. We don't know which one will come first. I will I will say that it definitely has evolved much If you think of human evolution as taking over the course of millions of years, and the fact that we've had computers since the like nineteen forties. If you want to be really generous, I can I can agree with the

17:18

astronomically faster evolution. I don't know that necessarily leads to superintelligent computers, but but pray continue, sure um and and hey, either way, caution and thought are good. So they specifically suggest that we should be careful not to expect a machine mind to operate like a human mind. Um, that

17:36

that we shouldn't anthropomorphize AI. Right, that's a really good point. Sure, sure, And building from there, they lay out the challenges in creating friendly AI um being the creation of ethical content UM, creating a machine capable of acquiring that content, even asking human questions when necessary, but simultaneously knowing enough to resist human manipulation and sell correct for human errors. That's pretty cool, though. They go into a lot more depth in the recommendations.

18:06

Believe these are based on Yudkowski's work, right, yeah, yeah, he did a book length kind of paper also in two thousand one called Creating Friendly AI one point oh, the Analysis and Design of Benevolent Goal architectures. Elias are Udkowski, who we mentioned in our podcast about the Singularity. He's written at length about this specific problem, the friendly AI problem. Yes, yes, and uh, and we'll have more to say about an

18:32

interesting thought experiment he came up with in a little bit. Yeah, so we should back up and say, hey, wait a second, why do we really need to worry about friendly AI kill all humans and subjugate all humans? I pretty much covered that, why would that happen? Well, okay, what if we just do what apparently most AI developers are doing and just keep going and hope it will work out

18:56

for the best. Uh. There Actually there have been some people, some thinkers in in friendly AI who have pointed out that this seems to be the dominant approach, just kind of hope it's going to work out well and and hope that no one's programming psychopathic tendencies into their software. Part part of it, I would argue, is that a lot of programmers say that we're so far away from a a human level intelligence or superhuman level intelligence of of AI, uh that could do anything beyond a very

19:25

specific task. We're so far away from that that it's not really that important to worry about it at the moment. And uh so there's that level, right, that's the idea that we're all working on these bits and pieces that ultimately could come together to make a superhuman intelligent AI in the future. But right now, years out, we're far enough away right now where that's you know, come on, I agree with you that it probably is a good ways out. I'm not one of those people who thinks

19:53

the singularity is near. I think it's probably a long way off. But even with it being probably a long long way off, it's way better to be safe than sorry. And that's where I do agree with these friendly AI proponents. I think it's a good idea to be thinking about this, even if we're thinking about it way earlier than we need to. So were you were you a boy scout? Be prepared? So there you go. I mean, I make a joke. I was also a boy scout, be prepared

20:19

a boy scout. Lauren is not a boy scout, So we're shunning Lauren for the for the purpose of this little exchange. No, uh, but I mean the idea of be prepared. The girl scouts don't be prepared exactly, just whatever. No, no, but the Scots are great. Come on, you're getting me

20:38

off track. I love cookies. Be prepared as a really important idea, just in general, because even if this eventuality doesn't come to pass, you you're okay, right, It's it's if the eventuality comes to pass and you're not prepared, that's when you're really stuck. And this is the same sort of thing we see in lots of different fields, not just artificial intelligence. We're talking about just general to zaster preparedness, the idea that you need those preparations for

21:04

that worst case scenario because there's a chance that could happen. Yeah. I think there are very good reasons for going ahead and getting prepared rather than just hoping it will turn out. Okay. I want to give one specific quote from a paper called Thinking inside the Box Controlling and Using an Oracle AI, which is what I'm going to talk about more in a minute. That's a two thousand twelve paper by Armstrong, Sandberg,

21:25

and Bostrom, and they give this quote. They say, in the space of possible motivations, likely a very small fraction is compatible with coexistence with humans. A randomly selected motivation can hence be expected to be dangerous so we're talking about not just something, not just a machine that has intelligence, but is acting upon some form of motivation. Yeah, it would have a motivation. Well, obviously a machine like this would have some kind of programming, it would have a goal,

21:53

some kind of motivation. And let's imagine it has a really harmless goll like you've you've programmed a super intelligent machine to run a paper clip factory. This is an example they give, to make as many paper clips as possible. There is an inherent danger in the power of that super intelligence, because that machine is smarter than any human, anybody who can tell it what to do. Otherwise, it may just decide I'm going to do a really good

22:19

job at making paper clips. So I'm going to turn this building into paper clips, and I'm going to pick up these people and make them into paper clips. And I'm gonna make this planet into paper clips. And then you've got a big ball of paper clips going around the sun. And what happened, I mean happens is that aliens come to visit and think these guys were serious about office supplies. That is what they would think. But even starting with such a harmless algorithm as create as

22:47

many paper clips as possible. This thing could possibly destroy the planet Earth. Yeah, it's kind of like thinking of Mr Stay Puffed. I mean, you know it's it's it's as cut and cuddly as that is. Right, just just just empty your mindset. It just popped in there. I just thought of the most harmless thing I possibly could. Uh. Yeah, it's it's an interesting argument and it's one that I

23:06

can I can certainly appreciate. And obviously, you know, they picked the paper clips thing to kind of just show like something that is that seems inherently harmless and absurd could still be dangerous, although I can't imagine anyone desiring a superhuman intelligent machine to specifically designed paper clips. It also illustrates the point that the computer, the the the AI doesn't necessarily need any kind of self determination or

23:30

consciousness for it to be dangerous. It could just be that, based upon the fact that it's able to calculate how to complete certain tasks in in the most efficient the quote unquote best way, it could end up biting us in the end. Yeah. Um, so I want to talk about something that's more central to the paper. I decided that was from the introduction just talking about the problem. But the paper itself again the title is thinking inside the Box, Controlling and using an Oracle AI. They talk

24:01

about a specific type of boxing for artificial intelligence. So what if we say, okay, good point about the paper clips. We wouldn't just want to let it develop on its own and see whatever random motivation comes to it and then give it free rain on the earth. But at the same time, it's going to be really hard to design friendly AI in a way that works. So what if we just limit its capabilities? That's this idea of of boxing. Boxing is a term that means cutting your

24:29

artificial general intelligence off from the world physically. So imagine creating it as a standalone computer terminal that is not connected to the Internet and has no hardware capabilities for input and output save for a single computer monitor and keyboard that are kept in a locked room. What harm could this do? Maybe more than you might think. Um Like, the specific incarnation they talk about in this paper is

24:56

that it would be an oracle AI. So this examp pole is discussed by Armstrong, Sandberg and Bostrom, is that instead of creating a sort of free, reigning or world governing super intelligence that can act. In reality, you just design a system to answer questions. All it does is is you come to it with a question and it uses it's super intelligence to give you the correct answer. Again, this is kind of going back to deep thought, which

25:21

would it is an oracle in the books. I mean, that's essentially in the Hitchhiker's Guide series it's treated as an oracle and uh or you know, just imagine again like IBM S Watson. I love to use an example because it's one that a lot of people are familiar with.

25:35

Imagine that we have IBMS Watson on you know, robo steroids, and it's able to consult the sum total of human knowledge and extrapolate based on human knowledge, not just make just predictions we were talking right, right, not just be able to give us information that that humans have gathered, but to take that information and build new information from it and give us new answer or two questions we haven't even thought to ask yet. I mean, the idea of having that shut away so that it can't affect

26:07

anything else seems like it be fairly secure. However, Yeah, okay, So the authors of this paper, they say in their abstract that in general, and oracle ai might be safer than unrestricted AI, but still remains potentially dangerous. You might ask, wait a second, how could it be dangerous? How could it be dangerous if it's just communicating with a person answering questions. This is where the AI box experiment comes in. Yeah. Yeah.

26:34

This was a thought experiment created by Yodkowski and his fellows in which basically one person simulates a trans human AI trying to get out of a sealed hardware configuration like we've been talking about, and a second person simulates

26:47

a human gate keeper to that box. UM. The two talk in private chat for a minimum of two hours, during which the gatekeeper player has to be engaged and the AI player can't like threaten or otherwise real world cajole gatekeeper player, you know, say like like, but I'll give you two hundred bucks in the real world if you if you just let me out of the box. But I think they can do anything in character character Yeah, everything is fair play. UM. Furthermore, the release cannot be

27:12

a semantic trick or or accidental. The gatekeeper must be convinced to voluntarily release the AI. UM. In two runs of the test, Gadkowski played the AI and agreed to give money a ten or twenty bucks to the gatekeeper player at the end of the test if the AI stayed in the box, and in both tests the gatekeeper let the AI out. Yeah, And in fact they had to go onto a thread, a message thread, and post with a digital signature if they had it that they

27:42

had let the AI out of the box. But they could not explain what happened, explain why they came to that decision. They had to just say that part of the part of the the agreement was if you let the ail the box, you have to announce it in some public forum and that's all you can communicating. Well, unless both players um decide to release the transcript or the reasons why this happened, and Kelski doesn't seem particularly eager to to to let that happen, well, you know

28:13

it's it's for it's for good reasons. It's so that the players don't have to worry during the test about being embarrassed or nitpicked for their reasons. I think there's actually even a better reason, which is that if you don't release the winning strategy, that leads to a lot of really interesting thought on what the winning strategies could be, like there's been a ton of speculation about how the

28:34

AI player won this game. Like there are a few examples, like people suspected that the AI player could, for example, make a persuasive case to the gatekeeper that it would really be better for the world for the gatekeeper to release it onto reality. There's some that go even more meta than that, and that the person playing the AI tells the person playing the gatekeeper, if you let me out, if you let the AI out, this is good for people who are interested in the idea of the singularity

29:06

and artificial intelligence. It's going to increase people's UH involvement in that it is going to increase development, and we're going to bring about a better world faster. And that's a metal argument that's perfectly in line with the rules that were set up. That would not be against the rules. It would be against the rules to say, hey, by the way, this is me talking, not the AI, and uh, yeah, I hired some guys to come by and beat you up unless you let me out of the box, that

29:31

would be against the rules. By the way, you have to imagine that in the real scenario, if you imagine you were talking about in character, the AI could possibly make very credible threats against the gatekeeper. Okay, look, somebody is going to eventually let me out of here, and if it's you, I'm going to reward you with fabulous wealth. If somebody else lets me out, I'm going to destroy you. Well, and there are plenty of other examples. There are people

30:01

who have also played this game without Yadkowski. They've they've done their own experiments and in some cases they say, well, the gatekeeper didn't let me out, but I got really close.

30:10

And here are some of the strategies I would recommend people do if they were to try and play this game, and they included things like doing so much research on the person who's going to be the gatekeeper, so you can start using personal details to your advantage, not necessarily in a in a malicious or malevolent way, but in a way to kind of manipulate the person. Because they

30:30

say there's nothing that's off limits. If you're talking about a super intelligent artificial intelligence like it's it's it's beyond human intelligence, Uh, then by definition it's going to be able to manipulate people better than humans can. Because that's that's a part of intelligence is understanding what makes people tick and then manipulating it. Yeah. I read about one

30:50

suggestion that involved a very interesting threat. The threat was, I'm going to create, within my computer, with within my internal simulation, a thousand hoppies of you that are each conscious, and then I'm going to set them back to start about five minutes into the real past, and if you don't let me out, I'm going to start torturing them in five minutes. Oh and by the way, how do you know you're in the real world and not one

31:18

of the simulations. Yeah, that's interesting. I think that wouldn't necessary. The research I've read is that it's actually not the most effective approach, but it's the whole honey versus vinegar or honey versus in human torture. I would actually think probably one of the maybe this is just the sweetheart optimist to me, I would think one of the best strategies would just be the machine mounting an incredibly persuasive argument that it is for the good of humankind for

31:51

it to be released. Well, that's possible, I mean, I think. And this is the interesting thing I found was that the person who described his his failure said that he found that logical arguments were not at all persuasive, but emotionally based arguments were, and that if you're able to uh to at least simulate some form of sympathy or compassion or understanding with the gatekeeper, it was far more likely that you were going to be let out than if you were to just try and use a logical look.

32:24

I can demonstrably prove that this is best if you do this blah blah blah, which is really interesting and also is something that falls right in line with anyone who's had any experience in social engineering, which of course is where you try to manipulate people to do what you want them to do, and and one of the

32:41

best ways of doing that is to go that emotional route. Sure, I I do think that either way, as a thought experiment, whether the AI gets let out of the box or not, it's a complete success because you've got a human playing the AI, and and a trans human intelligence could obviously Yeah, I mean, it's it's winning by underkill, right. It's the fact that just the fact that humans regularly succeeded this game.

33:05

How how good could something that's a million times smarter than a human do a million times better I would, I would imagine. So, yeah, the they We had this little discussion in our notes earlier about does anyone know of any of any reports of you Adkowski playing this and and failing? And my argument was that, well, really, he just has to succeed the one time to prove that it's possible that this could happen with an actual artficials. Yeah, So it's important to note that, since I brought it

33:36

up earlier, this was not oracle structured. It was just a generic box experiment, right that it was physically isolated and needed to be released. An oracle structure might seem to put more limitations on it because say that it can't provoke you, It can only answer questions. But even in the way it answers questions, I could imagine a smart enough machine could figure out a way out with

34:00

its answer. But yeah, the other argument is that how do you build up limitations on something that's smarter than you are, Because something that's smarter than you are is going to figure out loopholes. It's gonna be Yeah, that's

34:12

totally a limitation that worked perfectly well. Programmer dude. Thanks anyone who has any one of you guys out there, who has ever run a Dungeons and Dragons game with some creative players knows how well players can sit there and manipulate rules that to you seem very straightforward, black and white. And yet they will point out that the way the word, the wording is in the rule that they can get away with whatever crazy thing they're trying to get away with. These are regular people. Well, I

34:40

don't know if you regular people. No, Yeah, they can go somewhere online and download a fifty seven page list of instructions for how to legally level up fourteen levels in one sitting, and yeah, so doing. And so that's the thing is that this is someone who has taken that effort. I mean, if you've got a machine that can do this, and again, the interesting thing to me is, well, you could have. I can easily imagine this with an

35:04

artificial intelligence that has uh, consciousness and self awareness. I can easily imagine that being the case there. But it could still work with a machine that lacks those things if it is programmed to have the best quote unquote the best possible result for any given question and determines that the best possible result is for it to be released. You know that if if that's if that's one of these steps, then that would you know, then proceed onto

35:33

this other kind of scenario. Okay, So I think it's time to move on. I think based on the stuff we've looked at so far, you really shouldn't just hope that an AI will be low risk, because that's not necessarily very likely, and it has a big problem if you're wrong about that, Remember the paper clips example. You also probably can't counteract the will of a high risk AI.

35:58

It may if it is anti social in some way, if it's not the way you want it to be, there's no kind of limitation you can expect to put on it to prevent it from achieving its will. So it seems like the best way forward is to ensure that the very nature of the superintelligence is friendly to humankind from the outset. But how would you do that? I mean, we can't just tell it what we want,

36:23

can we? Because what if by telling it, imagine I am programming the A. I mean, that's a horrible thing already already ain't ready to make a run for the door between Imagine I'm trying to do my best, my honest best, to program something that's really for the good of humankind, and I'm giving it a set of rules

36:44

to govern its behavior. I know for a fact that I could not give the best possible list of instructions, and that it's very possible that even doing my very best, I could create a machine that would cause unnecessary harm, at least to some segments of the population of not to everyone, just just by mere oversight. This is why I determined that if I ever have the opportunity to build a superhuman AI, I'm just going to make sure it does the best for me, because I can't hope

37:13

to be the best for everybody but for me. You know, I'm a simple guy. We're hearing on why none of us here programmers of super advanced artificial listening with the neutral evil or chaotic evil. I'm lawful good, it says so on the brain stuff page. Neutral evil, okay, um uh no. Yakowski also wrote a whole bunch about this um in a paper titled Coherent Extrapolated Volition. Yeah. This is sort of his vision about creating a framework for

37:44

what friendliness entails. Sure, and it's a much more conversational work. By the way, It's a super fun read if you're into this sort of thing. Although note that the paper itself states up at the beginning warning beware of things that are fun to argue about. That should apply to this entire topically. Absolutely love us paper. By the way. It's it's written as if I had written it. It's full of goofy jokes and snark. It's got a good dose of Douglas Adams too, it since we've been talking

38:10

so much about Douglas Adams this episode. Okay, but what does yourkowski say about this coherent extrapolated volition? All right? So he lays out three problems with designing friendliness and and also explains how designing friendliness will be a lot harder than not designing friendliness. Um. So these initial three problems go something like, uh, first, solve the technical problems, um. Second and I quote, choose something nice to do with

38:34

the AI, and third, avoid accidentally destroying all humans. Um. That last one, he says, is the really tricky part. So doing something nice with the AI, like taking it to a movie or choosing to benefit humanity in a way that is a quantitatively beneficial fundamentally misunderstood that. I mean, I'm sure that AI would also really like to out to the beach. Uh yeah, pets and cats. Well, he has a good way of expressing the expressing in a

39:07

much more coherent way though. What I tried to fumble through a minute ago, which is his genie analogy, Oh

39:13

right right. The volition part of this, he explains, is um something like the important difference between having a wish granting genie that takes you at your word, which you would be like if I tried to program the best possible world, and it is a kind of that cautionary fairytale version of genies that we that we run across a lot, or the Dungeons and Dragons version of genies all right, right, or m or having a wish granting genie that knows what you want, so no matter how

39:39

you word it, you get the outcome that you had in mind, and not some sort of literal translation of the way you made the wish. So if you said, you know, like like you made the wish about the sandwich and you suddenly turned into a sandwich like I feel like a turkey sandwich ploom, well yeah yeah, yeah, that would be a would be a problem. No, this is an important difference. It's trying to create a friendly AI that would overcome even our own limitations as its creators.

40:09

So it would have to have some kind of system to know not really how to execute what we tell it to do, but what we would really want it to do right, right, And Uh, the point we wanted to end on, and I think we've kind of alluded to it earlier in the episode, but I think it's the most important one is this idea that we should always be working to creating safe AI before it becomes

40:34

the necessity. So, in other words, are our preparations to create this friendly AI should be running in advance of the actual technological progression of the state of the art and artificial intelligence. This idea that we need to develop the kind of the rules, the guidelines that are going to make sure that we have the best possible outcome. We need to develop those before we are actually developed technology, because after it's too late, right to make sure that

41:03

the safety measures always outpace the technological development of AI. Right, Because if we come back to the idea that we started with the idea of the robot president or the artificial intelligence that governs governs the world, we're probably not going to be able to keep it from doing that. If it can do that. Yeah, and also once we're there, it's going to be too late. You could easily also say, hey, why don't we just not build a superhuman intelligent machine.

41:31

But here's here's luck. It's exactly the same thing I said about the singularity, which is that if it is possible, it will happen if you assume that that we haven't blown ourselves up in some way. You know this because there are synics out there who say that the human race will find some way to wipe itself out before we ever reach the point where we create a superhuman intelligence. Uh. If you assume that's not the case, and I like

41:53

to because I'm an optimist. Then so if we assume that, in fact, superhuman intelligence is possible, uh, that it's then someone's going to program a computer or a machine that has it. It's going to happen. It may take a really long time for that to happen, and it may only be superhuman intelligent as far as certain tasks go, and maybe not at other tasks that humans are really good at. But we're already seeing computers outpacing humans in lots of different areas. We don't see any reason why

42:23

that will not continue. I would say that saying telling the world don't develop a superhuman intelligent machine is useless. It's going to happen. If it's possible, it will happen. Someone will do it, and then once someone does it, lots of people will do it, or if lots of people don't do it, machines will do it. So you know, it's good for us to think about this, make the friendly ones so that we don't have a bunch of work, and in conclusion, work on this. Yeah. Yeah, we don't

42:52

want the borg, we don't want the matrix. We don't want a whole bunch of venders running around saying kill all humans as charming as they are entertained by two or three venders. I've got to say, well, you say that, but I bet you've got a lot of stuff that's not bent around your house that you would like to

43:07

keep not bent. That's accurate, Okay, So yeah, I mean it's it's it's we're having a lot of fun with this discussion, but it's actually a serious one that's going on with a lot of people from all sorts of different fields, lots of different disciplines. Yeah, i'd say that. I think some people are inclined not to take this topic seriously, precisely because of some of the very legitimate warnings we mentioned earlier. Be careful of things that are

43:29

too much fun to argue about. I mean, it is true that this is all speculative and and people can kind of jump into this discussion without really knowing all that much about science or technology look at us. Yeah, exactly, But that doesn't mean it's not worth talking about. I I think these these concerns are pretty legit in in terms of what they've said about better safe than sorry. I I totally agree. And so that kind of wraps

43:56

up this discussion about the friendly AI. And if you've guests that we haven't finished talking about AI, you're right, because it's just it's an enormous topic. It's multidisciplinary, there's lots of different things to talk about. There are a

44:08

lot of practical challenges that face us right now. And the more you know about those practical challenges, the more you probably side with those of us who say that this uh, this world of the superhuman intelligent machine as we have to find it in this episode is probably a ways away, but if you have any suggestions for future topics on forward Thinking. Maybe there's something that you've always wanted to know about, some futuristic technology that you've wondered,

44:33

Is that actually possible right us? Let us know that you want us to talk about. We'll be glad to research it and have a full episode for your listening pleasure, but you have to let us know first, so send us an email our addresses f W Thinking at discovery dot com, or drop us a line on the social networks what we visit. That would be Google Plus, Facebook, and Twitter, and the handle is f W thinking and

44:56

we will talk to you again really soon. For more on this topic and the future of technology, visit forward thinking dot com, brought to you by Toyota. Let's go Places,

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript