TechStuff Classic: Passing the Turing Test

Speaker 1

00:04

Welcome to tech Stuff, a production from I Heart Radio. Hey there, and welcome to tech Stuff. I'm your host, Jonathan Strickland. I'm an executive producer with I Heart Radio and love all things tech, and this is time for a classic tech Stuff episode. This episode originally published on June twenty three, two thousand fourteen. It is titled passing the Turing Test, something that we frequently associate with artificial intelligence and actual you know, like machines thinking or at

00:37

least appearing to think. So let's listen in to this classic episode. The Touring Test is named after Alan Turing, and we've done a full episode on Alan Turing way back when, back in November. Yeah, phenomenal person, amazing thinker. Yeah, one of the like the grandfather of computer science. Also tragic life story which we went into detail back in

01:04

that story. And so you might wonder why, right, So what does that have to do was the the guy who was the essentially the father of computer science or grandfather of computer science? What does he have to do with the story about a computer program in June two thousand fourteen, a computer program with an interesting name, Eugene Goostman passing the Turing test. What does that all have to do with each other? Well? To answer that, we

01:27

have to ask what is a Turing test? Well, as it turns out, Away back in the nineteen fifties, he started envisioning uh I thought experiment. Yeah, he published a paper in a journal called Mind in nineteen fifty called Computing, Machinery and Intelligence, and he sort of laid out his thought experiment there. It's interesting because what he did was he took this idea of a party game and then

01:53

adapted it for computers. Right. The party game that it's based on would have three participants, an interrogator, a man, and a woman um, all situated so that they can't see one another, and the interrogator is supposed to ask questions um to try to figure out which of the participants is the dude and which is the lady exactly. And it's the dude's job to try and mislead the interrogator to believe that that he, in fact is the

02:21

lady and the other one is the man. It's the lady's job to say, hey, I want you to get this right. I'm the lady. That other guy that's the dude. And so the interrogator has to ask questions. Now, obviously they can't see each other. Because they could see each other, then that would probably give things away. Probably, they really shouldn't be able to hear each other because that could

02:41

also give things away. Sure, and if you use handwritten notes, then you the interrogator might be able to make judgments based on the handwriting style. So it should really be typewritten, right, So you want to you want to remove as many easily identifiable traits from this game as possible to make it all about the questions and the answers. Now, Touring said, what if we were to take the same basic premise, but instead of having two human interviewees, replace one of

03:10

those humans with a machine. Now, if that machine can convince the interrogator that the machine itself is a human being, it would be a pretty phenomenal achievement. Or you know, just generally, if the interrogator wasn't sure which of the two interrogate, right, whether which one is the human, which one is the machine, or even if there could be

03:32

a case where you don't know. It may be that you have two humans that you're interrogating, and it may be one of those things where you have to you know, if you don't know for a fact that one of them the machine, that makes it even harder, right, at

03:44

least assuming that the computer program is sophisticated enough. Now, Touring was saying that we don't have any machines right now that can do this, but I envision a time when computers will be able to do such a thing where if you were to interrogate a computer, you would get back responses that would be uh, convincing enough for it to make it difficult to determine if it were

04:06

man or machine. And so he said that he predicted in fifty years time, which would be the year two thousand, that computers and software be sophisticate enough that interrogators would only be able to guess correctly seventy percent of the time, meaning that they would be fooled by the computer pent of the time. Right. So, uh, this was just kind of a thing he was coming up with, like an idea,

04:28

a prediction, not necessarily a test. Although very much like Moore's observation became Moore's law, this became what is known as the Turing test. People talk about a Touring test as a machine capable of fooling people into thinking it's another person at least thirty percent of the time after five minutes of conversation. Very good point. Yes, it needs

04:50

to be five minutes of conversation. If you are just getting maybe two or three responses, that might not be enough for you to be able to draw a conclusion that you feel good about. If after five minutes you still are not entirely, entirely certain, then that might say that this machine, in fact has passed the Turing test.

05:09

And this has been extrapolated to mean something about machine intelligence because Turing himself tied the idea of of how we perceive a machines intelligence directly to artificial intelligence and and and even how we perceive human intelligence. Because here's Touring's idea is a little cheeky, and I love the fact that it's so cheeky. So Turing kind of said, would you say such a machine as intelligent if it appears to be intelligent, is it fair to say that

05:42

is intelligent? Touring? So why not? Because I'm only able to tell that I am intelligent, right because of my experience. I'm only able to have my own personal experience. I can't experience what someone else's life is like, all right, sitting across from each other, Jonathan and I can only assume that the other one is intelligent, right, And it's because the other person is displaying traits that we associate with intelligence. They seem to be able to take an information,

06:09

respond to it, make decisions. And based on the fact that we ourselves also do that thing, we go ahead and say, all right, well, they clearly have the same features that I have, which includes intelligence. Uh. Now, he says, why would we not extend that same courtesy to a machine if it also appeared to display those same uh features? He says, doesn't matter if the computer is quote unquote thinking.

06:35

If it's it can fool you. Yeah, if it's able to to simulate it well enough, you might as well say it's intelligent, because simulate it is probably a kinder phrase for that than fool. Yes. Yeah, well I know it's it is fooling essentially, I mean, because ultimately you're talking about a computer programmer who's making this happen. So nowadays we think of this as the Turing test. Can a machine of the time are more fool someone after five minutes of conversation into thinking it's a human? Now,

07:05

this is really hard to do. This is is non trivial. I mean it sounds almost simple. Yeah, the concept is simple, the execution incredibly difficult. Because here's the thing, human language varied. We have unstructured, unpredictable ways to say things like if if I were to tell you that it's a hundred degrees outside and it is humid like the humidities, and you go out there, then I'm sure all of our listeners would have slightly different ways to express their thoughts

07:42

on the conditions outside. Some of them would probably contain colorful metaphors. Mine very likely when it contain colorful metaphors, particularly if I had to be outside for any length of time. But that's the point. We would all have different ways of saying this. So how do you make a computer program able to interpret all the myriad of ways we can all express the same thought, let alone

08:10

any thought? All right? This is what's referred to as natural language recognition, and it's a really huge problem in artificial intelligence and and and a lot of other speech related computer programming. Exactly. Yeah, this is where a computer program has to be able to parse the language so it recognizes things like this word is a noun, this word is a verb, this this word alters this other word. Uh. And not only does it need to be able to

08:39

recognize it, it it needs to be able to respond in kind. Right, So if you have appropriately in some way or another share. Yeah, you could have a computer program that's literally making up sentences or you know, the approximation of a sentence randomly, where it's just pulling strings of words and placing them in a sequence and then presenting them. But that wouldn't be convincing at all. If I were to say, hello, how are you today, and Lauren was to say blue

09:05

panther pickup jump down street, I'd be like what? And even that was closer to being a sentence than some of the random stuff that you would see if it was just truly completely So it also has to be able to uh endure a five minute long conversation, like we said, in order to pass the Turing test. So you can't have too much repetition or that gives it away. Absolutely.

09:29

If you've ever been playing a video game and all of the NPCs say the same thing over and over and over again, yeah, if if all the chat bot says is hey, listen, then you're clearly playing Zelda and you're not actually having a decent conversation. That being said, there is someone I know who plays a ferry at the Georgia Renaissance Festival, and hey, listen is heavily represented in a repertoire. It's pretty amazing. No, it's pretty awesome. But at any rate, Yeah, so so these are these

09:56

are big problems. You have to build a database of words, you have to be able to figure out what kind of syntax are you going for. It's a wide open, huge problem, and solving this problem can be really beneficial in lots of ways. We'll talk a little bit about that later. It's it's beyond just making a program that seems human, right, That's that's one way of looking at it. But there are a lot of other benefits that come along with it, which will chat about towards the end

10:23

of the show. But for right now, let's go into this story behind Eugene Goostman and whether or not it was actually the first chat bought to pass the Turing test. Yeah. So, first of all, we've got three programmers in this story of Vladimir Vslov, Eugene Demchinko and Sarage you listen. As you may guess from their names, they all hail from Russia and the Ukraine as well, and although not all,

10:49

not all of them still live there. But starting in two thousand one they got to work on this, right, Yeah, they were trying to design a computer program that would pose specifically as a third teen year old boy from Odessa in the Ukraine. Yeah, and that meant that they had specific parameters that they could work within. It automatically helped reduce some of that unpredictability and that lack of restriction that you would have if you were to just say,

11:17

this is a fluent adult speaker of a given language. Yeah, Yeah, giving him these you know, setting up the expectation from the judges that you know, this is a non native English speaker. It's it's a kid essentially. Um, you know that they'll expect him to have limited knowledge of the world and and different subject areas, and a limited understanding of English vocabulary and grammar and all that kind of stuff.

11:43

So you're already managing expectations. That's going to come into play when we talk about some of the criticisms about this. Although although I do think it's a very clever way and and a lot of previous chatbots have have had similar yes kind of approaches. Yeah, because like we said, if you were to take a quote unquote pure approach to this, it's really really challenging. So yeah, by by limiting this, the judges have an idea of well, this this could be a thirteen year old boy or it

12:10

could be a computer program. It means that the computer program doesn't have to be as sophisticated as one that would be completely fluent and have you know, an adult's experiences, uh and ability to communicate. So that was step one and uh, Eugene Goosman took part in the competition. UH that had five total chatbots. It was it was one of five that took place on the sixtieth anniversary of Turing's death, and the program managed to full thirty three

12:41

percent of the judges into thinking it was actually a person. Uh, and it had there were thirty judges from what I understand, So that was where you got all the headlines of chat bought. There are computer beats touring tests, which already is not accurate. There were some slight um hiccups in a little bit of the news reporting that we'll get

13:03

into that part of the story later. But you know, the key takeaways I think here are that this was a this was a competition that was celebrating touring awesome um. And that there were thirty judges, some of whom were celebrities, yes, including an actor who had appeared on Red Dwarf right. Yes. And then a couple of years before that, because because you said they started working on this in two two thousand one, this was not the first time that Gustman

13:32

had entered competition. UH. Two years previously, that same software had convinced twenty of judges at a similar competition that was held at Bletchley Park. Now, but Bletchley Park that's where Touring helped crack the Enigma machine, the encoding device that the German military was using during World War Two. So this was a big celebration. It was at the centennial celebrating his birth, and it ended up falling just

13:59

short of passing the Turing test. Uh. The organizer of the the more recent event, the one in which Gustman ran away with quote unquote beating the Turing test. UH. That organizer was Kevin Warwick. That name may sound familiar to some of our listeners. If you've ever heard us talk about cyborgs. He's the guy who had an r f I D chip surgically implanted into him. His wife also did at one point. Yes, they could communicate with each other through them. Some unflattering news media sometimes refers

14:33

to him as Captain Cyborg. Yeah. There there are some critics who say that he, uh, he courts publicity in a manner that is unbecoming of of of a scientist. Yes, really of anybody. Yes, yes, uh, those are the critics who say that. By the way, I just want to

14:53

make that clear. At any rate, He said that this was the first time a chatbot had passed the Turing test at an event, or the conversation was open ended, meaning that they had not previously uh decided upon a specific topic or line of questioning that the judges were allowed to say whatever they wanted to the chat bot and response, which obviously makes it harder because you have

15:15

to have a much wider breadth of potential responses. Yeah. Yeah, because again, if you were to say, all right, this chat bot is just going to talk about, um, I don't know, sporting events from last year, well, then you can prepare pretty well for that. Yeah, exactly. So it's one of those again, it's one of those things where the unrestricted nature adds in a degree of difficulty. It's time for us to take a quick break, but will

15:41

be right back. So why would you need to make the qualification that this is an open ended approach and this is the first chat bot to manage it. That would be because despite what you may have heard, Eugene Goosband was not the first program to beat the Turing test, not by a long shot. So work on these sort

16:08

of chatbots, these kind of artificial conversationalists. That's real recent, right, I mean they just started doing that like what like like maybe three or four years ago or two thousand, one of the earliest, I mean that's when they started with Goose. Yeah. No, in the nineteen sixties and seventies. Say what back in the mid nineteen sixties there was Eliza, which was written by Joseph Weisenbaum. If you're pronouncing it in the correct German, that's correct. Excellent, I'm finally learning.

16:35

I'm sure. I'm sure he pronounces at Wisenbaum. But yeah, but no, it would be Weisenbaum at any rate. Um, this this was a program um that would respond to human conversation in what ideally would be a relevant way. Yes, it was obviously an early attempt. It was not meant to be a program that takes on the Turing test. It was really a thought again, kind of like a thought experiment, the idea of what does it take to create a piece of software that can react to questions

17:06

and make it make sense. At that point, it was more a can we do this? Then let's do this for real? Yeah, and these are the sort of foundations that you have to lay in order for other things like the Eugene Goostman program to be uh successful. So he created a language analyzer. Now this specifically would look at words that users would put in and then compare them against a database of words that were stored in the computer's memory. And then also created scripts. Now, in

17:36

this case, the scripts were sets of rules. They're kind of like, you know, like a protocol or an algorithm in a way. These rules dictated how Eliza would respond to messages in order to cut down on that huge, massive number of variables we were talking about, that the whole unrestricted, unpredictable thing, And so they would have different

17:56

uh kind of like like overlays. Think of him as an overlay that would kind of guide Aliza's responses, And the most well known one was called Doctor, which put Eliza in the role of a Rogerian psychiatrist. Uh. This is the person who responds to everything with a question. Al right. It's that it's that passive interview style where where you know, you repeat back. You know, if if if I go, oh, man, like I'm I'm I'm really sad about my cat, Oh, tell what is it about

18:26

your cat that makes you sad? And then you can say things like you know, And that's one of those things that where as a conversation starts to wind down, you then have another line of questions, So tell me more about your mother, like that the whole tell me about your mother thing. That's really coming back to this

18:42

kind of model of psychiatrists. But yeah, if you've ever heard the joke about them responding to anything with another question, so I just just taking the last word and turning it into a question what am I paying you for? Why do you think you're paying for? That kind of thing? Uh? So that's what that's how Aliza spotted and in fact, you can find examples of the Eliza proscript transcripts, and

19:07

even even the actual there's there. There are ports you could say, or people have essentially created their own version of Eliza just using the originalize as a guide. You can find tons of them on the web, and you can attempt to have a conversation with them. It's not terribly compelling, but it's it's kind of fun. Usually within maybe four or five exchanges, you've already run into something where you're like, well, this can't be a person, or if it's a person, it's the weirdest person I've ever

19:34

conversed with. But but again, it wasn't really an earnest attempt to to create something that would pass the quote unquote tearing tests, right, which I don't think was being referred to as such at that point yet. Yeah, it was kind of people knew about tourings prediction, but it

19:48

wasn't so much called a Turing test. Another chat bought that premiered in the early nineteen seventies went even further and actually was an attempt to try and pass the Turing test in a very specific approach, kind of like you know, Eugene Goostman is a very specific approach to narrow down that those parameters. In this case, this one was called Perry p A R R y A. Right, and Kenneth Colby created it too to emulate a patient who has paranoid schizophrenia. Yeah, someone who has you know,

20:18

sort of a persecution complex. Uh. They imagined that that there are people or other entities that are out to get them, and in this uh specific case, he kinda he kind of really embraced this approach. It reminds me of people who create a like a Dungeons and Dragon's character, but then give their character an entire backstory. Yeah. Yeah, yeah, this this Perry persona was an entire persona. It was a twenty eight year old man with a job as a post office clerk who was single, had no brothers

20:50

and sisters, and rarely saw his parents. He had specific hobbies. He liked to go to the movies and horse racing. Yeah, he liked to bet on the horses. Uh. And he placed about with bookies in the past, right, and that he realized later bookies have an association with the criminal underworld and that therefore the mafia knew about him and

21:12

we're out to get him. Now, now all of that might sound very ridiculous to you if you never had any kind of interaction with someone who suffers from paranoid schizophrenia, that can seem like, well, that seems cartoonish. But no, this, this kind of thinking is not uncommon, you know, whether it's whether it's a criminal organization or the government or some other even unnamed entity. Oh absolutely, uh, it's it's very much realistic in terms of that kind of diagnosis.

21:39

And and if I was speaking about it with humor in my voice a moment ago, it's it's only because I am absolutely tickled that the programmers of this of this program built it, built it in. Yeah, I mean it's it's actually pretty entertaining that they went so far as to make this whole, uh, this whole backstory to explain, because that's what gives it the believability, and then they

22:01

ended up testing it in conversation with a human. Perry would gradually start to introduce his thoughts quote unquote thoughts about being persecuted, and would respond sensitively to anything said about his appearance, family, or religious beliefs. I've actually seen lots of transcripts of conversations with Perry, and sure enough, it's one of those things where you know, you might have a few exchanges and then Perry ends up saying something that seems really odd, but not so odd as

22:28

to seem artificial. It just seems like it's a non sequitur, you know, something like, well, that's what they want you to think. And if you were to say, who are they the mafia? Who did you think I was talking about? Like the mafia after you? Of course they're after me. They know who I am. That kind of stuff, and it's disturbing, like it's you know, when you know what it is, it's kind of amusing, but like if you're in the middle of a conversation, join, oh this poor person. Yeah.

22:55

And so in order to test it, Kolby did a couple of different things. Did one test where eight psychiatrists interviewed both Perry and a human patient via teletypewriter. So in both cases the the the psychiatrists could not see who they were interviewing. This is going back to the kind of the original Touring test idea, or at least Touring's proposed experiment, and in this case, only two of the eight were able to identify that one of the

23:21

interviewees was human. And the other was a machine. In a second test, Colby presented a group of a hundred psychiatrists, almost a psychics, transcripts of interviews between Perry changed the Yeah, he had he had pairs of exactly I foresee no. But they gave these psychiatrists transcripts of interviews between an interviewer and Perry, and an interviewer and a human patient, and forty out of the responded. I don't know if the other sixty just never got it or if they didn't.

23:54

Response rates are variable, right, so I'll the forty who responded, nineteen of them guessed incorrectly. So that's almost a fifty percent, you know, getting you know, right up there with a pretty impressive amount. Now, again, we have to look at the fact that Perry is operating under a very restricted, uh set of rules. We're talking about paranoid schizophrenic, someone who we would assume would occasionally an non normative answers

24:25

exactly two conversational pieces. And and again it's it's a limited time that you're having with this person or this entity in this case, this this program, and uh. Because the psychiatrists had a specific expectation of the type of interactions they were going to see that could have affected their their uh answer right. So ideally, in the perfect situation, you would have this interview happening where you have no

24:56

expectation as to what the answers should be. In other words, you don't know ahead of time that the interviewee is having any kind of other, you know, any kind of restrictions upon that person, so that you would be interviewing anyone like any average person. But that's obviously not what we're talking about here, nor was it the one for Eugene Goost Bond. So I've also seen, by the way, transcripts of people who set up Eliza and Perry to

25:25

talk to each other. So you have Eliza acting as the Rogerian psychiatrist and Perry the paranoid schizophrenic, having bizarre conversations, and they usually don't last very long because Perry gets upset, And obviously by gets upset, I just mean that Perry ends up essentially shutting down the conversation because Eliza just wants to ask questions and Perry gets suspicious of people who are asking questions, and by again get suspicious, I'm

25:52

saying following specific rules that make it feel like this. Computer programs getting suspicious, but they are entertaining. If you ever do a search on line, just look for Eliza Perry transcripts. There. There are a few of them, and they're all pretty entertaining. But so since since that time, I mean obviously, lots and lots of chatbots have been

26:10

created for multiple reasons. Oh yeah, well, you know, some of them are trying to test the Turing test, and others are trying to fool you into giving out your credit card information or plaining on a link that has malicious uh links to malicious software. Yeah, anyone who's been on any kind of chat program, specifically like AIM has probably encountered this at least once or twice, where they're

26:35

they're getting an unsolicited message from someone or something or something. Yeah, if you type a couple of times, you realize, oh, this is not actually this is an attempt for to either get information from me or have me click on a link. Yeah, that's a that's a thing. But there are some examples of uh, I guess saying legitimate is weird, but there are some examples of they're more and more

27:01

scholarly attempts. Yeah, like a PC therapist that one was by Joseph Weintraub, and it fooled of its judges into thinking that it was human. Of ten judges, of course, so like, if I'm doing my math correctly, that's five that I believe you are. Unless we're talking about quantum judges. Um, these judges were both right and wrong at the same time. And um, and it was it was a whimsical program. I guess you could say, yeah, it had come some I think it's fair to say its answers could be

27:35

pretty smart ass. Also, I read some of these transcripts and it actually surprised me that enough judges thought that it was a person. Maybe they thought it was a person who was purposefully attempting to fool them into thinking it was a computer, huh, you know, because it would say things like I compute, therefore I am that kind

27:53

of stuff where it was specifically yeah. So you're looking at this stuff and you're thinking, all right, well, maybe this is kind of going back to that original touring test party game, where the the effort is for a person like the person who's being interviewed is trying to throw everybody off, and that's perfectly within the rules, unless you stay upfront. No, just be honest in your in

28:13

your answers. There's nothing in here, by the way that says that the interviewee has to tell the truth necessarily, unless you just state that as a parameter at the beginning. So in other words, you could you could be like, I'm totally the computer and you're the human being interviewed. Uh. I don't know if that's a fair way of saying that the device won or lost, but it is a possibility. Uh. Then we have a two thousand eleven. Now, this one

28:36

is a really pretty impressive one. And this was clever Butt, which was made by a fellow named Rollo Carpenter, and it fooled fifty nine point three percent of a live audience and an event in India with more than a thousand people. Yeah, the way this worked was that the audience watched as interviewers interacted via text with either clever Bot or a human in the course of a four minute interview. So it's a little shorter than what Turing had said, but not by not by a whole lot.

29:07

In four mints is still a good amount of time. Sure that that is twenty less, that's true, that's true. So keep that in mind. But but at any rate, it was you know, pretty interesting experience. And also from why I read they misidentified, they thought that the human was a computer sixty of the time. Um, because they didn't necessarily just say that it was a computer or it was. So now we see that there are a few examples of chat bots quote unquote passing the Turing test.

29:42

So what does that mean. Does it mean that the machines are actually thinking? Um? No, I mean, and it's not to say that that computers don't have a certain amount of machine intelligence, but there's absolutely a distinction between that and what we consider to be human intelligence. That's true. The programmers themselves have said that this doesn't mean a machine is able to think. Uh, they're just able to interpret commands then to follow a set of rules to

30:12

make a response, which is still pretty cool. And and it doesn't It certainly doesn't mean that the Turing test is worthless as an exercise. No, it is. In fact, it's improving our ability to create programs that that can understand or at least respond to natural language. Natural language recognition is one of those big things where if you're really able to crack it, then you can have some amazing opportunities open up, and we've seen this recently with

30:39

stuff like Siri. Oh absolutely being able to speak to your computer rather than having to I mean, even if you could speak to your computer through a keyboard and have it understand what you're what you're saying, I mean, like it's it's the reason why Google spends so much time and money, and it's search algorithms of of trying to figure out what you really mean when you search for a certain phrase, because traditionally, you know, before we

31:06

really got into the natural language recognition era, it meant that in order to work with a computer, you had to work with a computer on the computer's terms. You had to learn the commands, you had to learn the way to navigate a computer system in order for it

31:19

to do what you wanted it to do. Once you get to a point where natural language recognition software is robust enough the computer is working on your terms, you you can put in however you're thinking, like whatever, whatever mental exercise you've gone through to ask this computer to do something, whatever you do to kind of express a thought, uh as a command to this computer, the computer can then interpret then respond and it's not just for serving

31:49

you back whatever information you happen to be looking for. It's I mean, I mean, we're talking about being able to just look at a computer and say, you know, I really want a graph that looks blue and has these percentages in it and is about this thing, and it just doesn't. Yeah, like I want to see what the population distribution of Atlanta is in a bar chart or something, and then it could bring that shows out, finds that information, put it into a bar chart, and yeah,

32:18

that's pretty phenomenal stuff. We have a little bit more to say about passing the Turing test, but before we get to that, let's take another quick break. We see other examples of machine intelligence everywhere, things like pattern recognition, probabilistic predictions, for example, Pandora. You know the Music Genome project.

32:42

It's yeah, that's that's pattern recognition. Yeah, it's looking for elements of songs that you say you like and then looking for other stuff that's not in the specific category you mentioned or the specific examples you mentioned, and there's something else you probably will like because you like these other things that also have this stuff in it. Uh,

33:01

you know, sometimes that's less functional than other times. It makes me think of Patton Oswald has a great routine about TiVo that does the same sort of thing where he says, you know, TiVo is great. I mean, I like, I love Westerns, and I'll have it set to TiVo a Western for me, and I come back, and then they'll be all these other Westerns that will be suggested.

33:20

I didn't even know about. Thank you, TiVo. But then sometimes TiVo gets it wrong and I come back and everything has got horses in it, because Westerns have horses in it. So I've got my little pony and cartoons with horses and unicorns and things, and I have to say, no, TiVo, that's a bad TiVo. But TiVo says, but you said you liked horses. Same sort of thing. Like when you get more sophisticated than the the computer program starts to

33:45

anticipate things and makes these probabilistic models. These these models where there there are certain percentages associated with various responses, and it goes with whichever one seems to be the

33:58

most prevalent, assoning it meets a threshold. If this sounds familiar to you, it's because that's how IBM S. Watson worked, right, which is a really good example of natural language recognition, absolutely, because not only was it able to recognize natural language, it had to interpret things like wordplay that Jeopardy do. This is the machine that went up on Jeopardy and

34:21

beat the returning champions or former champions. Uh And you know, if you've ever played Jeopardy or watched Jeopardy, you know that there are categories that depend on things like puns or hominem's or other forms of word play. So it has to parse all of that, and that's even more complicated than just taking a simple sentence and figuring out, all right, what are the potential responses to whatever this

34:46

this this phrase is so great, great example. You know, they would end up coming up with a potential answer. It would assign a percentage of how quote unquote sure it was that that was the right answer, and if the percentage was higher than its threshold, which I think was something like that, it would buzz in and give that as a guess. Sometimes it was wrong, but it was right a lot of the time, so that's kind

35:12

of cool. Uh So, getting back to Eugene Gene, the main machine as I called him in in my notes at one point. Uh And of course, I'm anthropomorphizing when I say him, it's a it's a it's an it. Well, it has a dude name. Yeah, it has a dude name and a dude persona. But it's ultimately, isn't it. Would you say that perhaps some of the reporting around this was was maybe a little misleading or at least hype is well, you know, Okay, the entire Eugene Gooseman

35:47

chat bought sounds really cool. I haven't met it personally, No, I haven't either, although you can. There is an Internet version, and I'm not sure that it's the same version that's being used in competition, because I've seen some transcripts from the Internet version and they don't seem good at all. They seem bad, right, I guess you'd have to talk to some actual thirteen year old boys from actually, yeah,

36:09

that this is part of it. You know. There's certainly been some some questions among uh natural language AI enthusiasts online about whether we're really just lowering our expectations for human communication. Which, yeah, that's that's a totally different way of looking at and a depressing one to say that, Oh, well, if you come from this place and if you are of this age, then I expect you to only be able to communicate at this level, right, which is depressing.

36:40

It certainly is well, but you know, but it's a valid point, I think. I think it's a good thing to be thinking about in this kind of situation. UM. You know. Beyond that though, and I certainly don't want to downplay the apparent achievements of its programmers, because I haven't programmed any capable chatbots today, I it's been ever since I did. But there are a few things that are just a little bit shady about the news UM.

37:08

First off, the original press release, which came out of the University of Reading, I believe so UM stated that a quote supercomputer had achieved this feat, and perhaps charitably, it was a mistake or misunderstanding on the part of the writer of the press release, but some skeptics have suggested that it was in fact a purposeful publicity play that in fact worked, because a whole lot of news headlines around the interwebs repeated the error very excitedly. Yes,

37:39

because it was not a supercomputer. It was a computer running a piece of software. It's a program, Yes, the program that did the work. I mean, the computer just provided the horsepower. Right, it's the software that did all the action work, and it was not It wasn't on a supercomputer. Little little known fact. Supercomputers have better things

37:58

to do than run chat bought software generally. Yeah. Yeah, we're talking about things like figuring out global weather changing change patterns and things like that, you know, or or the way that money works. Chat bots low on their priority list. It's like number seven at least. Um And and also that this press release in question was largely a quotation from Dr Kevin Warwick. Kevin Warwick, of course

38:28

being the fellow who organized this entire competition. Um who who's an engineer and a futurist, um and also the instigator and or enjoyer of a certain amount of hype and debate about future technologies. Yeah, he is, obviously you can tell. This is the guy who elected to have surgery performed on him so he could have that r

38:53

F I D chip and call himself aside Wars. This is someone who not only embraces this these ideas of futurism, but is actively trying to promote them and get to them that we're not even saying that that's a bad thing. What we are saying, is that that may give him somewhat of a bias when it comes to proclaiming a computer software piece of computer software being an amazing achievement that beat the Turing test, right sure. I mean he

39:21

admits basically to to being a provocateur. He he says that that's really his job, you know, it is to get people excited about tech and engineering and in the future. And we get that like that that we agree that's our job too. We think it's red. We think that perhaps I don't want to put words into your mount Lauren.

39:40

I think perhaps that there's a different way of going about it where you can still be excited, but you can be a little more grounded in the way you present things, because I I also think the achievement of creating a chatbot that could be uh convincing is a fantastic achievement. I mean, it's something that, even under any number of qualifications, incredibly challenging to do, no matter how

40:01

you frame it. Um. I do think, however, that if you seem to over inflate the achievement, you run the danger of making people feel jaded about it later, which I think Computer Wolf exactly. Yeah, exactly. So it's one of those things where you know, you have to take the context into account, right and don't don't downplay the achievement, but don't sit there and say, like, ah ha, now we have intelligent computers everywhere. That's not that's not true either.

40:31

I saw there's a great Wired article that specifically went into uh, kind of debunking the whole beating the Turing test thing and again kind of saying the same thing we're saying, like, take the context into account, and and part of that article. They ended up asking a cognitive scientist named Gary Marcus of n y U about this, and Marcus proposed a new version of the Turing test because he says the old version is not really a

41:00

measurement of machine intelligence. Uh, it does kind of illustrate ways of creating natural language recognition and clever ways to fool the human side, huh. And that it was very valid historically at the time because you know, text textual communication was new and exciting and it was you know, it pushed the field forward, it really did. But now we've gotten to a point where fooling the person on the other side of a keyboard is not necessarily the

41:29

goal that we should be looking at. He proposes, the next version of the Turing test should be that a computer software, uh, like any kind of program that wants to beat it, what it has to do is first quote unquote watch a movie, television show, YouTube video something, some kind of video media, and then be able to

41:48

respond to questions about it. So sort of like here, let me show you this ten minute video on car safety, and then asking questions about specifically about the video, what happened after they fastened the seat belt, that kind of thing, and if the computer program is able to answer it, then that would be a much more convincing touring test than just kind of spewing out a script, which is, you know, it's adding another layer of difficulty on top

42:16

of an already difficult task. But that's that's the whole only way you can go forward. Otherwise we're just going to see increasingly sophisticated chat box. Yeah, and and that is actually a very difficult and interesting problem. What wasn't it just recently that there was a computer that we we taught I mean not us personally, but that humanity, some researchers and we're taught taught to identify cats pictures

42:38

of cats. That was the AI program that essentially went through thousands and thousands of uh I think it was images and videos and then became able to identify cats. It essentially defined what a cat was because no one taught it right. It learned what cats are based upon their appearance, on their appearance and can look at pictures of cats and say that is totally a cat, essentially saying that thing that is in that video is the

43:04

same as this other thing that's in this picture. That's the same as this other thing that's in this totally different video, which sounds trivial and hilarious, and it kind of is hilarious. Also easy because I mean, come on, everything on the internet has cats in it. Yeah, that is That is kind of a gimme, isn't it? But

43:20

but still no, it is. It is cool. I mean because just just that level of image recognition, I mean being able to take an object and look at it from a different angle than you were taught or that's a different color, different different size. Yeah, all these things. All of these things are easy for us, hard for computers,

43:37

so seeing something make that breakthrough is really exciting. Anyway, we thought we would take that story and kind of break it down for you guys, explain how it's still cool but maybe not as cool as the way some of the headlines are saying. Right, and also say hey internet journalists, UM, step up your game. Yeah, I understand that you want people to read your stuff. Oh yeah, And and you're under deadline pressure and that's terrible. But

44:02

let's look hard. Let's represent reality, shall we don't just don't just spit out press releases the way that you found them. Yeah, and uh, and I would be criminal to to neglect to mention Noel. Our producer reminded me that obviously this is a very important field of study because we want to be able to tell the difference between computers and humans when the future of Blade Runner becomes our reality and you're chasing down a replicant and you have to determine if it's actually a replicate or

44:34

human being. And that wraps up this classic episode about passing the Turing Test. Honestly, the whole Turing Test thing, that general concept has morphed over the decades, and it's it's more like shorthand right. Passing the Touring test isn't so much about passing a specific hypothetical test. It's more about this idea of creating a machine that of here's to be indistinguishable from a human as far as you know, processing and communicating information so that you feel like the

45:08

machine is truly intelligent. I think the turning test is one of those concepts that changes over time, and and it's more like you start to find out what it's not rather than what it is. It's an interesting things, very much like our artificial intelligence itself, or even machine consciousness. These are all concepts that are are kind of wibbly wobbly. As dr who might say, well, if you have suggestions for topics I should cover in future episodes of tech Stuff,

45:36

reach out to me. The best way to do that is over on Twitter. The handle for the show is text Stuff HSW and I'll talk to you again really soon. Text Stuff is an I Heart Radio production. For more podcasts from my Heart Radio, visit the i Heart Radio app, Apple Podcasts, or wherever you listen to your favorite shows.

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript