Passing the Turing Test

Speaker 1

00:04

Get in test with technology with text stuff from dot com. Say everyone, and welcome to text Uff. I'm Jonathan Strickland, and uh, you know, something hit the news recently. You may have seen it, in fact, you may have seen it in multiple places about a program, a computer program passing the Turing test, possibly the first such program to ever pass the Turing test ever. And we had planned on doing an episode on this, and then on top of that, one of our listeners, Nick on Twitter said,

00:40

just saw this headline on Google News. A computer just passed the Turing test in landmark trial. So we knew that the timing was perfect. Thank you Nick for writing in. We are really ready to talk about this, but in order to do that, it's important that we, you know, kind of lay some groundwork. So first of all, the Touring test is named after Alan Turing, and we've done a full episode on Alan Turing way back when, back in November of Yeah. Phenomenal person, amazing thinker. Yeah, one

01:12

of the like the grandfather of computer science. Also tragic life story which we went into detail back in that story. And so you might wonder why, right, So what does that have to do? Was the guy who was the essentially the father of computer science or grandfather of computer science. What does he have to do with the story about a computer program in June two thousand fourteen, a computer program with an interesting name, Eugene Goostman passing the Turing test.

01:41

What does that all have to do with each other? Well? To answer that, we have to ask what is a Turing test? Well, as it turns out Away, back in the nineteen fifties, he started envisioning uh I thought experiment. Yeah, he published a paper in a journal called Mind in ninety fifty called Computing, Machinery and Intelligence, and he sort of laid out his thought experiment there. And it's interesting because what he did was he took this idea of

02:08

a party game and then adapted it for computers. Right. The party game that it's based on would have three participants, an interrogator, a man, and a woman um all situated so that they can't see one another, and the interrogator is supposed to ask questions to try to figure out which of the participants is the dude and which is the lady exactly, And it's the dude's job to try and mislead the interrogator to believe that that he, in fact, is the lady and the other one is the man.

02:39

It's the lady's job to say, hey, I want you to get this right. I'm the lady. That other guy that's the dude. And so the interrogator has to ask questions. Now, obviously they can't see each other. Because they could see each other, then that would probably give things away. Probably, they really shouldn't be able to hear each other because that could also give things away. Sure, and if you use handwritten notes, then you the interrogator might be able

03:03

to make judgments based on the handwriting style. So it should really be typewritten, right, So you want to you want to remove as many easily identifiable traits from this game as possible to make it all about the questions and the answers. Now, Touring said, what if we were to take the same basic premise, but instead of having

03:23

two human interviewees, replace one of those humans with a machine. Now, if that machine can convince the interrogator that the machine itself is a human being, it would be a pretty phenomenal achievement. Or you know, just generally, if the interrogator wasn't sure which of the two interrogate ease, right, which one is the human, which one is the machine? Or even if there could be a case where you don't know.

03:50

It may be that you have two humans that you're interrogating, and it may be one of those things where you have to you know, if you don't know for a fact that one of them the machine, that makes it even hard, right, at least assuming that the computer program

04:02

is sophisticated enough. Now, Touring was saying that we don't have any machines right now that can do this, but I envision a time when computers will be able to do such a thing where if you were to interrogate a computer, you would get back responses that would be uh, convincing enough for it to make it difficult to determine

04:21

if it were man or machine. And so he said that he predicted in fifty years time, which would be the year two thousand, that computers and software be sophisticate enough that interrogators would only be able to guess correctly seventy of the time, meaning they would be fooled by the computer of the time. Right, So, uh, this was just kind of a thing he was coming up with, like an idea, a prediction not necessarily a test, although very much like Moore's observation became Moore's law, this became

04:51

what is known as the touring test. People talk about a touring test as a machine capable of fooling people into thinking it's another person at least thirty of the time after five minutes of conversation. Very good point. Yes, it needs to be five minutes of conversation. If you are just getting maybe two or three responses, that might not be enough for you to be able to draw

05:12

a conclusion that you feel good about. If after five minutes you still are not entirely, entirely certain, then that might say that this machine, in fact has passed the touring test. And this has been extrapolated to mean something about machine intelligence because Turing himself tied the idea of of how we perceive a machines intelligence directly to artificial intelligence and and and even how we perceive human intelligence. Because here's Touring's idea is a little cheeky, and I

05:47

love the fact that's so cheeky. So Turing kind of said, would you say such a machine as intelligent if it appears to be intelligent, is it fair to say that is intelligent touring. So why not because I'm only able to tell that I am intelligent, right, because of my experience. I'm only able to have my own personal experience. I

06:08

can't experience what someone else's life is like. All right, Sitting across from each other, Jonathan and I can only assume that the other one is intelligent, right, And it's because the other person is displaying traits that we associate with intelligence. They they seem to be able to take an information, respond to it, make decisions. And based on the fact that we ourselves also do that thing, we go ahead and say, all right, well, they clearly have

06:33

the same features that I have, which includes intelligence. Uh. Now, he says, why would we not extend that same courtesy to a machine if it also appeared to display those same uh features? He says, doesn't matter if the computer is quote unquote thinking. If it's it can fool you. Yeah, if it's able to to simulate it well enough, you might as well say it's intelligent, because stimulate is probably

06:58

a kinder phrase for that than fool. Yeah. Yeah, well I know it's it is fooling essentially, I mean, because ultimately you're talking about a computer programmer, who's making this happen? So nowadays we think of this as the Turing test. Can a machine of the time are more fool someone after five minutes of conversation into thinking it's a human? Now, this is really hard to do. This is yeah, this is non trivial. I mean it sounds almost simple. Yeah,

07:28

the concept is simple, the execution incredibly difficult. Because here's the thing, human language varied. We have unstructured, unpredictable ways to say things like if if I were to tell you that it's a hundred degrees outside and it is humid like the humidities at and you go out there, then I'm sure all of our listeners would have slightly different ways to express their thoughts on the conditions out side.

08:00

Some of them would probably contain colorful metaphors, probably mine very likely when it contain colorful metaphors, particularly if I had to be outside for any length of time. But that's the point. We would all have different ways of saying this. So how do you make a computer program able to interpret all the myriad of ways we can all express the same thought, let alone any thought. Right,

08:27

this is what's referred to as natural language. Recognition, and it's a really huge problem in artificial intelligence and and and a lot of other speech related computer programming. Exactly. Yeah, this is where a computer program has to be able to parse the language so it recognizes things like this word is a noun, this word is a verb, this this word alters this other word. Uh. And not only does it need to be able to recognize it, it it

08:56

needs to be able to respond in kind. Right, So if you appropriately in some way or another share, Yeah, you could have a computer program that's literally making up sentences or you know, the approximation of a sentence randomly, where it's just pulling strings of words and placing them in a sequence and then presenting them. But that wouldn't be convincing at all. If I were to say, Hello, how are you today, and Lauren was to say blue

09:21

panther pickup jump down street, I'd be like what? And even that was closer to being a sentence than some of the random stuff that you would see if it was just truly completely So it also has to be able to uh endure a five minute long conversation, like we said, in order to pass the Turing test. So you can't have too much repetition or that gives it away absolutely. If you've ever been playing a video game and all of the NPCs say the same thing over

09:48

and over and over again. Yeah, if if all the chat bot says is hey listen, then you're clearly playing Zelda and you're not actually having a decent conversation. That being said, there is someone I know who plays a ferry at the Georgia Renaissance Festival, and hey listen is heavily represented in a repertoire. Pretty amazing, No, it's pretty awesome, but at any rate, Yeah, So, so these are these are big problems. You have to build a database of words, you have to be able to figure out what kind

10:19

of syntax are you going for. It's a wide open, huge problem, and solving this problem can be really beneficial in lots of ways. We'll talk a little bit about that later. It's it's beyond just making a program that seems human, right, That's that's one way of looking at it. But there are a lot of other benefits that come along with it, which will chat about towards the end of the show. But for right now, let's go into this story behind Eugene Goostman and whether or not it

10:45

was actually the first chat bought to pass the Turing test. Yeah. So, first of all, we've got three programmers in this story of Vladimir Vassilov, Eugene Demchinko and Sarage you listen. As you may guess from their names, they all hail from Russia and the Ukraine as well, and although not all

11:05

the not all of them still live there. But starting in two thousand one, they got to work on this, right, Yeah, they were trying to design a computer program that would pose specifically as a thirteen year old boy from Odessa in the Ukraine. Yeah, and that meant that they had specific parameters that they could work within. It automatically helped reduce some of that unpredictability and that lack of restriction that you would have if you were to just say,

11:33

this is a fluent adult speaker of a given language. Yeah, yeah, giving him these you know, setting up the expectation from the judges that you know, this is a non native English speaker. It's it's a kid essentially, Um, you know that they'll expect him to have limited knowledge of the world and and different subject areas and a limited understanding of English. Vocabulary and grammar and all that kind of stuff.

11:59

So you're already managing expectations. That's going to come into play when we talk about some of the criticisms about this, although although I do think it's a very clever way and in a lot of previous chat bots have have had similar, yes, similar kind of approaches. Yeah, because like we said, if you're to take a quote unquote pure approach to this, it's really really challenging. So yeah, by by limiting this, the judges have an idea of well, this this could be a thirteen year old boy or

12:26

it could be a computer program. It means that the computer program doesn't have to be as sophisticated as one that would be completely fluent and have you know, an adult's experiences, uh and ability to communicate. So that was step one. And uh. Eugene Goosman took part in the competition UH that had five total chatbots. It was it was one of five that took place on the sixtieth anniversary of Touring's death, and the program managed to full thirty three percent of the judges into thinking it was

12:59

actually a person. UH. And it had there were thirty judges. From what I understand. So that was where you got all the headlines of chat bought. There are computer beats

13:10

touring tests, which already is not accurate. There were some slight um hiccups in a little bit of the news reporting that We'll get into that part of the story later, but you know, the key takeaways I think here are that this was a this was a competition that was celebrating touring awesome um and that there were thirty judges, some of whom were celebrities, yes, including an actor who

13:38

had appeared on Red Dwarf right. Yes. And then a couple of years before that, because because you said they started working on this on two th two thousand one, this was not the first time that Gustman had entered competition. Two years previously, that same software had convinced twenty of judges at a similar competition that was held at Bletchley Park. Now, but Bletchley Park's where touring helped crack the Enigma machine, the the encoding device that the German military was using

14:07

during World War Two. So this was a big celebration. It was at the centennial celebrating his birth, and it ended up falling just short of passing the Turing Test. Uh. The organizer of the the more recent event, the one in which Goostman ran away with quote unquote beating the Turing test. Uh. That organizer was Kevin Warwick. That name may sound familiar to some of our listeners if you've ever heard us talk about cyborgs. He's the guy who had an r f I D chip surgically implanted into him.

14:42

His wife also did at one point. Yes, they could communicate with each other through them. Some unflattering news media sometimes refers to him as Captain Cyborg. Yeah. There there are some critics who say that he uh, he courts publicity in a manner that is um unbeco umming of of of a scientist, yes, really of anybody. Yes, yes, uh those are the critics who say that. By the way,

15:08

I just want to make that clear. At any rate, He said that this was the first time a chat bot had passed the Turing test at an event where the conversation was open ended, meaning that they had not previously uh decided upon a specific topic or line of questioning. That the judges were allowed to say whatever they wanted to the chat bot and response, which obviously makes it harder because you have to have a much wider breadth

15:33

of potential responses. Yeah. Yeah, because again, if you were to say, all right, this chat bot is just going to talk about, um, I don't know, sporting events from last year, Well, then you can prepare pretty well for that. Yeah, exactly. So it's one of those again, it's one of those things where the unrestricted nature adds in a degree of difficulty. So why would you need to make the qualification that this is an open ended approach and this is the

15:59

first chat to manage it. That would be because despite what you may have heard, Eugene Goosband was not the first program to beat the Turing test, not by a long shot. So work on these sort of chatbots, these kind of artificial conversationalists. That's real recent, right, I mean they just started doing that like what like like maybe three or four years ago or two thousand, one of

16:23

the earliest, I mean that's when they started with Goosman. Yeah. No, in the nineteen sixties and seventies, say, back in the mid nineteen sixties there was Eliza, which was written by Joseph Weisenbaum. If you're pronouncing it in the correct German, that's correct. Excellent, I'm finally learning. I'm sure, I'm sure he pronounces it Wisenbaum, but yeah, but no, it would be Weisenbaum at any rate. Um. This this was a program UM that would respond to human conversation and what

16:53

ideally would be a relevant way. Yes, it was obviously an early attempt. It was not meant to be a program that takes on the Turing test. It was really a thought again, kind of like a thought experiment, the idea of what does it take to create a piece of software that can react to questions and make it make sense. At that point, it was more a can we do this? Then let's do this for real? Yeah, and these are the sort of foundations that you have to lay in order for other things like the Eugene

17:23

Goostman program to be uh successful. So he created a language analyzer. Now this specifically would look at words that users would put in and then compare them against a database of words that were stored in the computer's memory. And then also created scripts. Now, in this case, the scripts were sets of rules. They're kind of like, you know,

17:45

like a protocol or an algorithm in a way. These rules dictated how Eliza would respond to messages in order to cut down on that huge, massive number of variables we were talking about, the whole unrestricted, unpredictable thing, and so they would have different uh kind of like like overlays. Think of him as an overlay that would kind of guide Aliza's responses. And the most well known one was called doctor, which put Eliza in the role of a

18:13

Rogerian psychiatrist. Uh. This is the person who responds to everything with a question. Al Right. It's that it's that passive interview style where where you know, you repeat back. You know, if if if I go, oh, man, like I'm I'm I'm really sad about my cat, Oh, tell what is it about your cat that makes you sad?

18:33

And then you can say things like you know, And that's one of those things where as a conversation starts to wind down, you then have another line of questions, So tell me more about your mother, like that the whole tell me about your mother thing. That's really coming

18:47

back to this kind of model of psychiatrists. But yeah, if you've ever heard the joke about them responding to anything with another question, just just taking the last word and turning it into a question, what am I paying you for Why do you think your page for that kind of thing? Uh So that's what That's how Eliza responded, And in fact, you can find examples of the Eliza program script transcripts and even even the actual there's there.

19:14

There are ports, you could say, or people have essentially created their own version of Eliza just using the original Liza as a guide. You can find tons of them on the web, and you can attempt to have a conversation with them. It's not terribly compelling, but it's it's kind of fun. Usually within maybe four or five exchanges, you've already run into something where you're like, well, this can't be a person, or if it's a person, it's

19:38

the weirdest person I've ever conversed with. But but again, it wasn't really an earnest attempt to to create something that would pass the quote unquote tearing tests, which I don't think was being referred to as such at that point yet. Yeah, it was kind of people knew about tourings prediction, but it wasn't so much called a turing test.

19:56

Another chat bought that premiered in the early nineties seventies went even further and actually was an attempt to try and pass the Turing test in a very specific approach, kind of like you know, Eugene Goostman, is a very specific approach to narrow down that those parameters. In this case, this one was called Perry p A R R Y A Right, and Kenneth Colby created it too to emulate a patient who has paranoid schizophrenia. Yeah, someone who has

20:24

you know, sort of a persecution complex. Uh. They imagined that's that there are people or other entities that are out to get them. And in this uh specific case, he he kinda he kind of really embraced this approach. It reminds me of people who create a like a Dungeons and Dragon's character, but then give their character an entire backstory. Yeah. Yeah, yeah, this this Perry persona was

20:50

an entire persona. It was a twenty eight year old man with a job as a post office clerk who was single, had no brothers and sisters, and rarely saw his parents. He had specific hobbies. He liked to of the movies and horse racing. Yeah, he liked to bet on the horses and placed a bet with bookies in the past. Right, and that he realized later bookies have an association with the criminal underworld, and that therefore the

21:15

mafia knew about him and we're out to get him. Now. Now, all of that might sound very ridiculous to you if you never had any kind of interaction with someone who suffers from paranoid schizophrenia, that can seem like, well, that seems cartoonish. But no, this, this kind of thinking is not uncommon, you know, whether it's whether it's a criminal organization or the government or some other even unnamed entity. Oh. Absolutely, Uh, it's it's very much realistic in terms of that kind

21:44

of diagnosis. And and if I was speaking about it with humor in my voice a moment ago, it's it's only because I am absolutely tickled that the programmers of this of this program built it, built it in. Yeah, it's it's actually pretty entertaining that they went so far as to make this whole uh, this whole backstory to explain, because that's what gives it the believability, right, and then they ended up testing it in conversation with a human.

22:09

Perry would gradually start to introduce his thoughts quote unquote thoughts about being persecuted, and would respond sensitively to anything said about his appearance, family, or religious beliefs. I've actually seen lots of transcripts of conversations with Perry, and sure enough, it's one of those things where you know, you might have a few exchanges and then Perry ends up saying something that seems really odd, but not so odd as

22:34

to seem artificial. It just seems like it's a non sequitur, you know, something like, well, that's what they want you to think. And if you were to say, who are they the mafia? Who did you think I was talking about? Like the mafia after you? Of course they're after me. They know who I am. That kind of stuff, and it's disturbing, like it's you know, when you know what it is, it's kind of amusing, but like if you're in the middle of a conversation, oh, this poor person. Yeah.

23:00

And so in order to test it, uh Kolby did a couple of different things. He did one test where eight psychiatrists interviewed both Perry and a human patient via teletypewriter. So in both cases, the the the psychiatrists could not see who they were interviewing. This is going back to the kind of the original touring test. Idea, or at

23:20

least Touring's proposed experiment. And in this case, only two of the eight were able to identify that one of the interviewees was human and the other was a machine. In a second test, Colby presented a group of a hundreds psychiatrists, most a psychics, transcripts of interviews between Perry. Yeah, he had, he had pairs of exactly food, I foresee. No. But they gave these psychiatrists transcripts of interviews between an interviewer and Perry, and an interviewer and a human patient,

23:53

and forty out of the one responded. I don't know if the other sixty just never got it or if they didn't take The response rates are variable, right, so I'll the forty who responded, nineteen of them guessed incorrectly. So that's almost a fifty percent, you know, getting you know, right up there with a pretty impressive amount. Now, again, we have to look at the fact that Perry is

24:19

operating under a very restricted set of rules. We're talking about paranoid schizophrenic, someone who we would assume would occasionally have a non normative answers exactly to conversational pieces. And and again it's it's a limited time that you're having with this person or this entity in this case, this this program, and uh, because the psychiatrists had a specific expectation of the type of interactions they were going to

24:48

see that could have affected their their answer right. So ideally, in the perfect situation, you would have this in review happening where you have no expectation as to what the answers should be. In other words, you don't know ahead of time that the interviewee is having any kind of other any kind of restrictions upon that person, so that you would be interviewing anyone like any average person. But that's obviously not what we're talking about here, nor was

25:21

it the one for Eugene Goosband. So I've also seen by the way transcripts of people who set up Eliza

25:30

and Perry to talk to each other. So you have Eliza acting as the Rogerian psychiatrist and Perry the paranoid schizophrenic, having bizarre conversations, and they usually don't last very long because Perry gets upset, And obviously by gets upset, I just mean that Perry ends up essentially shutting down the conversation because Eliza just wants to ask questions, and Perry gets suspicious of people who are asking questions, and by

25:56

again get suspicious, I'm saying following specific rules that make it feel like this computer programs getting suspicious, but they are entertaining. If you ever do a search online, just look for Eliza Perry transcripts. There. There are a few of them, and they're all pretty entertaining. But so since since that time, I mean obviously, lots and lots of

26:15

chatbots have been created for multiple reasons. Oh yeah, well, you know, some of them are trying to test the Turing test, and others are trying to fool you into giving out your credit card information or planing on a link that has malicious uh links to malicious software. Yeah. Anyone who's been on any kind of chat program, specifically like AIM has probably encountered this at least once or twice, where they're they're getting an unsolicited message from someone or

26:45

something or something. Yeah, and if you type a couple of times, you realize, oh, this is not actually this is an attempt for to either get information from me or have me click on a link. Yeah, that's a that's the thing. But there are some example of I guess saying legitimate is weird. But there are some examples of other more and more scholarly attempts. Yeah, like a PC therapist. Uh. That one was by Joseph Weintraub, and it fooled of its judges into thinking that it was human.

27:20

Of ten judges, of course, so if I'm doing my math correctly, that's five that I believe you are. Unless we're talking about quantum judges. Um, these judges were both right and wrong at the same time. And um, and it was it was a whimsical program. I guess you could say, yeah, it had come some I think it's fair to say its answers could be pretty smart ass. Also, I read some of these transcripts and it actually surprised me that enough judges thought that it was a person.

27:49

Maybe they thought it was a person who was purposefully

27:51

attempting to fool them into thinking it was a computer. Uh, you know, because it would say things like I compute, therefore I am that kind of stuff where it was specifically Yeah, so you're looking at this stuff and you're thinking, all right, well, maybe this is kind of going back to that original touring test party game where the the effort is for a person like the person who's being interviewed is trying to throw everybody off, And that's perfectly

28:15

within the rules unless you stay upfront. No, just be honest in your in your answers. There's nothing in here, by the way that says that the interviewee has to tell the truth necessarily unless you just state that as a parameter at the beginning. So in other words, you could you could be like, I'm totally the computer and you're the human being interviewed. Uh. I don't know if that's a fair way of saying that the device won or lost, but it is a possibility. Uh. Then we

28:40

have a two thousand eleven. Now, this one is a really pretty impressive one. And this was clever Butt, which was made by a fellow named Rollo Carpenter, and it fooled fifty nine point three percent of a live audience and an event in India with more than a thousand people. Yeah, the way this worked was that the audience watched as interviewers interacted via text with either clever Bot or a human in the course of a four minute interview. So it's a little shorter than what Turing had said, but

29:11

not by not by a whole lot. In fourmants is still a good amount of time. Sure that that is twent less, that's true. That's true, so keep that in mind. But but at any rate, it was you know, pretty interesting experience. And also from why I read they misidentified, they thought that the human was a computer sixty of the time. Um, because they didn't necessarily just say that it was a computer or it was clever. But so now we see that there are a few examples of

29:44

chat bots quote unquote passing the Turing test. So what does that mean? Does it mean that the machines are actually thinking? Um? No, I mean, and it's not to say that that computers don't have a certain amount of machine intelligence, but there's absolutely a distinction between that and what we consider to be human intelligence. That's true. The programmers themselves have said that this doesn't mean a machine

30:10

is able to think. Uh, they're just able to interpret commands then to follow a set of rules to make a response, which is still pretty cool. And and it doesn't It certainly doesn't mean that the Turing test is worthless as an exercise. No, it is. In fact, it's improving our ability to create programs that that can understand or at least respond to natural language. Natural language recognition is one of those big things where if you're really able to crack it, then you can have some amazing

30:43

opportunities open up. And we've seen this recently with stuff

30:45

like Siri. Oh absolutely being able to speak to your computer rather than having to I mean, even if you could speak to your computer through a keyboard and have it understand what you're what you're saying, I mean, like it's it's the reason why Google spend so much time and money, and it's search algorithms of of trying to figure out what you really mean when you search for a certain phrase, because traditionally, you know, before we really

31:12

got into the natural language recognition era, it meant that in order to work with a computer, you had to work with a computer on the computer's terms. You had to learn the commands, you had to learn the way to navigate a computer system in order for it to

31:25

do what you wanted it to do. Once you get to a point where natural language recognition software is robust enough the computer is working on your terms, you you can put in however you're thinking, like whatever, whatever mental exercise you've gone through to ask this computer to do something. Whatever you do to kind of express a thought, uh as a command to this computer. The computer can then interpret then respond And it's not just for serving you

31:55

back whatever information you happen to be looking for. It's I mean, I mean, we're talking about being able to just look at a computer and say, you know, I really want a graph that looks blue and has these percentages in it and it is about this thing, and it just does it. Yeah, like I want to see what the population distribution of Atlanta is in a bar chart or something, and then it could bring that, goes out, finds that information, put it into a bar chart, and yeah,

32:24

that's pretty phenomenal stuff. We see other examples of machine intelligence everywhere, things like pattern recognition, probabilistic predictions, for example, Pandora. You know the Music Genome project. It's yeah, that's that's

32:35

pattern recognition. It's looking for elements of songs that you say you like, and then looking for other stuff that's not in the specific category you mentioned or the specific examples you mentioned, and there's something else you probably will like because you like these other things that also have this stuff in it. Sure, Uh, you know sometimes that's

32:54

less functional than other times. It makes me think of Patton Oswald has a great routine about Evo that does the same sort of thing where he says, you know, TiVo is great. I mean, I like, I love Westerns, and I'll have it set to TiVo a Western for me, and I come back, and then there'll be all these other Westerns that will be suggested. I didn't even know about. Thank you, TiVo. But then sometimes TiVo gets it wrong and I come back and everything has got horses in it,

33:19

because Westerns have horses in it. So I've got my little pony and cartoons with horses and unicorns and things, and I have to say, no, TiVo, that's a bad TiVo. But TiVo says, but you said you liked horses. Same sort of thing. Like when you get more sophisticated than the the computer program starts to anticipate things and makes

33:39

these probabilistic models. These these models where there there are certain percentages associated with various responses, and it goes with whichever one seems to be the most prevalent, assuming it meets a threshold. If this sounds familiar to you, it's because that's how ibms Watson worked, which is a really good example of natural language recognition absolutely because not only was it able to recognize natural language, it had to

34:06

interpret things like word play Jeopardy. This is the machine that went up on Jeopardy and beat the returning champions or former champions. Uh And you know, if you've ever played Jeopardy or watch Jeopardy, you know that there are categories that depend on things like puns or hominem's or

34:26

other forms of word play. So it has to parse all of that, and that's even more complicated than just taking a simple sentence and figuring out, all, right, what are the potential responses to whatever this this this phrase is so great, great example. You know, it would end

34:42

up coming up with a potential answer. It would assign a percentage of how quote unquote sure it was that that was the right answer, and if the percentage was higher than its threshold, which I think was something like that, it would buzz in and give that as a guess. Sometimes it was wrong, but it was right a lot

35:02

of the time, so that's kind of cool. Uh So, getting back to Eugene Gene, the main machine as I called him in in my notes, at one point uh, and of course I'm anthropomorphizing when I say him, it's a it's a it's an it. Well, it has a dude name. Yeah, it has a dude name and a

35:19

dude persona. But it's ultimately, isn't it. Would you say that perhaps some of the reporting around this was was maybe a little misleading or at least hype is well, you know, okay, the entire Eugene Gooseman chat bought sounds really cool. I haven't met it personally, No, I haven't either, although you can. There is an Internet version, and I'm not sure that it's the same version that's being used in competition, because I've seen some transcripts from the Internet

35:51

version and they don't seem good at all. They seem bad, right, I guess you'd have to talk to some actual thirteen year old boys from actually yeah, that well, this is part of it. You know. There's certainly been some some questions among uh natural language AI enthusiasts online about whether

36:11

we're really just lowering our expectations for human communication. Which, yeah, that's that's a totally different way of looking at and a depressing one to say that, Oh, well, if you come from this place and if you are of this age, then I expect you to only be able to communicate at this level, right, which is depressing. It certainly is well, but you know, but it's a valid point. I think. I think it's a good thing to be thinking about

36:37

in this kind of situation. UM. You know. Beyond that though, and I certainly don't want to downplay the apparent achievements of its programmers, because I haven't programmed any capable chatbots today ever since I did. But there are a few things that are just a little bit shady about the

36:59

news UM. First off, the original press release, which came out of the University of Reading, I believe so UM stated that a quote supercomputer had achieved this feat, and perhaps charitably, it was a mistake or misunderstanding on the part of the writer of the press release, But some skeptics have suggested that it was in fact a purposeful publicity play that in fact worked, because a whole lot of news headlines around the interwebs repeated the error very excitedly. Yes,

37:31

because it was not a supercomputer. It was a computer running a piece of software program. Yes, the program that did the work. I mean, the computer just provided the horsepower. Right, it's the software that did all the action work, and it was not it wasn't on a supercomputer. Little little known fact. Supercomputers have better things to do than run

37:51

chatbot software generally. Yeah. Yeah, we're talking about things like figuring out global weather changing change pad and some things like that, you know, or or the way that money works. Chat Bots low on their priority list. It's like number seven at least. Um And and also that this press release in question was largely a quotation from Dr Kevin Warwick.

38:18

Kevin Warwick, of course being the fellow who organized this entire competition, who who's an engineer and a futurist um and also the instigator and or enjoyer of a certain amount of hype and debate about future technologies. Yeah, he is, obviously you can tell. This is the guy who elected to have surgery performed on him so he could have h that r F I D chip and call himself

38:46

a cybors. This is someone who not only embraces this these ideas of futurism, but is actively trying to promote them and get to them that we're not even saying that that's a bad thing. What we are saying is that may give him somewhat of a bias when it comes to proclaiming a computer software piece of computer software being an amazing achievement that beat the Turing test, right sure. I mean he admits basically to to being a provocateur.

39:15

He he says that that's really his job, you know, is to get people excited about tech and engineering and and the future. And we get that like that that

39:25

we agree that a job too. We think it's red We think that perhaps I don't want to put words into your mouth, Lauren, I think perhaps that there's a different way of going about it where you can still be excited, but you can be a little more grounded in the way you present things, because I I also think the achievement of creating a chatbot that could be uh convincing is a fantastic achievement. I mean, it's something that, even under any number of qualifications, incredibly challenging to do,

39:52

no matter how you frame it. Um. I do think, however, that if you seem to over inflate the achieve it, you run the danger of making people feel jaded about it later, which I think Computer Wolf exactly, yeah, exactly. So it's one of those things where you know, you have to take the context into account, right and don't don't downplay the achievement. But don't sit there and say like, ah ha, now we have intelligent computers everywhere. That's not

40:21

that's not true either. I saw there's a great Wired article that specifically went into uh, kind of debunking the whole beating the Turing test thing and again kind of saying the same thing we're saying, like take the context into account, and and part of that article they ended up asking a cognitive scientist named Gary Marcus of n y U about this, and Marcus proposed a new version of the Turing test because he says, the old version

40:51

is not really a measurement of machine intelligence. Uh, it does kind of illustrate ways of creating natural language recognition and clever way is to fool the human side, uh huh. And that it was very valid historically at the time because you know, text textual communication was new and exciting and it was you know, it pushed the field forward, it really did. But now we've gotten to a point where fooling the person on the other side of a keyboard is not necessarily the goal that we should be

41:21

looking at he proposes. The next version of the Turing test should be that a computer software, uh, like any kind of program that wants to beat it, what it has to do is first quote unquote watch a movie, television show, YouTube video, something, some kind of video media, and then be able to respond to questions about it.

41:41

So sort of like here, let me show you this ten minute video on car safety, and then asking questions about specifically about the video, what happened after they fastened the seat belt, that kind of thing, and if the computer program is able to answer it, then that would be a much more convincing touring test than just kind of spewing out a script, which is, you know, it's adding another layer of difficulty on top of an already difficult task. But that's that's the whole only way you

42:10

can go forward. Otherwise we're just going to see increasingly sophisticated chat box. Yeah, and and that is actually a

42:17

very difficult and interesting problem. What wasn't it just recently that there was a computer that we we taught I mean not us personally, but that humanity, some researchers, and we're taught taught to identify cats pictures of cats that was the AI program that essentially went through thousands and thousands of uh I think it was images and videos and then became able to identify cats, essentially defined what

42:41

a cat was because no one taught it right. It learned what cats are based upon their appearance, on their appearance and can look at pictures of cats and say that is totally a cat, essentially saying that thing that is in that video is the same as this other thing that's in this picture. That's the same as this other thing that's in this totally different video, which sounds trivial and hilarious, and it kind of is hilarious. Also easy because I mean, come on, everything on the internet

43:08

has cats in it. Yeah, that is That is kind of a gimme, isn't it? But still no, it is. It is cool. I mean because just just that level of image recognition, I mean being able to take an object and look at it from a different angle than you were taught or that's a different color, different different size. Yeah, all these things. All of these things are easy for us, hard for computers. So seeing something make that breakthrough is

43:32

really exciting. Anyway, we thought we would take that story and kind of break it down for you guys, explain how it's still cool but maybe not as cool as the way some of the headlines are saying. Right, and also say, hey internet journalists, um, step up your game. Yeah, I understand that you want people to read your stuff. Oh yeah, and you're under deadline pressure and that's terrible. But that's that's hard. Let's reality. We don't just don't

43:59

just spit up us releases the way that you found them. Yeah, and uh, and I would be criminal to to neglect to mention, knowl Our producer reminded me that obviously this is a very important field of study because we want to be able to tell the difference between computers and humans when the future of Blade Runner becomes our reality and you're chasing down a replicant and you have to

44:24

determine if it's actually a replicate or human. Being very good point, computer programmers, just make sure you don't explain sort of you know, what you do if you see in turtle laying on back and you've decided not to turn it over, because that's like that's like our gimme, Yeah, yeah, we got yeah, we need that. So everything else fair game. All right, So that wraps up this discussion. Guys. First of all, thank you so much Nick for for asking

44:52

us about that. And if anyone else wants to ask us to talk about anything in particular right now, I would highly suggest you use Twitter, Facebook, or Tumbler. We have to handle tech stuff hs W. We will soon have an email address, but coming any day now. Yeah, we need we're making this transition from one kind of email server to a different one. We're getting new email addresses, and as of right now, as we're recording this podcast,

45:17

I think tech Stuff does not yet have one. Our future address will be tech stuff at how stuff works dot com. So if you're listening to this in try it. Yeah, it should be fine. But if you're listening to this the day it comes out and you wanted to send us a message and it's bouncing back, try Twitter, Facebook or Tumbler and we will get your message there. And yes,

45:40

we are working on this. We'll have it up as soon as we possibly can, and we'll talk to you again really soon for more on this and bathens of other topics. Because it how stuff works dot com

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript