Could we make a sarcastic supercomputer?

Speaker 1

00:04

Get in touch with technology with tech Stuff from how stuff Works dot com. Hey there, and welcome to tech Stuff. I'm your host, Jonathan Strickland. I'm an executive producer with How Stuff Works and I love all things tech. Today, I want to talk to you about an interesting topic that I got to explore a couple of years ago with Joe McCormick and Lauren fogobaum As we debated the

00:31

possibilities of computers learning how to understand sarcasm. We did it for a podcast called Forward Thinking, which was around for a couple of years. It was a lot of fun to work on that that show is over, but I thought I would revisit the topic and talk about it for you guys and kind of go over what would it take to have a computer that could actually understand when someone's being sarcastic. Now to understand why this is a big d it helps to have a refresher

01:02

course on how computers process information. And I know I talked about this a lot, but I still think it's important to cover the basics when you want to talk about something as advanced as being able to detect and understand sarcasm. So computers understand machine code or assembly language. This is a language that corresponds with the actual physical architecture of the computers, so the way the computer is built,

01:30

that's how this language interacts. It's it's essentially how the physical components of the computer are able to handle electric current or voltage differences in order to process information, and computers can interpret this and execute upon this language very quickly.

01:52

It is the basic language of those physical components. However, it is almost impossible for human to work with this, at least on a way that is at all efficient, because it ultimately for most computers boils down to binary language, right, zeros and ones, So you see a huge block of zeros and ones, and unless you are neo from the matrix, it means nothing to you. So we speak in natural

02:23

language to one another. Natural language, however, is filled with a lot of components that make it very very challenging for machines to interpret, like ambiguity, or there might be double meanings in a phrase and you may mean both meanings at the same time, and that is too complicated for most machines to be able to process. They just can't deal with that. So to bridge the gap between the way we humans communicate and the way that computers

02:54

process language. We have created programming languages and compilers. Now, programming languages fall into two broad categories. It's more like a spectrum, and you could be further on one end than the other, and we typically call them high level programming languages and low level programming languages. The lower the level of programming language, the closer it is to machine code, and the easier it is for a computer to understand, but the harder it is to work with if you

03:26

happen to be, you know, a human being. High level programming languages are easier for humans to understand. Now, if you have never taken any courses in programming and you're looking at a page of code, it can seem indecipherable to you. It is just meaningless strings of characters. But once you learn the rules of that programming language, how you construct an instruction and a series of instructions, how

03:54

you go from one instruction to the next. Once you understand the rules, it actually becomes quite easy to use in the grand scheme of things, much more easy than machine language would be. But again, the problem here is that computers don't understand programming languages, not natively. Even though this is not exactly the same as human natural language, it's also not the same as machine language. That's why

04:18

you need compilers. A compiler is essentially a translator. It takes this high level programming language or higher level anyway, and then converts it into a machine readable language for the computer to actually execute upon. And this is all in the design of the programming languages and the compilers. So this is the way that for decades we have interacted with computers, when you're talking about it on a on a direct level, not just executing a program, but

04:49

creating code, creating programs for computers to run. Over the last few decades, we've had some very very smart people working on natural language systems for machines which would allow a computer to interpret natural language in a way that would make some sort of sense and for the computer to be able to act upon that language. And we've seen this in plenty of examples recently. Most smartphones have

05:22

some sort of smart assistant. You have standalone products like Amazon's Echo, you have Google Home, You've got tons of devices that can interact with people. It can be activated by typically an alert phrase, which I'm not going to say because I don't want any of you guys to have to deal with that. I know how irritating it is when I'm watching a video and someone activates their specific system and then mine begins to respond and all my lights started going on and off because the people

05:55

on YouTube we're talking funny. I know how irritating that is. But use that at debates and then you can speak and typically you can say the same thing several different ways and the device appears to understand you no matter how you word it. And this is a real challenge because we human beings can find lots of different ways to say the same thing. For example, if I say what is the weather today, it could be very similar to if I if I ask a question, is it

06:25

going to rain today? Both of those are asking for information about the weather, but are very different ways of saying that. A good natural language recognition program will be able to parse that information and then return the appropriate response. This is not an easy thing to do. Typically it involves creating a neural network structure, and I've talked about

06:50

artificial neural networks recently. That's a typically a network that can accept multiple binary inputs, so either a zero or a one input that represents something uh, some sort of yes,

07:06

no or on off kind of feature. It can accept multiple multiple inputs of that nature, so multiple zeros or ones that all factor into making a decision, and then it has a waiting for each of those components, and then it produces a single output that's also binary in nature, either a zero one, and it passes that on to other artificial neurons further down the chain. Sometimes that will come back around and you have a recursive artificial neural network.

07:37

The goal here is for this process two ultimately result in a response that is reasonably certain to meet the requirements of the person asking the question. This tends to be talked about in the realm of probabilities. We we talked about how certain the machine is that the respons is the appropriate one, and if it falls below a certain threshold, then the machine would typically respond with I'm sorry, I don't know what you're asking for, or something similar

08:10

to that. There are cases where you just get misinterpreted and you'll get a response that does not reflect whatever you ask. That's a little different. That's where the machine has drawn a conclusion, has been reasonably certain that it came to the right conclusion, it turns out it was wrong the whole way. But that's the process. Now, when it comes to sarcasm, that adds yet another layer of difficulty, because now a machine isn't just parsing what you are saying.

08:42

It has to understand what you mean, the meaning of your words and the meaning of the way you deliver them. It could be different. So if I were to just write out a phrase with no tone, no body language, uh, not emphasizing any one word over another, it might be very difficult to detect what my intent was. It may seem like I'm being sincere, when in fact I'm being insincere.

09:11

For example, Uh, if I were to say that guy is super tall, but I'm being sarcastic, then just in that phrase the way I write it out, you would think, oh, well, that person he's looking at must be super tall. How do you recognize sarcasm? How can you detect that this is in place and then understand what the meaning underneath it is. One of the approaches that has been put forward relates to IBM's Watson platform. Now. Watson first made

09:48

headlines back when it was a contestant on Jeopardy. It went up against two former champions, including Ken Jennings, who shows up on a house Stuff Works podcast. Anyway, Utson went up against these two former champions and it was able to interpret natural language. It had to in order

10:07

to play the game of Jeopardy. And for those who do not know what Jeopardy is or they're not familiar with the game show, Jeopardy is a game where you are presented with categories of trivia and each category has multiple uh questions or multiple entries in it, and they range in dollar value, and the lower dollar value ones are easier to answer than the higher dollar value ones, and UH, you're Typically the way Jeopardy works is that you're you're given quote unquote the answer and you have

10:44

to provide the question. So uh, if the answer were this film that detailed the adventures of a young playwright in sixteenth century England one picture, you would say, what was Shakespeare in Love? So this computer is playing against these two former champions. This was sort of an exhibition series of games. It wasn't meant for uh, a competition in the way the typical Jeopardy games were. There was money on the line. It was an exhibition and Watson won it beat both of the champions, and it did

11:23

what I was telling you. It it would analyze the clue that was given, the answer that was given, it would try and generate a question to correspond with that answer, and only if the question met a certain threshold of confidence with Watson buzz in. If it did not meet that level of confidence, Watson would remain quiet. And most importantly, Watson was not at all connected to the Internet. All the information was contained within a massive series of servers

11:54

more than gosh, I can't even remember. There's a ton of processors attached to it. Um so a very powerful machine, but it still wasn't exactly able to detect sarcasm. It could work with wordplay, and it could work with riddles, so that was really impressive. But what it really did was it gave IBM the opportunity to say, we have this platform here, and we're welcoming developers to create applications that tap into this platform and make use of this

12:28

in order to do interesting stuff with it. And IBM was largely working with the medical industry at that point to try and help doctors treat and diagnose patients, and it was sort of computer guidance. It wasn't that you had an automatic doctor, but rather the doctor had what equates to a medical expert to confer with when trying to determine why's the best course of action for a patient.

12:57

IBM put up an Application program m interface or API and let developers create their own cognitive computing applications built on top of Watson. One of those was called the tone analyzer. It still exists back when we were doing this episode for forward Thinking. It was in the form of analyzing some text and telling you whether or not that text would come across as agreeable or argumentative, or positive or negative, and it would assign tone to those pieces.

13:32

I'll explain more about how it did and what it did in just a minute, but first let's take a quick break to thank our sponsor. So how did this tone analyzer work. It would search for cues in any written text, social cues, written cues, emotional cues in order to determine in the overall tone of a piece, which actually meant that The analyzer would tag individual words within a text, words that it recognized and had already pre

14:13

labeled as falling into various categories. So words that might have a positive meaning like happy, glad, joy, things like that. Those would get tagged as cheerful. But then it would then assign all the individual words tags and then tally everything up. So let's say you've got a bunch of sentences and it starts individually labeling certain words as being cheerful or sad or angry or helpful, and then it adds it all up and then would give you a percentage.

14:47

So a message might be agreeable or thirty conscientious, you would actually get multiples of these, and that would just really indicate the density of those types of words within the mess itage itself. Now, in an ideal world, if language were very simple to understand and interpret by machines, this would help you gauge how people would respond to your work. Right, So, you could write a message. Before you send it, you put it through the tone analyzer and it tells you what sort of a tone you

15:25

are setting. So if you wanted to create a business letter, you could send it through this tone analyzer, and if it came back as saying it's coming across as as a indecisive, you might want to go back in and edit that message so that you can make a more straightforward and decisive message and not give the wrong impression before you send the message out to your actual human recipient, and come up with alternate word choices in order to make sure that your message is received the way you

15:55

intended it. And anyone who has communicated over the internet can think of ways that this might have been helpful in the past, because again, language depends on so many different elements to get your meaning across, and when you reduce it to the written form, especially the written form online, where we tend to be very short with our our communication, it comes in very quick bursts, a couple of sentences here or there. We lack all that body language, we

16:26

lack that tone. It's very easy to misinterpret. I'm sure there's been an example in your life where either you got offended from receiving something that was meant in a way that was different from the way you you interpreted it, or the reverse happened where you sent a message and somebody had a reaction you did not anticipate because they could not tell what tone you were using just from the words you were using. Machines have that same problem.

16:52

In the future, an analyzer like this tone analyzer, it could be incorporated into word processors or email sir verse, or email services, I should say, or social media platforms. So you start typing in your message, and before you hit published or post or send, you could analyze that text.

17:11

It could tell you what the tone is, and then you could say, oh, no, that's gonna come across totally the wrong way, and you could actually fix it before you posted it or sent it, and then you wouldn't have that awkward decision of whether or not to edit something, or, in the case of Twitter, which continues to refuse to

17:27

allow you to edit tweets, to delete a tweet. I deleted a tweet the other day when I posted a link to a news story, and I had done a rookie mistake, one that I try to avoid, but I did it this pastime, which is that I didn't think to look at the date when the news item had been published, and had been published a full year earlier, so it was not new news, it was old news. And uh then deleted the tweet and it wasn't up

17:55

for long, but I still felt dumb about it. It would have been nice to have been able to check that. Although that's not tone obviously, that's but similar in the and the idea that you want to check before you end up offending someone, unless you're one of those jerk faces that just sets out to offend people, in which case, rethink your strategy. There are better things to do. It's just as you can make just as big an impact being a positive person as you can being a jerk face.

18:22

I know it can seem like it's more work, but it's also more rewarding in the long run. Okay, soapbox done. So. There is a demo of the tone analyzer that's available online, and back when we were recording Forward Thinking, the demo worked in a way where it would tell you about emotional tone and break it down by percentage. It's a little different now, but I want to tell you the what words and the results we got in the past

18:50

because they were so much fun. Granted you would get a different result now because the tone analyzer has been tweaked since we recorded that episode. So when we recorded that episode, one of my co hosts decided to put a sentence that is somewhat known in literary circles into this tone analyzer and find out what it said. And the sentence used was it is a truth universally acknowledged that a single man in possession of a good fortune

19:17

must be in want of a wife. Now, the analyzer said that this emotional tone was cheerful, the social tone was seventy six percent open and fifty agreeable, and the writing tone was analytical. You can also view the sentence in terms of word count as opposed to the weighted value of individual words, and using that view, five percent of the sentence sentences were in an emotional tone, in

19:46

a social tone, and five percent in a writing tone. Now, the analyzer highlights each word according to how it classifies them, So emotional words would be highlighted in red or pink in that older version of the tone analyzer, social words would show up in blue, and writing tones would be in green. And you could click on any word and the analyzer would offer alternative words that you might want to use and classify those words in the tones that

20:14

they are associated with. Such you could shape your message to meet the tone you wish to convey. Also, the tone analyzer demo used the business letter format as the means of comparison, So, in other words, we compared Jane Austen to a business letter. Presumably if you were to use a full version of the analyzer, not just the demo version. You would have other options so you could compare it with other models, not just a business letter

20:42

Joe McCormick. He included an excerpt from Dostoyevsky's Notes from Underground. That excerpt was, I could not become anything, neither good nor bad, neither a scoundrel nor an honest man, neither a hero nor an insect. And now I eking out my days in my corner, taunting myself with the bitter and entirely useless constellation that an intelligent man cannot seriously become anything, that only a fool can become something. The feedback was that the emotional tone had anger at cheerfulness

21:19

at so happy anger negative at. The social tone was agreeable zero percent conscientious, zero percent open. The writing tone was analytical, zero percent confident and tentative. Joe would actually end up highlighting some of the words to find out which words were the ones that ended up giving that cheerfulness result. Those four words were a good, honest, hero, and intelligent and that kind of are that that's important because those words, the way they are used uh in

21:59

that passage are not used in a positive sense. They are positive words, but they're meant to show kind of a negation there not, and not an assertion. So that really highlights a big problem in this tone analyzer, which is that it's tagging these words individually without context. So if I wrote the phrase I am not glad, it would tag the word glad and say that's a cheerful word.

22:32

But I said I am not glad. You if I told you I am not glad, you would not think, oh, well, that's a cheerful thing to say or a positive thing to say. But according to the tone analyzer, it would come across as a cheerful statement because it had tagged that word as as being cheerful. In the other words are not that strong, they don't they don't warrant being

22:53

tagged in a way like that. Now, over time, we might have a tone analyzer that can actually take context into account, and then you would learn a lot more about the actual meaning behind a phrase. It would be more than just tone. So if you were trying to get across tone by using more complicated and subtle word choice, where you're sort of being kind of uh poetic in your expression, you're trying to get across a feeling by

23:28

using irony or sarcasm. Then a tone analyzer like this would totally miss it because it would just be counting the hits and not understanding the usage there the hidden meeting the word play. So that is going to be a real challenge. So it's kind of another interesting use of IBMS Watson. There are a lot of other ones that we could talk about, like Chef Watson, which was

23:54

my favorite. Chef Watson would generate new recipes based upon ingredients that you would tell it that you had on hand, and it wouldn't it wouldn't go and reference old recipes and pull one up for you. Instead, it would make flavor profiles based upon all the different combinations of food that were found in various recipe books and generate a brand new recipe for you, right there on the spot.

24:19

And sometimes they were whacka doodle crazy, y'all. So in a way you could say that Chef Watson was another another way of seeing how IBM S Watson has a lot of promise, but it requires a ton of work on the app level in order to leverage it and make actual practical use out of it. I have more to say about computers detecting sarcasm. But first let's take

24:45

a quick word from our sponsor. So back in twent there were some researchers at the Hebrew University in Israel who designed a system called the Semi Supervised Algorithm for Sarcasm Identification or SAZI, and they used SAZI to analyze collections of nearly six million tweets and also around sixty six thousand product reviews from Amazon. They wanted to find rich treasure troves of sarcasm that turns out reviews and

25:31

tweets they fit the bill sarcasm is. Really it's typically conveyed in in some vocal tone right and nonverbal cues. So you have to first go someplace where sarcasm is is rampant in text form to be able to really fine tune how you can identify sarcasm versus something that's meant exactly the way it's written on the surface level. So they started to map out the various features that

26:03

were common in sarcastic comments online. So they were looking for things like hyperbolic words and if you're using a lot of exaggeration, that could be a key. Excessive punctuation was another one, especially ellipses, which I tend to use a lot, though I don't know if I use it so much for sarcasm as I do for just timing purposes. To indicate this is the beat I would take if I were saying this out loud. I guess that's just as irritating, though, also how straightforward is the Senate structure?

26:35

And they gave it examples of sarcasm. They fed it tweets that were tagged hashtag sarcasm, so that the machine quote unquote knew that that was already a sarcastic tweet and could start to analyze it and build out a model for what sarcasm is. They also fed at a bunch of one star Amazon reviews that had been judged to be sarcastic by a panel consisting of fifteen human beings, and the system was told it had to rate sentences on a scale of one to five, One being not sarcastic.

27:10

They mean exactly what the Senate says, five being holy cow, this person should write for the Onion, this is incredibly sarcastic. SAZI could identify sarcastic Amazon reviews with precision, not bad, but when it came to Twitter, it did even better, I think, probably because there had to be very short messages on Twitter. This was before Twitter had even expanded to characters, so it's still back in the one character days.

27:43

The precision rate for SAZI for Twitter was so it was really good at detecting straightforward sarcasm, the kind that a lot of people on Twitter use, because you have limited space so you can't really set it up in a more complex way, but it was all so uh more prone to judging things as false negative evaluations rather than false positives. In other words, it was more likely to look at a negative sarcastic message and say that's not sarcastic than it was to look at a straightforward

28:18

message and say, no, that is sarcastic. So that was kind of interesting. Back to Watson. Another use of Watson came out of the Milk and Institute Global Conference at IBM showed off some research that it had been working on internally, and it was calling this research debating Technologies.

28:41

This was a project in which IBM was trying to see if they could feed a computer raw information, have the computer synthesize the information, understand that information, at least on a computational level and then create a a debating strategy for both pros and cons based on that information. So it would take a huge amount of content like all of Wikipedia, for example, and then on any given subject that would be covered in Wikipedia, it would be asked form an argument that is in favor of or

29:19

is against a concept, whatever that concept might be. John Kelly of IBM showed off in a demo how the tool could be used to predict pro or con arguments about a subject based on a body of information. So you might be able to use this technology in order to anticipate what an opposing person might say on any given subject. Let's say that you are getting ready to debate a topic. You might feed that information to a

29:55

computer system using this Watson platform. You might feed in a ton of information, and then you might say, who imagine someone who is against this particular topic, whatever it might be. Uh. Let's say it's it's it's renewable energy and the uh the efficiency of solar panels, whether or not it makes sense to invest in solar panels. Let's say that your stance is that you have to argue

30:22

for solar panels. You might say, what would someone who wants to argue against solar panels, say, and then Watson would analyze this information and return to you what it thinks would be an argument someone would use to support that that stance, and then you could prepare for that, which would be an incredible tool. I mean, you could think of this as for political debates. It would be amazing.

30:50

You could think of how you might want to prepare so that you can argue intelligently against an opponent, and you can already anticipate what that opponent is going to say because you oh their general stance on a topic, but you might not know what tactics they might use to support that stance. Maybe politics isn't a great choice because that's not always in the realm of rationality. That often falls into a call toward emotional response rather than

31:17

rational response. That's more of a a commentary on politics in general, regardless of what side you might be on, all sides do this anyway. He actually showed at this

31:29

demo a different example. He said, what if you were to take the sale of violent video games to minors should be banned, that's the topic, and that the computer would then go through all the information and had access to it would end up sorting out all the parts that were relevant to the discussion, so it just put those aside and that would become the core of the

31:52

data it would reference. I would then go through and identify basic statements is either being a pro stance of banning violent video games to miners or a constance for that saying no, we should be able to sell violent video games to miners. The tools scanned four million articles.

32:14

It returned the top ten articles that were determined to be the most relevant to that particular debate, and it scanned approximately three thousand sentences, come from from top to bottom, and it then identified sentences that contained candidate claims that would be statements that would either be interpreted as being pro or con for the stance. Then it identified the parameters of those claims. Then it assessed the claims for the pro and con polarity, then constructed a sample pro

32:44

or con statement. And the statements in the demo were kind of interesting. And since the computer is constructing arguments based upon what people have already written, it would reflect a lot of vague statements that aren't a firm stance. So, in other words, like it it and take a bunch of stuff that was written that itself did not take either a pro or constance and then transform that magically

33:09

into the perfect pro stance or the perfect constance. Uh. It's dependent upon the words that human beings have already written, So it could not magically come up with a killer argument if the data that had been written about this subject didn't come down on a firm stance one way or the other. Um. The point of the demonstration wasn't to create a tool that could either troll people or counter trolls. It was to show that a computer could be useful to aid in the reasoning process when you're

33:43

making a critical decision. Again, to go back to that medical example, it could be used to help a doctor determine which diagnosis is the most likely to be accurate for a patient, what what course of treatment might be the most helpful for that patient, and thus it could have real practical use outside of this more esoteric, interesting UH debate. Us. Now, will we see computers in the future able to detect sarcasm just as easily as your

34:17

typical human being can when given the right circumstances. And I use the word typical reluctantly, but you get what I mean, I don't know. It's gonna take some time. It takes an awful lot of processing power too. You have to remember that for these neural networks systems, the ones that are running these these various platforms and programs and strategies, they take up a lot of processing power. Because our brains have billion neurons in them, so we

34:52

have a very sophisticated supercomputer sitting in our heads. Moreover, our brains are insanely energy efficient. They require about the equivalent of twenty watts of power. A supercomputer needs a lot more power than that. So while we're seeing advances in this, it requires so much processing power, so much energy. It is not a practical approach to most forms of computing,

35:19

at least from a consumer standpoint. You might see a future where the sort of stuff is all in the cloud and then we can access it through an app or a program or whatever. That way, you don't have to have a supercomputer sitting on your desk in order to tap into those uh, those capabilities, but you have to have an Internet connection, which most of us these

35:41

days tend to have fairly frequently. I mean, there are a lot of people out there who at this point have had a persistent Internet connection for pretty much their whole lives, which blows my mind. But that's the kind of world we'd have to live in in order to really take advantage of this, at least in the near term. I don't know if we're are going to see a computer that can analyze, say, an article from the Onion and not only point out that it's being sarcastic or ironic,

36:09

but also point out why it's funny. I think at one point, when you start analyzing comedy, that gets to be a level where nothing is ever funny ever again. But it is a really interesting problem. So that's whether that's that's this look back on if AI is ever going to understand sarcasm. I'm curious to hear what you

36:28

guys think. Do you think we're closer than I am suggesting? Uh? Maybe, well, I mean, we're definitely closer than we were when we did this episode on Forward Thinking, because that was a few years ago. But I don't know that we're you know, significantly closer. It's a it's a real tough problem. Or do you think that sarcasm is one of those things that's just innately human and machines are never really going

36:50

to be able to handle it. We've got a lot of programs out there that appear to be sarcastic, but that's because they're they're acting on preprogrammed respond says two things that we ask them. It's not exactly the same. It's kind of cheating, but I'm curious to hear what you guys think. Also, make sure you go to our brand new website for tech stuff. That's tech Stuff Podcast

37:13

dot com. That's where you're going to find all the links to all sorts of stuff like how to contact me in case you're wondering the best ways through email, It's tech Stuff at how stuff Works dot com, or through Facebook or Twitter that's Tech Stuff hs W. But all that information is also on the website, as is a link to our store at t Public. Remember every single purchase you make at that store helps out the show.

37:34

Don't forget to follow us on Instagram and I'll talk to you again really soon for more on this and thousands of other topics. Because it how Stuff Works dot com.

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript