Dr Yossi Keshet on Decoding Speech, AI, Morality, and the Future

⁠¶ Intro / Opening

00:00

Welcome back to another riveting episode of Data Driven. Joining us today, lakeside and positively glowing from his Appalachian retreat, is Frank. Meanwhile, the always astute and ever energetic Andy is here to keep us grounded. But enough about us. Today, we have a true luminary in the field of AI, someone who's blending the worlds

00:22

of academia and enterprise with seamless finesse. He's an associate professor at the Technion, has published over 100 research papers on automated speech recognition, and is the chief scientist at Iola. Please welcome doctor Yossi Keshet or as he's known to his friends, Yossi.

00:47

Alright. Hello, and welcome to Data Driven, the podcast where we explore the emergent fields of artificial intelligence, data science, and, and, of course, data engineering, without which the whole world would probably stop turning. And you know, data engineering is important. That's basically it. Still working on that that that revamped monologue, for, for season 8, Andy. Were

01:10

you on vacation? You're on vacation. I am on vacation. And for those of you who can't see on camera who are not who are listening, not watching, I am literally lakeside, in the foothills. Well, not the foothills. We are actually in the Appalachian Mountains. Or is it Appalachian? I I never I I've heard of those. I I never got a clear read on it. Say either. So, you know When I say either.

01:32

Yeah. Yeah. Yeah. Yeah. Yeah. So I am in Deep Creek Lake, Maryland, which is kind of like, Maryland doesn't really have a Panhandle per se, but if it did, it would be this is what this would be. I probably think I'm 5 miles from West Virginia and about 20 miles from Pennsylvania. So it's kind of like this quiet little corner of the state. And I've been, you know, reading and studying today. I hit day 600 on Pluralsight Consecutive. Nice. So recording this June 17th. And, how

02:06

things with you, Andy? Things are good. I'm gonna throw out a plug for data driven media dot tv because Frank mentioned. If you're listening, he while he was mentioning that, he was actually panning the camera over to the lake. But if you're, subscribing to data driven media dot tv, you get to see us. You get to see the video, and you can see, for instance, that I am wearing the, my data is the new oil t shirt, which you can pick up. I'm just full of

02:35

sponsor stuff today. I'm just doing Well, it's self out. It's self sponsored. And, honestly, we really need to get better at that. Right? We have data channel. Tv. There is a for listeners to the show, I will give a preview. There is gonna be data driven academy is is launching soon. You have a course coming up the end of the month. Actually, yeah, it's fabric. Today. We're recording this on 17th. It's 24th of of June, but I'm also doing, 2 more, at

03:03

near the ends of July August. And in addition to that, while we're shameless plugging away here, before we get to our very interesting guest, now I'm also bringing back my, day of Azure Data Factory as wildly popular. I delivered it at a couple of, conferences, international conferences, 22, 23. And, yeah. Let's see see if people are interested. What do you do Friday this afternoon Friday afternoons, Andy? Oh, there's this thing, Frank. Thanks for

03:34

mentioning that. Totally free. We we gotta we're trying to get better at this. That's all. We do. Yeah. Data engineering Fridays. And if you go to data engineering fridays.com, you can learn more about that. Frank, you're doing a lot of stuff with I noticed with using the, encore replay feature in Restream. And it's right you you shared that with me. I started doing that with data engineering Fridays as well. But great a great way to, you know, to get your message out there. And, you

04:04

know, I I had no idea replays would help. But my gosh. They really have. It's just a matter of just hitting the echo of I can't even talk. Algorithm the right way. Yeah. And Yeah. You know, maybe we can get the so I think it's a good segue, for our guest. Doctor Yossi, Keshet. He's the chief scientist at AIOLA, an AI powered tech company that automates business workflows

04:30

by capturing spoken data. Yossi is also an associate professor at the Faculty of Electrical and Computer Engineering at the Technion in Israel. Yossi is an award winning scholar and has published over a 100 research papers about automated speech recognition and speech synthesis. Welcome to the show, Yossi. Hi. Nice for having me. Thank you for having me. Hey. No problem. No problem. We are very excited to have you. And, you're not just an

04:59

academic, but you've also proven yourself in in actual enterprise. So which sounds really bad as I say that out loud, but I think you knew there was a compliment. But, so what is AIOLA? Can you tell me a little bit about that? Because I'm curious about that and and and workflows around spoken data. So

⁠¶ YOLA targets foundational industries through AI.

05:27

Iola is a company that is aimed to target the, you know, the very basic and foundational industries. Maybe if I may, let's start with the a general scene of the automatic speech recognition now, and then you will understand where are YOLA stands because we have now open AI and everything is like we you can say we solve the AI problem. So it's not like that. So we are in a in a amazing shape in in

05:57

terms of automatic speech recognition. So we we have a paper that shows that whisper, the model of OpenAI, is as good as humans in detecting and transcribing language when we speak about American English with noise, without noise, and also, l 2 speakers. That is the speakers of non non native American speakers of the language. And the the results are so whisper. The OpenAI model is the same as human listeners. And that is

06:26

the main thing. But the thing is that when you come to industries, usually they have jargon, they have special words. And and those words are either rare in their language or they they they are not none word. It's like I don't know. I when I'm a medical doctor and would like to make a surgery surgery and I would like to transcribe what I'm saying during the surgery. I'm there isn't words that which are not

06:53

often used or which are none, non English words. And in that case, those, automatic speech recognizer doesn't work at all. They don't detect those words. And in Ayala, this is our target to take those words, which are actually the most important word. Those are the jargon of the of the industry of the of the facility. So the goal is to help those industries to come up with the with the automatic speech recognition for reporting for transcribing speech.

07:25

I have a question. When you say automatic, what what makes it automatic? Is it just kinda, what exactly does that mean?

⁠¶ Automatic speech recognition similar to KJGPT model.

07:34

So automatic speech recognition today works very similar very, very similar to the way KJGPT works. KJGPT works on a model called transformer. It's an, deep learning architecture, which has, a history based on previous recurrent architectures. And it can predict, as as we all know, it can predict text amazingly. In speech recognition, automatic speech recognition, it's almost the same thing, but there is another component, to the to the to the this transformer, which is which is called encoder.

08:12

This this part take the speech and actually transfer it to a great representation that can be used with this, with this, let's call it with this with the other side, with this, GPT together. Together, they can, transcribe speech in, as I described, in a very good way, as good as humans in some cases. I will say, like, I've been messing around with the app that's on the phone, for, chat g p chat gbt, and,

08:45

I use the the voice interaction feature. It is amazingly good at getting rid of the umms, the ahs, the scatterbrain thoughts that I sometimes have when I talk to it. Like, it it could kinda really distill a lot of things. Like, I'm impressed with it. It's it's really gotten last time I did anything serious with speech recognition was probably, like, maybe 4 years ago, and it's really improved. Like, I mean, orders of magnitude

09:11

than I thought. I mean, it's it's it's it's almost at Star Trek level. You know? I'm not sure in those it depends on the company if it's Apple or Google. And I'm not sure which they don't declare which models they use. I think, personally, they don't use this whisper or the latest model that we have for automatic speech recognition that is transcribing speech. And the goal is a little bit different

09:36

in the in the phone. You actually want to maybe Right. Make, make notes, send an email, send a text message, and maybe the vocabulary the vocabulary is less less defined. There is another problem with the phones. Oh, no. Go ahead. I want to call my friend. His name is xi, and the last name is CHUNG. How do you pronounce it? What what do you do with that? I'm gonna say he or chi or so there is a there is a problem of proper name and how do you

10:10

define them. And this is a completely different problem. It's still an open problem, and the goal is a little bit different. So it's when we assessing the quality of those models, it's a little bit different than the assessment of just spoken language like what we do now. No. I mean, that's a great point. I mean, my last name has, you know, technically is Lavin. But, you know, growing up for for reasons many,

10:38

big and small, it became Lavinia. And like, so, like, the phone, depending on if it's Android or an Apple, it will, it will he gets confused pretty easily. And that is an interesting point. Some names, Andy is lucky to have an easy name for the, the system. But not everybody does. So I understand that. Sure. I also wanna double click on American English. You you you said that a bunch of times. Like, is there is there an inherent bias in these model trainings because these are done by American

11:13

companies? Yes. There is. Okay. The

⁠¶ American English research bias in speech intelligibility.

11:17

day the data is mostly of American English. The research institutes are mostly American. So the reason maybe I don't know if you'd call it you call it inherent or implicit bias, but there is a bias, definitely. We are investigating, by the way, the the intelligibility of speech in some cases And what is the intelligibility of of American listener versus the inter intelligibility of myself, which I'm not American listener, but I I know English.

11:48

What is the best, what is the best, double quote speaker? What is the best listener? How can we transform those to speech recognizer? How can we transform those to assessing the quality of speech? What does it mean? What does it mean about the pathologies in speech? And this is ongoing research on this on this field. Interesting. I I often wonder, like, you know, what it's not just English. Right? Like, you know, if you listen to Spanish, like, there's different dialects of

12:19

Spanish. Right? Even even German. You know, I'm sure there's, you know, plenty of dialects of all these languages and, like, how do you the training of a model that where it can get to be as good at understanding x and x versus x and y versus, you know, the base language, the base standard. I don't know. That's fascinating. It seems like it seems like it could be an endless loop of, like, training. It it is. Indeed, it is. And when we train, there is another so I'm I'm

12:52

working on deep learning and AI. And what we found out that it it may it may be the case that if you train on 1 language, huge amount of data from 1 language, let's say American English, but then train on less data on Spanish, you actually get you get some advantage of training from from the American English. So, again, in this modern whisper of OpenAI, most of the data is American English, but, actually, other languages are really great.

13:22

Again, Spanish is amazing. So maybe like humans maybe like humans as we learn more and more languages, it's easier for us. This is very interesting, point.

⁠¶ Studying foreign languages improved understanding of grammar.

13:33

No. That's an interesting idea because I know, like, I never understood American English grammar, American or otherwise, until I studied a foreign language. And then when I studied it, it was German. And, you know, German kept a lot of the archaic things that are in English and kept them and kept make kept them, made continue to keep them important. Like in English, you know, who and whom used to confuse the you know what out of me.

14:01

Right? But when I when I learned in German about different cases and things like that, I was like, oh, that's why it is. Right? So, like, all these things that just like you said, like, learning another having more data or data from another point of view, I suppose, or another way to look at the world help me look at my world a little better. Maybe maybe that's how AI will work too. I don't know.

14:28

Maybe. We don't know. We we actually have a guess about that because it those networks actually solve an optimization problem, mathematical optimization problem. It's a problem that that is, we define it with equation, and we need to have a computer running and solve it. The equation is overtraining set of examples. So it's 1 1 person say that, another person said something else.

14:55

And what happened is that when, again, when we have a large amount of data, it seems that those those networks get to an amazing place. So this this, algorithm, this whisper or other algorithms, it's really from the recent years, like 2, 3 years. That's it. We it's they they perform amazingly amazingly, with the with the same with the same mechanism, not with the same amount of data. Yeah. That's that's that's the

15:29

fascinating aspect of all of this. It's just that some of these things just seem some problems seem harder than they ought to be, and then some solutions to problems seem way more effective than they ought to be. It's an interesting also to say it's always the case that we so Whisper, OpenAI Whisper, was trained on 600000 hours of speech. But this is way, way much more than just a kid learning a language.

15:56

Kid language learning a language exposed to way much less hours of speech, less less accurate, less, coherent. And this is something, Nom Chomski raised years ago, like, 50 years ago. And it's still an open question. Like, if we can make those system works better, if we know the language, I guess you learn German faster than any machine that works today.

16:30

That's yeah. It's it's and I'm glad you mentioned Noam Chomsky because that kinda was like so for those who don't know, Noam Chomsky is, among other things, a noted linguist scholar. I highly recommend you do a search on him because that's a that's a good Wikipedia rabbit hole to fall into. But, how much does linguistics come up in this? Right? Because I think what's fascinating about this field for me is a lot of, my grandfather, my great grandfather

17:01

was a a linguistic professor. And, you know, as the family lore goes, I never met him. He died decade or 2 before I was born. He spoke, like, 12 languages. He was a professor of, like, 5 or 6. And, you know, a lot of people in my family seem to have on that side of the family seem to be gifted in language. And 1 of the fields I was tempted to to study in

17:23

university was linguistics. And I just find it interesting how there's a now a Venn diagram now is much larger than it used to be in terms of linguistics and computer science. So what are your thoughts on? Like, how much does like, if you're if you have a company like AIO. Right? Like, how many people are, you know, honest to goodness, linguists versus computer scientists and and AI engineers? So there is there is no no linguists there. Oh,

17:59

really? Okay. There are no linguists. But I have to tell you, so there was a professor called Freddie Frederick, Jelinek. He was the head of language, research at the John Hopkins University at Baltimore. He was amazing. He was 1 of the smartest, people on earth. And he said he was developed many of the speech recognition algorithms. He said, every time I fire a linguist, the performance of speech recognizer goes up. And this is, this is embarrassing. But I've been I

⁠¶ Passionate about linguistics and cognitive sciences. No AI has this capability.

18:36

made myself, 1st, really like linguistics. I really like cognitive sciences, and I really try to combine it with with my work. But it's really amazing that we don't have all those AI system don't have any of that. So you don't train CEGPT to what is a noun, what is a verb, what is anything. You don't train speech that this is the this is the you don't you don't use linguist. You don't use this is the prominent word. This is the end of the sentence. It just happened by huge amount of data. And

19:14

this is interesting. This is somehow contradict Noam Chomsky who said that there there is a universal grammar. There is a we are born innate with language. There is a maybe some black box in our brain which is tuned to learn a language. And, we are not sure about that. There is no direct proof if it's correct or no. We are born with language. We are as humans, we're born with language. We this is part of our, human being. We are not born with written language. So written language was invented.

19:48

The spoken language is something like like a zebra has stripes. This is this is our nature, and this is interesting. This is not happening not happening in AI. The best success that didn't have linguist, they don't have any restriction of what should be say or not. Maybe maybe AI will be a tool to somehow make the linguist research more effective and try to understand what happened in the brain, what happened in the cognition part.

⁠¶ Phenomenal correlation between artificial and neural mechanisms.

20:23

But I would like to tell you about another research we are preparing here, which is really amazing. 1 of the thing is that we have so there is this JGPT. It's a language model. We also have something in the brain. It's also neural network. And we when we try to compare them, there is a huge correlation between the the what happened in the artificial neural network of GPT and the neural

20:50

biological neural network in the brain. And, it was shown, several years ago, and here we show it again with, with this, with the most modern, automatic speech recognizers. So this is a phenomenal post correlation between the artificial and the neural mechanisms. I was gonna ask about that because I'm I'm familiar with, you know, at least the abstracts of

21:17

the research, from a few years ago and now. And I was curious if there had been any new correlations or, you know, or new research, new connections that have been made between machines learning languages and the way our brains work. It sounds like that's true. So we try to we just initiate, man, a research here in my lab about that. There was some French guys from, mainly King and his colleague at, Meta. And and I forgot the university in France. So they

22:01

show that there is those correlation. They show simple correlation. What we they show it with LLM, with language model. What we show is a little bit different. We show correlation with automatic speech recognition. So we ask people under fMRI, under MRI. They're we scan their brain at some resolution, and we try to find correlation with their brain activity during reading and during speaking aloud, and ask what is the correlation with the the best model we know for

22:31

speech recognition. And then there are correlation. I have to say that there is a mechanism in the transforming this architecture of neural network. There is a mechanism called attention. This mechanism allow those model to to have the connection between worlds and themselves. So, I'm eating an apple. It was delicious. So it refers to the apple. Okay? So there is attention mechanism. This what makes those model amazing. So there is attention mechanism, I guess, in the

23:01

brain. So we try to correlate the this attention mechanism in the models and compare it to what the activity in the brain. We don't have results yet, but it seems promising. And we also ask another question. What if you don't read aloud? What if you read like silent reading? What if you have dyslexia? What if you have, other type of, pathology? What what are the correlation then? So this is fascinating. So and there is correlation. I don't I don't know still what what's going to happen

23:31

with that. But I I know the pathologist, but it's unbelievable, the correlation. That that is really exciting, especially when you're examining things like dyslexia, which is considered, you know, not normal, or maybe that's not the right term for it, but a challenge at a minimum. The cool the cool kids call that neurodivergent

23:52

now. I think Neurodivergent. Thank you, Frank. So when you're studying, you know, when you're studying that sort, I'm wondering if there's a place for that, in in the artificial. I'm curious. What what do you mean? Can you So, yeah, is there is is there any benefit to, I say, transferring the thought processes of people who are neurodivergent and and automating that and making that part of the, you know, the the language model or or speech recognition?

24:29

Yeah. I think so. I think so. 1st, it's a it's a tool to to an to analyze what happened in the brain. Yeah. What happened but it's very difficult. So we don't, we don't have any debugger for the build the brain. We don't see the code of the brain. We don't see that this function doesn't work. And it's, most of the work is to design the experiment and

24:55

and it's really amazing. In our design, we have the same so as yet as I told you, I'm asking people to read aloud and compare it to what automatic speech recognition, is plan is, supposed to do. But I'm also asking people to read silently, and then I follow their eyes. I have a make a make a machine that follows their eyes, and I know where where is the where like, III track their eyes and I see which wall they are reading now. And I can and I can use that to follow

25:28

what what they read. But in order to operate that on a speech recognizer model, I need the speech. So it's during the design of the experiment, I need artificial speech or I need them to to read aloud afterwards. It's a it's a big, it's a big question how to do that properly and how to make things happen, but definitely walking with people with, with problems first to help them. And second, to understand them. And 3rd, to maybe make understand the brain and make, AI better.

26:04

I also think, like, stroke victims, right, could benefit down the line from a better understanding of lang language models. Right? Like, maybe there would be some kind of therapy that could be directed to that. I think I think it's fascinating. I always love those fields where they touch upon more than 1 thing. Right? This isn't just math. This isn't just computer science. Like, it's linguistics. But, you know, it's a little bit of everything. It's like a giant, like, pot of

⁠¶ Innovating transcription: improving on old industry practices.

26:26

stew that you just throw a bunch of stuff in, and it all kind of mixes. And, like, it's kind of like, almost like intellectual gumbo, I guess, would be the word. Right? But, what what, what drove you to make, your your your your company? Like, what what was the driving force to say, hey. You know, we have I remember many, many years ago in an office, and you would always see doctors talking into these little, like, miniature recorders.

27:01

Right? In the olden days, they would go off to some data center somewhere and somebody would not data center, but, like, some piping center, call center where people would transcribe that. You know, obviously, that is now an artifact of the past as these models have gotten better. What what was the goal in in in, your company to say we can do this better? What what was the the that breakthrough moment of, like, here's here's what the industry already does. Here's how we can do

27:33

it better. So there is

⁠¶ GPT's influence on various fundamental industries.

27:36

so we all know Check GPT, and it influence our life. We search now instead of Google, we search with GPT and it's amazing. It's unbelievable. So I thought, what about the very fundamental industries? What about, like, when you check-in when you, check an airplane, you use a special jargon. You cannot touch anything. You cannot leave even a pen there because otherwise the the plane wouldn't be, valid for flight. What about industries like the food

28:08

industries when you need to report, the process? You have gloves, you cannot touch an iPad, you cannot barely write. And what about, other industries like, maybe the cheap technology when you make nanotechnologies and when you make chips, you make, you know, silicon chips and silicon first. So you need you you are cover all. You are with gloves. You need to report the process. It's a all those industries has this have special jargons. They use special

28:41

terms to describe what they're doing. They don't have access to to to write something, and they are very limited in the way they they provide. And on the other end, we had speech recognition, but speech recognition doesn't work on those jargon world. Those jargon world are actually the most important to those industries, and this was the goal for Iola. So what we do is we operate, automatic speech recognition, the best automatic speech recognition,

29:12

but we also operate something else. We also operate something called keyword spotting. It's another deep network, which is focused on detecting only the jargon words. So you can define those jargon words in advance. You don't need to train them. You you can define them, and it they all work together. They work like, as a complimentary, couple to make a very robust prediction, and we can detect those, jargon words and make reporting on on on on the

29:44

process, without just by speaking. So it can it can use in any industries, any, industry that doesn't have access to the most modern AI system, the speech recognizer wouldn't walk there. They have problems, like, writing and formulating their reports. Yeah. So I'm curious how those work together. You mentioned that you've got the speech recognizer. You've got the keyword,

30:15

engine. Are they 2 separate engines that are just always running maybe agents, running at the same time or are they encapsulated, say, is the speech recognizer does the speech recognizer have a, you know, a subset or a a function built into it to do the keyword recognition? So just to be sure, those keywords in some industries are not are not are not English words. So it can be a word which nobody

30:44

knows about. It was not shown in the in the, like, in the Internet, like, JGPT strain on the data over the Internet. There are some walls that are not not there. This is your, proprietary company. You have invented a wall to describe what is the this, part of the engine. So Yeah. So what we so we have this keyword spotting. It was it it is trained to detect keyword in general. They are defined by,

31:10

by text and it operates. We have 2 model for preparation. 1 of them works on the this encoder part of the of the automatic speech recognition, and then it guides. It's still the speech recognition towards the correct transcription. And there is another mode, which is, our self, encode our self representation of speech, and then it also guides the automatic speech recognition to a better, location and to detect those

31:39

words. And, actually, we can show that you can buy combine any word can be from different languages, and we can detect them, like, almost 100% correct, those jargon words. That was that was going sorry. Go ahead. No. No. No. Sorry. That no. That's okay. That that makes perfect

⁠¶ Using multiple languages can enhance comprehension.

31:58

sense now, what you just said about the languages using multiple languages, you know, English plus all of the other languages because sometimes people will struggle if their English as a second language speaker. They'll struggle to find the right English word, and they'll substitute a word from their native language. And in other cases, they'll be perhaps teaching on a topic, and they may revert back to an older language, Greek, Latin, something

32:32

like that. That may be part of the, the lecture or, you know, I could see that in medicine. I could see it in, you know, all all sorts of literature studies. I could see a lot of that. And that that kinda clicked for me as you were saying that that makes sense that you would have additional languages. Yeah. I also wonder, like, in in also conversational context. Right? Like, you know, Spanglish is a

32:57

thing. Frankel is is the French and English kinda mashed together, and I know that other language whenever you have 2 groups of people kinda come together, like, you know, there's always some kind of weird mix of language that that kinda just evolves either naturally or forced. I mean, that's Right. That's another debate. Are you thinking belt or creole? I know we're Belter, you know, I wasn't going there, but that that's a that's an excellent example.

33:24

So, Yosie looks very confused. So so there's a series of books, called The Expanse. It was an excellent TV show for about 6 seasons, and it's basically set, 2, 300 years in the future. And as humans colonize the asteroid belt, their people from all over the world kinda all end up living together. So, like, the the Belter Creole language is this is a creole of, you know, literally dozens of languages. Right? So, like, it'll switch from, you know, Hindi to Arabic to,

33:57

English to French to there's even some German in there. I've heard some of that. Like, and there are these kind of these weird mixes of things. Right? So they'll say the the word for the Belter people, like, people live in the Belk, is Beltaloda. Belt obviously comes from, you know, the asteroid belt English. Loda, I think is a Hindu term. I think. Don't hate on me in the comments. Don't hate on me in the comments.

34:19

But, I know Walla is a is a is a Hindu term. Right? So they'll they'll, you know, when they talk to people who live in the Earth or Mars, they refer to them as well wallahs, gravity well wallahs. Right? Like so it's like, and I only know wallah because of dish wallahs, and Wired Magazine did a whole story about dish wallows in the nineties. Anyway, but I mean, I think, like, you know, I I suppose that approach could work for something like a creole. Right? Like, we have

34:46

multiple languages kinda mixed together. Or is that not really a massive business case? It's Creole is really complicated. It's a language. It's like real real a real language, and it's complicated. This the the more delicate cases of that, what we call in research, code switching when I'm Right. When I speak Hebrew, for example, I don't have a

⁠¶ Switching between languages in code-switching research.

35:08

word for the, you know, the Internet router. So I say the router in in English. Or I said email or I will say I don't know. There are so many words in English that are used especially in technology that you use worldwide in other languages, and this is code switching. There is another case. I think Andy pointed it out that sometimes when you are stressed or let's say your l 1 is Spanish, but l 2 is American

35:36

English or you're bilingual. And sometimes when you are stressed, you you just switch the the 1 word and it this is amazing phenomena. This is a research with Tamar Golang from, University of San Diego and Matt Goldrick from Northwestern University. And I provide, again, a mechanism to detect that and to make research of that. And the the key question is, like, why do you do that? Why do and when do you do that? Is it stress? What what what is the what is the state of

36:05

describing those? Are you gonna describe it in the American way, the Spanish word, or is it gonna be vice versa? And this is really interesting. It's not my field of research. I just know how to detect them and, and Interesting. To detect them really well, but I don't know why it happens and what is the mechanism behind that. I could definitely see, the opportunity with starting with being able to detect, you know, these I

36:42

don't I don't know the right word for them. I'll I'll call them modes. You know, a mode of speech where someone is mixing 2 languages. And I'm sure those vary. So Like when I go Jersey on you. Right? That's we we can't we can't say any more about that, Frank. We're trying to keep our clean rating. But yes. Exactly. But, that's sorry. Inside, Joe. But the, but, yeah, I could see modes of speaking where someone who is more familiar with English as a second language.

37:16

And and they've still you know, of course, they know their native language. They'll always know that. But as they I don't I don't wanna use the wrong word here, but I'm thinking experience is probably the best word is they get more experience, gain more experience with their second language. They may switch words less or switch languages less. And detecting that, I think, is the is key. I understand now more about what what you're doing, what

37:42

you're accomplishing. And that that's the very first step to then being able to produce speech in those different modes. And that would be a fascinating, you know, a fascinating accomplishment.

37:58

If you do, the more we can have. Machines speak to us in the language that we're most familiar with, that, of course, you know, is is almost there now, mostly there right now, but have it be able to to speak to us in these different modes where we where the machine switches where it's back to our first language, you know, based on some algorithmic calculation. That sounds fascinating. Yeah. It is. I'm not sure we are there yet. It's we have a long way to go there. But, Sure. Yeah. Makes

38:34

sense. Fascinating. Well, this is how it starts, though. Right? This is fascinating. This is, yeah, this is, somehow there is an elephant in the room. There we may have to say something about AI and their regulation and what happens now. And, if I may, I would like to say something about this because I have a deep totally different point of view about that. Please. So everybody is speaking about regulation and it might be a catastrophic situation if those, machine are connected

39:13

together and they start to train themselves. They try to build a meta architecture and try to train themselves, and then they come up with something which is better than human. Some some people call it the singularity point. So this is frightening. They're smarter than us. Maybe they they're gonna kill us all. And people say now people speak about regulation now, and there are several institutes in Europa, in Europe and in, the US trying to tackle that. And that

39:44

is amazing. That is really important, but I think we missed something here. And I'll tell you why. So the so there is a book. It's here. You know, Isaac Asimov, I, Robot. You probably know that. So he, like, the first page of this book is like the 3 laws of robotic. A robot may not in in injury a human being or through an interaction, allow human being to come to harm. A robot must obey others and so on. So we have let's say we have the regulation. AI cannot hurt humans. Okay?

40:16

But that doesn't enough. It's not good enough because if the AI is smart enough, it will not do the I mean, it will show us humans that it really obey the law the laws, but it wouldn't. And this is frightening. And here I suggest to look a little bit about the human morality and what why human are have do they have laws? So we need to think about, if I may, think about the human psychology. In human psychology, we have a mechanism to obey law.

⁠¶ Superego: Freud's guilt and fear mechanism. Evolutionary.

40:47

It's called the superego. It was embedded or defined by Freud. So we have a mechanism that if we if we doesn't we if we don't obey a law, we feel either guilt or fear. And this mechanism was evolutionary. So do we have a group of monkey? They obey the the alpha monkey because they're frightened from him. They have some kind of

41:10

primitive superego. We obey the law because either we fight them from the police or either we feel the guilt, we we it's like the those experiments that show that, there is, somebody, left something on the table, and we don't take it because we feel guilt or we feel something. So this is this mechanism, what I claim, should be transferred to the

41:37

AI machine. This should be the regulation. So what is it superego? Superego is a infrastructure for to be moral, and we need a digital version for that for the this is the regulation we need. We need the infrastructure to be moral in machine. And what it what does it mean? So superego means that it's a little bit like self harm, if I may. It's like we feel guilt. We feel something bad if we do something not okay, if you're not obey the law.

42:04

So it's like a self destruction for AI machine. So AI machine, if it doesn't obey the law, should feel something. It

⁠¶ Book writing claiming need for non-standard regulations.

42:11

cannot feel so. Right. It will distract itself. So this is my claim. This is a book I'm writing, and this is something very fun fundamental. We we all speak about this regulation, but I think it it doesn't help just to to do standard regulation. And if you if I may say another thing, the last thing is that if you read the I, Robert, carefully, so he speak there are several short stories there, and he speak about robots that

42:37

obey the law. And if you look carefully about those robots that obey the law, the those robots have super all all of them have have super ego. They feel guilt. The the first story is about a robot that play with a girl, and he feel guilt about winning all the time. So he let her win. So he feels guilt. It means that it has superhego. And then he feels frightened from the mother of the girl. And it's

43:04

really amazing. So I think, so this book I'm trying to describe the psychological concept of superego and then describe why it need to be more and how we can, find a way to put it in regulation, like the the infrastructure itself and not just lows. That is a very interesting problem you're trying to solve. Very important problem at that. Agreed. And culturally, we speak, in the US, we have a saying that you cannot legislate morality, which legislate, regulate would be, you know,

43:42

synonyms. Exactly. Right? So Right. Right. And and legal code is code. I I definitely get what you're what you're saying. And I think it's super important. You mentioned you were writing a book about this. Now now now you have to tell me more because I wanna read this book. Same. I'm in the process of looking for an agent and it's, it's complicated. It's supposed to be a popular book trying to explain the psychology of fraud. What is, superego, ego, and the id,

44:16

and then describe what is the pathology? So we all have a pathology. So you have the pathology of, it's called, the, personalities criminal personality disorder. This person will not have a super ego, ego ego. It's like Richard the third from Shakespeare. He didn't have superego. He killed his family and didn't feel guilt. So this wouldn't what's

44:44

going to happen with the with the with those machine. And then I give some literature examples of, what is a superego like from the, criminal and punishment that that the guy killed the the old lady, but he didn't he nobody, caught him killing the lady. He murdered her. Nobody caught him, but he still feel guilt. So he has a very, big superego. And then we describe I describe, what happened in other moral theories of human being, all of them connected to the

45:17

superego. And then I tried to describe a little bit how machine learning is trained. Again, solving an optimization problem. And then I try to describe how can we do superego with, how can we have a digital superego if we can? No. It's like you're giving it a conscience of of sorts. Exactly. Yeah. And I I just wanted to, to add, we may be able to help you. Maybe not find an agent, but find a publisher. Both Frank and I are published. And we, you know, we know Andy has a lot of

45:51

Andy's got a lot of connections in the publishing. Well That would be great. I am I am not, I just wrote a lot of books for different, publishing houses, and I know some people that if they can't help you directly, they can probably point you to someone who can. And, again, I am wholly motivated by wanting to read this book. Same. Like, I think it's important because I live in the Washington DC area. Right? So so, like, there's a lot of people there who they're policy

46:20

makers. Right? Like, and they just assume and I think a lot of humans fall for this. Right? You you see this when the European Union passed their AI regulation act. They assume that regulation's gonna solve all their problems. And I think regulations prove that 1 of the fundamental forces in the universe is is unintended consequences. And, you know, when you regulate something, you don't end

⁠¶ AI movie plot illustrates ethics in robotics.

46:46

the problem. You change the way people will route around it. Right? Like, and I think a good example of this in AI is the movie Megan, which I don't know if you've seen, or m threagan. I'm not sure how to pronounce it, where I think she was about to torture she was I don't wanna give the plot away, but the the robot child, Chucky, kinda goes evil, Like, this is the basic kind of plot line, and the the the person who created her

47:11

was like, you can't kill me because it's against your programming. He goes, oh, I said nothing about killing you. I was gonna put you in a coma, and you'll live, you know, however many years. Like, it was just like I mean, that's a great example of, like, she you know, don't kill. Right? Seems like a pretty reasonable instruction to give a robot, particularly a child's toy. They'll kill anyone. But, you know, she was realized, like, well, kill

47:32

equals death. So if I don't kill you, if I just hospitalize you or incapacitate you, that doesn't conflict with rule number 1. Right? Which I think is no. Obviously, as, you know, humans, we're like, well, it's not really the spirit of the law, or the rule. But clearly, the robot or the AI in this case, kind of figured it out. Like, I don't know. I think you're right. Like and any regulations like that too. Right? How many loopholes do people discover, whether it's

48:01

tax laws or, you know, this. It's like, well, technically, it's legal. Is it actually, you know, what the law intended? No. Like, it's Yeah. You need a you need almost an something like a Nuance engine, you'll see to Yeah. To get the the what the machine to interpret to the laws. And that's I've read Asimov as well, big fan. And that's what happens down stream of the 3 laws as they begin to fail as because the robots are doing exactly what they're programmed to

48:39

do. And they're not they're they're finding ways that in our opinion, human opinion, circumvents the 3 laws, but really doesn't break the robot's programming. And it's all about, you know, how do you define harm? Like, Frank's example is a great, you know, great example of that. So, yeah, fascinating stuff. Yeah. We gotta Awesome stuff. We gotta help you write this book. I wanna read this book. Yeah. I want to raise

49:09

another point, but the opposite point that you raised. Like, what happened with the autonomous car, for example, or people say, let's let's let's focus on autonomous cars. So so there will be autonomous car. Who is in charge of a of a car accident? Accidentally, somebody was killed. You are the owner you. Somebody is the owner of the car. He sits there. He bought the car, but the car killed somebody. So who who this is an open problem. This is, again,

49:43

moral problem. So what I suggest here is maybe it will take time, I guess. Maybe the the car, if we can be the superego and mechanism for morality, you know, the just the infrastructure for morality can take the morality of the human. And if somehow he inherit the the the driver morality, you can blame the driver. I'll give you another example, which will be much more maybe concrete. So we say now that there will be change GPT for every person, for every laptop and iPhone and whatever.

50:21

You will have your own GPT with your own life follows your own history. And the discussion with this GPT will be, And the

⁠¶ GPT discussion focuses on personalized and helpful interaction.

50:28

discussion with this, GPT will be very personalized and very helpful. What happened in that case? So in that case, if this, GPT will take your responsibilities and morality, somehow we can copy your morality and be part of it. So if you're moral, it will be moral. If you're not, you're not, but this is your responsibility as a human. And I think this is the way to to go with that. We need just the infrastructure and not the the law. Anybody can define the low, and anybody

51:01

can break the low. We just need the infrastructure to know that at least the machine to know that it break the broke the low. And and this is really important. I I think Oh, I totally agree. Totally agree. Well, we're gosh. We're coming up on time, Frank. Yeah. This was awesome. So we'll just any book recommendations? Obviously, I, Robot, I think, would be good reading in this space. You also mentioned Shakespeare too,

51:34

Richard the 3rd. So Eddie, you can book which I'm which I'm reading now, which is the band, Vernon Stuputeux. It's, it's amazing. It's amazing. It's 3 books, and it's actually discussed whatever which is not AI. Anything which cannot be solved with AI. It's speak about a a person who has a vinyl shop, shop to sell vinyl and then CD runs, and now we cannot sell anything. So this shop is is closed, and then he he he try to somehow manage, but he get up at the street. He's, like,

52:08

homeless, and he meets many people. And the way like, every chapter is a different, person or or a group of pair of people, and it's really fascinating. It's all those things that you cannot solve with AI. It's all the human interaction, the very, very basic human interaction. Amazing. It won the Booker Prize in the, 2018. Nice. Where can folks find out more about you? So I have a website under Joseph Keshet, and, and they can find me there. Excellent.

52:47

Any parting thoughts, Andy? No. Just great great interview. I appreciate that. 1, I would ask if you repeat the name of the book you just mentioned about the the different stories. What's the name of that book? It's not it's a it's a single story. It's called the the pants, for non subtext. It's from French. Oh, okay. Amazing. Amazing. Amazing. Awesome. Excellent. That's it. That's it for me. But that's great talk. Thank you. Excellent talk. Thank you.

53:18

And we'll let Bailey finish the show. Well, folks, that brings us to the end

⁠¶ End of insightful data-driven episode, future technology.

53:22

of another enlightening episode of data driven. We've navigated the fascinating intricacies of automatic speech recognition, explored the moral quandaries of AI, and pondered the future of technology with none other than 1 of the best minds in the field, doctor Yossi Keshet. Remember, if you enjoyed today's conversation, don't forget to subscribe to data driven media TV for exclusive video content. You can also grab some fantastic merch like the my data is the

53:52

new oil t shirt Andy's sporting today. And while Frank is basking in the Appalachian sunshine, you can bet we're already cooking up the next episode to keep your data driven minds engaged and entertained. Until next time, stay curious, stay informed, and always keep questioning. Cheerio.

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript