KCAA: Inside Analysis with Eric Kavanagh (Sun, 22 Oct, 2023)

00:00

Stration that the House can't even vote on his resolution that condemns Hamas and supports Israel. The Foreign Affairs Committee chair added that the House's inability to govern empowers and Embolden's adversaries to claim that democracy doesn't work. Nine House Republicans are now considered likely candidates for Speaker after the party dropped Jim Jordan is their nominee.

00:18

On Friday, I'm Chris Karragio NBC News Radio. You're on board kcaa's Inland Express KCAA Home A Linda ten fifty Am, the station that needs no mess. Year behind, the information economy has a ride. The world is teeming with innovation as new business models reinvent every industry industry. Inside Analysis is your source of information and insight about how to make the most of this exciting new era. Learn more at inside analysis dot com, Insideanalysis dot com. And

00:55

now here's your host, Eric Kavanaugh. Yes, oh yes, folks, welcome to the future. Indeed, your host Eric Cavanaugh here on the only

01:07

coast to coast radio show in the US of Ada. It's all about the information economy Inside Analysis, and folks, I'm very excited today to have a very special guest in industry visionary, someone who has been working in the space for a number of years and has always seemed to have his finger on the pulse of where things are and where things are going, and that's especially difficult to do in these days with this new remarkable power of generative AI. We'll

01:34

be talking to Andrew Turner. He's the general manager of a company called Proxy Data. They're doing some very interesting things in that space. And before we dive in, I wanted to throw out a bit of my past here as I think about a metaphor to describe what we're seeing in the world and what this GENAI stuff is really doing. And I'm reminded of the philosopher about whom

01:53

I wrote my thesis in college in philosophy, Jacques dere Da. He wrote this very dense piece called Structure, sign and Play in the Discourse of the Human Sciences, and it was deep and he starts off with this whole concept. He says, let us assume that something has happened like an event, if that word weren't so loaded, and he refers to as a sort of

02:16

rupture or redoubling. And what he's kind of talking about is is a movement to another level of self awareness, almost like cultural self awareness, and how that cultural self awareness is now shaping and sort of reshaping how we interact with each other, how cultures are shaped around the world and change. And if you think about this generative AI stuff, it's really what these guys have created. What these folks have created is a reflection of the text of our world.

02:49

That's really what it boils down to. These engines like open AI and Barred and LAMA and Lama too, and all these huge foundational models as they're called. They have been trained on the corpus of text on the internetch maybe it's Wikipedia or any other Twitter for example, is a very good source of information excess, it's now called, and they are now reflecting back to us this corpus of text, of language of communication. And Andrew will talk a

03:15

bit about the transformer models and some other things that we're enabling. En schmid Huber, who I've interacted with, very smart guy. He wrote papers on these transformer models and they do really interesting things. So in the earlier days of these large language models, you could only see like a token or two

03:34

to the left or right, in other words, backwards or forwards. You can think of a token as like half a word basically, and what these engines are doing is just predicting text that the engine thinks you want to see based upon your prompt. Well, there's this whole issue around embeddings you have to use, which are like memories to a human basically, But your embeddings, combined with the generative power of these models, this predictive capability, is

03:59

what has created these things. The transformers allowed the engine to have multiple agents, if you will, so now you can see a number of tokens forward, a number of tokens back and get a much clearer picture. Just think of it as extending your view as you're driving down the highway. Before it's like you're driving in dense fog. Now the fog has gone and you can see pretty far forward as you're driving along. So's it helps to create more

04:25

accurate depictions of what the prompt is telling the engine to create. But it still is an engine just predicting text basically. But the other nice thing about these transformers is that they allow for there to be multiple agents I refer to as almost like a peanut gallery of participants who are suggesting what the next word should be, and then at a certain point in time, the engine goes, Okay, I'm going to go with this word next, I'm going to

04:49

go with that word next, etc. Well that's what they're doing. But we're just at the beginning of this I mean interactive AI I think is going to come next, and that's going to be even more interesting. But let's just stay focused on the JENAI and what do we mean my dark data? So the title of today's episode includes dark data. That's all the data you

05:08

have that you don't really know about. That you have a catalog that's sitting on servers somewhere or sitting in a file system somewhere, and that's probably more than eighty percent of the data that any mid size or large organization has. So you'd better be very careful about pointing this AI gun at a vast trove of dark data because you don't know what's coming back. And this stuff is so complex now that it's difficult to figure out exactly how the engine got the

05:33

answer that it got. And this is partly why you have hallucinations. As a friend of mine, Eugene Burke says, these large language models don't have an epistemological boundary, meaning they don't know what they don't know, and that could be a very dangerous thing because that's when they start making stuff up. Like chat GPT told me I wrote three books. I was like, did I are you seeing in the future or something? Because not overcall writing any

05:56

books, but maybe. But anyway, someone who has focused on this for a while now and it's working with a company that is trying to harness and if you will team this dark data, shine a light on this dark data is Andrew Turner of Practice Data. Andrew, welcome to Inside Analysis. Tell us a bit about what you're working on today and how you're helping companies make the most of this technology while mitigating the risks. Well, thank yes,

06:19

thank thanks. Firstly, thank you for letting me join you from a very cold London tonight, so slightly different times on to where you are. Yeah, I mean, I think, as we all know, all these technologies go through a certain cycle, and you know, we're now in this kind of I want to say hi, but we're in this kind of roller coaster ride of you know, recovering from opening eyes movement of releasing chat GPT. You know, well just just about twelve months ago. Really now is it?

06:53

So? Now? Now? I suppose what I would say is the kind of the the door of actually, okay, let's let's stop talking about all the things we could do with it. Let's talk about something we can do with it. Right, So I think, you know, I think this is where the pragmatism kicks in, and you know, the marketing's got

07:13

to stop and the actual delivery of value is going to happen. So yeah, that whole thing, not saying that the marketing won't stop, but yeah, I mean I think, you know, it was a really good intro. Thanks for doing Thanks for doing the intro. I think the interesting thing that we find is that people are still struggling with what to do with this stuff. So they you know, they've been working on data related programs for

07:34

multiple years and then this this kind of tsunami of genreity AI happens. I was talking to the chief their officer a very famous media company in London, and she the positive thing was that she said that actually the CEO literally contacted her when this whole chat GPT thing hit the hit the business and said, what do we do from a policy point of view? Do we use it race it that we actually you know, put you know, kind of moratorium

08:03

on it so none you can use it. So I think the positive sheets awt of that is actually the CEO contacted her to say, how do we govern this type of technology? You know, let's not shut it down, let's work out how we can actually you know, coexist with it. So but it's probably happening in lots of boardrooms at the moment. Is what do you do with it? You know, what is the business use case? How do you prove the how do you prove the value and make sure that

08:28

you can get something out of it quite quickly to build. It's the classic thing about scoping something that's of you know, a reasonable scale, not too big, you know, so you don't create this Bermuda triangle where you disappear into Beida triangle or you never could get out. But you know how find you know, scope something that's actually your value and then and actually go and

08:48

deliver something and then you can scale it from there. So it's back to it's back to those kind of principles, I think, whether you're embracing generative AI or other things. But yeah, one of the things that we did at practice data is trying to look at how do we how do we hit this conundrum of managing data at scale? And to the point you did in the intro, is this whole issue of there's lots of data just set around

09:11

and being moved around your organization and being you know, stored. Yeah, but do you actually have your arms around it, so you know, putting a shining light and putting a real proper enterprise level discovery across that data set, whether that's you know, unstructured data, semi structured or structured data, which is traditionally where people have looked at just purely structured data. Obviously, when you get into large long as one is, you're looking at unstructured and

09:37

semi structured. So how do you get that kind of over you on that data and then actually identify where are you risks and opportunities and actually what all of the things we've done recently and this can become more important is build effectively business use cases and industry models to help these organizations really accelerate because you know, lots there's lots of tools out there, but a lot of them are very generic. So you know you're in a you know you're in a property

10:05

and casualty insurer. They don't want to know, They don't want to have a generate tool. They want to look at metadata and particular data models that actually they can relate to from a business point of view to make decisions. That's the work that we've been doing at Practice Data to be different. Because I was speaking to Mike Ferguson you probably know at Big Data a few weeks ago. He said there's something like thirty eight different data discovery, data,

10:35

data catalog types of solution out there. So if I'm a buyer, if I go back into where I used to work at Tesco or ge or whatever, it's a buyer's nightmare, you know, where how do you decide what you're going to buy? So what we do is we do something very very narrow, but very very deep. So that's what we're focused on, is how do we add value but not try and take, not try and say we can do everything, but effect we do automation, do curation as a

11:03

service. So you know, we're looking at how do we use our industry models, domain expertise and give that value add into the actual overall analytics value check. That's what we're focused on and you know what, you just gave me a good opportunity to bring ted Lassow in to the conversation, which of course is taken America by storm on Netflix or maybe it's Apple TV. I

11:22

don't even know which streaming services. I don't think it's actually Apple. But the joke there is, do you take this thing as like a college football coach or something, and they bring them over to England to, you know, to coach a football actual football We call it soccer team, which is an outrageous proposition because it's like, what the heck does this guy know about soccer? Or you know, Premier League football, which is a serious deal.

11:43

It's a very very different game. And the point being here you typically want someone with domain expertise because you don't want to hire, you know, a medical doctor to fix your car, for example. I mean, maybe he or she knows how to fix cars just serendipitously, but that's certainly not their core area of expertise. And if you're talking about trying to wrangle with these models that have some of them one point nine three point two billion vertices

12:11

or something, this is incredibly complex stuff. You can't you can't boil the ocean. It's an age old mantra in the IT world, don't boil the ocean. That's kind of what folks you're starting to do. But you're not taking that direction. You're saying, no, no, hold on, let's focus on these specific ontologies for certain industries and businesses because that way you can

12:33

pull this off. Because if you try to use and this was my concern very early in the game when I was listening to people talk about this, I'm like, Okay, I get where you're going with this, but my goodness, the training will take forever in a day, and by the time you've done training it, you're done training it. Like, you know, are you even in business anymore? I mean it's like, yeah, you can get value, but geez, you got to be focused right. Well.

12:56

Yeah, the issue as well is that, you know, the I mean, I was talking to a bank in the Naudic region of the day and there was the issue still is around where because the technology takes so long to deliver. Yeah, and then what you've got to do to is work out a sustainability model. You have to actually, you've got to win the hearts and minds of the people who are going to manage the day to going

13:22

forward. If it takes you know, multiple years to get to that point where they can actually start adopting it, then you know that the hard stuff is the soft stuff, actually the change management to actually engage, reward, incentivize, train, educate the people who actually have got that deep domain expertise. When they see something that's actually they can relate to very very quickly,

13:46

you're going to improve that engagement. You can improve that. You're going to change management a lot easier as well, because you talk in their language. If you've got a generic tool and you start with a blank canvas, then you've got a lot of work to do before you get even to that. You know, you could say base camp, you know from an everest or climbing a mountain kind of analogy, right, and that that can that you

14:05

know, that's a frustrating. If you're the leader of that business unit or that particular function, you've got a lot on your hands to sort out. So we saw this a few years ago when we were working with Andrew and and myself was the CEO and founder was We're working in another company and we were doing some herculean work, piece of work in very large enterprises, but

14:24

you're just taking too long to get to that base camp. So we said, how can we accelerate that put the effort into building the models so you know, you can obviously get you know, free open source models from hugging face. There's a lot that's there's been some really great work they've done, but we believe that there's some real added value in those industry use cases and industry models, and that's what we've been focused on. As I can have

14:46

USP well. And so the ontologies or the industry models that you start with, they're really like roadmaps. They're like a legend if you will, this discovery process. So at least you have hallways to go down, rooms to look into instead of trying to you know, find something in the entirety of the United States, which is just too big to digest, right, And

15:11

that's the challenge. If you're trying to digest too much, you just get over overloaded and the system, the brain just crashes and you just move on to something else. You just can't do it. So you provided pathways essentially of discovery, and then you use the technology to look around and to curate and you know, I have this new concept I keep talking about, I need to write something and stick a pin, so to speak. But I

15:35

kind of view these foundational models as a second chance for data. And what I mean by that is we spend all these years organizing, analyzing, doing data warehousing and then data lake architectures and all this stuff, but it was all sort of the same concept that you persist data somewhere and then you go around and you try to analyze it and number crunch it and so forth, and you know, we just we have a mess out there. I mean, if you think of the the corpus of information that any mid size to

16:03

large organization has, it's just it's monstrous. It's all over the map, it's in all sorts of formats, and there's really no way you're going to tame that. But with these foundational models, it's like we got a second chance to sit down and carefully curate what we want to put into our embeddings, into our memories if you will, for the AI to learn, and if you do that right, that I think is going to be really compelling.

16:26

What do you think we got two minutes before the break? Yeah, I mean I think it's that you know, a phrase we use messy, murky doubt data. You know, how do you grab your arms around it? How do you make a real impact on that? And one of the things we did, we've architected our solution is do it a cloud scale because you know, that's no longer a constrained So we're going to be making it available on you know, the cloud provider marketplaces, so you can basically try

16:49

before you buy. You can get grab your arms around this stuff. You can apply it into your own data set, you know, get get, get the momentum, get people on board, and then scale it from that. Because you know that thing that no infrastructure is no longer a constraint anymore. You can get on with that. You can you know, you can select it from the marketplace, you know, do it by tomorrow. It's you know, provisioning that type of instructure is like an overnight thing. It's

17:14

all. It takes you a couple of hours. It's not something you have to work weeks and weeks and months for. And I have an army of people to do that. You can do that from you know, the established players. So that's that's also what we're going to be releasing as well. To make that make it accessible and make it, you know, adoptable, and actually then you're going to win people over and then people going to adopt it well, right, because you think about the patience of the average person

17:37

these days, and there's no doubt that change is always difficult. Change is always hard. Some people are better at it than other people, but everyone has to deal with change. And you know, probably the older you get, the less willing you are to change. It's kind of your old habits die hard, as they say. But to your point, if someone can make progress in a fairly short period of time or at least see where the going, that's the key. That's the way you kind of win the hearts

18:03

and minds. And you know, it sort of reminds me of my own writing process when I have to write something new. At first, I'm like, oh goodness, what am I gonna do. I have to find a starting point. I have to find where is that bass camp? Okay, the base camp is up there, And once I find that, I can climb my way to it and then take a few deep breaths and then prepare for the next part of the journey. And that's kind of where we are

18:25

right now. I think everyone's trying to find their base camp. But folks, don't touch on that. I'll be right back. You're listening to Inside Analysis. Expected, Welcome back to Inside Analysis. Here's your host, Eric Tabanaugh. All right, folks, take us to the future. Indeed, I love that song that's by Black Bananas is the band. You gotta love it. I love that song and it reminds me of William Gibson, the futurists. I've always wanted to meet a futur rist, to be like,

18:57

are you even here right now? Man? But anyway, he's a smart guy, and he said, the future is here already, it's just not evenly distributed. It's like, oh, that's a good one. And all, of course, it also means the past is still here, just in pockets all around us. So it's an interesting dynamic. But we're talking today with Andrew Turner, general manager of Praxy Data. That's PRXI Data, just like it sounds. And we're talking about jen Ai and making sense of dark

19:23

data. And we wanted to talk about the different roles, things like prompt engineering, for example, which is a real task now, it's a real skill set and as many people have said and I think this is true. It's not so much that AI is going to take jobs away, per se as it is that people who learn to use AI effectively are going to have a much better time getting jobs and doing their jobs than people who don't. It's like any other tool. And you don't see too many yaks out there

19:49

in the fields tilling the land. We're using giant machines to do that stuff these days because they're much more efficient. And I've always believed that machines can do the job better than people. Let the machine do the job and let the people do the more interesting things on top of what the machines do. But Andrew, we are going to have to deal with some pretty heavy duty

20:08

change management. Right in our deep dive earlier, you talked about this McKinsey quote where they said that we now believe, or they now believe, we'll be able to automate many of these processes ten years sooner than we had previously thought. That's a pretty big deal. I mean, that's a step change. You're like skipping a decade basically, and it's going to require us all

20:29

to get our sea legs. Right. What do you think? Yeah, I mean I think it's absolutely I mean that quat quote came out of the fourth of October, so it's only a few weeks ago, and it's absolutely you know when you think about the ramifications of it's absolutely staggering. You know, you talk about, you know, I remember the days of business process reengineering, but that's like, to your point, it's like BPR on steroids,

20:53

you know, times ten. So you know, it's it's kind of like it's saying, you know, you it's a real opportunity to redesign and really step change and reorganize the organization. And any CEO that's listening to this session that kind of says, oh, you know, we can put it on the back burner. We could ignore the impact of generaty of AI. You know, it's literally it's it's a it's that would be a mistake.

21:18

The question. But the question I think is then about, you know, how do you then sit down with your people team, you know, what is the organization of the future, because I think it is that point you made a bit earlier. It's not just replacement, it's it's coexistence and actually how do you shift the work into that new type of role and then it is a piece around upskilling and you know, how do you upskill the right

21:44

roles in the future. And you know, one of the things that I was sat in the audience at this recent session where you know, the couple of CTOs presenting their take or you know, from a technical perspective about their view on generative r and they said, you know, it's pretty much the

22:00

whole organization structure is going to be impacted at different levels. And people that we you know, we traditionally call you know, developers that the rock stars of the future or you know, the the rock stars of the of the

22:12

today and the stars of the future. But then you know, I think it's going to change that that role because people are spending a lot more money on people in this kind of area, and actually, how do you get the productivity out of those you know that that you know, everybody wants to be a data scientist and then you know, you pay, you pay, you pay market rate to attract the best, but then actually, how do

22:33

you put the tools around them to actually become productive and impactful in that organization. So in generality, it puts that the next layer of complexity on that I think there's an opportunity for organizations to help other, you know, enterprises, to actually educate them on prompt engineering. I think it's a key,

22:49

it's a core competency going to be going forward. And if you're not done prompt engineering or interacted with bar GPT, et cetera, then you know, it seems like talking, you know, talking interacting with a real person.

23:03

It's quite it's quite remarkable. Yeah, it really is. And you know, I had Matt McClarty, the CTO of Boomy, on the show a few weeks ago, and he had this great line where he referred to what era we're in in a certain sense, and we talked about machine learning and AI and what's really happening now, and he said, it's a time of learning, meaning we humans are now learning what this stuff can do and how it works, and we're learning about how to make it hum we're learning about

23:32

what the challenges are. And it kind of reminds me of something I worked with a couple of Russians years ago, and it was a good Russian and the bad Russian. The good Russian Niko and Minchugol. He was a very smart guy in Nikolai Minchugo. He ran the first ever advertising agency in Moscow because when Pedestoika came out, all of a sudden, you had a need for ad agencies. There was no need before in the communist system because why would you have an ad agency? Right, So he was the first to

23:55

do that. He's a designer but a very good coder too, and he built out this system. But he would say funny things to me, like one day said, the funny thing about the web is a no one knows how it works. All we know is it works, right, And that's kind of where we are at this lom stuff. I mean, we do know more about how it works. You can get into the weeds, and the embeddings are crucial, right because the models have been trained on this corpus

24:22

of text that is vast. It's very very large. So there's good stuff, there's bad stuff, there's truthful stuff, there's not truthful stuff, and so it has all this material. But the AI engine itself is kind of like a teenager with a tremendous vocabulary, but it still doesn't know the sort of more rays and the rules and what it should do versus what it shouldn't do, And these are all important things. To know and so to me,

24:45

those embeddings are crucial. And I get back to this sort of second chance for data concept because once your organization understands what these things are and you kind of wrap your head around some of the use cases, like customer support, for sure, it's a big one. News gathering, news dissemination, news generation I think is going to be revolutionized by this stuff. Once you wrap your head around it, then you can sort of start the process and

25:10

figure out, well, what do we want. How do we want to teach our teenage child. Do we want to send it to Catholic school? Do we want to send it to public schools? We want a home school basically, And you really have to know like what your appetite is, and that gets into the training side, right, So you don't want to be trying to boil the ocean. Don't try to, you know, do two full time jobs and educate your five kids by yourself. You're probably going to

25:33

burn yourself out if you do that. So you have to kind of understand the scope and the appetite and where you're going. And if you do that and you use tools I think like praxy data for example, I think you're going to have a really good chance of success. But what do you think, Andrew, Well, I mean, I think what you touched on there

25:48

was that what i'll call data toxicity. You know, so that thing about you know, because of you know, there was a scenario that will slightly the week where if you get it wrong the way you connect these things together, which you can actually generate data and then it actually publishes that data and then it feasts on itself, like it becomes like a carnival for its own

26:08

data sets. So the danger there is that you you know, you go back to origin, what's the origin, what's the syndicated origin, and what's true and what's not true? Yeah, right, there's a piece around that, around you know, lineage and and the kind of the inheritance of that data, which becomes back to some kind of fundamental how you manage data at scale principles you need to get you need to get right. So you know, that's the point you said a bit earlier about this. You know,

26:40

what we see is this messy, murky data. You've got all this kind of patchwork quilt. You know, people have gone out and bought lots of tools and they've evolved over time. But actually what we see has being this kind of like I suppose you know this, I think we talked about it that you know, it's backplane, you know, like a backplane concept.

26:57

You know, how do you have a backplane across your enterprise that you can actually rely on, that you you look after properly, that's going to take in these different types of dat source So you have that I don't want to it's an age your phrase about that golden record definition and people still struggling with

27:15

that. So you know, all we're trying to do is is not not say you're going to throw away all the thity investments you've already made, but how do you actually you know, tap into this this model, these models, this this industry insight to to add value to your you know, getting those people on board and then dealing with the change management we talked about a bit earlier, because that's that is really really the sustainability model. That's the

27:40

biggest issue we can see out there. You know, people can keep buying these tools, but until you actually get work out a path to get people engaged, that's that's the key to it. So you know, we can't you know, I don't know what we have situations in the future where we have robots managing the data in our organizations. There's that tacit knowledge that's been hearing the brain, right, that people have got experience about that is difficult,

28:03

difficult to codify. I'm sure work out through bring computer into face a way of codifying at some point in the future. But for the for the you know, for the next few years, I think that's where we're trying to add some value in this, in this kind of this ocean of dates that we're dealing with. Yeah, and I think, you know, for use cases, certainly chatbots are going to get a lot better. You know, a lot of times in enterprise technology, stuff comes out a bit too

28:29

soon. Right when the chatbots first came out, I think they just they weren't quite ready for prime time, and so they didn't do as well as as we needed them to do. And you know as well as I do, that people have a very short attention span these days and very little patience.

28:45

So when something doesn't work out, like oh, that didn't work, and then these impressions last, right, these impressions linger for periods of time, and it's hard to sort of shake off that old mindset and realize, no, this is a new tool, this is a new way of doing

28:57

things. But I think that the key is to help people understand that used properly, this new suite of technologies is going to make your job a heck of a lot easier, and it's going to it's going to make your job more interesting and more fun, is I guess a way I'd put it, because in general, machine learning algorithms are very good at tackling very tedious tasks like scanning fifty thousand purchase orders and finding the twenty seven that have some error

29:27

in them. Well, a human being a record after record after record, forget about it. Man, it's a nightmare and it's terrible for morale. So as long as people understand, and this this is my beef with the media writ large, is that I have this expression I throw out there that the narrative is always wrong because the narrative is a story, and the story is not reality. The narrative is some story I've spun up around components of reality. But the reality with AI is that it is making our jobs easier,

29:59

or making certain easier. It's not necessarily taking jobs away. And it's not the red eyed robot that's going to take over the world and all that stuff. But the media loves to focus on that. They love to focus on negative because that sells, It gets eyeballs and so forth. And I think it's important that people recognize this as a whole new tool set. You know, I mentioned dere Dodd is sort of redoubling. It's a reflection on

30:21

everything that we've seen already. So now it's almost more like riding a wave instead of trying to create the waves. We have the waves now now we

30:29

have to figure out how to ride them. What do you think, Yeah, yeah, I mean I think that what you were touching on a bit earlier about that the chatbot situation was that you know, you obviously the the data that feeds the chat bot, right, So you touching on this thing about there's been some scary situations happened recently with you know, what you do

30:47

privately and what you do publicly. Yeah, so you know, how do you deal with that, the risk around data privacy, preserving privacy around that, you know, building your own private you know LLLM models, your own data models, so putting your domain expertise with recognize other external data sources, and that's where you're going to have that differentiator in service because if you create that kind of that data source that then is an input into your chatbot,

31:19

you're then going to have to stand a much better chance of having a very a delighter as a customer experience. That's going to drive loyalty. Yes, So you know, one of the use cases we've been looking at recently is with an organization that's dealing with insurance and roadside recovery and that kind of thing where you kind of combining lots of you know, are you getting complaint letters in from into a call center, how is the call center agent dealing with

31:48

and linking and saying who is that particular customers that's affected by that? So you're getting different types of data input, but you've got to connect these dots and how do you put an instruction in place to allow you to do that. You've got to then, you know, you've got to work. It's a complex object. It's a complex structure of how you sat in a CRM

32:08

system or something like that. But then actually you've got a different type of interaction of data coming in through through letters and call center records, how do you make sure you connect those dots to make the right decision to look after

32:20

that customer or deal or deal with that service intervention. So those are the types of thing where we can see that having this kind of foundational capability is very important where you're actually connecting the different types of data that before were siloed'.

32:37

That's a really good point. Yeah, that's very interesting because you're right when you think about the old way of connecting systems, and I'm old enough to remember EAI Enterprise Application Integration right where we would connect one application to another. Well, in the old days, you would look at things like primary keys and foreign keys and stuff like this as a a very what not what's its declarative and what's the other one? I can't even remember the name.

33:07

Different kinds of programming. It's very specific instruction go here, do this, go here, do that. But it's it's rigid. It's a very rigid connection. And what you want is more of a fluid connection that can accept inputs from multiple sources, like you say, from the call center, from text messages, from mobile, from your laptops or whatever. They are all

33:28

these different sources which to manually connect. Yeah, you can do it, but it's cluegy and it's also not very performance and then you wind up spending all this time and effort trying to figure out how to optimize the performance. So let's add some hardware over here, let's add some caching over there. These are very specific tactical things that you can do tactical, but it's hard to pull off. Whereas with these foundational models, that's what they do for

33:52

a living. That's what they're designed to do, is to take sources of

33:57

information and fuse them and then generate some output. And if you can train that to do something using only your trusted data, the second chance for data that we're talking about, then you're really going to get somewhere because those things can do it really really really fast, and they just kind of figure stuff out about how to do that dynamically, instead of having to give such specific instructions to your systems people about how to get this connected to that and not

34:21

connected to this. They're kind of solving that out of the box. But folks don't touch that that. I'll be right back. You're listening to inside Analysis. Welcome back to Inside Analysis. Here's your host, Eric Tabanac. All right, folks, welcome back to our show today, the Dark Data Challenge. Solving the Dark Data Challenge with Practical jen Ai. You're truly Eric Havanat talking to Andrew Turner of Praxy Data Praxy Practical. I don't know if

34:54

that was the logic behind it, but you know. And now let's talk a little bit about public verse is private. Right, So when you're on Google, you're just when you're using chat GPT, let's say the original one. That's the public version that's trained on all this data that's out there, but it hallucinates, it does all kinds of different things. What you really want is a private instance that you then very carefully train by curating only the

35:20

data that you want, only the trusted data. It's like teaching your kid how to grow up. You want to be careful about what you expose them to. Not too much YouTube, not too much TikTok. Maybe we screwed up on that one a little bit with our kid, but anyway, we try our best. But tell us about your thoughts. I think every company is going to want to have their own private instance of one of these models and just over time build it out and let it support things like customer support,

35:46

let it support things like even supply chain other stuff. But what are your thoughts, Andrew, Yeah, I mean I think it's I think people originally thought there was only one model, right, They only thought there was only a public one. And you've probably heard about that. You know, there's been various examples where you know, the high tech company based in South Korea, you know, uploaded their source code into chat, GPT, et cetera, et cetera. So you know, but the I think that you

36:16

know, this is where the CISOW comes down into. This is where they kind of you know, the IT security compliance piece and that the chief data officer around you know, data privacy is you've got to govern your assets as you will govern you know, any other assets. So you've got to it's just a different type of asset that you're having to manage here. So yeah,

36:37

there's there's a raging debate. But the net net is you've got to build your own private you know, generative AI asset asset library, you know, your set up models to address that problem. Because you know, we know there's some there's some class actions going on with some litigation going on with certain tech companies at the moment around you know that they're the things that they've done prior to coming out to market, and you know it's obviously causing a

37:05

lot of concern for a lot of organizations. So I think getting the right advice around how you architecture what you're going to be put in place. But yeah, you've got to have something that's protected, it's it's it's like a it's kind of like sleep at night factor that you've got to think about,

37:21

and you've got to get the right people around the table. And I think that you know, the key thing I think is also if you're a CEO or a CFO, is looking at the the leadership team you've got working for you to give you briefing so you understand the ramifications of just putting this piece thing, this this thing together in the right way. I think then then you can really exploit it. So that's the kind of that's the thing that

37:45

we're seeing. There's there's people are definitely speeding up now. They you know, they've they've been watching and waiting, they've been dealing with this kind of this massive tsunami and all the all the kind of you know, as you mentioned the kind of media and PR around this, but people, now what I wanted to go into, get into brass tax gett into real use cases, you know, work out you can actually make some real big impact in

38:06

their organization, reshaping their organizations we talked about earlier with their organization structures. And then you know, I think it's raising the importance of that chief Data Officer role, the chief Digital office, so the CIO, the CTO to be at the top table again, you know, and actually to really help the CEO and CFO and top team to make those really really strategic decisions.

38:29

Yeah. Well, you think about digital transformation, which we've been talking about for twenty years or so, and we've been doing digital transformation, but this is almost like wholesale organizational transformation. Is what can happen if you do this stuff? Right? What do you think? Yeah? No, definitely, I mean I think I mean it was I was literally two hours away from

38:50

presenting something and that that McKinsey quote out came out. I mean it's absolutely mind blowing, right, you know, taking ten years off an organization's evolution with in one stroke. So you know, obviously you know, is is that a precursor for you know, oh please please bring up a very very large strategic consulting firm to actually help you with that transformation. But you need some help. I think, you know, you need to get the right

39:15

people around the table, and you need external help. There is a and that's what's driving up the cost of resources because there is a finite resource of people who know what what the pass should be. So you know, that's that's the key thing now is actually how do you get the right team on board? How do you accelerate with this stuff? Don't don't wait, You've

39:34

got to get your arms around it. So yeah, at least the planning side of things right, You've got to get the c suite and key directors on board to recognize that this is a direction for almost everyone for the future. In me, to not use this stuff is a bit crazy. I think you're just going to get steamrolled if you don't. It's like the dot com age when the dot com came out and everyone realized, oh wow, so now I can sell everything online. I don't have to have all these

40:00

brick and mortar stories. Well that was a pretty big change in terms of how we view the internet, and now you have this new, incredibly powerful set of technologies. And to your point, you can use different models. There's the chat GBT, there's barred you know, GitHub as co pilots, you know, Microsoft, they all have their own version, if you will. So you have to start somewhere, so at least start the planning process,

40:27

get people to sit down and understand what's possible. And the best way to do that on an individual basis is just to play with this stuff, right, Just get in there and start playing it, giving it prompts. Ye see it to create your lab, create, you know, create, create your you know, your your data lab. But it's it's it's reinforcing

40:45

that type of innovation lab. And you know what we're quite impressed with actually what Meta are doing with Lama too, you know, that's what we've been pretty impressed with that as a as a foundational capabilities because they've been you know, we see been through some quite interesting you know, pr the last few years about data privacy and preserving that kind of dimension. But you know, we think that we think they've done a really good job around the stuff they've

41:09

been doing around armors. So I think that's something to think about as you you know, you put that reference architecture in place, and you get these right people in the room, and you get on with a sprint, you know, get some value done in and for a few weeks get that into the back into the board and show show what's possible with this stuff, and that that's the way you're going to transform your industry rather than being transformed or

41:31

put in a difficult position where your competitors are actually doing something. You know, we were we were looking as you know we were looking at recently, was you can use these types of tool to do competitive analysis and identifying you know, what the what the strengths and weaknesses of your cells and competitors that's been indexed by these types of model. That's right. If you're not leveraging

41:53

that kind of stuff, you're missing out. Well. And I heard and I haven't verified this, but I've heard it from a sources now and it makes complete sense that Google has been quietly building a vectorization layer underneath g Drive and Gmail such that in your prompt at BARB you can do at Gmail at Drive and it will then point to your vectorized information and use that to generate

42:21

its answer. And that is pretty darn cool, I mean to already have it there and just to explain to folks what these vectors are, what these embeddings mean. And embedding can be any piece of information. It can be a press release, it could be any number of things that you're using to

42:36

build out the foundation for your personalized engine, if you will. And then you turn it into mathematical or numeric values basically, and that way you can number crunch and do comparisons and do nearest neighbor and k meines distant differences basically, And that's how these things run. And they're just doing all these calculations. These just tremendous amounts of calculations. But you think just in terms of and them use cases. You know, I sit there and think to myself,

43:02

MRIs. The MRI, the actual file of someone's MRI is a massive file. It's huge. I mean, it barely fits onto a cd RAM. But if you vectorize it, you probably collapse it by ninety nine percent in terms of size. And now you can also analyze it more effectively. So this can solve all kinds of problems, all the data storage issues that we have. We can now vectorize this stuff and then put those big fat files in just ice cold storage. That's very very cheap. Its depends what

43:31

kind of industry you're in them. I mean, we we're getting involved in a lot of healthcare stuff and synthetic data. You know, actually you know, creating synthetic data sets which you've got the characteristics of the actual data. That's where you've got to You've got to go down that route as well. That's another dimension that people need to think about. It's not just about privacy,

43:50

right, it's actually about synthetic data as well. Is an important you know principle you've got to work out as well in certain industries because there's obviously a you know, a patient record, there's a there's a lot of you know, biometric data, et cetera, et cetera. You've got to work out how to manage that data at scale as well. So that's that's another thing that you know, it's it's a lifelong kind of thing to sort out. But I think you know, generally of AI is giving the industry a

44:15

leg up. It's how it's but it's a you know, it's a ten headed hydra. How do you how do you manage that in a smart way? A lot of maturity, but also a lot of risk. So it's it's it's back to basics. It's back to basics, but it's also back to you know, getting executive sponsorship on this kind of stuff and actually getting the right people around the table and then getting you know, try trial and error. You know sometimes in enterprises that we see today is there's still a

44:43

bit scared of failing. But you know, you've got to do sprints. You've got to apply agile principles. You've got to get to the point where you can you know, you're going to learn something out of the back of something, doing something quickly, and then you're you're going to iterate on that. And it sounds like an obvious way of approaching it, but we still see some food factor around them. Yeah, that's exactly right. Well,

45:05

folks, look these guys up online praxy Data Andrew Turner. You've been listening to Inside Analysis. All right, folks, time for the podcast bonus segment here on Inside Analysis, talking today with Andrew Turner of praxy Data, all about jen Ai and solving the dark data challenge. So Andy take it away. Yeah, so we were talking about the the situation with a lot of these models, and you know, there's obviously a huge advantage you can get

45:36

if you get this right, but also sometimes they get it wrong. So you know, we've got this situation where you've probably heard about this thing called hallucinations. You know, you've heard the thing where we get an incorrect data. We're getting bias put into this data set, and the thing is then

45:51

and actually responses from the from the from the platform. So you know, how do we keep learning this data, how do we keep improving this data accuracy and ensure that we get a good output, a good quality output can rely on. The Other thing that we've see in as well, is that there's this whole debate, raging debate about business data with public products, public generator AI products. So you know, how do you make sure you're compliant? You know, if you're in a regulated industry, you know, how

46:19

do you deal with data leakage, data theft? You know, how do you how do you understand that the third party dat you may be integrating is actually syndicated and actually good to integrate with your model? And how do you

46:30

do with the usual things around you know, risks around hacking. So you know, we've got a strong position here that we believe that you've really got to build, you know, your own private models, and that's where we see we can augment that with our industry models to help you get a leg

46:47

up even further. One of the things that we see as have been a core competence of organizations and individuals and teams within another organizations is this thing of what we're saying prompting so or prompt engineering as people starting to call it, where you know, you develop this kind of style to interact with the models and to get your desired output, and you have to you know, you can actually then start to request a certain way of interacting with it, even

47:15

a certain character of how you interact with it as well. You've got to provide the context. You've got to understand that the temperature on that particular prompt that you're actually integrating as well, and also make sure that you actually start to build a library up so you build the value with library that you can use as a reference model a bit like we're done with industry models from our data point of view, to help you your organization evolve and become more competent

47:42

over time. So you know, looking at this kind of the magic would say of generative AI and how you can combine that with you know, corporate data. You've got to look at how do you integrate that so you know, you can use it individually, you can go into that consumer version that chat, GBT, bar et cetera, and music like that. But obviously ultimately you're going to move from an individual use case into an integrated business scenario. You know, this example of you know, the kind of power you

48:08

can do today. You know you can actually start to say, okay, you know, asking these questions. You know, how could I create a direct competitor to a particular company, either your own company or a company that

48:20

you've seen a preer on your radar. I'd yet you know, use the models to find out what their strengths of weakness is doing interactive SWAT analysis, you know, dynamically with prompt engineering, what IP or trade secrets do they have or they have published that you can actually then catalog understand, create a you know, a way of actually dealing with that and creating a battle card. You know, how do you deal how do you sell against that particular

48:47

position they have in the market. This is the kind of things where you can actually really use it from a marketing and strategic point of view that really make a difference to your day job. You know, to talk about public and private, you know, so that what that means is that you've got

49:01

to build this competency internally. You've got to look at you know, we talked about prompt engineering, we talked about upskilling, we talked about helping not only train the models, but train your organization evolve those roles that new organization structure of the future. But also you've got to look at think about how do you manage that data at scale and apply some kind of the privacy concepts

49:23

of you know, compliance, security and control. Because whether you're in a regulated industry a non regulated industry, you've got to think about building your own

49:30

model. So how do you put these these little pieces of the recipe together in a way that's going to help you get to that next level of capability In terms of the future of generative A you know, what we see as a number of things here happening is you know, with the point when this is already starting to happen where you can actually interact with it through voice, you know, where you can actually start to evolve this into a what we

49:55

say in multimodeal capabilities where you can combine you know, video with images, with texts, with audio, with actual code, so you actually create this kind of multimodal type of output and the evolved in that ultimately into some kind of platform that you can use internally and that's you know, access to corporate and personal data, obviously respecting the kind of you know, GDPR, CCPA and all different regulations you got across the world. And the whole point is

50:23

about boosting the performance of your organization. That you've got to think about think this through about not only from a people point of view, but also how do you put the process in place and how do you build the products and data platforms that you need to do for the future, So, you know, finishing off of what we've do practice data. As I mentioned, we're trying to make sense of this kind of messy, murky dark data out there.

50:45

There's been a lot of investment in lots of different tools, lots of different platforms, but we're still seeing a real huge problem. So our kind of area that we're focusing on is really the domain knowledge and the industry libraries that we've built to allow you to have discover data, you know, curate data at scale, you know, have a curation as a service, and also have conversations with your data. So that's we see as an important thing,

51:10

whether that's structured, semi structured, or unstructured data. And we're saying that we're effectively delivering that, you know, a narrow type of capability, but being very deep, So not trying to be a suite of products, not trying to be a Microsoft, trying to be a Google, trying to be specifically focused on those industries we're looking at looking at financial services, banking, insurance, looking at healthcare, look at the government use cases, and

51:32

ultimately one moving into telco and retail going downstreaming from there. But we're trying to be part of that analytics and insight value stream and also not only delivering industry libraries, but also delivering business rules out of the box that you can apply to identify risks, identify opportunities. That's fantastic, both folks. Yeah, go ahead, one, let's lie go ahead, so you know,

51:55

it's very fast. I suppose a big watershed moment is that on the sixth of November, there's going to be the first Opening Eye Developer conference over in San Francisco, So it sounds like something that people should consider going to. Obviously, it's going to be an evolution of the GitHub co pilot, which has been you know, raging, raging capability that people can coexist with developers.

52:17

And I suppose the key thing is about, as I said a bit earlier, skills, compensation and dealing with that, you know, the food factor, the fear, uncertainty, in doubt. I'm going to grab this initiative by both both with both hands and see how you can embrace it and apply it to your organization. A few books for you to read and think

52:37

about as you go forward. There's a particular one that the one on the right hand side AI twenty forty one that was written by the ex CEO of Google in China, Mustafa Suliman is the original founder of deep Mind, which is now part of Google, and a couple of other interesting things. If you've got some bedtime reading for you to finish off. If you want to know more about praxy data, my email address is at at practi data dot com and we'd love to hear from you to speak to you about our use

53:07

case and our capability. Thanks for your time. Good stuff. Another deep dive on Inside Analysis. Look at these guys up online at at praxidata dot com. We'll talk to you next time you've been listening to Inside Analysis. Southern California's Inland Talk Express is kcaa open for takeout and delivery. Hell Tapiac Mexican food restaurant in the Tri City Center of Redlands is back. Their entire family is on hand to serve up their delicious burritos, but chaka charizo,

53:38

Wavosmanchero's steak and eggs just part of their mouth watering great food. Since fifteen thirty one, people have marveled at the miracle of El Tapiac, and now you can marvel at the great food the logo family has been serving up for over two decades, nestled quietly in the corner of the Tri City Center Shopping mall next to Burlington Code Factory. Support them. They can't wait to serve you some of their delectable, authentic South of the Border Mexican fair at great

54:02

prices, served up with love. Support the area's best loved Mexican food restaurant in these tough times. Order up a tasting meal on the phone for delivery or take out for breakfast, lunch, or dinner from ten am to six pm called nine to nine three oh seven zero zero one seven. That's nine to nine three oh seven zero zero seventeen. Or google Altapiac Redlands and treat yourself to Hebelta Club's original pure pouty Arco super tea helps build red corpuscles in

54:30

the blood, which carry oxygen to our organs and cells. Our organs and cells need oxygen to regenerate themselves. The immune system needs oxygen to develop, and cancer dies in oxygen. So the tea is great for healthy people because it helps build the immune system, and it can truly be miraculous for someone fighting a potentially life threatening disease due to an infection, diabetes, or cancer. The tea is also organic and naturally caffeine free. A one pound package

54:57

of tea is forty nine to ninety five, which includes shipping. To order, please visit to ebot club dot com. To hebo is spelled T like tom, a, h ee b like boy oh. Then continue with the word T and then the word club. The complete website is to hbot club dot com or call us at eight one eight sixty one zero eight zero eight eight Monday through Saturday, nine am to five pm California time. That's eight one eight sixty one zero eight zero eight eight to ebot club dot com with

55:27

sixty years of fascinating facts. This is the man from Yesterday and back in time to this time in nineteen sixty six. Following a Charlie Brown Christmas which debuted last December, Charles Schultz end CBS will debut a second Charlie Brown Special. It's called the Great Pumpkin Charlie Brown. Hey, Charlie Brown, I've got a football. How about practicing a few place kicks. I'll hold the ball and you come running and kick it. Oh brother, I don't mind

55:59

your dishonesty. You must think I'm stupid. And back in time to this time in nineteen eighty eight after John Housman passes. Although he had a long career starting with radio, John Housman is best known as the feisty, krusty law professor in the paper Chase. He was also known for this commercial Smith Bonnie. They make money the old fashioned way. They earned it and from this time in nineteen fifty nine, debuting on CBSTV A forgotten favorite, Mister

56:27

Lucky. He's swab, he's sophisticated, He's a demon with Lady Luck and lovely ladies. The theme to Mister Lucky is by Henry Mancini with more at man from yesterday dot Com. It's that time of year again, No, not the holidays. Medicare open enrollment and if you have questions about Medicare, you should talk to the local experts. Paul Barrett and Associates. All of his agents are certified with plans that are accepted by most of the medical groups

57:05

in our area. Call nine oh nine seven nine three oh three eighty five. Their service is free and after forty two years of the business, their agents are trained to help you pick the plan that's right for you. Del Walmsley here, the first thing you're going to have to learn is that until you stop expecting our politicians or anyone else to change your life, your life isn't going to change. The only person who can change your life is you.

57:27

But you need to know how. Listen to my show, The Del Walmsley Radio Show, where the Hype ends and the help begins right here on CACAA, now broadcasting on ten fifty AM and one oh six point five FM, the stations that leave no listener behind. Was your car involved in an accident or just need help with dents? All Magic Paint and Body Collision Centers

57:54

in business for over thirty years. They're highly trained staff and certified visions and friendly staff are the best in the business and treat each car as if it was their own. All Magic Paint and Body Collision Centers are family owned and offer state of the art equipment and tools to ensure optimum results. They use the latest technology in computerized color matching and specialize in frame repairs with their modern

58:17

laser measuring systems. They're OEM certified and they have four locations to serve you. All Magic Paint and Body Collision Centers offer rental car assistance with free drop off and pickup services too, and their work has a lifetime guarantee. All Magic Paint and Body Collision Centers are in Narco, East Vale, Marino Valley and in Fontana. Call them at one eight hundred and sixty one Magic.

58:39

That's one eight hundred sixty one Magic. All Magic Paint and Body Collision Centers one eight hundred sixty one Magic, All Magic paint and auto bodies, says drive carefully, Tune into The Faran Dozier Show, US Marks Place in Time, the soundtrack to Life. Sunday nights at eight pm are KCAA Radio playing

59:04

the hottest hits and the coolest conversations. Sunday Nights at APM on The Ferrando Zier Show with in the array of music, talk, sports, community outreach, and veteran resources, his hits from the sixties, seventies, eighties, nineties, and today's hits. The Ferrando Zier Show on KCAA Radio on all

59:25

available streaming platforms and on six point five FM and ten fifty AM. The Ferrando Zia Show on KCAA Radio, Killstina KCAA, lovel Linda at one O, six point five FM, K two ninety three c f Brito Valley, NBC News Radio. I'm Chris Kragio. The US is stepping up its military presence in the Middle East, including air defense systems. Secretary of State Anthony Blincoln said that the US is taking steps to avoid a broader regional conflict. Blincoln noted that includes making sure that

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript