KCAA: Inside Analysis with Eric Kavanagh (Sun, 24 Dec, 2023)

00:00

People know, if you're going to excel, you look at a spreadsheet, you have rose and columns. Well, rose typically contain lots of different types of data first name, last name, phone number, email address, address, all this kind of fun stuff notes that's all in a record. That's one record, and then the columns tend to be the same or they should be the same of first names, last names, phone numbers, et cetera. One reason why that's important is because compression is easier on a column,

00:27

and so Verdica was one of the first major column oriented databases. Wasn't the first, what was one of the first cybased IQ was one before that years ago. It was one of the first to say, you know what people want to do analytics. People want to do analysis compression it is a big part of that. And being able to rapidly slice and dice information is a big part of analytics. That's what we need to be able to do.

00:50

So Michael Stonebreaker who invented the modern database. By the way, if you look this guy up, he's amazing way back in MIT, like I get fifty odd years ago. He came up with the postgress and the ingress, which were database architectures that have now been used by tons of different companies. Well, I'm telling you this backstory just help you understand how we got here. And what's fascinating is this data fabric stuff came really after the had do

01:15

movement. People started thinking, all right, we have these incredibly topocryphal, the topographically challenging environments. How are we going to be able to provision across multinational corporations with many, many different users, sometimes tens of thousands of people. You'll learn in the database world, concurrency is a real issue, and being able to serve many concurrent users is a real issue because it's like having

01:40

ten thousand people come into your boutique shops someday. Well, you can't even fit them all in there. This is what happens with concurrency. You have to purposely design your architecture to handle that kind of stuff. As my buddy Mark Madson, who's an analyst again these days, once told me. He said, it takes ten years to find the edge case for a database that will crash it. Ten years like after your in production before you finally figure

02:06

out what it's going to throw it off. All right, this is how complex databases are. So data fabric is even more complicated than that. I mean, quite frankly, it really is a whole array of data pipelines and triggers and monitors and sensors. Now we have data observeability these days, which is all about watching for upstream changes in data. Did a field change? Is there a new column in this database? Did we not get the data?

02:32

Well, these kinds of early indicators are very useful for data engineering teams and for analysts because hither too you show up and start doing your quer and you didn't get the data that you want. Well, what the heck happened? Let's look into it like raise a ticket with it and wait a couple of weeks. Well those days have to be gone, folks, and data fabric is designed to be able to solve for that and then data matches. I suggested, Well, it's a fascinating concept. Exactly what it is,

03:00

I don't know. We're going to find out, but I think I've been talking for a mom now, so I'm going to go ahead and pipe down and throw it over to Steve Starsfield from open Text Vertigo. Steve, you've been around maybe almost as long as I have, or about as long as I have so you've seen this whole evolution. It's kind of amazing, it's

03:19

kind of bewildering. But I do feel like we're getting to a place now where it leads, to a certain extent, the nuts and bolts of the data administration don't have to concern the analysts and the end users who just want to have fun with the data. What do you think, Yeah, absolutely, you know, it's funny. I think over the years, what has

03:38

happened is the data center of gravity has changed quite a bit. Right, So it used to be that the data warehouse with the center of gravity, but we can't do that anymore, right, We can't just continually move all the data in the data warehouse. So we have things like data lakes,

03:53

and we have object stores that we can store our data in. But the center of gravity has changed because as the volume of data is getting so much greater and we can't possibly load that in, the number of people who want access to analytics has changed, and so we have different challenges now, we have big challenges that we can have. And so really what data fabric and data mash really represent is that change in gravity where the data sits, whether

04:19

it's you know, in locally object store on the cloud various places. I think it kind of represents accessing and managing and using that data, not in one central place, right. Yeah, And that's that's a really good point. And you know, you bring up on data lakes, and of course now we've gone from data Lake to data lake house to lake house architecture and you know all this stuff. And I think the key is to remember the

04:46

product that focused from a business perspective of the data product. But you know, I'm kind of reminded too of the no sequel movement. So out of Haddob you had this sort of no sequel which meant not SQL and sometimes ONLYQL,

05:02

not only SQL. And what happened is very shortly after that, these no SQL databases started strapping SEQL engines on top of them, because guess what, SQL is a standard, that's the structured query language for those who don't know as QL, it's a standard, and so you need that stuff.

05:19

But I remember when the data Lake craze came about, I started wondering myself, are we making some of the stain mistakes again that we made last time, where you just throw a bunch of stuff out there and then Okay, you hope to be able to find it, and people talk about the data swamp and things of this nature, and that's why you have people saying, now lake house architecture, they're coming up with more clever ways to be able

05:41

to get access to it. But that's really the bottom line is access path and efficiency, right whenever your information architecture is the key is you want to optimize the access path for the important users and you want to be efficient and durable and secure and all that other stuff, right, Steve, Yeah, I think we don't really have an ETL problem or ELP problem anymore. Right, We're not trying to move the data around. What we're trying to do

06:04

is access it in place. You're absolutely right about that, Eric, And so you know we need to do some things in order to access in place. It could be data measure data fabric, and you know, with a data mash, we want to make sure that we have access to the metadata so we can access that data on top of the data and then perform our analysis. With a data fabric, that metadata may not exist, right, so we may have to create a semantic layer, our own semantic layer using

06:34

a graph or some other technology. And so again it's about accessing the data without having to move it into a single consolidated data warehouse. Yeah, that's a really good point. And the metadata, so I talked about that in the semantics and data catalogs and some of these other things. They are all

06:53

there to help us make sense of the data. And the other interesting development here from a business analyst perspective is that in the data warehousing world, we've stripped out all the context to be able to get it through thin pipes, to slow processors and be able to to slice and dice the data. Well,

07:10

you don't have to do a lot of that stuff anymore. I mean, I remember the schema on read concept coming out of the Data lath and even that's a bit challenging, right, because it's going to slow the process down a little bit. You have to make sure you get that correct. But the general thrust I'm throwing out here is that you don't necessarily need to strip out so much context read metadata anymore to be able to facilitate use down

07:34

the road, right, Steve, Yeah, that's absolutely true. You know, I think that metadata, that context is super important for the organization. It kind of describes how you use the data and what you want to do with the data, and so stripping that out can be you know, detrimental to whatever you want to accomplish in terms of business. Right, we do these things because we want to understand our business better. We want to drive through areas. Area one is we want to make sure that we can drive

08:03

additional revenue within the organization. We want to make sure that we're super efficient and using only the servers that we need to have and only the processes that we need to do. And we want to make sure that we're compliant. So if someone asked us for reports, we should be able to go out

08:18

to those data sources very easily get access to that. And so you know, having that meta data, having that sort of whatever it is fabric or mesh or whatever to allow us to access that those data sources and perform those three things I think is super crucial. Yeah, we have got a couple of good questions from our studio audience, by the way, so I'm just going to throw this one out there. One of our attendees is writing,

08:43

so does that mean that users would access to data at the source. That is usually not a good idea because it could put too much pressure on source databases even with data measure fabric. Don't we need to move the data to some central location for analytics access? And this is this is a good question, but I will say you are starting to see a greater focus on federated data access and leaving the data where it sits, just to be able to

09:07

touch it as needed. But there is a point about overburdening source systems. That's why we came up with these things in the first place. Right. We realized that sap ERP was not easy to query and so we pulled data out of that put it into the warehouse. But what do you think about that? There is some concern, but I think the processes are getting more and more elegant and efficient at being able to do that. What do you think, steem, Yeah, it's a case by case basis. I know

09:31

for Vertico what we do is we do have data virtualization. So if you're, for example, you wanted to run important do a joint against a table that's sitting in Oracle, that's pretty easy for us to do and a lot of other databases have that too, So being able to go out and access that third party is key. It will tax the third party system though, And what I see a lot of companies doing if they're concerned about that, is they're leveraging a stores these days, so they're taking a lot of their

10:01

data and just dumping it into an object store. The the that provides you with some pretty interesting capabilities. So a lot of the databases, including Pertica, has this whole concept of separation of compute and storage. So I have my data, it's sitting in a separate storage object store. I want to do some data loading. I spin up three nodes. I do data loading. I want marketing to run reports. I spin up three nodes. It

10:28

does the reports. You know, I want various my dashboards throwing really fast. It's spent up five nodes for the CEO, and I run the reports. They're all operating on that same data, but we're able to access that through one location, through one engine. And we see that happening a lot in the market. It's happening within you know, our within our customer base, and it's a kind of a new way of doing an architecture. Is that whole separation of compute and storage. Yeah, and these are all deep

11:00

architectural determinations that get made. And kind of where I was going with that comment about large language models is we don't really I don't really understand how they work. But what I do know is they can get around all kinds of these issues because you're dealing with a multi dimensional member mole app right, multidimensional online analytical processing, the micro strategy folks, you want multiple different dimensions.

11:22

But these large language models have like three point one billion vertices or something, and there are lots of different ways you can slice and dice stuff. But again, we don't know about the veracity, we don't know about the clarity. We're not sure. And of course, with something like a data warehouse, you want to be really gosh darned sure about what you're doing, right.

11:43

We want to know, you know, what the most current information is, what the most trusted information is, and we want to be able to use that. And so you know, to some extent, MDM used to help us with that. We used to take all the data and create an MDM system around that. I think this is the next evolution of ETL. It's an evolution of MDM. And you know, because we have such large volumes and everyone wants access to the data, data mesh, data fabric are

12:07

is the next thing? Is the next evolution of that? Yeah, and so you have some of those principles baked in, right. I mean I think that's the key is that we've learned over the years. I remember asking myself a rhetorical question, is MDM the next SLA? If you think about

12:24

service oriented architecture? We had fine grain services and course grain services, right, And even though no one really talks about SA anymore, I think what happened is the principles of service oriented ar chitecture kind of got baked into how we do cloud and now that's just the norm. But real quick, what do you think about that state? Yeah? I think you're right. I think it is, you know, the basic way that we do cloud.

12:50

It's it's but you know, I can't tell you how many customers I talk to that have, you know, like a pub subsystem to where they're publishing data and subscribing to data cosca where some of the open source tools that are available for that, so you know, yeah, there is you know, this concept of a data bus that I think a lot of companies deal with, including Tafka, that kind of help that. Yeah, it's funny, you know, going down memory lane again. I worked for Daman Consulting back

13:20

in two thousand and one. That's when I got into this whole space and Michael Haston, super smart guy at consultant, as the same birthday as me, and we were excited about that. He would talk about what he called an enterprise backplane, and what we was talking about is what you just mentioned this bus. Basically, it's a staging area of data. And I meant to mention this in my opening remarks as well, that intelligent caching is older

13:45

than the hills. I mean, it's something we came up with a long time ago, and it's a very useful construct for being able to facilitate access to often used data. Right now, how you manage that, how you construct that matters a lot and will have a big impact on whether or not it works. But it's not a new concept, right I mean, caches are a huge part of data virtualization for example, now, a big part of data fabric That's what I was talking about, the pre processing stuff.

14:13

And you know, the beautiful thing is machines. You know, unless you turn them off, they don't sleep. And machine learning unless you turn it off, is just learning and learning and learning. They're just kind of crawling around looking for patterns, and we humans are pretty darn predictable if you get right down to it. So if you do have a machine learning layer and it's monitoring usage of data, it's going to know when the peaks and valleys

14:35

are. And to your point earlier, you can spin up three ohs or spin up four ohs, or spin up two ohs, or whatever the case may be. Not that you're never going to have a hiccup again, there will always be hiccups, there will always be downtimes and things of this nature. But the point is we're getting really close. And I think that's the key with data fabric at least, is it's trying to be as prepared as it can be for whatever data usage you're going to need in the next hour,

15:01

the next day, whatever the case may be. Well, don't touch out to folks, will be right back on a fantastic episode of Inside Analysis. Welcome back to Inside Analysis. Here's your host, Eric Tabanac. All right, folks, back here on Inside Analysis, part of the dm Radio Broadcasting Network. We're talking to Steve Sarsfield of Open Text Vertica and Eugene Burke

15:30

from Digital Strategies Group. Eugene, you heard us ranting and raving about data fabric versus data mesh in the opening, and I had a great question around logical data models and where is this is all going? I mean, you know, when I think about these large language models, again, they have absorbed far more than just text. They have absorbed concepts. They have absorbed

15:52

formulae, whole spreadsheets. You know, one of my buddies was saying, Goes, I just go to these things to get the numbers of things, because that was absorbed as well. Now again you do have this whole issue of moorings and anchors of truth as some people call these things, and you have to worry about all that. But what's your take on the data mesh versus data fabric religious war or is it that big a deal at all? So yeah, so, Steve, I guess I would have a two part

16:19

question to get us started on this segment. Are data fabric and data mesh twin sons of different mothers? Are they destined to fight or do they have different objectives and different mountains to conquer? And how do they relate to lms? And this kind of adoption of a completely different paradigm or enterprise ask and answer computing. I can answer that if you want. You know, I kind of look at data mesh and data fabric as I look at the origin

17:00

story. You know, every superhero has an origin story, right, and I think data fabric and data mesh have different origin stories. So in the case of data fabric, the origin story has a lot to do with graph databases. You know, graph databases to some extent, they are a solution looking for a problem. To some extent, I won't want to I don't want to completely paint them like that, but you know the problem that they

17:30

solved. The one problem that they really solve is that if you have disparate data, and that data is sparse, and that data is it doesn't have any metadata attached to it, it's sort of like an unknown graph. Databases do a really good job of building linkages between the data. Data that's sitting in different files, data that's sitting in different columns and rows, and so graph databases do a really good job at that. Thus, data fabric the

18:03

uh. You know, Eric mentioned the Hadoop model, and I think the origin story of data mesh is hadoop. You know, we have data that's kind of sitting in there in the data lake. We want to have access to it. We want to make sure we know probably that the metadata is accurate, and we know that maybe our company grew through acquisition. So I've

18:25

got data warehouses that have multiple data warehouses in our organization. I've got a data lake in our organization, but the metadata associate with that is probably pretty good and we can sort of kind of trust it. That's data mesh and putting all that data together is kind of where that comes from. So when I look at data fabric and I look at data mesh, I look at those origin stories and it doesn't really answer though which one I should use.

18:52

And so you know, if you kind of turn that around and you are a pharmaceutical company and you have a lot of dis data and the data is a little bit messy and it's a little bit sparse, maybe the thing to do is to set up a semantic layer access that and use a data fabric. If you're a company that has grown to acquisition, you've got multiple data warehouses, a CRM system in the ERP system. But guess what, the

19:15

data is pretty good, pretty fit for use. Maybe data mesh is the solution, and I think that's the big difference in my head of what the differences are between those solutions. Do Eugene you want to comment on that, sure, I guess the follow on is, do you agree with some people's assessment that the data fabric is more IT driven and a data mesh is more business pull or business driven and organized around the business typology or top top topography.

19:49

So what mesh is trying to solve is to put the business back in the data driver's seat. Yeah, I think that's true. And you know, one of the reasons for that, again is the introduction of a graph database and a semantic player. Right. That's a pretty tough thing to do. It requires a specialized set of knowledge that not a lot of people have, only a handful of people. And I'm trying to understand what sparkle and cipher is and trying to understand basically what a triple is versus you know,

20:22

a standard column in row. That requires specialized knowledge that we just don't have normally in the normal database work. So it is IT driven. It usually has a component of services and technology that are bound together. That kind of work together to create that semantic player with data mesh. You know, we could almost pull that off if we have a good understanding of metadata and you know, how to manage all of those and data virtualization and some of the

20:55

other tools that you might use for a data mesh. Yeah, that's you know, it's more of a business initiative. Yeah, thank you. Guys could comment like that. That's how I see it though. Yeah, you know, I also have had difficulty wrapping my head around two bowls, right, And I remember when the semantic web was going to solve all the world's problems and it never really kind of got there. Now there are semantic layers that you can use for a database, and that's a very useful thing.

21:23

It's very similar to data catalogs, right, I mean, the data catalog is there to capture the meaning of things and to enable business people to connect dots in their systems basically, right, I mean, that's what a data catalog is supposed to to. But again, these things are all sort of it's just interesting how they're all sort of moving forward at their own pace. Although I'd never heard before until now, so thank you that the origin story of data mesh was had dupe. I did not know that. I listen,

21:53

it may be wrong about that. All of these technologies, all of these technologies though, you know, they rely on multiple technologies, right, it's the coming together of multiple technologies. So we have databases, and databases that used to run data warehouses are having capabilities around data mesh. Right. They're able to go and access data that's outside of them and form analysis on them like they couldn't ever before. And we have query engines, you know,

22:22

Presto and Trino and some of those technologies. They get access data that's sitting in a data lake that doesn't have any metadata that it's not sitting in a database, visualization tools, graph databases, data virtualization, data catalog. So a lot of those things are coming together as technologies to form data mesh. Data data fabric. Yeah, that's a really good way to put it. A data fabric, I think is the amalgam of all these different things.

22:48

Yeah. It is designed to be an efficient and thoroughly capable data foundation to handle whatever data uses the business may have. Which is interesting too because it's almost like we're we're moving in the direction of so called h TAP right, hybrid transactional analytical processing, which always made me kind of wonder there was this approach where a query would come in and you could have sort of a sniffer there awaiting. It's like, okay, is this an analytical quer and

23:18

operational queer? And if it's operational, okay, I'll go this way. If then I'll go that way. I wonder about that. Anytime you have these sort of if then statements at the at the foundation of a database technology, well, that's going to affect performance, right. I mean, all this when you get right down to it has to do with performance. Can this thing perform the tasks I wanted to perform quickly enough, efficiently enough,

23:42

and accurately enough. And you know that's where I see tremendous pressure from these lms on basically everything that's in the data stack, everything that's in the information stack, because you know these The key is to have your embedding strategy and to know and that gets back to your point Steve from earlier in the show that you, as an organization, you need to start getting your data prepared for these large language models and make sure that it's trusted, make sure that

24:12

it's governed, understand your processes and one of the things that you can use to do that is a large language model, because they're actually pretty good at being able to ascertain and then articulate specific processes that you need to go to. You know, it's really interesting. I was talking to Steve Lucas, the CEO of Boomy, former president of sp America's he's over at MARKETO for

24:34

a while. He was saying, when they got in there, he started looking at what they were seeing inside their own systems, and he asked it show me all the different versions of order to cash that we have. It was like, okay, that just started, Like holy love it. Wow. So we're just now scratching the surface. Now. It doesn't mean all this other stuff is going to go away, not right away, but it does mean you have to get your data ready, and that means quality checks,

25:06

that means lineage checks, things of that nature. What else would you suggest, Steve and then maybe Eugene from your experience, how can organizations get ready for that? How can you prepare your data for large language models? First deed, I mean a big one for me is access and security, right, that is a huge one. A lot of companies have a PII seeing in a data storage somewhere. It may be encrypted, it may not

25:32

be. How do you identify that PII and make sure that you're not exposed as a company to all the potential finds that you could get around that that is a regulatory mess that could you know, you want to make sure that

25:48

that problem is solved. We have some technologies that open texts that allow you to actually go and take a look at even free form text, even video and audio allow you to access that can take a look and make sure that there's no PII there, so that you know you're not vulnerable to law, lawsuits and stuff. So security access encryption, that's a real key one for you know, making sure that your data is in order for for the next generation. Yeah, Eugene, that's one I can probably go on, but

26:22

Eugene. Any so, there are few things like an l l M for exposing flaws in your data, and exposing one of the flaws would be security holes. So to your point, if you have PHI or p I I and the l ll M has a way to find it, it will find

26:41

it right because that's what it's built to do. And so now is the time to understand your I A M architecture and to make sure that if you're going to use a large language model for customer service, patient service, provider service, that you really understand the pathways for accessing your highly sensitive data. And because you don't want to have an audit come up to say, Okay, here's violation, violation, violation or God forbid of breach, right,

27:15

because you didn't adequately think through your architecture. Yeah, so that's a technical component, you know, there's there's also sort of like the operational components of it, right. You know, when data governance we used to talk about people, processes and technology, and so the people in the processes are also something you need to look at. How is data handled, is data copied? You know regarding processes, and then people who has access to it,

27:42

how are they handling it and so on. So again, you know, there is a component of data governance to data mession, data forever. So the mixture of a wild West data culture in lll MS is quite potentially dangerous. Yeah, that is true. The other cool thing here and some of the cooler technologies I've go across have this capacity to scan your environment. Some of the data catalogs have this capacity. Very very useful stuff. There's a

28:12

company it's really more in the security and governance space Extra Hop. I haven't taken a brief infront of in a long time, but what I loved about them is that they will scan. They basically just siphon off your network traffic and then create a digital twint of your entire information landscape to show you every database, every application, anything that's touching it. You see an object for

28:34

that, and that's the kind of tech that you can use. And of course LMS, I mean, like I said, we've said before, if you point one at your information architecture is going to start sucking that stuff up. And if you weren't careful about what went in there, it's going to be very difficult to get it out. It's like unlearning things. It's hard to unlearn something you know. And I often use the analogy of raising children

28:57

to explain how to train your large language model. If you've got a two year old, you don't want to let her hang out with a bunch of gangsters in the hood for a while, like, because they're gonna absorb all this information these behavioral patterns. So you have to be careful about how you how you train what access you give to these models, just as you do racing your child, because all of a sudden, your kid'll be reflecting things back to you that you don't like, and you're like, well, where

29:22

did that come from? Well, I don't know where did you allow this child to go? What did you share with this child? Well, just a funny story. One of our babysitters watch with our I guess three year old the Chucky movie, like The Chucky, the Little Killer, and we're just like, all right, why did you do that? Now? Our kid loves these really dark, disturbing movies. We're like, all right,

29:44

I don't know that was the best move on our part. But you know, once it's in there, you're not gonna get it back out unless you like, unplug the whole thing and start from scratch. That's not gonna be fun. That's not gonna be fun. And I think that this is a major trend line that we are going to see here over the next number of years. It's going to be very interesting. But this is what I seen. We'll pick this up after the break and see what does Steve and Eugene

30:10

think about this. But I think most large organizations, even men. They need to Small companies are going to pick their poison. They're going to choose. Okay, I'm going to use BART or I'm going to use open AI, or I'm going to use anthropic. I'm sure somebody's will come out and they're going to begin this process of training that large language model, that AI model on their corporate data. Well, that process, I promise you,

30:33

is a really really important process. And I often i'm throwing out this concept. I'm saying it's a second chance for data. And what I mean by that is we've spent the last forty odd years doing all sorts of things to move data, cleansed data and enrich data, load data, access, analyze, parse, all this stuff to get some value from it. And now this is a big reset. We're going to hit the reset button non data. And it's called the large Language model. But folks, don't touch that.

31:02

I'll be right back. You're listening to Inside Analysis. Welcome back to Inside Analysis. Here's your host, Eric Tabanat all right, folks, welcome back here to Inside Analysis, part of the DM Radio Broadcasting Network. Your host here, Eric Kavanaugh, was Steve Sarsfield of OpenText Vertica and Eugene Burke of Digital Strategies Group, and I wanted to throw this question at both of you. It's one of our attendees is writing talking about logical data models.

31:37

Right, we had logical data warehouses, which was a sort of virtual data warehouse. There are lots of different ways you could do these things, and he writes that ldms are the semantic layer that is missing. I presently create ldms to document primary key, alternate keys, business definitions, relationships, PII classifications, and many other things that are not in the physical implementation layer. That's a very clever way to go about things, and I think that you

32:01

could even load some of those as embeddings into your large language model. But these are really interesting observations, so I'll throw it over to Steve first. You know, I've got this concept that these large language models represent a second chance for data. What do you think about all that? Well, yeah, I think that's true. You know, it's interesting how over the years

32:23

we have these different ways of accessing and managing data. I think you're right that large language models will be the way that create semantic models and access data of the future. So yeah, I'm not sure I have any more to add to that, that's a great way to access data. Yeah, I'll throw it over to Eugene, because it's not like from my perspective, it's

32:47

yeah, yes, First of all, it's these are text generators. That's really what they're designed to do. Of course they're also art generators and different things like that. But when I'm thinking about this is what I'm really thinking of in the darkest corners of my mind. Here are that you know, a multidimensional structure that it has like four or five dimensions that can be pretty complex. These things have billions of vertices, which means the complexity is through

33:15

the roof. It's just all over the dark place. And that allows you and I'll throw one last little story Eugene, and then you can comment on it. I remember watching Carl Sagan when I was like ten or eleven on his show to Cosmotion was like billions and billions of stars, and he gave this presentation where he had a table and he had a bunch of little pieces of paper and like squares and circles on the table and he goes, this is a two dimensional world, and all these little creatures are in their two

33:42

dimensional world. Well, imagine if someone could come along and pick up one of these two dimensional creatures and lift it into the air, and now all of a sudden it can see all the two dimensional creatures floating around. What kind of impact that would have on your thought processes and on your ability to extrapolate it to come up with new ideas. Oh, I mean I can see that like it was yesterday, and that was like forty two years ago

34:06

that I saw this. But it was just such an excellent way of articulating the power of perception and of perspective. But what do you think of at all that ugene? Is it? Is it a second chance for data? It is? And so back to the analogy, or that's the story of

34:25

feeding something that a three year old ought not to have. If you're trying to use LLMS in a customer interaction scenario, you only get one chance to make that first impression, and if the consumers or customers lose trust in the implementation of a large language model, it's going to be very, very difficult to get it back if you don't have the guardrails to say, oh, it's still learning. Well, most people want understand that. But if it's

35:01

an all large language model doing bank customer service, provider customer service. It needs to be fed the good Gerber food, right, and so here's the second chance for data, So feed it only the good stuff, right. So thinking about strategy for preparing these models, maybe don't expose it to kind of the swamp, right, because it's going to ingest some of the swampiness. So I was reading yesterday really incisive kind of analysis. Tech debt is

35:37

one thing. Data tech debt is actually more insidious than that, because once your business customers, your internal customers or your external customers, lose trust in the data that you're showing them, the information that you're saying this is the represents the business, or this represents your customer status and it's wrong, then you have almost no end of pulling your hair out because so I guess the moral to that story is prepare your models very well. Hey, just an

36:15

add on to that, it's pretty interesting. I read an article in Time magazine that talked about how chat GPT used canyon workers to actually get rid of the toxicity. Right, So they used workers to actually go through the data and said, is this toxic content that we're putting into chat cheap or not? Right, and that was one of the ways that they got it, so they got out they didn't have too much toxic information in chat chiaptake.

36:44

But you know, I think we should think about that is, you know, how can human intelligence and AI intelligence augment each other as we're building these models, so we can look to that too kind of as a test. Yeah, that's that's an excellent point to make, Steve. And I think the upshot of what I'd like to share at the audience here today is that your data warehouse, your data marge, the things that you've worked very hard on need to be a crucial and foundational component of your AI model, and

37:19

they should be front and center as a trusted source. That is the primary source that you want to use, is something like a data warehouse, because you have governance, because you have all this attention that's been paid to the model. And remember, the data of your organization reflects the organization itself,

37:37

you know. And I think we could see some really interesting things happening of dynamically generated data models that look at the data and the flow of data and go, you know, maybe you should maybe you should reconfigure your data model to better reflect this new way that things are working in your organization. I think that's coming to There are a lot of fun things that can kind of

37:58

spin out of this. But you know, the other fun point i'd throw out here, maybe it's going to comment from each of you, is I've been seeing when you connect these engines to your data sources, there are lots of cool things that you can do and think about. If you've ever done a Google search for instructions for a particular app, how can I use xyz app? And you get something that says, okay, step one, go to the file and click on the red button. And you go there and

38:24

there's no red button. You're like, all right, what is this talking about? Because it's an old version. It's an old version of the previous app or the previous version of the app. It is a very difficult problem to solve, or at least has been historically, because you don't control Google search engines, right, you don't control that stuff, and so they're just

38:40

going to find old stuff that is difficult to manage. Well, if you do this correctly, the large language model can sense like change data capture when something has changed, and when the source file has changed, it goes Aha, update and let's go grab that stuff dynamically. So you start thinking about that's really powerful stuff because when you make the change to the system of record, it's almost instantaneously reflected in the large language model that you're using to interact

39:07

with the environment. But the closing thoughts from U Steve, what do you think? Yeah, I just want to comment on that. So databases have things like indexes and materialized views, and we have something called projections, right, projections. What they allow you to do is based on your queries, how can I optimize the data? So that is all aged I'd driven now for a lot of companies out there, for Vertica and for some of the

39:31

other database companies that exist. So I'm watching as an AI, I'm watching the queries that take place, and I'm saying, you know, if I had this materialized view, or if I had index configured this way, I could switch it around and I could run that theory that query ten times faster. Some of that functionally exists today. That's something that we're always looking to build out, you know, for optimizing speed. So yeah, I mean there's that component too. Yeah, And I think that you can also again

40:04

dynamically provision these things. So that's the other fun thing my buddy Loo Simon was mentioning, is he's like, think about this. Historically, you had to learn to speak computer in order to talk to your computer. You had to learn the structured query language, you had to learn Python, you had to learn some language to be able to communicate with this machine. You don't have to do that anymore. Now you can use natural language to communicate with

40:30

these things. And I mean, I'm telling you this prompt engineering stuff, it's really it's taking off as well it should, and it's going to be amazing when we hit what's called interactive AI, which I think is really the next big phase that's coming, and it's going to be wild because if you let these intelligent bots loose on your environment, we'll start I absolutely agree with

40:49

that. I think the next big thing, the next big snowflake, the that next big database if you will, or analytics engine that comes to market that solves that problem, that says we're going to be really simple. You're going to type natural language queries and we're not going to care at all about SQL or Sparkle or any kind of language cipher. We're going to answer that query based on that language that you enter in. I think the company that

41:15

does that is going to succeed and be the next big thing. Yeah, I think you're right. Well, folks, this has been an absolute blast talking to two experts, Steve stars Field of open text Vertica look them up on LinkedIn, and Eugene Burk of Digital Strategies Group. Things are changing very,

41:32

very rapidly. And you know, one of the fun quotes I heard the other day with respect to understanding machine learning and artificial intelligence and where it's all going was learning meaning human learning, meaning we as humans need to learn how these things work. And the way you do that is by playing with

41:51

them. Send me an email info at inside analysis dot com. You've been listening to Inside Analysis and now it's time for today's podcast Bonus, in which host Eric Cavana talks about the transformative impact of large language models such as chin jept and other projective analytics tools on the future of data analysis and decision making

42:14

in the industry. All right, ladies and gentlemen, Hello and welcome to this virtual summit on Inside Analysis or truly Eric Kavanaugh here, I can never miss an opportunity to promote future proof of the world's first made for TV webinar series. Very excited about that. Check your local listenings. We're now in Washington, DC and Silicon Valley and Los Alamos and lots of other fun places, So check your local listings or hop onto YouTube to see past shows.

42:42

Let's dive right in data fabric versus data mesh. Cut from the same cloth, Yes, indeed, so let's talk about what this really comes from, and that's the modern data stack. I'm sure many of you've heard this concept,

42:53

the modern data stack. One of my favorite lines about this is that we kind of sacrifice state at the altar of sk And what I mean by that is we broke apart all the different component parts of a database into separate layers for storage, for integration, for processing, analytics orchestrations, some antics, governance, artificial intelligence, machine learning, of course, security, and well guess what. All of that can be done inside of a database.

43:20

But the modern data stack really attempted to solve for scale issues, to be able to scale out any one of these component parts, to scale out and then to scale back down. That's what you want, that's the optimal scenario, because I'm sure many of you recalled back in the day before we had something like the modern data stack. You then had the provision for the highest workload irrespective of whatever workload you had or what your budget was going to be.

43:44

And at times when you had peak usage, that was okay. But when you don't have peak usage, you're spending a lot of money that you don't really have to spend. So that's kind of where the that's kind of where this thing came from. I panelists is having trouble getting in, so let me try to multitask while I'm talking to you here. But basically what's happening here is you have this situation where we're trying to be able to again leverage the power of compute wherever it is needed. So if I need to

44:15

ingeest a ton of data, that's what I do. If I need to process a bunch of data, that's what I do. If I need to scale out the governance components of the equation here, then maybe that's what I need to do. So that's where the modern data stack came from. But there are a lot of component parts that well, they complicate things because anytime you have multiple parts, while there are connections between all these parts and that can slowly but surely cause you some trouble, so you want to watch out

44:42

for that, and that's one of the downsides. But let's kind of dive deeper in. So what is the data fabric. It is a substitute for a database. It's supposed to be more flexible, more versatile, more durable, faster, better, easier to govern. All that fun stuff. Sounds great. Is it easier to manage? No, it's not easier to manage. It's going to be significantly harder to manage than a single database. Right, And I've been joking to myself our DBA is going away. We have

45:08

DFA's data fabric administrators. I don't think that's going to happen. I think you're still going to have data based administrators. Really, data engineers is the key. So data engineers are sort of the new DBAs, if you will. So debas will still be around, they're just doing slightly different things. So what else is a big part of the whole world of data fabric is this whole concept of automation. Right, That's what we really want to be

45:34

able to do is automate things. And so we're automating various components of integration. So think preprocessing for example, think monitoring for usage patterns and then provisioning additional access at a certain time, like maybe at the end of the week, maybe at the end of the month, for example, when you can have a clothes to do. Situations like that, you want to be able

45:57

to get additional resources. Well, with automation of a data fabric, you can pre provision, you can pre process data, and that's a big part of what facilitates the data fabric's ultimate goal, which is to make life a lot easier for the consumption of data, whether for people by people or for

46:14

machines, whether from machine learning, et cetera. You want to be able to detect these patterns, and that's actually a really strong use case for machine learning, because even though we might think that we're not very predictable as human beings, the truth is we are very predictable, and our behavior patterns can

46:30

easily be ascertained, understood, and then codified by machines. And so that's a big part of data fabric is to watch for patterns of usage and then be able to pre provision pre processed data, for example, to do some work before someone shows up such that it's already ready to go. So I do want to mention one fun thing about data fabrics. So in companies, according to Gartner, in the data fabric space was Talent. Many of you may know Talent. Lots of companies use Talent. It was an open source

47:00

company Talent. Open Studio has been really an open source starward for the last gosh twenty years almost. I think it was around two thousand and three or so that they started to take off. I remember talking to them in two thousand and five when I worked for the Data Warehousing Institute. Well, Click bought Talent. The company Qlik sorry, Click bought talent. Click of course is a business intelligence platform, and they finally finished their acquisition of Talent a

47:28

number of months ago, and guess what happened. They announced that the open source version of open studio is going bye bye, that's going away. Well, as I said, they were a leader in the data fabric space. So what does this mean about the future of data fabric sharp answer is I don't really know. And you can look up online you'll find lots of commentary about this. It is a hot topic for sure, but the bottom line is it is going to be closed source from now on, so now branded.

47:55

There are lots of integration vendors that are not open source. Informatic is not open source, Matillion is not open source. Ab an Issue is not open source. Lots of companies in the integration space are not open source. But Talent was, and it was very well known for that. It was kind of like the Lancelot of open source. Right, the strongest night in King Arthur's court, if you will, was Talent and that is now not

48:16

open source. It's all going to be closed source. What does that mean for the future of data fabric I'm not entirely sure, but it's probably not the best news. Thanks so much of your time you've been listening to Inside Analysis. This segment brought to you by Christmas and The Fat Greek in Ukaipa. Still have some Christmas shopping or gifts to buy for that special person, business, associate, or friend. Nothing says love like a Fat Greek food

48:40

certificate. Nick Chris and their families at the Fat Greek want you to know about their holiday gift certificates. How you can also get whole holiday meals for the family of friends too quick, easy and affordable, and you get money back too. You could see more about the Fat Greek on the big Ukypa led monster signs on the ten Freeway. The gateway to u Kaipa. Fat

49:00

Greek is your holiday relief station. Kick your feet up and enjoy the holidays with all kinds of Greek comfort food to take that load off your feet, and you might want to partake in an adult beverage at their full cocktail bar. You can also catch a game on their big eighty five inch TV too. The Fat Greek in Ukaipa on Ukuipa Boulevard across from the golf course near oaklenn Boulevard, open every day except Tuesdays. You can pre order a holiday

49:22

meal at gofat Greek dot com. That's gofat Greek dot com and Happy Holidays from the Fat Creek. And while you're at it, don't forget to say ohpah this holiday season. Was your car involved in an accident or just need help with dense Allmagic Paint and Body Collision Centers in business for over thirty years. They're highly trained staff and certified technicians and friendly staff are the best in

49:53

the business and treat each car as if it was their own. All Magic Paint and Body Collision Centers are family owned and offer state of the art equipment and tools to ensure optimum results. They use the latest technology in computerized color matching and specialize in frame repairs with their modern laser measuring systems. They're OEM

50:12

certified and they have four locations to serve you. All Magic Paint and Body Collision Centers offer rental car assistance with free drop off and pickup services too, and their work has a lifetime guarantee. All Magic Paint and Body Collision Centers are in Narco, East Vale, Marina Valley and in Fontana. Call them at one eight hundred and sixty one Magic. That's one eight hundred sixty one Magic. All Magic Paint and Body Collision Centers one eight hundred sixty one Magic

50:37

All Magic Paint and Auto Bodies says, drive carefully. This segment is sponsored by Dickie's Barbecue, now in you Kaipa at three point thirty five sixty two you Kuipa Boulevard in the Bomb Shopping Center Dicky's Barbecue where you can get sauced with five delicious barbecue sauces. Well the holidays, there's the Dickies holiday feast options everything you need for a festive gathering with delicious, hassle free meals if

51:07

you just eat the serve. Whatever your needs are, they have the perfect option including the complete feest, the dinner fees or the single holiday meats and sides available for pickup and delivery from Dickies. And there's no charge for kids on Sundays. In fact, the kids get free ice cream. Dickie's Barbecue now open in Yukaipa at three point thirty five sixty two you Kaipa Boulevard and the Bond Shopping Center Dicky's Barbecue. Whatever your needs are. This segment brought

51:37

to you by Christmas and the Fat Greek in U Kaipeon. Still have some Christmas shopping or gifts to buy for that special person, business, associate or friend. Nothing says love like a Fat Greek food certificate. Nick Chris and their families at the Fact Creek want you to know about their holiday gift certificates. How you can also get whole holiday meals for the family or friends too

51:59

quick, easy and affordable, and you get money back too. You could see more about the Fat Greek on the Big Ukaypa led Monster signs on the ten Freeway at the gateway to Ukaipa. Fat Greek is your holiday relief station. Kick your feet up and enjoy the holidays with all kinds of Greek comfort food to take that load off your feet, and you might want to partake in an adult beverage at their full cocktail bar. You can also catch a

52:22

game on their big eighty five inch TV too. The Fat Greek in ukaipe on Ukuipa Boulevard across from the golf course near Oaklen Boulevard, open every day except Tuesdays. You can pre order a holiday meal at gofat Greek dot com. That's gofat Greek dot com and Happy Holidays from the Fat Creek. And while you're at it, don't forget to say opah. This holiday season.

52:44

Hill's Country Kitchen is now open for business. Hills Country Kitchen the newest restaurant in Ukaipa at the corner of fifteenth in Ukaipa Boulevard across from Craft and Hills College, located in the collection at Craft and Hills Shopping Center, along with Laser Legacy, the original Rosie Hills Country Kitchen, where you're always welcomed. Hills has the recipe for delicious breakfast, reasonably priced lunches an amazingly scrumptious sninner.

53:08

Hills Country Kitchen in Yukaipa is now open for a breakfast and lunch an amazingly scrumptious sninner where you're always welcomed. Hills has the recipe at the corner of fifteenth and Ukaipa Boulevard, across from Craft and Hills Colin. Redlands. Auto Electric reminds everyone that the blood you donate gets someone another chance at life. Somedate that someone might be a friend, a loved one, or even

53:38

you, so please give blood and give the gift of life. This message courtesy of Redlands Auto Electric at one one sixty five West Park Avenue in Redlands. Known for quality, integrity and knowledgeable service, call nine O nine seven ninety two four seven seven six Redlands Auto Electric on the air because they care. This segment sponsored by the All RAMS Market, RAMS Express Car Wash and the RAMS Ultimate One Stop Shop at twenty ninety mentone Boulevard on the corner of

54:07

crafton Hills Avenue, just across from Jacento Farms. RAMS is now open, all new, well stocked, bringing fresh and sharp convenience to mentone with their clean, safe chemicals. They'll leave your car all shiny and protect it. When you buy gas inside the one stop shop, you'll get savings on your car wash. Make life easy and sowed it all with Rams. You can

54:28

save with the Dinosaur and your top tier Sinclair Fuel. Rams has great car wash membership dealer and you can save on gas when you wash your car. Satisfy your appetite when you drop in the one stop shop convenience store. Smell the chicken delectable mouth watering, crispy, crunchy fried chicken. That is. Rams is now open at twenty ninety mentone Boulevard in Mentone. Thank you to RAMS for their community spirits and sponsoring this radio station. RAMS in mentone is

54:58

waiting for you. As a small business owner decision maker looking to start a new business. Are you frustrated with having to handle all the administrative business for your company, a constant turnover, a bookkeepers and staff to keep the business running and out of trouble while you are trying to grow your company. You

55:19

can't do all this paperwork and expand your business too. Executive services can provide your company with chief financial officer services including tax return preparation and advice, bookkeeping, financial statement preparation and analysis services, loan package preparation, payroll marketing services, notary services, new business formations, business liability insurance services, and an on call CFO to help you with any business questions or issues you run into.

55:45

Call Executive Services at eight hundred seven oh seven fourteen seventy seven now to get your company organized and have a peace of mind. For only a couple one hundred dollars a week, you can have an experience CFO on call to handle all of your companies administrative issues and problems. Call eight hundred and seven oh seven fourteen twenty seven or visit the Executive Services website www dot xscbs dot com for more information. KCAA Loma Linda, The Legacy KCAA ten fifty am

56:15

and Express one oh six point five FU. You're listening to The Inland Talk Express ten fifty am and one o six point five FM. Ky Loma Linda, you're listening to an encore presentation of this program KCAA The Inland Talk Express. Welcome to the Fabulous Lifestyle radio talk show. I'm your host, Sherry Marie robi Rosa here on kcaa broadcasting network. We're affiliated with CNBC and NBC News and Sports, where we cover over five million households in Greater Los Angeles.

57:00

If you've missed our previous shows, you can watch us on our TV streaming channels, distributions on Roku TV, Amazon fireTV, and the Android app. Just subscribe to the Building Solid Foundations channel. Hello friends, I'm the fashion host for the Fabulous Lifestyle radio show. My name again is Sharry Marie rob Rosa, also known as Mommy Majesty. I'm a life and fitness coach, a fashion designer and stylist, an author and speaker, and an interior

57:35

designer, and the mother of twelve children. Yes you heard that right, twelve children. I'm also the CEO of Mommy Majesty Fitness, where I love to coach women through my signature program that I created and used after each of my twelve pregnancies to lose sixty to eighty pounds and to get my hourglass shape back twelve times. It's my heart's desire to empower women to show up as the most beautifully fit, stylish, confident version of themselves, starting from the

58:10

inside out. So that they can achieve all their dreams and goals in this life. I'm so excited to be back here with you again today to dive into one of my favorite topics, the world of fashion. Today, we're going to be discussing part three of our series, How to Create Your Personal

58:30

Style Portrait. Then after the commercial break, we're going to hear from our very special guest, Latasha Finnel. She is the CEO of Boss Lady Bling Blingy Jewelry and she creates some beautiful pieces, so you won't want to miss hearing from her. So we're going to discuss part three of our series, How to Create Your Personal Style Portrait. But first I wanted to recap Part one and Part two for those of you who may have missed those episodes.

59:01

So in part one, we discussed our mindset and we covered five of the most common negative thoughts that keep women from believing that they are able to create their own style. The first one is I have to lose weight before I can be stylish, and we discussed while that's not true, that you can

59:22

be beautiful and stylish at every weight and every age. And the second thought was that it's going to take too much time to be stylish, and we discuss while that is true at the beginning, it takes a little time to figure out your style and to curate your wardrobe, but after that, you're

59:40

going to save yourself tons of time getting ready in the future. And the third negative thought was that it's going to be too expensive, and we discuss why that is not true because there are many beautiful, stylish pieces at every price point available to all of us. And the fourth thought is that people will judge me if I show up looking beautiful and stylish, and we discussed that while that

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript