Agentic RAG with Ed Charbeneau

Speaker 1

00:01

How'd you like to listen to dot NetRocks with no ads? Easy? Become a patron for just five dollars a month you get access to a private RSS feed where all the shows have no ads. Twenty dollars a month will get you that and a special dot net Rocks patron mug. Sign up now at Patreon dot dot NetRocks dot com. Hi, this is Carl Franklin.

Speaker 2

00:24

And this is Richard Campbell.

Speaker 1

00:25

We've got two special shows coming up soon, episode nineteen ninety nine and two thousand.

Speaker 2

00:32

For episode nineteen ninety nine, we're collecting people's y two k stories what did you do to help the Y two k event not actually happen?

Speaker 1

00:40

And for episode two thousand, we're going to be sharing stories about how dot net shaped your career.

Speaker 2

00:46

We have a special page at dot netroocks dot com slash voxpop where you can record messages for us that we can play on these special episodes. So tell us what you did for Y two k and what dot net means to you, and of course how long you've been listening to dot net rocks. So go do dot NetRocks dot com slash vox pop now and leave us a message before the thought of operates like whiskey left in a glass overnight. Do it.

Speaker 1

01:22

Hey, it's dot net Rocks. I'm Carl Franklin, I'm Richard Campbell, and Ed Sharbern knows here. We're gonna be talking to him in a little bit. But first we got our opening business to do. Ed, feel free to jump in if you need to. We'll do all right. This is being episode nineteen ninety seven. Let's start with what happened that year, so world events and cultural things, the handover of Hong Kong to China. In July, the UK officially transferred sovereignty of Hong Kong to China one hundred and.

Speaker 2

01:51

Fifty six years.

Speaker 1

01:52

One hundred and fifty six years.

Speaker 2

01:54

Yeah, and you remember China promised one country, two systems for fifty years.

Speaker 1

01:58

You know, promises don't mean anything on the world stage.

Speaker 2

02:01

Apparently that event had huge impact in Vancouver because in the couple of years leading up to it, a huge number because they were actually British subjects as well of well to do Hong Kong fo bought homes in around the Vancouver area.

Speaker 1

02:17

Wow, huge numbers. Wow.

Speaker 2

02:18

So drove up our housing prices. And then when the handover was kind of a non event back then. They largely didn't come until a few years later. When their kids get old enough, they're like, we want to send a broad to school. Hey, we own a house in Vancouver, so they just send their kids, you know, and a Lamborghini, you know, Creosa.

Speaker 1

02:35

Sers and I bet the Chinese restaurants got better too.

Speaker 2

02:38

Chinese restaurants are always good here. They call it Honcouver for a reason.

Speaker 3

02:43

My memory of the politics at the time is very, very rusty because I was a scrappy eighteen years old with my daughter being born that year.

Speaker 2

02:52

Yeah, you were busy. I was busy, Yeah, no kidding.

Speaker 1

02:56

Princess Diana died in August thirty first. I remember. Everybody remembers where they were when they heard that, But I was actually playing a gig in downtown Mystic, Connecticut.

Speaker 2

03:05

Did you call it out?

Speaker 1

03:06

Yeah? Yeah, yeah, you have to. You have to. Plus it was on the TV, like everybody was stunned. The Kyoto Protocol was adopted in December. That's an international agreement to reduce greenhouse gas emissions.

Speaker 2

03:20

Sorry, adopted? How optimistic of you?

Speaker 1

03:23

Yeah, well it was adopted by Kyoto, wasn't it.

Speaker 2

03:27

No, that was the location with it. It was supposed to be an international treaty that everyone.

Speaker 1

03:31

Yeah, well it was adopted at first and then it got unadopted.

Speaker 2

03:34

Well, everybody says they're going to do things, is this question of whether they do things.

Speaker 1

03:37

Yeah, Like I said, promises don't really mean anything on the world stage, do they?

Speaker 2

03:40

I would argue, And when I talk about climate change that the projection is that in nineteen ninety seven at the Kyoto curR we're last meeting, like four degrees temperature increased by twenty fifty and that's actually off the table. So we did some stuff nowhere near enough. Currently the projection is somewhere between two and a half and three, which is, by the way, still way too much. Yeah, but like we have me progress.

Speaker 1

04:08

Yeah.

Speaker 3

04:09

I was hearing something I think it might have been Neil de Grasse Tyson talking about it, and he said there was some success out of that, that it greatly reduced like the CFCs that were being put in the Ozone.

Speaker 2

04:23

Well, CFCs were largely eliminated by the Montreal Protocol in nineteen seventy seven. Although you know, the reason the Montreal Protocol went so well is because everybody could make money off of it. Every refrigerator needs to be replaced. Every you know, air condition needs to replace. Like that was very exciting for certain parts of the mark.

Speaker 3

04:40

I might I might have swapped the two in my mind.

Speaker 2

04:43

Yeah, I do. When I talk, you know, I'm talking to students about, you know, about climate change. I mentioned the Montreal Protocols obviously before they were born. But it's also like we can as a civilization get together and agree to do something and actually do it.

Speaker 3

04:57

I was born a year later.

Speaker 2

04:59

There you go.

Speaker 1

05:00

The Asian financial crisis began in nineteen ninety seven. The collapse of Thailand's currency spread all across Asia. Recessions, unrest, Yeah, not good. The Oklahoma City bombing trial verdict happened. That was two years prior, nineteen ninety five, but the verdict was handed over in ninety seven. There's this guy named Andrew Cunanan who went on a killing spree and murdering several people, including Gianni Versachi died by suicide after a

05:38

nationwide man hut. Happy Happy Happy Joy Joy Joy. Mother Teresa died.

Speaker 3

05:45

Yeah, I did I catch the most somber like look back day or.

Speaker 1

05:51

Yeah, it'll look up looking up I do the bad news. Richard does the good news. Oh okay, the Heaven's Gate mass suicide. Remember the Nikes.

Speaker 2

06:01

Yeah yeah, they were web developers too, they were in JavaScript. Yeah, I'm sorry I say that. A lot of that's normal.

Speaker 1

06:06

Well, anyway, it was a you know, a cult, and thirty nine members died by drinking poison or something and going to sleep with their Nikes on. Yeah, Tiger Woods won the Masters. That's good news, the youngest Masters champion. So Mike Tyson, Vander Holyfield's ear off, jeez, Harry Potter and the Philosopher's Stone came out, and yeah, what do you got for technology and science?

Speaker 2

06:39

Let's start in space. There are how many eight Shuttle launches in ninety ninety seven, which is an unusually large amount. Three of them are shuttled mirror missions. It just happens to line up that way. You know, they're roughly every six months. So there's Atlantis in January, again in May, and then again in September. Flew all of the Shuttle MIRR missions. And that's taking crew up, taking crew down, and experiments and things like that.

Speaker 3

07:04

I was living in Florida at the time. I caught a couple of those. Yeah, from the parking lot at work, you could you could see a little flicker in the sky from where we lived, and.

Speaker 2

07:14

Then it would zip away and be gone. Right. It's like you wait a long time for about ninety seconds of wow, look at that, and then it's over. The most interesting thing on the shuttle side in ninety seven is Columbia. So Columbia and April does a mission STS eighty three, which is Space Lab mission, so they're just going up doing experiments and so forth. Columbia didn't have the power because it was so overweight to actually get

07:38

to mirror, so it didn't never had that option. But there's a problem with the fuel cells on board Columbia, and they cut the mission short to only three days to turn around and come back, yeah, and then decide to refly it. So they fixed the problem, refuel and only a few months later in July fly it as STS ninety four, same crew, So three months turnaround fairly unusual, wow for shuttles. They usually take longer than that. On the Mere Space Station, they was a very im in

08:05

a full year. Not a good one either, you tell me all good news. So in February a fire in space. So one of the ways that Mirror maintained its oxygen levels is they actually burned lithium per chlorate. This is something they do in submarines as well, because the emissions of burning that is straight oxygen. Wow. Well, something went wrong and it caught fire. Yeah, that's kind of kind of flammable. Yeah, it's dangerous, but they did it all

08:32

the time. But something went wrong and it took They got the fire out in ninety seconds, because fire in the space station is very, very serious. Yeah, but it took minutes to get the smoke cleared out, so it's a question of whether it was breathing or not. And the reality is if they evacuated, probably never never be able to get back into it again. So that was

08:49

very scary. Quickly overwhelmed in June when M threety four, which was a progressed cargo supply emission, did an experimental approach for docking, lost control and collided with the spector module, damaging the solar panels and breaching the hull of the spector module. The crew. It was a slow leak, so the crew had minutes to do something about this, not seconds, which was good because they needed to close off the hatch on the Spector module. The Spector module having four

09:20

very large solar panels. The power cables that put that power into the rest of the station actually ran through the hatch, so they had to disconnect them without electrotying themselves, push them into the Spector module, and close up the hatch to maintain air pressure, which they succeeded in doing.

09:36

A couple of months later in August, they will have an internal spacewalk, so they'll put on the space suits and reopened that hatch, find that the computers and things are all running fine in vacuum, and replace the hatch with one that had the power line passed through so they can hook the power back up again, and then eventually in October to another one of those internal spacewatch to add in remote control on some things and so forth.

09:58

Spector will never be repressure. It used to be the place that the Americans slept in for the rest of Mirror, which of course is already two thousand and one Yanks. One more bad news on the road on the rocket side, and that is a Delta two failure. This is only I only mention this because Delta two had literally hundreds of successful launches. It was a non event for a

10:20

Delta tow to fly like Falcon is today. This was a GPS satellite and twelve seconds into the launch, one of the solid rocket boosters exploded, blew through the tank and destroyed the entire vehicle. But it was almost entirely fueled, so you can find a video clip of this. It

10:38

is the most spectacular failure you can imagine. Wow, And it was a reminder of why they have what they call blockhouses, which are these reinforced concrete boxes that the folks actually operating the mission work from on the pad because debris rained down everywhere it was only twelve seconds into its flight, including setting fire to cars in the parking lot. Like geez. It was quite an unusual event. Okay, some cooler stuff. The Mars Pathfinder mission launched in ninety six.

11:11

It lands in ninety seven. This is the Pathfire lander which used the air bag landing systems. When they pressurize, the airbag bounced to a stop and then it unfolds itself and inside the Pathfinder lander. This so jurner rover about the size of a shoe box, not very big, but it was the first rover to ever move around another planet. Now to be clear, not the first rover. The Soviets put rovers on the Moon, you know planet. Thing supposed to operate for thirty days, lasted eighty three days.

11:39

Very successful mission and started. They called the Pathfinder reason it was a path to lower cost missions to Mars. Cassini launches. This is the huge bus size spacecraft going to Saturn. Launches on a Titan four B. It won't get to Saturn until two thousand and four. Will be the first, i think, the only thing that ever orbit around Saturn and also will end the Huygens prob on Titan.

12:03

And finally the comet hail Bob so Alan Hale and Thomas Bop both amateurs, find this astro this commet in nineteen ninety five and do some calculations and realize it, Well, it's a huge, absolute massive comet. Good thing he didn't hit the Earth because it got quite close. It'll be visible to the naked eye for eighteen months. He reached his peak brightness in nineteen ninety in April ninety seven. Yeah, and they calculated its orbital period at twenty five one

12:32

hundred years. So hang on to your hats. It'll be back in forty three eighty five.

Speaker 1

12:36

I remember that very clearly. In fact, I was speaking at a Microsoft event in Paris and we were at euro Disney and they shut down the whole park and it was just hanging there over the roller coaster.

Speaker 2

12:50

Just to watch this amazing thing, right, it was so cool. Yeah.

Speaker 1

12:54

Yeah, And my daughter who was too got to see it at too at that point.

Speaker 2

12:58

So on the tech side of things, the web is moving along nicely. The HTPP one point one specification is released. Two guys Sergei Brandon Larry Page, who have a search engine called BackRub, register a new domain Google.

Speaker 1

13:14

Yeah, I heard of it.

Speaker 2

13:15

The Wall Street Journal introduces the very first paywall onto the Internet, and the Macromedia ships dream Weaver.

Speaker 1

13:24

Yeah, fun memory.

Speaker 2

13:25

Aquired by Adobe in two thousand and five. Intell raises the Pentium two seventy seven point five million transistors depending pro was only five point five so obviously progress. Rico releases the CDRW the rewriteable CD.

Speaker 1

13:39

Yeah, they didn't work too well.

Speaker 2

13:40

There are problems with original CD readers because the reflectivity roll this is relatively low, and eventually newer models have a thing called MultiRead so they can use CDRWS properly rewriteable one hundred thousand dollars. The I Tripoli introduces the specification eight O two dot one you know it as Wi Fi. Nobody cares until couple of years later when the Modifkai did the specification called eight two dot one one B. Yeah, it said a lot less expensive implement.

14:07

That's in ninety nine really, so of win amp. It whips the lamos ass. That's Justin Frankel and Dmitri Boldiev from Nelsoft make an MP three player for your PC.

Speaker 1

14:18

The plugins were the thing I remember.

Speaker 2

14:20

Yeah, they built a real ecosystem around it was.

Speaker 1

14:22

I remember Jeff macy Lex showed me milk Drop, which was a graphical, amazing graphic plug in for win Amp. I used to It's just the coolest thing I ever saw.

Speaker 3

14:33

Used to let the visualizers run on the monitor. Super psychedelic.

Speaker 1

14:40

Yep.

Speaker 2

14:41

Just give it a look and let's see what else. A couple a couple of other good ones. Oh yeah, they'll sell the AOL for in nineteen nine nine for eighty mil which is good play because it wouldn't be worth that much later. So world done. This is the year that IBM's Deep Blue after two years of trying beats Casper off. Ye, it's an RS six thousands computer, twelve gigafaughts of processing power, which is exactly the same

15:04

as an iPhone seven. A few years later, and on the Microsoft side, Visual Studio the very first version, which of course included VB five, they announced Windows ninety eight and Bill Gates becomes the world's richest man for the first time, and in probably a PR bit of a pr blender shows up on a giant screen at Macworld to announce that Microsoft will invest one hundred and fifty million dollars to Apple, and people freak out.

Speaker 1

15:31

Yeah, it was.

Speaker 3

15:31

About the same time I started building my first Windows desktop application. There you go in VB and wind forms, and.

Speaker 2

15:40

I'll mention two games, the first version of Grand Theft Auto, which is sort of an overhead perspective game which is still killed people and stole cars. And Ultima Online, one of the first graphical, massive multiplayer online games.

Speaker 1

15:53

All Right, I'm just gonna we're running out of time for the beginning here, so I'm just going to read off the top ten albums of nineteen ninety seven. Radiohead, Ok, Computer, The Verve, Urban Hymns, The Prodigy, The Fat of the Land, Puff, Daddy in the Family, No Way Out, Garth Brooks sevens, Spice Girls, Spice and that was continuing the massive global

16:19

sales because it was there last year too. Andrea bos Shelley Ramanza, the Titanic music from the motion picture Celine Dion, Let's talk about love and Shania Twain come on over top. Ten movies of nineteen ninety seven, Hercules, The Fifth Element, My Best Friend's Wedding Liar, Liar, Oh my god, that was hilarious. Great movie, as good as it gets. Air Force One, Tomorrow Never Dies, Men in Black, The Lost World,

16:57

Jurassic Park, and titan As. If you didn't know, Yeah, all right with that, let's roll the music for a better no framework.

Speaker 2

17:07

Awesome, man, what do you got? It's funny you were talking about.

Speaker 1

17:17

Games because I went looking in the you know, the trending repos on GitHub, and one of them is for a thing called robust Toolbox, which is a homegrown gaming engine written in C sharp, and that led me to the game that they wrote it for space Station fourteen, which is a remake of space Station thirteen, but it runs in C sharp on Robust Toolbox. So then I went looking for Space Station thirteen to see what that was all about. And it's an overhead view kind of

17:53

you know, survive and thrive in space. You know, all sorts of problems happen. I don't know exactly what, but in the little demo that they were doing, there was digital blood on the floor, so something nasty must have happened. It looks fun. But the cool thing is that this Robust Toolbox, the homegrown engine written in C sharp that was primarily for Space Station fourteen, is released as its own thing. So spetistration. Yeah, cool. So that's what I got. Who's talking to us today?

Speaker 2

18:30

Richard grabbed a commental for show nineteen ninety one, The One We Do Andrew Murphy talking about leading teams in the time of AI. I got a lot of reaction on this show, and I'm going to read one of the comments. Here's from Adam right. He says, I'm in the middle of listening to this episode and I feel a bit conflicted. It's possible on going through the grief process. As Andrew mentioned, there are quite a few predictions about the future based on this arguably short time frame in

18:50

which these AI tools have been somewhat useful. Richard asked at one point, who are the people that aren't adopting this technology yet? And I would suggest that it's the late adopters. There are sets of developers and companies that don't hop on every band in the first few years of its use. It takes time for new tech to be overwhelmingly adopted, absolutely true. Aside from that grip, there's likely another grip that cares about the potential and actual

19:07

sociological impacts of technology in the real world. You've seen the price of computer components lately. One of my favorite YouTube channels, Gamers Next is freakily calls out the RAM shortage as being due to the reservation of silicon waivers don't exist to be used in GPUs that also don't exist for AI tosay centers that don't exist. I think it's undeniable that developers using this technology are literally training

19:28

their own replacement. Maybe in some sense it's more like sharpening an axe, but I think more of us should be at least concerned about where the acts will land. The best hope for the technologies. Rigidal lid out in one of his talks. Is the models get smaller and more specialized and work well enough that we put more trust in them. In my opinion, that would be the time when mass adoption would be warranted. Perhaps it will take the current enthusiasts working out the flaws. Do you

19:48

already get there? Though? Thanks for the thought promoting episode. I mean, it's a whole conversation coming around now about this idea that the job wasn't to write the code anyway, it was to provide solutions to customers. And you know, the solution is much about data collection, is it is about implementation, so that we have tools that will do implementation is an important part of the equation. It just makes us all the faster.

Speaker 1

20:11

And also guiding of the AI. You know that to achieve the goal that the customer wants, that's not an easy task.

Speaker 3

20:19

I think we're also seeing it drive demand. So when demand increases, how much software hasn't been built because the barrier was too high. Yeah, So now we're seeing you know, more work because there's more demand. So that's keeping things a little bit more stable than people predicted.

Speaker 2

20:35

I mean, certainly from the machine learning perspective, this was true for radiologists you know when the first image models really got popular, and there's like seven hundred proved by the FDA now for doing recognition. I think even hinted in self count said you're a fool if you're a radiologist.

20:51

That job is over. And in reality, of course, a demand for medical energy is gone through the roof, and the demand for rideologists has gone through the roof, right because they can move faster with these new tools and there's far more people to test. So I think that cycle is going to go on for quite some time. And it just seems like, you know, history doesn't necessary repeat itself, and I'm talking about the history of like five years ago, but it does rhyme, and I think,

21:12

you know, you're onto something there to ed with. There is increasing demand. There's more kinds of software that need to be built. I mean, get into this sort of sense of are we all going to surround ourselves with our own custom software? Because I know a lot of developers are going to get with these tools, that's what they're doing. I already do that, and where the developers go, others go.

Speaker 1

21:30

Yeah.

Speaker 2

21:30

So Adam, thank you so much for your comment. In to copy of music coba Is on its way to you, and if you'd like a copy of music Cobe, I read a comment on the website at dot net rockst'll comment on the facebooks. We publish every show there, and if you comment there and ever read it on the show, we'll send you copy of music Cobo.

Speaker 1

21:40

And if you can't wait for that, go to music tocode by dot net and you can get the whole collection in MP three wave and flat formats. Okay, we've wasted enough time, and it's not a waste of time, of course, let's bring on Ed. I'll officially introduce him. Ed Scharboneau is a principle developer advocate for Progress Software, a ten times Microsoft MVP, and video author on dome Train.

22:04

Ed works at the intersection of modern dot net developer experience and artificial intelligence at Progress I'm Sorry, Richard Progress Nice Ed helped Ed help drive early work around Blazer and the creation of telleric for TELLERICUI for Blazer, helping bring modern dot net web development to a broader developer audience.

22:25

More recently, his work has focused on AI and agentic RAG systems, where he helps guide product strategy, evaluate emerging technologies, and leads development of developer tools that bring advanced AI capabilities into the dot net ecosystem. I can't wait to hear what you got to say.

Speaker 3

22:43

Ed.

Speaker 1

22:43

Welcome, Thanks for having me.

Speaker 3

22:45

Appreciate you guys having me on. I've been a longtime listener and we've all bumped into each other a few times and it's been a pleasure.

Speaker 1

22:53

Sure, yeah, sure, you.

Speaker 2

22:54

Brought me a lovely bottle of whiskey. I think it was last year actually from the was it red ry?

Speaker 1

23:00

Well?

Speaker 3

23:01

That was a JEP decreed red bloody, bloody butcher corn, red corn. Yeah, it was a weeded bourbon. But it's absolutely one of my favorites. And it's gotten. It's got an odd nose on it. It's it smells a little cologne like.

Speaker 2

23:19

It's threatening on the nose. But boy, was good in the mouth. And I don't have it anymore. I drank it.

Speaker 1

23:25

Okay, that's the best testimony right there.

Speaker 2

23:28

Yeah, all right, what have you been working on? My friend? You've been up to stuff. I get that.

Speaker 3

23:33

I've been been a busy, busy person, so I I love to be on the bleeding edge of of all tech all the time. It's, you know, part of being a developer advocate I think is keeping your ear to the ground. But over the years, I've done a lot of a lot of technical sessions, and I love this quote by H. G from War of the Worlds that I that I always use that kind of describes what drives me to keep up with all of these technologies.

24:08

And I don't know if you're familiar with the beginning of War of the World's, but it's with infinite complacency. Men went to and fro over this globe about their little affairs, affairs, serene in their assurance of their empire over matter, and the bit about complacency always got me.

24:27

It's like, even when HD Wells was writing about this, you kind of realize that people go about their day and put their head in the sand, and we get in these patterns of doing the same thing day in and day out, and we might lose sight of things that are happening around us that we should be watching out for. And with my career, it's always been what's the technology that's going to come take my food away?

Speaker 1

24:54

Right? Yeah, what's the.

Speaker 3

24:55

Next big thing that's coming. I need to pay attention to.

Speaker 2

24:58

Who's going to move your chee?

Speaker 1

25:00

That's it.

Speaker 3

25:01

That's kind of what's driven me over the year. So I think that quote I started using when responsive Web came out and I started talking about responsive web and how people needed to watch out for the mobile factor of doing web development and everything's going to need to adapt.

Speaker 2

25:20

Yeah, the mobile funder, it wasn't just a mobile finder, it was a tablets. We were happily doing our m dots back in the day, and then the flip and iPad comes out. You're like, what a third factor?

Speaker 3

25:31

Oh no, now we can't predict the screen size. And yeah, I tried to get ahead of that trend at the company I was working at at the time, and we were like one of the first industrial manufacturers that had a web presence that worked on any screen size. Nice, and it caught some attention because of that fact. There were plenty of you know, marketing sites that were starting to do this type of thing, but nobody in the sector I was in was paying attention to it.

Speaker 2

26:00

Yeah. So the oldest media query setups I ever saw was Audi. Audi was on it really early on. It used to be my went I talking points. It's like look car company.

Speaker 3

26:11

So I've always always kind of been a full stack dev. But a lot of the stuff that I've done publicly as a developer advocate has always been UI facing. So I've done a lot of a lot of web development, a lot of UX talks, and a lot of UI stuff. But you know, I've always been a full stack dot net dev prior, so you know I would own the whole thing. So from time to time you'll see me talking about you know, odd odd back end things like the internals of Enity framework and that type of thing.

Speaker 1

26:49

Hey, why don't we take our break now and then get into retrieval augmented generation. Is that a good idea? All right, sure, we'll do that. We'll be right back after these very important messages, and we're back. It's dot net ROX. I'm Carl Franklin, That's Richard Campbell, and that's Ed Shaberneau. We're about to dive into this topic here of RAG in agentic RAG. So let's just define it.

27:19

I mean, I'll take a stab at it. I haven't used it yet, but from what I understand, it's taking an existing LLM and adding data to it and coding it in a database so that it can be retrieved. Data can be retrieved from it and you can think of it like taking you know, user manuals, help files, documents about your company, anything that you would want to be able to look up later in index. Is that a good It's.

Speaker 3

27:49

A pretty good primer. So when we talk about retrieval augmented generation, we're talking about augmenting the context of the model that you're working with and injecting some information and then reshaping the prompt based on that information. So there's there's actually a couple of different processes that are used within a RAG system. And it's kind of interesting because the whole RAG part, the thing that gives it its title,

28:19

is only just a small piece of it. So, uh, the RAG concept on its surface is you know, injecting that that context for you. But the retrieval part has a lot of steps to it, and in order to get retrieval, you have to have something to retrieve, So there's a lot of a lot of different subprocesses in here. So on the retrieve side of it, you have a database, and there there's quite a few different vector databases out

28:52

there that that can be used for this. But with the product that I work on, we use the Nuclear dB, which is a database that we own it's an open source database, but it's our vector database, and you want to store your documents in that database and they get vectorized. So what that means is when we feed the documents into the system, those documents are going to be broken down into bite sized pieces so we can manage the

29:26

size of those documents better. So they get chunked, and those chunks get broken down and then sent through an embedding model. So the embedding model is going to take those chunks of text and is going to look for the semantic meaning in that chunk of text and store that semantic meaning as a number, So that's your vector that you can then search on. These vector searches are very powerful because they search on meaning. It's not like a keyword search. You can find things that are relative

30:04

to the thing that you're searching for. One of one of the easy ways to describe the semantic searches. If you are storing things like milk and cheese in your database and you search for dairy, you should be able to pull back those items that are milk and cheese because they are semantically similar. But what's really interesting about the algorithm for that is the cosine similarity. You can invert that as well. So if you say, what you know, I want the opposite of dairy, you might get something

30:38

like capsaicin. Right, So you have hot peppers and things like that. So if you think about it in a larger query, you might say, what, you know, what can I use to put out this fire in my mouth after I've eaten hot peppers? And the answer might be to drink milk because that is the opposite of the pepper and it will put out the fire. So it's

31:03

a really powerful search tool. And when we run that search and the vector database comes back with an answer, it retrieves that chunk of text and it augments the original question and then feeds that to a large language model to generate a new answer. So you're taking the context and you're expanding it with this knowledge that you've just retrieved from your vector database. And that's where that term rag comes from.

Speaker 2

31:31

So what vector databases are we talking about here? As a particular product like ionomatically think something like elastic, But.

Speaker 1

31:38

I was saying that they have their own so you guys.

Speaker 3

31:41

Yeah, we have one called the Nuclear database, right, And the reason it's called nuclear dB is because the company that created it was Nuclear, and Progress acquired that company last year and we named the product Progress agentic RAG. So you will see the term Nuclei used within the APIs and things like that in the database itself because it originally came from the company name. But yeah, we use the nuclei a dB for our database.

Speaker 1

32:14

Okay, So is there any limit to the size of data or documents that can be stored or is it just limited by disk space?

Speaker 3

32:22

So this is one of those situations where our product actually has has an advantage. So we actually have had customers with millions of documents in their database. So it's a RAG as a service. So you're talking about something that you're deployed to in the cloud, so you can do a hybrid mode on prem things like that with it as well, but generally people use it in the cloud, so you're really just limited by the cloud itself, which you know, we can just keep adding storage too, so

32:56

the storage can get quite large. There are other ways to do RAG. There's like open AI for example. You can do some file activities with and I think their limit is like ten k of files, so ten thousand files and then it's maxed out with our solution like I said, you can. You can just keep scaling. And what's interesting about the architecture that we're using is it

33:26

takes a lot of the complexities out of RAG. Jeff Fritz and I were having a conversation on one of his shows the other day and I was telling him, you know, the best way that I like to explain this is, have you ever have you ever rolled your own off system?

Speaker 2

33:47

Only? Yeah, I'll do that again.

Speaker 3

33:51

And it was a mistake, all right. There are plenty of tools out there that will let you build RAG systems from. There's a lot of moving pieces. You've got large language models, embedding models that you have to choose from. You have to choose a database, you have to have a system to ingest data, and you have to have a storage place to store that vector data. And you can go off the shelf and get solutions from every big tech vendor out there for each of those pieces

34:24

and glue them together and make your own solution. But there's a lot of complexities to each one of those pieces, and unless you have a data scientist or two on your team, it's probably just the better idea to go with something that's already been prescribed. So that's something that we do well with this product. It's a RAG as a service platform with a lot of different ways to ingest data. So you can ingest PDFs, word documents, text data, all sorts. But you can also do things like MP

34:59

four five files, m P three's. So we have providers for all sorts of data types that's going to take that data into the system and vectorize it so you can perform not just search, but uh, you know, intelligence from that data.

Speaker 1

35:15

All right, So you said before that most people are going to use it in the cloud. But you know, one of the one of the great things about using RAG systems is they that you can vectorize your your sensitive documents. You know that should be kept on premises. And so would a great big o Lama server be possible, be possible to use with this product.

Speaker 3

35:39

Yeah, you could do hybrid, So you can you can do hybrid, or you can do it fully on prem if you wanted to.

Speaker 1

35:45

Now what does hybrid look like?

Speaker 3

35:46

So you could hybrid, you would you would take your data and store it locally, okay, so you could host just the database portion of it and then you could run the models in the cloud. Okay, so you have your choice of models as well. So that's another aspect of this being an entire solution that's end to end.

Speaker 1

36:09

Is uh.

Speaker 3

36:11

There there's actually a back office for it that you can log into and then you can control various aspects of the product. So I can go into my back office and I can choose all the models, you know, from all the models that are out there, from Azure to AWS and Gemini and all of those things. And I can provide custom endpoints and keys if I wanted something that's completely customed and hosted by my you know, my own servers.

Speaker 1

36:43

So yeah, that was that was going to say you could just run it all locally if you wanted to, with something like a Lama, right.

Speaker 3

36:51

Yeah, So it's it's plug and play with with whatever you would like to bring to it.

Speaker 1

36:56

Cool.

Speaker 3

36:57

So it is extension extensible and modular, and like I said, you could you can fully take it offline or you can host the pieces that you want in the cloud. So it does have a hybrid aspect to it as well.

Speaker 1

37:10

And this is a free tooled is it all encapsulated in nuclear dB or what is it?

Speaker 3

37:15

It is a commercial product, so it does have a license involved with it. It is open source.

Speaker 1

37:21

But so RAG is a service. Is the commercial product? Yeah, it's called nuclear dB. Is the devector database part of it? Correct?

Speaker 3

37:28

Yeah, The product itself is called Progress agentic RAG. The database portion is nuclear dB, and the RAG is a service is is a commercial product. There there are open source all of it is open source though it is. All of the code is up on GitHub and various.

Speaker 2

37:51

Repos But that doesn't mean, you know, it's not trivial to operate this stuff. Is the reason it's a service.

Speaker 3

37:56

No, Right, So you know, just just the data gestion alone is a big portion of it. Like I said, it takes in all of these different sources of material.

38:05

If you were to try to roll something like this yourself, you know, this is something that demo's great, you can build over a weekend type of a thing, but then when you try to scale it, you know, as soon as you know, somebody from upper management's like, all right, now we want to ingest some MP three's and then this video would be really nice, and then you're you're off trying to figure out solutions to ingest that chunk

38:29

it up and get into your vector database. Right. Another thing that this does is you can scrape web content very easily. I can put in a site map and it will take in that entire site map, And I can also specify selectors like CSS selectors and XML selectors so that it only ingests the pertinent information. Then you're not re ingesting like a bunch of menus and footers and all that stuff to kind of cloud up your data. So there's a lot of infrastructure built on just the

39:04

ingestion process alone. And then also you probably noticed it has the word agentic in it. It's not just a regular RAG system. There are agents that are working alongside it. One of those types of agents is on the ingestion side, So as you ingest things, you can assign agents to do tasks with that data. So these are kind of like task runners that might summarize or categorize or tag and do various activities on the data as it's being ingested,

39:36

and stores that metadata in the database as well. So one of the interesting things that we have a demo of is we ingested a bunch of financial documentation from

39:52

you know, the big Fortune five hundred companies. So we got their annual reports in their m PDF format, pulled them in ingested those and then there's an agent that runs on that PDF, and it asks the large language model to inspect the PDF for data that would go good in charts, and it extracts from that unstructured data into a format that is easily easily chartable in bar charts. And then when you retrieve that information out, you can just ask it for structured data and you don't have

40:28

to do any kind of special plumbing or anything. You just give the large language model the shape that you want out, and it recognizes that it has this jacent data that's chartable in its database, and it pulls that out and puts it in your structured data for you. So now you've got these charts that are displaying on screen that were generated by an agent just by ingesting a PDF document that had no structured financial data in it.

40:53

It was just data that was pulled out. And what's nice about doing it at ingestion time is you don't have to keep repeating that over and over again. You know it's stored in the database now, so you don't have to run that generation every single time somebody retrieves charting information.

Speaker 1

41:10

What are some of the things that can go wrong with RAG systems?

Speaker 3

41:13

So the search portion of it is difficult. The search. When we talk about vector search, I gave you the very simple example of you know the dairy and whatnot, but would you search in vector database? The vector search is very much a semantic search. So when you phrase your question, if that question isn't phrased that the same way the data is written and ingested, the semantic meaning might be different, but it still might contain the answer

41:48

that you want. So if you try to roll your own you're just using an off the shelf semantic search and nothing else, you might not get really great search results. You might might let a lot of answers slip through the cracks.

Speaker 2

42:04

Interesting because you know, one of the things we talked about here was that a whole sort of milk and cheese versus dairy thing like this is supposed to be the strength it is.

Speaker 3

42:14

But depending on how you phrase that question. You know, I phrased it as what might help, you know, put out the flames of a hot meal or something like that. For a rudimentary vector search, that probably wouldn't return the results we want For a system like this, we we wanted to have that in depth knowledge to be able to pull those results back. So the search that we implement isn't just a standard vector search. It's actually a

42:47

multi part search. So we use a keyword search on top of the vector search, and then we also implement something called named entity record nition, and this gives us a knowledge graph. So knowledge graph is really important in a system like this because it associates other concepts that a large language model might not be able to infer on its own. So for example, it might have named entities in it, like a company, a person. You could think of it that way as other nouns that it

43:32

is associating with the data that's there. So you might have like a company sells a product, and you might have the company name and the product name in there. And it's kind of similar in a way to relationsships in a relational database, but instead of tying things together by a key, you're tying them together by a concept. So you might have a concept of the company progress and it sells the product TELLERQUI for Blaser, and the thing that connects those two dots is the selling portion

44:08

of it. So this company sells X. And then you may have a query later that you're trying to find out what companies have products that compete with this product, and since it has that knowledge graph built in, it can reverse that concept and figure out what other companies it has in its database that have similar products, where a basic semantic search probably wouldn't pick those things up. So this helps with things like people in places, locations,

44:42

all of that type of thing. All those types of things can get connected through the knowledge graph, and that really increases the accuracy of the search, so you're going to get a lot better results.

Speaker 2

44:54

Do you get into situations where stuff coming out of the vector base is conflicting with the large language model, like does it become a competition for accuracy or proper responses?

Speaker 3

45:06

So you have full control over the large language model and the prompting, but by default the prompts don't allow the training data to conflict with the data that's being pulled from the vector database. So there's some guardrails that are already in place to help prevent this sort of thing. If you go creating custom prompts, you might be able to trip that up a little bit, but we yeah, so we do have some things in place already for

45:41

you to try to to eliminate that. So generally if you ask a question to progress agentic RAG and it doesn't have that data in its storage anywhere. So it goes into the vector database that can't find anything. It may have plenty of stuff in training about it, but it doesn't have documents to support it. And that's something we probably should touch on to.

Speaker 1

46:04

Another reason to use a hybrid model. Keep your data.

Speaker 3

46:08

Yeah, it'll come back and it'll say, you know, I don't have a context to provide an answer for that, rather than pulling hallucinating something, right, And hallucination is a big part of this as well, right, So with so that's something to be argued about large language models. Actually, I always say that hallucinations. One person's hallucinations is another person's creativity. So if you want.

Speaker 2

46:37

Good line, I don't know that it's true, but excellent line.

Speaker 3

46:41

If you want to tell us creative story, you know you might might want to, you know, create a fictional story. With a large language model, you want to hallucinate as much as possible. You don't want it to ground itself in reality. But if you are asking it facts about data that you've ingested, you don't want it to do that. You want it as grounded as possible to hallucinates least

47:06

as it can. So with progress agentic RAG in general RAG systems, you want citations, so that is something that's provided. So when you do a vector search, it will pull a bunch of information up and it may not even use that information in the final answer. So what you end up with is a list of resources. These resources are things that hit the search but might not have been relevant enough to include in the answer. And then on top of resources, you'll have citations, and citations are

47:41

the resources that did get used in the answer. And then you will also get back information about what string of text was used in that answer, so it'll give you, for example, it pulled the answer from this pdf, and it it's coming from this paragraph in this pdf, so you can really trace back, you know, the resource.

Speaker 2

48:06

And that's I think what people want is like when you when the offer admits it a definitive statement of some kind, it's like, here's where that was said. I'm not fabricating this. That's powerful stuff.

Speaker 3

48:18

And all of the quality of metrics are driven by the agentic framework that's inside of the system as well. So I mentioned the agents are there and multiple capacities.

48:31

One of them is for something that we call remy, which is the system that continuously monitors the information and the queries that are being ran in the system, and these are weighing in the groundedness the accuracy of the information there, and it's it's displayed to you on a dashboard with a graph that shows where the those UH scores are being hit.

Speaker 2

49:03

Now, this seems like the thing you'd have to build yourself if you're doing right from scratch. Absolutely.

Speaker 3

49:08

Yeah, if you want to validate how your system is working, you're gonna have to come up with a bunch of evaluators and you're going to have to have those evaluators continuously running against your data as you ingest new data and as users are hitting queries on it. Yeah, so that's that's part of the system.

Speaker 2

49:27

I see a lot of value in that, just like there are patterns to how you do those evaluations.

Speaker 3

49:32

So you've got being able to.

Speaker 2

49:34

Get them ready to go for you.

Speaker 3

49:37

Difference, You've got context relevance, which is how relevant your information was based on the query. You've got the answer relevance, and it's how how close was the AI generating an answer that's quality, and then groundedness is all the context off.

Speaker 2

49:54

As soon as you're ingesting a bunch of data from a bunch of places, like, you're going to get conflicting results, and so it's like, what does the tool do when present when it retrieves two facts that are not the same about the same query, right, So whatever tooling I have to help surface that problem like that, that's what's going to get you to travel here.

Speaker 3

50:17

It's it's part of a hybrid search to try to shake those things out.

Speaker 1

50:22

So what's the pricing?

Speaker 3

50:23

Like the pricing I'm never the sales guy in this situation, so I don't remember the pricing off that's top of my head. I think it's I think there's a first of all, there's a fourteen day free trial, so you can kick the tires on it, which doesn't.

Speaker 2

50:38

Seem very long when you're putting together rag data like that's not a trivial.

Speaker 3

50:41

Effort, So it isn't it is? That we'll talk We could talk about that in a minute. I'll go over some of the pricing tiers that I can kind of remember. I think it's I'm looking it free for fourteen days, six hundred or something like that.

Speaker 1

50:53

Seven hundred dollars is the introductory price per month, right, and then there's like an almost teeth a month for another level.

Speaker 3

51:02

Yeah, so you would think fourteen days isn't a long time, and I would like to see this be longer myself, as a developer advocate, I'm always airing on the side of the consumer that's using it, So I do agree with you there. But one of the things that supports the fourteen days is how quickly you can actually turn this around. It's actually really surprising, and I'm not just saying that because I work for the company that has it. I'm actually a big fan of the product. Like it.

51:34

It's very quick to get your data in and as soon as that data is ingested, there is that back office that I told you about that has a search function built in that you can just start keying in searches to see what type of results you're going to get back, and it's got the citations and everything right there where you can see it. And there's even a rag lab inside where you can test different prompts against

51:58

different models, sorts of stuff. So the turnaround time is very very quick on this, and you can even deploy like a JavaScript widget on your page to get started and have the search functionality raised like immediately on on your web platform. But when you want to want to want to customize this, there's also SDKs in every flavor, and we are on dot Net rocks, so you probably want to hear about the dot Net SDK a little bit. But once you ingest your data, you can just hop

52:33

on one of the SDKs. For the dot Net SDK, it's I wrote it. I wrote it to be very dot Net friendly. Have been a dot net developer for twenty years, so I'm using all of the patterns that dot net developers are used to. The platform itself is written in Python, but since it's a software as a service,

52:59

you don't need to concern yourself with that. As a dot Net developer, You're you're mainly working with the SDK, which is talking to the rest APIs and on the dot neet side, you've got all of the types that come back from the rest APIs and nice C sharp strong types, and the APIs follow similar methodologies to things like Microsoft Extensions AI. You have dependency injection, so you go into your RAG system, you ingest your data, you go to your SDK, you pull in the new get

53:35

package and you do dependency injection. You say, use nuclear dB, here's my keys, here's my endpoint, and then you have a service that you can call search on and that's it. Just like three or four lines of code and you're talking to this system that's giving you intelligent answers.

Speaker 1

53:55

So this sounds really cool. A couple more things about the pricing lane layers. The starter layer that's seven hundred monthly is text files only max seven hundred and fifty megs per file, five gig index data or fifteen thousand resources. The pro and these are cloud offerings. The pros nineteen twenty five monthly and that has all file types, twenty

54:23

five gig in next data or eighty thousand resources. But then a customized quote or enterprise you got to contact Progress and that's the only one where you can have on prem options. So if you're going to do it on prem, you got to call yeah.

Speaker 2

54:42

Yeah.

Speaker 1

54:42

The other two are in the cloud only, so there you.

Speaker 3

54:46

I think on the enterprise side, you're looking at companies that are doing like I said, we have some that are doing millions of documents, right, and the pricing model probably shifts for somebody that's scaling it that large of a scale it is, yeah, but the time to market on using something like this is what really shocked me.

55:06

When the folks at Progress were like, you know, we're making this acquisition, we want you to take a look at it from a developer perspective, especially given the front end capabilities that you know about, and see what you think of this product. And I was like, wow, this is this is really amazing. The only thing it's missing for me is a C sharp SDK, and I immediately got to work on building a C SHARPESTDK for it, not because I was tasked to, but because I saw the value there.

Speaker 1

55:38

I was like, this is cool.

Speaker 3

55:40

I got to bring this to dot net devs and the the uniqueness of it, I think is something that we're bringing to the table for dot Net. There are other you know, rag type things out there, but like I talked about with the open AI file APIs, those are been locked under a preview and you have to kind of sign off through pragma warnings to even use it. It's like, this is not supported. You're going into our chart uncharted territory. Put these you know, put these flags

56:13

in so you're allowed to even test this. Please don't deploy it to production. We went V one on the sd K last week. It's been fully tested and in tested and it covers the entire rest API. So you've you've got this solution that you can spin up extremely quick. I could get a demo this running and probably less than twelve to twenty minutes, and uh, you're.

Speaker 2

56:43

I pulled a list of the demos from the Telleric get heberpository, one for Maui, one for Kendo, and one for Blazer Nice.

Speaker 3

56:53

Yeah, yeah, that's that charting example is in there. Yeah, so that one that one mashes up the progress agentic RAG with Kendo Ui or Blazer Maui as well. I think has the chart example. And when you dig into the source code for that, you're going to be surprised at how little business logic is there. It's mostly markup. Most of us is just the presentation.

Speaker 1

57:19

So that was one.

Speaker 3

57:20

Of the nice fits here when when we did this acquisition, like it was very much a back end service there, there are, like I said, there are widgets in there that do you like, some vanilla JavaScript stuff to get you started, But when you're ready to build an application, you're really going to want to customize the UI, and we have done UI for you know, almost twenty years now at progress with the teller.

Speaker 2

57:54

It's all about how you present the information, right, Yeah.

Speaker 3

57:57

And I was like, now we've got the ability to do these really intelligent queries and results and use our UIs with it, like this is bread and butter, peanut butter and jelly. Yeah, you know, these two things go together great.

Speaker 2

58:14

It also speaks to how there's less and less interface part less buttons and knobs and switches, and more and more screen space given to visualization. It's one, you know, it's just a box to describe what you want and then the rest of the screen is dedicated to showing you what you want.

Speaker 3

58:32

I know I've said search quite a few times too, but I think search really under sells what we're doing here in the RAG space in general, not just for this particular solution, but all of it. Searches the like Hella world, right, it's it's the easy go to.

Speaker 2

58:49

Well and it aamly a company doesn't have a search problem. Yeah, it's just the normal all search stocks.

Speaker 1

58:55

Yeah.

Speaker 3

58:56

I'll give you some interesting things around search too. But one of the things that this does well is it can do like product recommendation, which is kind of search like, but you can you can come at this from a different point of view instead of asking it a direct question. You say, I'm a dot net developer, what products do you have that can help me with this problem? And

59:25

do product recommendation. And you can also use this as a competitive advantage if you want to say, bring in data from a competitor and bring in your own data and then do a competitive analysis and do you can do battle cards in real time for your salespeople. You can say, all right, what you know, what product is my competitor selling and what is you know? What is their audience and how do we compete against them? And this can do that analysis for you and shape battle

59:59

cards in a UI and display them nice. One of the more interesting demos that I worked on with our sales folks, and I can't name names of particular products or companies, so I'll be a little bit vague, is we did one that was a food laboratory. So we were working with somebody that did a product that had to do with food additives. And what we had come up with as a solution is, you know you're in a laboratory working on the next donut, Let's say, and you want to know what you can use as a

01:00:43

food coloring for that donut. Why not make a mobile application that you snap a photograph of your prototype, and the AI analyzes that photograph for coloringschene, whatever other keywords you want to tag that with, and then goes off to the RAG system and searches for products that you

01:01:08

have that can be used for that result. And then it's going to come back with not only an answer, but a bunch of data sheets that you can click through and find data on, and then you can ask follow up questions on it, things like how will this coloring affect the taste of my product? And that goes further, you know, beyond a search that's not a normal search.

01:01:30

Now you're you're doing like a research roll and a developed you know, R and D roll inside of this tool, and it's doing it in several minutes rather than you know, something that would take hours or days.

Speaker 1

01:01:44

Nice, it's just good stuff ed. When were we going to see you next?

Speaker 3

01:01:49

I am going to be at stir Trek nice, and that is on May first, and I'll also be at code Stock. Code Stock's a good one in Tennessee. I think that one's mid April. That one went away for a little bit and came back. There was just some really good people in Tennessee. I'm happy to see that they brought that back and then also be at the MVP summit. This probably is likely to air after that though, So no, you are.

Speaker 1

01:02:20

Corrector yeah, yeah, April ninth, Yeah, someone in this little air. Yeah, so all right, dude, good stuff, Thanks very much for sharing.

Speaker 3

01:02:28

Yeah, you can you can find out more about this at progress dot com. You can actually kick the tires on it right from the main screen there there's a search box that deploys the the product behind the scenes, so you can just ask questions to about anything Progress related. And it's using that we're dog fooding at hardcore right there. Great, and then you can find it at Newgate on Newgitt repository as the Progress dot nuclear package. And yeah, we.

Speaker 1

01:02:57

Got a few more links on the page at dot nerocks dot com. So ed, thanks very much and we will talk to you next time on dot net rocks.

Speaker 4

01:03:06

Thanks for having us.

Speaker 1

01:03:18

Dot net Rocks is brought to you by Franklin's Net and produced by Pop Studios, a full service audio video and post production facility located physically in New London, Connecticut, and of course in the cloud online at pwop dot com. Visit our website at d O T N E t R O c k S dot com for RSS feeds, downloads, mobile apps, comments, and access to the full archives going back to show number one, recorded in September two thousand and two. And make sure you check out our sponsors.

01:04:00

They keep us in business. Now, go write some code, see you next time.

Speaker 4

01:04:05

You got trade Middle vans by the

Speaker 1

01:04:07

Sam Is Home and my taxes in line red

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript