Applied Large Language Models with Vishwas Lele

00:01

How'd you like to listen to dot NetRocks with no ads? Easy? Become a patron for just five dollars a month. You get access to a private RSS feed where all the shows have no ads. Twenty dollars a month, we'll get you that and a special dot NetRocks patron mug. Sign up now at Patreon dot dot NetRocks dot com. Welcome back to dot net rocks. I'm Carl Franklin and I'm Richard Campbell, and we are in our respective places on opposite sides of the North American continent, absolutely as we usually are.

00:45

But why don't know you caught us in the middle of three or four conferences here listener. Yeah, well, in the last few shows and get in time shifting being hilarious. We're all shows. We did it in Porto, and some awesome conversations. Really neat folks, folks who never talked to before. I always tend to when we're going to do in person shows, I try to go after guests we've never had on before. Right, Yeah,

01:08

new stuff, all good, I would point out, I am. You know, it's getting towards the end of the year, so you know what I'm working on the end of the year, geek out. Yeah, yeah, I know it's close to the end of the year when I have to open up the script my script files and start pulling stuff forward for space and for energy, and I can't wait. There's been so much, it's so much, so much good stuff. They're both they're both going to be I'm

01:33

just committing now. They're going to be long. I mean, you could just do a whole one on the JWT right, oh, JWST and the crisis in cosmology, like the current proposal they're floating around. Maybe we're long about the length of the age of the universe. Maybe it's twice as long as we thought. It's twenty six billion years instead of thirteen billion, wow, which just is stunning. I mean, I love the fact that this is why that instrument went up there, right, it was to test all

02:01

of the assumptions we've been making and it has delivered. Like do we talk to Amber Strong about that, like this was one of the site and she's just like, Okay, there are things we're finding that cannot be there. There's something we don't know and it's a big something. And Vishwaz is looking at us like did I just parachute in here from another planet, land on the right show anyway, don't get me started. Oh wait, you did, I did. I'm sorry you late. I can't wait. I can't

02:30

wait, I can't wait. So uh yeah, well let's get started with better no framework awesome? All right. Well I've talked about this show I've done with Brian McKay, the a iBOT Show before we had them on. Yes, yeah, and we have just started getting into code and using c sharp with the API to do you know, the chaining, you know,

03:02

get the results of one prompt chain it into another. But we attempted to do one last week and by last week, I mean you know a few weeks ago, I think it was the twenty fifth of October, we recorded with a guest and for some reason Riverside FM, which is what we record with, decided to completely trash the are not finish uploading the screen share on top of that, so we're going to record it again. That's that's the thing. So the episode eleven, watch for that because it's going to be

03:36

amazing. We're basically trying to break the or push GPT four V image recognition to its limits. Oh wow, that's what we're trying to do. But in the course of doing that, Brian actually got some results and we're going to rerecord it. So I don't know what's going to happen the second time of the first time. First time he got results that he didn't expect, and he tried these same prompts with the same images and got completely different responses.

04:06

And it turned out either somebody's watching or it's just learning, Like, no, that wasn't a good response, Like there might be something else looking at the responses and tuning the algorithm, like in a matter of twenty four hours. Well, there's also another possibility, which is that generate of AI models have a variability that allow for a range of answers. Yeah, and I kind of got into it with him and his guest as well. You know, the whole idea of do these large language models reason, And my

04:40

stance right now is no, they don't. And you know, when presented with the facts, this is what they do. They take this, they take this. How is that not reasoning? My position is well, because it gives you inconsistent answers. Well, more partly it's just a statistical model, right, Like, there's no reason to be had. It's not weighing

04:58

any factors at all. Gives a statistical result, right and and you know, but if you want to just look at it empirically, when you ask somebody what's one plus one and on Tuesday they say two and on Thursday they say nine, Well you asked the same question over and over again, it'll give you different answers like yeah, that's it. Yeah, and so there therefore you can't count on it. It's not reliable and it's not what you think it is. You don't ask You don't ask a sarcastic parrot for facts.

05:25

That's not what it's meant for. I'm sure Vishwas has lots of things to say on this, but anyway, tune into the Aibot show dot com. Uh. Starting at about episode nine, I think is when we started using code. But in episode eleven is where we start really pushing the boundaries with GPT four v Love it, don't learn it, Love it Cool.

05:46

Boo's Talkers Today, Richard grabbed a comment of a show eighteen twenty nine when we did just earlier this year with our friend Vishwaz when we were talking about fusion development, which apparently is a term that's now fallen out of favor. But this is about the idea that the power platform presents an awesome way to build heetrically need as clients against cloud back ends, and that there's a role for traditional net developers in there and building some of those back end services and

06:12

even building client side components. And that was the idea of Fusion, that we can all work together. I don't know how else caught on with folks, but not that hip. But we did get a comment off that show from Jim, who I believes are regular He said, every time I hear someone talk about how easy it is to build with power apps for a power user, I shake my head and I have to think to myself that we've

06:31

lost something. In the past twenty years. I went from it support dude with no real software development training to a software development leader, in large part because of visual Basic for applications, which made it easy to learn to program. We didn't need a solution that put hard to maintain Excel formulas into web app. We had a problem that needed to get solved and get back quickly

06:53

to developing more apps. We need tools that get us to the equivalent of the late nineties and early two thousand desktops development experience, not things that try and mimic access. I mean, it's one click to create a project that I could start dropping UI elements into and then writing clean cut with a minimum of ceremony, not hiding logic behind textbox dates out of textboxes. And I'm going to turn off my rant before it turns into a blog post now.

07:15

I mean, Jim, I would say you're misremembering some stuff too, like first thing that I came to appreciate. I said this the other day on another show, where I said, I think wind Forms is the exception. It's the only product that ever had a great designer. Every other attempt to make a great designer look at you, front Page or dream Weaver, much less WPF. Every other attempt to make a great designer has fallen on hard times. Only wind Forms had a great designer. There is a reason for

07:46

that. The reason is wind Forms is a pixel based grid, and all of these other things are dynamic in don't use yeah, and that's why can't You can't have a visual designer when you're not pixel based. Well, and the downside to the pixel based is then you go to higher risk monitors and oops, or you want to work at mobile oops, or you want to scale. Yeah, all of those problems are introduced, and so that's why we have this one draggy, droppy designer that people fell in love with back

08:15

in the day. Yeah, but the idea that that vb ever built clean code of any kind and with minimum ceremony, Like it was good at getting stuff done, but boy it was. You had to practice carefully to make maintainable projects. Yes you did. It's a bit of a nostalgic effect to think all things were so much simpler them. Well they were because we had VGA displays, one type of machine to deal with, and you know, it more or less work that you were writing code on the same kind of

08:41

computer you were going to run it on. Like life was simpler in that respect. You know. I had a flashback to the nineties when I had my little kids and we used to play DVD games right on the computers of the day. Dug out some of those DVDs, you know, like played them, brought them up, and I was thinking, oh, this is going to be great. You know, we get to do these cartoony mathey things, right. Yeah, it's like the size of a postage stamp on

09:07

my screen. Your screens just a little bit higher resolution now, yeah, and you blow it up and it does not look good, you know. For the for the history of dot net, I have all of those great slides of the conferences from two thousand, like the original pds A tech ed when dot net was first announced, and like Mixo six and so forth. You know, I had to get all of those graphics remade because they were like three twenty two eighty yeah, six hundred like back then, screens weren't

09:33

the same resolution, and they they looked terrible. Tretty tramble on eighty ten eighty. So I literally sent copies of them to designers, said remake these as ten eighty for me. So I have the only ones. I have the only good looking mix O six graphic, Like, it's me. I made it, it's mine. That's very cool. Anyway, Jim, thank you so much for your comment and a copy of music Coobu. It's on

09:52

its way to you. And if you'd like a copy of Musiccobeu, I write a comment on the website at dot net rocks dot com or on the facebooks we publish every show there and if you comment there and I read it on the show, we'll send you copy of us to go by. And you know, you can follow us on Twitter if you like to, or X or whatever the hell it is today. It's going to be Z tomorrow, I'm sure. But yeah, we're still there, but the cool kids

10:15

are hanging out. I'm mastadon from from my perspective anyway, I'm at Carl Franklin at tech hubs Social, and I'm Rich Campbell at Macedon dot social. Yeah, send us a two. Maybe you'll get a copy of music to go buy. There you go. It's another way that you can leave a comment. And that brings us to our good friend Vishwas. Vishwa Leiley has been on the show some thirteen or fourteen times. He is an Azure an AI expert. His bio is long, but I won't be labored that here.

10:48

Uh he's a real great guy as well. Welcome back, Vishwas. Thank you what you've been up to, what's doing well? Doing the Like everybody else, got started with chagpt open ai, Azure open Ei. Specifically for asure developers like me. It was an easy sort of way to get into generative AI I've been working on some aim al stuff previously, but generative AI, of course has changed everything that we knew before. Now you're in watching BC I think. I think a lot of your customers or governments like,

11:28

aren't they pretty careful about adopting new technologies. I'm always impressed that you're in these things so early on. Yeah. So, yes, Richard, that's true. But I do work with government and commercial customers both. You know, the the idea is that can we can we try out some solutions in the commercial world bring those some of those solutions to the government community. So that's that's been the pattern. But you'll be surprised the government customers.

11:56

And you know, Microsoft has Azure Open the Eye service that was fed ramped, right, so it has available in a certain secure setting. That was an announcement that came a few months ago. So there are civilian agencies who have lots of needs for automation of processes and processing of documents, and they're definitely looking at generative AI models to be able to expedite some of that work.

12:24

Well, I know the government is currently sharpening its clause to to put some restrictions on AI even sort of be fuddled about this whole thing from the very beginning, and I'm not so sure how well they understand it. But clearly there's some legislation needs to happen, especially around copyright protection and that kind of stuff licensing. But what are your thoughts on that. Have you heard

12:48

anything being in DC about what they've planned to do? Well, you know, like you, there's been news that White House is preparing for a major announcement. I don't know if it's a good thing or a bad thing for You're right, I think we do need some guidance there, but I always worry about what kind of guidance and will impede the amazing amount of overall. I mean, they are negatives and there are positives, like everything else.

13:18

I do think that this is just an amazing development that we will look back five years from now and say, oh, something new started in twenty twenty three. So I'm hoping that the legislation that they're coming up with, I'm hoping with the industry leaders is a responsible step but does not curtail the progress and the amazing possibilities that we see from this technology. Yeah, when I think about the logical roles, it's like it would be good to have some

13:50

privacy legislation. You've been needing that for ages, you know, large language models notwithstanding. But to me, the one area that you want from a public perspective care about is source of data full stop. If you're going to put make an LLM available to the public, even for a fee, you have to tell the where you get your data from. That's an easy one, and it's and you know, there's nothing profound there. It's not a

14:13

secret. Presumably, it certainly deals with it starts to tackle the cocky right issue, but it's just like, hey, you know, you want to compete, make distinctions between different lms, talk about source of data, right, So, Richard, I think source of data. And there's been a debate about you know, what do open source models mean? Right? Is it just publishing the weights of your model, Does it make it open source? Or does announcing to the world what sources that your models were trained on.

14:41

Yeah, that's the definition of open source. So you're absolutely right. These companies have to at some point tell us what are the source of data. But that's different from citations. Once the response is being generated at a sentence level, that's that's a very hard problem to do to spid well. And that's what you really need, right, I mean, nobody wants a you know, fifty terrorbyte list of sources, general sources that it's been trained on. But what I want to know is when it tells me something,

15:13

where did you get that fact? You know? But as you said, it's very hard because it's a statistic model that predicts the next word that's going to come. How does it know it? Arguably, it's not pulling from any one source. It's the points of the data that it processed against. Right, Yes, that is true. We can sort of break my mind, break the generative way technology into two parts. The ability to comprehend what you're saying. You know, so it's throwing it a prompt and it is

15:41

trying to comprehend and trying to give you an answer. And the second part is the general knowledge that it has captured through the training process. Right. If you think of I think of generative Way in two terms, right from and I think we're going to talk get into this more. I look at it from the perspective of how can generate your way I make me a better developer? How can I solve everyday problems? That are there in the businesses

16:07

in it. How can I solve that? So from my perspective, I trend to break, like I said, into two parts, the part about comprehending and sort of listening and reasoning, and then part about general knowledge. I tend to leave the general knowledge part out if I can. When I'm asking for a prompt or asking for a response, I'm going to provide you a context and you should only reply based on the information that I am providing you commonly referred to as the RAG pattern, and that is very important to

16:38

me. So Richard, your question about sources and citation is very important. But in the kind of solutions that I'm building, I am providing it a context and saying I'm going to give you these ten pages of information and you need to process that, and that way, the hallucination and the accuracy is much high to begin with, because I'm controlling the source. I agree, and I think that code in general, you know, code generation or help

17:08

with coding is where AI. You know, this large language model absolutely shines. Sure, and nobody's going to complain that, oh, that algorithm was lifted out of my open source project. Well it probably wasn't it was probably put together just by statistics, but but I've found it. That's that's where it really shines. And that's not where people are complaining. People are complaining

17:37

because it's taking their art, artwork and appropriating it. You know, artists, like all the artists are crying about this, right, you know, and this is why people don't trust it. But as you know, anyone who's writing complex documents that are in or code that are in a specific as

17:56

you said, context, right, that's where it really really shines. And before I stop talking here, the other things that I like about it are the new chat GPT being web interface, so instead of using the data that it's trained on, you can do essentially what is a web search, but it goes out, finds multiple sources, puts things together, and if it finds the answer, it sort of gives you a nice concise answer instead of

18:32

having to scrape it out of different sites. And also if it can't find an answer, it'll say, you know what, I searched all over the web and I couldn't find that thing that you're looking for. Maybe it doesn't exist, and maybe here's some documentation that you can go look at and see if you can figure it out yourself. Yeah, that integration because all of these language models have a knowledge cutoff date December twenty twenty one or some date

18:56

like that. So if you ask a question that goes beyond that, of course, having this big integration where you can do go do a web search, you could do that. I guess with ch AGPT too, with some plug ins, but beings sort of gives you that. What I've found is it generally works, but there are there are situations where the citations are not complete, and that merging of real time look up versus what the model has

19:21

been trained on does not work as seamlessly. You're right as you were copect, and especially for things like c sharp like you're better off using the default model that it was trained on to to do some code figure you know, right, a routine that does X, then you are using the bang stuff. The other thing I like about it now is I can say, here's a u r L to a PDF, digest this, and then tell me

19:48

why, you know, my fridge doesn't work. That's right, and it will actually do a good job of diagnosing, And you can take pictures and upload it and it'll understand what it's looking at in the picture most of the time. This is what we're finding out anyway, cal that's a very good

20:03

example. Give it a fifty or one hundred page document and let it read it, and then have a conversation with it, and say, because you know this, this fifty page document presumably is talking about different problems with the fridge. You don't want to go through that. You can say, Okay, now that you've read it, I'm going to tell you two symptoms that I'm seeing with the problem that I'm having, and can you narrow it down? And then having a conversation with that. That's just amazing. It is

20:30

amazing. That's aig. It speaks to the strength of this model. It is a good summarizer to be able to ream through a lot of information. And plus you can validate you know it got it from those hundred pages. So once it gives a statement like you should do this, you can take that phrase and search in that document and probably find exactly the reference you need to say, yeah, the document really does say this. I found I did this once in my amplifier and it actually gave me the answer and said

21:00

on page fifty three in this section here's what it says. So it actually told me where in the document the answer lied. And this was a real experiment that I did. And I had a real problem with a piece of gear, and I literally told her to read the PDF of the manual and it in like five seconds gave me the answer the challenge that you know, let's say if your one hundred page PDF document, either it can read it

21:25

completely, that's one thing. But you know what if you have thousands of fifty page documents, Yeah, then you're trying to sort of come up with an answer, and you're eventually going to run into a limit. Even though GBT four gives you thirty to two K models, which is what about two

21:40

hundred and fifty pages or so what if you had more pages? And then now you can't feed that into the model, right, and now you have to do some sort of a search, a vector search of some sort, get the right content in and pass that as a context you prompt you said the V word vector databases. I find these baffle and I'm not so sure that you know, Joe web user is going to be able to figure them out. Can you briefly explain what they are? And sure, sure,

22:08

sure, and I shouldn't have used the words without explaining it first. No, of course you should. How are we going to ask you questions? I mean, yeah, what's our role here? No? True? So let's take a step back. So you started out with a great example of you know, having this great fifty page PDF about your amplifier and it just you. You had the model read that completely, and you were able to do that because that fifty pages fit into the size of that that the model

22:44

supports, which is thirty two. Right. What if you had thousands of fifty page documents and you're trying to build an answer. You can't feed all of those to the models, at least not now. Maybe GPT five GPT six would allow us to do that, who knows. So in the absence of that, what you have to do is you have to go do a

23:03

search against those thousand documents. And the problem is that if you did some keyboard search, as we've been used to doing, right, it is not that keyboard search is really not conveying the intent of what you're trying to search. You might search for some words and you might get good hits, no question about it. But what if you're trying to just describe the problem. You're just saying, my amplifier works this way in certain conditions, but I'm

23:33

noticing these problems. I can guarantee you there will be no keyboard hits when you write a statement like that. But in the you're describing the problem. And what if you were able to take this two word two lines worth the description and find a paragraph of texts that closely resembles this text. And that's

23:56

what a vector database is. What we are trying to do, is we're trying to take every piece of text in that large corpus of documents and we are assigning some statistical values to it so that we can place certain sentences in a certain manner. So think of every word having a certain number of attributes, and then words that are closely aligned will have similar number of attributes.

24:22

Things like that. So now you take that sentence of your description of the problem, you create an embedding out of it, and you know, you can use any of the models to create an embedding, which is you're getting a mathematical representation of those two sentences. And then you are and this going back to high school math, you're doing a coscience search of this mathematical representation

24:47

against the mathematical representation of the entire corpus. Maybe you get two or three hits which generally talk about this problem, and then you bring that text back and only send that text to the model because now you've compressed that information. So that's if you're able to do that, then you can get answers to the question even if you have one thousand documents. And I'll talk about later on the product or application that I'm working on, which basically uses the pattern

25:19

that I just talked about. I mean, we don't have enough time for you to explain this, but in general, how does one go about creating a vector database from a tome of documents? Right? So you know, let's just take the simple example of you those two lines of description of your problem. Right, you can go to any number of embedding models that are available. You can say, hey, please give me a vector representation of these three lines of text, and it literally gives you back a vector representation

25:56

of those three lines of text. So, because these models have been trained, like I said, you play this prediction game what comes next? What comes next? You play this prediction game, and by playing this game, millions and billions of times you have assigned a certain attribute to a certain word in a certain phrase of that sentence. And then by placing these values in a multi dimensional space, you can say these words together mean something, These

26:29

words together mean something. And you know, we all remember the twenty fourteen word to wac paper that Google came out with, you know, twenty fourteen, and we were so amazed when it could do simple things like king minus man plus woman equals what oh queen, And we were so happy that you know, these models have now some semantic understanding of English language. Right that

26:52

was twenty fourteen when the word to wac paper came out. And of course, you know, we've given the complexity of the models, the transformer model, the availability of GPUs. Now we can do so much more. We were embedding words literally, now we're embedding paragraphs and pages essentially, so that to your listeners, essentially you have taken your corpus of text and you've created a mathematical representation that tells a model. The model is not sentient, it

27:26

does not know what it means. But because of mathematical representation, can take the problem description and say there are these four paragraphs out of these thousand documents that seem to be talking about something similar, and you go and pull those paragraphs, and then you go to your prompt and say, I'm going to give you four paragraphs worth of relevant information about my amplifier, and I'm going to give you some symptoms. Your job is to take that symptom and try

27:56

to come up with a plausible answer for the problem that I'm having. That's that's essentially what's happened. I get that with the paragraphs that you generate for the prompt. But how do you know if I have thousands and thousands of pages of documents, is there like literally an app that I can just drag and drop those into and proof its a vector database or is there a lot of manual work on the part of the of the user. That's a great

28:25

question. So Microsoft sixty sixty five Copilot is going to come out. The release data is November first or something like that, if I'm read correctly. So let's just use that as an example to answer your question. Card. Sure, so Microsoft is sixty five copilot would be Now you have inside Word, PowerPoint, Excel, whatever you will be able to ask. You can go into Word and say, hey, can you please generate me an invoice based on the last five invoices that I sent out? Okay, you'd be

29:02

able to do something like this. How do you be able to do that? Well, maybe you have thousands of invoices stored on your one drive somewhere, and when you start training the semantic index, which is you know, which is a component of the Microsoft sixty sixty five copilot, what they would do is they would go through your one drive, open up every document, and most commonly this is what people are doing. Open up every document at the page break, Take the one page so or some fixed number of words

29:32

or tokens. Read that from your document, create an embedding, store it. Take the next page, create an embedding using the AI model that I talked about. It could be any text model, and store it. And that's how Microsoft internally is creating the semantic index. Of course, cognitive search now has vector database built into it. Right, Cognitive search used to be

29:56

text search, now has vector database built into it. So now you've got through this whole process of ingestion, which can take lots of hours, yes, because you could have thousands of documents. But now you've taken your entire corpus and broken it down into a set of embeddings. So I take it then this isn't something that Joe programmer would do with a bunch of data, like you need some sort of powerful tool on the back end to do this

30:23

for you. This is something I'm time to understand your question. Maybe this is the time to sort of talk about the application I'm building and see if I answered that question. And I'll come back right all right, but hold that thought. We'll do that right after this break for these very important messages, and we're back. It's starting in Rocks and Carl Franklin. That's Richie Campbell, and that's vishwas Lele and he was just about to start telling us

30:52

about this amazing application that he's working on. Please do go ahead, sir, thank you. So I don't know amazing, but there's an application that we've been trying to build the last six months that has taken up a lot of my time. So we talked about the Microsoft THY sixty five copilot. We are building a similar capability but to help proposal developers. So imagine you're getting a fifty one hundred page RFP or RFI or RFQ or something like that,

31:23

and that's a very time consuming and expensive process. Right, wouldn't it be nice if you were able to take all your past performances so you're responding to these kinds of RFPs over years prior. Can you take those documents and create some sort of a vector database and then use that to generate a draft

31:47

of a future RFP. So to going back to your question, Carl, as a developer, what we had to do was to write those ingestion routines where we take those past proposal responses that you may have submitted to the government or some agency or a state or local or what have you, and write that ingestion routine where we take that document split it up. Now, there is a lot of work that is happening which says, should you be breaking

32:19

the document on the page boundary? Is that enough? Because think about it, if you just break it on a page boundary, and maybe the important idea or concept is transcending that page boundary, right, break it up by topics, right, break it up by topics, But then how do you break it up by topics table of contents. That's a good point, but sometimes you know, most in our case, at least the PDF documents that are usually make up these responses. The table of contents are not that great

32:55

and PDF is not a great possible format in any case. Right, So you have to use in some cases language models to be able to even passe them out and to break the document down into some sort of a section in some classification format so that you can use it for embedding. So you store, maybe not do it on page boundaries, maybe do it on some sort

33:22

of a lexical boundary. But then you also think about what if the table of contents has some sort of inheritance built into it, is it important for you to capture that inheritance because that might come in handy as you're trying to respond to a question. Right, So those are all important things to do.

33:44

So going back to your question, Yes, as a developer, you have to think about where your embedding boundaries lie because when it comes time to generate content, the more precise context that you can provide to the language model, the better response are going to get. And the more precise context you can get by thinking deeply about how you're going to break your corpus into the

34:07

right set of embeddings. Isn't Microsoft Copilt going to do a lot of this for us, Like, this is a solve problem if you just are all in on them three sixty five and pay whatever per month they're going to ask for. That's a great question. And let me tell you the difference is so Microsoft Phrisks have Copilot is going to be a general purpose answering engine, right, so you can ask it and say, hey, please draft me the text for an invoice based on the last five invoices, and I'm sure

34:40

it'll do a great job. You've seen demos of that. It also has access to the graph, so you knows which people you're collaborating with, and it can take into account that. But if you have a complex writing problem where the questions are not single line questions, right, the RFP questions are. Sometimes the questions themselves are maybe a page long or half a page long. Right, how do you sort of create a prompt? So in case

35:12

of Microsoft, there are sixty five you're essentially writing a prompt. You're saying, hey, genade mean invoice which is based on the past five invoices. Right, So much context in that statement that needs to be known that you have in mind. Right, in our case, the context is enormous, right, Richard, Because if you're trying to respond to a complex question,

35:34

you have to infuse a lot of attributes into that. Right. You have to think about what are the pain points that the buyer is expressing in this description? What are the then themes? How am I the one that they should be selecting and not somebody else? Is the buyer cost conscious? Have we done this work? Like? You know, it's great to say that we are the greatest company it can do agile development, But want it be nice if you can provide some proof points that we have done this and we've

36:02

had success. So I've just given you three examples. There are many such attributes, So when you are crafting a prompt, you need to take these considerations into account. So in case of sixty five copilot, you are writing the prompt, but in our situation you're writing a prompt that is far more complex. So we are trying to do some prompt engineering for on behalf of the user where they can tell us at a high level what they want to do with their response, and for us, it is very much about about

36:39

That's why it's a copilot. It's not a pilot or an autopilot. You're leveraging the experience of experienced proposal writers. The experienced proposal writers who spent ten twenty thirty years writing this, they know exactly how you should be responding. We are simply augmenting them and providing the first draft. Right. All of the innovativeness of our engine will not help you win any proposals. Our engine will help you do sort of the mundane steps quickly so that you can spend

37:07

more time adding innovation to your response. So you know, if you if you have to generate this prompt, we think that every user is not going to be able to generate those levels of prompts. And I'll tell you why, because as good as GBT is, if you give a certain size prompt, it starts forgetting some instructions that you give in the middle, right, so you have to repeat it. So we have to have sort of layers

37:37

of prompts that go in. And this is why you know. The team that I'm working with has data scientists in them, and people ask me sometimes, hey, generative AI is pre trained, do I need a data scientist? Well, you need a data scientist if you're if you're crafting this sophisticated

37:52

prompts on behalf of the user by taking some input from them. So, Richard, in that manner, we think that this is a very specialized example of what we are trying to use the language model, and this is sort of the difference with the M sixty five copilot, and there are a few other differences for us. Hallucination is very very important. Don't please don't go out and say that you did a five thousand SharePoint node migration when you've only

38:22

done fifty or five hundred, right, that we'll be a problem. Right, So we do explicitly accuracy verification, and there are many techniques that your listeners may be aware of that. You know, you take the generator to response, you try to extract entities from that and lage language models. Right, So any definitive statements in there, yes, yes, and language models

38:46

are great or extracting entities. And then you take those entities and then you do a search against your knowledge repository and say, do I see a five hundred node migration ever mentioned here? And if it is not, then let's

38:59

just flag it and let a human proposal writer go check this. We're doing those kinds of things peer, if you I'll just just extend this by saying that I don't know if if your listeners have seen this study from MIT, where there are many studies out there which said, okay, let's just try

39:16

to measure the productivity improvements or something like chat depety. Right, So what they did was they took four hundred people college degrees good writers, broke them into groups of into two groups two hundred each, gave them a writing assignment, and then people who were evaluating them had no idea whether they were using some tools or not. So they gave them on writing assignment where nobody was

39:38

allowed to use any tools. Say, got the baseline. Then they give them another writing assignment, but this time half the group was only allowed to use chat deputy. The other group was not allowed to use chat jpety. And now once again the evaluators had no idea who was doing what. And if you look at the results, pretty amazing results. Right. The group that was using chat GPT completed their work in thirty or thirty three percent less time. And I can point you to the study. So my numbers may

40:09

be off. But more importantly, more importantly, the people who are using chat GPT scored a thirty three or thirty five percent better grades than people who are not using Okay, so a third better grade in a third less time, third less time. I mean that's amazing. And then what was it fact checked? Because that's my problem. You might be able to split up

40:31

beautiful text, but is it really right? Yes, and human evaluators evaluated that, right, It was just like you know, people testing that that's always a case, right, And we know that human evaluators make mistakes too as well well, and we don't know exactly what they were trying to write, like as soon as you get away from non factual writing, like just exploratory fiction based writing and stuff, it's not the same set of problems right,

41:00

right, And you know what is interesting is that you know, remember I said that they were all given a task where nobody was allowed to use in external tools. So that was to establish a baseline, right. What is interesting in the study is that writer folks who did not have great writing ability, right, even they benefited from when they were put in a pool where they could use chatgputy. They benefited from. Their grades went up significantly.

41:28

I mean I would argue they would benefit more, they would manafit more right. But you know, there's very very interesting thing that about that study that I'd like to bring back to your question, Richard. It was very interesting in the study that there's something called, and I'm going to use a fancy term that I did not know until I read this m study called the human machine complementarity. Okay. So what they did was they evaluated the grades

41:55

at two levels. So the team that was remembered the team that was using cheat gpty. It took the output of chat gpt and graded it, and then they took the output where humans did some prompt engineering on top of chat GPT and generated an output and evaluated that too, okay, and they found that there was not much difference, right. So what is interesting is,

42:20

so where is this human machine complementarity coming in? So people looked at the chat gepety output, made a few tweaks, and then they submitted it.

42:30

So, tying it back to your question, Richard, my theory is that the reason there was not much different from the chat gept and the human machine output was people were not thinking carefully about the bronze and actually writing detailed prons, right, And we think that by creating these layers of prompt for them you can get a benefit on top of so everybody is going to get that third percent benefit. That's going to be at some point table stakes, right,

43:04

was it? Steve Snowski I was reading a post from him recently said every textbox is going to be LLLM enabled soon and it's going to be like you start writing and it will just start giving you some stuff. So no, everybody is going to benefit from thirty percent or twenty eight percent or whatever that is, right, So how do you go beyond that? Well you go beyond that through sophisticated prompt engineering. So that's been my journey to build

43:32

this tool. Take your past responses, and I have to say this. People ask me this all the time. This is not what we are building. Is not cheating as a service, right, I tell people, if you don't, if you don't have good past responses, if you don't have good architects who can come up with a good solution, if you don't have creative solutions, then what we produce will be pretty you know, poor and

43:59

not something that you want. But if you have good past performances that you've done this SharePoint migration work and you can come up with a you have architects who can come up with a solution, then this is going to benefit. So really, from a responsibilii standpoint, it is about taking your knowledge and then generating that quickly and then accurately checking it that you're not hallucinating and saying you did five thousand not server migration, right. I love it as a

44:30

service, chees. But also you know the old the old comside comment of gigo right of garbaging garbage out. Yes, like we still have data quality problems like this comes down to the quality the data you load the set with and how well you load it. But it is an automation tool, and all automations stuff from the same problem, which is they're only going to amplify what you tell it to amplify. And if you tell it to amplify your

44:57

stupidity, it's going to amplify your stupidity. You're you're absolutely liter Richard my sort of in my experience in six months working on this project, it's clear that you spend a lot of time doing data wrangling. You talked about garbage and garbage out. Do you worry about what data is getting in? So so be prepared to do those kinds of things. Even though these models are great, But I would say one thing, these models are far more accepting

45:28

of the mistakes. So I was just doing this example recently where you know, we had to calculate some accuracy or some data and we went out to three or four external services to go validate this data. And each of the services returned the data in a different format, right, one return in a

45:51

certain Jason structure. Somebody gave us XML back, and previously I would have to write a lot of data wrangling code to sort of pars out that data, get this and then compare it sort of very format it's yeah, to get them into very imperative code, very much imperative code that I had to write, yeah, right, and then immediately throw away like you're only going to do throw it because it's going to And in case of language models, you can say, hey, I'm you know, I'm interested in validating this

46:22

source information that I have. I went out and made a call to four external data sources, and they're going to give us some data back. And this data may not be exactly what you want, might be wrapped in some XML or JSON. Please try to make sense of the data and it actually tries to extract entities whether it is XML or JSON right and does the comparison for you. So so I found that writing that data wrangling code was much

46:49

quicker than previously as possible. So there are these examples. So he was able to discriminate a tag from content it will not it will yeah, so as is the case with the imperative program. So I do feel that as a software developer, as we think about these problems, and Carl you talked about, you know, being a coding assistant is important, and I tend to use most of the times, I tend to use chat jupety for my coding stuff because it is you know, of course I can use the get

47:28

ub copilot and get ub copilot. You know, the benefit of course is that you know ID you're based in the ide right there as you're coding. It is showing. But for me, even more important thing is I go to chat gepet and say I'm trying to write this routine and this is generally my idea about this routine. I can describe it in plain English. It comes back with some code and maybe I take that code, I paste it in there and then I'll test it and maybe'll give me an error, or

47:52

maybe the logic is not quite accurate. I can go back and have a conversation that, hey, you gave me some code, but it is not handling this edge condition. Oh that's let me handle that edge condition for you. Go modify these three lines of code from here and then bring it back to my ID and change it. So in this manner, it's an iterative development model. So that is a significant timesaver. I had this exact experience

48:20

you're talking about, without going too far into it, Kelly. My wife and I like to do word search puzzles, right, and we were getting it in the paper once a week, and we're like, you know, maybe I had to go get a book, and the books were like terrible. So I actually wrote a word search using chatch ept to help me do it, and it was completely iterative, and I probably should publish the conversation

48:45

that I had to get the code. But it started just by putting them all in a line, you know, and I'm like, can you mix them up? Some backwards some diagonal some backwards diagonals? Like sure, So we just went through this reiterative thing and then it actually was great. I mean I used, first of all, I use a GPTAPI to search for words around the topic of X camping right. And I tried, first of all, I tried all these other libraries that try to do semantic you know,

49:20

matching and searching and stuff, and they just fell flat. But you know, the GPTAPI was like, no problem. It just produced these words and I could tell it maximum word length, minimum word length, how many words to come back with, you know, don't use dashes or apostrophes like. It was really really an easy and it was just a single sentence of English that I had to perfect in this API as a query. So yeah, I know exactly what you're saying. The iterative process of using GPT chat

49:52

GPT in particular to come up with some solution is really really powerful. I knew the model is able to reflect sometimes and say hey, I asked you to write a poem and which I let's say, I want I wanted to make sure that this is does not rhyme, and you give me an output which is rhyming, and say hey, this does not make sense, and it will say, hey, sorry, I apologize, I don't I met your requirement. It's very polite. It's polite, but Richard, I do

50:20

want to go back to you said it is a stochastic parrot. Right, I would challenge that a little bit and I put a put a video out. I'll send you the link that is not based on my work. That's based on some great research that was done something called the Othello GPT and your

50:38

listeners may know Othello as a game called reversy. It's an eight by board, right, So somebody did this, this awesome research, and I have an eight or ten minute video that I talk about it, and I especially if you want to reference that, because you know, we always think, hey, this is a stochastic parrot. You know, it just comes up

50:57

with something based on what it has heard. And then you start typing and you start talking or you give a response, and it says, oh, these eight words, the ninth word must be this, and the tenth word must be this. So what the creators of the Athology GPD paper did was they said, Okay, let's constrain the problem as any good engineer would do.

51:19

Rather than having a fifty thousand vocabulary of English words and then having billions of parameter models, let's just use the same technology, use a transform on network. But let's constrain the problem down to the game of Othallo. So Othello is a by board and you know, for limited number of moves, as you can imagine. Right, let's just take the sequences. Let's just go to the World Championship Othallo website and let's just download all the game scripts.

51:51

And now your vocabulary is not fifty thousand words, it's just the limited number of moves that you can make in that game. And now train the model. Train a language model based on the scripts of Othello game. Okay, And now there's a technique in in neural networks that has been around for some time called the probes where you can go to a certain layer of this neural network and say, what have you learned? What up until now?

52:22

What is your understanding? And what the creators of this paper did was, so let's say two players are playing this game and then you let's say you're they are halfway through the game or at whatever level they are, And what they wanted to know was, does the model really develop an emerging representation of

52:43

the board. It's not just predicting the next move, right, And to prove that point, what they did was what if I'm going to suddenly change the board positioning, I can move these these these moves around or the pieces around, and does it now change the prediction based on that? So two things I said, apply approbe figure out the emerging representation of that othello game.

53:13

So now it is predicting based on that. Now go in and manually change the position of these pieces and now see if it predicts something accurately. So what this paper tries to do is try to say that, you know, it's not just a stochastic prediction. In training this over and over and over again, it has developed an emerging representation of the world, and can we just try and apply this to CHATGPT In some ways it is developing an

53:46

emerging representation of the world. Now there are people who absolutely disagree, Ian Lacun being one of them. But there are data scientists who who think that these models are learning something more than at giving them credit for. So that's a debate. But you know, I'll point to that. TELLOGPT if you want to read the paper, I found the link and I'll included in the

54:09

show notes. Yes on your post and yeah it's in I mean, yeah, I'm you know me I follow in firmly in the Angkoon space of this is there are no emergency property emergent property here except the ones that we project upon them. It's software, but you can't argue with its ability to summarize, And so is this a kind of summarization, right? And a good summary also represents an emerging understanding. Right? But keep in mind, you know, the model has no notion that you're playing a game. It has

54:43

no notion of who's winning, who's losing. Right, are there any of those things are important? Right? And that's there in nice part of the problem like this is such the opposite of common sense that you could possibly imagine. Right. But at the same time, the eight by eight board is complex enough that it cannot memory the moves. You can't keep it in your head. You can not keep it in your head. So there is some

55:04

emerging representation the rules of the game. Some strategy is emerging that you know, anybody who's played othello knows that you know you want to take the coroner positions quickly to block your opponent. It is developing those emerging strategies. Nobody said you should do this, So there's something to be said, I found that paper so fascinating that I had to read for your listeners. I had

55:28

to read it five times to understand that paper. These papers are complex to read, and then I tried to synthesize the meaning as I understood it in a ten or fifteen minute video. Looks like you found a link to that. Yeah, it's twenty minutes, but okay, it was complicated, so

55:45

it was a little longer. But I appreciate your thinking on this, and it's great to see you building stuff here because it does give us all encouragement that there is space for some useful products if you put the proper parameters around it, like the capiats are important, covets are important, and think about it in this manner. Right again, going back to I'm always looking for how can I build application faster? So think about the three applications type that

56:13

are emerging. Right of course, three sixty five copilot, you have productivity tools they are adding lllms to. That makes sense, right, But then maybe you have some sort of a line of business application or complex claims processing system that most of your listeners can relate to. There's one piece of functionality

56:30

that we've never been able to you know, imperatively code. Maybe that functionality can be we can embed LLMS into your existing line of business applicant, right right, we you know, we get some large discounts because if we keep track of all of the invoices and certain products from different vendors, if we can somehow put them together, maybe we can come up with a way and negotiate a contract with somebody. Right, very hard problem to do. We

57:00

have never been able to add that to over invoice them. Maybe now we can inject LLLMS to do that. So that's the other type of application. And then of course the third type is a native you know, problems that we have not been able to solve without generative AI, and now people are coming up with applications and new applications come out every day. No, and I appreciate all of that sort of truth that as we narrow in on what these tools are reliable for, it is a new class of automation and it

57:29

will take certain workloads off the table that just we're bloody hard. I think the easy win is the raising of the average Like you talked about the writing side, that yeah, LMS will mean that your ordinary writer is a better writer, one would argue, the very best will still be better than any tool, but most of us aren't the very best, but far more interesting

57:52

will be this true. We couldn't have solved this any other way, Right, you were mentioning a fellow and immediately go to the deep mind stuff that that went after go. And while that's interesting, deep mind's also tackling protein folding and is now producing results that we're going to take years to validate because

58:12

they've tackled such a complicated problem. Like at this point they're like, well, it's now come back with some protein for whole routines we've never seen before, and until we prove them, we don't know if they're right or not, and it's actually hard for us to even prove it. Is that's interesting is the IBM Watson Chess program and AI. Well, it's a decision tree

58:30

model built in the nineties. Like it's way more primitive and compremitive. But if you look at what it's just, it predates the deep learning models that we use in generated AI, but it uses just pure brute force to go through those trees in every possible combination. Yeah, well, it has a

58:47

triage because you can get too deep. But one of the variations on Alpha GO they actually use the same training model to train against chess, and its slaughtered the best digital chess players out there, and it learned no time at all. Like it just if there was an indication that for this kind of

59:07

problem space, it was wicked good at it. And also that, unlike the Decision model was taught against historical games, that Alpha my model literally was just given the rules and played itself, right, which is what they ended up with GO. Originally, they with the original version of the Go they trained against classical games and then made a much more efficient version that just trained

59:30

against itself. So, I mean the space is where these genera AI models make sense, none of which is fact based, especially especially when it comes about extracting facts from the Internet. You know, that place that really lacks

59:46

a lot of facts. Yeah, so you know, it's as long as you use a tool appropriately and you don't fall into the trap of trusting it as an authority, right any more than you trust your calculator being authority calcuators, Well, at least the calculator comes up with the same answer every time.

01:00:04

Oh, the input it's are the same, for sure, don't discount the user error issue with calculators, right, I mean it, you know you have to follow the same sequence of keys, but yeah, you get Well, how many times we run into companies where they based their actions on a on a spreadsheet that had bad math in it. You know it's easy to do. Uh. The hour flies by with you visual, I know, kind like I want to keep going, but I could easily go another

01:00:30

hour. Oh well, thank you for this. It's I'm really grateful for the work you've done. Yeah, me too, And and especially I really appreciate your business oriented mind of using AI where and that's something that I'm I'm just now starting to, you know, wrap my mind around. How how are real businesses going to use this? I mean you mentioned the Azure thing, but also Google Bard now can search Drive and Gmail and everything else.

01:01:01

Gmail search was always really good, but all these business tools are coming out, and therein lies the opportunity I think. Yeah, So thanks again, Vishwas and dear listener, we will talk to you again next time on dot net rocks. Dot net Rocks is brought to you by Franklin's Net and produced by Pop Studios, a full service audio, video and post production facility located physically in New London, Connecticut, and of course in the cloud online at

01:01:53

pwop dot com. Visit our website at d O T N E t R O c as dot com for RSS feeds, downloads, mobile apps, comments, and access to the full archives going back to show number one, recorded in September two thousand and two. And make sure you check out our sponsors. They keep us in business. Now go write some code. See you next time. You got Jack Middle Vans

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript