¶ Intro / Opening
Hi everyone. My name is Patrick Akio and if you're interested in generative AI, large language models, and some interesting challenges and applications we've seen so far, this episode is for you. Joining me today is Renz Dimadal. He's Principal Data Scientist over here at Xevia and trailblazing with the technologies as we speak. I'll put all his socials in the description below, check him out. And with that being said, enjoy the episode.
Beyond commoding, have you done any like logging our stuff or
¶ Hosting and being a guest
something like that? Yeah, I've been here, like I said, there at the top of the table. And then I had like a customer and somebody from my team and I had to like make sure that was nice flow of the conversation going on there. It was fun. How was it? Was it good? Yeah, it's fun. I really like speaking. I really like. But also, I also really like being the the host there. Yeah, take that type of role to make sure that the flow is right.
Yeah, I agree. It's the first time I I went on a podcast as a guest was a bit weird because I was like, OK, I'm just gonna go in blank and then I they're gonna ask questions and then it was a good flow of dialogue. But then I do realize how hard it sometimes is when you're speaking and you're still trying to figure out what you exactly want to say and you don't want to come across as rambling. It's quite hard. Was that before or after you started hosting your room? Was after OK.
So you really know, it's like on the other side of the table. Yeah, yeah, yeah, yeah, yeah. Also. And I I did an intro call like I do always intro calls like I did with you for potential guests and other people also do that with me. And in there, the person was like, OK, what, what do you want to talk about? And I was like, this is your show. What do you want to talk about? So yeah, I had to be like, no, no, no, You direct this. I'll be here for the ride.
Exactly. I'll be here for the ride mostly. Nice. Yeah. But what do you do more on a day
¶ Day to day of Rens
today, nowadays? So day-to-day, because I like I I was hired here as a consultant. Yeah, as I was mostly at the clients. And then I became also more yeah, a leader, principal data scientist over here. So also taking on some management responsibilities. Yeah, there was still like 1015% of my time. But now since March, April, don't really remember exactly anymore. I'm now not that full time and a client anymore. I'm like full time helping CBA figure out.
What are we gonna do with Gen. AI? Yeah. So how is it gonna turn into proposition for us? How does that change the work that we do as tech consultants ourselves, those type of things. And that means for me, figure out sales propositions. Yeah. Giving talks at conferences also. Yeah, part of this also. Maybe I'd like to spread the word.
Yeah. And and making marketing materials, those type of things kicking, kicking off new developments like we have lots of consultants who would like to work with Gen. AI and when they make something to inspire them maybe some ideas but that they don't need that that much. But what is really helpful is like when somebody makes something great to make sure that it actually gets seen and that it gets follow up and that's I think that's a lot of fun to do.
How did you get into that position? Like, was it an open thing and you applied or did you really push for it and kind of create it yourself? The Gen. FAI part, you mean? No, that was I was asked. So Stein, my manager, he said like grants, I want you to. I think this is going to change the world. Yeah, I want you to focus on this. Pull this. Yeah. And I I, in advance wouldn't have thought that that would be a possibility.
Yeah. And then I thought, OK, well, I'm going to do it. Yeah, good stuff. Yeah, yeah. And what is kind of your opinion on kind of Gen. A I, where it stands and where
¶ Generative AI: Where do we stand?
it's going in that way? That's a. That's a big one. That's a big one. I agree. Yeah. Yeah. Yeah. So where Jenny I stands today and what's my opinion of that is like, that's the question, right. Like, yeah. So I think right now it's a really interesting time. And that's because the, the capabilities of what's possible has really increased, let's say, in the last 12 months. Yeah, But expectations have also really increased. Oh yeah.
And. Those two uncertainties of like what's actually possible plus what people are expecting of it multiply just creates a whole lot of, yeah, uncertainty I think and what's what people think is possible and what's not
possible. And yeah, my opinion of that is that I think that it's time to, I'm looking forward to it when it does settles a little bit and we can actually start to build stuff with our with with people that's also like the feedback that we get from our customers when we speak to them. Okay. Now we kind of get what is the sense and the nonsense, what's possible right today and what maybe will be possible in maybe a year or maybe 10 years. We don't know, because the progress is so fast.
Yeah. And I think that in my opinion, is that it's quite that There's too many hot takes today. Maybe my hot takes. There's too many hot takes. People are saying, yeah, the world's changed. All these people like the world's totally going to change. These people are coming maybe from the NFDS or whatever. Everyone's going to be jobless. Fully hyped, trained. Then you have to fully. We're getting into the doom scenarios. We need to stop now.
The beginning, Yeah, I really don't like those types of people. But then there's also the people that say. It's totally useless. It's the same. I read article yesterday, colleagues shared it about they made a comparison of large language models with psychics, how they basically delude their audience, but also psychics delude themselves. So the people working with at lans, they delude themselves that it's more capable than it actually is.
Yeah, I don't think that's. The case either because I see people using it for useful stuff all the time. There's lots of examples of it being applied quite usefully, so that's not the case either. But then to give the nuance, story of something new is happening and it's useful in some cases, but you need to find out how and not useful in other cases. And that's less sexy to sell. Exactly. It's a bit harder I can imagine.
I can't imagine when there was a time that this kind of new technology came out and everyone kind of was playing with it or talking about it, or have you tried it and see what pops out and was this amazed by it? I do feel like it opens up a lot of doors, but I haven't seen maybe maybe it's just me, but I
¶ Honeycomb.io using an Gen AI to increase conversion
haven't seen anything concrete that has landed with an organization that has adopted it yet. Feel like everyone's still trying to figure that out and trailblazing. To play with it, yeah. So I think, I think maybe you can compare it like with other. Early apps, it did have like fast, like broad adoptions for maybe like the Internet. Like Internet came out, people started to make personal websites and that type of stuff.
But it was even broader than JNI because it was quite hard to make applications with it. Yeah, but also maybe app development, like that's where lots of people was, lots of hype and you could see what's the possibility of it and how it could be applied. But then about places where it's already really being used. Yeah, of course, lots of companies and are now making little proof, concepts and trials with it because they figure it out.
But there's also really, like I've seen some examples in industry that I I thought like hey, that's actually pretty cool. And they are, they are running it live. So maybe one example is from I heard about it in the Emeralds community podcast, which is also where I'm organizing Emstem chapter also. That's how I got there. I heard about it as well as a company Honeycomb I believe and they are like observative platforms. So cool.
Yeah, nice. So. Maybe you could tell me what did they actually do, because I only heard about it. What is the Genii side of it? It is an observability platform. I had Jessica Kerr on and I think either she's Dev railed there or one of the principal engineers there, but I don't know anything more than it's an observability platform. Okay. So then we both know part of the truth. That's Okay. So what the product manager
said. So they have on their website a. You get like first step, people have to submit all the data there. Second step is they need to write queries to get like useful stuff from the data that they've sent there. Yeah. And they shared that like their hurdle is one, get their customers to upload the data and the second hurdle for them is to get their customers to write a good query, yeah, get what they need. Yeah. And they have and that's important for them because they
have a product strategy. So that means you write. You get people for free and then they convert. And they found that people who write complex queries convert more likely to paid users writing complex queries or hurdle. So what they made is just below this normal query editor they put a query assistant where you could a natural language write what you want and then the background it transforms it into their own query language to get the insight. And what did they find when they
put that life? The? The people who had access to this new tool, they converted, they were much more likely to write these complex queries. And they know from previous experiments like, hey, that will probably spill down into more paid users. Yeah, And they're using that right now. And that's I think a pretty cool win. And I think that's also a really good example because it shows you're not going to put chat, chat everywhere.
Basically, it's a lazy approach. They integrate it into a product in a place where they knew that it could be high impact. So I think that's. Yeah, that's pretty cool.
¶ Plagiarism and chat bots
It's super targeted and I always feel like it's gonna go into the realm of assisting users into doing something right, Making something that is or might be complex, making it easier, right? Having kind of a translation in a language or something like that is where I've seen it most applicable. Or just guiding someone through and getting the information they need. But then still you're talking about kind of a digital assistance type of thing? Yeah, now this.
There's more use cases for that. I think there's like 3 buckets that I sometimes like split it in as one as basically content generation. Yeah, so copywriting or maybe helping you with your e-mail, something like that, Is that legal by the way? Like if it generates something, isn't that can that be plagiarism as well? Yeah, that's Okay. So let's say you use such a model to, for example, write A blog post. Well then such. These are language models.
They generate text. Yeah, some of the text might indeed contain an idea that was originally thought by somebody else. So if you don't start to say, hey, that's my own idea, that's a problem. That's a problem, yeah. So yeah, you probably wanna look things up in that way, so you don't wanna take new ideas from there to do that, but. Honestly not most people use these models for that usually. What I see mostly is that people
have good ideas themselves. They could source material themselves and it needs for example help with writing the text in the desired tone of voice or those type of things. Yeah, so that's one site. So content generation is 1 site. Maybe the other one is more search related. So it's also like basically the query example from before. It's becoming easier to maybe write your queries in natural
language. Or to find data from your sources in a less strict way that you have to have exactly the right keywords to, for example, get your search results. Yeah, and then the 3rd way is yeah, that's now with language models becoming easier to have. Maybe we're starting to get a glimpse of being able to get finally maybe chatbots that actually work. Like I've never had a good chatbot experiences.
Especially not it's like for a company like and ChatGPT is fun to play with but it's not really a service chat bot for for a company or something. But it does make me optimistic that maybe we'll get there one time. Yeah, but then I mean you want to say that actually I think As for companies I would really advise and not to immediately go into that chat bot route because I think that chat bots by themselves are kind of like a a UI failure.
So if you like, if you need assistance, then something went wrong. Yeah, and that way, this is also like something that I saw in a different talk from the Emerald community. Linus Lee was this, he works at Notion and you know, the note app. And he said basically, yeah, well, good UI is obvious.
¶ Integrating different solutions
Like it doesn't put you make you think. Yeah, it's intuitive. And when you're working with a ChatGPT. You're basically working on an empty slate. You have to think what has to be done. You have to describe how it's going to be done, and then maybe that model will give you something useful back. But just like maybe to give a slightly different example, lots of people are now putting code
into ChatGPT to ask for help. Yeah, yeah, you have to take your code, copy paste it into some other tool, and then hopefully get something back and you put it back in. That's not really. A nice flow like I mean okay, don't have to go to Stack Overflow anymore and search for five results. Maybe ChatGPT because sometimes it's more helpful for me, but I still have to switch tools all
the time. Wouldn't it be great if in my IDE where I you can see everything but the context of what I'm already doing that is automatically loaded in so that's taken into account when a suggestion is created? Where's my cursor? Which auto tabs do I have open? What was the last Test that I ran? That failed, yeah, in. The context, and let's imagine that that was automatically already based in the LLM before suggesting to create, so your flow doesn't get broken.
And that's what I mean with that chat is kind of like that UI failure, because if you have to do all these things manually to put them in there, then you weren't able to capture all these other important signals as part of the product that you're making. Yeah, exactly. It's like having two products next to each other and you have to fill in information from the one product to the other one and then get the output and do something in the first product
again. Well, if it's integrated then it has all the context of where you are currently and it can use that in any suggestion that it needs to. Yeah yeah. And also going back to the Honeycomb example, they could have made a chat bot like chat with your data. Yeah, ask for queries. But they already know that people have a. They want to search for a query, and they want that result to be shown then somewhere so they can add it to the dashboards.
You don't make all the chat button and everything, no, you integrate it constraints into the place where you want it. And that's not just useful because it's for the user, it's nicer to use. You can make an optimized user UI, basically, but it's also easier to test if it's doing
¶ Testing AI solutions
what it's supposed to because you give a free range. General assistance that can do anything a good luck. If it's doing what's supposed to do, But if you say hey, we have this query tool, that's supposed to make it easier, although we can see are people using it more than the old one? Are the queries that are being generated like as a baseline? Are they actually? Is the model returning valid query language like would it actually provide a result?
And maybe do people give a thumbs up afterwards if it's OK or not? Well those are really useful metrics. I can think of those for a query assistant. I cannot think of that for. Generic assistant? Okay. Maybe Do people get using it and thumbs up, but not the more lower level stuff that gives you a faster feedback loop. Yeah, and that's really important if you wanna grow faster. I hadn't thought about it like that before, but it's a problem with any toolbox that you have,
right? If it solves every problem, then you're like, okay, When do I use this right? Do you use it for every problem? No, because it doesn't really solve every problem. It solves a lot, but for tools that are very specific, you're like okay. When I have this issue, I know exactly that this is the problem. This tool solves this problem. Yeah. So then it's hyper specific. Yeah. And that's easier to show the
value. Yeah. And that's I think so when you go back to all these companies trying to now adopt this technology that it's, yeah, it's better to start with a use case maybe where you know okay, this is specifically this is the flow, this is how things are now going here we see an opportunity. To make things easier, things cheaper, things faster, things better for the user, and then let's see if we can make that work, and let's see if we can make a feedback loop running.
And that's also what is mostly happening around me right now. Yeah, exactly. I mean the way I started using
¶ Slack GPT
ChatGPT was just when I came out, I made an account immediately. Whenever I had a question or I wanted to just play around and see what this tool did, I would log into their thing, fill in my information. Usually I would have to re log in actually, and then within CB. I don't know if you were involved, but we now have this Slack GPT plugin and I love it. It's basically it's a slack plugin and it does the exact same thing except when I see some stuff on slack I'm gonna call.
Let me double check that if that's correct or if this is how you would do that. I go to slack right? I open up my my messages with slack GPT and I ask my questions and it pops up. I like that. Plus off all my information is more so prioritized, which makes a lot of sense in that way. Yeah. Thanks for saying it. So it was originally my idea. Yeah. Good. So, yeah. And together with colleagues, with Kyle, with Ismail, we made the first version. Yeah. And it's.
Yeah, it's getting quite some use. I love it. Fun. Yeah, Yeah. Is it open source? We haven't open sourced the repo. Maybe we just should, because there's nothing secret in there. I was gonna say yeah, we should. I do think it's a cool initiative and I talk about it to friends and they're like, can I use it? And I'm like, I don't think so. Maybe not yet. Yeah. Yeah. So if anybody wants it, if one person tells me to open source it, I will.
But in the comments, yeah, yeah, it's some good stuff, yeah.
¶ Sensitive information
I think that's also really, but it was also important for us to make that because I think I see all one, because it's really cool to see that all our colleagues are using it and the postal feedback is really nice also because people go to chat.openai.com. Like their data is being used to train models. Spell your secrets. Yeah. And that's really not OK.
Like that's was the Samsung famously they put some sensitive data in there and that's really harmful because also then I believe it was in the same news article that then the company said OK, we're shutting down everything, no more gentfvi for anybody and that's maybe also not the solution.
So I think being able to. Yeah, have tools that you can use responsibly that you are safe, feel comfortable with using, but also managing the user's expectations with it well, it's also really important. I'm curious what you think about that. So you say that you use it. What do you think of the message that we send you every day with like this is what's happening with your data. I ignore it. But do you know what it says? I read it once, probably at the
very beginning. I was like, OK, yeah, don't fill in any private information. Anything that I wouldn't like fill in in Google or anything that would be kind of. I'm also on a project where there's a nondisclosure agreement. So anything information with the rest of what we're doing and the way we're solving it, don't put it in there. And I was like, OK, that makes all sense. But usually when I'm like, OK, I program and go, I'm like and I have to do stuff with bytes.
I'm like, OK, how do I byte shift this or how? If this is my input and this is my output, how do I get this And it creates a function and sometimes even to my frustration is wrong like over and over and over again And then I'm like, we're not there yet. I'll I'll just write this myself most of the time. It really helps me speed up like I I had this huge like switch statement for example because we have quite a complex data model and it was from protogeneration.
We had an enum and let's say the enum had like 30 fields and I had to make both mappings, so from a specific switch statement to one and then the reverse. And I'm quite handy with my ID like I can, I can duplicate my cursor and I could do it. But then I was just like, OK, let me let me try this right. Just with a dummy example. OK would would it do that? And it did do that. And then I was like, OK, this is my spec, can you increment all these things?
And I don't have to do it manually And it did everything And then I was like, OK, this is quite, this is quite nice. It saved me a bunch of time. Yeah.
¶ GitHub Copilot, Sweep AI and new ways of coding
Cool, yeah. Do you use Copilot stuff like that? I haven't dabbled with Copilot yet because I don't know, I haven't tried it. I feel like I'm still super effective. So then I was like, OK, would it make me even more effective? Or is it going to be distracting, like generating or suggesting stuff that I didn't know I don't want to use? Yeah. So I very consciously step out of something and very specifically ask for certain questions and then integrate it back and get back to it.
Nice. What about you? Yeah, so I use Copilot just in my IDE sometimes. I like it for the autocomplete, basically. Yeah, I would really like it if I could also chat with my data like what I said before. Like if I have a chat window that already knows the context, that's what I think I would really like. Yeah. So I could stay in a flow more. And what I'm came across and what I'm actually curious about is how this is gonna unlock like
new ways of coding basically. So there's this product that came across, it's, I think sweep dot ar or something like that. And basically what they say is you write an issue in the background. Yeah. And basically I'm an LL mixed pull request.
Yeah, I've heard that. So if you just write like, hey, I need to fix this function or make write the reverse whatever like you said, like a simple task maybe it will be possible and I think that's why I think that's interesting is because it shows how the workflow's gonna change, can change and I'm not saying it's good enough today, but like you see what the progress that's being made and I think that we are gonna gonna get there. Yeah. And I'm really curious about like how this.
Yeah basically how the way of working is gonna change because of it. Yeah. Yeah I talked to a guy that he's a YouTuber his first night and he loves to do research. So also when ChatGPT came out, he was skeptical and then he did a lot of research and then he was like, OK, this is, this can change in exactly the way that you say it in that you have kind of this a I team member that can pick up something that maybe a junior team member would do as
well. And still you can review it and be like, OK, no, this is not really upholding our conventions in this case. Or or this is actually wrong. And then it could still do that and their output is like tremendously faster. Yeah.
¶ Way of working and quality control
What I think is interesting is let's say, let's say it goes that way for you to be able to review whatever this AI thing is going to create, you have to have the knowledge. So then are you going to hire more junior people and educate them to get to that level or are you going to solve like the easier stuff with AI and then it becomes a big old hurdle for people to kind of get into this field where they have, let's say
more senior related knowledge? That scares me because that means that the information is gonna be a bit finite. I feel like unless people use then AI as well to educate themselves faster, but that's a hard thing. You need more hands on experience I feel like. Yeah, I think that's interesting. I'm not scared about that. I think and because the dynamics are okay so. Let me put it this way, were you scared when auto formatters came? No, no and pre commit no. Those type of things.
I think this is kind of like the same thing but maybe a little bit further, right? Could be and even junior developers, you could still get them. Maybe you get a meta thing where you have let's say sweep delay. I would have like some some tool that makes automatic pull requests on issues if they're small enough basically and then
you say like the challenges. Okay, I've senior team members and they could totally review that code and then go ahead and have junior team members maybe would have written the pull request and now they don't have anything to do. So then why would I hire them? And that's why I get into the squeeze of only having senior people, but then they get no new senior people because the juniors don't get grown up.
I think in the practice you'll at least If that would happen to my team, I would tell the juniors Okay you take a first pass of reviewing that code. I'll take a second pass. And for example, like that stuff like that I think would happen. Yeah. Or they. Yeah, I think. I think it will always people always adapt their way of working and everything. I mean in in car manufacturing plants or whatever, people did stuff manually, then they automated stuff and people move
more into quality control. Those people also still had to
¶ AI will change the way we learn
be trained and everything. So I think it will change also how? How people will get trained and everything like I never learned how to do. I did learn how to inferred matrices by hand, but never really did it because it's really painful. Yeah, so never. Needed it, but still I learned matrix algebra and because I didn't have to learn how to do the matrix inversion by hand, I could go way further into all these things and I think that will happen as well.
So junior developers who have access to all these toolings, they can maybe learn how to do software engineering, systems design, they can. Maybe faster go to higher level concepts of how to make things work together in the big picture because they don't have to worry so much about the relatively low level stuff. Yeah, so we I like that. Yeah. So their role would change and probably the way they would learn would also change because maybe less hands on and more so the quality control.
I think that's a good example of it doing less manual work and more so quality control. But then still I feel like because the field, it's quite broad and there's a lot of knowledge aspects, right, You can go deep in a lot of aspects. I do think it's going to be a bit more hurdle kind of getting into the field already now. I mean, I'm, I'm on subreddits a lot and people also reach out to me, they're like, OK, how much
projects does My Portfolio need? And I have a real hard time answering that because I don't have a portfolio. Like, yeah, I can say that, but then someone's going to really like and be like, Nah, 7 didn't do the trick and I'm going to feel bad. Yeah, like there's no cookie cutter mold yet. People do struggle to kind of
¶ Distinguish yourself when applying for jobs
land their first job, for example, especially if they don't have the educational background, if they've done something else. And I feel like if an AI tool then already solves kind of easier problems, then it might be a bigger hurdle for them to get into the field as well. Yeah, because also the skills that were maybe. Distinguishing them the before. Yeah, I'm not gonna distinguish them today anymore, but that happens all the time, right?
Fair enough. Yeah. So, but then the creative question is, how do you now, how do you then distinguish yourself in that new. Yeah, in a world where maybe the basics are not special anymore? Yeah, I think that's the right. Thing five years ago, seven years ago, it was really special. If you could train a machine learning model, that's not special anymore. So you have to distinguish. You have to distinguish yourself in different ways and.
The way I think about it, maybe in the big picture if I can get a little bit hand WAVY, is that what I see is that let's say
¶ Three types of knowledge and AI
there's three types of knowledge, like factual knowledge, knowing how to bit shift or whatever you were talking about before. Sure, and how to apply things, how to just write code that works and the wisdom like knowing why something is a good idea or something is not. Yeah, more simpler example that I've given to other people is like. Factual knowledge. Knowing that tomato is a fruit application. Knowing how to make a salad. Wisdom. Not putting fruit tomato in a fruit salad.
So what I think is that with AI like a generative BI specifically, we are getting to a case where so with the Internet. Factual knowledge became less distinguishable. Knowing all the capital of the world? Yeah, not so special anymore. Knowing the whole figuretips. Knowing the whole New York map. Okay as a taxi driver, less important because you just have Google Maps. Like stuff is changing. Yeah. Going out to apply. How to apply things became more important.
Driving around, well, all the things also became relatively more important than knowing all the streets. Yeah, now we get to. That's Gene. I can maybe write more code, do more things for your automatic driving as well, but it's relatively more important for people to do the wisdom, like knowing whether something should be done in the 1st place or if it's The thing is well done or not. Quality control, curation, evaluation, all these different words for that type of thing.
So if I'm going back to your question about you're a new developer and you want to. Make a portfolio. Maybe now, like, what we'll see as a trend is that it will become more important for you to show that you have high quality judgment. Yeah. Of what is good as what bad, what should be, how it should be made, bloody da. Rather than showing that you can provide beautifully formatted code. And I think that's actually pretty optimistic.
That's quite interesting because that means that we need to have people with. Yeah, good quality judgments. Yeah, I'd like to see that more in the world. So I agree. I agree with what you say.
¶ Learn by doing
But judgment for me maybe this is just how I learn, right? Judgment was really when I got hands on experience when I saw that okay, this thing that we did was not as effective as the next thing. Or when I moved from project to project and I was like okay, we actually have less processes and we're going faster. So processes are not always good like those things you learn from actually experiencing. Yeah. And I feel like it's hard.
If the application or an AI solution kind of knows that and knows best practices and really good fundamental knowledge, basic stuff, and then you come into that and you're more supposed to be like quality control, then you're gonna go off of here, say, or you're gonna experience in a different way. And maybe it's because I don't really have or foresee how that would go rather and I only have my own experience. Yeah, okay, maybe. I think there's two sides to that.
So one is about how to learn things, right? So there we also, I don't know, you have like graphical calculators in high school. Yeah. Like you were not allowed to use those until a certain point. Yeah, because you had to learn the foundations first. Yeah, I think we're gonna get the same here. Second is because we had those, you could go faster, go further in high school already than people did before. So I think that's still the topic about learning.
I think that will be there. The other part is about when you
¶ Doing more as a single person
say like evaluation, when I say that that wisdom access about what we have to do, it's not just about quality control, it's also deciding what should be exactly the why. The why and for example, designing, figuring out if something is a good use case, something worth building. Yeah, I think that's. And then if you talk we'll go back to the to the use cases. I think that today it's easier than before as a single person maybe to. Make a web product that does something really neat, yeah.
For example, auto transcribing YouTube talks like you go from YouTube video and automatically take like from the video all the all those frames that are actually different slides from a conference talk. Use whisper or something to get the transcript out and to automatically go from YouTube to blog post interspersing the slides. Plus the transcription of what the speaker says into something. Yeah, like. Seven years, five years ago, two years, three years ago, that was.
That's pretty hard to make nowadays with Whisper plus this type of large language models to maybe format it a little bit. Simon Willis, who wrote a cool blog post about this. Yeah, this is pretty makeable. I did it myself, actually, yeah. But if you talk about portfolio stuff like this stuff is like, it's all possible right now. So I think actually it's really, really, this is a good time to make a cool portfolio showing that you're creative with how to use these tools.
Yeah, it's like we have a lot of building blocks, and rather than figuring out how to create this building block that does this thing, you can use them and hook them together and create something that solves a problem or just show something cool. Yeah, and that is easier than I think it's ever been. What did you make? So I suck at writing summaries
¶ Patrick generated podcast summaries
for my podcast and I've stopped doing it. I just do the outline, which is the timestamps and what we talk about. So my idea was in an innovation day, OK, if I have my audio file from the podcast, I wanna give that to Whisper and it's gonna give me the transcribed version and then I'm going to feed that into ChatGPT and I'm going to be OK, This is the podcast, this is the context, Can you write me a summary? And it did the thing, it just
took a long time. Also, I didn't want to pay, so I had like a limited amount of characters and stuff like that. But this was early on and a summary came out and I was like, this is actually fairly doable. Like if you tweak this or if you use this as a base template, you can do a lot of cool stuff with this. And it was my first stab at like playing around with Wispray Eye and stuff like that. Cool, yeah, if you need to like. Access to a bigger model, like
32,000 tokens. It's like 50 pages of text. Yeah, we have that. So okay, let me, I'll let you know, let me know, I'll do a continuation. Yeah, that was some. We need to do a slack GPT. I think so. Yeah, that was before slack GPT was there. And it was, it wasn't really, because I was actually gonna use that. It was more like, let's see what is possible. Yeah. Like, how far is this a thing? It was an idea. And then actually we did it and I was like, this is some cool stuff.
Yeah, cool. Nice. Exactly. When it comes to, I want to jump
¶ Easy to make something with LLMs
back into organizations because let's say before Slack GPT or ChatGPT, we had organizations which wanted to make data-driven decisions, right? But before you need to, before you can do data-driven decisions, you have to have data, you have to have collected data and you have to have a story. You have to have a history of a
lot of data already. So then when organizations wanted to do that, sometimes they saw that okay, if we want to do that, we can only do that month down the line because now we're gonna start collecting the data that we need to be predictive in that way. Are organizations ready to kind of adopt LLM models you feel like or Llms rather, or is there still challenges with kind of integrating it into the organization and then a product in that way?
There's definitely challenges. Are they the same or are they different? Yeah. So. Okay, some things are the same. You need to have your basics in place to be able to get access to your own data and everything, all those things. But I think it's more interesting to talk about the differences. So one difference is that before to be able to get to your first model that you could maybe somehow integrate in a product, service, decision making process
that you have took you quite. Quite some effort to get the data, make sense of the data, train the model, yeah, host it somewhere as an API. Or maybe set up some pipeline as a batch process, Yeah. And it will have to be done by data scientists, somebody who knew how to do those things now with these large language models, to get to your first thing, I'm not saying that this is how you should do it, but you
can do it this way you can. Go to openai.com to GCP with your models, use the Palm model, whatever, write a prompt and you have an endpoint that basically gives you the thing back. So that quick win of like seeing the 1st result, that has gone down by a lot and that doesn't work in reality. So that's one difference, but the other difference is that,
¶ Long term LLM challenges
well, it's so easy now to make something like a 1st result. Making actually viable and desirable for the long term. There's some different parts there, for example costs like those models, large language models that you can use as an API are quite expensive Okay. So you have to really think about Okay. Does this makes is this really value add for my service? Is it worth the benefit? Latency. So these models are quite slow to respond. Like these are huge models.
It's really impressive how fast they can return things. But compared to a recommendation model just giving you, OK, these are the five next things you should watch on your favorite streaming website. They're much slower and in maybe you've heard of it like like websites like booking in Amazon. They do the studies of like how much does, let's say 100 milliseconds slower website, what does it mean for our sales and stuff like that? Yeah, it's. Insane. It's insane.
It's insane. But then if you look at that with the response times of these large language models and you say okay, this is not gonna work real time. No. So that's already a case. Also like, how are we gonna solve that? That's different because your own models usually are way faster. And then there's the thing about how do you actually make sure that one version of your LM system is better than the other one?
Because you have your prompts, you have maybe some other settings that you can also ask an API like temperature. And there's all those little things that you can tweak to kind of tweak what comes out of your out of what you get as a response from the model. But being able to really test this a better than B is quite tricky and the reason is that well with. It's really hard to get good
metrics for this. Okay maybe to go back to like more traditional predictive AI, machine learning type of things that we've been doing for a long time. Let's say spam classification or whatever. What you usually do is you get lots of examples mails that are spam or not spam and then labels spam not spam. And then you can train a model that predicts for a new given text is this spam or not. Well then you can see how accurate is that model on the held out data set.
OK, because and because it's just ones and zeros, you can see accuracy, precision, recall, all this stuff. But now let's say we're making a tone of Voice Assistant that can turn any text into the tone of voice for your favorite company. OK, with input text. And now we get output text. And how do we judge that? What metrics do we use for how well it is? That's quite tricky. And then, OK, well, we're going to have people label this.
OK, sure. You're going to get people to label tone of voice of the output text. Great. And now we have a new model and you're gonna let them also label all that text? And how are you gonna actually totally manage that? Yeah, which one do? We pick which one now do we actually pick and do you have a good process for letting people evaluate the tone of voice of your company like that's been training annotators or thinking about stuff like that.
That's just a whole lot harder than previous business cases, at least in predictive cases where you just have labels that are maybe generated. Naturally. Like people, yes, spam, no spam, whatever. But this type of stuff is usually a lot harder to evaluate and I think that's really an open challenge because there was a right and a wrong in most cases, except now it's more, it can be more opinion based, right? If it gives you a tone of voice thing, a lot of marketers are
going to have an opinion. But if both texts are in line with the tone of voice, then both could be correct. So then you have to pick one which one is actually better. Yeah, I don't know if it actually then matters if both are good, does it matter and and is there going to be significant difference between which ones. But I think when we're talking about measuring, it is a hard thing to then measure which one is actually better. You have to have to AB test that
¶ Metrics to focus on for LLMs
I guess. Yeah. And that's that's the challenge. So I like to think like about three types of metrics and like success metrics stuff that you really see once you're in production, like the feedback from the user, whatever, like really the things you care about and then you have maybe driver metrics which is. The stuff that you as a development team can see going up or down, but maybe without going live, yeah. And you know that they're kind of related.
So for example, if my spam classifier gets more accurate, then probably we'll get more users or people will be happier using my e-mail service. And then you have guardrail metrics, and those are the metrics that you don't necessarily want to go up, but they should just not drop below a certain level. Yeah. And. I think that's with large language models. What I find relatively easy is the success metrics. Like stuff in production, Okay.
Do people like the service? Do I get thumbs up from results? Okay Sure. Did it help them? Gartner metrics is also kind of OK because you can have, let's say, checks afterwards like, hey, this is text bias, toxic, whatever, There's some relatively standardized stuff for that. Or for example, you could if you say tone of voice for example, Does it have any of these forbidden words? Okay. It's kind of, yeah, scratch those. That's okay. That's Okay. But that's a really low bar to pass.
So there's a more guardrail metrics. But that middle part, being able to iterate on your system without going live, I think that's quite hard in cases where indeed, as you say, there's multiple ways to be right, it's ambiguous between people. And yeah, that's a bit of an open challenge, I think. Yeah, yeah, I'm interested to
¶ Rens's greenfield LLM implementation step by step
see where it goes. I mean when you were laying out kind of the differences in Llms being really fast to get up and running right? Because you have something, you can query it and you have a certain output latency might be longer. You also mentioned that this is probably not the right way to do it. If you would have kind of a clean slate and open field, how would you implement it then? Would you create something yourself or how would you go about it?
Yeah, also there it depends on the use case and the big qualifiers, like what data are you using? Yeah, and can you? What agreements do you have with, let's say, a large language model provider on privacy and all of those things? Major cloud providers have stuff like Okay. Your data never leaves your cloud perimeter. So as long as you're okay with storing your stuff on our cloud, yeah, then you should probably also be okay with putting it in a large language model. Okay. That's nice.
Sounds like a good deal, yeah? But some companies have data that they are not comfortable with putting on their cloud service provider. But maybe they still want to use llms for that, for example. Okay or so. That's why you won't be qualified. But maybe we can park that one for now. But I would do if I could have clean states. Let's say project is. I would probably start with an open source. No sorry, not an open source with provided model from a cloud provider or from open AI.
Use that as an API just to get it up and running quickly. And from there I would become very. Rigorous about collecting that feedback, finding a way to test without going live again if I'm making it better or not, yeah. Make sure that I'm really good at actually making sure that it also really translates to things live getting better or not, not just in a sandbox. Yeah, not just in a sandbox, yeah. And once I'm capable of being able to, just to test those improvements.
And then it's like looking where's the bottleneck, right. That's classic product development basically. And that can be like, hey, it's maybe it's because it's too slow, Okay. Well, then we can try different things. Is it maybe because the quality is not okay? Well, then we're gonna try other things. Yeah. And I think, yeah, I do think, I do really believe that being able to start quickly, that's the big benefit, being hard to test and knowing how to iterate on your product like this,
that's the hard part. Those are open channels, yeah.
¶ How to leverage open source LLMs
And. So it's interesting, let's say for the real machine learning experts colleagues that I have also is what how can you go? For example, what can open source models do to take away some of the hurdles that you have now with just using a model as an API? Because you can use smaller models, like once you have gathered your data, then you can use a model that's maybe 10 factor smaller and train to specialize for your task and that's it's gonna make it faster, no?
It's gonna make it faster, Yeah. And also cheaper because you're not using a super expensive commercial endpoint anymore. Yeah. It's gonna also make it easier to deal with your data issues, for example. Yeah. And so I think that's pretty interesting I could imagine. I like the comparison that you made when it comes to kind of standard way of product development, right? Testing and iterating as fast as possible and seeing if it gains adoption.
If it has the benefits that you envision, right? If the benefits outweigh the costs in that way as well and even live doing the same, right, iterating over and over and making it better in that way. I like that you put open source in there as well in that it is an option.
¶ Monopolising organisations gathering data for AI
Cuz I was wondering what your take is now that you have these quite substantial large companies. Yeah, they kind of own these large language models and it is not actually open, right? Is that an issue that you foresee? Yeah, it's in. It has multiple issues there. Yeah, right, right. So when it's costs latency? Not transparent how it was
created or whatever. So if you are in a situation where you need to be very transparent about how your system works, Yeah. And one of the components in your system is entirely black box created by a company that doesn't even say on what data the model was trained, Yeah, good luck. Good luck explaining that. So those are definitely issues. Yeah. And that's interesting. I think that, yeah. And then also being able to tune this, that's like what I said
before. I think that's that's a big one. Yeah. A lot of people I talked to, they were more talking about the ethical side of it, right? That if you truly wanna have this technology that's gonna transform a lot of fields, then the best result would be to have everyone kind of to use collective intelligence in that
way, right? And if there's backing and there's a lot of, how do you say that there's a lot of capital capitalism and it's driven by that sort of mindset that it might steer into a direction that it would be other than how do you say that it would steer into a direction and that direction would not be the same as if it were open, let's say,
for everyone. And I feel like that's a hard one because we're never going to know kind of this other variant unless some open source variant comes out, which is has the same backing, same financials I guess, but it's not driven towards the same financial gain in that way. Yeah, I'm thinking. I think we see this with more with more technologies in place, right?
I mean like phones, you have the closed versions and people making like the fair phones and whatever, which are more hackable and whatever, or you're dropping slightly off. Sorry. Thanks. Yeah. So I think those things are important. I'm less concerned about, let's
¶ Big companies play unfair in the AI playing field
say that it's companies making these models and making them available. Because they have an incentive, they make something great, a service people can use. What I'm more concerned about is 1 if everybody's playing by the same rules. And that means that on the one side, for example, if a big company can scrape the Internet, take all this data that nobody gave them permission to use and just do it anyway, and that then afterwards they start lobbying,
for example, stop this. Exact practice or whatever cuz they already have. It they already have it or whatever, making it or that, I find hard that it's infeasible for open source collectives to spend millions on training like a gargantuan model. Yeah, I'm less concerned about that because I think, well, if it's. Really that's valuable and whatever. Then a second competitor will step up and they will try to out compete the first ones like I if the rules of the game are fair,
then that's okay. But the challenge is it's not that case. Yeah, like you only know with Open AI, they script the Internet and everything. A copilot has issues with GPL code being in there and that if you can generate that code, that's not okay. So I think those things, I am more concerned about it on the
¶ Generative AI steals art from other artists
ethics side. The buttling field is unfair. Yeah, yeah. There's a nice example, the image side of that, where these huge models like Dolly and Stable Diffusion or whatever, they are training on lots of images on the Internet. And then artists are really upset because these models are really good at copying their style, the style that they spend years honing as craft. And then now with the click of a button, somebody else can just steal it. I understand they are upset.
And I also really like one example. It was like this image of the point is like, they're suing, but it's really hard as a small artist to make a of course, to make a dent, make a stand. Yeah, and there was this one initiative. And then people started to make images with Mickey Mouse, for example, using stable diffusion, doing awful things like with a hatchet and blood and everything. Mickey has done it again. And then the idea was Okay just basically bully or tease Disney.
Into suing these companies that are making those models because Disney's really good at companies that are infringing on their trademarks, basically. Yeah. And I think, yeah, I'm really curious how this will play out for me personally. I think that it's really important to make sure that just your use case doesn't return those results. So to have good monitoring, testing, anything to use all those things but.
I'll also be honest that for me, the temptation to use the technology is too large to say no, that the possibility that some of the code is being generated might include this. Yeah, Gpl's gonna totally stop me from using it. Although we do have within CBI, we have Copilot, and there we do have, for example, the filter on like, hey, there's like a post hoc filter. If this looks like code that's actually in known GPL databases, then don't return it, for example. And that's also a ways to
mitigate it, basically. Yeah, it's a Gray area for sure. Yeah. Yeah. And I feel like maybe it's because like, I haven't seen something before, but for me it's it's it's never been like this before. I feel like the the adoption went really fast also, and it's quite widespread. A lot of people know about it, at least people I talk to. Maybe it's my generation more so. But in any case, I feel like it's also very accessible.
You can show it to other people. Showed it to my parents, showed it to my aunts and stuff and they were all like, this is, this is interesting can you do it in my language? That was that was the one. I got gonna do it in Turkish. And I was like I make a try and some stuff came out and they were like, Nah, I mean it's not as good as the English version. Yeah. Which is fair. But then because it is so widespread and all of these Gray areas all of a sudden simultaneously pop up like the
artist example. I mean I that's a painful one for the artists. Right. And it's it's yeah. That's the consequences. You could you could see that. But then what can you do about it. I feel like also, because the
¶ The future of AI looks bright
possibilities are so cool, I feel like there are a lot of possibilities that we've never thought of before even or that we always thought were possible, but now they're more so at the fingertips. That stuff is moving fast and that because of that you are going to have a lot of Gray areas. But hopefully the end result, the things we're going to build and implement are going to be worth it. That's kind of my hope, I feel like. Yeah, I think I agree.
It's a it's a tool. You need to use it in the right way. I mean, I have her in the wrong hands is also not. So that's not a great idea. Yeah, yeah, for sure. Cool, man. This was, this was a lot of fun. How? How did everything go, you think? That was nice. Thanks for coming on, man. This is a lot of fun. I'm gonna round it off here then. Lens dima da. I'm gonna put all his socials in the description below, check him out, let him know you came from our show.
And with that being set, thanks for listening. We'll see you in the next one.