Become a LLM-ready Engineer | Maxime Beauchemin (Airflow, Preset) - podcast episode cover

Become a LLM-ready Engineer | Maxime Beauchemin (Airflow, Preset)

May 14, 202441 minEp. 37
--:--
--:--
Listen in podcast apps:

Episode description

If you’ve worked on data problems, you probably have heard of Airflow and Superset, two powerful tools that have cemented their place in the data ecosystem. Building successful open-source software is no easy feat, and even fewer engineers have done this back to back. In Part 1 of this conversation, we chat about how to adapt to the LLM-age as engineers.

 

Segments:

(00:01:59) The Rise and Fall of the Data Engineer
(00:11:13) The Importance of Executive Skill in the Era of AI
(00:13:53) Developing the first reflex to use AI
(00:17:47) What are LLMs good at?
(00:25:33) Text to SQL
(00:28:19) Promptimize
(00:32:16) Using tools LangChain
(00:35:02) Writing better prompts

 

Show Notes:

- Max on Linkedin: https://www.linkedin.com/in/maximebeauchemin/
- Rise of the Data Engineer: https://medium.com/free-code-camp/the-rise-of-the-data-engineer-91be18f1e603
- Downfall of the Data Engineer: https://maximebeauchemin.medium.com/the-downfall-of-the-data-engineer-5bfb701e5d6b
- Promptimize: https://github.com/preset-io/promptimize

 

Stay in touch:

👋 Make Ronak’s day by leaving us a review and let us know who we should talk to next! [email protected]

Transcript

Being a SQL Monkey is probably not going to cut it anymore when AI is a better SQL Monkey than we are. The thing is, lacking is the executive skill and the memory, the long-term memory and the business context that are for now, private from the LLM, and need to be squeezed into a context window for it to make sense and be useful. It's been kind of a learning journey because at first I was just trying things and it just like doesn't work. And I was like, yeah, this element thing, man.

It's all hyped, like, shit doesn't work. And then it took me a while to realize, he's like, okay, I'm actually just really bad at prompting. It's kind of like, like, Googling back in the days, right? Like, if you don't do the right keywords, the result is not super great. So for you, right? You're doing like 8 to 10 pounds every day. Like, did you see that gradual, like, improvement in terms of results for yourself? And like, how do I get better at this? Like, I do want to get better. On Leng Chain, I think it's

really interesting because like, when I found it, I had to say things like, I don't understand why this exists. Not because I just don't understand the problem space.

I got familiar with the problem space. I was like, oh, yeah, this is like everything I need. This is super great. But then I started to try to use it. And then I was like, oh, this is, it does like kind of what I wanted to do. But then it's not, but, but not exactly. And then I cannot use the methods that are here exactly in the way I want to use it.

And then I just want to use them. Welcome to the software, Miss Adventures podcast. We are your hosts, Ronnick and Guan. As engineers, we are interested in not just the technologies, but the people and the stories behind them. So on this show, we try to scratch our own edge by sitting down with engineers, founders and investors to chat about their path, lessons they have learned. And of course, the Miss Adventures along the way.

Welcome to the show Max super excited to have you here. Well excited to be on the show to an excited to catch up with the episodes you have so far to so make sure you like catch up on that. Thank you. Excellent.

So just getting right into it. So at the beginning of 2017, you wrote this post called the rise of the data engineer, which both helped define the role as well as bring more attention to it. So I was at engineer, you know, when that came out and was like, oh my gosh, this is, you know, this is always all about.

And but then at the end of the same year, right, a sequel to this downfall of the data engineer, which summed up pretty much all my struggles also as a data engineer at the time. So I guess what led you to kind of write a sequel. Well, it's like what led me to write the original to. So I think like trying to get back in the context at the time. So when I left Facebook to join Airbnb. So that thing that was in 2014.

And internally, there were still calling themselves, I think ETL people, like ETL engineer and business intelligence engineer. And I was like coming out of Facebook, I think we had started calling the team, the data engineering team. And, and to me, I changed away. I came out of Facebook after two years. They're just thinking definitely about my role and about the industry and who I wanted to be and what I wanted to do.

And I wanted to make a chasm with the past, like just basically I don't want to use these goy tools anymore. I want to do, you know, pipelines as code, even, you know, just move away from the GUIs be more, you know, bring some the concept of software engineering into data processing data engineering, serving people with data and in general.

And, and I think I wanted to make take a strong stance for that. And internally at Airbnb, even saying like, oh, we should, if we do job postings, externally to go and hire people, we should, you know, put a job posting saying data engineer. But then people are like, what is a data engineer, you know, what does that mean? So I think I decided to write the blog post. I think I had read, read, maybe, and we should dig out that post.

But I think there was a similar post called the rise of the data scientist coming out of someone internally at Facebook. Yeah. I see. I see. I didn't know that. So, yeah, which was similarly kind of declaring like, hey, there's a new role. It's disruptive. It's fun. It's, you know, it's, it's new and exciting.

So I wanted to do something similar for for data engineering. So that's where it came out of. And then, you know, I think personally I was struggling with the role to where I wanted to go even further than, you know, being the data engineer and be more of a software engineer and a tool builder. And then I was like, oh, here's what all the problems and challenges around the roles are. And maybe that's what we're going to need to break through to make this kind of fun or successful.

Or, or, you know, this is the reason why I don't want to be a data engineer anymore, maybe, you know, mix and mix up the two. And I think like, you know, we've done the return on that post. So the, the, it's called a downfall of the data engineer and it's interesting to revisit it, you know, year after year with practitioners to this.

Like, is that, is that still an issue? Yes or no? Cause there's probably like five or six things in there. They're like, ah, this is where it sucks to, to try to be influential in that role or to be successful in that role. What was the reception like after, especially I guess after the downfall one like two people or people like, oh my gosh, yes, like let's solve these problems or what was that like?

Yes, it's interesting, you know, like you write a blog post. It's as if you walk into a microphone with an empty room and you say things. And then some people might mention it in a podcast five years later. So that's the reception was not like, oh, there was like hundreds of people at my door the next day, like ringing the doorbell, trying to get interviews.

So, no, so I think I mean, people reacted. There's, you know, if you think about like the people that read and review this stuff. So usually when you blog on, I don't know, I don't, I think it was like my, it was not on behalf of a company. And blog posts I've written before was under like that say Airbnb or Liffem Brella or presets. So it gets, it gets reviewed. It gets a little bit more, more attention and review.

And certainly kind of just peer review by, on the peer review front, I think, I think people agreed generally, and I think it resonated over all over the years. I've heard people saying like, hey, I read the posts and really resonated with, it's similar to what you said. It's been a handful of times where we've done either podcasts or we did an article with Monte Carlo where we did the return on the downfall.

Like is that still an issue that we move, you know, move forward? And for me, it's, it's never clear like, oh, is it, you know, is it my experience as the like, outgeneralizable is this? The same at all organizations, anecdotally, maybe I talk to, you know, like 30 data engineers a year about these struggles, you know, here and there. But I guess it's hard to say like, oh, is that universal or is that limited to my experience?

I remember there was a joke about like, oh, yeah, you know, data science is data engineering until you have the data. Like, I so I know you got your start in like, like, bi analytics and things like that. Like, have you thought about just at some point, just, you know, giving them just go do data science, which has like way more coverage and sort of support from leadership and things like that.

Yeah, and I think it was called like the sexiest job in America, like five years, you know, I was like, ah, that's like, that sounds, that sounds kind of good. I could work on that. No, but no, actually not really. And I don't know, I guess like the big difference was AI and ML, right? Like that was a really exciting thing and that was the draw for a lot of people.

And in retrospect, I think it was and is but like with generative AI now, I think a lot of the, you know, some of the skills learn in that era are, I think I think they're useful and transferable. But yeah, I think I think like the draw for me was more software engineering in general than data science. I don't really know what I think that maybe it's the potential impact.

This seemed a difficult to have like you Jim packet as a data scientist. And then there's always like the data science. So people want wanted to go into it to do to solve problem using, you know, ML and AI. And then they were just kind of data analysts that live in San Francisco wanted to call themselves as a company. I mean, that's an old joke. What is a data scientist? The data analyst living in San Francisco.

Yeah, I've heard that one before. Yeah, I mean, clearly I think I'd say at Airbnb, we had a data science team of like a hundred people at some point. And I think a lot of them were doing, you know, a lot of what data analysts would have done or what analysts engineers are doing today.

Either to support, you know, the kind of stuff they wanted to do in data science, because like 80% of the work is their wrangling and then preparing a data as, you know, it's kind of well known. Or because it's what the company needed. And at some point, because there's a limited maybe number of problems you can, you can apply ML to.

And if you want to work on like, I'll create a creating models and doing that kind of stuff there. It seemed like there was a lot more impact to be out in terms of doing very basic data science and applying it at scale. So that's more like data science engineering. You know, or data science infrastructure type stuff, which is different skills.

I think over time, almost every engineer or even data scientists at this point, I've seen people move down below one stack. It's like, I've done enough of that. Let me know, but infrastructure for black format. If that is you an award to make it easier for others to kind of just plug in play, for example, like under the skill set thing.

I always thought about like, okay, yeah, if you want to be a data scientist, you should like pick up the skill sets of the stack below you. So like data engineering skills. And then if you want to be a good engineer, you need to pick up like this, like the stack below you. So like infrastructure, like data infrastructure, like skills. What do you think of that?

Yeah, I think a few things on that. So the first thing is like to see people expand or moved lower down the stack over their career is like as a natural, a pretty natural progression natural draw, where you become, as you solve the problems are in layer, you want to go meta.

Like you want to go more, you know, generalize and say, I want to solve the problem that creates the problem on the other layer. Like I want to get deeper, like solve it at a deeper level. I think it's a natural, it's a natural progression. I think expanding and becoming like widening your skills to isn't just in general, but is it a natural progression?

Is it better to go down the stack or up the stack? There's different kind of biases there. If you want to be closer to users and use cases in the business, you know, you can evolve in that direction. If you want to get closer to the meta problem, our things are done. Doing things in a more reproducible way. Like that's a normal draw too. But I think overall, if you think about just how people skills evolve.

Do you get deeper into a vertical or get wider? And then I would say like all of the paths are valid in terms, as long as you gain surface, right? Like you want to expand your surface either left or right, you know, or up the stack down the stack, deeper in certain areas. So you want to be a very, very specialized, very deep in an area or wider. That's a really interesting question. Overall, I think my stance on that is better to go wide than go deep, especially in the area of AI.

I think what we're going to see with these LLAMs and some of the skills getting come out of, come out of, come out of ties is better to be a generalist. Because then you have a bunch of little agents you could use eventually, right? And you have like very smart, like says if you have an army of very smart interns, you know, without like with some context, but like not a lot of good executive skill. At least like that's a way of feeling working with with LLAMs feels like today.

So it's good to be to have good executive skills and coordinate coordination type skills. And then you can be wider and get, you know, get help from different. AIs to help you coordinate and build things. And there's always a question of like, is an AI going to solve this for me? So, or can I, if the AI is good at it, maybe I don't need to learn it.

So speaking of like, Jin and I, Jin and I, like, do you see like a kind of a parallel of like, you know, back in the days with data science. And then that was sort of getting a lot of the coverage versus the engineering was sort of the kind of powering it versus like today, which in and I like, what do you reckon would be the equivalence of like engineering.

The effect of, say, the having this new tech on the role, you mean, or also did this, I think data science was, you know, an important thing that's transformative. Like what we're dealing with here though is like something that's changing, you know, everything and everyone and every role and every skill.

So I think this is fundamentally different from anything we've seen before. Right. I guess I can't only be compared to like the internet or something like in terms of like the level of disruption and how it's going to affect everyone's live. And it's like one of these things. It's hard to see a wife pace or what it's going to look like on the other side. And I'll fast we're going to get there.

But I think for me, I think one advice I give everyone is you should develop like first reflex to try to do it. Whether I or have a I do it for you. So the same way that we all develop, you know, first reflexes on like let me Google that around like 2000 to 2005. Or maybe as we get our first iPhones or first smartphones, we're like, oh, we're having, you know, a debate about something. Let me look that up. Right. I have that that first reflex.

I think we need to develop that very, very quickly with AI. So like don't try to, you know, do it on your own. Try to do it with AI first. And if it sucks at it, then then do it on your own. What does that mean technically or like technically would that be just like using tragedy BT, try to solve the problem first or like trying to come up with a prompt or how?

Yeah, I mean, I think I think like if you look at your daily, your daily workflows and I don't know what's your two-dualist for today, Vian, say this podcast, but you can look at like, okay, I've got these like some technical tasks, some things I'm trying to do. Before I even get started, I might try to ask my assistant and that's probably a chat GPD Claude. And then to say like, I'm just going to write down what I'm thinking about doing for that and see if I can get any or some assistance.

And then depending on it seems like this thing is going to be able to help you or not, you can, you know, paste the right code snippets or input documentation or things you're trying to write, whether you're trying to, you know, write an email or a message or a PR.

Or a design, you know, did a model or something like that, like to write down your thoughts and work with your assistant on getting the feedback loop without disturbing anyone is so glorious and then you can figure out where I can and cannot help. But like that first reflex for most tasks, I think you should try to do it with assistance from me. That's what I do.

We're to look at my and I definitely would not pull my chat GPD live on a podcast. It's a big sum. And and be cautious with privacy because like, you know, it does overflow to and like, you know, for me, even for like founder advice or like legal input or everything, the vast array of things that say a founder does as a start up. I definitely have to love first reflex for most tasks to ask, you know, chat GPD and see how it can help.

And it's good at things you would not think originally. It might be good at, right? That's a good analogy because like you wouldn't also just pull up your Google history to be like, haha. But what I was going to say at the point around that is like the statistics of like how many times a day and the for for what kind of task I use this stuff is like I would say that it's like now it's like five to 12 prompts a day or like Miss Sessions.

And a crust of a variety of what it means to be a founder, you know, the kind of task that a founder might do even like yesterday I had my immigration interview. I'm going to be an American citizen. So I'm Canadian originally. I've been on a green car. So so I went and did the interview, but like I didn't know apparently there's a hundred questions I might ask you and all that stuff. And I did like audio sessions with chat GPD on a drive to the Bay area this week.

And I was like doing role play with it was asking me questions I practice by the time I got to the interview I practiced the interview many times and reviewed, you know, whether three branches of the government are and who's the current, you know, security secret area of states and all this stuff. All the stuff that were likely to ask me that were tricky. I'd reviewed in role play with GPD over audio in the car, which is like random use case, right?

That is pretty cool. But yeah, even like, yeah, I was going to say like even like for ideating on like, oh, I feel like I want to start a new open source project around like did access policy. Here's some ideas that I have and just having a conversation around it, you know, instead of like writing and devoid. You're kind of, you know, talking with someone smart that has infinite time and attention for you until you run out of GPD for, you know, a request for the day.

But it's surprising how well it is that it's just being brainstorm, you know, friend kind of deal and keep the emotions out of the not a friend a useful assistant. Think about that. Like what is the most unusual thing you've asked charge if you if you remember considering like, for example, the use case you just mentioned on brains from interview practice for American citizenship. I would not have thought of that. That is really cool. That's really good. Especially over voice.

Yeah, so when I say unusual, I don't mean a bagway, but just like something which you didn't expect it to be good at. But you're like, oh, this is really good at this thing too. I think that's the stuff I've been most amazed with is writing really intricate blog posts on the edge of discovery. Like what I what I think is kind of new, like let's say, let's say like,

and I, and I, I, I, I, and brainstorming for around say creation of new project. I think it's extremely good at like marketing and product product marketing, like messaging and positioning for, for start of founders is something you might not think about if you're, if you're not a founder, but saying like, hey, we're trying to we're coming up this new, this new product launch, you know, or we're thinking about a new product that we want to launch, you know,

and here's how we want to position it and here's what we think it should do. It's an extremely good product marketer. But I was going to say one thing I worked on recently, we could take the tangent eventually is just thinking about semantic layers, you know, and the B.I. world and then think about the intricacies of what exists and what the world needs.

And at some point we did a hackathon project around what the ideal semantically might look like, you know, and its properties and then just going back and forth some of some of it's like the rubber duck effect, like just having someone to talk to. That's just like bounces back ideas. So there's a lot of value in just having someone who listens carefully and spits back words that are related to the, you know, but I even, yeah, so like, can you, you know,

give me some related ideas, or I'm thinking of this thing and that, what, you know, what do you think? And it's been an extremely good partner to work on these things that call it the edge of like innovation and discovery. So some of the aspects you mentioned before was like, hey, start with chat, GBD first, similar to how we would go with, well, let's try to go with the letters.

In a way, you're saying it increases your productivity, anything you're trying to do it might already give you some aspect of the solution so you can do more as an engineer, for example, now putting yourself in the founder's seat, how does it? How do you think about your team size at hiring at that point? Because now you're saying, what would have taken me X amount of time to do now with this co pilot of sorts, I can do a little more efficiently and so can your team.

So have you thought about this and then collect team size? Yeah, I mean, yeah, yeah, definitely. I mean, I think as a founder, you always think about throughput and productivity. And then how do we do more with overall and then how we do more what we have? I think recently in the over the past year and a half where a lot more resource constraint that we were before, like before we were not there was just no ceiling like you want to raise infinite money.

Take it, you don't have an infinite valuation, take it. I think now we've been really pushed toward to think about efficiency in general. I think it's always really hard to objectively measure throughput and software development, right? And it's always hard to do estimates. It's always hard to account lines of code, account PR as you can count features, you can, I don't know, you look like customer satisfaction.

But I think we're all a lot more productive than we used to be. I want to think it's for sure, is like telling everyone the company to build that first reflex of like, you know, well, first like everyone should have like pay for your, you know, we'll pick up the bill for your chat GPT or Claude or, you know, get the best AI you can or they want to work best with get co pilot, get all the tools, right?

If you need to produce some images, get mid journey, like just go like that stuff is so cheap for what it does. There's just a no brainer. So enabling people with it is it now in terms of the, the social economic, you know, changes over time is really, I mean, you have major impact, which don't know exactly how right people are looking at like layoffs that FYI, I'm like, how many of these layoffs are, you know, related to AI or won't be replenished.

Like that, oh, it's just a normal dip and like markets go up and down, but the swing back with AI might be very different this time around. I think that's fundamentally true.

In general, if you look like does the printing press lead to like less, you know, text being written or read now, right? Or is there less journalist because of the printing press, you know, or less writers? No, there's more, but this is this is different. So as a founder, I can tell you, I think it's good to get the pulse on like the microcosm of like, oh, if you get the take on like how founders think about their company is individually, then maybe an aggregate that gives you a sense of what's going to happen.

And more meta economic layer, but I would say, I think I think I think currently, yeah, I think it's, you know, startup is always where incentive instead of I said grow as much as possible. So, and now where it's advice to be efficient, but as, you know, I double my my revenue, I probably want to double my effect of my expenses too, because we want to grow as fast as possible.

So, so there's clearly that. But, yeah, I mean, I think we're going to start seeing like very, very small companies getting a car for like we're going to see the less than 10 people unicorn becoming more, more of a thing in the future too. So less people can accomplish as much in a lot of cases.

I forgot a thing I saw it with the other day forget to pass from, but it's like how many number of people that take to build, let's say, a hundred million dollar company or a billion dollar company, for example, and that number keeps going down with advances with what we're seeing with elements, for example. And it might eventually come down to maybe one person, a company and that is still valued at this higher number, for example.

And it's probably been the trend overall, just productivity going up, but there's a big kind of step change happening and then just a lot of things are going to be different on the other side. And as I said, it's unclear like what it's going to look like during the transition. I'll fast the transition. It's just going to go and yeah, where we're going to land.

And on the topic of other limbs and there are a bunch of other things you also talk about, you have this open source project, prompt demise. Can you tell us more about that? Yeah, so I mean, that was a, I think, that's a, like a year ago or so.

We were building, um, Texas equal features inside Super Set as a differentiator for presets. So for context, people are not super familiar with, with what I do, I started Apache Super Set after I started Apache Airflow and been really dedicated to Apache Super Set and started a company commercial open source company where we offer Super Set, um, as a service essentially, right? And Super Set is an open source competitor to the tableau and look her, uh, lookers of the world. So be it tool.

And it's fully open source. It's amazing. It works super well. There's no reason why people should pay for vendors, you know, and then if you want to host the solution around it. If you haven't checked it out, you can check it out. Just go to Apache Super Set and you can check out, you know, what it does what it is and you can play it, you know, you can get set up quickly, use it.

Try it. Um, and then preset is just a cloud service around it with some bells and whistles and some improvements, some of some of which, and I won't go into like the exact pitch of just in the context of what we're talking about. We built a, an AI assistant within a preset to augment Super Set and that's a differentiator because we need to, you know, make money and have a commercial offering as well, right on top of the cloud service.

So we were working on like text to sequel and it's a tough, it's a tough problem. It's really hard to work with what's really deceptively easy to work with these LLMs.

You work with it. Like, here's a few tables, schema, can you write sequel that does this? Like, oh my god, this thing is good at sequel, which has deep implication for the data training world that we haven't talked about, but like, you know, being a sequel monkey is probably not going to cut it anymore when AI is a better sequel monkey than we are.

The thing is lacking is the executive skill and the memory, right, the long term memory and the business context that are for now, private from the LLM and need to be squeezed into a context window for it to make sense and be useful. So we started working on this problem saying like, oh my god, like, this thing is so good at writing sequel if you provide it the right context. So we started looking at, you know, vector databases to store your data models.

And just in general, like working, you know, some of the challenges we hit, like, here are the only, we're like working with different sequel dialects, making sure, you know, that, that, that, that is able to generate the right dialogue, it gets a little confused around that.

And then providing just overall the right context as to what you're trying to do and what the models that can use are, and we started working on that like when we realize is, you know, you can use the GP335 turbo or GP335 or GP4, and you can bold something in your prompt that says, like, do not, you know, keep, make sure to capitalize, you know, the reserve words or if it's BigQuery, do this, right? So you can, you can start like just really changing your prompt.

And then it changes the outcome really and trickily. And then what we're trying to solve is the big, big fuzzy problem of like, people might ask anything in your data schema might look like anything. So we measure the quality of our prompt or the quality or, you know, whether just even something as simple as, should we use 35 turbo or four turbo or four or, right? And how much better is it performing?

So early on we found this decent or good data set around text to SQL is called a spider data set. It's out of people forgot if it's like MIT or sorry, I don't want to miss quotes. I'm not going to say anybody, you can research like this, a spider data set.

That's a list of prompts, simple schemas, and then the good answers for it. And there's a bit of a context where people are like, you know, we did different teams working on this problem. I said, we did 82% or we did like 87% with chat GPT on this test set. So it's a published test set. And then there was no way at the time to just write kind of unit tests or a framework for for for someone to take unit tests and measure the outcome.

And so prompting the idea behind it was like, oh, let me write a little toolkit where you can write your prompt cases, which I like to test cases for if you're familiar with it, stick some of the ideas from like unit testing frameworks and apply them to prompt engineering and prompt testing.

So that we could say like, okay, take this 2000 tests and run them against GPT 335 or takes, you know, run against GPT 4 to everyone compare the output of like the percentage of success where once succeed over the other what it's good at what is bad at. How much it costs, how long it takes the average, the P90 of how long it takes for the prompt to come back. So wanted to apply the scientific method and just rigor to prompt engineering.

And that's, you know, prompting as a little toolkit to allow you to do that with with some amount of structure. It's quite cool and I saw that you guys also have like chain support and for me also for chain, it was like when I first like started looking at it, I guess this is the last year, I was like, why do you need a library to do this?

Do I just like to write these texts and then just sort of like works. And then I think right like as more I started tried to like write better prompt and you know do more use cases and I was like, oh my gosh, yeah, such a mess like without sort of these like libraries and things, the exact same thing with the prompt and I was right where it's like once things get to kind of the production level where like it's actually dollars on the line like this like you actually want the same engineering sort of like the best practice that we developed right to actually have that trust.

And then you can actually have that trust over, transfer it over instead of just kind of putting your hands in the air. Yes, it's like trying to have some empirical measurement in a very fuzzy unknown world right and then because like you're working on your prompts and you can add like literally a hint in there to say like, please, but please don't do this or you look at your 10% of failure say on text to sequel generation.

And then you realize like, oh, all the failure are related to trying to run that stuff on snowflake because it's not good at speaking the snowflake dialect. So then you might add a thing that says, oh, yeah, but if you're using snowflake and specifically date, date function to change the date grain of the thing, you know, here's some function definition that you can use right or like be cautious around this.

And you're like, by doing this, you might have like whack him all there, but then you know, might have made some like the BigQuery support worst right. So then it's really hard to know this. So you need empirical, you know, you need more rigor around that. And that was like the general idea with prompting my on length chain. I think it's really interesting because like when I found it at the same thing, I like, I don't understand why this exists.

Not because I think they don't understand the problem space. Not a familiar primal problem space. I was like, oh, yeah, this is like everything I need. This is super great. But then I started to try to use it. And then I was like, and no disrespect anything for the toolkit. I think it's just something that matured very quickly. But then I started using it. I was like, oh, this is it does like kind of what I wanted to do.

But then it's not, but, but not exactly. And then I cannot use the methods that are here exactly in the way I want to use them. So then you kind of fall off or you know, for me, I was like, oh, it's harder to try the bend this toolkit into submission than the value I get from it.

You know, in some ways, right? So it has a lot of convenience method to say break text and it chunks and it went some amount of overlap to this. And that's some things are really useful. But then I think say then have support for the particular vector database.

And then you want it to use at the time or not the kind of support that we needed. So then you went like 80% of the way. But then you have to monkey patch some stuff to make it work. So say they're like, let's just get a little bit of Python that doesn't text processing. Like we can write that with AI in like five minutes. It's easier.

So is that what you guys internally do is just kind of like having your own sort of set of like utils and start to help these stuff. I think I think we I think we do use some like just a chain is that is a weird. It's not it's a tool kit, you know, so you can kind of think of it like a you know, as a bunch of like utility tools around.

AI and ML and I think like over time, I think we agree to using like just specific portion of the tool kit. It's like we use the hammer and the screwdriver, but we don't use like anything that saws or cuts or you know. So so we pick some part of it. I think stuck around and some things like, okay, we can which is do our own thing because it's harder to bend this tool into doing what we do.

We need to do than it is to just do it on our own for some use cases. And so like internally or some of like what I'm trying to work on is like a lot of like summarization and then trying to like kind of like style transfer for like text. And I remember it's been kind of a learning journey because at first I was just like trying things and they just like doesn't work and I was like, yeah, this element thing is all height for like shit doesn't work.

And then it took me a while to realize he's like, okay, I'm actually just really bad at prompting. It's kind of like like googling back in the days, right? Like if you don't like sort of do the right keywords, it's kind of like the result is not like super great. So do you have like an I guess so for you right like you know you're doing like eight to 10 pounds every day like did you see that gradual like in fruit in terms of results for yourself. How do how do I get better at this like I do.

Yeah, I mean, I think it's just you know, it's like you have to approach it a little bit more like if the fuzzy like you know like a human maybe like it maybe it's like, oh, you approach someone you don't know very much that you know they've maybe they're they're high graduate that you know, you know, they're smart.

I have accumulated a lot of knowledge in different areas, right? So but then you you don't know how to work with them and you don't know how good they might be about different things. So I don't think I don't think the answer is to over engineer your prompts to say just like what do I need to tell it for it to help me.

You know and then in some cases like I think I've gotten more sloppy with the way I interact with GPT to in general in some areas right like in some areas you're like I can just like I'll take like some that was just open a session and take if I'm doing some coding I might have just an error message or a you know a problem and see I'll just like copy base a big thing instead and text just throw away I didn't see what it's going to say.

I might have some good pointers you know. I think fundamentally the first thing is like well what context does it need to help me and what context does it have from you know learning from the entire internet so you have to say okay it doesn't know anything about things that specific to my business or my use case.

So what's not generalizable why is it going to need and then you know you you can certainly try more things like what if I tell you this that can you help me more and so it's progressive disclosures until you prove this prove whether they're going to be able to help you or not.

But yeah in terms like you know I think text is equal and prompt am I is I think what I realize a lot of use cases for AI I think are not as empirical as measurable as the one we have and some ways we're blessed with text is equal because if I ask you can you write this query on this database it's pretty much I mean it's not always like you know 100 like bully and on like what are succeeded or not.

Sometimes I might like I don't know put re alias columns in a weird way or given you more than what you ask for but it's useful right so so it's sometimes it's so it's not a pure bull in a like correct not correct but at least we have something that's like generally we can say this is a good answer this is a bad answer.

If you think and you please summarize this X and a paragraph it's harder to to evaluate whether it's exceeded or not right or if you have a CS type quite a customer success type question you're writing a CS but which is huge family use cases right people want to automate support.

So if I have a if I can simulate and optimize it a chat session where someone you know put some information I need help with this and that but it's harder to read the answer and give it a score so that then you can use an a I did do that but the near kind of. Yeah I don't know that I don't know what are you doing. I'm not like I would you got rich out but yeah you have to trust the end of the line system beyond a point.

I mean you could I think you can you can and I talk with people that use optimize it in more fuzzy use cases that are less like this bullion like it's the AI succeed at the answer now.

For instance I think the examples that are really interesting and that optimize examples that I wrote when I originally wrote the project there was like writing some Python function so I can actually ask the AI to write a Python function take the Python function and run run unit test on it make sure it actually works it's like a right a function that you know does the.

Tell you if it's a prime number or not then it generates a code they actually put it in a interpreter and and tested that this empirical use cases but when you get to like less empirical. Like true or false use cases it gets more subjective and and hard to evaluate but this is pretty cool like this is more like test and development right use specify what you want you describe the test but you let the.

And then you are here whether the code you got back is actually doing what you asked for to ask it to do. Would be good yeah the blog post was very much like originally when I wrote the thing is like bring the bring the TDD and like you know a rager and what we've learned in software engineering test devil you know unit tests test your end development to prompt to prompt engineering.

The project is super cool and we'll definitely link it in our show notes we recommend people check it out not just the project but also presets superset a mayor flow. Hey thank you so much for listening to the show you can subscribe wherever you get your podcasts and learn more about us at software misadventures.com. You can also write to us at hello at software misadventures.com we would love to hear from you until next time take care.

This transcript was generated by Metacast using AI and may contain inaccuracies. Learn more about transcripts.