Leveraging Ruby for Effective Prompt Engineering and AI Solutions - RUBY 643 | Ruby Rogues podcast

00:00

Welcome to another episode of the Ruby Rogues by Guest. I'm your host today Valentino Stoll. And we're joined by a very special guest, Scott Werner. Scott, do you want to introduce yourself? Tell us why you're Ruby famous here? Yeah, Ruby famous I wish. I hope someday. Thanks for having me Valentino. It's super exciting to be here and be a part of this. Yeah, so I'm here. I'm the

00:30

CEO and founder and co-founder of Sublayer, where we're building a self-assembling AI agent framework in Ruby and kind of supporting tools and infrastructure for to make something like that possible. And yeah, basically been programming with AI and Ruby for, and we were just talking earlier, basically exclusively trying to get AI to write all my code for me.

01:00

And I'm actually writing code and primarily Ruby. That's awesome. So just so we're clear you're talking about Sublayer is the project. Sorry, yes. Sublayer is the gem. And so yeah, it's said, you know, been describing it as a self-assembling AI agent framework. And kind of what I mean by that is that each of the each of the components that we've built are designed so that it's easy for an LLM to generate new additional components that you might need.

01:38

So the framework is very minimal and it's really about kind of defining the conventions, the interfaces, and then kind of letting you lose to expand on that for whatever you want in your application. I'm super curious here. Like where do you start something like that? Like how did you get involved in this? Right? Like how do you just stop coding?

02:06

It was tough. You know, I think I think a lot of us were really inspired when we saw chat GPT for cat, or we saw GPT for come out and people using chat GPT and you know posting videos of I use this product through this profit of chat GPT. I copy and pasted the code into X code and I have a running app and I've never coded before and you know saw that and immediately was like, oh well, those people are going to want a way to add functionality and modify that software over time.

02:44

And kind of realized that a lot of the stuff that we do in XP around like TDD and the different refactorings we do, let themselves really well to working with LLM. So breaking things out into small pieces, the smaller requests, the simpler requests you have for the LLM, the more likely it gets you something back, that's good.

03:14

You know, and also kind of had this, had this conversation with people were like, you know, you would be mad at me if I gave you a thousand line PR, but at least you could come over to my desk and yell at me, right? If you got a thousand line PR from chat GPT, that would be a big problem. And so kind of with that line of thinking is where we kind of started our journey. That's really funny. I just think of the cost associated with that too. For something that you don't want. Right. Right.

03:44

That's really cool. So I've definitely checked out sub layer. I've used it to generate specs because I don't like writing specs. I'm not a test driven person, but I do drive tests. So I'm more of my backwards test driven developer, whatever you would call that. And so I use it. Yeah, I was a little script that I run sub layer and I say, hey, generator, you know, our spec tests for this Ruby file. And for the most part, it does pretty good.

04:18

Yeah. So I love the kind of syntax for just like, hey, here's a generator, use this prompt to do it. It's super cool. So why don't you give us like the high level like, how does this all work? Right. Yeah. A little, you know, a little bit of background like how and why it works is, you know, we started calling it promptable architecture.

04:47

But the core idea was that they were going to be patterns, styles, ways that you build your application that lend itself well to LOMs, taking a short request and giving you something back. And, you know, some of the things we started out with things we kind of already know of, you know, clear interfaces, single responsibility principle, loose coupling, and kind of came about those through, you know, reasoning of, you know, it's going to be much easier to add code to your app.

05:17

If you can just drop that new file in rather than have to update all a bunch of other different files and, you know, other things like having these conventions that you can throw into the prompt and the LOM will just believe you that that's your convention and follow it. So that's, that's been a lot of kind of our trial and error figuring out what are these, what are these patterns? How are we, how are we going to structure our applications differently to take advantage of this.

05:48

And so, you know, one of the main things around the, the sub layer gem is, you know, every single component is very loosely coupled that, you know, you could have 100 generators, you could have 1000 generators and they won't affect each other. And so, you can, you know, one of the things that we've been playing with is like, maybe, you know, don't repeat yourself or duplication in your code base isn't that bad.

06:12

If it takes, you know, in the dots you can see that you can generate a generator and like, you know, 10 seconds. Maybe it's not that bad if you know, you and I both work on the same story. If, you know, it's us just describing it and getting the code and it doesn't affect anything else in the code base. So, so there's, there's promptable architecture and so like that's been kind of our drive for this gem.

06:37

And one of the things that we found works really well with that is that you really only need to give the LLM one example to generate something new and so generate a variation off of that, which is where our kind of main tool looper ends comes in. And basically what it does, which I think is, you know, has been a pre-common experience from a lot of the developers that I've talked to.

07:05

Nine times out of 10 when you're working on a new user story, there's somewhere else in the code base that you do something similar or there's a pattern you probably want to reuse. So you go, you look at that file, you repurpose some of the code, maybe copy and paste and, you know, keep the code base very similar to what you're doing.

07:24

Which we kind of realized is a manual human version of rag, which, you know, I made a joke to somebody that like my development style is retrieval of generation very good at googling and looking around the code base. And, you know, it was kind of mostly joking, but then kind of took a step back and was like, actually, maybe there's something to that. That's really funny. I never thought of it like that, but yeah, we are just, we're just doing rag.

07:58

Right. And so realize you could actually kind of operationalize that. And the, you know, one of the things that I think for you join a new team and what takes so long to ramp up is that experience of, oh, yeah, when we, when we run into that problem, we do this pattern over here that's in, you know, this file on, you know,

08:18

this line, but like you want to get that from months years of working on that team, that intuition, whereas you index these things, you put them in a vector database and you query them, you can find them. And, you know, I, we built that we originally actually on the sub layer gem, thought we were going to build a whole bunch of different, you know, components and generators and stuff.

08:45

But, uh, our product planning meeting one Monday, we're like, you know, we basically thought we're going to do about this much work this week and had the idea of maybe we'll just generate one of these generators. And then a whole work weeks worth of work was done in a day. And it was like, take a step back. What did we just kind of like discover.

09:07

So, like, you know, building, we found like building in that method, uh, building in this method of like very loosely coupled components, small components can be very powerful and quick. I didn't realize that you have a list of predefined blueprints on the site. That's really cool.

09:30

Yeah, that's super cool. I was going to say that, you know, this mentality is makes so much sense, right? Like, I feel like I've been at places before where they had just like generators in general, like Rails generators, right? Where, okay, you just like want to start up a new Rails app, like, you use a generator, like, you want to create a new model that has all of the conventions, use a Rails generator for that.

09:54

Like, uh, it definitely has helped like in a consultancy for example, like, you know, where you have all this boilerplate stuff that you just want to build off of. It's like, well, you use the pre existing stuff and then at least you have the shell there and it's all hooked up and wired correctly.

10:11

And then you can build off of that. But this is like the next step after that, right? Like, because then you can then start filling out the stuff in that boilerplate, right to adapt to whatever it may be. Which is really cool. And so like, how does it, how well does it work? Right? Like, I think that's like, maybe what a lot of people hesitate to jump into all this with, right? Like, you know, okay, so you have all this source material for it to use. Like, isn't it just making mistakes though?

10:45

So it can actually one of the things that I've kind of realized and been trying to figure out how to kind of describe, right? But it's almost like, by intentionally designing your classes and interfaces a certain way, you can almost like target the hallucinations to where you want them. Which, you know, it makes mistakes, but they a lot of times are mistakes in the style that you would probably want anyway.

11:17

Which I know it doesn't really make sense, but like so in the actual way, what do you mean by that? So like, well, you know, one of the things if you, you know, go ahead and generate something on the docs site, you'll see that we have kind of this idea of an LLM output adapter and right now it says, you know, there's a symbol for single string.

11:41

So we're basically the simplest generator you can make is, here's a string, I'm going to send a prompt to an LLM and I want this specific string back. And like you said, like your example is the string could be our spec tests for, you know, some Ruby code that I've got. But if you ask it to build a generator that maybe gives you a list of, you know, a list of refactors to make in that Ruby file, it might hallucinate that it needs a LLM output generator that generates a list.

12:16

And while that might not exist, it'll still be in the conventions and the style of that of the library, which might exist, but also might point to, okay, well, here's the next feature we need to build, we need to flesh out what it needs to get a list back.

12:33

So having those, having those spots where there's, there's nothing different from the base example to the new example you're trying to make, but the places where it's different, you can basically design your code so that those are the pieces that change. Output adapter might be a little more complex, the thing that I think you run into a lot is it just generates a new prompt for what you're trying to do and doesn't touch anything else in the rest of the code.

13:02

And so I'm thinking about it like I'm targeting the hallucination to that prompt or to the, to the initializer rather than anything else in the, in the code base or in the file. Gotcha. Yeah, that makes sense. I have so many questions around the blueprints aspect of it.

13:21

And I mean, I, I guess like where does, where does somebody even get started like is, is blueprints a good place to get started with the generic code generation aspects of it or is that like a first test of waters with sub layer and get used to the generators aspect of it. Like, where do you suggest somebody like that's, hey, I have got a team of people. We have a conventions in place. We want to start adopting this process like, where do you start?

13:48

So yeah, I guess one, one place you can start. We just, you saw on the site, we just rolled out some new functionality where you can sign up. You can start to put some code examples in privately and start to generate from them to see how it how it acts. And that, that could be a great place to kind of start to start to play with. And so that's one aspect of your, if you're curious about playing with with your own code. And then on the sub layer docs, we actually, and sorry, this is very meta.

14:36

We've actually created this, this interactive docs product, which we're going to be rolling out as well, where live in our docs where you can come and go to any of the components we have generators and actions right now and see an inaction, just right in what you're trying to do. And it will give it, give you reasonable code. There might be some things, depending on what you're trying to do, might not be implemented yet.

15:05

But one of the things that we're doing, if you go to the running, there's a, there's a guide on the left about running local, running a local LLM with LAMIFIL. And what Andrew on our team does is actually in the demo of setting it up, go to the docs site, generate the code for the demo, runs it on the video.

15:31

So I'm sorry, we're doing a lot of cool. I was on earlier, because I, I had looked at Blueprints a while ago, and I hadn't seen some layers, latest stuff. And this is so cool. Yeah, you just go on to the site, and you can like, you have just generate this stuff. It's so cool. Sorry if we're inflating your bill here, but it's maybe why it's there.

15:58

So I'm curious, like, you know, definitely local inference is, you know, a hot topic and people are trying to, you know, it's getting faster. And MacBooks are running it quicker. Is that something that you recommend people start with though, like, is it fast enough to like make, make like the development flow smooth. It's, it's not yet. I would probably recommend starting with something like, you know, I think, for, for some, some of the stuff that we're doing, you know, five dollars in,

16:39

in open AI API credits can kind of help you get started. I think Google has a free tier or even Gemini 1.5 Flash, which is super fast. We're, we have, we're using, you know, some of the, some of the models on our site. And, you know, you can also just go and play with, way, play with stuff in, with private blueprints. For the local models. I kind of do think that's where, where's probably going. For, for at least a lot of stuff.

17:25

The problem, I think the, the prohibitive thing about that is that you need a very powerful laptop to really be able to run the local models fast enough and powerful enough to actually get good results back. Like, $4,000, $5,000 on that book pro or $5 in the open API or maybe I, maybe I credits. I remember I, I first saw the, you know, the first open models and I go and I, you know, install on my Raspberry Pi.

17:55

It's not thinking, oh, yeah, you know, like, I'll, I'll just throw everything out, a chain of Raspberry Pi's. And in terms of you can't really like spread inference like that. It's not, it's not really there yet. And it's also super slow on anything like even on a like a nice M1 right. So like, it's kind of funny. Like, you sit there and you're like watching, it reminds me of the old days of like waiting for an image to load on dialogue or something.

18:25

You're waiting for the sentence to complete. Yeah. So I think it also, you know, the risky run is that it gives you the wrong impression. Like, if you're, if you're just getting started, it's definitely good to throw $5 at some of the like frontier models, like,

18:41

like, Claude 3.5, GPT 4 or, or, or Gemini just to see what's possible. Cause you can get the wrong impression if you're like, like, this one, the thing just like went off the rails and started talking to talking about something on related, right.

18:58

Yes, I'm interested like, have you found like certain models perform better for like certain tasks like related to code generation or other ones that are specifically better at generating Ruby code or like, or even just like other certain tasks with code generation, like, generating the specs for X or something like that, where some models just do a better job than others or is it kind of like just use any one of the top ones.

19:33

I, my opinion in my approach is kind of really just use the top ones because they're really, those are going to be the slowest, most expensive worst models were ever going to have again. It's just getting better and it's just scaling up. So like, getting, you know, getting used to what's what's possible. And, you know, even if you run to the wrong, run into an issue, like, it's more about the UX and like, what is this, you know, what is this imply if you can do this.

20:04

I was telling the talking to somebody about, you know, remember, remember when everybody's like, oh, AI image generations never going to work because it keeps getting people seven fingers. And like, I can't remember the last time that was, you know, part of the conversation. But it was all about like, here's a great way to generate an image, ignore the hands for it for now. That reminds me, I saw recently some, some university has the third thumb.

20:35

Where you can actually use your feet to like, you like to get another thumb, like, prosthetic on your hand so that you can hold stuff while you're doing something else with one hand. They show peeling a banana with one hand, you know. That guy's clearly AI better. Oh my gosh, so funny. But I guess it's a more to your point and actually one of the things we've been trying to figure out the right way to present this and maybe it'll come out as something interactive on the blueprint site.

21:07

But we've found that different attributes are different like, different, different attributes of code quality that we've talked about for a long time. And we're actually kind of quantifiable now where one of the experiments we were running was, you know, having having really descriptive good names and then generating off of that blueprint that, you know, GPT 3.5 did really well when you made those names worse or less descriptive.

21:42

And it had a harder time figuring it out, but GPT 4 could. And so like, you know, that kind of goes the follows a lot of the scaling for the other models as well, where if you give it an easier problem to solve it can solve it, but as soon as that problem gets harder.

22:00

And we need a more powerful model, which I've been testing out with some people like we might be able to put a dollar value on technical debt because it costs you a dollar per thousand tokens versus, you know, 10 cents per thousand tokens. That's really funny. I think the jobs we don't want to do to AI. It's all a cost for token at this point. It's always been.

22:30

That's, that's an interesting idea like, you know, definitely refactoring makes a lot of sense. I have a special GPT I made like that, you know, has a bunch of resources that I've collected over the years at its disposal to like, help me refactor stuff specific to Ruby and it works great. And having that extra knowledge base like is awesome.

22:54

And I'd use it all the time just for like rubber duck sessions, right, which I know the next step would be to then have it actually change the stuff, which I haven't spent the time to do.

23:04

But like now that you know, you've made it so easy with sub layer, I'm definitely going to revisit it because that's definitely a task where like you could just say, hey, like generate a pull request for me like, you know, and all that GitHub APIs are there like it could use it as disposal like I have you played like with tool usage in that way yet. And as far as like the sub layer connection.

23:35

Yeah, and actually, I guess even to that, that task is kind of the where we see things going that we're kind of looking at it like, you know, those kind of automations once you have that LLM and you have a, you know, an easy way to take.

23:51

Get some data, send it to an LLM in a specific way and then get something structured back, you can build things with that and you can build things with that very quickly like that, you know, take this, take these refactoring or these, you know, these suggestions make a pull request or, or, you know, kind of anything you can think of we've been, we've been toying with, with one idea where it kind of sits in our,

24:17

sits in our video chat for our planning meeting in the beginning of the week that watches the video, check, manages the transcript, analyzes the transcript and then anytime, you know, you know, when somebody asks a question about a story and then you get an answer and then you forget to put it in the story and then like, we, you know, a few days later, we decide here.

24:39

You can just build an automation to kind of pull that out and put that in the story for you and like that's a super quick task that doesn't take that much work anymore right. And, and so very long-winded way of saying we've explored the tool usage, but I think we're looking at it a little differently than how the providers are offering it.

25:05

You know, we, we'd absolutely rely on function calling, but only for, only for like the structured output aspects of it and relying on, you know, things that we're good at and things that we can do deterministically, reliably, which is, you know, call APIs and, you know, pull, make, make requests and make, make, get requests and get, make folks requests that it files locally, that kind of thing.

25:37

But it matters to like get the parameters that you're going to send to those APIs and then we can do that deterministically, but like have the, have the LLM do what it's good at and have us do what we're good at. Yeah, that makes a lot of sense.

25:58

Do you see it more of like moving in the way that kind of GitHub is pushing with like configuration steps or is it like I'm curious if you've played with like the opposite of like the object oriented aspect of like, OK, here's a prompt and, you know, it can have all these extra functions and stuff wrapped into the code aspect versus like, you know, the configuration outline where it, right, where it seems to be.

26:24

It seems to get us going that way to fit better into their CI structure to be honest, right, where you just have this YAML file and then it has prompts and can perform actions and stuff in it in that way. Like, where does that flexibility more advantageous in like putting it in your code versus like in something like a configuration file. That's a good question. You know, I, we were definitely coming at it from a different perspective.

27:05

You know, I think a lot of the multi function calling the, the, the GitHub workspace are very kind of conversation driven and so you have those functions that are structured ways for it to get more information to do the next step in the conversation, which can be powerful, can be very powerful. I mean, we've seen a great a bunch of examples of the power that that is possible.

27:36

But I think from maybe this is just, you know, the thinking, wishful thinking about software engine, the future of software engineering that, you know, a lot of the principles that we've found. While building while not exactly right when working with an LLM still matter.

27:58

And so, you know, one, one thing that I'm nervous about and haven't why I haven't really gone the route of like multiple different functions where to choose from is kind of the same reason why, you know, it's, it's bad to have multiple or it's bad to have a dozen parameters in your function, right. There's a, there's a quote that's something like if you have, if your, if your function has 10 parameters, you probably forgot one.

28:26

That like a lot of those things while folksy and kind of in intuition and not like backed by any science, really, I don't know. Still feel right to me, I still feel like you want to know that like this is going to happen, this is going to happen, and this is going to happen, even if there's some fuzzy, non deterministic thing happening here.

28:55

I agree with that. I'm personally against the step by steps with with with LLM usage in general, mostly because like you kind of want observability into that. And kind of like a way to intercept those steps like as an example, I see like the usage over time starts to become more like reactive than it does like input output. So like, especially once you start using agents more or things that are like can communicate with each other.

29:33

And it comes like an observability thing, right. Like you end up having things that like are watching, you know, what is happening and can help redirect, right. If things start going wrong or interrupt. And when you have that step by step process, it starts to make it harder to change that, right. Because it's got the code flow and like, okay, well, you tell it to go to a different point in the code flow, like then it's like instruction jumping, like what do we back to see again, right.

30:03

I use Ruby so that I don't have to know assembly, right. Like I think like Aaron Patterson's known notoriously known for his joke, like, you know, I do assembly so you don't have to, right. And so I see, you know, the prompt usage kind of going in that direction too, right. Like people write natural language.

30:25

And it does stuff for them, right. Like that's the whole driving force of all of this is like, you know, we had like a product manager come in and make some changes to get better outputs, right. And like, then our end users see the result of that. And like very specific, you know, tweaks like general people can make, right. I work at dark simony and it's like a social network for physicians and they go on the chat GPT and they use chat GPT to like solve problems for themselves, right.

30:55

Like, so go and they'll, you know, generate like a denial appeal letter for some insurance provider. And they just like, then they use it, right. And so they use natural language to make that happen. And like, obviously, like they have some issues like that they have, there's a lot of handholding there, right. And, and so like the whole point is like to help drive people to use the natural language. And then, as developers are like to introduce like the advances of like extending it, right.

31:26

Like, okay, chat GPT, all these models like, yeah, they give you text like, well, because of that, you can have a generate any number of things because we're coders, right. We can start pipelining and an obvious skating that, you know, input output that the LMS give us to mutate in different ways, right. And the users don't have to see that, right. But they could still give it the same inputs that us as developers would even give it, right.

31:51

That's like, that's where I see like, all right, we got to get away from this like, all right, step by step instruction. Like that's a programmer like mentality, right. Like we're starting though that the design evolution is going to be really interesting to watch for developers, I think. I'm curious, I'm curious like what, what your thoughts are on that aspect. Like, where do you see like the design driving force of a lot of this stuff moving.

32:21

So my previous startup was actually a prototyping tool for voice apps. And so this was like, right as Google, Google home and Alexa were coming out. We built a tool for prototyping out different conversation flows. And that same thing of step by step back and forth kind of conversation. And you know, one of the things that we kind of realized is that conversation is kind of just like one one modality.

32:53

And we got acquired by Adobe, built that into XD because we, you know, firmly believe that, you know, you want multiple modalities for inputting information to a machine. And sometimes it's a voice because you're in a, you know, your, your hands are busy and you can't actually press a button or, you know, you're cooking or something.

33:16

But other times you're on the subway and like asking, asking your phone for your bank account information, it's probably not a great idea, but like pressing buttons on your phone is probably fine. So all that being said, I think, I think where things are going, I don't think we're going to throw out like the last however many decades of UX research that we have into like interfacing with machines.

33:43

That there are going to be a lot of times where you have buttons and forms to, you know, that people can understand and intuitively grasp. Like, can you, can you imagine going to Facebook and having to like write a SQL query or ask like the Facebook bot to add somebody as a friend versus, you know, just clicking that friend button.

34:04

And so I think it's going to be, I think it's going to be a mix, even if behind the scenes, there is going to be, you know, this non deterministic natural language API that. That we've been using, but like from a user standpoint, not requiring them to understand the right way to prompt that makes sense. Yeah, that's interesting. I mean, prompt. I hate to say prompt engineering, but trying to get, get the right thing to do what you want is still frustrating to me.

34:40

It is like it's so it reminds me of like when I first started to learn how to code to be honest, which is a little funny. So much fun to bring that excitement back, right? It really is. Yeah. And I don't know are there other tools that you use for that like, you know, prompt design aspect that you'd recommend to people. I radical. I definitely know kind of out there on the prompt design stuff.

35:14

I, you know, and you we've got an open source server for blueprints. We've got sub layer is open source. You can see the prompts that we're using. We're very minimal. Most of the prompt design for us, which. I don't know if I've even. It's maybe it's prompt design, but it's all about like what code are we going to give it and like making the problem easy. And so we haven't really used many tools because the, there's maybe 10 lines of prompt that we rely on.

35:49

But I think, I think thinking about it like programmers is helpful that, you know, if you have for, you know, we were talking a little bit about this of just like what, what makes, makes code promptable. That if you have to say like, I need this input, this input, this input and do this other thing and then, you know, do that. It's probably bad code. It's also probably not very promptable code and like taking that further. If you're asking the LLM to do this and this and this and that.

36:24

It's probably going to screw up somewhere. And so like break those apart. The same way that you, you know, you would much rather not have a 1000 line method. You break it out into individual pieces that you can individually test are individually reliable and then piece those together. And that's kind of my approach to. I do use like sometimes we'll use like, you know, the models against each other.

36:49

So it'll, right, iterate on which is helpful. Like, how would you improve this and then go to the other one and say, well, what do you think about that? Like, how would you improve this from? They often will like, you know, do better if you use the like a competing like model against the other, which is a little funny. And especially if you tell it like, oh, like you're in a competition to like the best prop. I was going to say, you look different against each other like.

37:19

I mean, I haven't tried to like take the deep dive to be like, this is a matter of life or death, right? Like, and you can also win a million dollars, right? Like, just shove all of the, you know, prompt tricks, which is kind of funny. Like those are going away. Like, it's getting harder and harder to make those improvements work. But I don't know if they're, they're doing that training or split. Yeah, it's a little funny.

37:45

Yeah. Yeah, that brings me to my next point, like, you know, blueprint seems like it's like ripe for fine tuning, right? Like you have these pre existing conditions, pre existing formatting. You know, it seems like this would make a great flow to fine tune the models to give the outputs in more desirable formats. Like, do you even see that as like being a meaningful improvement or is it are the better like just you pay to use better models of this one?

38:15

I mean, the my, my thinking right now is the is, you know, until we until we get to the, you know, the end of this curve of, you know, the costs dropping. I think they drop by a, like, they quarter every 18 months or something like that, where, you know, it keeps getting cheaper and new models come out, which I think. So until until that stops or slows down to really just just keep going.

38:49

And, you know, I wrote a post a little while ago titled waste inferences, which basically kind of that call or just like, you know, we can actually build very simple applications and do very simple things, even if. Yeah, it costs a little bit more to, you know, use GPT 40 or cloud 3.5 today, but six months from now it's going to cost half as much and then six months from that it's going to cost half as much as that.

39:19

And so like, if you, if you over optimize today, that's time you could be spent, you, you, time you could be spending finding new patterns for when, when these things cost nothing. I don't know if you remember, I mean, I remember early on, you used to get charged, some people still get charged for bandwidth, but like the entry tier for any kind of web hosting was also a bandwidth charge and like somebody linked to an image on your site, you could go bankrupt.

39:50

But like I haven't seen anybody, you know, really talk about this 3, but like the costs are so minuscule now that it's not even like called out. And it's funny you mentioned that I remember being on a call, there was like some kind of like, I don't know, like presentation or something that a lot of the open AI engineers that were giving a two a bunch of companies.

40:12

I was like enough to join in and you know, a lot of people were asking like, you know, it is fine tuning like worth it for these models like and they're, you know, how do you get even like your pre existing machine learning team. On board to start using this stuff right. And what they were basically saying bluntly was like, you know, the models are getting so good that fine tuning is really not going to be worth it over the long run in general.

40:44

And that like basically, it would be better for your like machine learning team even to get used to using the large language models. Then it would be to have them like, you know, do like machine learning to like for recommendation system or something like that.

41:05

Like over time, like you would be more advantageous to like basically get your foot in the door to the models and work through prompting than it would be to like get trained data on some kind of like, you know, like, pie torch or something like that for whatever you're trying to do like because over the long run, you still have to maintain all that like you have to worry about the same things like is it being effective doing its job right.

41:33

And then also improving those aspects of it, whereas the models themselves like they already are improving themselves. And you can make it like you are right like where you can get it to improve itself like iteratively. And so it's interesting to see, you know, it's I'm kind of torn on it though, because I do I have work with some fine tuning aspects to like get it to conform to a very specific outputs as an example.

42:07

And it does like perform much better for those very like the new you know, Obie's book is great like describing the narrow path, which is like the perfect example of like how like all of this works really right. It's like the smaller and narrower the scope you can make the task like the better that it performs and like, you know, the problem with it now is like G. B. four is like such a vast like knowledge base right.

42:35

Like it starts to perform poorly the greater the tasks that you ask of it or the more that you ask of it. So sure I can like start to distill and do a pretty good great job most of the time, but like the more you start to like ask for like as an example like you know what.

42:52

Which of these categories does this content fall into right it's going to make a lot of mistakes because you're asking it to like match up too many things at once right, but if you're just like well is it this category yes or no it's like going to get almost 100% of the time right.

43:07

And so like it's more thinking thinking like those aspects of it, which is kind of funny because like the pricing model is like perfect for them right because the more that you use it you know the more money they make but also like you know the more that you need to use it the times wise right starts to increase just by the nature of the design, which is kind of funny. But just realize I thought we should put a disclaimer on that way for suppose I'm not being paid.

43:45

But I guess the one thing I would say on that and like even to to your question and you know maybe started to start at the should said this a lot earlier that like a lot of these these things are so new right that like a lot of what we're doing is a lot of what we're doing there's a lot of you know papers coming out around a lot of these techniques.

44:09

But I think everybody's just really you know they have the theories of why these things work they can get it to work and that might work for them it might work for this there might be pieces of what people are doing that are right or wrong I don't think anybody knows for sure and like especially how uncertain the future is that I think I think it's you know these opinions are mostly mine from the way that we're approaching it I can be completely wrong.

44:38

I've seen successes and I have seen like evidence that we're right about some things but that doesn't mean that there are a whole bunch of other techniques and patterns in ways to do it. And so like that's the you know at the also kind of terrifying.

44:59

But it's also super exciting right like it's so fun to see you know to like expand the way that you're thinking about it when you see somebody try something that you wouldn't have thought to try or try and failed and then see them succeed with it and you know expand your understanding of these things.

45:17

So if if something you know for anybody that listening like if something interests you and like you it's super easy to try out so like don't let don't let somebody say saying like this doesn't work or this does work stop you from just trying because usually like a two cent API call to find out if your idea.

45:37

There's something there yeah for sure now has never been a greater time to experiment because you can just prove people wrong so easily or prove yourself right you know like it's it's very easy to test that out which is honestly great with all the new tools that are out there. You know perfect time to plug the Ruby AI builders discord you know people just like drop in the wildest stuff and you're like oh I could do that like you know definitely come join us.

46:10

Yeah it's a couple of weeks ago and somebody found that it could you could send it base 64 encoded information and it would be fun and that's not like a really fun thing you have to go on a Friday night and just like how does how is it able to do this that's awesome. Yeah I wonder if you can like skirt around some of the token issues compression I don't know I wonder if it makes yeah any slower ever responding. I have a nervous.

46:50

So I wanted to briefly touch on you know for those that don't know you started hosting a what seems to be a repeat event in New York City. So Ruby AI meet up how did you get started with that like what what prompted it like you know I attended the last one I thought it was really really great to meet new rubies like in the AI space but even non rubies were there which was pretty cool to see. How did you get involved in that.

47:23

I got a you know I've got a backstory of course so you know I one of the we we got asked a lot you know as we were building this startup why Ruby and I am as you can maybe if you can see some of the books here a little bit of a software historian and really love looking and digging deep into like the reason for why we do what we do.

47:46

And kind of realize that you know every time there's a new platform shift or new kind of thing we can do with computers or as things change there's like there's kind of a pendulum swing and you you know you kind of see like object oriented programming in the gooey kind of coming out of small talk. And you know that's more of the informal kind of like fuzzy reasoning kind of approach which then you know you got it got more formalized through like the 80s and 90s win with like C++ and Java.

48:20

And then you know database back web applications became a thing and we found that the dynamic language is Ruby Python JavaScript made it possible to test out a whole bunch of ideas and find the pattern and find what we.

48:35

How what it means to have like these dynamic web applications and then through the 2010s or so it's become more formalized with TypeScript and Rust and we we're not really trying all these different variations we've got we know what we want to build and we want that to be stable and scalable.

48:52

But now with how long is nobody knows right and so my thinking was you know i'm i need a dynamic language where it's going to get out of the way and I can test out an interface at the sL idea super quickly and start thinking back to like when I was getting into Ruby how there's just like this explosion of different gems people trying things out like why the lucky stiff was putting things out.

49:19

And then you know what happened fire and you know something like camping comes about which inspires Sinatra i don't know if this is direct but in my mind this is my head can and that camping inspires Sinatra which inspires flask and express which is like kind of what a lot of things are built on now.

49:39

And we kind of are in that period now where like we want to get bring people together and kind of like what I said before just like i don't know that you know maybe i'm right about some things and wrong about some things but like we need to get together and share those ideas bounce those ideas off each other get arguments and like show show each other cool stuff that we're building and so that's where the happy hour came about it was like I feel like I feel like Ruby is right for this I feel like there is.

50:08

There is you know from talking to people one on one there is definitely that feeling of like where did that magic go so it's like let's see if we can let's see if we can bring some Ruby is together in New York and you know test that test that theory out and so we did and it was it was a huge I feel like it was a huge success like you know got a chance to meet you we'd people coming from all over the east coast to come come to it.

50:37

And yeah got you know we've sponsors for the next one coming up in July July 24 test double in field fire hydrant saw the you know saw the excitement saw the energy and yeah we're trying to make the next one even bigger bringing more people together.

51:00

Yeah I think the another head cannon I don't know this exactly right the story you hear about like GitHub starting was you know the you know happened at a booth at a sports bar after a bomb after a meetup where they were like you know we should try this and I don't know I wasn't there I don't know the conversation but like that's the story here and I you know wanted to try to you know make that so if we can recreate some of that magic.

51:26

Yeah that's awesome you know I appreciate it like I know a lot of people appreciate it like yeah that was the first Ruby event I've been to in a quick quite a while and just because it was close and you know like there wasn't you know it reminded me of the the uncomference tracks right you know they're just like come hang out at a conference like and not go to any talk and it like removes all of the barriers to like me people right like which is kind of funny.

51:55

Because you go to the paper the conference so that you can listen to the talks.

52:00

But then they end up having like you know somebody's just like it ends up being like a lightning talks but like side side tracks right which is kind of funny but it has that kind of like vibe to it right where like people are just like talking about like all of their excitement which is like all this stuff is so exciting and where do you start and like what are people working on like you know you don't know until and like that the meetup has was like.

52:24

Definitely the combination of everybody just being like here's what we're working on like this is like exciting you know and it was super cool to see. So I'm looking forward to going again. Yeah we actually I guess I can I don't have the the access to make comments but send this to you link to the link to the event if anybody. So awesome yep we'll drop it in here yeah come join us it was so much fun and yeah like you said people traveled for it you know i was really cool to see.

53:04

Yeah it's funny you know you don't realize like how how tough the last few years have been being you know remote and hybrid that like you don't really get a chance to like as much I guess anymore just like me hang out. Yeah got a lot of feedback of just like like that same you know you're saying commentively i haven't been to a review event in a long time i haven't you know haven't seen as many people at a review event for years. Like there's definitely you know.

53:41

Yeah something we've lost by like going mostly remote and hybrid that being in person kind of brings back for me anyway. Yeah yeah I remember my first real comp was it in Maryland and yeah it's definitely intimidating in a larger setting and having smaller groups you know it definitely is easier to chat with people and socialize I think. But one thing I think we had this moment when we first met it was like I know your name but like I know you from the avatar in this part right right.

54:21

I'm going to bridge that gap a little bit. Yeah I mean to be honest if x just had like a you know print your avatar like I feel like a lot it would make it like you know these social connections easier to see. So I just slapped that on. That's funny. So where are you going next to this like where's sub layer heading where's blue prints like where do you where do you see like your next phase in all of this.

54:56

So the next thing and you know with the topic of this is Ruby is a sleeping giant for AI application development like I think it's you know we've kind of laid the foundation here with the framework. New version trying to get the new version at last night but I had a couple of things I had to do.

55:15

So I had the foundation with the framework the foundation with blue prints and then really kind of show how easy it is to show and like spread the word of how easy it is to build these LN powered applications that changes a lot of you know changes the way. So I think about what's possible right. I think there's still a lot of it's hard to it's hard to go from a place where you know a project in 2018 even.

55:50

You're like okay that would be about a team of like 15 people it'll take you know a year to do turning into like a 10 cent API call like that mindset is like it's it's wild right. So I'm sure you've experienced it with just like some of the things these models can do. And so really like our next steps are. Bringing more and more of that to.

56:18

To the forefront of just like what all what are all these things that you can do that take you know half a day to automate where like previously probably just ignored it because it was so costly and so boring. And so building out the framework more bridge bringing more use cases to the forefront and then you know expanding blue prints to make that even faster so that more and more of these examples that you have the more things that can do and more things that can generate.

56:50

And then also trying to experiment with the interactive docs piece getting ready to roll out kind of making it possible for anybody with API docs to have what we have in our docs. See if see if there's interest there so even if you're not even if you're not on the forefront of like trying to turn million dollar projects and the 10 cent API calls you can still see some benefit today.

57:17

You know what I would love to see is just being able to give like a GitHub link to like lines or something like that and say like do something with this. I think that can make really cool like example of hey like here are the lines to some code like you know change this or something or make it say something silly I don't know. Yeah, but there's so much stuff you could do like that like it's so easy.

57:47

Yeah, I would love to see more like more ease of use right like it's it's already easy but like still people don't you know where to start. Which is which is a little bizarre at the same time right like it is as easy as just typing right. I hate to like to distill it down to just that but like you know at this one you can just go to like you know Microsoft's calm and like use chat to be free for free you know like.

58:18

It is that easy to get started which unfortunately you got to buy into that right right. Yeah, I mean I think that's that's the thing and I think there's you know one of the things that have been trying to figure out is how to. I've been trying with that that example of like you know you're already doing right rag is just you googling for like the docs finding the docs and using it and using it in your task.

58:45

I feel like there are a lot of like very big scary like technical terms that you know when you're getting ready to do a side project it's like I'm going to have to like read all these docs to figure out what these things mean when like what they really mean is like put an example in the prompt.

59:04

Like in context learning is just like give the give some examples in the prompt right and it figures out how to do it like that I think like that simplification I really want to try to find or work with people to help find because.

59:20

I've been talking about this with like rails where there used to be like you have to know all this stuff about like database normalization and third normal form and and this and that but it like it really boils down to you mean like customers have many purchases got it right. So I think I think there are that I think is one of the things that's going to happen need to happen next just like the like easier description like to still this down there's you know all there's very deep.

59:53

Deep explanations to take it further and make it more and more powerful but to like get started just like you mean like just copy and paste that into it right yeah the dummies guy to prompting you know but to be honest like I never understood why like they came up with the phrases they did like you know zero shot or few shot like what does all that mean like it was almost like nonsensical to like say that when you just mean examples right. Or not examples like.

01:00:25

Yeah I think that's one of the things that you know been talking about with you know Andre and I have been talking Andre from. Langshan rb. I've been talking about is that you know a lot of this stuff is coming out of research and coming out of academia and like those things those those terms do matter for like those. Those domains but from you know an application engineering side of things that are more like you know applied.

01:00:57

We can have our own descriptions of these and why they work and that are separate from you know the science of the theory and more of the practice right like what's the you know in. There's no difference between theory and practice but in practice there is well is there anything else you wanted to cover today before we jump in the picks here. Yeah well I you know if anybody listening is going to Madison Ruby i'm going to be giving a talk going more into using LLM's in.

01:01:36

A little bit of the philosophy on you know how to deal with L and generated code the toxic called going post L. You know won't give any spoilers away but you know leaning very heavily which obi talks about post L's law in his book. But you know being building your systems so that they can be liberal and what they expect liberal and what they accept and conservative what they send. Just a little bit more liberal in what you what you accept than than what you're used to.

01:02:10

And then you know there's a lot of things that Ruby can do easily. You know you can do anything right but like the meta programming things that you can do in Ruby to change your application at runtime. I think our under explored so be going into going into that a lot of the talk. Yeah I share your sentiment there I definitely tried to have LLM generate Ruby methods and run them and then reuse them it's a lot of fun. Very dangerous.

01:02:54

Yeah definitely under explored for sure so that's cool yeah people go check out Madison Ruby if you're not going I'm excited to watch that talk remotely at least. Yeah so here what you got. So if people want to reach out to you or find you on the web you know where can they do that. Come primary Scott at sub layer calm if you'd like to email me at Scott Werner or at Twitter Scott Werner at Twitter.

01:03:34

And let's see, I'm also like we talked about in the Ruby AI Builders Discord, sharing the big discussions right now or on that Arc AGI challenge, which we've all been kind of wrecking our brains about. And then the link here. On the sublayer site, we have an access to our Discord, which is a little bit more about our releases and maybe some less Ruby specific stuff and more kind of thinking on what is, how do we think about these AI tools and use them for building applications just in general?

01:04:23

Yeah, I've been enjoying that channel as well. It's super interesting that stuff gets posted. And you would think it would get shared in more places. To be honest, a lot of the, yeah, a lot of the papers I get shared, maybe people just aren't reading as much as I am. I don't know, I don't think so, but. It feels like there's unlimited amount of stuff to read. There's so much to read. I wake up, we're recording where did I fall on?

01:04:52

Yeah, I feel like the paper I read last week is just like already like, you know, dated. Like how does that possible? I know. All right, well, let's jump into picks. We've been talking about so much great stuff. Picks are segment where we just pick literally anything that could be code if you want to. It doesn't have to be, but it's typically what I pick. And if you need a minute, I can go first. So I've decided to to fine tune a large language model for Ruby.

01:05:32

Mostly just for the experiment than anything. But I'm calling it RubyLang.ai. I'm gonna start building it open and just like, learning how all this stuff works and how it might be used to source all of the awesome open source Ruby that's out there that is designed incredibly and make use of that and take advantage of it and make a new language model that is very Ruby-centric. So I'm gonna see how it goes. I'm hoping that I'm successful in it, but I may not be.

01:06:10

There may just be a bunch of lessons learned. But we'll see. You can follow my progress. RubyLang.ai. Yeah, I mean, I saw that you posted about it either yesterday or the day before and yeah, excited to see where that goes. Yeah, you know, I got this stupid massive GPU server at home and I was just using it for inference and I'm like, you know, I'm just just wasting away over here running inference on it. And so I thought like I should fine tune something and this seems like a perfect use case.

01:06:44

So here we go. Yeah. That's great. Yeah, so I guess I've got two things. One, you know, the, I talked a little bit around like the formal informal kind of pendulum swing. You know, one of the places where I got that from was an OptiGrim talk, the Soul of Software from, this was a long time ago at this point, but it's a couple versions of it up on YouTube.

01:07:19

He goes into more of the kind of, the informal mindset and how there are those two splits of software and how neither one is particularly absolutely right, but more it's, you know, each one gives you different, different things depending on what you need. So like that was, that was very impactful for me, especially now given the, you know, the approach we're taking.

01:07:48

And the other thing read a lot of sub stacks and there's one else, Strange Loop Canon, which he had a post a little while back about what LLM's can't do, which, you know, is also a trap a lot of times because as soon as you say, it's not possible to do this. Everybody on Twitter tries to prove you wrong. And in his post he had one about, it can't do Conway's Game of Life. And I think like within a week, somebody had proven him wrong.

01:08:18

He had tried fine tuning all of these different things, couldn't figure it out. And the collective wisdom of Twitter came up with a solution, I think. But then he has one recently. That's awesome. Yeah. It's very good. I highly recommend just the sub stack in general. That one was very good. And then this latest one, seeing like a network is very, was very, very good as well. That's awesome.

01:08:48

Well, I appreciate you coming on, Scott, and talking about all this cogeneration stuff and the love of Ruby AI. And, you know, it definitely is a sleeping giant. And I think we're just gonna see more and more of why that is. And, you know, we'll have to have you on again after you know, you start creating many companies, Meta, MetaLot, all of this Ruby code generation. I could definitely foresee a future where you're just like, oh, you know, create a Rails app that does this and it just does it.

01:09:24

Yeah. Seems easy, you know. Yeah. It's simple. Yeah, no, thank you for having me. This has been a lot of fun. It's been awesome, you know, chatting and catching up and excited to see you in person. Not a month. Yeah, totally. All right, well, until next time folks, I'm Adir and come visit us next time.

✨ This transcript was generated by Metacast using AI and may contain inaccuracies. Learn more about transcripts.

Leveraging Ruby for Effective Prompt Engineering and AI Solutions - RUBY 643

Episode description

Transcript

Leveraging Ruby for Effective Prompt Engineering and AI Solutions - RUBY 643

Episode description

Transcript ✨

Transcript