Leveraging Ruby for Effective Prompt Engineering and AI Solutions - RUBY 643 | Ruby Rogues podcast

00:04

Well, welcome to another episode of the Ruby Rovers podcast. I'm your host today, Valentino Stole, and we're joined by a very special guest, Scott Werner. Scott, do you want to introduce yourself? Tell us why you're Ruby famous? Here? Ruby famous? I wish someday. Thanks. Thanks for having me, Valentino. It's super exciting to be here and be a

00:25

part of this. Yeah, so I'm here. I'm the CEO and founder and co founder of sub Layer, where we're building a self assembling AI agent framework in Ruby and kind of supporting tools and infrastructure for uh to make something like that possible. And yeah, basically been programming with AI and Ruby for and we were just talking earlier basically exclusively trying to get AI to write all my code for me and me actually write the code and primarily Ruby. That's

01:06

awesome. So just so word clear, you're talking about sub Layer is a project. Sorry, Yes, the sub layer is the GM. Sub Layer's the GYM, and so yeah, it's uh, you know it said. You know, I've been describing it as a self assembling AI agent framework, and kind of what I mean by that is that each of the each of the components that we've built are designed so that it's easy for an l l

01:34

M to generate new additional components that you might need. So the the the framework is very minimal, and it's really about kind of defining the the conventions, the interfaces and then kind of letting you lose to to expand on that for whatever you want your application. I'm super curious here, like, uh, where do you where do you start something like that? Like how did you get involved than this? Right? Like, uh, how do you

02:02

just stop coding? Uh? It was tough, you know, I think I think a lot of us were really inspired when we saw chat GPT for Katt. We saw GPT four come out and people using chat GPT and you know, posting videos of I use this pro throw this prompt into chat GPT. I copy and pasted the code into x code and I have a running app and I've never coded before, and uh, you know I saw that and immediately it was like, oh, well, those people are going to

02:38

want a way to add functionality and modify that software over time. And uh kind of realized that a lot of the stuff that we do in XP around like TDD and and the different refactory as we do lend themselves really well to working with LM. So breaking things out into small pieces, the smaller requests, the simpler request you have for the LLLM that you the more likely get

03:10

you something back. That's that's good, you know. And also kind of had this, I had this conversation with people like, you know, you would be mad at me if I gave you a thousand nine PR, but at least you could come over to my desk and yell at me. Right if you've got a thousand nine PR from chatchypt that would be a big problem. And so kind of with that line of thinking is where we kind of

03:31

started our journey. That's really funny. And I just think of the costs associated with that too, for something that you don't want right right, that's really cool. So I've definitely checked out cub lair. I've used it to generate specs because I don't like write writing specs. I'm not a test driven person, but I do drive tests, so I'm more of a backwards test

04:01

driven developer whatever you would call that, and so I use it. Yeah, it was a little script that I run sub layer and I say, hey, generator, you know our spec test for this Ruby file and for the most part it does pretty good. You know. Yeah, so I love the kind of syntax for just like, hey, here's a generator uses this prompt to do it. It's super cool. Ah. Thanks, So why don't you give us like the high level like how does this all work?

04:34

Right? Yeah, a little you know a little bit of background like how and why it works is uh, you know, we started calling it promptable architecture, but the core idea was that they were going to be patterns, styles, ways that you build your application that lend itself well to l

04:56

o ms taking a short request and giving you something back. And you know, some of the things we started out with things we kind of already know of, you know, clear interfaces, single responsibility principle, uh, loose coupling, and kind of came about those through you know, reasoning of you know, it's going to be much easier to add code to your app if you can just drop that new file in rather than have to update a whole bunch of other different files, and you know, other things like having these

05:28

conventions that you can throw into the prompt and the LLLM will just believe you that that's your convention and follow it. And so that's that's been a lot of kind of our trial and error figuring out what are these what are these

05:42

patterns? How do we how are we going to structure our applications differently to take advantage of this, And so, you know, one of the main things around the sublayer gem is, you know, every single component is very loosely coupled that you know, you could have one hundred generators, you could have a thousand generator and they won't affect each other, and so you can you know, one of the one of the things that we've been playing with

06:05

is like maybe you know, don't repeat yourself or duplication in your code base. Isn't that bad if it takes you know, in the dots you can see that you can generate a generator in like, you know, ten seconds, Maybe it's not that bad if you know you and I both work on the same story. If you know, it's us just describing it and getting

06:26

the code and it doesn't affect anything else in the code base. So so there's there's promptable architecture and so like that's been kind of our drive for this GEM, and one of the things that we've found works really well with that is that you really only need to give the LM one example to generate something new and so generate a variation off of that, which is where our kind of main tool blueprints comes in and basically what it does, which I think

06:57

is you know, has been pretty common experience from a lot of the developers

07:02

that I've talked to. Nine times out of ten, when you're working on a new user story, there's somewhere else in the code base that you do something similar, or there's a pattern you probably want to reuse, so you go, you look at that file, you repurpose some of the code, maybe copy and paste, and you know, keep the code based very similar to what you're doing, which we kind of realized is a manual, human version of RAG, which you know, I made a joke to somebody that

07:33

like, my development style is retrieval, augmented generation, very good at googling and looking around the code base, and you know, it was kind of mostly joking, but then kind of took a step back and was like, actually, maybe there's something to that that's really funny. I never thought of it like that, but yeah, we are just we're just doing RAG.

07:58

Right and so real, so you can actually kind of operationalize that and the you know, one of the things that I think for you join a new team and what takes so long to ramp up is that experience of, oh, yeah, when we when we run into that problem, we do this pattern over here, that's in you know, this file on you know this line. But like you only get that from months years of working on that

08:22

team. That intuition whereas you index these things, you put them in a vector database and you query them, you can find them and uh, you know, I we built that. We originally actually on the sublayer GEM thought we were going to build a whole bunch of different you know, components and

08:43

generators and stuff. But uh, on our product planning meeting one Monday, we'd like, you know, we basically thought we were going to do about this much work this week and had the idea of maybe we'll just generate one

08:56

of these generators. And then a whole work weeks worth of work was done in a day, and it was like, take a step back, what did we just kind of like discover so like you know, building, we've found like building in that method building this method of like very loosely coupled components. Small components can be very powerful and quick. I didn't realize that you have a list of of pre defined blueprints on the site. That's really cool.

09:30

Yeah, that's super cool. I was gonna say that, you know that this mentality is makes so much sense, right, Like, I feel like I've been at places before where they have had just like generators in general, like reils generators right where Okay, you just like want to start up a new rails app, like you use a generator, Like you want to create a new model that has all of the conventions, use the rails generator

09:54

for that. Like it definitely has helped, like in a consultancy for example, Like you know where you have all this boilerplate stuff that you just want to build off of. It's like, well, you use the pre existing stuff and then at least you have the shell there and it's all hooked up and wired correctly, and then you can build off of that. But this is like the next step after that, right, like because then you can then start filling out the stuff in that boiler plate right to adapt to whatever

10:22

it may be, which is really cool. And so like how does it how well does it work? Right? Like, I think that's like maybe what a lot of people hesitate to jump into all this with, right, Like you know, okay, so you have all this source material for it

10:41

to use, Like, isn't it just making mistakes? Though? So it can Actually one of the things that I've kind of realized and been trying to figure out how to how to describe, right, But it's almost like by intentionally designing your classes and interfaces a certain way, you can almost like target

11:01

the hallucinations to where you want them. Which you know it makes mistakes, but they a lot of times are mistakes in the style that you would probably want anyway, which I know it doesn't really make sense, but like so in the actually I think, what do you what do you mean by that?

11:24

So like, well, you know, one of the things, if you you know, go ahead and generate something on the doc site, you'll see that we have kind of this idea of an l M output adapter and right now it says, you know, there's a symbol for a single string. So we're basically the simplest generator you can make is here's a string.

11:46

I'm gonna send a prompt to an l M and I want this specific string back, and like you said, like your your example is the string could be our spectats for you know, some Ruby code that I've got, But if you ask it to build a generator that maybe gives you a list of a list of refactors to make in that Ruby file, it might hallucinate that it needs an L and output generator that generates a list And while that might not exist, it'll still be in the conventions and the style of that of

12:22

the library, which might exist but also might point to Okay, well, here's the next feature we need to build. We need to flesh out what it means to get a list back. And so having those spots where there's nothing different from the base example to the new example you're trying to make, but the place is where it's different, you can basically design your code so that those are the pieces that change. Output adapter might be a little more

12:50

complex. The thing that I think you run into a lot is it just generates a new prompt for what you're trying to do and doesn't touch anything else in the rest of the code. And so I'm thinking about it like I'm targeting the hallucination to that prompt or to the to the initializer rather than anything else in the in the code base or in the in the file. Gotcha, Yeah, that makes sense. I have so many questions around the blueprints aspect of it, and I mean I guess, like where does where does

13:22

somebody even get started? Like is blueprints a good place to get started with the general code generation aspects of it? Or is that like a first test the waters with sub layer and get used to the generator's aspect of it? Like where do you suggest somebody? Like that's hey, I've got a team of people, we have a conventions in place we want to start adopting this

13:45

process. Like where do you start? So, yeah, I guess one place you can start We just you saw on the site we just rolled out some new functionality where you can sign up you can start to put some code examples in privately and start to generate from them to see how it how it acts, and that that could be a great place to kind of start to start to play with UH promptability, which we've got a whole bunch of features we're going to roll out, kind of being able to score or point out

14:18

areas where you know where it might make the code more promptable, and so so that's one aspect if you if you're curious about playing with with your own code. And then on the sub layer docks, we actually and I'm sorry this is very meta, but we've actually created this this interactive docks product which we're going to be rolling out as well, where live in our docks, where you can come and go to any of the components we have generators and

14:48

actions right now and see it in action. Just write in what you're trying to do and it will give it give you reasonable There might be some things depending on what you're trying to do, might not be implemented yet. But one of the things that we're doing, if you go to the running there's a there's a guide on the left about running local, running a local l M with Lamaphile, And what Andrew on our team does is actually in the demo of setting it up, go to the dock site, generate the code

15:26

for the demo, runs it on the video. So I'm sorry we're doing I was on earlier because I had looked at blueprints a while ago and I hadn't seen some player's latest stuff and this is so cool. Yeah, you just go on to the site and you can like, uh, you have just generate this stuff. It's so cool. Sorry if we're inflating your bill

15:52

here, but that's maybe why it's there. Yeah, so I'm curious, like, uh, you know, definitely local inference is uh, you know, a hot topic and people are trying to you know, it's getting faster and MacBooks are running it quicker. Is that something that you recommend people start with? Though? Like, is it fast enough to like make make like

16:21

the development flow smooth. It's it's not yet. I would probably recommend starting with something like, uh, you know, I think for for some some of the stuff that we're doing, you know, five dollars in uh in open Ai API credits can kind of help you get started. I think Google's has a free tier or even Gemini one point five flash, which is super

16:51

fast. You know, we're we have we're using you know, some of the some of the models on our site, and you know you can also just go and play with way play with stuff in with private blueprints for the local models. I I kind of do think that's where it's where it's probably going for for at least a lot of stuff. The problem I think the The prohibitive thing about that is that you need a very powerful laptop to really be able to run the local models fast enough and powerful enough to actually get

17:37

good results back. Yeah, four thousand and five thousand dollars a book pro or five dollars into open API or opening API credits. I remember I first saw the you know, the first Open models, and I go and I, you know, install on my Raspberry Pie is I'm thinking, oh, yeah, you know, like I'll just throw everything a chain of Rasberry pies. And it turns out you can't really like spread imprints like that. It's

18:03

not it's not really there yet. And it's also super slow on and anything like even on like a nice m one, right, So it's kind of funny like you sit there and you're like watching reminds me of the old days of like waiting for an image to load on dial up or something, right, like you're waiting for the sentence to complete. Yeah, so I think it also, you know, the risky run is that it gives you the

18:33

wrong impression, like if you're if you're just getting started. It's definitely good to throw five dollars at some of the like frontier models like cloud three point five GPT four to oh or or Gemini, just to see what's possible, because you can get the wrong impression if you're like like this one of the thing just like went off the rails and started talking to talking about something unrelated, right, right, Yeah, So I'm interested, like, have you

19:02

found like certain models performed better for like certain tasks like related to code generation, or are the ones that are specifically better at generating Ruby code or like genering certain kinds of Ruby code, or even just like are there certain tasks with code generation like generating the specs for X or something like that where some models just do a better job than others, or is it kind of like

19:30

just use any one of the top ones. I my opinion and my approach is kind of really just used the top ones because they're really those are going to be the slowest, most expensive, worst models we're ever going to have. Again, you know, it's just getting better and it's just scaling up, so like getting you know, getting used to what's what's possible, and you know, even if you run to the wrong run into an issue, like it's more about the UX and like what is this, you know,

20:00

what does this imply? If you can do this. I was telling talking to somebody about you know, remember remember when everybody's like, oh ai, image generation is never gonna work because it keeps getting people seven fingers, and like, I can't remember the last time that was, you know, part of the conversation, but it was all about, like, here's a great way to generate an image. Ignore the hands for it for now. That

20:23

reminds me. I saw recently, Uh, some some university has the the third thumb or you're gonna actually like you use your feet to like you like to get like another thumb like prosthetic on your hand so that you can like, you know, hold stuff while you're like doing something else with one hand. They showed peeling a banana with one hand. You know that guy, That guy's clearly aerated. Oh my gosh, it's so funny. But I

20:56

guess it's more to your point. And actually one of the things we've been trying to figure out the right way to present this and maybe it'll come out

21:03

as something interactive on the blueprint site. But we've found that different attributes are different, like uh, different different attributes of code quality that we've talked about for a long time are actually kind of quantifiable now where one of the experiments we were running was you know, having having really descriptive good names and then generating off of that blueprint that you know, GPT three point five did really well. When you made those names worse or less descriptive, it had a

21:41

harder time figuring it out, but GPT four could. And so like you know, that kind of goes the follows a lot of the scaling for the other models as well, where if you give it an easier problem to solve, it can solve it. But as soon as that problem gets harder, you need a more powerful model, which I've been testing out with some people, like we might be able to put a dollar value on technical debt because it costs you a dollar per thousand tokens versus you know, ten cents per

22:14

thousand tokens. That's really funny now, sourcing the jobs we don't want to do to AI, it's all a cost per token. At this point, it's always been. That's that's an interesting idea. Like, you know, definitely refactoring makes a lot of sense. I have a special GPT I made like that, you know, it has a bunch of resources that I've collected over the years at its disposal to like help me refractor stuff specific to Ruby,

22:48

and it works great. And having that extra knowledge base like is awesome and I'd use it all the time just for like rubber duck sessions, right, which I know the next step would be to then it actually changed the

23:00

stuff, which I haven't spent the time to do. But like now that you know you've made it so easy with sub layer, I'm definitely going to revisit it because that's definitely a task where like you could just say, hey, like generate a pull request for me, like you know, and all that get up APIs are there, like it could use it as a disposal Like have you played like with tool usage in that way yet As far as like the sub layer connection, uh yeah, And actually I guess even to

23:33

that that task is is kind of the where we see things going that. Yeah, we we're kind of looking at it, like you know, the

23:45

those kind of automations. Once you have that LM and you have you know, an easy way to take get some data, send it to an l LM in a specific way and then get something structured back, you can build things with that and you can build things with that very quickly, like that, you know, take this, take these refactoring or these you know, these suggestions, make a pull request or or you know, kind of anything

24:10

you can think of. We've been we've been toying with UH with one idea where it kind of sits in our UH sits in our video chat for our planning meeting in the beginning of the week that UH watches the video, check manages the transcript, analyze the transcript, and then anytime, you know, you know, when somebody asked a question about a story and then you get an answer, and then you forget to put it in the story, and then like we you know, a few days later, we decide, here,

24:38

you can just build an automation to kind of pull that out and put that in the story for you. And like that's a super quick task that doesn't take that much work anymore, right, And and so very long winded way of saying, we've explored the tool usage, but I think we're looking

24:59

at it a little differently then how the providers are offering it. You know, we we'd absolutely rely on function calling, but only for UH, only for like the structured output aspects of it, and relying on you know, things that we're good at and things that we can do deterministically and reliably, which is you know, call APIs and uh yeah, pull make make requests and make make get requests and get make books requests and it files locally that

25:33

kind of thing. But it matters to like get the parameters that you're going to send to those APIs and then we can do that deterministically. But like have the have the l ll M do what it's good at, and have us do what we're good at. Yeah, that makes a lot of sense.

25:56

Uh do you see it more of like moving in the way that kind of GitHub is pushing with like configuration steps or is it like I'm curious if you've played with like the opposite of like the object oriented aspect of like okay, here's a prompt and you know it can have all these extra functions and stuff wrapped into the code aspect versus like you know, the configuration outline where right where it seems get hubs going that way to fit better into their CI

26:27

structure to be honest, right where you just have this YAML file and then it has prompts and can perform actions and stuff in it in that way, like where is that flexibility more advantageous in like putting it in your code versus like in something like a configuration file. That's a good question, you know.

26:57

I we were definitely coming at it from a different perspective. You know, I think a lot of the multifunction calling the giftub workspace are very kind of conversation driven, and so you have those functions that are structured ways for it to get in more information to do the next step in the conversation, which can be powerful, can be very powerful, and I mean we've seen

27:29

a great bunch of examples of the power that is possible. But I think from maybe this is just you know, the thinking, wishful thinking about software engine the future of software engineering, that you know, a lot of the principles that we've found wild building, while not exactly right when working with an

27:56

l M, still matter. And so you know, one one thing that I'm nervous about and haven't why I haven't really gone the route of like multiple different functions for it to choose from, is uh kind of the same reason why you know, it's it's bad to have multiple it's bad to have a dozen parameters in your function. Right, there was a there's a quote that's something like if you have if you're if your function has ten parameters, you

28:22

probably forgot one that. Like a lot of those things while folks see and kind of intuition and not like backed by any science. Really, I don't know, still feel right to me. I still feel like you want to know that, like this is going to happen, This is going to happen, and this is going to happen, even if there's some you know, fuzzy non deterministic thing happening here. Yeah, I agree with that. Uh

28:56

huh. I'm personally against the step by steps with with LLLM usage in general, mostly because like you kind of want observability into that and and kind of like a way to intercept those steps. Like as an example, I see like the usage over time starts to become more like reactive than it does like

29:23

input output. So like especially once you start using agents more or things that are like can communicate with each other, it becomes like an observability thing, right, Like you end up having things that like are watching you know what is happening and can help redirect right if things start going wrong or interrupt, and when you have that step by step process, it starts to make it harder to change that, right because it's it's got the code flow and like

29:53

okay, well you tell it to go to a different point in the code flow. Like then it's like instruction jumping, Like what are we back to

29:59

see ag in? Right? Like assembly, Like I use Ruby so that I don't have to know assembly, right, like I think like Aaron Patterson is known notoriously known for his joke like you know, I do assembly so you don't have to write like uh, and so I see you know, the the prompt usage kind of going in that direction too, right, Like people write natural language and it does stuff for them, right, Like that's

30:26

the whole driving force of all of this. It's like, you know, we had like a product manager come in and make some changes to get better outputs, right, and like then our end users see the result of that and like very specific you know tweaks like general people can make, right, Like I work at doc Simony and it's like a social network for physicians and they go on the chat shept and they use chat shept to like solve problems

30:53

for themselves, right, Like they'll go and they'll you know, generate like a denial appeal letter for some insurance provider, and they just like then they use it, right, and so they use natural language to make that happen, and like obviously like they have some issues like that they have. There's a lot of handholding there, right, and and so like the whole point

31:14

is like to help drive people to use the natural language. And then s as developers are like to introduce like the advances of like extending it, right, like okay, chat, chap, all these models like yeah, they give you text Like, well, because of that, you can have to generate any number of things because we're coders, right, Like we can start pipelining and obfuscating that you know, input output that the alms give us to mutate it in different ways, right, And the users don't have to see

31:45

that, right, but they could still give it the same inputs that us as developers would even give it. Right. That's like that's where I see, like, all right, we got to get away from this, like all right, step by step instruction Like that's a programmer like mentality, right, Like we're starting the design evolution is going to be really interesting to watch for developers. I think I'm curious. I'm curious, like what what your thoughts are on that aspect, Like where do you see like like the design

32:15

driving force of a lot of this stuff moving? You know. So my previous startup was actually a a prototyping tool for voice apps, and so this was like Rata's Google, Google Home and Alexa were coming out. We built a tool for uh, you know, prototyping out different conversation flows and that

32:37

same thing, a step by step back and forth kind of conversation. And you know, one of the things that we kind of realized is that conversation is kind of just like one one modality, and we got acquired by Adobe built that into XD because we, you know, firmly believe that you know,

33:00

you you want multiple modalities for inputting information to a machine. Sometimes it's a voice because you're in a you know, your your hands are busy and you can't actually press a button, or you know, you're cooking or something. But other times you're on the subway and like asking asking your phone for your bank account information, it's probably not a great idea, but like pressing the buttons on your phone's probably fine. And so all that being said,

33:30

I think I think where things are going. I don't think we're going to throw out like the last However, many decades of UX research that we have into like interfacing with machines, that there are going to be a lot of times where you have buttons and forms too, you know that people can understand and intuitively grasp right like can you can you imagine going to Facebook and having to like write a sequel career or asque like the Facebook bot to add somebody

33:59

as a friend versus you know, just clicking a friend button. So I think it's going to be I think it's going to be a mix, even if behind the scenes there's going to be you know, this non deterministic natural language API that that way that we've been using, but like from a user standpoint, not requiring them to understand the right way to prompt that makes sense. Yeah, that's interesting. I mean prompt I hate to say prompt engineering, but trying to get get the right thing to do what you want is

34:36

still frustrating to me. It is, like it's so it reminds me of like when I first started to learn how to code, to be honest, which is a little funny. Likes so much fun to bring that excitement back, right, it really is. Yeah, And uh, I don't know, are there other tools that you use for that, Like, you know, prompt is aspect that you'd recommend to people. I radical, I definitely kind of out out there on the prompt design stuff. I you know,

35:15

you we've got an open source server for blueprints. We've got sub layers open source so you can see the prompts that we're using. We're very minimal most of the prompt design for us, which it's maybe it's prompt designed, but it's all about like what code are we going to give it and like making the problem easy. And so we haven't really used many tools because the there's

35:44

maybe ten lines of prompt that we rely on. But I think I think thinking about it like programmers is helpful that you know, if you have you know, for you know, we're talk a little bit about this of just like what what makes code promptable that if you have to say, like I need this input, this input, this input, then do this other thing and then you know, do that, it's probably bad code. It's also

36:13

probably not very promptible code. And like taking that further, if you're asking the LLM to do this and this and this and that, it's probably going to screw up somewhere, and so like break those apart the same way that you you know, you would much rather not have a thousand line method. You break it out into individual pieces that you can individually test are individually reliable,

36:37

and then piece those together. Yeah, that's kind of my approach to I do use like sometimes we'll use like, you know, the models against each other to build and iterate on, which is helpful, like how would you improve this? And then go to the other one and say, well, what do you think about that? Like how would you improve this from?

36:57

It's like they often will, like you know it do better if you use the like a competing like model against uh the other, which is a little funny, and especially if you tell it like oh, like you're in a competition to like make the best prompt, right, Yeah, I mean I haven't tried to like take the deep dive to be like this is a matter of life or death, right like, uh, and you can also win a million dollars, right, like just shove all of the you know,

37:30

prompt tricks, which is kind of funny, like those are going away, like it's getting harder and harder to make those improvements work. But yeah, I don't know if they're they're doing that training, but but yeah a little funny. Yeah, yeah, that brings me to my next point, like, uh, you know, blueprint seems like it's like ripe for fine tuning, right, Like you have these pre existing conditions and pre existing formatting. Uh, you know, it seems like this would make a great flow

38:00

to fine tune the models to give the outputs in more desirable formats. Like do you even see that as like being a meaningful improvement or is it are the better like just you pay to use better models of this one? I mean, the my my thinking right now is the is you know, until we until we get to the you know, the end of this curve of

38:25

you know, the costs dropping. I think they drop by the quarter every eighteen months or something like that, where you know, it keeps getting cheaper and new models come out, which I think until until that stops or slows

38:45

down to really just just keep going. And you know, I wrote a post a little while ago titled waste Inferences, which basically kind of that call or just like you know, we can actually build very simple applications and do very simple things even if yeah, it costs a little bit more to you know, use GPT four oh or Cloud three point five today, but six months from now it's going to cost half as much, and then six months

39:15

from that it's going to cost half as much as that. And so like if you if you over optimize today, that's time you could be spent. Time you could be spending finding new patterns for when when these things cost nothing. I don't if you remember. I mean I remember early on you used to get charged. Some people still get charged for bandwidth, but like the like entry tier for any kind of web hosting was also a bandwidth charge, and like if somebody linked to an image on your site could go bankrupt.

39:50

But like I haven't seen anybody you know, really talk about three. But like the costs are so minuscule now that it's not even like called out funny you mentioned that. I remember being on a call there was like some kind of like a I don't know, like presentation or something that a lot of

40:08

the Open Eye engineers that were giving to a bunch of companies. I was like enough to join in, and you know a lot of people were asking like, you know, it is fine tuning like worth it for these models, like and they're you know, how do you get even like your pre existing machine learning teams on board to start using this stuff right, And what they were basically saying bluntly was like, you know, the models are getting so good that fine tuning is really not going to be worth it over the

40:43

long run in general, and that like basically it would be better for your like machine learning team even to get used to using the large language models then it would be to have them like you know, do like machine learning to like for a recommendation system or something like that, like over time, Like you would be more advantageous to like basically get your foot in the door to the models and work through prompting than it would be to like get trained data

41:16

on some kind of like you know, PyTorch or something like that for whatever you're trying to do, like because over the long run, you still have to maintain all that, like you have to worry about the same things like is it being effective doing its job right and then also improving those aspects of

41:35

it, whereas the models themselves, like they already are improving themselves. And you get and you can make it, like you are, right, like where you can get it to improve itself like iteratively, and so it's interesting to see, you know, it's I'm kind of torn on it though, because I do I have worked with some fine tuning aspects to like get it to conform to very specific outputs as an example, it and it does like perform much better for those very like the you know, Obie's book is great

42:14

like describing the narrow Path, which is like the perfect example of like how like all of this works really right, It's like the smaller and narrower the scope you can make the task, like, the better that it performs, and like, you know, the problem with it now is like GBT four is like such a vast like knowledge base, right that like it starts to perform poorly the greater the tasks that you ask of it, or the more

42:40

that you ask of it. So sure it can like start to distill and do a pretty good great job most of the time, but like the more you start to like ask for it, like as an example, like you

42:50

know, what which of these categories does this content fall into? Right, It's gonna make a lot of mistakes because you're asking it to like match up too many things at once, right, But if you're just like, well is it this category yes or no, it's like gonna get almost one hundred percent of the time, right, And so like it's more think thinking like those aspects of it, which is kind of funny because like the pricing model is like perfect for them, right, Like the more that you use it,

43:17

uh, you know, the more money they make. But also like you know, the the more that you need to use it, the times wise, right, starts to increase just by the nature of the design, which is kind of funny. I just realize I put a disclaimer on that

43:35

waste not being ab I that's funny. But I guess the one thing I would say on that, and like even to to your question, and you know, maybe started could have started the I said this a lot earlier that like a lot of these these these things are so new, right that like a lot of what we're doing is a lot of what we're doing. There's a lot of you know, papers coming out around a lot of these techniques. But I think everybody's just really you know, they have the theories of

44:13

why these things work. They can get it to work, and that might work for them, it might work for this. There might be pieces of what people are doing that are right or wrong. I don't think anybody knows for sure, and like especially how uncertain the future is that I think. I think it's you know, these opinions are mostly mine from the way that

44:36

we're approaching it, and I could be completely wrong. I've seen successes and I have seen like evidence that we're right about some things, but that doesn't mean that there are a whole bunch of other techniques and patterns and ways to

44:49

do it. And so like that's the's you know, at this also kind of terrifying that, like, you know, but it's also super exciting, right, Like it's so fun to see, you know, to like expand the way that you're thinking about it when you see somebody try something that you wouldn't have thought to try or tried and failed and then see them succeed with it, and you know, expand your understanding of these things. So if something you know, for anybody listening, like if something interests you and like

45:22

you, it's super easy to try out. So like don't let don't let somebody say saying like this doesn't work or this does work stop you from just trying it, because it's usually like a two cent API call to find out if your idea there's something there. Yeah, for sure. Now it's never been a greater time to experiment because you can just prove people wrong so easily or or prove yourself right. You know, like it's it's very easy to

45:50

test that out, which is honestly great. With all the new tools that are out there, you know, perfect time to plug the Ruby AI builders discord. You know, people just like drop in the wildest stuff and you're like, oh, I could do that, like, you know, definitely come join us. Yeah, it was a couple of weeks ago and somebody found that it could you could send it Base sixty four and coded information and it would be fun. And then so that was like a really fun thing

46:24

on a Friday night night. How does how is it able to do this? That's awesome? Yeah, I wonder if you can like squirt around some of the token issues compression. I don't know, I wonder if it makes yeah, it any slower at responding. I haven't noticed. So I wanted to briefly touch on you know, for those that don't know you've started hosting a what what seems to be a repeat event in New York City, Uh, the Ruby AI meet up? How did you get started with that?

47:06

Like? What what prompted it? Like, you know, I attended the last one. I thought it was really really great to meet new Rubias like in the AI space, but even non Rubius were there, which was pretty cool to see. Yeah, how did you get involved in that? I've gotta you know, I've got a backstory of course, so you know, I one of the we got asked a lot, you know, as we

47:30

were building this startup why Ruby? And I am as you can maybe if you can see some of the books here, a little bit of a software historian and really love looking and digging deep into like the reason for, uh, why we do what we do and kind of realized that, you know, every time there's a new platform shift or new kind of thing we can do with computers or as things change, there's like there's kind of a pendulum swing, and you you know, you kind of see like object oriented programming

48:02

and the guy kind of coming out of small talk and then you know that's more of the informal kind of like fuzzy reasoning kind of approach, which then you know, you got it got more formalized through like the eighties and nineties when with like C plus plus and Java, and then you know, database back web applications became a thing, and we found that the dynamic language is

48:28

Ruby, Python. JavaScript made it possible to test out a whole bunch of ideas and find the pattern and find what we how what it means to have like these dynamic web applications. And then through the twenty tens or so, it's become more formalized with Typescript and Rust, and we're not really trying out all these different variations we've got. We know what we want to build and we want that to be stable and scalable. But now with all lems,

48:52

nobody knows right. And so my thinking was, you know, I need a dynamic language where it's going to get out of the way and I can test out and interface a DSL and idea super quickly. And started thinking back to like when I was getting into Ruby, how there's just like this explosion of different gems, people trying things out, Like why the Lucky Stiff was putting things out like rapid fire, and you know, something like camping comes

49:22

about which inspires Sinatra. I don't know if this is direct, but in my mind this is this is my head canon that camping inspires Sinatra, which inspires Flask and express, which is like kind of what a lot of things are built on now and we kind of are in that period now where like we want to get bring people together and kind of like what I said before, just like I don't know, you know, maybe I'm right about some things, I'm wrong about some things, but like we need to get together

49:52

and share those ideas, bounce those ideas off each other, get into arguments, and like sho show each other cool stuff that were building. And so that's where the Happy Hour came about. It was like I feel like, I feel like Ruby is right for this. I feel like there is uh, there is you know, from talking to people one on one, there

50:13

is definitely that feeling of like where did that magic go? So I was like, ah, let's see if we can let's see if we can bring some rubies together in New York and uh, you know, test that test that theory out and so we did and it was it was a huge I feel like it was a huge success. Like you know, got a chance

50:32

to meet you. We had people coming from all over the East Coast to come come to it and yeah, got uh you know, we've sponsors for the next one coming up in July July twenty fourth test, double infield fire hydrant. Saw the you know, saw the excitement, saw the energy, and uh, yeah, we're trying to make the next one even bigger, bringing more people together. I think another head canon. I don't know if

51:02

this is exactly right. The story you hear about like GitHub starting was you know, the you know, happened at a booth at a sports bar after after a meet up where they were like, you know, we should try this, and I don't know, I wasn't there. I don't know the conversation, but like that's the story you hear, and I wanted to try to, Yeah, see if we can recreate some of that magic. Yeah that's awesome. You know, I appreciate it, like I know a lot

51:29

of people appreciate it. Like, yeah, that was the first Ruby event

51:32

I've been to in a Quinton quite a while. And just because it was close and you know, like there wasn't you know it, it reminded me of the Unconference tracks right where you know, they're just like come hang out at a conference like and not go to any talk, and it like removes all of the barriers to like meet people, right like, which is kind of funny because you go to the pay for the conference so that you can listen to the talks, but then they end up having like you know,

52:02

somebody's just like it ends up being like a lightning talks but like side side tracks, right, which is kind of funny. But it has that kind of like vibe to it, right where like people are just like talking about like all of their excitement, which is like all this stuff is so exciting, and where do you start and like what are people working on? Like

52:20

you know, you don't know until and like that. The meetups was like definitely the culmination of everybody just being like, here's what we're working on. Like this is exciting, you know, and it was super cool to see. So I'm looking forward to going again. Yeah yeah, actually I guess I can. I don't have the the access to the comments, but send this to you link to the link to the event with anybody. Oh awesome, yep, we'll drop it in here. Yeah, come join us.

52:54

It was so much fun and yeah, like you said people travel for it. You know, it's really cool to see. Yeah, it's funny. You know you don't realize like how yeah, how tough the last few years have been being you know, remote and hybrid that like you don't really get a chance to like as much, I guess anymore, just like meet and hang out. And yeah, got a lot of feedback of just like like that same you know, your same comment, like I haven't been to a

53:30

Roovie event in a long time. I haven't, you know, I haven't seen this many people at a Ruby event for years. Like there's definitely you know, yeah, something we've lost by like going mostly remote and hybrid, but being in person kind of brings back for me anyway. Yeah, yeah, I remember my first Reels comp or was it in Maryland? And uh yeah, it's definitely even intimidating in a larger setting and having smaller groups, you know, it definitely is easier to to chat chat with people and social

54:07

ats, I think. But you know, one thing, I think we we had this moment when we first met. It was like I know your name, but like I know you're from the Avatar and discord right right, uh bridge that gap a little bit yeah. I mean to be honest, if extras had like a you know, print your avatar, like I feel like a lot it would make like, you know, these social connections easier to see. Uh, just slap that on. That's funny. So you

54:44

where are you going next with this? Like where where's sub layer heading? Where's blueprints? Like where do you where do you see like your next phase in all of this? I think the next thing and you know, with the topic of this is Ruby as a sleeping giant for AI application development. Like I think it's you know, we've kind of laid the foundation here with the framework, new version of trying to get the new version out last night.

55:10

But I had a couple of things I had to do. Lay the foundation with the framework, lay the foundation with blueprints, and then really kind of show how easy it is to show and like spread the word of how easy it is to build these l empowered applications. That changes a lot of you know, changes the way when do we think about what's possible? Right, I think there's still a lot of it's hard to it's hard to go from a place where you know a project in twenty eighteen even you're like,

55:50

okay, that'll be about a team of like fifteen people. It'll take you know, a year or two to do, turning into like a ten cent

55:57

API call like that. Mind it's like it's it's wild right now. I'm sure you've you've experienced it with just like some of the things these models can do, and so really like our next steps are bringing more and more of that to uh to the forefront of just like what all what are all of these things that you can do that take you half a day to automate, where like previously probably just ignored it because it was so costly and so boring

56:30

and mind numbing. And so building out the framework more bringing bringing more use cases to the forefront, and then you know, expanding blueprints to make that even faster so that more and more of these examples that you have, the

56:45

more things they can do and more things they can generate. And then so trying to experiment with the interactive docks piece getting ready to roll out kind of making it possible for anybody with API docs to uh have what we have in our docks uh see if see if there's interest there, So even if you're not even if you're not on the forefront of like trying to turn million dollar projects into tens on API calls. You can still see some benefit today.

57:16

You know. What I would love to see is uh, just being able to give like a GitHub link to like lines or something like that and say, like do something with this. I think that could make a really cool like example of hey, like here are the lines to some code, like you know, change this or something or make it say something silly. I don't know, Yeah, I don't know. Yeah, like there's but there's

57:43

so much stuff you could do like that, Like it's so easy. Yeah, I would love to see more, like more ease of use, right, Like it's it's already easy, but like still people don't you know know where to start, uh, which is which is a little bizarre at the same time, right, like, uh, it is as easy as just

58:01

typing, right. I hate to like distill it down to just that, but like, uh, you know, at this point you can just go to like you know, Microsoft dot com and like use chat GPT for free, you know, like it is that easy to get started, which unfortunately you got to buy into that, right right, Yeah, I mean I think that's that's the thing, and I think there's you know, one of the things that I've been trying to figure out is how to I've been trying

58:32

with that that example of like you know you're already doing rat rag is just you googling for like the docs, finding the dots and using it and using

58:40

it in your task. That Like, I feel like there are a lot of like very big, scary, like technical terms that you know, when you're getting ready to do a side project, it's like I'm gonna have to like read all these dogs to figure out what these things mean, when like what they really mean is like put an example in the prompt or like in context learning, it's just like give the they give some examples in the prompt right and right, and it figures out how to do it like that.

59:12

I think like that simplification I really want to try to find or work with people to help find because I've been talking about this with with like rails, where there used to be like, oh, you have to know all this stuff about like database normalization and third normal form and and this and that, but it's like it really boils down to, oh, you mean like customers have many purchases, got it right? So I think I think there are that I think is one of the things that's going to happen need to happen

59:44

next. So just like the like easier description like distill this down, there's you know, ah, there's very deep uh you know, deep explanations to take it further and make it more and more powerful. But to like get started, it's just like you mean, you like just copy and paste that into it, right, Yeah, the dummies guide to prompting, you know. But to be honest, like I never understood why like they came up with the phrases they did, like you know, zero shot or a few

01:00:15

shot, Like what does all that mean? Like there was it's almost like nonsensical to like say that when you just mean examples, right or not examples?

01:00:23

Like, yeah, I think that's what's one of the things that you've been talking about with you know, Andre and I have been talking Andre from Lang Chain RB I've been talking about is that, you know a lot of this stuff is coming out of research and coming out of academia and like those things, those those terms do matter for like those those domains, but from you know, an application engineering side of things that are more like you know,

01:00:53

applied we can have our own descriptions of these and why they work, and that are separate from you know, the science of the theory and more of the practice. Right was the you in In theory, there's no difference between theory and practice, but in practice there is. Well, is there anything else you wanted to cover to today before we jump in the picks here?

01:01:21

Yeah? Well, I you know, if if anybody listening is going to Madison Ruby, I'm going to be giving a talk, uh going more into using l MS in uh, a little bit of the philosophy on you

01:01:37

know, how to how to deal with l and generated code. The talks called going post l you know, won't give any spoilers away, but you know, leaning very heavily, which Obi talks about Postel's law in his book, but you know, being building your systems so that they can be liberal and what they expect liberal and what they accept and conservative and what they send, uh, just a little bit more liberal and what you what you accept than than what you're used to. And then you know, there's a lot

01:02:13

of things that Ruby can do easily any language to do anything right, but like the meta programming things that you can do in Ruby to change your application at run time. Ah, I think are under explored, so be going into going into that a lot at the talk. That's awesome. Yeah, I share your your sentiment there. I I've definitely tried to have the lll M generate Ruby methods and run them and then reuse them. It's it's a lot of fun, a lot very dangerous. But yeah, definitely under explored

01:02:54

for sure. So that's cool. Yeah, people go check out Madison Ruby if you're not going. Uh, I'm excited to watch that talk remotely at least. Yeah, hear what you got. So if people want to reach out to you or find you on the web, you know where can they do that? Sub layer dot com. Uh primary Scott at sublayer dot com if you'd like to email me uh at Scott wernerd dot or at Twitter Scott

01:03:30

Wernerd at Twitter, and let's see we've all. You know. I'm also like we talked about in the Ruby A I Builders discord, uh, you know, sharing you know, we uh discussion the big the big discussions right now are on that that arc A g I challenge which we've all been kind of racking our brains about. And then you know the link here on the

01:03:58

sub layer site. We have a we have an access to our discord, which is a little bit more you know, about our releases and maybe some less Ruby specific stuff and more kind of the thinking on what is how do we how do we think about these these AI tools and use them for building applications just in general. Yeah, I've been enjoying that channel as well. It's super interesting the stuff gets that gets posted, and you would think it would get shared in more places, to be honest, A lot of the

01:04:33

yeah, a lot of the papers I get shared. Maybe people just aren't reading as much as I am. I don't know. I don't think so, but it feels like an unlimited amount of stuff to read. There's so much to read. Wake up every morning, where did I go? Yeah? I feel like the paper I read last week is just like already like you know, dated, Like like how is that possible? All right, Well, let's jump in the picks. We've been talking about so much great

01:05:06

stuff. Picks our segment where we just pick literally anything. It could be code if you want to, It doesn't have to be, but it's typically what I pick, and if you need a minute, I can go first. So I've decided to to fine tune a large language model for Ruby, mostly just for the experiment than anything, but I'm calling it Ruby lang dot

01:05:39

AI. I'm gonna start building it and open and just like learning how all this stuff works and how it might be used to source all of the awesome open source Ruby that's out there that is designed incredibly and make use of that and take advantage of it and make a a new language model that is very Ruby centric. Uh. So I'm gonna see how it goes. I'm hoping that I'm successful in it, but I may not be. There may just be a bunch of lessons learned, but we'll see. You can follow my

01:06:15

progress Ruby lane that ai. Yeah, I mean I saw that you posted about it either yesterday or the day before, and yeah, excited to see where that goes. Yeah. You know, I got this stupid massive GPU server at home and I was just using it for imprints and I'm like, uh, you know, I was just wasting away over here like running imprints on it, and so I thought like I should fine tune something, and this seems like a perfect use case. So here we go. That's great.

01:06:51

Yeah, so I guess I've got I've got two things. One you know, the I talked a little bit around the like the formal informal kind of pendulum swing. You know. One of the places where I got that from was an AVD Grim talk the Soul of Software from Ah it was a long it was a long time ago at this point, but it's there are

01:07:15

a couple versions of it up on YouTube. He goes into more of the kind of the informal mindset and how there are those two splits of software and how neither one is particularly absolutely right, but more it's each each one gives

01:07:39

you different different things depending on what you need. So like that was that was very impactful for me, especially now given the you know, the approach we're taking and the other thing read a lot of substacks and there's one called strange loop Cannon which he had to post a little while back about what lllms can't do, which you know is also a trap a lot of times, because as soon as you say it's not possible to do this, everybody on

01:08:06

Twitter tries to prove you wrong. And in his post he had one about it can't do Conway's Game of Life, and I think like within a week somebody had proven him wrong. He had tried fine tuning all these different things, couldn't figure it out, and the collective wisdom of Twitter came up with a solution, I think. But then he has he has one recently. It's very good. I highly recommend just the sub stack in general. That one is very good. And then the latest one, seeing like a network

01:08:39

is very was very very good as well. That's awesome. Well, I appreciate you coming on Scott and talking about all this code generation stuff and the love of Rubai and you know, it definitely is a sleeping giant. And I think we're just going to see more and more of why that is. And you know, we'll have to have you on again after you're you know,

01:09:09

you start creating many companies menalize all of this movie code generation. Uh, you know, I could definitely foresee a future where you're just like, oh, you know, create a Reil sap that does this, and it just does it. It seems seems easy enough, simple. Yeah. No, thank you for having me. It's been a lot of fun. It's been awesome, uh, you know, chatting and catching up and excited to

01:09:33

see you in person a month. Yeah, totally all right, Well, until next time, folks, I'm out here, and come come visit us next time.

Transcript source: Provided by creator in RSS feed: download file

Leveraging Ruby for Effective Prompt Engineering and AI Solutions - RUBY 643

Episode description

Transcript