Streamlining AI Integration - JSJ 616 | JavaScript Jabber podcast

00:06

Hey, welcome back to another episode of JavaScript Jabber. This week, on our panel, we have Dan Shapire, Hey from a nice, nice weather in television. I'm Charles max Wood from Top End Devs. It actually snowed here over the weekend, so we finally have a few inches of snow outside just here. No, we're wearing a T shirt. Yeah, well we go skiing. But yeah, anyway, we have a special guest this week. It's He's May. I didn't get your last name on here, but

00:41

oh here we go. Oh that's a hard one. You're gonna have a hard time with. Well, us say, ed, that is perfect. That is perfect. So he's May. Do you want to introduce yourself real quick? Let people know who you are and why we're excited to have you here. Sure? Sure, So, as Charles said, my name is Ismael. I am the co founder of an open source framework called Superagent.

01:08

Superagent is a framework which allows any developer, regardless of their skill set, to create and integrate AI assistance into whatever type of application, stack or environment

01:25

that they are using to build their apps. So, as you guys know, AI can be pretty complex and there aren't that many machine learning engineers out there there are actually very few, and Superagent allows any developer to basically become a machine learning engineer and leverage all of the fantastic technology that's being developed every

01:49

day without having to actually know that much. So we abstract the way all of the tiny machine learning pieces that need to fit together in order to create a accurate, production ready application, so that the developers can, you know, focus on their users, focus on creating amazing user experiences, and you

02:16

know, tap into the vast possibilities of AI. That's so cool. And I have to say that a few years ago we started a show on machine learning called Adventures in Machine Learning, which is much more on the engineering side of building models and stuff like that, and I thought, oh, this is stuff that I really want to pick up. But I figured out that I am much more interested in building things with the engines that other people built.

02:46

There. I am in, oh, I've got to wrangle this data too and somehow make it jive with this other data so that I can get a model that gives me the answer I want. So anyway, this is cool stuff, and this is something I wanted to talk about for a while. Cool I mean, and I see that I came from Originally, I came from an open source background, so I've been a contributor to many of

03:08

you know, the machine learning frameworks out there that people use. And one of the things that frustrated frustrated me a lot was that, you know, regular developers were having a really hard time using these frameworks because it required so much you know, knowledge of you know, deploying this stuff and running it in production, and how to get it accurate, how to get the responses you want, how to attach you know, data to your AI model,

03:38

how to have that model interact with you know, third party ABIs, everything like that. It was just a mess. So I just decided one day, I think it was in May, and my co founder was in Vegas playing poker and I had a week off, and I just said, fuck it, man, let's sit down and do this. And so I took a week and I just coded everything, like the first version in a week. And when it was ready, I was like, this is this this this feels good. You know, perhaps other people would want to use this.

04:13

So I just decided to slap an MIT license on top of that and open sources. And you know, the feedback I got was amazing, and it just blew up and now we are taking it basically to the next level. So that's how it started. That was I wasn't it wasn't meant to be, but it was meant to be, so to speak. I love those weekend projects, so the best things in life, you know. I'm trying to decide if we should start with the chatbot side or the AI engine

04:51

LM or whatever side. I think the best thing to start with actually would First of all, I think it's worth while mentioning that before, before we started recording, you had a bit of interesting news to tell us how about the project, and I think it's worthwhile to highlight that up front. Yeah, let's do that. So I'm happy to announce, you know that Superagent will be a part of the Y Combinator Winter twenty four batch, which is

05:20

like an amazing thing, one could say a dream come true. I think it was like twenty four thousand applicants or something, and you know, we were chosen as one of I don't know how many, but not that many. So it's an amazing thing. And I'm situated in Sweden. I've lived here almost all my life and it's a lot of snow here as well as you might imagine. And I will be relocating to the States and San Francisco

05:54

on Thursday. So I'm packing. I'm packing, and I just found out that my passport is going to expire in two months, so I have to fix that as well prior to But but but that's that's on my side. Uh So, really excited about that, and really excited. You know. One of the great things with y Combinator is that you can focus a lot on building something that's actually viable and can become something big and viable in the

06:23

future. And I'm really excited about that. You know, sitting here all the way, you know, across the Atlantic basically and only chatting with people on our discord channel or having like the zoom calls is great. But you know, I think that being where, you know, in the mosh pit

06:45

of AI in person would be even more great. And I could you know, achieve more and have better you know, talk to people who do this stuff all of the day, and all of the all of the devilopers and machine learning engineers and stuff that are working on this and and and basically you know, building the future. That would be that That's my dream, you know, to talk to those guys and learn and adapt and try to incorporate

07:15

whatever works into our framework. So oh yeah, so obviously congratulations and kind those Uh And I wanted to add that one of the great things about being part of something like by Combinator is that it's definitely not as smart, it's not that money. It's as smart as you can get in that regard.

07:35

So it opens a whole lot of doors. And and it's it's not just getting you know, the funding, it's also being part of the program, uh, you know, really pushing your your project forward in terms of recognition and adoption and stuff like that. So I'm really excited for you guys. And as I said, kudos. And the second and the second thing that I wanted that I think it's a good starting point is to kind of to

08:05

understand what what super agent might be used. Four. So if you can give a concrete example of one or two or maybe even three things that you know that super Agent makes easy to do that would otherwise be you know, much more challenging. Yeah, sure, I'll give you two examples. I'll give you a personal example and then I'll give you like an enterprise example as well. So personally, I use Superagent every day. And as you as

08:46

any like open source maintainer will tell you, the community is everything. When you're building an open source project, the community is everything. And that's that's one of the main reasons why you open. So you want feedback, you want contributions, you want a bunch of stuff from the target audience that you're trying to, you know, build a product or a framework or whatever it

09:11

is for. And so when you come to when you get to a certain stage and when you have a critical amount of contributions being made, it's like having fifty employees. It takes a lot of time to go through contributions, to talk to users to see what developers want, how we should build whatever they want, and what ideas they have, and what code they've written,

09:37

and so all of that is really time consuming. And one of the most time consuming things is doing code reviews, so reviewing other people's code that are not actually a part of your team, they're just open source contributors. And so if I would estimate that takes me around ten to fifteen hours a week to only do that to review other people's code. So what I did was

10:03

that I set up a assistant, as we call it. An assistant is basically an agent or a language model that it has a bunch of tools connected to it and can be trained on different data. So I trained my assistant, which I call Shuri Can. It's a Japanese name because Superagent has this Ninja thing and a Japanese aura around it. Everything we do we call it,

10:28

you know, Japanese names basically, So Shuri Can. She's an assistant that I built using Superagent, and her sole purpose is to do code reviews for contributions that we get from open source you know people, and she does that so well that even the contributors don't know that it's an assistant. It's an AI assistant. So it has passed the touring test multiple times, I would say. And the way we do that is that basically it's super simple.

11:03

You take with a couple of clicks in our UI or with code in our SDK, you can set up an assistant and attach as many files or data sources that you want to that assistant, and that assistant will learn all of that data and you can instruct it to do specific tasks. So in this case, I've taught her everything there is to know about our code base, and so when a contribution comes in, she knows exactly how our codebase

11:33

is done. She knows how we like to write code, she knows all of our rules, all of our internals set up, and she can give feedback to contributors on what they could do better or what they should change in their code review, and then the developers do that and then I can go in and merge that in. That's just one simple example of what you can build. So if you would generalize that, that's like building a team member. So to answer your question down, one of the most exciting use cases

12:07

I see is to augment your team and increase productivity of your team. That's one of the main I think, you know, unique selling points that AI has, which which I believe will be a huge thing in the future, even more prominent than it might be today. So that's one yeah, yeah, So before we move to the other one, a few questions about that.

12:35

So first of all, what model do you use for sure can so in the Shuriken case, we use an open AI model, the GPT three point five model that is fine tuned on our code base basically, so that's that's a proprietary model that we use an open Ai proprietory model for that specifically, So the starting point was open Ai three point five, and then you further trained it on your own codebase. That's correct, Yes, fine tuned

13:09

it on our on our own codebase. And because this is definitely not my area of expertise, is there, like I would imagine there's some sort of a lower limit about how you know, how small your code base could be before you would actually get some value after training on that specific codebase. Now, so yes and no. Uh, you know, generally I can say that the more the more data you have doesn't necessarily amount to a better performing

13:46

model. Uh, it's the quality of the data that's important. So you can actually have quite small training data sets, but if the quality is high, and the quality is of course depending on what it is the assistant or language model is supposed to solve. But depending on the task and depending on the data quality, you can actually get away with just a couple of thousand you know, rows in a spreadsheet to train your model to be very effective

14:18

on a specific task. So I would say quality is you know, a better thing to focus on when it comes to data sets than quantity, and that has also been improven by open source model developers that have you know, basically generated synthetic data, small synthetic data sets that have a really high quality and have been able to train models that are as equally good as GPT three

14:50

point five with less data, less compute, less money. So quality is super important when you do that, and there are a bunch of you know, bunch of stuff that you need to learn before you can train your model. Of course, how should the data, what should the data look like, what type of data for what type of task, et cetera, et cetera. That's all of the questions that regular developers ask themselves every day, but they don't have any you know, they don't have any prior experience working

15:24

in this field. So that's also one of the reasons why we decided to build a framework for the mainstream regular developer and abstract a way all of all of this stuff that we are discussing now, just abstract that away and make it as easy as uploading a file like Okay, you want you want your assistant to be trained on ten PDFs, Cool, upload them and we'll do the training in the background. You won't notice we'll send you an email when

15:50

it's done and you can use your assistant. That's it your case. So in your case, you kind of pointed it at GitHub rep and that's how you trained it exactly to multiple repositories. We have multiple repositories, so pointing it to those multiple repositories, and then we have a bunch of technology that you know, extracts the necessary data chunks, it splits it and fine tunes

16:19

a model for that specific use case. So one thing that I'm curious about with this is I see a lot of people add what you're talking about, at least to me, sounds a lot like continuous integration or you know, some of the steps that people put their into their GitHub actions where it basically says, you know, all the test pass you pass through the linter, and hey, you know your style matches our style, or hey will you

16:52

accept you know the automatic cleanups that come out of the linter and stuff like that. So I'm a little curious as to what sure it can gives you that that you don't get from you know, just kind of a standard what do you call it? A standard CI set up? Yeah, So CI usually looks at syntax, and you know code conventions like Lindin. You know, it's it's basically a code convention. You might write sloppy code, but it will pass the linter. So that's the thing with sury Can. A

17:30

linter will catch bugs. It will catch you know, could be type errors or something like that, you know, but it won't catch sloppy code. If it works, even if you build it, it won't catch that because it will build. It won't throw any errors. So what sure Can does is that she actually looks at the code itself and and you know, gives feedback on the actual code, like you should not write this function like this, write it like this, or use this package instead of that package.

18:03

You know, that kind of feedback, the ocular feedback that a code reviewer usually gives his or her employees. That's what sukin can do in our repository. So I'll actually I actually push on that. Can you give a really super concrete example of the type of feedback that you might get, like I can. I can show you if you want to, if that's possible, well, and you can. And that's nice, But we are primarily audio. Most of our listen, most of our audience is you know, just

18:41

listening. So it would be better if you just describe it as much as best you can. Yeah. Sure. So Let's say that you are writing a piece of code that should, you know, do a specific thing, and the code works. The code that you wrote works, but it and it follows the code conventions and it's passing all of the tests and all of that stuff that we you know, run on the code base. But it might not be optimized, it might be really slow, it might be poorly

19:17

written, so to speak, and that happens all of the time. So what Suriken does is that she makes sure that the code is not only you know, follows our code conventions and the linters and all of that stuff, but she actually makes sure that the code is as effective and you know, fast and proficient as it could be by giving you small tips, hints, pointers on what you could do better when writing that specific piece of code.

19:51

Let's say it's a function for logging in. You can write log in functions, you know, in a hundred different ways, but usually there's only one way that's efficient, effective and proven to be like the way to do it. She will if you haven't written it in that way, she can you know, give you feedback that that's the way you should write it, please make this necessary change, and then you know, the developer goes in and does that, and then I can feel confident that, Okay, someone has

20:23

looked at this and I can merge it into the main branch. Basically, so very specific feedback on the actual code, which is something a human being does today, is done by an AI assistant that I built in like three minutes in our repository. And for me, you know, when I look at it, I'm amazed that this actually works. I'm amazed by it. I can sometimes, you know, I get emails when sure It can mentions me in a comment on GitHub, and I get blown away, like this

21:00

is not actually a human being. This is just AI that's trained to do

21:03

a specific thing. But in that context, in that small, you know, narrow vertical context, she does an amazing job as well as any human might be able to do it. Is there a GitHub repo for a sure again that we can look at or yeah, so super the super Agent propository that you posted here in the chat if you go to the issues tab and or sorry, the pull requests tab, which is where people you know contribute code and you open up a pull request, you will see that there is

21:41

an contributor that's named sure it can Bites and she basically gives comments on you know, different type of depending on what type of code you have writtain, she'll give you different type of comments if it's if she doesn't find anything weird, she'll just say, good job, you know, good job Charles, thank you for the contribution, thumbs up love and that's that. Gotcha.

22:07

But what I'm wondering about is, you know, is there like a walkthrough to set up something like this or yeah, so we have a I have a YouTube channel actually that has like all of the I have a bunch of like videos and stuff. When I set up a bunch of different type of assistants and she's one of them. And actually the code is open source, so anyone can grab your it can make her their own code reviewer if they want to. Does she also like, okay, have you seen her occasionally

22:37

block full requests like not approveful requests? No. I trained her specifically to only give feedback and not block anything, because you know, I wanted to try it out. I didn't want it to be so definitive. So she only gives feedback. You know, it's up to the user to do the change or the country viewterr and up to me to make sure that the code you know, works still when I merge it in. But but she only

23:07

she's she tries to be good vibes. I can, I can say as much as that, So no blocking, No, you know, this is bad. So she's trained to be humble, nice and give you know, feedback, So it would be amusing to train her on Linostova's responses. Yeah, yeah, yeah, yeah, And that's actually a good you know,

23:34

that's actually a good use case. If you want the language model to act as a specific individual, if you have data on that specific individual, Uh, you know, you could train the model to to try to resemble the way like Linus would would answer a code review. I'm guessing it wouldn't be

23:57

as much of a good vibes that I have sued. So basically what you're saying is if you use the entire Linux kernel code based as the training data set, you will get very snippy comments whenever anybody tries to do a pull request, you could get that or yeah, you could absolutely do that. And I know that there are companies out there doing exactly this, training models

24:25

to resemble you know, Arnold Schwarzenegger or whatever it might be. Right, So, right, you go pull Elon Musk's tweets and say this is how you're supposed to respond. Right, That would just that would just be so awesome. What is interesting to see? Yeah, to see stylistically, like, okay, go pull this person's tweets, right, as long as it's not like a corporately handled right, they don't have pr people running their account,

24:53

right, It's it's them straight into the phone. Yeah. I recall like a few years ago, even before the whole machine learning craze, I think Microsoft or somebody try and release some sort of a chat bolt into Twitter, and she became a pornographic neo nazi within like two days or something like that. Yeah, and they had to pull her, They had to pull her off. They got so much heat that it basically killed the whole AI think due to the heat they got for that. So that's the downside of

25:32

it, could be could be the downside. But I'm a positive guy. I don't like you know, Yeah, I don't like to be like a doomer on this kind of stuff. I believe that it's good for humanity.

25:41

I believe that you know, it's a must. It's a must. Well, and I think you can intelligently because I don't know what parameters they put onto that Twitter bot from Microsoft. They kind of said, well, we just turned it on all of Twitter, right, And yeah, I think it probably fed off of a bunch of other bots, right that in some corner of Twitter that I just never see. But I don't know. But this is interesting because you you can specifically pick the data set that you want

26:17

to emulate and then do it that way. Right. So you said that you have two examples. You gave us one, right, So the second one is is more of a you know, enterprise use case. And so if you think of a regular company, let's say a you know, a producing company or a company that is not a tech company, just a regular company, you know, And so they have a bunch of data. You know, they have a bunch of data in their management systems. They have

26:49

a bunch of files, they have a bunch of PDF excel sheets. They have a sales or CRM system, they have a help desk system, they have a bunch of data about their users. They have a bunch of other type of data, and there is a lot of people working with extracting knowledge from this data. So if you take like a marketing manager as an example, one of the one of the key tasks of that marketing management manager is to look at different data and try to figure out what works what doesn't work.

27:25

That's one of the jobs that they have to do in order to be successful. So the second use case is that you can take like in my case, I took our GitHub repositories, but in an enterprise use case, you could take all of your enterprise data and do the exact same thing. You can feed it to the model. You can train the model on that data, and you can start asking questions about your data. You can ask questions like which you know, what are my top five best customers, and

28:00

it will give you an answer within seconds. You can ask it to plot charts on sales and it will plot a chart on your sales for the you know, for whatever time period that you might have in mind. Usually how this works is that you have a person at a company wants some data. They go to another person at a company that has is tasked with like extracting

28:23

that data. A so called analyst. And so what you can do is that you can increase productivity for all your employees by just training a model on your data and making it available to whoever should have access to that data, so they can ask questions without having to go to a analyst or without having to wait one week to get an answer. You can just ask and the

28:52

model will answer that question. And if you think about how much data a company of thousand employees of that size, which is not a big company like a medium sized company, how much data they have and how much knowledge they can extract using this technology by just training a model on it. So that's the second use case that we see a lot of. Basically, that's the use case that people use it for super agent that is Yeah, So two

29:25

questions about that really. Question number one, as we've seen with general machine learning models chag EPT and whatnot, occasionally they quote unquote lie, they provide misinformation, and they do it in a way that seems really reasonable and self assured, which can really lead people down the wrong path. So my first question in that context is when you, let's say, train the the agent on your internal corporate data, how likely is it that when you ask the

30:08

question, you'll potentially get misinformation. Yeah, so it is likely, and we call that hallucination. So the model can hallucinate, you know, and give a false false answer, false answer to any query or question. Now the question is how do you solve that problem? Because if you think about it, if I ask you something, Dan, if we were working at the same company, I would ask you a question, how do I know

30:37

that you are correct? The only the only way to know that you are correct is to see the sources from which you derived the answer to whatever question I ask you. If I ask you how many customers do you do we have today and you just come up with a number, you know that could be wrong. So usually what you do is that you make a PowerPoint presentation and you show the actual data and then you you you answer that question by

31:11

showing the underlying data. And so what Superagent does is the exact same thing. Instead of just you know, blindly uh accepting a text that is generated by a model, we also uh you know, uh make the underlying data transparent. So we actually show if you ask a question about a contract, not only do we answer that question, we also show you that specific contract. Not only the specific contract but also the section that was used to answer

31:45

the question. So you can ocularly look at the source, look at the answer, and you know, make up your mind if this is you know, makes sense or not. If it does make sense, you have the possibility to rate the response. So if it's a good response, you give it a thumbs up, If it's a bad response, you give it a thumbs down. Every time you give it a signal, or engine fine tunes the model on those signals so that the responses get better and better over time.

32:22

But I think the main thing is we are all used to using like I am, at least using chat schipt. One of the issues with chatchipt is that it doesn't actually show you the underlying source. So if you could visualize that, you know, with the charts, with data, with tables, with deep links to documents, then that's a completely different game. And that's the way humans are used to communicating data and stuff to each other. Like if you go on Twitter right now, somebody writes something, how do

32:57

you know it's true? You don't until you actually check out the data or the source. And that's what we are trying to visualize, not only like the answer from the model, but also the underlying data that is used to answer that question by the model. Cool. That's that's really insightful. And my second question in this context is how do you provide or what safeguard do

33:27

you provide against inadvertently leaking private or secure data. I mean, if I train something on a model, and it's quite possible that somebody asks a question that will reveal data that you know I don't want revealed, right, and so this is a big question. And this is the primary reason to why Superagent is open source. So when it's open source, it allows you to do two things. The first thing is that you can deploy Superagent to your

34:01

own infrastructure, completely isolated in your own environment. Nobody else has access to your data or anything like that. The second thing is that we allow you to run Superagent with your own language models. Usually these language models are open source models like Lama too that you've deployed to your own environment, or Mistrol, which is a new model which is very effective, which you have deployed to your own model. I myself am not a believer of proprietary models at

34:36

all. I don't believe that. I don't believe that a company if the end game is to get enterprises to adopt this technology. I don't believe that they can adopt it if it's black boxed. I don't believe that that can happen. And I don't see any other examples actually where enterprises give away their day to a black box and don't have any control over that data. That very rarely happens. And me, as a developer, if I'm developing some kind of app, it's very you know, it's very rare that the underlying

35:15

core technology is outsourced to some third party black box company. It almost never happens. Almost never happens. One example is like Google Maps. I don't know if you guys remember, but initially when the iPhone was launched, we only had Google Maps. There were no Apple Maps, there was no app. What happened, Well, Apple found out like like any same company would do, like, we can't give away all of the you know, traffic and data to Google on our platform. So what are we supposed to do?

35:51

The only way to solve that is to build your own maps you know app and deploy that as an alternative. So Superagent allows anyone to take the platform. It's open source, it's free. The boy to your own environment, use your own model to run it without having to leak any data to any external party, including US. So that's like the way to mitigate that issue, which is a big issue. And I believe that you can build you know, small hobby projects on top of open AI and GPTs which they

36:32

have just released. But I don't think that an enterprise grade like healthcare company could leverage that. I don't think a legal firm or law firm could leverage that due to the nature of it being black box and proprietory. So that's

36:47

why I believe in open source. So to clarification or follow up questions on that, first of all, when you're saying you use your own data or your own model, what you're actually from my ownunderstanding, what you're actually saying is that you start with a general model, but then you refine it with your own data, and that refined model stays within your organization and never leaves. Is that is my understanding? Correct? Yes? And you also probably

37:20

make a distinction between which data you use for internal versus external services. So for example, if if you know, in our cases next insurance, let's say we might train I work at an insurance company. You know, we do a lot of stuff with machine learning and whatnot. That's one of the things about us that you do a lot of stuff online and you know,

37:46

use MARC learning. So there would be a difference. You know, you might use that internal data about the policies that people have for the internal operation of the company, but you won't expose that to to external users, whereas you might train a model for external use based on you know, the questions and that you might ask uh and various you know, general terms that are pub in a public domain about you know, types of insurance and stuff like

38:22

that. That that would that would be externally available, right, And the way we we we do that is that you know, every agent or assistant that you create with superagents gets its own little brain, its own little memory, its own data. It's completely decoupled from other assistants. So you can actually deploy hundreds of these assistants with that are trained on different data sets that some of them might be for internal use, some of them might be for

38:54

external use. A good example is we are working with the ispeing in the UK and they have one assistant that's trained on internal data that they use to you know, educate employees on different things inside of their company, and then they have a customer support assistant which is only meant to help their users or customers with general queries like how do I reset my router, how do I do this? How do I do that? And that's only trained on you

39:29

know, public information that they have on their website. So the answer to that is that it's very easy to deploy a lot of these assistants. And I would say, like, in average, each user has around five different assistants that they run on our platform. So if my understanding is correct, you basically deliver three main things, if I find the same correctly. One is the actual agent itself, like clone our repo, run our agent on wherever you want to, you know, run as many instances of it as

40:09

you like. The second thing that you provide is the way to is the ability to take an existing quote unquote standard the open source language model, and then refine it using your own data. And the final thing that I understand that you're giving is the way to attach those agents to various API or input output sources. Is my understanding correct that that is the stuff that you provide.

40:40

That is exactly three things So the three things we usually, you know, we have a different way of explaining them, but I would say it's perception. That's the first thing. That's the model itself, the brain, or sorry, this is the brain. The second thing is the perception, which is the data that you feed to it. So we have the brain, you have the perception. And the third thing is what we call tools.

41:06

And a tool could be an existing API, it could be a third party service such as you know, Salesforce, but it could also be code that you want the assistant to run, so you could build you know, automation workflows with assistants that run code, predefined functions, or whatever it might be. We call them tools. So it's the brain, the perception,

41:32

and then the tools. These are the three main things that we the three main pillars that Superagent is built on, and we make it easy for any developer to orchestrate them in order to create an assistant that can do basically anything that you would wanted to do. So if I wanted to build an assistant, I guess the things that I'm thinking about here are you know one it sounds like, yeah, I can hook it up to open AI, or

42:04

I can you know, pull in my own model. That's that's probably you know, beyond having it set up and being able to access it through standard methodologies, you know, you handle all of the stuff as far as like providing it more data or providing it specific data and then extracting the responses. I guess the other part of a chatbot, though, is or an assistant, is the delivery, right, So whether it's in some kind of user

42:36

interface. I'm thinking like code assistance, right where you have them plugged into like visual studio code and it'll highlight code and give you feedback on the code and things like that. But I'm also thinking like chatbots, right, so maybe a discord bot or a you know, an embedded bot on your website that people can ask questions of and things like that. So how do you interface with the delivery of these systems? So we do that in two ways.

43:04

We do that in a so called no code way, which is basically, you create your assistant and we'll give you a embeddable chat or a embeddable you know, user interface which allows your users to interact with this assistant. Similar very similar to how chat GPT is formed. Basically super simple but still

43:30

powerful. So that's the first way. That's for developers that are you know, just starting out trying to prototype something, trying to get a feel for how how their assistant you know, actually works and how it how it how accurate it is, and you know, prototyping. The second way, which is the most used way, is that we give you three set of SDKs and a rest API, so you can use either of these SDKs or rest APIs to orchestrate your own assistance and then build whatever type of UI you want

44:06

for the delivery part. So you know, the thing with I think that something that people are missing, you know, is when you look at chat GPT and when I look at it, and I come from a design background. I've designed and built you know, front and type of applications all my life basically, And the one strange thing is that when chat GPT launched, it was like all of the other UI components that have been refined for you know, a century just got thrown away and got replaced by a chat markdown

44:42

box. And you know, that's in my you know, when I think about that, that's really like, that's really not the way the Internet usually works. You know, when you consume a software, it has a bunch of different components which makes it feel valuable where you can extract more value than just text. You know. One good example of this is a company that's called Perplexity AI. I don't know if you guys know about them, but they have built basically the new Google Search. And it's not just the chat.

45:20

It has a bunch of other type of UI components that makes that user experience ten times better than what Google Search is today. So I believe that if you are going to deploy a chat you know, an AI assistant chat plot, you need to use existing you know UI components that people are used to, not only chat. It's very limiting in what you can accomplish with

45:52

only chat, and it doesn't work well for all use cases. It might work well if you are trying to chat with an aim then it works well, but if you're trying to extract knowledge, then chat might not be the best way to do that alone. Stand alone. There might be other components that you would want to visualize for the user in order for them to be able to quickly extract information. Simple example would be a chart you know that is that a user can interact with, which is very common in any other

46:30

software. Right, all dashboards have charts, but chatchipt doesn't, you know, And so I believe that in the future we will see you know, if you think about all of the UI component libraries on MPM, like the registry for no JS, there's a bunch of awesome UI components out there. I believe that AI and these UI components will merge eventually so that you get

47:00

the power of AI in those use user facing UI components. So having an AI dynamically generate the user interface needed for the user to absorb or extract information that they are looking to extract from the assistant, I believe that's the future. That's something that we are working on actively as well, making it more dynamic and making it more rich the user experience, not just only chat. I believe that's a big thing for adoption, especially if you're looking at enterprises

47:43

that are trying to adopt this inform you know technology. By the way, what is superagent implemented in which TIM members back in back in this Python uh concurrent Python so asinc which was quite a mess but eventually worked out, and it runs on fast API, which is an open source framework for running concurrent Python threaded Python basically, and then we have some services that are building Rust

48:17

specifically for memory. So you know, if you just take this is interesting, but if you just take a language model off the shelf and try to chat with it, it won't actually remember your previous conversation, and it's like talking to someone that doesn't remember stuff, you know, and that's just not

48:37

feasible. So you have to build memory. And that memory usually is some kind of key value store, some kind of readiest database, you know, and then you need some way of integrating your model into that key value store. So that part of our service is built in Rust. The memory, it's forth a short term memory, but also a longer term memory, so you can ask questions about stuff you chatted about, you know, a month

49:05

ago. So that's built in RUSS. And then the UI is built on next jas, which is a framework, open source React based framework typescript and yeah, so that's that's the staff. Basically. Infrastructure wise, we you know, rely heavily on GCP and AWS of course, and so that's where the infrastructure is at this point. Okay, so I mid missed this because I came in late. But how is this different from Aluma how is the

49:39

different plophically and how's it different technically? So technically LAMA runs locally on your machine. Uh super, any where you put it. Uh no, it doesn't actually because you need to put the model somewhere, right, So it runs the model locally but right right, But what I mean is wherever you put it, right, Like I can put it on Digital Ocean. If I want to pay a billion dollars a month, I can put it on

50:07

AWS. But like, yeah, yes, yes, yeah, So it's similar to a LAMA in the sense that you can run different type of models. Its open source and that kind of stuff. What differs is that we focus on a specific type of agent, the knowledge assistant as we call it. So we feed your model and fine tune your model on the data that you want the model to have access to. We both fine tune the language

50:40

model. But something that people miss a lot is that there is actually another type of model involved in, you know, fetching data from third party sources. That model is called the retrieval or encoder model. We also fine tune that model on your data. There's actually two models in play. Usually when you use chat, youpt or anything else, and you upload a file to that. So these models we fine tune on your specific data, making them

51:14

super accurate for the specific use case that you have for your assistant. And so it's an orchestration layer. Is that what's referred to as long chain Because I was looking into this because I wanted to figure out how to do with a LAMA. Yeah, and what was coming up was long chain And it's like, it's not part of a LAMA, but it's like you there's an API you feed in more data. Is that right? Is that what that

51:37

is? Right? Yeah? So so lang chain is an open source framework which allows lang Chain is an open source framework that allows you to I'm actually a contributor there. That's where I started off contributing to that framework that allows you to build these type of assistance, any type of assistant. It has similarities to superagents, and we actually utilized lang chain in parts of our infrastructure

52:07

as well. The main difference is that lang chain is built for machine learning engineers, people that know what the heck they're doing, people who know how to accurately fine tune a model, how to accurately orchestrate the whole assistant, you know. Superagent documentation seemed like they were using very very technical terms that once I understood it seemed like things that could have been explained in a sentence.

52:38

Yeah. So Superagent is like, you know, the version of lang chain that's meant for the mainstream developer people that don't have any skill set or any knowledge or background in this type of technology. In the short, you know, if you want to explain Superagent, it's like Stripe for payments,

53:02

but it's for building AI assistance instead. Think you know, prior to Stripe, you know how hard it was to you know, set up payment, recurring payments, subscriptions on your whatever, you know service that you were building. It was a pain in the butt. Uh. Most of these open source frameworks are you know, give you the building blocks to build whatever you want, but you need to know what you're doing. In the Superagent case,

53:30

you don't need to know anything. You just instructed to do specific things

53:35

with texts. Give it a prompt, as we call it, You feed it with data, and we take care of the heavy lifting on our side so as to make sure it's actually use a chatbot to generate the other personas uh, yeah, we use we use language models to generate training data and you know, generate all of the stuff that needs to go into the model in order for it to be accurate for the specific use case that you have, and you don't have to think about any of that as a developer.

54:07

That's on our side. That's like the value we bring to you as a developer. You know, if you talk to a Superagent user and you ask them why do you use Superagent? The number one thing they tell us is that Superagent allows me to focus on my product. I don't need to become some other kind of engineer and learn something new. I can focus on the stuff I'm building, like an iPhone app, and I can just integrate all of this wonderful technology with a simple SDK like in twenty lines of code.

54:39

So that's the value prop that we bring to developers. Cool cool, All right, well that's nice, thank you. I don't want to shut down the conversation, but I do have another podcast scheduled in like eighteen minutes. You're living a busy life, chuck, Yeah, So I'll just I'll just ask one last final, really quick question. Yeah, you usually run models locally or in the cloud in most cases one hundred percent cloud, one percent cloud. That's so expensive though, how can you but it isn't No,

55:25

it isn't It isn't expensive. It doesn't have to be on our cloud. It could be on your cloud. You know. That's the thing. If you have if you can deploy your model to your own cloud, you can deploy your model to these serverless like infrod providers that are out there, and you can get a model running for you know, the price of running a model. It's going to go to zero in a very short time, that's the thing. So it is that with the cloud yet so far, so

55:52

far, everything that introduces more expensive than the previous thing. Prices have not

55:58

gone down in a decade. They've up. Yeah. But but if we if you think about you know, uh, it might not be zero now, but if you think about the trend, even open Ai, you know, they slash their prices with like yeah, if you're talking about open Ai, yes, yes, yes, because they are going to optimize it and they may eventually be able to get the price down, you know, not that they're not having to pay what you would have to pay for Azure right

56:25

and getting they're getting a very different rate. Yeah. And and that's the thing that the other providers as youre a w S. They are right now as we speak, deploying technology and this has already happened where you can basically host your model in a serverleust environment and only pay you know, the couple of zero points zero zero two five cents per token that you would pay open ai. So this transition is already happening. It's a requirement otherwise nobody will

57:01

be able to be able to run this technology in production. You know. So if the price is high, no business can run it. You cannot have a chatbot that you know costs thousand, ten thousand dollars a month and three people are using it. So the whole industry is pushing this for the prices to go down, and we already see it now. Even the prices

57:24

of the hardware is going down GPUs. So if I if I want to run Superagent I you said it's open source, like, so I could just pay a flat forty dollars a month, get eight cores eight VCPUs and run it on that and that would be good enough for you know a few people using it at a time. Yesaight, and completely free. You don't have to pay us anything since it's open source. No what I mean is like I'd have to host it somewhere. Yeah, you have to pay it.

57:57

It's like forty bucks a month for you know, and then you don't have to pay per token. You just yeah, Okay, I would love I would love to learn how to host this because I've got some people that are I was actually going to build something with a Lama for some people, but yeah, I and I think let's do that. And I think Llama is great. And the reason why I think it's great is because it's completely open source. It allows you, as a developer, to have complete control over

58:23

what you're doing, where you're deploying it, what models you're using. It's up to you. It's not up to some you know company to decide where your data is, how you can extract the data, and all of that stuff that comes with like proprietary black box models. I don't. I don't. As I said initially, I am strongly opposed to that type of model, and I don't think it will work. I don't. I don't think it will work. How do How do I pay you for support and to

58:57

make sure that superagent is still around? Is like the number one reason I pay for things is like, I want to pay because I want the project that I'm paying for the product, I want it to be successful so that when I wake up tomorrow, the web page is still there and the download button is still there, and the support email is still there. Because they don't need your money, They've got why compiny their money? But yeah, I agree with Yeah, the bill comes due and we we don't want to

59:27

live off of VC money. That's not the plan we want to We want to create and we talked about this initially. The thing with open source is that it's great to build a community. You can get a lot of users, but it's very hard to create a business out of that. A good example of that is, you know, let's take Django, the Python framework. How many businesses have been built on Jangle A lot of them. How many people for Jangle zero? Same thing with fast Api, it's one of

01:00:05

the best Python frameworks out there. How many people pay for it zero? How many people use it a lot? So it's really hard to take something open source and make a commercial business out of it. That's what we're working on now to be able to do that. There are examples of people who have succeeded with that. By offering you know, support plans and stuff like that. There are ways of doing it, but it is the solution is

01:00:35

hosting. I'm not I don't think that. I don't think that that will help because the whole premise here is that you would want to host it yourself. You don't want to give your data to us, to ISMAEL. You don't want me to have all your financial data. You want to have all your financial data. So hosting might work for the absolute, you know, smallest companies, but if you're working with a healthcare company, they won't host with us at this point. So what we are doing there is to give

01:01:08

them services such as AJ mentioned support packages. You know, we we we can run instances for them on their cloud. There are a bunch of different business models you can run, but it is hairy and it isn't you know, as straightforward as just having a B to B SaaS company that is proprietory, you charge per month, people trust you, et cetera, et cetera. So that was what I initially said, is that it is a hard thing to monetize and a lot of people have failed, I mean, the

01:01:43

greatst have failed at it. So I'm just hoping I a buy button on it, and I'm going to buy this thing in whatever form because I have are I have to build something. I'm going to be using some mix of either Olama or open Ai. There's still a little bit of you know what am I going to do? Your solution. I can choose to switch between Hugging Face, which is where I get the Alama models, and open Ai,

01:02:07

which is where our prototype is deployed. So like to me, this provides a layer of abstraction that for the things that we want to do for this project, this is the right tool. So you put the buy button there, I'm going to buy it. It doesn't even really matter what you charge. I'm going to pay it because i have a time crunch and I've got to get a storry. Sorry but yeah, no, I you know, in a lot of ways I agree with AJ and this reminds me of

01:02:38

something that I'm going to throw out my picks. But yeah, let's go ahead and do picks, and uh yeah, I've got like ten minutes and then I've got to roll into the other show. Dan, what are your picks? I'll make it short and sweet this time. So my first pick is going to be Prometheus, the monitoring solution. I've been using it a lot to as you people who've been listeners to this podcast. No I do stuff related to performance, to analyzing how applications execute, and Prometheus has been

01:03:20

an amazing tool for this purpose. So collecting all sorts of performance data and execution profiles, getting them in there, building dashboards using Grafana, running all sorts of from QL queries. I'm now actually looking to try to solve a

01:03:39

really hairy challenge within our organization. We've got something like fifteen micro services, some of them having hundreds of endpoints, and I want to be able to analyze the entire system to catch performance degradations without having to manually configure and specify you know, limits manually for each and every endpoint and dependencies and whatnot. And it'll be really interesting to see if I can get something like that up

01:04:12

and running just based on the capabilities that are built into Prometheus. But you know, so far, so good, so we will say, but I have to shout this out as a monitoring tool. And by the way, I also contributed back into the prom ql no JS client, which was really cool, and I have some additional ideas of some additional contributions that I want to do back to that project. We probably should have the owner of that project on our show. It's a really cool project, so that, yeah,

01:04:46

for sure, So that would be my first pick. My second pick is if you've listened to our past episode, you know that we did it around a bunch of polls that I ran off my x or Twitter account. I've recently run another one which is getting a lot of votes even as we speak, which is basically trying to see which framework is the unreact, Like if you think about React, what's the opposite of React in the in for

01:05:23

in front and frameworks? And the options that I gave were, well, you know X only allows like four options like max, so that that's kind of limiting. But the ones that I gave our HTMX swelt solid, slash, quick, and other. And ask the people who you know who answer other to like specify what they meant, and I got some really interesting responses. So I'll probably share the link to that tweet, like people like people like Jack Harrington jumped on, Ryan Corneato, Carson Grove himself and others.

01:06:01

It's it's becoming a really interesting discussion. I'll just throw it out there that Ryan's Corneato's choice for the most unreact framework out there. You know what it was. Can you guess no, React is the most unreact one of course? Yeah. That he basically make like made a distinction between the current react and when where react was ten years ago and said that current react is like the most unreact compared to React of ten years ago and vice versa. Uh

01:06:41

So it was a really interesting choice. Anyway, It's it's a fun poll. We'll see how it comes out in the end. I won't spill the beans about Too's in the lead, although you might guess. Uh, and those would be my picks for today. Awesome, what are your picks? Well, first and foremost, I got a his in Herbi dat Well, I think all the days are his and Herbi days. I think they all have the female button as well. But it was only thirty bucks. I

01:07:15

don't know why I didn't get one years ago. I guess I thought it was going to be complicated or something. The only thing complicated about it was the people who built my house. Of course, couldn't spend the extra two dollars to put the adjustable pipe on there, so I had to go down the street to Low's and grab a four dollars and sixty nine cent connector piece to put it on, Like it came with everything in the box that it

01:07:39

should have come within the box. It is. The toilet was installed in a way that it was permanent, and I had to replace a fixed length tube that could not be moved in any way because it was solid rather than flexible. But anyway, yeah, so now instead of being like an eighty American and spending one hundred billion dollars on toilet paper to do the job that the good Lord figured out how to do for us millions of years ago. Yeah, yeah, that's that's the exciting news. But I think, like

01:08:16

warm water everywhere, like like going on, it's just cold water. It's it's not fancy. It's just like a cheap thirty dollars thing. It seems to work perfectly. It's not leaking as far as I can tell, at least not. Yeah, it means day one, but it's it's the reason I was late because it was gonna be I had an hour. It's a ten minute project. It's like, undo this one thing, put this thing in place of it. You're done. I mean it gets a little cramp

01:08:41

depending on your bathroom side trying to reach around. But then because the whole thing with the tube, and it's like, oh crap, I can't put this back together. I have to make a run to Low's. I have to wait and oh crap being the operative word here, yeah, yeah, yeah, and then I and then I had a nice crap after it all

01:09:00

and was able to confirm that it works. So anyway, Yeah, don't be a dumb American, be like the like the people in the rest of the civilized world and and use the same thing that people have been using for thousands of years to wipe your heinee. Okay, get some water in there. Goodness knows, and then I'll uh, there's there's a couple other things, but I'll try to just keep this one short for today's purposes and later huh, Charlotte got to go in four minutes, so fast, right.

01:09:35

So the other thing I'll pick is a LAMA, because I have used Olama and I've gotten value out of Alama, and it's super easy to get some of the models that automatically downloads them from hugging face for you. You don't

01:09:47

even have to know about hugging face. But I've used misterl and I've used code up, and I've used a couple of other models, And depending on the model you pick, they're better than Chad gpt is for a specific because they're the benefit of Alama is that rather than getting a you know, billion, billion, billion data point model of everything, you're getting models that are more fine tuned for specific things. And then there are ways to lang chain

01:10:13

to load stuff in. It automatically does the history for you between the different requests and whatnot. And so I'm just gonna put the web install dot dev slash Olama link there because that's where the installer is. That makes it dead

01:10:27

simple so you don't have to think about it. And then uh yeah, I will just mention that I I started using Home Assistant and I now have a thermostat that is connected to a Google calendar like calendar, and I'm gonna I'm gonna put this in my wall soon and I'll pick that next time. So that's all all right. I'm gonna jump in here with a couple of things. Now, I always do a board game or a card game.

01:11:00

This one is one that my wife got us for Christmas. I think technically Santa brought it for Christmas, but my eight year old's not here to split hairs on that. And yeah, so Disney Chronology, it's a really simple game. I think we play it in like fifteen minutes twenty minutes, especially if my daughter goes before my wife, because my daughter can't get them right, so my wife steals them. So essentially, you pull the card.

01:11:29

It has a year, and it has like Disney released steamboat Willie to blah blah blah blah, right, and then the answers like nineteen twenty eight or

01:11:36

whatever, right, and it has the month of the year. And so you have three of those cards in front of you and you pick either before all the cards in between too, you know, any two of the cards that are next to each other, or after your last card, right, And so yeah, if you don't get it right, then the person next around the circle gets a chance to you know, put it into their front

01:12:00

and steal it. That's the whole game. But if you're kind of a Disney nut like my wife is, then it's kind of fun to see if you can guess them. And if you've got the fourteen year old going before you, that can't get them right, then she wins because she steals them all. Anyway, it was fun, a lot of fun. So I'm gonna pick that it's a super simple game, and yeah, it's it's fun

01:12:25

because Disney, not because the game, because the game's idiotically simple. Another pick that I'm going to throw out is, so a lot of you know that I spend a lot of my time writing Ruby on Rails, and so I followed David Heinemeier Hanson or DHH and he and Jason Freed are doing something a little bit different going forward in kind of the SaaS space. I'm just

01:12:49

going to put the link in for once dot com. The idea is is, way back in the day, you used to buy software and then you could install the software wherever you want it and use it however you wanted. And the way you guys were talking about super Agent was kind of the same idea in some ways, where it's, hey, you can take this code and you can run it wherever you want and just own it and love it

01:13:13

and you know whatever. And so they're pushing forward this idea out there on the Internet where they deliver effectively what would be SaaS apps, right so you can install as many instances of base Camp. I don't know if they're doing it with base Camp, but as many instances of whatever as you want. Because you now own the code, you own the version of the code you bought. So I think that's cool and I think it'll be interesting to see

01:13:42

how much of a difference it makes out there in the market. I think. I think it could be disruptive. I think in other ways it may take a while for people to catch on and go, oh, this is a one time payment and this is a good deal. But yeah, I really love the idea, So I'm going to pick that next thing. You know, people will want to purchase and own music and movies. No, some of us still do. Yeah, yeah, yeah, I think if that one, people just have to you have to unlearn what you've learned.

01:14:15

Because the whole thing is people think that that running a command to start a service on Linux is like something that requires a four year degree. It's like

01:14:25

it's a five minute thing. If that, Yeah, I think the other thing that you have to run that people are going to run into is they will have purchased movies and had them on some streaming service that they you know, they got access to them on right, so you enter the code or whatever, and then it's not going to be licensed to whatever service that is, and they'll lose access to it. And then they're going to go what the heck? And then right, and then they'll start looking at Okay,

01:14:55

how do I own this again? I think flex servers might take off for some of that. I've also scene another and I'll see how I can find it because I've been wanting to play with it. But I've seen a service out there that will attach to your audible account and download all of your audio books in open audible open Audible. It's like twenty bucks. It's the best thing ever. I love it. I have to I have to cut in. I actually have to drop off guys. So off is. It was

01:15:23

great speaking with you. I learned a whole bunch best of black. Thank you, thank you, and and it's an awesome thing that that you're doing. And I'm really loving the approach so so again, and it was great having you on and by appreciate it. I'm I'm going to send an email to uh Ruby Roads real quick and just let them know I'll be a few minutes late. But he's my, what are your picks? Uh? So, first off TV show started rewatching Fargo. I don't know if you guys

01:15:58

have watched it, but it's an amazing show. It's like five seasons and they are not interconnected in any way, so you can jump into any season, pick the season you want and the actors you like the most, and it's just amazing. So that's on the show side, the fun side, the dead side. I have this project, open source project that I would like to highlight, and you should bring that guy on. His project is

01:16:26

called outlines dot dev. He's a French guy called Remy, and what he does is that he allows you to create models that can answer in other formats than just text without having to write a bunch of prompts and stuff. So it's a technique that you use to you know, basically an extension you plug into your language model which allows you to control its output, which is because it gives you ideas of what other type of extensions you might want to plug

01:17:06

in there that would do other things. So that's really interesting. It's state of the art stuff. Outlines dot dev. I would I would pick that one. All right, cool, all right, well let's go ahead and wrap it up. Thanks for having East Mail, Thank you for having me guys, it was a pleasure. My Twitter is My Twitter is home man P so h O m A n P. Thanks bye,

Transcript source: Provided by creator in RSS feed: download file

Streamlining AI Integration - JSJ 616

Episode description

Transcript