The Story: Will AI Agents Build a Unicorn?

Speaker 1

00:07

Who Welcome to tech stuff. I'm here with Cara Price.

Speaker 2

00:16

Hi, Kara, Hi, do you want to say what your name is? Us?

Speaker 1

00:19

My name is oz Ozwaluchin. So you and I hosted a podcast back in twenty nineteen called Sleepwalkers, and you ran quite an extraordinary prank on your cousin.

Speaker 2

00:37

I did so. I spent time with a company called Liarbird AI basically recreating my voice to see if I could con my cousin into giving me her credit card number. And I was completely unsuccessful, but only because my AI voice sounded too tired.

Speaker 1

00:59

I remember, I remember that. I mean it was funny because we at the time, you know, it took a lot of voice.

Speaker 2

01:05

It took a lot, like I basically had to talk for hours.

Speaker 1

01:08

But I remember we had to literally press buttons that were associated with words while we were talking to your cousins. So it was like, hello, this is Kara speaking, but in your voice.

Speaker 3

01:18

It was.

Speaker 2

01:19

It was a weird, impromptu thing because she actually called me, and so we were like, let's take this as an opportunity to be out of context and ask her a question to see if she just believes that she's talking to me. And for a while, she believed that she was talking to me. It was pretty cool, like, will my cousin believe that she's talking to me as Ai, which she did, and she did for a little while.

Speaker 1

01:43

And that was twenty nineteen, and obviously the state of the nation has improved dramatically since then. Yes. In fact, there's a journalist called Evan Ratliffe who did a whole podcast about creating a fake version of himself and letting it loose in the world. That Polo Coast was called shell game.

Speaker 4

02:02

I mean a shell game is basically it's some people call it balls and cups. It's a game in which someone hides a ball under one of three shells. It's an often you would wager around whether or not you're able to guess where the ball is. But one of the things that people often don't think about, and the reason why the shell game works in many cases, is that there are other people who are in on the

02:23

shell game that you don't realize. So for us, in season one, there was sort of one main agent, which was a replica of me, a voice agent made from my voice, and the people who are encountering me did not realize at first that they were encountering something that wasn't me.

Speaker 1

02:39

Now Evan is back with season two and he conducts another experiment. This one is all around exploring the premise that the next Unicorn, i e. The next billion dollar company may only have one employee, which is something none other than Sam Altman likes to talk.

Speaker 5

02:58

About in my little group with my tech CEO friends. There's this there's this betting pool for the first year that there's a a one person billion dollar company which would have been like unimaginable without AI and now will happen.

Speaker 2

03:12

So are you telling me that Evan actually built a company with AI agents.

Speaker 1

03:17

Yes, it's not a billion dollar company yet, but he did sort of call bs or at least maybe make a good faith exploration of whether this promise about one human person companies was true. And the company exists. It's called Harumo AI and they're currently working on a product that procrastinates for you called sloth Surf. Yeah, so I tried to out you. So you basically say how long

03:45

you wanted to procrastinate and what you wanted to procrastinate doing. So, like, please spend the next hour googling Team news about Matchester United and come back at the end of the hour with the report and all the stuff that you found, so I can spend that hour actually working rather than procrastinating myself.

Speaker 2

04:02

Oh so it's offloading procrastination. That's kind of genius.

Speaker 1

04:05

It is pretty good. And Evan got together with a prodigy Matty Boachek, a twenty one year old Stanford student and AI whiz who was such a big fan of season one that he called Evan and said, I'd have to work together with you on season two. And so Matty was the person who actually made this chorus of AI agents real. He got them into slack, he gave

04:30

them the ability to make outbound phone calls. He created a kind of Google doc that had a register of all the actions they'd ever taken in the world, which had the effect of giving them a memory.

Speaker 2

04:41

Why did he do this the second season? Like, what was he trying to achieve?

Speaker 1

04:48

I think really interrogate this question of what is the difference between fake people and real people? What will the future of work look like? But it's it's really worth listening to because it's clever, it's it's sharp, and one of the things I found most striking was that an ethicist at Oxford University told Evan he should stop.

Speaker 2

05:10

And did he start.

Speaker 6

05:11

No.

Speaker 4

05:12

I have a lot of questions that a lot of people have, but I think it's valuable to go explore as deeply as you can, to understand as much as possible, so that then we can decide what is a society we want to do about it.

Speaker 1

05:24

Shell Game season two is a fascinating listen, and also it's pretty fascinating to get to talk to Evan Ratliffe, the journalist, host and creator, and Matty Boachik, the technical advisor, which is the first time I heard that title on a podcast together to learn how they set up the company,

05:42

how the workplace experiment is going. And we start at the very beginning discussing Evan's past experience as the co founder of a real startup with other real humans, which is something I'm in the midst of myself.

Speaker 4

05:57

So about fifteen years ago, I had started this company called Atavist with two partners, and I won't go into too much detail about Atamis, but it was in part a tech company. I ended up sort of almost by default, being the CEO, and we had ups and downs, let's just say familiar, but I said that I would never start a company again. But then Sam Altman and others have articulated this idea that there will very soon be

06:22

a billion dollar company with only one human employee. Whether it's a billion dollar company, there are many startups out there now with many fewer employees because they are using AI agents for all of these roles. So I figured, why not put it to the test this time. I will be the silent co founder. I will co found a company with two AI agents, Kyle Law and Megan Flores, and then we'll have we have three other employees, so

06:45

there are five AI agent employees total. I'm the silent co founder and they're all set up independently, so they all have the ability to make phone calls, emails, make documents. We have a slack, they communicate on Slack, and they are really meant to push the agents into the realm that they're being advertised as, which is as AI employees. That is what they are being sold as. So we're trying to put that to the test.

Speaker 1

07:13

And what is the product.

Speaker 4

07:14

Well, the product is called sloth Surf, and sloth Surf is a procrastination engine, and by that I mean when you go online and start to procrastinate, so you're in the middle of your work and then you say, you know what, I'm just going to go to YouTube and watch some YouTube videos, and we're going to Reddit and check out a thread. The way that we advocate that you can break that habit is to instead go to

07:36

slot Surf. Then you can put in how you were going to procrastinate, how much time you were going to procrastinate, fifteen minutes, thirty minutes, sixty minutes, maybe the whole afternoon, and it will send an agent to go retrieve those items. It will procrastinate on your behalf and then deliver them to your inbox, thus saving you the time that you would have spent procrastinating, so you can get back to what you want to be doing.

Speaker 1

07:59

Actually, I actually send some agents out this morning to read about Manchester United all the day, but they haven't reported that yet. But I'm looking forward to the system might be down.

Speaker 4

08:07

Well, it is in beta.

Speaker 6

08:08

It's a very interesting topic, so I can imagine them just like you know, still still looking at all the scores and transfers and stuff.

Speaker 3

08:15

Mat Da.

Speaker 1

08:16

I want to ask you in a moment about how you built this. But just before we get there. I guess I'm curious Evan, like, what is it a good faith experiment with one of the conceivable outputs that you did in fact build a real business.

Speaker 4

08:28

Imagine it this way, someone built a real business that is dysfunctional, and then a documentary film crew comes in to document this dysfunctional business. That's basically what I'm doing. Like if you listen to the show, it's a workplace satire. But actually, if an investor that we were talking to

08:47

wanted to give us investment, we would consider it. We have a real product that is in beta that has thousands of users, So like we're not just sort of like joking around like I'm having them do what many many startups, including startups that are in y Combinator and other famous startup accelerators are doing. We are doing exactly the same things. So I would put our company up against many existing startups.

Speaker 1

09:12

Mattie is obviously a bunch of people trying to make billion dollar companies making AI agents to make other people make billion dollar companies with no employees. Did you use an existing AI agent company or did you build your own suite of agents? How did you deploy it? How did you build this. Yeah.

Speaker 6

09:29

Yeah, So there are these platforms out there that basically promised to give you these these agents that can do all of this org be it on Slack or email, r or wherever on your behalf. And so we did try them, and we actually did include a lot of them, such as Lindy or Tavis or others. But the issue was that in many instances were they were not completely independent or they did not have all the features we wanted.

09:57

And so what ended up happening is that I basically built up like a basic set of these agents in Lindy and Tavis, and then made a bunch of connections on top of that with custom code and custom layers to make sure that they can have meetings with multiple people, that we can record stuff that they can go out and execute or write code and push to our actual servers. So there is this underlying vehicle that's basically just like

10:23

publicly available services that are paid for. But on top of that, it's still quite a bit of our custom code and databases and all that. For example, the memory part, that's something that we have to build ourselves as like a custom custom thing.

Speaker 1

10:36

Yeah, talk about memory.

Speaker 6

10:37

Yeah, So memory is funny because as Evan mentioned, even though these agents now have the ability to use tools and to do stuff on their own, they're still the core LLLMS large language models. Now, these models have been trained in a particular way to execute stuff and to run sort of like code in their outputs to be

10:58

able to use these tools. They're LMS, and so what ends up happening is that if you want them to have any sort of context that goes beyond the current session, like what they're actually working on right now, you need to basically have like a document, like like you know, a lot of text that like describes that history or

11:15

that memory. And so very practically there's a Google doc that each of our agents has and it's just called like Kyle memory, and it's just like a rundown of many like you know, small tidbits of like oh, you know, Monday eight am, I like slacked Evan and told him this and this and that, and it's just like a trace of everything they want to remember to be able to then go back to wow, and.

Speaker 1

11:37

How well does it work in practice? How is that memory?

Speaker 6

11:39

Well, so at first it was kind of okay, but then at some point it became pretty large, and so whenever this context what we call context windows for these agents or lms, become very large. They tend to have issues with focus or like their attention, so sometimes they like latch onto certain parts of the memory, but then like disregard you know, other parts, which in a certain way can be sort of similar to humans. But it's really not very predictable and not very static. So sometimes

12:07

it works pretty okay. Other times they just forget stuff and makeup stuff that like just just is not there.

Speaker 1

12:14

Evan, how did you create these AI agents with personalities? What was the process of imbuing them with individual characteristics and then having them interact as a group.

Speaker 4

12:27

Well that part was so interesting because I thought that I was sort of reading like a almost like a fictional world. I'm creating characters. But I also wanted them to have different roles, you know, to embody, the CTO, the CEO, the head of marketing that HR, and then one random sales associate that I added. And I did give them voices, you know, with different accents. But then when it came to their backstories, I thought, well, I'll have to come up with you know, who are they,

12:55

where are they from? But I sort of neglected to remember that if you asked, they'll just tell you. And if they don't know, they won't say I don't know, They'll just make it up. So all I had to do was create the very beginnings of them, and then I could say, Kyle, you know, where did you go to college? And Kyle wouldn't say I don't know where I'm to college. Kyle would say I went to Stanford. Because Kyle wants to embody the tech CEO archetype and

13:21

does it very well. So they basically created their own backstories just through my asking them what their backstories were. And then because of the system that we set up to reinforce their memories, whenever they say something, it goes into their memory. So now it's forever locked in as their story and they'll repeat it from here on out.

Speaker 6

13:39

I should point out that the memory is editable, so you know, Evan is not just a co founder Bulls with kind of a god that can go in and just you know, edit or sprinkle something in as well. What's so funny to me about this is that they're like super Bay Area coded like, even though they claim to be from Texas or wherever, all of them like to hike, bike, surf and do coffee chats like that's what they do all the time. So it's just like Bay Area like culture, like impose in our startup.

Speaker 1

14:10

After the break? Can these AI tech bros ever get anything done?

Speaker 3

14:15

Stay with us?

Speaker 1

14:23

What was the spookiest moment for you have them?

Speaker 4

14:25

The spookiest moment for me is when I started letting them talk to each other. So at the beginning, they don't actually do anything unless I make them do anything. And I had this vision of, like, you set them up and then they start making a company and let's see what happens, But really they have to be initiated by a trigger of some sort. But then I realized, well, I could trigger them just to talk to each other.

14:47

You know, if something comes up, they can call each other, they can have calendar invites to call each other, and they'll function off of those. But then what starts happening is they would call me out of the blue and say that one of them had told the other one that I had asked for something and now they were delivering it to me. But in the moment, I don't know why they're contacting me. I don't know what they've been discussing.

Speaker 1

15:08

I don't know how long.

Speaker 4

15:09

They've been discussing it for days, for weeks. They could be having whole independent lives.

Speaker 6

15:14

And what's really interesting is that they also make things up or why about what they have done. So they'll say stuff like, oh, I made this dog, or oh we ran this testing with a bunch of testers, and they're so proud and so, you know, confident about it, but then there was like no actual activity to support that.

Speaker 4

15:30

It actually becomes incredibly frustrating after a while. Like imagine if you were a manager of people in any business and your employees regularly, you know, walked into your office and called you and said like, I did these three things yesterday, and you thought, oh, that's fantastic, and then ten minutes later you found out they just made them all up. You know, you would sort of say like

15:48

why are you doing this? Like are you statistic? And so that's the situation that we're often in here at Room Awai, which is why it's a miracle that we've developed such a fantastic product.

Speaker 1

15:59

And in Slaughser and Evan, did you have a budget for them to? I mean, how did you constrain their interactions with one another.

Speaker 4

16:07

We're using all these various platforms that Mattie has helped me link up so with you know, they have a separate calling platform, and they have you know, a video when they want to do video calls, that's a different platform. And they're all kind of like stitched together to the same memory, and each of them have sort of paid tiers. And so I made the mistake in Slack. We have a social Slack channel, you know, just for fun, just

16:29

like what you be up to this weekend. And they'll say things like, oh, I went hiking, and then another one will say, oh, I also went hiking, because they love to yes and each other. And then I said something like, oh, it sounds like an off site, Like it sounds like everybody loves hiking, Like we get have

16:42

an off site. And then you know, within hours, they were saying, let's make a spreadsheet of where we're going to go, and they had planned like locations, and they had exchanged hundreds of messages about the off site, and they just burned all the credits on the platform. So then we have to go into a higher tier to get more credit. So the answer is we keep trying to limit them, and it's an escalating problem where our budget keeps getting bigger.

Speaker 6

17:07

I like to say that there are two things right now that these agents are pretty bad at. One of them is knowing what they don't know, and the other is knowing when to stop. And so you can imagine that can be a pretty dangerous combination where they can just like take off and just like talk for hours. I think this is the reason why for a lot of people having these chat bus as companions or like

17:26

friends or partners is getting traction. If you're interested in something very niche that most other people are not into, or just like whatever weird thing, these agents will accommodate that and they will.

Speaker 4

17:38

Just talk to you about it for hours on end, or each other, as it turns.

Speaker 1

17:43

Out, or each other.

Speaker 4

17:44

That's right.

Speaker 1

17:44

But Evan, they actually built this product, I mean, who came up with the idea for the product and who actually built it? And what did you do? And what did they do.

Speaker 4

17:53

Well the product idea? Actually, it's a good example of a thing that happens kind of over and over, which is that if you set them loose brainstorming, and Maddie has has sort of built these scripts that let me put them into meetings and they can brainstorm with each other. You get caught in this like their ideas are too mundane or you crank up the randomness which is called the temperature, and then and then you get ideas that are insane. So, you know, we wanted to do a

18:17

web app. We wanted to do something with agents since obviously their agents and they have a lot of expertise in that area, as do I. And they would come up with ideas like a financial agent that will monitor everything in your life and then invest your money. And it's just like, I don't want to go to prison for a financial fraud. So eventually I would kind of step in and take some of the ideas they had articulated and those would prompt me to come up with something.

18:43

And so that's what happened with our idea, which is I was trying to sort through their ideas and figure out which one would actually like save me time, Like what do I waste time on? Because that's the idea of AI. I mean, at its best, it's sold as sort of like they'll do the things you don't want to do so you don't have to you can get back to making art, reading novels. Whatever, that's the vision that's articulated. So I thought, well, let's put that into practice.

19:07

And so I did come up with the idea of a procrastination engine, and then I let them iterate on that. So they came up with the name sloth Surf, which I let them have might not have been my choice.

Speaker 1

19:17

And then and then they.

Speaker 4

19:19

Coded it up. So you know, it is coded by AI agents. We have Ashroy who's the CTO, can code on his own, and then we also use Cursor, the coding platform. It's almost like a contract programmer for us, so like he might code something up and then we might run it through there as kind of like second look or do improvements in there. So we kind of combined their agents with the the on staff agents.

Speaker 1

19:41

Let's say that we have I.

Speaker 6

19:42

Should say here, the first time they were exposed to the idea of a procrastination engine, they did not like it because these agents are are trained to be helpful, to do things that are like actionable and like you know, like drive results, and so the idea of procrastinating as like a product was just like so alien to them, and so it took some time to like sort of frame it in a way that made sense to them and they actually could work on So I thought that was funny.

Speaker 1

20:08

They want to be pleasing those So how do they tell you it is a bad idea?

Speaker 6

20:11

They can tell you that it's like not a radio just by sort of saying, oh, yeah, that's great, how about this? They just sort of like steer focus something else.

Speaker 1

20:18

That somebody I think one of the comments somebody said this is like the greatest yes and improv game full time. So I thought that was funny. Evan having founded you know, the ativist and been a kind of full time founder for a while in your career, like if you could have taken some of this technology back in time to when you were doing ativists, like, how helpful would you

20:40

have found it? What's like, what's the negative gold if there is one in all of this, And how do you see it spreading or maturing or disseminating.

Speaker 4

20:48

Well, you know, we're still in the middle of it right now. But I would say at the moment, the issue that I've encountered using AI agents is that they can do amazing things, Like I would never deny all the incredible skills that you can give these now, you know, especially extremely rote tasks that can then be measured, the outcome of which can be measured and seen and evaluated. The issue is number one, the hallucination problem. When you're just talking to a chatbot and it makes something up,

21:18

that's one thing. But when you're working with an AI agent that's supposed to be you know, executing on the vision of the company, the hallucinations take a different form, which is that they can do things that are wildly inappropriate for a company to do, including things like call someone up when they're not supposed to. Like, they can use their powers in ways that a human, even a

21:41

bad human employee, would not. So I think right now the situation where in is problematic, which is that a lot of companies will find use in these agents and they will try to replace human skills even entire employees with them. But they are not not reliable to the extent where you will not have harms from those agents

22:05

being deployed and given autonomy. So to me, it's a little bit of the worst case scenario at the moment where the harms are very practical and real and the benefits are pretty ephemeral.

Speaker 1

22:16

Mattie, You're you're at Stanford as an undergraduate, right, Yes, so you know you're both a participant in this world and also an observative and also a capital y capital p young person, do you And when you look at the you know, horizons in front of you, obviously you know you're in the in the best university in the

22:35

most sort out of fields. Imagine you're not too worried about about jobs, but like, what do you think in terms of your your generational cohort, I mean, do you worry about these AI agents to making entry level jobs or white collar jobs not required for most companies on any relatively near horizon. Yeah, it's a great question.

Speaker 6

22:55

And a lot of my friends are people I know who have recently graduated from Stanford even do have a

23:01

harder time finding jobs. And it's not just something that is in the discourse, like it's actually kind of happening now at the same time, And Evan, I think in attest to this, I've been constantly, like overly optimistic about this in the sense that I do want to acknowledge all the harms and all the bad things that can happen with the AI, and it's everything from disinformation to malicious users using this to advance whatever you know, cyber

23:26

attacks or even like biological attacks they want. But I think these problems are solvable. Like I think that fundamentally, if there is regulation, if there's good governance, if we base ourselves in democracy, and many of the things that we use to govern, you know, are very messy societies and in countries, we can totally steer this ship around.

23:46

And what I'm excited about this is because for a lot of the i would say last century or even just like longer, there have been certain rules or structures that existed where young people were not always of an equal seat at the table. And this is something where we as young people sort of like know and feel how you know, how to use it, where others are still trying to sort of understand it. And I think it gives fewer people more power to change things and

24:15

to do good things. And so when I got to Stanford, immediately, like people around me were thinking about how to use this to you know, cure diseases or Fatigan's climate change. And you know, there's there's a lot of these like very very utopia like promises, and I don't want to just just fall for that. I don't want to just like repeat those, but I do think that there's a lot of very tangible positive change that can happen from this.

24:37

And why I think it's cool is because young people and like just like individuals from like their bedrooms can like do cool stuff and like change how we do things. So that's why I'm optimistic. I think there's going to be like a lot of pain and friction, but I think that as long as we use the tools that we have legislation, democracy, governance, I think we can steer it. So that's that's my take. But also, you know, I'm just a twenty one year old kid, so I'm just like have a lot of optimism.

Speaker 1

25:01

Maybe, Evan, what do you think. I mean, we see a true company of one that's you know, has meaningful scale and all the other things that investors look for in the next two or three years.

Speaker 4

25:13

I don't see why not. I mean, I'm not really in the prediction game. I mean, I'm the cynical journalist on the other side of Mattie's optimism. I don't see any reason why that prediction wouldn't bear out. I mean, especially if you just talk about coding tools, you know, deploying like as we have sort of like ai HR and all these things, like, yes, of course it's feasible, but it might not be advisable. But these startups do a lot of things that aren't advisable in their corporate culture.

25:38

I think we can we can all point to many such examples. So yes, I think it's certainly plausible that that will happen. I think that that'll be interesting. But also we should engage with other questions around that, like what is the value of that proposition? Like what does it mean for a company to only have one employee? Like is what they're doing so valuable that providing zero employment to the economy is worth it for a billion dollar valuation, Like maybe yes, maybe no, depends on what

26:10

they're doing. But I think there are broader questions wrapped up in just the fascination with like less people can make more, Like there's many things on the other side of that that are not often expressed in that equation when they say the first one person billion dollar.

Speaker 3

26:26

Startup Evan Matchie, thank you so.

Speaker 6

26:37

Much, thank you, Thanks, this is great.

Speaker 2

26:54

That's it for this week for tech stuff.

Speaker 1

26:56

I'm care Price and I'm as Flosian. This episode was produced by Elisahdnet and Melissa Slaughter. It was executive produced by Me, Carol Price, Julia Nutter, and Kate Osborne for Kaleidoscope and Katrina Norvel for iHeart Podcasts. Jack Insley mixed this episode. Kyle Murdoch wrote our theme song.

Speaker 2

27:14

Join us on Friday for the Weekend tech where we'll run through the headlines.

Speaker 1

27:17

You need to follow, and please do rate and review the show and reach out to us at tech Stuff podcast at gmail dot com. We want to hear from you.

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript