67. Attack of LLM bots

⁠¶ Intro / Opening

00:00

In a galaxy far, far away, the bots invaded our website. In total, what, we had about 60,000 requests within 24 hours or something like that. 43,000 requests from Amazon bots. 13,000 or so from Anthropic Bot, 5,000 from the other unidentified one. Hello, and welcome to the unprepared edition of the Metacast Behind the Scenes podcast, episode 67. I'm Ilya Bezduliv. And I'm Arnab Deka.

⁠¶ Introduction and episode overview

00:31

It is actually in some ways more prepared than any other episode. But do you want to explain why you say it's unprepared? Because as I started doing the intro... I'm like, oh, we should talk about YC, which we did not plan to talk about. It's not that we planned to not talk about it. Actually, we don't have a plan for this episode, except that we just said, let's talk about the operational incident that we had last week.

00:52

Okay, so let's set out the blueprint for this episode, right? No, let's just keep rambling and make it like totally random. We're going to talk about two things. You just added the first topic, which is YC, but we were going to mostly talk about, I'm going to hyper bully this thing a little bit, but we had an invasion of LLM bots on our website. Episode 67, the attack of the bots.

01:15

And we need to start playing Star Wars music here. Yeah, we're going to talk about what happened, what could have happened if we were not paying attention, and how did we stop whatever was happening from happening. The defense against the bots. So quickly, yeah, the YC topic, let's start with that. So usually we also start with new features, but this time we just had a massive upgrade.

⁠¶ New features and updates

01:37

of dependencies and there aren't really many new features. There are some improvements, but it's not like something that we can talk about for many minutes. So we will just skip that part today. If you're interested, you can check them out in our Reddit. There's a list of the small improvements and whatever has been improved. And probably tomorrow, I'm going to push PR to the web app with the change log.

01:58

Yeah, I'm excited about that. Yeah, I'm just now writing like a markdown parsing pipeline and all that stuff so that we can easily publish those. Are you going to like retroactively publish previous versions too? Actually, I might, because what I'm doing is I'll create a way to add markdown files to the source code, and then it will render those changelog. And we have the changelog in the same format every time we publish. So we should have all of those changelogs in...

02:25

So I can just copy paste them. It should be straightforward. The change log would be also like with images and stuff like you do on Reddit. I'm thinking about this differently, actually. Maybe we should just have bullet points.

02:36

And if we have something that needs images and all, we write a blog post about it and link to it from changelog. Because I feel like changelog should be a very straightforward format because actually when I looked at our FAQs, because I'm now refactoring our FAQs too in the same format. I'm just driving this.

02:51

pipeline with the FAQs. When you have images in the wall of text, it's kind of disruptive. I would rather have it in a standard format. We should also talk about SEO. I guess it will be part of the discussion of the bot invasion.

⁠¶ Why do we keep applyign to YC?

03:02

Anyway, so let's talk first about the incubators. So since we were on air last time, we have applied to two incubators. We have applied to South Park Commons. Yeah, an incubator based out in San Francisco and New York. Four or five locations across. world. Right, yeah. We apply to that.

03:20

Today is February 12th when we're recording, Jason. Yesterday, a few hours before the deadline, we applied to YC. That was a very intense application, our fourth YC application. And it would have been fifth if we didn't skip one of the cycles. Which we were wondering, why did we skip that? But anyway, I think it makes sense. So I posted on LinkedIn yesterday.

03:40

Like, we applied to YC. Wish us luck. And somebody asked us, what's the benefit of applying to YC? And I responded there. I don't know if you saw the comment, but let me ask this question to you, assuming that you didn't read my comments. I'm curious what you will say. For us or for the general person? For us specifically, I think the benefit is the networking.

04:01

That would be, I think, the primary benefit is the connections we would get from YC people who are in YC as well as other companies in YC. Well-known fact that YC companies help each other quite a lot. And I think we could probably accelerate our growth quite a lot if we got some sort of support like that. Because just on our own, I think we're improving the tech, the app, step by step.

04:29

We are progressing on marketing, organic marketing and SEO, and we'll talk about a little bit more. And we are seeing a healthy growth rate. But it's not enough to, I think, make this thing sustainable by any means within the next few months.

04:46

And that's where I think an accelerator thing comes in. So the primary benefit, I would say, is the networking. And the secondary would be, of course, the money. It's not a huge amount of money that you get from YC, but it would still be good enough to extend our runway on the... by at least a year and a half, two years, maybe even more. What do you think, Lia? I think we just passed the test on internal alignment because it's the same thing. So we are LLMs now?

05:12

No, I was thinking about the corporate alignment. We are synchronized. You know, LLMs have this alignment problem. So we are a really good LLM together now. Right. Let's hallucinate something. I wanted to share the founder's video moment that we had. I want to keep that for posterity. We were basically on the call. We were writing the script together. We approached this like there are a few things that you want to communicate and there are some things you say, some things I say.

05:37

turns and i think there are you me you be two times we're taking turns And we wrote something that we were pretty content with. It was mostly about the company and the product. And then you were like, let's watch a few of the successful ones. And we looked at the DoorDash from, I think, 2013, the old one.

05:54

We looked for a couple of other ones. And then I noticed a comment on the YouTube video. They said everything YC needs to know in the first seven seconds. And I'm like, interesting, because I remember what they said in the beginning. So we watched this again.

06:06

And what the guys say there is, we are computer science majors in MIT. We have known each other for four years. It's our second startup. And it's exactly at the seven seconds mark. And we're like, okay, so maybe we should actually change our approach.

06:20

And we completely rewrote the script. And we started with, hey, I'm Arnab. I'm a former principal engineer at Amazon. So we were like, here's a credibility bomb, like right there. I go next and I'm like, hey, Amelia, I'm a former product manager at Google and Amazon here.

06:34

There's another credibility bump. And then we are saying that we worked together for five years at Amazon and we started AWS chatbot together. Founded and launched. Yes. And we worked together for five years and we've known each other for many years. And then everything else is about the product and all of that.

06:48

It just changed our mindset. Those like six words we read in the comments just completely changed our approach and just flipped it. And this video is very different from the previous three that we submitted. So let's see how that pans out. It is very different. Reflecting back.

07:02

Two years ago, around this time is when we probably submitted to YC the first time. Around March, April. Now it's February, okay? Approximately two years ago. And at that time, our mindset was that we will at least get an interview. Because we're like coming from AWS and Google and we have work history together. Like you don't just reject former principals at Amazon.

07:25

No, I mean, YC in their communication says that they like people who have previous experience and who are committed to this thing and have worked together before. So we're like, at least we'll get an interview. And here's our fourth time applying. We haven't even gotten like we apply. And then whenever the deadline is, we hear back saying, oh, you didn't get through. Here are a few resources or something to look at. Keep applying.

07:52

So that's what we do. So let's see. I'm not very hopeful, but you never know. Right. But we also threw AI under the bus in this application because I think now it's trendy to be sort of a little bit edgy about AI. Yeah, I guess I can't just verbatim repeat what we said because it's also about some other companies. The context of us talking about the AI. But basically what we said is we use AI under the hood. We don't want to throw AI in the face of the users because it's beside the point.

08:19

To be fair, that's always been how we have thought about it. We've never marketed or talked about Metacast as like an AI podcast app. However, I think in the last application, we did market it like that to YC. We did, okay. Internally, I think we've always thought about AI as powering something rather than the end in itself.

08:44

Yeah, it just means. Yeah, we'll give you an update on SPC NYC in about a month, month and a half, I think, is when we'll hear back from them. Yeah, so in the galaxy far, far away, the bots invaded our website.

⁠¶ The LLM bot invasion incident

08:57

We have written a long post-mortem on our blog, which we'll link in the show notes. But here's what happened. Just to set the context, we have a web app. It's written on top of Next.js and hosted in Versal. I think at the end of January.

09:12

we extended the podcast directory. So what we had previously, we had podcast pages, the pages of podcasts that we have on the platform. We also had the episode pages, but they were not discoverable by bots because links to those pages were generated client-side. So we've changed that and we made those

09:26

And the whole purpose of this is... seo google and everybody discover our content and when people are searching let's say like whatever the heck podcast is our link is over time it is one of the first few links that you see you click into it you find the podcast but also you start to learn about metacast right Basically for us, it's the top of the funnel play. So somebody comes in and some of them may convert to Metacast. And I think we talked a little bit about this in our last episode.

10:07

in 66, but we are starting to see our traffic from Google increase. It's still very small, but the trend is very healthy. Without question, that is one of the things we're seeing is what is bringing us new people. Right now, actually, Google is very slow to index our site. I think it only indexed 4,000 pages or so out of one and a half million. And that number is very important because that's why we had the problem. It's a very large number of pages.

10:34

which isn't a problem by itself. But what was happening is every page has an image on it. It's a cover of the podcast or cover of an episode if the episode has an episode-specific cover. When that image is displayed... It's actually downloaded from the podcast hosting provider, let's say Transistor or Libsyn or Buzzsprout, onto Vercel, resized, compressed, and then served from our site.

10:59

So instead of having a 3000 by 3000 pixel image that is like two megabytes, the end user sees like 300 by 300 pixel image, which is like 20 kilobytes. It's very efficient, very nice, very snappy, really, really cool. And it caches that image so that it only does this once. Yes. However, every 1,000 images optimized cost $5. So Versal calls this image optimization.

11:25

And is it on by default or do you have to turn it on or you have to use a specific React component? It's on by default as long as you use the image next component. I mean, it's React component, but I think it's next specific image component. It's from Verzil. Yeah.

11:39

is optimized by default unless it turned off. By default, you cannot optimize external images, so you have to allow them. And that was the other problem that we had. If I may think through this, how it... probably would have happened is when we started building the site, we only had the landing page in the beginning. Then we started adding more and more pages. Then we added podcasts and episode pages. And for the landing page and those other about us page, whatever our FAQs.

12:08

it makes sense to have this image optimization on. You want those pages to be blazing fast. You want the downloaded artifacts into the browser super small so that it's fast even on a slower network. And that's why we probably... started using the image tag with the optimization and then when we started doing podcast pages and episode pages we just kept doing the same thing

12:32

Didn't really think about it, I think, too much. But we explicitly allow listed all hosts to be optimized. That's the thing that would have been impossible unless we put the configuration in. But for us, because the podcast could be hosted anywhere on the internet, you can't just allow...

12:46

at least a few sites and be done with it. You pretty much have to open yourself up to optimizing any images. It's kind of a secondary problem, but we turned our website into an image optimization API. Well, luckily nobody uses it as such, but because I guess...

13:00

Nobody discovered it. But what was happening is we started to see a lot of bot traffic. It was Amazon bot. That is last Thursday, Friday, February 6th, 10th, 7th. Looking at the chart, it was going on for a few days prior to that, but then it spiked on Friday.

13:14

Like on Friday, it's like doubled or tripled compared to the previous days. And Vercel has 5,000 image optimizations included with the pro plan that we have for free. We did not start seeing increases in spend until, I think it was going on for like four days.

13:28

And then we got an alert that we've exceeded our spend limit. Not exceeded, I think we're like at 50% or something. And we were like, we're not supposed to be spending any money on it on top of what we are paying monthly for the plan itself. And then we saw those image optimizations. So we decided to turn it off.

13:45

but we turned it off incorrectly. We turned off the optimization for those specific images on podcast and episode pages, but we still left the image optimization API on our website up and running. So we still allowed those optimizations to happen for external images. And I think...

13:59

what happened is bots have crawled HTML of pages in the previous days, and then they started to also download images that they extracted from those pages. Yeah, or what could have also happened is typically crawling is like... You can't at the same time crawl stuff and parse everything from it and whatever pipeline you need to like extract the information from. So typically in crawling, I actually built a crawler back in AWS for the AWS website a while back.

14:27

But what you do is the discovery pipeline is independent from the actual parsing and data extraction pipeline. The discovery pipeline is basically just going super fast across all the websites. links in there and just putting them into a queue. And then another queue will come and process, extract the HTML from it and store it. And then the final step is like actually extracting the data from the HTML of that page.

14:55

Yes. And then basically they have been able to extract those image URLs. Yeah. They found all our links in the previous few days, but they had not found the content yet. Right. Yeah. And then they started hitting those image optimization.

15:10

like crazy. And unfortunately, we don't have enough observability to say that it was Amazon bot or the Anthropoc bot, because these were the two biggest ones sending the requests. Or a third bot that was disguising itself as a Chrome browser, which basically unidentified.

15:24

bot. So one of those three, we will not point our fingers which one because we don't know. They were downloading images by the thousand. That's when we were like, but we disabled those. But then we realized that those pipelines are sort of separate and that's what's happening. And in total, what we had about... 60,000 requests within 24 hours or something like that? Yeah, within 24 hours on Friday last week, we had 43,000 requests from Amazon bot, 13,000 or so from Anthropic bot.

15:50

Claude bought 5,000 from the other unidentified one. Yeah, and OpenAI hadn't found us yet. And so we're talking about overall about 60,000 requests over 24 hours. And Enthropic bot and Amazon bot, they're probably doing some of the pages that the other one has already done. So let's take out some of that, right? It's not a total 60,000 number, but let's say it's at least 25, 30,000 requests to...

16:16

pages that has never been cached before, which means about 15,000, 20,000 image optimizations on top of what we probably had. I do have the actual number. So there was about 10,000 image optimizations. At the end of it, it was 13,800. or something like that. Yes. And I think the total number for the month was about 20,000. Yeah, 5,000 was included. We didn't care about it. After that, that's when we started seeing it because...

16:40

Out of that 18,000, every 1,000 costs us $5. And this is like going immediate, like the traffic volume is spiking up. Yeah, so imagine if this, instead of 60,000 requests, if they send like 500,000 requests. The napkin math for this, so for 1.5 million images, actually it would have been less than 1.5 because some podcasts don't have any episode covers, but let's say it's a million images. A million by five, it's $5,000 that we would have to pay for sale.

17:06

That's a lot of money. I mean, they would probably forgive us, but I would hate to even have this conversation and ask them for forgiving our bill and be stressed about it. So what we did first, I was debugging this, it was very early for you. And I looked at the logs and versatile firewall and I saw that it was the Amazon bot and the cloud bot and I'm like,

17:25

Holy shit. Because it's not even user traffic. If it was real users, it would have been a different calculation probably, but it was they collecting data for their needs. Then we would happily pay $5,000, I think, yeah. If it was like real user traffic, it would... meant some real success, I guess, going on. So I just blocked all of them. I blocked Amazon bot, blocked Cloud bot, and I blocked GPT bot from OpenAI. So for people who may be in similar thing, how did you block them immediately?

17:53

in Vercel firewall by user agents. I looked at the user agents that sent the most requests in Vercel firewall. I set up firewall rules like if the user agent contains Amazon bot, deny requests for these paths. Because we only denied them access to the podcast directory. For everything else, we kept the access open. You could not have denied the other bot that was masquerading as a Chrome browser with the user agent.

18:21

Not easily in Vercel, but we could have added Captcha or something like that there. Vercel has some good protection against all that, yeah. Yeah. At the same time, I also posted about it on LinkedIn. So people sent me to Cloudflare. A few people recommended Cloudflare.

18:35

where they actually have, they call it AI independence. So they have a thing like one checkbox that you click and then it starts to challenge all of those bots and deny them access to your website, which is pretty cool. So basically we stop the bleeding first, stop the spend that they were generating.

18:49

But then still our image optimization API was exposed. So we needed to do something about it. Basically, we removed those configurations that allowed any image on the internet to be resized through our website. Yeah, it's kind of a scary thing to say. Here's the interesting thing. You mentioned this in the PR, but I'll try to describe it. Because we are serving podcast HTML from RSS feeds, somebody could have written up a RSS feed.

19:17

with some image URLs that they wanted to be optimized and whatnot in it, in the links. And then it'll come through our website. we will optimize it and serve the smaller versions of it and they could have downloaded it. Or you could have just looked up the image URL.

19:36

and just substitute it because basically it was looking like slash underscore next slash image slash and whatever image url and then there is the image url in there you could have just plugged your image url along with the dimensions And you just get the image. It was a crazy loophole in our website.

19:52

Obviously, like no user data would be affected by this, but it's just like our cost. Right. Also, we have been very safe about the website. Well, it has read-only access to some specific public parts of the data, which is podcasts and episodes right now. There is no user-facing data. There is no way to even access it from the website. Even for the podcast and episode data, it accesses the data through middleware cloud function. Yes, and we can turn off that middleware anytime.

20:20

Yeah, you can change that endpoint at any time. And it only does it from our server side. So none of those endpoints are exposed to public through client code. Except for the image optimization API that we had in the client side. Right. Somebody made the joke. He said that it could be a good pivot for us to create an image optimization API instead of a podcasting app. He's like, paid $7,000 for image optimization. and created a service for it.

20:46

My knee-jerk reaction was to just block all of those bots in robots.txt. I sent PR for that because I went down the rabbit hole reading up on all of those bots, the user agents, also looked at the SEO bots like SEMrush.

21:00

Ahrefs. So apparently, actually, those SEO bots, they crawl the entire internet in search of backlinks to websites of their customers. Apparently, they're very aggressive. They haven't hit our website yet. So I created a list. Actually, it's also on the blog post. I have a list of followers.

21:14

those user agents with links to where those companies describe what those bots do. But then you called me and you were like, hmm, maybe we should actually allow LLMs to crawl our website. Yeah, and you can describe your reasoning. Actually, I was telling you that we should allow open...

⁠¶ We re-allowed LLM bots to crawl the site

21:29

AI and Anthropic and all that, because taking a step back, LLMs have a huge hallucination problem. They very frequently spit out information that they just made up because ultimately it's just a token generation machine. So they have tried to prevent that by adding citations into it. So very frequently, if they're stating a fact, nowadays you will see that, oh, I found this information in this website. And so we thought that if they're finding information from our website,

21:56

and they're telling their users about it. Not directly, but indirectly, they may say that they found this information in our website. Or over time, they may say that on Metacast, I found this information. And like, okay, that's a good way for the user to discover us. Let's keep that in. The one that I wasn't sure about was the Amazon bot because I didn't really think there would be any benefit about it. But you mentioned that Alexa actually uses that data.

22:22

And what I know is in our home Alexa, very frequently, if I ask it something and it doesn't know, and Alexa doesn't know a lot of things compared to like actual AI agents, it would say on this website, I found this. this information, it'll read that information. They're like, okay, we should probably keep that in too. Yeah, it's a lot of requests, but pages are cached. So far, it hasn't cost us anything beyond what we're already paying.

22:47

Right. Once we turned off the image optimization, all of these 1.5 million pages being requested, cached, and all that doesn't cost us anything on top of the pro plan that we're paying for anyway. So we decided to, in our robots.txt, we actually allowed all bots to crawl the website, except for a couple of like an API endpoint.

23:09

which I don't think we have anything anymore. It's kind of redundant. But anyway, so we have a couple of paths that we've disallowed, but we've also blocked, asked, SEMrush, MJ12bot, and .bot. First time I heard about those was last Friday.

23:20

to not crawl our website at all. Because like those bots, you don't get any benefit from them unless you are their customer and actually you sort of support the overall effort. If you allow SEMrush on your website, you're just helping somebody else to know if there are backlinks to it. website. Why would I pay for this? I mean, unless we start using SEMrush ourselves, then we can enable it. But right now there is no benefit to allow SEMrush. Yeah, we use Ahrefs and we did not disallow Ahrefs.

23:47

Yeah. And then one last disclaimer about it, we have said multiple times on this podcast now that using robots.txt, we allowed or disallowed specific things. But at the end of it, it's more like a... Please read this and honor these rules rather than any enforcement. Like robots.txt on its own cannot enforce anything. Bots can do whatever the heck they want to do in spite of what's in the robots.txt. But...

24:15

Our intention is that these well-known bots from the OpenAI and Cloud and Entropic and Amazon, they're going to honor what's in there because it would be a huge PR sort of problem for them not to honor what's in the robots. other bots can get around it Like the one you said that basically pretended to be a Chrome user browser. And those will get around it. And that's fine. I think when that problem comes, we'll tackle it. But now we know more about how to tackle those kind of problems. Yeah.

24:47

We always have Cloudflare as our sort of backup if we have to. Can you explain like in a minute how you would take a website that's running on Next.js and Virgil and put Cloudflare on top of it? Like, is there any downtime? Do we have to do anything else? Oh, yeah, putting me on the spot. I don't know. But my hypothesis is that you probably sign up for a Cloudflare account, like a CDN, right? So you probably give it Vercel endpoint, and then you point DNS to Cloudflare. Okay, makes sense.

25:15

I wanted to comment that you were talking about those AI companies and their reputation as if they care. The social media response was very interesting. I basically typed it in like a few minutes. I put the alarm emoji and said like, our website is getting slammed by Amazon and cloud with the screenshot and a few words about how it's unfair that we are paying for the infrastructure that they use right now.

25:37

also we weren't clear what was going on we were just like seeing cost rise so it was this kind of emotional thing It started to go viral. So we had 400,000 impressions on this one. I think 130 or so reposts, a couple of thousand likes. And loads of comments. Yeah.

25:53

I would say half or more of the technical stuff, like use this or use that, or like poison the well, give them some junk data, all of that stuff. It's called the LLM tar pit. It's a new term I learned. But the other half was like, it's data theft. You should sue them. They don't respect.

26:07

They don't care. They pretend to be something else, even if you disallow them in robots. And actually, the Cloudflare post, they have this Cloudflare AI independence blog post. It talks about perplexity disguising itself as what it's not.

26:21

actually saying this. So they say something like, some bots have been shown that they blah, blah, blah. And it's the links to New York Times or some other places where people wrote about that. So if you click those links, you know it's perplexity. But they actually said that perplexity is doing this without

26:36

saying the word perplexity, which is very smart on Cloudflare's side. So I was actually thinking that there is no direct benefit from allowing the LLMs to crawl our site like we've mentioned. And most people think that way, right? On the other hand, though, the fact that we got attacked, you posted on LinkedIn, it got 400,000 page views. If that had brought us traffic from people, would you say that that's a benefit from the AI bot scrolling us or not? So we had...

27:04

two premium signups during that period, right? So they might have come from the post. I mean, obviously, the blog post, the post-morting, it will have a longer effect because we also posted about it on Reddit, so more may come from it. In the moment, probably a couple of people came from it, and the 70% of five...

27:18

we paid for those. It's a very high customer acquisition cost. And this is, I think, generally what I... feel about LinkedIn and I've heard other people say is LinkedIn in and of itself is great for like talking about things and going viral and all that but people are finding it very hard to actually convert from LinkedIn into customers. Right. What I did find really cool, though, I think we should call this out. So we have a company's friend, Brian.

27:50

I think he works for a company, but he's also indie hacking on the side. I don't think he's even a user because I think his use case is like so simple. He may be paying customer, but I don't think he's a user. But I got tagged. I'm like, you've been tagged on the comment and I'm going there. And somebody was asking for some transcript stuff in the podcast that they wish they had.

28:07

And Brian is like, oh, go check out Metacast that Ilya is working on. I'm like, this is really nice. So that person might go and check it out. And I think eventually it is beneficial. But like you said, I think measuring individual posts is sort of beside the point.

⁠¶ Book & podcast recommendations

28:22

Anyway, so let's wrap this up. What are you listening to? So I am in still the same book series that I talked about last time, Red Rising. I am in book four now. So what started me reading this again is because I suggested it to my daughter. She loved the first.

28:39

book, but she reads the paper books. So she's in the third book now, and I have reached the fourth book in the audiobook format now. I realize that I haven't read the sixth book in the series yet, so I'm looking forward to that. That'll be the first time listening to one.

28:54

That's very cool. I'm also continuing to listen to that Russian book about those two Russian guys traveling the U.S. in the 1930s. I have like two hours of the book left. They've gotten from New York through the U.S. The place I'm currently in is there in California. And it's so interesting to hear them describe the Golden Gate Bridge before it existed. So they are taking a ferry ride.

29:16

across the bay because they came from the north to San Francisco. And one of the benefits of taking the ferry versus the San Mateo Bridge is that you can see the Golden Gate. And so they took the ferry. So they saw a lot of cables there. Cables and all of that stuff. So it's been built at the time. And also they were given a tour of San Francisco Bay Bridge. I watched a video about those two bridges. Actually, the Bay Bridge is actually more complex than the Golden Gate Bridge. I think it's...

29:40

longer, it's like deeper water, unstable bottom. It's actually more complex from an engineering perspective, but it's less famous for obvious reasons. So, and they were given a tour and they were like walking up on scaffolding or something like that. Hearing all of that just gave me chills. I'm like, wow. We were born like 50 years after it was built, but they were there as it was happening. But the funny part about the book is towards the end of the book,

30:02

it gets a lot more ideological. So they started to criticize US a lot more and admire the Soviet Union a lot more. I don't know if it's related to just the fact that they've spent so much time and they started to long for the homeland. Or maybe it's just they lured you into the story by telling you all of these good things to just like slam you at the end with all of bad parts.

30:23

Anyways, I'm enjoying the book. I don't know how it translates to English, but in Russian, it's like, wow, the language is just so good. I'm also recognizing that the contemporary, the people who we know now, like in comedy clubs and all that stuff, the language they use is very much derived from the language they...

30:38

created. So those two people, they were so talented that they set the trend for what comedy is in our culture. So it's very cool. So is it like each other talking to each other? That's the style of the book? Or how does it go?

30:49

They just describe the things. The narrative, yeah. They also have dialogues in there, conversations. But like the punchlines, how they would describe something and then have the punchline that basically flips the whole thing upside down. How they use the language, how they reference things.

31:02

I'm like, wow. Because I like writing. So I'm listening to them and I'm like, wow, this is so good. Very talented. I'm enjoying that part too. You're listening to it in Russian? Yes. And they also have three narrators in there. Because on the trip...

31:16

So there are two of the authors who narrate. And there is an American gentleman and his wife who actually drive the car and help them out. And there are two separate voices. So there is a voice for narrators and there is a voice for the man and a voice for the woman. It's kind of also funny to listen.

31:31

some music in there. The whole thing is very theatrical. How big is it? Like how many hours is it? I think it's 20 hours. It'll probably be 16 hours in English. You've been reading it for about a month now? Maybe longer. So you're clearly not rushing through it. Not rushing through it. No. Not rushing through it. That's what I was setting up for over the last 30 seconds. Right, right.

31:55

Okay. Well, this was a fun episode of the attack of the LLM bots and the defense that we played. Let's see what the next few weeks have in store for us. We'll be back soon. Yeah. Bye.

✨ This transcript was generated by Metacast using AI and may contain inaccuracies. Learn more about transcripts.

Summary

Episode description

Transcript

⁠¶ Intro / Opening

⁠¶ Introduction and episode overview

⁠¶ New features and updates

⁠¶ Why do we keep applyign to YC?

⁠¶ The LLM bot invasion incident

⁠¶ We re-allowed LLM bots to crawl the site

⁠¶ Book & podcast recommendations

67. Attack of LLM bots

Summary ✨

Episode description

Transcript ✨

⁠¶ Intro / Opening

⁠¶ Introduction and episode overview

⁠¶ New features and updates

⁠¶ Why do we keep applyign to YC?

⁠¶ The LLM bot invasion incident

⁠¶ We re-allowed LLM bots to crawl the site

⁠¶ Book & podcast recommendations

Summary

Transcript