#197 - AI in Gmail+Docs, MiniMax-01, Titans, Transformer^2 - podcast episode cover

#197 - AI in Gmail+Docs, MiniMax-01, Titans, Transformer^2

Jan 20, 20251 hr 24 minEp. 236
--:--
--:--
Listen in podcast apps:

Episode description

Our 197th episode with a summary and discussion of last week's big AI news! Recorded on 01/17/2024

Join our brand new Discord here! https://discord.gg/nTyezGSKwP

Hosted by Andrey Kurenkov and guest-hosted by the folks from Latent Space

Read out our text newsletter and comment on the podcast at https://lastweekin.ai/.

Sponsors:

  • The Generator - An interdisciplinary AI lab empowering innovators from all fields to bring visionary ideas to life by harnessing the capabilities of artificial intelligence.

In this episode:

 - Google and Mistral sign deals with AP and AFP, respectively, to deliver up-to-date news through their AI platforms.  - ChatGPT introduces a tasks feature for reminders and to-dos, positioning itself more as a personal assistant.  - Synthesia raises $180 million to enhance its AI video platform for generating videos of human avatars.  - New U.S. guidelines restrict exporting AI chips to various countries, impacting Nvidia and other tech firms.

If you would like to become a sponsor for the newsletter, podcast, or both, please fill out this form.

Timestamps + Links:

Transcript

Andrey

Hello and welcome to the last week in AI podcast. We can hear us chat about what's going on with ai as usual. In this episode, we will summarize and discuss some of last week's most interesting AI news, and you can also check out our last week in AI [email protected] for stuff we are not gonna cover in this episode. You can also look in the description of this episode for the links of the stories we will cover. I am Andrey Karenkov, one of your usual hosts.

My background is of having studied AI in grad school and now working at a gen AI startup. And not as usual. We do not have Jeremy. He's off doing some policy business maybe having to do with new administration. I don't know, but we do have some great co hosts from LatentSpace, the number one AI engineer Podcast and a community reaching over 2 million technical readers. They do a lot of interviews with very cool AI people.

So I think there'll be great co hosts and I'll just let them introduce themselves.

Swyx

I'm Alessio. Yeah, I co host Leading Space with Sean. I run a venture capital firm called Decibel where we do early stage technical founders investing. And I'm a pretty active open source and AI engineer. So yeah, on my GitHub and Twitter, there's a lot of the projects that I, that I work on.

Shawn

Yeah. And even though, unless he was a VC, I was very surprised to hear that he was at the first cell hackathon last night. And I was not. This is like a complete reversal situation. I was cooking

Swyx

last night. I can tell you

Shawn

Yeah, so this is Swix. Hi, I am also co host of Flaten Space, and I also run the AI Engineer Summit that's coming up in New York in February that is going to be entirely focused on agents and AI engineering leadership, and I am also founder of Small AI. And yeah, we run AI news in space and, we're kind of in the same space. Like I've been, I've been a listener for a long time, so just really glad to be on when you know, your, your, your A players is out of commission.

Andrey

It's always nice to have some new co hosts that are also in the space. Maybe you can also give the listeners a bit more of an idea of what Latent Space is about and what they can expect if they do check out your podcast.

Shawn

We. try to serve the AI engineer in a broad sense. So the in space is about the transition to the productization of AI, by software engineers. I think that, a lot of the existing, AI and ML I guess media or coverage or even attitudes and philosophy is very focused on research.

And I think you know, especially with the movement that was started since Chachi Beauty, like a lot of things that are now going into, into production and especially have the bar has been lowered a lot for, you know, Basically guys like me to be able to build AI products because we do interviews. That's, that's definitely a big part of our podcasts. And then we also do essays. We do conference coverage. So like we cover NeurIPS, ICMI, CLEAR and cover what the latest research is.

Alessio, did I, did I miss out anything?

Swyx

No, I think you covered most of it. And then we have a weekly paper club and we have a weekly AI in action club on Friday. So yeah, so the Discord is active. Yeah. Yeah. We have about 5,500 people on the Discord now. So if you're interested in research, kind of Wednesday is your day. If you're interested in applied ai, Friday is your day. Yeah, go, go check it out. And we're trying to do more on YouTube, so check us out on YouTube, let us know. If you like the formats and that's our 2025 plan.

Andrey

Yeah, there you go. And as I was listening to here, you're obviously also very knowledgeable in the space of AI. You're also kind of living in it day to day, so to speak. So it'll be a fun episode just to give a quick preview. As we always do. The episode is actually going to be a bit light this week, where it's not. Too many huge news stories. There's none and not any huge news on the tools and app side. Some minor kind of updates on the business side. The OpenAI drama is kind of quieting down.

So there's just some small updates. Primarily some, Pretty cool projects and open source and research stories similar to last week. And then some more updates on the actions that the Biden administration is going with before they close out. So yeah, a bit of a light news week and we'll probably be a bit short of unusual, but it'll give us some time to chat.

And then one more thing, we do want to acknowledge any listener comments and corrections seeing some more comments on YouTube, which is always nice. And we are having a bit of a chat on the new discord. So there is still the link in the description. You can go to AI discord, kind of different from latent space. And it's still, You know, still kind of forming, we'll see how it goes.

But the idea is I'll, I'll at least try to post some news stories and papers as we see them, rather than as we cover them on the podcast, and then we can chat about it there. So feel free to check it out. It's a, it's a fun spot and if nothing else, it's fun for me to see the sorts of people who listen to a podcast.

And before we get to the news, as usual, we do want to acknowledge our sponsor, which as has been the case for a little while now is Regenerator, Bobson College's interdisciplinary AI lab focused on entrepreneurial AI with Bobson being a number one school for entrepreneurship in the U. S. There was this initiative where professors from all across Bobson partnered with students to launch Regenerator.

And organize it into various groups like entrepreneurship and business innovation, AI ethics and society, the future of work and talent, and so on. They are peer training all of Bobson. So making sure that the faculty there is up to date on AI concepts and AI tools. And are, you know, I guess, guiding people in learning how to do entrepreneurship, innovation, and creativity with AI. So a very cool initiative.

And I guess Bobson as a college is going full in on AI And now onto the news, starting with tools and apps as usual. And to start with, we have kind of a duo of stories where neither is too huge, but I figured we can. Just pair them up. So first, Google is making Docs free, but raising the price of Workspace. So there are these Workspace apps you can pay for as a business like Gmail, Docs, and Sheets. And they've had the ability to pay for AI features 20 per user per month.

For the Gemini business plan. So Google is now making that free, but raising the base cost of subscription from 12 to 14, which kind of makes you wonder, I guess, if people were really into this Gemini business plan. And then just as that happened, Microsoft did something sort of similar. They have.

Co pilot for business, and they did some rebranding now it's free copilot for business is Microsoft 365 co pilot chat, and they are rebranding co pilot chat as Bing chat enterprise and sort of, emphasizing the agent angle here. And again, trying to encourage people to sign up for this 365 co pilot.

So yeah, real duo of stories there of Google and Microsoft, both having their kind of business offerings for people to pay and makes me, yeah, I do often wonder how many people find these things useful. The AI tools on Google kind of do email summaries. They do some automated meeting notes, some writing tools, nothing. So hard to say, I guess, how excited business people are about these subscriptions.

Swyx

Yeah, we have a a group that we call the AI center of excellence at the firm, which is about 200 fortune 500 execs and AI buyers. And yeah, most people will tell you that co pilots are a scam because people are paying, you know, 12 bucks a month for workforce. And then they're paying, they want to, they're asking for 20 bucks for the AI stuff and the math doesn't work out. what you're basically seeing happening is kind of like, you AI washing of the PNL.

So what those companies are basically doing is like, Hey, look, you're already paying me 12 bucks. I'll just charge you 14 bucks. But then the way they're going to look at it internally, it's like, you know, instead of two bucks going to AI, it's like six bucks are going to AI and eight bucks are going to the previous product. So then the, the reporting or like the AI revenue and kind of like. ROI on the AI investments that these companies are doing kind of looks better.

So, a lot of it is financial engineering when it comes to this, but some of the, you know, maybe more midsize 10 to 30 billion market cap companies that I talk with, all of them started with their AI stuff for free. they're not trying to charge for AI separately because they don't really have a lot of leverage. In Microsoft's case, it's, you know, people are not moving out of, if you're a Microsoft shop and you're using teams, you're using SharePoint, you're using the stuff.

You're not going to move out, but it doesn't mean that you're going to pay for AI. But obviously Microsoft is under a lot of scrutiny to show that all this money they're pouring into AI and data centers is kind of getting ROIC. So you'll see, you'll see more. Of this, of this like AI bundling and pricing.

Shawn

I'm a little bit more positive, I guess, on this, this thing. You know, we always wanted this vision of AI that is too cheap to meter. And having a really good free tier and all these things, it just makes AI sort of everywhere. I also, I mean, I definitely agree that yeah, washing is going on. The consulting firms are extremely guilty of this. But. I mean, this is the future you want, that AI is free and everywhere in small ways, and then we'll hopefully step it up over time.

One thing, I was just kind of looking at their intro video, the co pilot chat intro video, and noticed, if you looked at the UI, it doesn't say what model they use. It doesn't say OpenAI anywhere. And this is a question of how much does Microsoft want to collaborate with OpenAI in the future? A little bit unknown, but you know, it's interesting that Microsoft used to have a very, very close relationship with OpenAI and now maybe less so.

Andrey

Right. Yeah, I think I kind of agree where it's a bit of both, like as a user of Gmail and Google Sheets, I do look forward to a future where some of the boring stuff I do on spreadsheets is automated and I can just ask Gemini. To do it. And as you said, I think this is a real advantage of Google and Microsoft where they do have people already using their suite of products and they can sort of just encourage them to upgrade and pay.

they are seemingly kind of iterating on how to get people to do that with this Microsoft latest rebranding of the free version is co pilot chat. The paid version is just co pilot. They have the Various things that you pay $30 a month for. And then also they have these agents where you pay as you go, where they do research for you on the web and their graph of knowledge. So they're definitely expanding a product suite aggressively and try to get people to buy into it.

Okay. And next story, we have Google signing a deal with AP to deliver up-to-date news through Gemini. So AP is the associated. press. They do a lot of news delivery of breaking news. And now we are seeing yet another deal of a news organization. We've seen OpenAI already doing a deal with AP back in 2023. And now Google has done the same thing, presumably. You know, paying to integrate and give that breaking news. So we've talked a lot about, this is a trend with open AI over the past year.

And we are now starting to see, it seems other companies following in their footsteps.

Shawn

We have this thesis on our podcast called the four wars of AI, and this is a kind of the data war that people are kind of fighting. I don't know if this is like the first example, but it's one of the earlier examples of a news provider not being exclusive that they, do a deal with opening. I and Google. I don't know if that was something that it was a deal point for opening. I in doing their deal, whether or not we will pay extra for you to not deal with anyone else.

But clearly, I mean, the like now this is possible because the AP has done it. Yeah. And clearly it's in the news organization's interest to strike a deal with every single LLM trainer on the planet to do this. Yeah, I mean, I really wonder how much they charge.

Swyx

Yeah. And I think the structure now with Chai GPT search and things like that is changing because before it's like, if you use the, my news to train your model, I don't really get anything out of it versus now that they're moving more towards these products with source attribution as I, well, you know, first I'm getting money for access and then I'm getting clicks to my product.

So Barron's is doing the same thing with the Wall Street Journal and some of these more financial services things where at the end of the day, they also need If Google, so to speak is going away and the social media algorithm that Elon has put in place downsize links all the time, you need to find some other way for people to come to your, to your product. So yeah, these deals can get quite expensive just based on.

What I've heard from, from people working in this order of magnitude, at the early stage, kind of like, you know, early companies are paying about, 750 K 1. 5 million for some of this data. I'm sure that, you know, OpenAI is maybe paying 50 million, 30 million, something like that, depending on the source and kind of like some of the more fine grain thing. Like, for example, can you also use for training, like things like that? There's kind of like a lot of parameters that go into it.

Andrey

Yeah. My impression from having covered some of these stories is we didn't get any concrete numbers, but Just from, I guess, various examples of this with OpenAI, my impression was it's easily in the millions, in some cases in the tens of millions, depending on how big the publisher is. So it's a very lucrative deal for the publishers. And I do recall, actually, as you've been covering some of the developments that it was noted that these are non exclusive access.

And perhaps that's also the case with others, and we'll, we'll get more stories like this.

Shawn

and just broadly, you know, I think part of the, the war is that there's this sort of fight or flight, you know, partner or lawsuit decision going on. I think for someone that. Made the other decision of part of partnering. I think you can always listen to the Atlantic the verge podcast did a interview with the Atlantic on why they did the licensing deal.

And I think it's really interesting to think to view it from their perspective that they actually don't see that much difference between suing them and doing a deal with them.

It's, it's just a purely economic, like how, how can I get the most money out of out of my content play, which is like, just very interesting on the LLM lab side I think it's like kind of incriminating, like, you know, I'm not a legal strategist or anything, but like, you can say that AI training and training on data is fair use, therefore you don't have to pay for any of it, but like by paying and licensing, you kind of admit guilt so like, I don't think you can have it both ways.

But you know, obviously I think this will be played out in court.

Andrey

Yeah. And that's one thing we did note with all these openly AI stories that they have put forward this free use argument for training and all the publishers data, as they have also been signing all these deals partly to, and Train our new data and have breaking news, but it seems also to have the back catalog. So yeah, as you say, we'll see how that plays into some of the lawsuits going on and onto the next story.

Actually, it's very related that we have me straw signing a deal with AFP, the agent France, Press to offer up to date news in the chat. So the chat is there basically chat GPT or Gemini. It's a chat bot and very similar. It will now be able to give you access to stories as you chat to them.

So another example of that data war and that, almost search war going on as well, or I don't know if it's a war, but everyone is trying to get you to use their chat to see about stuff that's going on right now and Mistral trying to also compete in that chat bot space is, it's pretty curious because I feel like I don't know if a lot of

people are using let chat and I don't know if they'll be able to compete with chat GPT and Gemini and so on, which are already so dominant, but interesting to see them try. No comment there. And moving away from those stories to something a little different and dealing with open AI, we have a chat GPT having reminders and to dos. So we're adding this better feature called tasks in chat GPT, which is what it sounds like.

It can schedule future actions and reminders similar to stuff that Google assistant or Siri can do. It is available to paying people plus steam and pro subscribers. And you can have these one time or recurring tasks like daily weather reports or reminders. So again, kind of surprising to me from a product perspective to see them doing this kind of thing. To me, I always think of these chatbots as one of things as places you go. To do research or your work.

And here we're doing this thing that is very much like Siri and kind of positioning chat GPT is more of a personal assistant, which I don't think we've seen so much of in their product direction so far.

Swyx

Yeah. This is basically how do you get batch completions in the, GPT interface. I think it's obviously great for them. Because yeah, you can kind of predict traffic. I'm sure they pre compute a lot of these things because most of the time it's not live data. And then, yeah, they got Chrome jobs, you know? So if you were a developer, it's kind of like, obviously you want to schedule things. But I think for, for maybe the non technical folks this is like a big upgrade, just being able to have.

Do the same thing every time at a specific time. It's a, it's good. I don't know what the adoption is going to look like because it's kind of limited, right? You don't have a function calling, you don't have good interpreter. I think, we'll, we'll see what the numbers look like.

Shawn

Yeah, I get, but they can easily add it, you know, over time. So I think there's, there's this dual story, like eat, depending on who you are, this is the biggest launch of the whole week, or this was a complete disappointment of opening. You know, you said, you said you want to start out building AGI and now you're building reminders. And I, I think it's just, you know, it's a starting point today. They want to, they want to get somewhere with the agent stuff.

And this is a core loop, you know, of, what an agent needs to do and needs to be asynchronous. Lang chain happened to also release what they call ambient agents on Opening, I release tasks and you know, I think both of them are basically exploring spaces where you don't have to initiate the chat in order for the AI to do something. And I think like the general theme of ambient agents is something I can bet on or follow, explore throughout 2025.

I, I, like, I can see that chat GPT tasks is like on that path. It's just a very, very small step on that path.

Andrey

Right. it is trying to compete with basically, I guess, what you get with the Google or Microsoft suite of products, where it's built into your productivity tools, into your email, et cetera. And I think this is a feature that many people foresee with AI being a sort of personal assistant. That is ambiently there for you and, and is doing stuff for you all the time and not just when you are interacting with it. So is, as you, yeah, I think another example of trying to position chatGPT for that.

And also to differentiate it because in my view, still there's not much of a distinction between chatGPT and Cloud and Gemini, you know, one chatbot is not too different from another. And so there's no stickiness there and whoever is cheaper and faster and better. is probably what you're going to go to. And I've personally been using cloud a lot more. So there is that product angle too, of how do you get people to stick around and commit to you.

And moving on to applications and business not too many exciting stories here, and we start with something pretty different from our usual kind of stories related to chatbots or LLMs. I think this one is pretty good to highlight and it is that Palmer Luckey's AI defense for company Endurall is building a 1 billion plant in Ohio. So Palmer Luckey, I guess, is being highlighted here as the pretty prominent CEO and founder of this company.

Previously, he did VR and started Oculus, what Meta is now doing. And after that started Enduro, which is a company building drones and really AI enabled drones for the military, for military applications, they have these Fury Roadrunner drones, Barracuda missiles with the idea of being there that these are AI enabled, more advanced and competing with maybe, I guess you could say a legacy Contractors and R and D shops for the military.

So clearly we're getting a lot of business considering they're building a 1 billion factory named Arsenal One. And I think really highlights that often we are not talking about AI and it's military implications. We haven't seen any sort of advanced AI so far be a major player in Wars or ongoing battles. But I think it's safe to say that it is going to happen at some point. And certainly people are investing and moving in that direction.

So, and it an important thing to keep in mind in this whole push for AI investment and that kind of rapid improvement of AI is that it is going to play into this aspect of technology as well.

Swyx

Yeah, it's definitely. Already happening when, you know, we do a lot in the cyber security space. And there's two types of warfare, right? There's kind of like the digital warfare, which is more resource denial. And then there's, yeah, the drones and things like that, but most of it today is pretty autonomous. And I think that's why the, you know, we talked about sovereign AI on the pockets before and things like that. Like the more warfare depends on technology, the more.

Owning the infrastructure for a lot of these things matters and where the data centers allocated matters and how much, you know, what models running matters. So, yeah, it's not, it's, I think that's kind of the I think Anton from Chroma I'd done this like fake protests, basically like, They're like, it's illegal to not build autonomous warfare systems because you're putting human lives at risk instead, and so you should just send the robots to fight for you.

So I think you'll, you'll see more, more of this in the next few years, now that AI is kind of like in the, in the zeitgeist

Andrey

and I should actually highlight For Enduro that it's not just military applications. They have one of their successful products is Sentry Tower, which is a surveillance system to use, for example, along the U. S. southern border that integrates with AI partner recognition with, I guess, people. Seeing people crossing, presumably.

So that's another angle here as well, where AI certainly is already massively being used for surveillance or in this case, I guess, detection of people that is, massively the case in China from what I've heard. And this company is now offering that to some degree in the U S as well. And one more thing I'll mention is I think we're Silicon Valley as a whole, my impression is, is warming up to the idea of working with the military and working with the defense sector.

We've seen Anthropic and Meta both announced that their products can be used for military applications by the defense system by the DOD. And in the past, Silicon Valley type software engineers are very resistant to that sort of thing. We've seen, you know, protests in Google when they're, I guess, AI division, cloud AI cloud division was working on some stuff for military. I think it was project Maven and that was met with a lot of outcry.

My general vibe check on Silicon Valley is that people are, Becoming more open to it if nothing else, because presumably it's very lucrative.

Swyx

Yeah. I think, I mean, yeah, these things are hard talk about. There's always like people that speak out and then people that think certain things. And don't think those numbers really change. It just change who feels like they can talk publicly about it. So. I think a lot of people in, in the Bay has been pro defense for a while.

I think now there's a lot more maybe VC interests, which I think then makes people want to talk more about it because then they want to raise money to go do a company in that space. So there's more, more of that at play, but Yeah, I don't know. I mean, I'm becoming a U S citizen in five days. So my, my view, congrats,

Andrey

congrats. That's always a major milestone. Totally.

Swyx

So it's funny, you know, I grew up in Italy and people in Italy don't work really hard and my citizenship interviews at 7 45 AM. At the immigration office. And I do not know of a single federal office in Italy that is even open at 7 45 AM, let alone running. So, the U S is a very different country.

Andrey

And next story, going back to publisher and news. OpenAI is apparently bankrolling Axios expansion into new markets. So Axios is a sort of, I guess, media company, and OpenAI is Partnering them to support them expanding into a local newsletter for Pittsburgh, Kansas City, Boulder, and Huntsville. This is a three year deal. And this is the first time OpenAI is directly funding newsroom operations of a media company it has partnered with.

We've seen them engage in content sharing and licensing deals, but this is the first time they are actually, I guess, helping a publisher expand. I would imagine that's to get closer ties to kind of become more closely aligned with these media companies. Another interesting development in this whole story of, I guess, what you call the data wars and not one I would have expected, you know, paying for media companies and their initiatives.

And on to another type of business story, one we see every once in a while, we have a new startup from a famous AI person focused on AGI. And this time it's Cholet, I would guess is how it's called. They have a new startup called Endia. And the story is that they are Going to shoot for AGI through, it seems like program synthesis. This is co founded with Mike Knoop, who they are already hiring. And Sholay, his background is having created Keras, which is a very popular I guess, platform.

Package that a lot of people use it for coding with, deep learning back like a decade ago. He worked at Google for a while. Kerris was bought out by Google or adopted by Google. And yeah, it's, it's part of a trend in a way where we've seen, obviously, people leaving open AI and doing AGI startups. And this is another example of that. And, I guess it's just the feeling now that you're close to AGI and you can kind of shoot for it, perhaps.

Swyx

The, you know, Francois and Mike have been working on the RKGI challenge before this. So I wonder, it's funny. It's like when you run a challenge for AGI and then you see like the O3 model score really, really well. I don't know if them starting a lab means they think people are on the wrong path to get to a hundred percent and they have better ideas, or if they also feel We're going to get there in, reasonable timeline. And so they kind of want to, run their own thing.

But my, can I actually play soccer together on Thursday nights? So I already messaged him about doing a podcast episode with us. So we'll, we'll learn more. Well, they'll, they'll talk more about publicly soon.

Andrey

Yeah, it makes sense. And we did cover. Just recently with O3, the Arc AGI challenge and the rumblings or conversations people have been having about maybe you couldn't call O3 AGI, maybe we're already there. So I think people are very Bullish on the agents in this example, it's program synthesis, learning guided programs, inferences that will create a GI that can invent, adopt, and innovate.

But in general people are aiming to do this in the next one, two years horizon and not super far in advance. And speaking of VC type things, we do have a couple of stories related to fundraising and evaluations. First, we have the company Synphasia, which is an AI video platform. They're focused on sort of AI generated, Videos of human avatars things that you can use for, I don't know, like marketing materials or maybe internal videos things like that. have raised 180 million.

Their valuation is now at 2. 1 billion. So not in the LLM space, not in the chatbot space, but we have seen Synthesia, I think Previously also covered them. And I do think that this is one example where AI video will be pretty successful with kind of this inclusion of synthesizing videos of people talking or, perhaps doing some things in commercials. Much cheaper than hiring models to do that for you, much easier to adapt to multiple languages to quickly iterate on messaging.

We've seen this done for politics, for example, in South Korea, I think we covered that story. So clearly based on this fundraising and the valuation of this company being at 2 billion, the, Presumed expectation of the company's bankrolling this is that many businesses are going to go for this.

Shawn

Yeah, I mean we, I I'm actually also very much thinking about having latent space, have a AI generated. Podcast like we'll clearly market as AI because we want to make sure that, you know, people know whether they're listening to us or listening to an AI. But I think this sort of AI creator space is, is going to keep growing. I don't, you know, Synthesia pre existed this current AI wave.

And they're, you know, I think they, they do much more sort of broad things than just like helping people in the creator economy. Like this is much more like support or sales or anything like that. But I think it will keep growing. Like, Hey Jen is the other one that's kind of up and coming in this area. And I think we as creators, like literally like us in this room as podcasters should also think about using it.

Andrey

And another story, another startup being valuated at around 2 billion. This time it's the makers of cursor. They raised 105 million from A series B round with some of the major Silicon Valley VCs Anderson Horowitz Thrive and others. Cursor is one of the major AI coding initiatives out there. They it's an integrated development environment where it has a built in, Coding assistant similar to GitHub co pilot, which helps you by predicting what you'll be writing.

Next cursor has kind of a few more features that set it apart, that are more advanced. I know I'm personally a user, a lot of people at my company are users. So it does seem like it is positioned to, perhaps lead in this very, very competitive space. There has been many startups, betting on doing, I guess, coding assistance, a lot of fundraising for that. I think you saw YC having like multiple startups in the space in a single batch. So very competitive.

And it seems right now that this company is a major competitor and potential one to come out on top.

Shawn

Yeah, I mean, I feel like they really crossed the moment when they got the Lex Friedman interview. I think, like, first time Lex has interviewed four people and also first time Lex has, I think, interviewed an early stage startup that was kind of pretty notable. Yeah, I think definitely, you know, I've been a doctor at Cursor. I'm also, we're also pretty close friends with Codium, which launched Windsurf. And yeah, it was a bit of a meme.

Last year of people forking VS code and trying to compete the cursor with their really good execution. And they were basically first, like we, we actually had them on the podcast. I think we were the first podcast that they ever did and seeing them execute. And then also they, they came and spoke at my conference last year, like very sort of heads down determined. Has a, with a very strong point of view as to like how people should be coding.

Like they didn't promise, you know, super AGI agents or anything like that. They just said like, we will make a better idea for you with AI in it. And I think typically what people usually say is that, you know, challenges like this would force the incumbent VS code to do something. And VS code and GitHub have done something. They just. Are usually copying cursor. And I don't know if that's like the strategy.

Swyx

And cursor is now over a hundred million of revenue. So I know there's a lot of AI over I blah, blah, blah, a hundred million of revenue. It's pretty good.

Andrey

Yeah. There's not too many startups that can claim to actually be making money. I think. And cursor is one of them.

Swyx

I mean, I will pay three times as much for cursor. It's like I use composer. Every day, it just, it's nuts, you know, like, like Sean said, I was at the Vercel hackathon last night. I used Veed Zero for the whole thing. I think Veed Zero is kind of like closer to like Bolt. nu and some of these products. It's like really nice when it's like self contained, you know, you see the preview, but.

I think composer, it's like just as good on the coding side and you're still within your environment and you know, it has a lot of advantages. So I'm curious to see where things go, you know, from here.

And then there's kind of like replit kind of, you know, it's kind of been a little left behind, at least in the Zeitgeist, you know, they're obviously making a lot of progress on the product, but it's a competitive space, the, it's hard to get developer mindshare and, When you only have the ID form factor, it's easy for people to switch around. You know, I can open the same repo and VS code and cursor at the same time. It's like the same, the same thing, you know?

So, yeah, I, I'm sure the, the worst will get even more intense this year.

Shawn

I'm curious if there is room for multiple players here, or if of course it's just going to eat everything. I've also said that opening, I should acquire cursor because the. are you know, at this point, a really good source of coding data. And you know, I think everyone needs one.

My thesis has, has gone has died basically because now it's pretty clear that cursor is is rumored to be Anthropic speakers, customer now, something like that, like that, you know, they, they, they definitely ran into limits with Anthropic multiple startups claim this, so I don't know how true that is, or, you know. What the, what the exact positioning of this is. But yeah, I mean, I think, I think applications, you know, we're in sort of the applications layer of the podcast or whatever.

We they typically want to be multi model and then the the LLM labs, they all want to be sort of vertical, right? They, they want to build their chat GPT. They want to build their tasks and reminders and all that. I think it's one of these, it would just be a fight back and forth between applications and the foundation model labs.

Andrey

And onto the last fundraising story, we have Harvey, which is a company offering AI for legal purposes for legal companies, they are seemingly trying to raise 300 million to be valued at 3 billion. They are said to have quadrupled their revenue. They were getting something like 30 million as of. Their last round of CBC in July.

So again, another example for a more domain specific company that is aiming to dominate a specific space, in this case, AI for legal applications, where it's obviously very important to not hallucinate, to give accurate information as you're doing, for instance, like research for a certain legal case. And Harvey was one of the early players in that space. And. Seemingly is already getting a lot of revenue. We saw them make some deals with major law firms covered that a while ago.

So another one that is seeming to probably stick around in my view.

Shawn

Yeah. One thing that's kind of interesting. So we did mention Cursor's valuation last time. it wasn't published anywhere except for this New York Time article I found Cursor is valued at 2. 5 billion with a hundred million ARR. And that's interesting with Harvey. Which is which is rumored to have 50 million ARR now and is raised, is valued more. So they have half the revenue and is valued higher.

And is this because the legal space is super hot or is this because the investors are very desperate to get into this thing? What's,

Swyx

what's going on? Yeah, no, it's hard, it's hard to say. I think Some of the perception is that Cursor is doing very well. And I think going back to your point about, is there going to be a one winner? There's maybe a question of like, okay, can all these companies survive? And I think everybody's getting obviously a ton of traction early on. But yeah, I don't, Harvey's expanding. I mean, I think most of this. It's been made public.

I'm not leaking anything, but but Harvey's kind of like broadening from like just legal to like all professional services. And same thing is happening with Habia, which started as more financial services focus to start. And now it's kind of like going towards all kinds of service businesses and knowledge work. So, they have a bunch of overlapping investors. So I know some of the investors are a little nervous about the collision path that they're on.

But yeah, I, I think theoretically the market is bigger. If you do all professional services work in practice, it seems hard to build a company that does serves every professional services market. So, yeah, I don't know, but I think since he was already an investor in cursor, so maybe they got to stick to it. and y'all always gotta like do the math also on the. How much dilution the founders are actually taking. So it's not always, it's not always just about the evaluation.

Shawn

Yeah. I'm also curious, like where the money's going, do you need 300 million to build, you know, GPT wrappers? I don't know. Like they all claim to have their own models, but like also let's be real. Like most of the traffic's actually going to the. Clods and the GPTs. And yeah, I don't know. But obviously good for them. I'm not, I'm not criticizing them at all.

I'm just like there, there's supposed to be this thesis of like, you know, you raise once and you're done or, you know, you're, you're profitable or unicorn 10 percent startup. You know, that's, that's what gum loop is, is touting, but then like, you know, both Cursor and Harvey, they're kind of following the traditional path. I'm sure they, they haven't touched this in their previous round. They all say this and you know, money's just sitting in the bank.

And so what is it for sales and marketing? I don't know.

Andrey

It's certainly a part of it. I think,

Swyx

yeah, I don't know how many people Cursor has. I'm looking on LinkedIn, Harvey has 74 employees. Wow. Carer doesn't even have a LinkedIn organization, . It's, but when you search, so I've been to their office. It's very small. Yeah. It's, it, it's under 32 people, but it also has all the investors and then it has a bunch of, like any spheres, kinda like a common name for like random tech things in the world. So, yeah, it's definitely a much smaller team.

I'm sure Cursor is mostly spending it on compute.

Shawn

Right. And that's the other thing about this ARR. Like a lot of it is sort of passed through to the model labs. So like, what is the margin?

Andrey

Yeah, exactly. I'm, I'm curious to know if they are actually profitable. Cause there's also kind of a price war aspect here, where certainly for VLLM providers, providers, there has been a movement towards offering smaller rates of dollar per token, like aggressively lowering it. And in these competitive spaces, especially for coding, you would expect that there would be a similar kind of paradigm of, you know, you're not going to go above the 20 per month.

Cause that's the pattern that's been established by GitHub Copilot early on. And if you're going to try and go above that, then To be very expensive and something like cursor where it's doing ambient completions for you. It's like, as you code it, just offering stuff for you all the time. It's also looking forward to see about suggestions, not where you are. It's able to do inference over multiple files, answer long questions.

Yeah. You've got to wonder if one 20 subscription per month is actually going to pay for what a typical programmer who uses cursor every day. Is querying and moving on to projects and open source. We've got a couple fun stories here, starting with Minimax oh one, which is all about scaling foundation models. So they have minimax text oh one and mini. Minimax VL, Vision Language 01. And the big story is that they're meant to handle lower longer context.

So they use lighting intention and mixture of experts to have a total of 456 billion parameters with 45 billion Active Webber token, pretty big for this class of models. And they're saying that these can handle context windows of up to 1 million tokens during training, and they can even take 4 million tokens during inference. So, this is one of the big stories since chat gpt came out. It used to be that the token windows were like this big.

4, 000 tokens, 8, 000 tokens as the default, wherever context window is, how long of an input you can handle, right? So can you handle like a book's worth of content? Can you handle multiple documents? So this is an example where by using a newer form of attention, Typically, that's one of the major tricks for being able to do this kind of long context. You are able to scale.

And so Minimax says that they offer context windows that are 20 to 30 times longer than other Open models, and these are publicly available. So another kind of class of open source model that is competitive with top of the line models like GPT 4. 0 and Cloud 3. 5 Sonnet.

Shawn

Yeah, very big. I would say the meta narrative that I'm interested in is just the rise of this lab called High Law. Not really something that, was really announced last year. Like, they just quietly started producing video models and now text models. And who are they? What are they?

Andrey

I can't say if I'd have done a deep dive, but we've seen more and more announcements of this kind from primarily Chinese organizations. And, I think this is Chinese anyway, I would guess. So and, and playing in the open source space. So I guess you could say that's another interesting trend where we saw and covered recently also Quinn.

QVQ, these are now major players in the space of models that are competitive with Lama often and other things that have been out there also deep seek stuff like that.

Shawn

Yeah, that's true. I would say like, I mean, the other thing is, is That we can definitely see there's a trend in increasing number of MOEs. So 32 experts is like, the standard used to be eight and now it's 32. And I think it's gone up to 150, 160 before. And it seems to be. consensus in terms of like more efficiency or a sparse inference. But like also having that many parameters means it's like really hard to serve. I think this is also very true for DeepSeq v3.

So mostly these are just kind of training models where you could kind of distill into smaller ones that you actually use. It's just interesting that they, they choose to launch their text models like this. Usually it's the other way. Like you remember Mistral launched their 8B. And then they went to eight by 22 B. Here it is. The Chinese are doing the opposite. They're just going big.

Andrey

Yeah. And on the, benchmark side, obviously they're showing that they are doing very well on the long context side, kind of competitive ish with Claude and GP Foro, much better than other open source models out there. Another interesting aspect of this is they did release a report on the archive with a lot of details on the training, on the architecture, et cetera. Similar to.

Lama or other models that basically lay out for you all of the nitty gritty of training models, of optimizing them, dealing with long context windows. So I guess another interesting aspect in addition to there being pretty competitive open source offerings for various LLM tasks, now also for vision language tasks, is there's no kind of secret Formula for having a good LLM as, as far as I know, it's just like, you need good data and you need to have the infrastructure.

And then the details of how you train and set up your model are not very much of a secret at all. And onto the next model, this one is called the MinMo and it is covering another aspect of a space that is less seen in open source. This is a multimodal large language model. That is more so focused on the voice integration and audio. So again, this is coming from China, Tongyi lab and Alibaba. This is an 8 billion parameter model, and it is focusing on voice comprehension and generation.

So this is dealing with speech recognition, speech to text and text to speech. Again, another example, I think we've mentioned recently how vision language is Area of it hasn't seen as many open source models and they're starting to come out. Now we have pretty impressive example of speech to text. It's apparently competitive with whisper large V3. It's getting 85 percent accuracy in language identification and 98 percent accuracy in dialect and other tasks. So. Pretty impressive.

And again, pretty interesting that you're seeing more and more models of the sword being developed. And in this example, they are saying that the code and models will come soon. So far, we just have the paper. And I get another example, I guess, of something you've seen where we've seen previews and papers come out. Before they are able to do a full launch here. We're not seeing a model or code yet, but we do see the numbers and the paper. So let's see if they follow through, I guess.

And onto the very last example here is a benchmark and that's another space where there are many, I guess, things happening all the time and many new benchmarks for different size of the AI. Inference game, I guess you could say in this example, it's halogen and the paper or the report has this fun thing with title fantastic LLM hallucinations and where to find them. That's referring to some Harry Potter movies. Anyway.

This one is focused on identifying on benchmarking LLMs for how many hallucinations they have, often how many times do they say things that are essentially false. This has almost 11, 000 prompts and automatic verifiers that look to see in the LLM outputs, how often they produce reliable. knowledge and facts. So it's interesting, I think they show that even top performing models can have hallucination rates as high as 86 percent in some domains.

My personal experience and kind of impression is that hallucination is Increasingly not a real problem. You know, when I do work on some sort of coding or some sort of, I don't know I guess some I just want to find out about some topic. I'm not super worried about hallucination and them just you know, importing. Packages that don't exist or doing some silly function call for something that isn't there.

But at least in this example, they are saying that there are still some pretty major hallucinations on things like coding. Text simplification, historical events, and so on.

Shawn

Yeah, I kind of disagree on the hallucination is not a real problem thing. I think it just differs by use case. In the code generation examples that I see, LLMs still generate a lot of code of APIs that don't exist just because they think they should exist. And also I mean, a very simple example is cursor for a long time. When you try to write opening a code with cursor, it would refer to the old completions API and not the new chat completions API.

And obviously it wouldn't know anything in, in that sort of like a beta API. So that was like very annoying. But also for large report generation, like the Gemini deep research of the world let's go back to from called bright wave that generates reports from a lot of data sources. And I run AI news, which also generates. from a lot of resources into, into a report that gets sent out every day. Yeah, every single day there's hallucinations.

It doesn't understand that, what sources are tied with what. So sometimes they'll confuse things and sometimes it will not fact check something that was like, obviously wrong. If you know, even like the little thing about like, let's say like Eric Schmidt, like running meta. I'm like, no, like that's, that's wrong, you know, but like the model doesn't know that. So like we've caught it. Like I get reports from my readers every single time this happens. And I'm like, yeah, sorry, man.

Like I'm trying, but the best I could do is just have another agent check the agent. But like,

Andrey

yeah, and to be fair, I guess I say I'm not too worried about it, but then again, if I would be hesitant to make a claim or like cite an LLM for something factual, because you really, you know, if you. Talking about some historical detail, I could easily see that being fabricated. So

Shawn

yeah, this is why, like, I always thought, like, I think I would thought that something like perplexity would take a lot longer to become a thing. And I think really it's just, People don't mind as long as you have citations. And if they look, they see something where they click through to the source, they're like, ah, the LLM made a mistake, but isn't that funny? And then you move on with the day. It's not a, not that big of a deal. If you, as long as you provide sources.

Andrey

And onto research and advancements, we have at least one paper that generated a lot of excitement in the AI space on the Twitter X verse and so on. The title is Titans Learning to Memorize at Test Time, coming from Google. And the gist of this is that LLMs, usually as we use them, are kind of frozen post training. You train them, they have some sort of knowledge that is up until a certain date, and that's where knowledge caught off. And there's been a lot of work.

In the research space on how to get beyond that, how to be able to update the models as you use them so that they have ongoing updating attention. One example of that is TTT, which we covered back in Like mid 2024. Well, now there's a new variant of that, that people are pretty excited about. It is called Titans. And the idea is we have a few ways to integrate it.

So at a high level, what we are proposing is a. Formulation of novelty loss of being able to see what you want to put into your long term memory based on how novel it is. They also propose a specific way to do forgetting. So they have this decay model where over time you can let go of information that's not relevant. And they also look into how to integrate memory in your architecture. So there's different ways you can have memory as what you're doing.

They have one Variant called memory in context, where you have multiple parts of your model. You have persistent memory, which is your typical LM that has knowledge that is there forever. You have your long term memory, neural memory, which is the memory that gets updated as you go. Context as architecture is appending into your Input. So you take some input sequence, you add some information from your persistent memory, and then you also add some knowledge.

from your long term memory into the input based on retrieving some of it with regards to the input. And then you do attention over all of that. And as part of your output, you update the neural memory with this novelty metric. Another example of what you can do is memory as gate, where you, Add memory into your attention, attention calculation. So you take your input and then as you calculate the typical sort of attention over the input, you, take into account the longterm memory.

And then another example they have is memory as a layer, which I forget is more traditionally what you do with things like recurrent models, where you pass the input through your memory and then you update the memory and then you And also have a kind of middle layers almost of the model, be the memory.

So anyway, they have this, test time memory where you actually update the models based on the novelty of the inputs and as you might expect, they show in comparison to various models of memory, like transformer plus plus, Mamba, Delta net, TTT. They outperform all of them. By a decent metric. So this is one of these problems that's unsolved for LLMs, right?

Where if you have an agent that's deployed out there, if you have an assistant that you're using, the fact that it can't remember things long term, things that you've seen, that you've chatted about is a real hindrance and a real, I think, reason that AGI is not possible if you're not kind of learning as you go. There's been more and more research on this, which is building on a lot of that. We have detailed comparisons to various things like Mamba and like TTT.

And people were pretty excited cause there's a lot of detail in this paper and the numbers of course, on various benchmarks look very solid.

Shawn

I think it was a really good summary. Yeah.

Swyx

I think the only thing I've not understood about these things is how are they supposed to work with multi tenancy? Okay. So if I'm deploying a model to serve, like, does this only work if you have one model, one customer per model, basically you need to replicate. So, which seems interesting, you know, but it's also like, it's kind of hard in practice.

Andrey

Yeah, it's, it's hard because you do need to update the weights of a model, the sum of the weights. And that's one of the aspects here is you have part of a model, the neural memory, which is. That you're updating and other parts of it are frozen. So in terms of actually deploying it, you will need sort of per person, per chat bot, it's own memory that so far what people do is more retrieval. So you store away some information like chat GPT memory. That's how it works.

You store some stuff and it gets added to the input here, actually updating the weights at test time. And that's much harder.

Shawn

Yeah. I don't think this is as big of an issue as, as you are fearing because we already have prompt caching, right? It's already caching per person. It's a form of memory. It's not as advanced as this one, but it's a form of memory, like our infrastructure wise, not an issue.

Swyx

I don't know if that's necessarily true, because if, if you put my memory in a model, then on a net new response, you can not serve the model to anybody else.

Shawn

Yeah, it's basically caching, right? You can load the memory module back in whenever you activate your endpoint again.

Andrey

I think from, yeah, from, Infrastructure side, which is a big aspect of this, I guess there is a difference in prompt caching and in retrieval in general, where it's one thing to append some extra stuff in the input to the model, here you have sort of customized model, so different weights for for, I don't know, I guess you have millions of variants of your model that you now need to store and load onto your GPUs, which is presumably not quite the same.

So. Again, as with any of these papers that look into alternative architectures, we've talked about Mamba a lot. And that addresses a similar problem where you can have ongoing memory over a lot of time. Not clear when we'll see some of these things actually have an impact on. Things that are out there in real world, but always exciting to see developments in the core architecture of how we build these large language models.

And what they do say is you're able to scale to context window sizes larger than 2 million tokens and maintain high accuracy and things like needle in the haystack scenarios. So. And a real surprise for me personally that we've been able to scale context so well without using some sort of memory like this.

Shawn

with a reminder that Nido on a Haystack is just the most basic version of what memory utilization looks like.

Andrey

And onto the next paper, another exciting or interesting development from Sakana AI. We've covered a few of their papers, so they're putting out a lot of neat things. In this case, it's transformer squared self adaptive LLMs. So this is looking at adapting to tasks in real time by selecting, I guess, the best variant of a So they have essentially different expert models that deal with things like math or coding specifically.

And what we do is given your input, they have a few different ways to be able to decide on what set set of, what type of task it is, you could say, and then they. Take that as the right sort of set of weights to use. And then they put your input through the model a second time. And then that kind of gets you better. Performance.

So another example, in a way of we would say as adaptation over time, although in this case, I was a little disappointed in the general idea where they do need to pre train different specific variants of a model, what they call tasks, specific expert vectors. Which are then mixed in to the weights. So it's not quite as dynamic as something like memory here. But you still have just a few variants and you pick between them.

they do say that this outperforms things like Laura and is more efficient takes fewer parameters. And Laura is one of the very important techniques out there for taking Base model and customizing it to a certain task. So in that sense, this is pretty cool. And then there's some pretty nitty gritty details of doing SVD and so on. But, anyway, another example of research on adopting LLM students.

Shawn

so this is coming out of Sakana, which is kind of cool that they are sort of ramping up their research. I think the previous ones like AI scientist or the evil merge stuff is always like kind of tinkering around the edges of research. And I think this is kind of in that category, philosophically, like it makes a lot of sense. I think it makes like this two pass mechanism you're also seeing in AI. embedding models as well with the sort of CD small that's released out of Columbia University.

But one thing that I, I don't really know is like how this relates to representation fine tuning, because I mean, I just, I just searched the paper. It doesn't really mention REFT, but REFT was one of the best papers at NeurIPS last year. And like, it's kind of the same thing in terms of like people are understand that lower layers are good, but like tuning like representations over the entire model is better.

And transformer squared kind of does that, like it uses the inference to do the sort of representation discovery, which is kind of cool. But in all other respects, it feels like just representation fine tuning. And yeah, I guess people are exploring this, but at the same time, I am not so sure, like, if it'll get mass adoption yet, because we haven't seen Rift get mass adoption.

Andrey

And then just one more paper this week onto yet another trend, as far as research inference time scaling, but this time it's not inference time scaling for LLMs, this is actually for diffusion models. Diffusion models are what typically gets used for image generation. So diffusion means that you start with a bunch of noise and then you do multiple steps to, move your noise, like mutate the image towards something that is correct, given some texts.

And so typically one way to get better images is just to do more denoising steps, more iterations of fine tuning your image towards the the Best, I guess, denoised variant of it. And that's how you train it. Typically is you take a real image, you add noise and you train the model to do these denoising steps. Well, this paper is looking at what else can you do beyond just doing more denoising steps to get better outputs. And what they find is.

One example is you can search in the space of what noise you are adding. So you don't just do more denoising, you also do different types of denoising and find the best output for your, diffusion model that way, which is, I suppose somewhat big deal, right? Inference time scaling is the hot thing now in LLNs and. Image generation is not seeing that much competition, I think. But if this could be one of the differentiators for ChajBT, for GemIIni, for the multiple kind of.

Places competing for text to image could be a significant part of a model that gets you that last kind of Delta in the quality of your output.

Shawn

I'm a little bit more skeptical. Like I don't know what I learned from this paper, like diffusion models or the original inference time compute model anyway, so I guess they're, they're, you know, trying to improve it, but like the title itself is a very cynical. Like let's stuff the heart current hot keyword into the title to make it hot. But like, I don't know, they focus on the results.

Andrey

Yeah, I don't know. I mean, I think it's, I wouldn't say it's so cynical because ultimately it's similar to what you often do, I suppose, with inference time where you search in the space of possible outputs in this case, the outputs are,

Shawn

you know, Diffusions have done, right? We've clipped, clipped by the diffusion. That, that is. Yeah,

Andrey

but usually they take a linear path where you just denoise, right here. They're saying, well, maybe you can denoise in different ways and get better outputs that way. So I don't know. It's, it is novel. Maybe I would say it's, not going to get you much better outputs compared to the inference time scaling of LLMs. And on to policy and safety and yet another story about restrictions of chips. So there are these new guidelines and restrictions for exporting US made AI chips.

This is known as the interim final rule on AI diffusion and these are, doing some new, I guess, categorization. So there are. New now free groups, strong allies like Japan and South Korea, they are adversaries like China and Russia, and a third group that is most countries. And so each of these groups have varying levels of restrictions on AI chip purposes. So that's countries like Mexico protocol and Israel now face a cap of 50, 000 GPUs per country. Which is kind of surprising.

I mean, we've seen most of the restrictions on China, for example. Now it's on any other country. Nvidia has criticized these proposed rules. So these are not yet in action as unprecedented and misguided quote, because it does seem like a very. Big move to yet again, restrict the ability of Nvidia to sell to other countries in a very significant way.

Swyx

Yeah. I don't know what's going to go actually in effect, but yeah, our friends at Semanalysis did a very, very good post on breaking down this thing line by line, the actual impact, like what are like the data centers that are actually getting hit in different countries, as you know, there's a lot of global build out. For example, Malaysia is a country where. There's been a ton of investment from Oracle and Vidya for data center build out.

Question mark on how some of these things will fair out. And the other thing is the restrictions are not geographic base, but about ownership base. So for example, if you are, you know, Volvo, which is like a car maker in Sweden, they're actually majority owned by a Chinese company. So they would also be limited to GPU access. There's kind of like a lot of. Implications of this thing. But also, you know, we're going to get a new administration four days. Right. Yeah. It's hard to

Andrey

say what will happen with all of this policy. You know, this is coming on the executive front, these are just guidelines and restrictions. So as yeah, like. With when Trump comes in, it's a real question mark as to what happens with all of this stuff. And speaking of I guess the last set of stuff that the Biden administration is doing, they also did another thing, signing an executive order to accelerate the development of AI data centers powered by emissions free electricity.

So they're involving the DOD and Department of To lease sites for these data centers. And yeah, another example, I think of, it seems like they're clearing out their priorities, signing a lot of executive orders, and this is addressing one of the popular limitations with how hard it is to get permits and so on for data centers. Again, we'll be curious to see what happens as the new administration comes in.

Shawn

Like Trump will probably just expand this, like, it's not about reversing it. Like, like Trump is pretty pro AI anyway, so hopefully they've talked, I don't know. Yeah, I think,

Andrey

I think it's an expectation. Exactly. And speaking of that the next story is that OpenAI has a new, I guess, guidance. They have this economic blueprint that is Putting forth what they think is a good version of AI regulation. And this is, you know, as you might expect is dealing with wanting increased federal investment in power and data infrastructure to support AI data centers.

They are saying that we should streamline industry engagement with national security, establish export controls for these models. They also are advocating for publicly available information for model training. So in on the copyright question and so on. So, yeah, another example like regulation in the US is in an, in an open area, pretty fluxy.

So not too surprising that OpenAI is trying to influence wherever it goes and interesting if, if they released this blueprint to time it with the new administration coming in. And one last story in the section dealing more, I guess, with societal impacts, not policy and not quite safety, but I think sort of related, this is about a Pew Research Center survey that found that 26 percent of U. S. teens aged 13 to 17 have used chatGPT for schoolwork, which is Double what that was two years ago.

Although I don't know what two years ago was. Cause that's when the chat GPT came out, like a couple of months after that. So I guess maybe sort of like that. Anyway, over half of the teenagers also believe that it's acceptable to use chat GPT for researching new subjects. A lot of them, like more like 20%, also approved for essay writing. So I guess it's, it's interesting to see a glimpse.

It's often surprising to me and to people in the AI space, how little impact chat gbt and visa LLMs are seeing, how many people are not using them or not aware of them, even. To see that a quarter of us teens are using them. Clearly it has a massive impact on education already. And that's one of the areas where it is very disruptive. So that is really painting a picture on education and I guess, school work as being an area of that needs to evolve now that we have stuff like this.

Swyx

It's always surprised that numbers are so low. Why is it not 100%? I don't know. Strange.

Andrey

And moving on to the last story, just one more in this section, synthetic media and art, and this one is dealing with AI copyright disputes. So this case, KADRI V meta platforms, one of the ongoing challenges by offers and IP holders over use of copyrighted content, there is now a new deposition Where there was, I guess we talked about the libgen dataset last week. So this is dealing with the deposition of Mark Zuckerberg.

So actually some statements he has made as part of this, I guess, the YouTube part of the story, the headline is he said, for example, YouTube, I think may end up hosting some stuff that people pirate for some period of time by. YouTube is trying to take that stuff down. This is again, dealing with LibGen that had pirated content. And he also said, and the vast majority of the stuff on YouTube, I would assume is kind of good and they have license to do.

So I don't know, I guess not really worth highlighting the YouTube. There's some other things that happened in the position. Zuckerberg claims he hadn't really heard of LibGen. He says, I get what you're trying to get me to give an opinion of LibGen, which I haven't really heard of, it's just that I don't have knowledge of that specific thing.

So, yet another thing he said, so would I want to have a policy against people using YouTube because some of the content may be No lots of statements by Zuckerberg in this deposition, nothing particularly surprising, but does point to this LibGen case actually giving us some updates and some interesting things where we haven't seen many of these legal battles have progress or have any problems. updates on what's going on in them. And that's it for this episode.

Thank you to our listeners for listening to yet another episode of Last Week in AI. As always, you can go to lastweekin. ai for the newsletter at lastweekinai. com for the notes for this podcast, all the links, if you want to open them on your Browser. Sean and Alessio for co hosting once again, Layton Space, very cool resource. Check it out. And yeah, again, thanks for co hosting.

Swyx

Yeah. Thanks for having us.

Shawn

Yeah, it's a real pleasure. I think you are doing like very important work. I refer to your notes all the time. So it's a very special to be contributing for the first time.

Andrey

once again, thank you to the listeners for any comments you leave to people who joined the discord, fun to see you chatting a little bit and just hearing your backgrounds, but more of anything, thank you for listening and thank you for tuning in week to week and enjoy our AI generated outro song.

Transcript source: Provided by creator in RSS feed: download file