#220 - Gemini 2.5 Flash Image, Claude for Chrome, DeepConf - podcast episode cover

#220 - Gemini 2.5 Flash Image, Claude for Chrome, DeepConf

Sep 01, 202553 minEp. 260
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

Our 220th episode with a summary and discussion of last week's big AI news!
Recorded on 08/30/2025

Check out Andrey's work over at Astrocade , sign up to be an ambassador here

Hosted by Andrey Kurenkov and co-hosted by Daniel Bashir
Feel free to email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai

Read out our text newsletter and comment on the podcast at https://lastweekin.ai/

In this episode:

  • Google's newly released Gemini 2.5 image editing model showcases remarkable advancements, enabling highly accurate modifications of subjects while retaining their original features.
  • Anthropic expands Claude with an AI browser agent for Chrome and adds features to remember past conversations, enhancing the user experience and personalization.
  • NVIDIA and AMD to share revenue from AI chip sales to China with US government, marking a notable shift in export control policies and trade practices.
  • AI companion apps are experiencing substantial growth, with projected revenues expected to reach $120 million by 2025, raising questions about social implications and user engagement.

Timestamps + Links:

See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

Transcript

Intro

Hello and welcome to the last week in AI podcast. We can hear chatbot, what's going on with AI in this case. A bit more like the last month in ai. Unfortunately, we've had to skip a few weeks. Jeremy. Busy as always, with exciting reporting and I don't know his natural security work and I've been traveling. So as I always say sorry for if we missed weeks, we'll try to be back on regular schedule.

Going forward and in this episode we will summarize and discuss some of last week's most interesting AI news and a bit of also the week before you can go to last week in AI for our text newsletter, which does go out every week, most weeks for other stuff we are not covering in this episode. And in this episode, I'm one of your regular co-host Andre Rankoff. Jeremy once again could not make it. So we have one of our regular guest co-hosts Daniel Bashir. Hey. Yeah, I'm Daniel.

You may have heard me on this podcast before. If you have explored the last week in AI Substack World, you might have also listened to the gradient, which if you haven't, we have lots of interview episodes, which are, I think pretty cool. Would love for you to check those out. Yeah, great to be here. And me and Daniel, were just chatting. There's this, an idea being floated over reviving the Gradient podcast, which has been on ICE for a little while now.

So, yeah, last weekend listeners, you might hear some news on that. In a few weeks we'll see. But uh, in this episode we'll be covering some primarily exciting news regarding new tools and apps, some releases from Google, philanthropic not any major applications or business stories. Some pretty cool open source stuff, and you know, just a couple. Notable policy stories not been a super busy month so far, luckily, and we haven't missed too much.

Tools & Apps

So starting out in tools and apps, first story has got to be the new image editing model by Google. They have released Gemini 2.5 flash image, which is. Like by far the most impressive model for editing images that has been available so far. It was kind of hyped up for a while. It was being used under udib, nato, banana, and yeah, after a little bit of that, it was revealed to be, in fact, from Gemini.

And you know, sadly, this is an audio format, so I'll have to describe what you can do with it. But gist of it is you can very accurately take a subject like a person, and then. Change your clothing of this person, change your posture, change your setting. And it is very convincingly retaining the features of this person and very successfully kind of following your instructions about. What you wanna do, you can combine different images as well.

It's by far beyond anything else We've seen to a point that some people are saying Photoshop is in trouble now. So Still to me, you know, we've had very powerful models for image gen for a while. This one is still, you know, next level. Yeah, and this is also coming kind of just off the heels of Genie Three, which was released earlier this month also by Google DeepMind which is really quite impressive as well. It's got this sort of ability.

For you to look into a world and it's actually quite stable and sort of maintains some of the features of the physical properties. Like if you look at something, you make some adaptation to a part of the environment, like painting a wall, you turn away and you turn back. It's still there. And you know this. Is quite interactive with the notion of a world model that's pretty hotly debated in AI circles, whether models have 'em, what a world model is, things like this.

So I'm pretty excited to see more work like this that forces us to think about these notions. Yeah, there's been many fun examples of things you can do with this. It's a multi turn, kind of conversational model, of course. So you can take an empty room and decorate it, you know, paint the walls at a couch at a table, and the room will be successfully sort of populated without all the details changing, just having a very specific thing you wanted executed. And on the not of the world model.

There was a fun example I saw online where someone pointed out that they gave the model an image of a road. In Dallas or something and asked it to show what is the opposite view, what's behind the view. And the model apparently was able to show the view from the other side of the same area, which definitely kinda speaks to this being sort of world model, like being able to understand physical properties, locations, things like that.

Next big piece of news on this front is that Enro has launched a Claude AI agent that lives in Chrome. And this is pretty much something that you'd expect. The AI exists in a sidecar window. It maintains context of browser capabilities. The agent can perform tasks on behalf of the user, which is pretty exciting. There's a lot of AI companies out there developing similar AI powered browser solutions. I feel like this is an interesting direction.

And maybe there is a question out there of like, what would an AI native browser look like? How do, how does the interaction for the interaction designed for that look different from how we use browsers today? Is it like this agent and a sidecar the way that Anthropic is doing right at the moment? Or is there a world where browsers actually look pretty different in a more fundamental way?

That feels pretty unclear, but I think we're in the beginning stages of something that could be really interesting. Right. Yeah. This is launched via cloud for Chrome, so it's an extension coming pretty quickly. I think OpenAI launched their agent model maybe a month or two ago, a little while ago, and that is very similar. You give it an instruction and it does web stuff for you. For opening eyes thing it.

Has its own little dedicated environment and it creates its own browser and sort of does it in the chat GPT interface. Here you have this plugin for Chrome and you actually use it within the Chrome browser, which is a little different. And to your point, this is also coming pretty soon after Perplexity has launched their browser, I think called. Dia or something like that, that is also pitching this sort of age agentic browsing stuff. So. It's yet another sort of competitive area.

We've seen this with search, with deep research, with every single kind of use case of ai, open AI and ro and a few hours going head to head. And yeah, I think these are gonna be pretty powerful. I have one fun example where I had, a spreadsheet with some links where you gotta open each link and check kind of the website check for some quality assurance. And in the past, I would have to do this myself, click and go look and do this very manual labor.

I. Was able to use strategy agent, like tell it, go to this Google doc, open it, clink on these links, look at the site, check for these things. it took like half an hour. It took a long time to get through it, but I was able to do it. So it was gonna be, I think, similar to just the chatbots these kind of agentic web browsing agents are gonna. Be used in a million different ways to speed up all sorts of boring stuff. And speaking of philanthropic, they have another slightly notable update.

The Claude Chat bot can now remember your past conversations, so something that's been available on chat bt for a long time. You can activate it by going to settings and labeling search and reference chats it's interesting to see Anthropic adding this a long, long time after OpenAI did OpenAI, as far as I remember was must have been last year when it started remembering details from your conversations to sort of personalize it to each user.

And that speaks, I think, to OpenAI having had a more consumer oriented focus and philanthropic. Targeting much more of a code and enterprise and business people. But yeah, I'm sure this makes it a more compelling offering. Yeah, there's a coupling with another story here that Google's Gemini will also be more personalized by remembering details automatically. As with other chatbot offerings, you have options like temporary chats where you can have private conversations.

It won't be saved to use for personalization or AI training. My take on this, and you can kind of see if you look at the pieces we're referencing here. That Anthropic has taken a slightly different pattern where they will only use chat memories if you prompt them to, and I think that we're in the pretty early innings of what memory and personalization are going to look like for these systems. I think that there are a lot of different contentious issues. That come up here.

I think that memory in its current form is not perfect in any of its implementations, and I do think that there's gonna have to be a lot of considerations and hard work on how does the interaction pattern it affords right now, how does that make a difference to the model behavior in a way that's relevant to different things users care about and what sorts of principles. Should we have about how that evolves?

Again, feels like a very early discussion as these things are just beginning to be ruled out, but something to pay attention to. Yeah, and I think this is interesting to me. As a topic because I think this, I've started realizing first of all the magnitude of Chad GPT usage, right? We, we've covered how we have 700 million active users. And you know, in my mind I was sort of assuming or even not imagining how people are using it. Like I use it for work.

I use it to like help brainstorm and write some code and whatever. But many people use it in many different ways. Some people use it as like a therapist or like a life coach. Some people use it just to talk to and think through problems. So for people who do. Talk to chat GPTA lot, like talk to it. That kind of memory feature I think probably matters a lot more.

And so Gemini launching in particular where Gemini is becoming, I think the main competitor to chat GPT as far as chatbot's, people actively use that could matter quite a bit. And speaking of Gemini, there is another launch from Google. They launched guided learning, which is available within Gemini and is designed to teach rather than simply answering questions. So it's meant to. Have you learned things? Have you built a deep understanding, help you work through problems step by step.

All that sort of stuff. Again, we keep saying this, I find it interesting. This is happening very, very soon. After chat, GPT launched study mode. We know that all of these services are used heavily by students. There's no, I don't know how, what percent of high school students and college students aren't using JGBT at this point. It must be in the below digits. So, makes a lot of sense for this to launch and hopefully these sort of study oriented things will make it so students.

Actually try to learn as opposed to just have the AI do the work for them. Yeah, I hope so too. I think there's a really interesting set of questions here. Some of them are around how do we ask people to still do the hard and effortful work that is learning and developing a deep understanding of things. Because I think that to really. Cause the sorts of changes in your brain and time you need to mull over something to really get it and have deep intuition and understanding.

There just isn't a shortcut to that. I think that the way our education systems work, there's different forms of legibility. That we have that indicate what it looks like for a student to have attained mastery or to have a deep understanding of something. And I don't think it's news to anybody that these forms of legibility are pretty imperfect and don't always indicate that. And increasingly they can be gamed. And when you're a student, you lose out on something. You lose out on this, not just.

Generalization of understanding that might come to matter later on, but a sort of satisfaction you might get personally from deeply understanding something in such a way that it might intellectually stimulate you. What make you want to consider different paths or things like this later on in your life and so that deep effort work looks or feels. Quite important also just for the development of a person. This is getting too long, so far, so I'll stop there.

I mean, it's a, it's a deep topic, I think, and it's very interesting to consider how if you're on the younger side, you know, starting to grow up and you wouldn't remember a time before AI before chatbots. Like your experience will be very different from our experiences where we had the internet at least, but like, we had no idea to learn, but it was very different. Yeah. Oh, Moving on to a couple stories about OpenAI. We've had what now?

Philanthropic Google as the main guise of this section so far. Next we have news that Apple intelligence will be integrating with GT five, starting with iOS. 26. So, Siri already integrates with OpenAI, GP four O. as far as I know, like you ask Siri a question and then it decides to pass forward that topic to Chad, GPT. And so perhaps not surprising that they're gonna be upgrading it to. GPT five relatively soon, but does speak to the kind of continued partnership between Apple and OpenAI.

one more piece of news on OpenAI, they are adding new features. To Codex their coding assistant. So they're introducing an IDE extension, an extension to the kinda standard coding tool, which is also something that cloud code has. It is introducing GitHub code reviews. Yeah, generally kind of expanding the feature set. Of their cloud code competitor.

And this is, I guess for for non-programmers, this might not be very exciting, but I think Cloud Code has really seemingly made huge impact in the programming world. And these kind of agentic coder tools are pretty rapidly being adopted and making a big shift. So OpenAI managing to compete, managing to. Get some user share with Codex as you know, for a rare occasion kind of entering this space later. Philanthropic it's a pretty significant struggle.

Yeah, a lot of stuff going on in the coding world right now, as you've seen from the many startups involved in this.

Applications & Business

Our applications and business story for today is also about a startup, sort of in this space. It's about a company called Lovable, which TechCrunch refers to as a vibe coding startup, and, if you haven't seen lovable before, basically it's used to create full stack web applications and websites, so that's the specific area that they're in, and they are projecting some pretty big numbers. They're aiming to achieve $1 billion.

Annual recurring revenue within the next 12 months, which is quite soon. And it's currently growing that a RR by at least $8 million each month. It's already surpassed a hundred million in a r just eight months after reaching its first $1 million, which again, goes to show obviously many of these companies have lots and lots of spend, but the kind of user and revenue growth that they can experience is. Quite on a different level from what we've been seeing before.

Yeah. Lovable has been kind of a clear winner so far in this entire space and they did launch quite a while ago, so they pretty much took off this year as AI got good enough to. Be usable basically without knowing code about reading Code Lovable is one of these very user friendly. You know, I don't know if they allow you even to see the code.

There are some competitors like Repli, which are more friendly to technical users that expose much more kind of techie stuff, and it's a very busy space, as you said. So rep plate is one competitor. There's also Bolt there's V zero from cel. Base 44. There's like at least 10 significant players at this point, I think. And yeah, it's. probably gonna be a major market, assuming the economics of it start working. 'cause I think the speculation is these companies are acquiring all this revenue.

By burning through cash and not even trying to be profitable at this point. And speaking of big numbers the next one is about raise Decar, the company that we recently covered as having launched this Real time sort of filter, real time video to video model that was very powerful.

You can give it like, normal stream of a Regular kind of world normal video, and it can turn it into GTA or I don't know, Simpsons or any sort of art style with real time streaming, which would mean that if you're like playing a game, it can completely change the art style, for instance. Or you can even have a very low res game and then make graphics whatever you want. So they have raised a hundred million and they have now hit a $3.1 billion. Evaluation and that's pretty significant.

Like there's no large set of users for this yet. And this entire idea of streaming video to video their model Mirage, LSD yeah, again, still sort of at a preview stage. So investors seem to be pretty optimistic on this having a lot of potential. Yeah. It's one of those where it feels quite early to say anything substantive. We have another story here that's also about a pretty big raise and the company you've surely heard about before that is not too new.

Cohere has raised $500 million from investors with a new valuation of $5.5 billion. Lots of different players involved here. COHEs, hoping to use those funds for accelerated growth. They plan to expand their technical teams and developing enterprise AI solutions. Again, unlike many AI startups, cohere is less focused on consumer applications and much, much more on customizing AI models for enterprise clients like Oracle and Notion. Hoping to develop this sort of cloud agnostic AI platform.

So this is again, a pretty different approach that some of these labs are taking with their technical talent, where they are trying to look at different enterprises and businesses, thinking about how can AI be useful for your sort of vertical? And you're seeing both sort of general versions of that, like cohere, but then also ones that want to develop deep expertise in a very specific area. Next up, We have a story about pony AI not active in the US is aiming to roll out. To the European market.

So this report from Bloomberg is kind of saying that this is their aim. So far, apparently they've already rolled out 200 Gen seven Robox vehicles just over the past two months. They're aiming to get to a total of a thousand vehicles. And this is notable because in the US. We've definitely seen a speed up of competition and deployment of Robox. Is this year in particular, Waymo is entering new markets.

Tesla's robox service just launched and is also at least aiming to expand rapidly and it's very clearly going to be a huge deal. Like this problem is starting to be at the point where it's solved, where robot taxis are quite reliable. People seem to prefer them to Ubers in general from. What I've seen in discussions, so pony AI being another significant player coming from China has the potential to really break into the European market. And if that's the case, that's gonna be a big deal, right?

Last story on applications is about another big lab and a bit of changing of the guard. Igor Baskin, who is a co-founder of Elon Musk's, XAI and who I recognize as having some kind of bird editors X profile photo. I dunno if it's a bird. I can't remember if it as wings, but it's a memorable pro profile photo. Anyway, besides a point he has announced his departure from XAI to start his own venture capital firm.

The Boost Skin Ventures, which will focus in supporting AI safety research and backing startups that aim to advance humanity and explore the universe. This was inspired by a discussion with Max Techmark about building AI systems safely for future generations and is also following several scandals at XAI, involving the chatbot grok, which included controversial responses and inappropriate content generation.

Many of you, if you are extremely online or spend basically any time on X, probably remember the GR four release and what happened around then. Yeah, it's been a tumultuous few months for XAI, to be sure. A lot of impressive results with Rock four launch. Just very impressive L-L-M-X-A-I in general since launching, I think towards VAN of 2023 since the team coming together just caught up incredibly rapidly. It'll be fun to speculate if this means that XAI is not doing so well.

Typically you don't see people departing from startups I've co-founded in less than two years. But here obviously it's hard to say if Aush can just wanted to go off and start this venture. Initiative or if it indicates anything about XAI internally, but still significant to have a shake up in leadership in general. And XAI. It's in an interesting time in its life, so,

Projects & Open Source

Moving on to projects and open source. First, we have an open source release from Meta ai. This was, I think from a couple weeks ago. The release is Dyno V free. A state of art vision model trained with supervised self supervised learning, which is able to generate high resolution image features. So basically it allows you to process.

Any given image and output a representation of it that's useful for all sorts of stuff and that you can use to, for things like object, attention, sematic segmentation, video tracking, et cetera, about any fine tuning. And this is a pretty large model. It's has 7 billion parameters, which is unusually large for. Just pure image models trained on 1.7 billion images. This is very much just like taking the image processing model to the biggest, place it's been.

we don't talk too much about just pure image models for things like semantics and mutation, OB object, object detection, video tracking. These are like semi solve problems at this point. Used to be like. A decade ago, these were significant tasks in computer vision, but it's pretty important to remember that I think as far as using ai, applying it, object detection, segmentation, just general video understanding and image understanding tasks are pretty significant. So having.

A really cutting edge model that is free for academic use, that has a commercial license as well, comes with a lot of code could be very useful for certain people. Yeah, these sorts of models clearly have pretty important impacts out there in the world. For this specific model, a few orgs like the World Resources Institute and NASA's Jet Propulsion Laboratory have been using it.

This has also improved the accuracy of some pretty specific tasks like forestry monitoring, supportive vision for Mars exploration robots, and the fact that you can do this with minimal compute overhead and you don't have to rely too much on web captions or curation. So that. You are able to sort of apply this universal feature learning when you're bottleneck by annotation is a really good advancement, I think. Next up we have a specific set of foundation models, GLM 4.5.

This is an LLM with 355 billion parameters designed to excel ingen reasoning, coding tasks employees, a mixture of experts architecture. Which is pretty familiar to a lot of people who spend some time in ML research, but basically lets it select different subsets of its parameters for different tasks, which is quite good for efficiency and performance. What this also means is when you hear the number of parameters in the model, that's not quite the same as the effective number of parameters.

So the number of parameters that are actually being used when the model makes an inference about something, and the training is sort of multi-stage here. It pre-train on a diverse data set. This is followed by fine tuning on specific tasks, improve its capabilities. Nothing too crazy here. There's RL thrown in the training process especially when it's working on decision making, problem solving sort of tasks. Just a pretty interesting model. Yeah it's kind of interesting.

We have a figure here, figure free, and There's pre-training on a general corpus, then pre-training on a code and reasoning corpus. Then there's mixed training, which has precepts, repo level code data, synthetic reasoning, and long. Context and agent data, and then there's RL and stuff. So there's a lot going on and this is very much following in the footsteps of R one.

I want sort of introduced, I think this approach, at least in terms of published research, of having these multiple kind of stages for training gentech and reasoning models. And the notable thing about this model, aside from being big, is they are doing quite well, like they're claiming on the benchmarks to be beating Opus four, to be up there with oh three and Rock four, almost to be quite performant at a smaller number of parameters.

So, you know, 353 billion parameters is a lot, but it's less than deep Seq. R one is less than K two. On coding tasks, they are similar on the benchmark front. So very much a continuation of a trend you've seen all throughout this year of open source models coming out of China. Starting with R one and really. Proceeding ever since that are getting better and better, that are getting really on par with the closed source offerings from philanthropic and OpenAI for many things which is new, right?

Like until this year you could not get an open source LM that was anywhere near competitive with Claude or JGPT. Now that's different. And speaking of open source releases from China, next story is about deep seek releasing. Its V 3.1 model. So this is you know, a bump in the version as probably the title. It has a longer context window and not like any sort of substantial.

Jump in any sense, but I think notable to see Deeps see continuing to release and continuing to update the R one model sort of incrementally and still being competitive. Although apparently Deepsea fans are waiting for the release of R two, which would be the successor two R one. So, this is kind of leading up to that. And speaking of open weight LLMs, we have kind of an interesting story about the overall market.

So, artificial analysis did a benchmark evaluating the performance of GPT or assess 120 B, the recent open source release from open ai. And they evaluated the performance. Of this model across different providers on the cloud. So you can run these open source models through various companies like Cerebra Fireworks, deep Infra Together, AI Rock, Amazon Azure. A bunch of them.

And the funny thing that they found in this is on a particular benchmark, Amy, they have very different outcomes across the different providers. So on some of them, Cerebra, nebulus, dp, infra, they get very high score, 93%. Then you go to Grok, Amazon, Azure, they go down by 10%, maybe even more than 10%, which speaks to hard to say what? These providers are doing, are they like making smaller versions? Are they quantizing, are they using different hardware? But definitely a surprising result.

You would think that if it's the same model and all these people are serving it, letting you use it via their hardware, you expect roughly the same performance. But apparently that's not the case. Our last story on this front is an open source text to speech model from Microsoft called Vibe Voice 1.5 B. This is capable of generating up to 90 minutes of speech with four distinct speakers supporting cross lingual synthesis and singing.

It's primarily trained on English and Chinese and available under an MIT license. There's. A decent amount of work going on right now in audio synthesis. And I think that this is like a pretty exciting advancement. Like 90 minutes of speech is quite a long time. I think there's still questions about general coherence of the audio over that stretched period of time, but it does seem as though, again, we're making pretty quick advancements.

Yeah. And this is one of these notable things where audio in general, historically, has kind of lagged behind in the open source front in terms of data sets, in terms of models, is just this kind of area where you don't have as many options as, for instance, image generation. So having powerful text to speech means that. On the one hand, as a company, you can use it, fine tune it for various applications. On the other hand, we know now that people use these kinds of things for scams and so on.

And that would just mean that, you know, you have to really be on lookout whenever you hear someone in audio these days. Like it's at to a point where you cannot tell the difference between AI generation and actual recorded audio.

Research & Advancements

And on to research and advancements. Just a couple of stories for this episode. The first one is deep Think with Confidence, which is a new approach that basically makes test time scaling more efficient and more effective. So they're looking at the type of test time, scaling where you wanna do several. Parallel reasoning paths. You wanna have a model, try to solve the problem multiple times and get to different results.

And then you might take sort of a majority output or a combined output of your various reasoning traces. And this paper introduces a fairly straightforward idea. So it's titled Deep Think With Confidence.

As you are doing your rollouts of different reasoning paths towards getting to an answer, you can evaluate, roughly speaking, the confidence of a model in terms of the kind of prediction what they call token confidence, which is looking at the probabilities of the tokens, it's actually outputting and they also defined an average. Trace confidence that they call self certainty. And basically they evaluate this thing as you roll out the model. And if you have low confidence, they kill the run.

They kind of stop it. So you end up being able to do many parallel runs, cut off ones that seem unpromising. Then if you get to high confidence, you're now able to combine these results from multiple models and get to a combined kind of confident output. And in benchmarks, they show that. With this method, they're able to improve performance pretty substantially. Able to improve performance you know, by 10% getting on some of these benchmarks like Amy.

A couple percent boost for G-P-T-O-S-S 5% boost for deep seek. Basically making it so for things where you're not reliably getting an output necessarily, you are now gonna get more significant ratio of getting to the right answer. And yeah, it speaks to, I think a place where we are with reasoning and test time scaling. There's a lot of. Logging fruit probably in this whole area of test time, scaling in terms of ways to do it more reliably, efficiently.

This one is a fairly straightforward algorithm method that can be applied widely. while we're thinking of test time scaling and a lot of these improvements, maybe a natural question is to ask what happens to jobs? And as it happens, a couple of days ago, a Stanford study found that the adoption of generative AI is significantly affecting job prospects for young US workers, particularly those age 22 to 25. And this came out quite recently and there's been a lot of commentary.

I would actually recommend taking a look at Noah Smith's recent blog on this specific paper, and also, of course, reading the paper itself because it's worth trying to understand and contextualize those claims. But just to get up a bit on a soapbox about this paper, I feel like despite the fact that the people who wrote this paper are pretty careful economists and like very deserving respect, it does feel like this finding is a bit. Of a specification search.

As job markets rise and fall, there's always some group of people who are doing worse than the rest, and it's a little bit unclear that it is always justified to tie this to, there's a new technology on the block, like ai. What's worth saying is that. Sure it's possible that AI is impacting job prospects to some effect, but it's a little bit hard to disentangle this entirely from other economic factors. One really great thing that Noah Smith does in this post where he takes a look, is he.

Looks at the data about how AI exposure relates to job prospects for people at different ages. And This is specifically for people who age 22 to 25. But the workers who were in their thirties, forties, fifties, who were judged to be most heavily exposed to AI actually have seen robust employment growth since late 2022. And. You can maybe score this back with the story about AI destroying jobs.

But again, it's kind of unclear like, why would companies be rushing to hire new 40-year-old workers in AI exposed occupations. Again, just a lot of question marks here. Six Facts About the Recent Deployment Effects of Artificial Intelligence. So they're examining the effects of AI on labor market, unemployment on people being able to get jobs.

The first fact is they uncover substantial declines in employment for early career workers age 22 to 25, as we say, in ACC accumulations, most expo exposed to ai, such as software developers and customer service representatives. Second key fact is that overall employment continues to grow, but employment growth for young workers has been s stagnant since late 2020. Third fact is not all uses of AI are associated with declines in employment.

Fourth, they find that deployment declines for these workers remain after conditioning on firm time effects. So they, they do try to be careful, as you said, this is analysis from labor data. We are not doing experiments here. We're just looking at various statistics and trying to conclude. What ai effect may have had. So they try to account for these other factors that could explain statistics.

Fifth, they say that the labor market adjustments are visible in employment, more of a compensation, and six, the above facts are largely consistent across various alternative sample constructions. So, as you said, like. economics research is tricky. There's no careful experimentation going on here. They are working with data that can have various interpretations.

Like in the case of software development, for instance, which is one of the major areas where employment has been much harder for early careers professionals. Obviously there's many factors going on during COVID. There was arguably Overp employment. Many of the big tech companies really hired like crazy, and then there was a large amount of layoffs going on in software development over the last couple years. There's economic conditions, all sorts of stuff.

So this is a very early piece of research and they do, to be fair, kind of position it as such. They call this IES in the coal mine to indicate that this might be a sign of what's happening, but it's kind of still early and it's hard to tell. But as far as analysis, as far as sort of actual. Research that is able to tell us anything about employment with ai? To my knowledge this would be the first sort of major work or obviously I'm not an economist.

Maybe there's been some prior research on this, but this is coming from a Stanford group that is pretty oriented, honest. One of the lead offers is. Eric Json, who has done previous research on AI and economics. So as you said, Daniel, if you find this interesting, probably we have to follow up and see some more deep analysis and possible interpretations of us.

Policy & Safety

Yeah. Our next story is in the policy and safety space, and this one's actually really interesting about an unpublished report on AI safety from the US government. Back in October, a red teaming exercise was conducted at a computer security conference in Arlington, Virginia. Where AI researchers dress tested some advanced AI systems, they identified 139 novel ways these systems could misbehave, like generating misinformation, leaking personal data, things like this.

The key upshot of that exercise was it revealed significant shortcomings and a new US government standard designed to help companies test AI systems. But the National Institute of Standards of Technology didn't publish a report on those findings. The reason for that, according to some sources was that along with other AI documents from nist was withheld because there were some concerns about conflicting with the incoming administration's policies. Wired now has this unpublished report.

And I guess one of the key takeaways here is that this is an area that feels like it should be nonpartisan and ideally not too influenced by politics, but it seems like there have been challenges faced in publishing AI research under the Biden administration. So just an interesting story in terms of the confluence of politics and AI safety. Right.

This is a report from nist, the National Institute of Standards at Technology, which was tasked with this kind of thing with creating standards and technology for ai. They created this. NIST AI 600 dash one framework to assess AI tools. This is an artificial risk management framework, general artificial inte intelligence profile. So this tmy exercise basically was to evaluate this framework that they published or like a year ago, I think mid 2024. So, probably, yeah.

Not too surprisingly know The Trump legislation reversed biden's actions on AI recently published their own agenda on ai. And it's very likely that these kinds of AI security initiatives are gonna see less interest, less promotion with current administration. And another story about the US government and a kind of a surprising one. The US government is going to take a cut of NVIDIA and A MD AI, chip sales to China.

So we have talked quite a lot about export controls, about restrictions for Nvidia to be able to sell GPUs to China. It's been a very evolving area and a Trump administration. There was a time where the age 20 chip, which for a long time was the one that Nvidia sold to China specifically, was blocked suddenly from being sold. And so, this is kind of reversing that now Nvidia apparently is able to sell the age 20 again, but will have to pay via government.

So Jeremy unfortunately would be the guy who would give the most insight on this development, but seems a bit surprising as far as kind of the approach to export restrictions. moving on to something unrelated to the government, going to another topic. We've talked about quite a lot lawsuits ongoing about copyrights for the major LLM providers. So philanthropic has settled a high profile AI copyright lawsuit brought by book. Offers.

So this was initiated by offers, Andrea Bards, Charles Grabber, and Kirk Wallace Johnson, who accused Andro of using their books without permission. There was some. Let's a conflicting developments in here California District George Ruled could use books with fair use, but found that the acquisition method via shadow libraries constituted piracy. And this is just one of multiple, lawsuits ongoing.

Basically for years now, that would have major ramifications about basically how you can use, how you can acquire data for training. AI models, open AI and philanthropic, and others kind of took the maximally permissive approach of using a bunch of data without asking any permission. And so this settlement hard to say as non-lawyers how significant of a effect it has on other ongoing law developments.

But does Mark one kind of piece of progress in this long ongoing story where, you know, at least this lawsuit has reached an end. Our last story is about AI companion apps, which are on track to pull in $120 million in 2025. In the first half of the year, these apps already generated $82 million. With downloads of up by 88% year over year, reaching $60 million, and the top 10% of these apps account for 89% of the revenue with 33 apps surpassing $1 million in lifetime consumer spending.

The popular ones into space include replica character, ai, poly buzz chai with. A significant portion of users seeking AI girlfriends. You may have also seen commentary on Twitter about AI boyfriends being very popular. This is a really interesting hairy space to me because I think it represents, or. It sort of portrays something pretty fundamental about the kind of companionship that people seek and are willing to accept and the different ways in which it can be met and not met.

Personally I find AI companions a bit troubling. For numerous reasons, but I won't get up on the soapbox without it here. Yeah, well, I did include it in the policy and safety section very much because it has pretty let's say concerning or significant, implications for society, for people's psychology. We know, in the modern age there's been a very much degradation. In amount of socializing and the amount of close connections people have.

It's. Arguably one of the major health crises of the modern age, like people's ability to have friends and close connections. And so this market growing significantly, getting a lot of revenue according to this support. There have been 112 apps published just in the first half of 2025 with the names of those apps having. Girlfriend in 56 of them. Fantasy boyfriend, anime, soul soulmate, lover, wi Fu. A lot of, yeah, clearly romantic interaction apps.

And it's coming also in this current paradigm of dating apps. I think the general consensus is it's hard and unenjoyable process to try and find a human girlfriend or soulmate, so. I mean, it's a little concerning. I think it's fair to say, on the one hand you can treat it as a video game, as like a role-playing exercise as a fun thing.

By the way, character ai, one of the players in the space for a while, which isn't focused on girlfriends, which is general roleplay, still has millions of monthly active users. Like this is a very big space. So it's. Likely to keep growing. I mean, X AI recently launched Ani and like their own rock based companions. I don't know. It's an interesting phenomenon for sure.

And, fun fact the movie Her, which was all about this thing directed by Spike Jones, where the main character played by Phoenix Falls in love with an AI character, said in 2025. Lots of people are saying that this movie was incredibly prescient, and I think that's fair. If you haven't seen her, I highly recommend. Well, that is it for this episode. As I've said, hopefully we are going to back, get back to a weekly schedule. Thank you Daniel for fulfilling the guest co-host duties.

Always fun to have you on here. Thanks for having me. I always really love doing this. And thank you to the listeners. As always, we appreciate you tuning in and bearing with us as we skip some weeks at an unpredictable rate. Always appreciate it if you leave reviews, if you share it with friends and more than anything if you just keep tuning in.

Transcript source: Provided by creator in RSS feed: download file
For the best experience, listen in Metacast app for iOS or Android