🎙️ EP 288: Policy on the AI Exponential & Perplexity’s Mind-Blowing AI Agent Data

00:00

An AI model just discovered high -severity zero -day vulnerabilities across every single major web browser. At the exact same time. Right. And meanwhile, a Japanese farmer with literally zero coding experience used AI to, well, launch a satellite crop tracker to monitor his fields. It's just two completely different realities, both of them happening today. Welcome to the Deep Dive. I'm really glad you're here with us.

00:24

Today we are exploring this... extreme dual reality of AI, we are going to unpack Anthropic's terrifying new safety framework. Yeah, that one is an intensely heavy read. It really is. And I'll also look at the wild real world tools dropping right now, things like Claude Fable 5. And then we'll analyze a massive Harvard and perplexity study. Which basically proves that AI is fundamentally rewiring human ambition. We are looking at a total paradigm shift. Completely. So let's start at the macro

00:53

level. AI is moving at this exponential speed, but policymaking is, you know, moving at dial -up speeds. And that gap is where the real danger lives. Exactly. So to close that gap, Anthropic dropped a two -part policy framework. And the trigger for this is actually really fascinating. Yeah, the preview model. Right. They ran a preview version of their next -gen model. It's called the Claude Mythos Preview. And it basically just went hunting. It did. It successfully found thousands

01:22

of high severity software vulnerabilities. This included zero days across all major web browsers. Right. Which are unknown software flaws hackers exploit before developers fix them. Exactly. And the sheer scale of that discovery just terrified the researchers. I mean, it would terrify anyone. Yeah. So they established strict new internal rules. But these rules only hit what they call frontier developers. Meaning the massive corporate

01:46

players. Right. It means companies. pulling in over $500 million in AI revenue, or companies spending over $1 billion on AI development. So they are specifically targeting the whales here. Yes, the big spenders. And the framework outlines four nightmare scenarios. They are actively trying to prevent these specific outcomes. First, AI lowering the barrier to creating biological weapons. Right, because suddenly anyone could synthesize.

02:14

dangerous materials without any background in biology whatsoever exactly second massive automated cyber attacks on critical infrastructure third models completely losing control and acting outside of developer intent and fourth ai accelerating its own research loop into a runaway spiral Two sec silence. Those are incredibly heavy scenarios. Very heavy. And to stop them, Anthropic proposed some strict mandates. Frontier Labs would be legally forced to run rigorous internal testing.

02:43

They also have to publish highly detailed risk reports. Yeah. And hand their models over to qualified independent evaluators. You just can't grade your own homework anymore. Which makes sense. Right. There is also a broader societal resilience plan in there. They want mandatory gene synthesis screening. To prevent that bioweapons scenario we just mentioned. Like if someone tries to print a virus, it flags the system. Precisely.

03:05

They also want urgent patching for aging software, the legacy code that runs our critical power grids and our hospitals. Oh, man, that is a massive undertaking. Updating decades old infrastructure is incredibly slow work. It is. Anthropic also took a really firm stance on state laws. They warned that Congress shouldn't block powerful state level AI regulations. Like the ones currently

03:31

proposed in California or New York. Right. Unless the federal equivalent is just as strong, they do not want a watered down national standard. Makes sense from their perspective. It does. But I really have to push back on this framework. It feels like, well, like installing blast doors on a bank vault, but leaving the back window completely open. Because the framework only targets the billionaires. Exactly. Regulating only the massive billion dollar spenders leaves a huge

03:56

blind spot. I mean, what about open source rogue actors? Yeah, they don't need a billion dollars to cause chaos. Not at all. That is the ultimate tension in the industry right now. Because open source models are getting incredibly powerful. A rogue actor could theoretically download a model today. Then they just fine tune it for malicious purposes on a cheap server. Which bypasses these frontier regulations entirely. Exactly. Anthropic is trying to contain the absolute cutting

04:22

edge. But the floor of what is possible just keeps rising for everyone else. That leads me to a crucial question about these risks. What is the actual timeline we are looking at for these nightmare scenarios? We're not talking about decades anymore. Given the speed of the mythos preview discovery, we're looking at a zero to three year window for these threats to become highly actionable. So the threat is already here and we're just playing catch up. Right.

04:47

And that is exactly why they are panicking. Well, let's transition from anthropics macro fears. If that is what AI might do, let's look at the micro reality. What is AI already doing right now on the ground? Yeah, this is where the landscape gets really fun. It does. Let's talk about Cloud Fable 5. A leaked prompt reveals it is highly tool -heavy and incredibly safe. It is blowing minds less than a day into its release. It aced three of the internet's trickiest questions.

05:14

These are logic puzzles specifically designed to fool AI. Like that famous how many R's in strawberry test. Right. It is highly info -aware. It understands context way better than previous versions. Meanwhile... OpenAI is trying to stay competitive in this exact space. They are considering major token price cuts. Which is the cost to process a basic unit of data. Exactly. Cheaper access means entirely new classes of people can build pools. And speaking of building, let's

05:43

talk about the creators. Remember that Japanese farmer we mentioned at the start? I love this. That is honestly one of my favorite stories right now. He has absolutely zero engineering background. None. But he used codecs to build his own greenhouse automation system. He also built a physical farm bot to navigate his fields. And a satellite crop tracker from his laptop. It's wild. It is unbelievable. He essentially replaced an entire agricultural

06:06

engineering department by himself. We're also seeing wild video generation tools democratizing production. Like Luma AI just released Ray 3 .2. It lets you direct complex video. with simple text prompts. You get 16 keyframes and 8 face tracks to lock down consistency. And 20 -second clips rendered at full 1080p resolution. It even has a full API for developers. That's huge. Yeah. And then there's HeyGen. It's an official cloud connector. You literally create polished videos

06:38

directly from a text chat interface. It features 25 unique skills for editing scenes and adjusting motion. It's just wild to see this infrastructure scale so rapidly. I mean, look at a company like Standard Bots. They just raised $200 million. Pushing them to a massive... $1 billion valuation. They claim their AI robots can fundamentally boost U .S. manufacturing. And the historical context here is massive. U .S. manufacturing

07:01

jobs peaked at 20 million back in 1979. Today, we only have about 13 million of those jobs left. And AI robotics claims it can aggressively reverse this downward trend. It's a bold claim. But the digital ecosystem supporting these agents is exploding everywhere. We have new infrastructure tools like Polra popping up. Yeah, that's a publishing API connecting to 10 different social platforms. Right. It gives text -based agents like Claude a full engagement loop. They can post, read replies,

07:32

and adjust their strategy automatically. There is also Spotlight. It's a free tool that natively reads your code sessions. It shows you exactly what your agents actually did behind the scenes. TypingMind is another great one in this ecosystem. It brings the best models across 18 different providers into one single workspace. The speed of this change is honestly dizzying. I have to admit something. I still wrestle with prompt drift myself. Oh, we all do. It's the ghost in

07:58

the machine. You get a model working perfectly for a specific workflow, and then... A week later, it just acts completely different. It's incredibly frustrating. Well, the underlying models are constantly updating and shifting their weights. Right. But there is a fascinating contradiction in all of this. Standard bots needs a $200 million war chest. They need that massive funding to revolutionize physical manufacturing. But on the other hand, a solo Japanese farmer replaces

08:26

an entire engineering team. With just a standard laptop and a basic internet connection. Exactly. So here's my question. Why is there so much friction bringing AI to blue collar robotics when white collar software scales instantly for a single farmer? It all comes down to the unforgiving physics of the real world. I mean, software bits can be duplicated instantly for practically free. Yeah. But blue collar robotics. deals with physical

08:52

atoms. You have to manufacture steel, deal with gravity, and manage complex hardware supply chains. Code copies for free, but you cannot copy paste solid steel. Precisely. And that physical capital requires massive upfront funding. Sponsor. Welcome back. So we have seen the raw power of the tools. We've seen the macro fears from companies like Anthropic. Now let's look at the psychology. This is arguably the most important part of the

09:15

entire discussion. I completely agree. What are these tools actually doing to human behavior? We're moving from software capabilities to fundamental psychological shifts. Perplexity in Harvard Business School just dropped a massive... joint study. It is completely shifting how we view modern knowledge work. They systematically compared traditional internet search against perplexity's new computer agent platform. They rigorously analyzed 10 ,000 identical queries across both

09:45

methods. That is a massive sample size for behavioral research. Let's break down the data they found. First, we have the time gap. Regular search usually feels pretty quick. You type a question, you get a list of links. But search leaves the heavy cognitive execution entirely up to you. When you factor in the human effort of reading and synthesizing, the traditional workflow took a really long time. How long exactly did the researchers clock it at? An estimated 269 minutes to complete

10:11

a complex task. That is well over four hours of solid human effort. Right. But the agent workflow, it wrapped the exact same task up in just 36 minutes. Whoa. I mean, imagine scaling that 230 minute time savings across a billion queries. It is a staggering amount of unlocked human time. But time saved is only half the story here. Let's look at the creation gap. Right. Like what were people actually doing with all that newly saved time? 50 % of the tasks handed to the agent involve

10:41

building something entirely new. Wow. Yeah, that is double the creation rate we typically see on regular search engines. So people are rapidly shifting from passively consuming to actively creating. Yes. And then we have the extra piece gap. This is easily the most fascinating data point in the entire study. The number of tasks falling completely outside the user's actual field of expertise jumped significantly. By how

11:06

much of a margin did it actually jump? It jumped nine full points, up to 59 % of total tasks. Wait, really? More than half of the tasks were outside their wheelhouse. Exactly. Users completely trusted the agent with highly cognitively heavy work. They confidently asked it to generate complex code across multiple disciplines. They drafted complex legal or technical documents without hesitation. They built multilayered visuals and complex data structures, things they would never

11:35

historically attempt alone. Think about it this way. Using regular search is basically like going to a massive public library. You still have to find the specific instructions yourself. Right. You have to read the books, take detailed notes, and try to build it yourself. But using an AI agent, that is like hiring a brilliant chief of staff. A chief of staff who already read all the books beforehand. Exactly. They read the books and built the working prototype while you

12:01

were just, you know, having coffee. The real unlock here isn't just raw speed. It is about removing the paralyzing friction of grunt work entirely. Yes. When you remove that friction, it actually increases human ambition. People naturally start aiming a lot higher. Perplexity's data proves this perfectly. I mean, interacting with capable agents gives us the underlying confidence to handle vastly more complex projects. Projects

12:26

we normally wouldn't even attempt to start. But this incredible data leads to a highly critical question. If agents unlock ambition and creation for literally everyone, will this lead to a massively oversaturated market of average AI -generated creations? It definitely will create an absolute flood of content. I mean, when creation is easy, the baseline of quality becomes completely ubiquitous. But that just means original taste and human vision become the rare, valuable commodities.

12:55

When anyone can build anything, original human taste becomes the premium. Exactly. The uniquely human element ends up mattering even more. Let's weave all of this together. We started with Anthropic today. They are building massive societal blast doors. Driven largely out of fear of AI's exponential compounding power. Yeah. They see the zero days. They see the existential risk to critical infrastructure. But that exact same technological power is what

13:20

empowers a single Japanese farmer. It lets him intelligently automate his physical fields entirely from his living room. It gives regular people the confidence to step way outside their established expertise. 59 % of tasks were far outside their normal comfort zones. It is a massive behavioral shift for the entire global workforce. The dichotomy of our era is perfectly balanced. We have existential risk matched exactly by an unprecedented explosion of human ambition. Beat. Which leaves you with

13:50

a really fascinating question to ponder. AI agents are rapidly removing the historical friction of execution. They are doing the heavy cognitive lifting for us. The repetitive grunt work is basically disappearing. Right. Your ultimate value is no longer in your ability to do that grunt work. Execution is rapidly becoming nearly free. So in a world where execution costs absolutely nothing, how do you cultivate the wisdom to know what to build? That is the ultimate challenge

14:16

for the next decade. Try a simple experiment today. Take just one task that is completely outside your normal comfort zone. Offload it entirely to an AI agent. And just see what it actually does to your personal ambitions. See if it changes how high you aim. Thank you for joining us on this deep dive. We will see you next time. Outiero Music.

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript