🎙️ EP 200: AI Builds Itself Now?! Codex vs Claude + Google’s Gemini Explosion

00:00

Okay, let's start with the concept. It's something that usually you'd find in a sci -fi novel, not a Tuesday news headline. Recursive self -improvement. It's the big one. The holy grail of AI, really. The idea that a system can look at its own code, find the problems, and just fix them. Rewrite itself to be better without a human stepping in. And we're not talking theory here. This is what broke this week with OpenAI's new model. It wasn't just writing code. It was being used

00:26

to debug and evaluate itself. doing its own training. That's a huge shift. I mean, we're moving from an era of tools we use to one of agents that build themselves. And it's honestly, it's a little airy. A little. It's kind of terrifyingly cool. Welcome back to the Deep Dive. Today is Thursday, February 5th, 2026. And I hope you're ready because the world feels different than it did just a week ago. Oh, it really does. The acceleration is just... It's noticeable now. So here's our

00:53

roadmap. We had a lot to get through today. First, we're going to unpack just the absolute madness of this agendic arms race. Yeah. We had OpenAI's GPT 5 .3 codex and Anthropix's Claude Opus 4 .6 drop, what, minutes apart? It felt like a coordinated strike. It was wild. Totally. Then we'll look at how this all spilled out into the real world. I'm talking about the culture war at the Super Bowl and the stock market just freaking out. Yeah. And then we have to talk about the

01:18

elephant in the room, Google. While everyone was watching these shiny new agents, Google has been scaling something that is, frankly, hard to even comprehend. The financial reality check there is pretty staggering. And finally, we'll wrap up with some new tools in this mindset we're calling the cognitive gym. Because if the AI is getting smarter, we really need to make sure we aren't getting dumber. That's maybe the most critical part of this whole conversation for

01:44

me. Okay, so let's dive in. All right, let's start with what everyone's calling Thursday AI. It feels like every Thursday the world changes. But this one, this one was special. OpenAI and Anthropic head to head. Oh, it was pure theater. So first you have OpenAI dropping GPT 5 .3 codecs. And, you know, for context, this is not just a slightly better chatbot. This is a model built to live inside your operating system. Right. It's powering a new macOS app. But I want to

02:11

get this agentic part. We use that word a lot. What does it actually look like when you use this thing? Okay, so the old way, and by old I mean 2025, was you'd ask the bot, write me a function. It spits out code. You copy it, you paste it, you see if it works. Codex 5 .3 is different. It treats your entire computer like it's sandbox. The release notes talked about it handling complex app development, even game development, over multiple days. Multiple days.

02:37

That's the part I can't wrap my head around. How does an AI remember what it was doing on Tuesday when it wakes up on Thursday? That's the leap. It's about more than just a big context window. They're calling it episodic memory. The model keeps a log of its own decisions, of its own intent, so it knows why it built the database that way yesterday so it can build the front end to match it today. And the headline feature, the self -debugging, you said it helped build

03:01

itself. How does that really work? Is it just like a spellcheck for code? It's way deeper. So in traditional training, a human has to label the data, right? A person says, this code is wrong. That's slow. With this, the model generates code, tries to compile it, reads the error message, figures out the logical flaw, and then rewrites the code and tries again. It's a closed loop. So it's learning from its own failures in real time. At machine speed. Exactly. It's why it's

03:28

25 % faster in execution. It's already failed a thousand times in a simulation before you even type your first prompt. Wow. Okay. So on one side, we've got OpenAI building the ultimate builder. But then Anthropic was not just sitting around. No. And this was the drama of it all. Anthropic pushed their launch by 15 minutes. It really felt like they were waiting for OpenAI to hit publish just so they could steal the news cycle. And they dropped Cloud Opus 4 .6. And

03:53

they're calling it vibe coding? I saw that all over X. What does that even mean? Is that just marketing? It's a fun name, but it actually describes a real thing. It's about coding with natural language intent, not strict syntax. So you don't tell it how to write the code. You tell it the vibe of the app. I want a retro dashboard that feels like a 90s hacker movie. And it just, it translates that feeling into code. So it's bridging the gap between sort of art school instructions

04:20

and computer science execution. Precisely. And while Codex is digging deep into the OS, Anthropic is going wide on data. Opus 4 .6 can pull info from huge document sets, run full financial analyses. I saw the stat here. It's number one on the finance agent benchmark. Beating OpenAI. Which is huge for the corporate world. While OpenAI is chasing developers, Anthropic is going right after the white -collar workflow. The analysts, the researchers,

04:47

the people living in spreadsheets. So we have Codex building itself and Claude basically replacing the analyst. It makes me wonder, if the AI is doing all this, where do we even fit in? It's a multi -billion dollar question, isn't it? No, really. If the model can fix its own errors, where does a human developer actually fit into that loop? I think we stop being writers, you know, writers of code, writers of reports. We become architects of intent. We define the what

05:12

and the why. And the agents handle the how. You don't lay the bricks anymore. You're designing the cathedral. So if the AI debugs itself, the human moves from being a writer to an architect. Yeah. We stop typing syntax and we start designing outcomes. But this tech rivalry, it didn't just stay on servers. It went right to the biggest stage in America. The Super Bowl, Super Bowl LX, I usually watch for the game, but the commercials this year were aggressive. Aggressive is one

05:43

word for it. It was a culture war in 30 -second spots. Let's start with Anthropic's ad. It felt... Very pointed. It was a subtweet on national television. They ran this ad that subtly made fun of AI platforms cluttered with ads. It was a direct shot at OpenAI, who've been playing with sponsored results. And it definitely got a reaction. Oh, for sure. Sam Altman was on X, like, immediately. He called

06:05

the ad clearly dishonest. You know the industry's getting serious when the CEOs are beefing during the halftime show. It felt very 90s Cola Wars. But then Google. Google took a totally different path. They didn't attack anybody. No, Google's Gemini ad was warm. It had bicycles, families, the Madden Bowl. They were just selling helpful AI to everyone. It was like they were saying, hey, let the startups fight. We're just here to help you find a cookie recipe. That's their

06:31

entire play. Normalization. Make AI feel as safe and boring as a Google search. But while the ads were going on, the stock market was reacting to the actual tech. And that was a lot less warm and fuzzy. Yeah. Let's talk about the panic. Anthropic drops 11 Claude Cowork plugins. And Wall Street just loses its mind. Why did software stocks tank so hard? It's the implication of those plugins. The Cowork tools let the AI plug

06:57

directly into your company. So it was a sell -off based on the idea that these AI agents could just kill the whole software as a service model. Exactly. Why rent a tool for a human when you can just rent an agent to do the job? But... You know, Jensen Hong from NVIDIA called the whole thing illogical. Well, of course he did. He sells the chips that run everything. He has a bias, for sure, but he's got a point. The market is reacting to the fear of being obsolete, not

07:34

the immediate reality. But then, to counter that fear, OpenAI launched Frontier. Right there. Enterprise platform. How does that calm things down? Well, it turns agents into coworkers that can use files and run code, but securely inside a corporate environment. They're trying to say, hey, we are destroying enterprise software. We're becoming the platform it runs on. It feels like we're in this limbo where the technology is moving way faster than the business models can adapt.

08:01

That is a massive understatement. So with all this volatility, is the stock market panic justified or is it just noise? It's panic for sure, but it's a signal the SaaS era is ending. Speaking of ending eras, let's pivot. Let's talk about the giant that everyone keeps underestimating because they aren't always the coolest. Let's talk about Google. The empire strikes back. I'm serious. We get so caught up in the open AI and anthropic drama. Yeah. But then you look at Google's

08:25

Q4 earnings call. Yeah. And it's just brute force. It really is the difference between product elegance and just logistical dominance. Let's run the numbers because they're insane. Yeah, lay them on me. Okay. The Gemini app. 750 million monthly active users. That's more than TikTok in some places. 750 million. That's a lot of people asking how to bake bread. And it's not just that. Enterprise. 8 million paid seats in four months. 95 % of the top 20 SaaS companies are using it. Over

08:57

5 billion interactions in Q4 alone. 5 billion. And there was that detail about how people are using it, right? It wasn't just text. Right. One in six AI mode searches now use voice or image. That's a huge behavioral shift. Yeah. People are showing Google a picture of a broken pipe. and asking, how do I fix this? In the infrastructure, I mean, OpenAI has Microsoft, but Google basically is the internet. That's their moat. Gemini 3 Pro is the fastest adopted model in their history.

09:22

Their new workbench, anti -gravity, got 1 .5 million users in two months. But here's the number, the one that stops the conversation, capital expenditure. The money they're spending to build all this. Google's planned CapEx for 2026 is between 175 and 185. Billion dollars. 185 billion. I can't even. What does that money even buy? It's not just servers. It's energy. It's physics. They're securing fusion power contracts, upgrading national power grids, building custom chips.

09:52

This is nation building money. This is the Manhattan Project times 10. It puts everything in perspective. Open AI feels like magic. Google feels like gravity, an unstoppable force. Open AI has elegance. Google has distribution. Basically, infinite cash. So does product elegance even matter if you can spend $185 billion on infrastructure? Distribution beats elegance every time. History repeats itself.

10:15

Okay, let's bring this back down to earth. We've talked about billions of dollars in global infrastructure, but because they're spending all that money, the cost for us, for the regular person to create something, is dropping to almost zero. That's the other side of the coin. It's a golden age for creative tools. Yeah. Despite the Titans fighting, there's some incredible stuff coming out for creators. Let's do a quick rapid fire on some new tools. First up, Kling 3 .0. Huge

10:42

for video. We're talking native 4K outputs, longer video times. You can basically make short films with this now. It's not just a four -second blurry clip anymore. Webflow. Big one for designers. You give it a prompt, it builds a multi -page

10:55

production -ready website. Not a mock -up. a real site higgs field vibe motion yeah motion images from a single prompt it gives you incredible control if you need a specific asset and superboard this one's amazing connects over 600 data sources it's like having a team of data analysts in your pocket you connect your stripe youtube whatever and just ask it questions across all of them these tools are incredible but they bring us to the uh the philosophical part of this the

11:23

thing we're calling the cognitive gym I love this concept because there's a real danger here. The danger that we all just get. You lazy. Worse. That we get functionally dumber. If we outsource all our thinking. If we let Codex write the code and Claude do the research. What happens to our own brains? It's like using GPS all the time. Nobody knows how to navigate anymore. I have to admit something here. A bit of a vulnerable admission. I've felt this myself. I call it prompt

11:49

drift. I've been letting the AI summarize articles for me instead of reading them. And last week, I realized I couldn't actually remember the details of something I'd supposedly learned. I retained less information because I did less of the work. That's the trap. Exactly. The friction is where the learning happens. The cognitive gym idea is the answer. You have to use the AI to increase the difficulty, not take it away. How does that even work? Give me an example. How do I bench

12:14

press with an AI? Instead of asking the AI for the answer, you ask it to quiz you. Instead of asking it to write the draft, you write it, and then you ask the AI to completely tear your logic apart. You force it to debate you. You have to lift the weight yourself. So the AI becomes the spotter, not the weightlifter. Exactly. And we're seeing this trend with solo founders. They're using AI to do the work of 10 people, but they're

12:37

still the one driving the vision. So how do we stop from becoming just passive consumers of AI output? You have to treat the AI as a sparring partner, not a butler. We'll be right back. After a quick word. And we're back. Okay, let's recap the big ideas from today. It's been a heavy week. A historic one, I think. First, the agent wars. Codex is literally building itself. Glaude is aiming to replace entire white -collar workflows. We are moving well past chatbots. Second, the

13:08

culture is splitting. The Super Bowl ads showed us the tension. The market showed us the fear. The entire SaaS model is under threat. Third, Google. The quiet giant. While we watch the startups, Google is spending $185 billion to build an infrastructure mode that might be impossible for anyone else to cross. And finally, the cognitive gem. The tools are getting incredibly powerful, which means we have to be more disciplined. We have to make sure we stay the architects of intent.

13:36

We are absolutely moving from chatting with bots to managing swarms of agents. That's the new reality. If you want to try out Kling or Superboard or any of the tools we talked about, check the show notes. We've got links for you there. They're definitely worth experimenting with. I want to leave you with one final thought. We mentioned 11 labs raised over 400 million euros, more than all of Europe combined in AI funding recently. And Google is spending 185 billion. The scale

14:02

of the numbers is just, it's astronomical. So if all that capital is concentrating so intensely, are we just watching the consolidation of human intelligence into three or four zip codes? And if that's true, what does that mean for your solo business in 2027? That's the question that keeps me up at night. Thanks for listening to the Deep Dive. We'll see you next time.

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript