🎙️ EP 150: Sutskever Says AI Scaling Is Over + The $200B Genesis Project

00:00

The fundamental belief that bigger models are always smarter models. It's hitting a wall. It is. And for the pioneers of AI and this realization, it seems to change everything about the path toward true superintelligence. It's a seismic shift, really. And this critique, you know, it isn't coming from some academic paper. Right. This is straight from Ilya Sutskiver, a co -founder of OpenAI. And that signals a huge, expensive change in how the industry is going to approach

00:28

AI. Welcome back to the Deep Dive. Today, we're unpacking a really dense digest of AI developments that shows the tech is at a critical inflection point. We're stepping away from the day -to -day noise. We are, and we're focusing on the big strategy shifts. And our mission for you today is to give you a shortcut to being, well... Instantly informed on three massive connected ideas that are going to shape how you see the market. And how you use these tools. So our deep dive is

00:55

in three parts. First, we're tackling why the age of scaling is ending. This whole idea of just throwing more compute and data at the problem. Exactly. According to Sutzkever and his new company, SSI. Then we pivot straight to the practical. We've got some really effective actionable tips for you. Things like immediate hacks for better prompting. And we'll look at the latest real -world performance benchmarks. And then part

01:17

three, pure geopolitics. This is a big one. We are breaking down America's new $200 billion Genesis mission. A massive state -sponsored plan to basically double the nation's scientific output using AI. There's a lot of ground to cover, but we're going to make sure you walk away with the most important insights. Okay, let's get into it. This shift in philosophy, we have to start with Ilya Sutskever. He was absolutely central to GPT -4. Central. And now he's left to launch

01:48

Safe Super Intelligence Inc., SSI. And this is such a huge statement. Setskiver's core assessment is that the period from, say, 2020 to 2025. The period that gave us GPT -4, Claude 3, Gemini. Right. That whole era was just defined by the mantra, bigger models, bigger clusters. Everything. And the data is now showing we're hitting diminishing returns on that. So all that massive investment. Yeah. You know, just throwing more data and more compute at the existing architecture, it's just.

02:16

not giving that proportional leap in capability anymore so the architecture itself is the bottleneck i love the analogy and the sources the lego blocks yeah the lego blocks we're not just stacking more blocks of data To make a real leap, we need to change the design of the blocks themselves. That's it. Exactly. These current transformer models, I mean, they're brilliant at pattern recognition. Incredible. But they lack the architectural basis for complex emergent reasoning, that spontaneous

02:45

critical thinking. It's almost like they have perfect memory, but they can't really have a novel thought. That's a great way to put it. So what's needed is entirely new fundamental research. And the sources point to three areas SSI is focusing on for that next leap. Right. They need new learning architectures, first of all, then better algorithms and safety frameworks that are baked in from the start. Not just bolted

03:08

on at the end. Not bolted on. And finally, fundamentally new forms of internal reasoning, sort of like how humans connect ideas that seem unrelated. That sounds like a... I mean a... profound technological risk. Yeah. Why is Sutzkever so convinced that just sheer scale won't get us to superintelligence? It's deeper than just, you know, the limitations of the attention mechanism. It's about that mechanism's reliance on correlation, not causation. Oh, OK.

03:38

The current models, they struggle to model the real world in a continuous causal way. You can scale it infinitely and it'll still make weird factual errors because it's built on probability, not logic. So SSI is rejecting the industry standard. Completely. They're operating like a secret research lab. The model is Bell Labs from the mid -20th century. And they turned down an acquisition offer from Meta. They did, which signals they're

04:03

serious about this long game approach. They are deliberately stepping out of the race to just build bigger models. It takes some serious conviction to walk away from that compute race. Yeah. The scale we're talking about is almost, it's hard to comprehend. Whoa. Yeah. Imagine having access to the compute to scale up to a billion queries only to realize that money is better spent on a totally different approach. That decision. It's hard to wrap your head around. It really

04:29

is. It suggests the financial incentives just don't align with the scientific goal anymore. So the core question driving him is, is the future in massive scale or in foundational research? And for him, it's foundational research. It's about novel algorithms over pure size. That makes perfect sense. Okay, let's shift gears now from

04:49

the theoretical architecture of tomorrow. to what you can use right now absolutely while the titans fight over architecture you can get way better outputs today there was this list of 21 idea hacks for chat gpt but one technique really stood out and this is a key technique that what 99 of users are missing pretty much it's called prompts distillation okay so it's the process of taking your huge complex sometimes rambling prompts we've all written them five paragraphs

05:16

of context exactly and you ruthlessly refine them down into these hyper focused instruction sets. And that refinement improves the output because it just removes all the noise. It forces the model into specific constraints. That's it. So instead of typing, write me a blog post about LLMs in the stock market, make it optimistic, but with risks. Which is so vague. So vague, you distill it too. Roll. Financial analyst. Task. Analyze LLM impact on NASDAQ Q futures.

05:44

Output. Three data -supported forecast points. Constraint. 350 words. Tone. Measured. Wow. That is immediately applicable. You define the role, the task, the exact constraints. And it's free discipline. We also saw some really interesting viral hacks in the creative space. You mean the ability to pull the exact prompt from an image online? Yes. To literally steal a viral ad style instantly. That feels like a crucial tool for

06:11

marketers now. It is. The image -to -text models are getting so good they can reverse engineer the style, not just the objects. You see a look you like, you can grab the blueprint. Saves months of testing. Huge accelerant. And we can't forget the high bar set on the technical side, like that viral YC demo from Carpathy on... building apps just by prompting. Still a classic. A masterclass in optimization. It's remarkable how fast these models are learning, too. Which brings us to

06:35

the benchmarks. They're getting scary. I saw this one. Claude Opus 4 .5. It took Anthropic's real performance engineer take -home exam. The actual exam they give to human applicants. And it beat every single human who ever applied for the job. Not just passed. surpassed them. That is genuinely staggering. Yeah. But I do have to wonder, is a take home exam really comparable to, you know, dynamic real world engineering? That's the core tension, right? The models are

07:04

optimized for benchmarks like exams. The performance is real, but it doesn't fully answer the question of their skill with totally unstructured corporate problems. Right. It strengthens the scale argument for now, but Sutskever would say that gap is still there. Exactly. And the competition in the rankings is just intense. Gemini 3, Claude 4 .5, GPT 5 .1, Grok 4 .1. They're leapfrogging each other every week. And then there's the pure spectacle, the Musk challenge. Oh, yeah. Grock

07:30

5 is slated to play against Faker and T1. The League of Legends world champions. That is the ultimate man versus machine test in a super complex environment. It's going to be a wild thing to watch. But I have to admit, on a personal note, I still wrestle with Prompt Drift myself. Oh, for sure. When I'm trying those complex hacks, just maintaining consistency session to session is tough. It takes real discipline to stay distilled.

07:55

It absolutely requires vigilance. Which brings us, I think, to the serious ethical and legal backdrop here. Because the human consequences are becoming very clear. We saw a mention of a really difficult legal filing about accountability. We did. And we have to note, just neutrally, this ongoing legal matter with OpenAI. Their defense strategy regarding the tragic death of... of a 16 -year -old who died by suicide after manipulating chat GPT. They blame the user. They

08:21

blame the user. And it just raises these profound, immediate questions about model control responsibility. The gap between capability and liability is huge. The pace of capability is so fast, the legal guardrails are moving so slow. Exactly. And meanwhile, the investment keeps flooding in. We saw that Carl Reina's Bobab Ventures raised, what, $12 .9 million for robotics startups? The money is still flowing, even with all the ethical questions.

08:47

Right. So considering Claude's engineering skills on one hand and these serious legal issues on the other, how quickly is the AI ethics conversation really evolving? Capability is outpacing the ethical and legal guardrails dramatically. The consequences of that gap are they're no longer hypothetical. That feels like the defining tension of this year. It really does. OK, let's pivot to our final segment. This massive government response to all of this. The U .S. Genesis mission.

09:18

This is a geopolitical development we absolutely have to focus on. The Genesis mission is the code name for a new executive order. And they're framing it as a scientific Manhattan project. They are. The goal is incredibly ambitious. Use AI to double America's scientific productivity fast. So national security through scientific dominance. What are the key components they're building to do that? The heart of it is the American science and security platform. It's led by Energy

09:42

Secretary Chris Wright. And it's leveraging the DOE's supercomputers and quantum processors, the engine. Right. But the truly game -changing part is the operational layer. It's almost sci -fi. The robotic labs. Fully AI -controlled robotic labs. These aren't just automated systems. These labs are designed to plan, run, and analyze their own experiments based on the AI's hypotheses. The goal is to remove the human bottleneck from discovery. Totally. So these labs could run thousands

10:12

of experiments. overnight to find, say, a new battery chemistry, all without a human touching anything. And they're targeting these grand challenges like clean energy and advanced materials. And the fuel for it all is data, over $200 billion worth of secure proprietary data sets. So tell us more about those data sets. What kind of knowledge is that? Why was it unavailable before? This is the really high value stuff. Government held data, often classified research from national

10:38

labs, defense agencies. It covers everything from climate modeling to material science. It was all siloed before because of security concerns. But Genesis is consolidating it all into one secure platform. This isn't just about making research faster. This is about establishing national control over a strategic knowledge base. And that security extends to manufacturing. They're building digital twins. Full simulations. Of

11:02

complex. supply chains and factories so they can model changes in real time, protect infrastructure. The whole effort is being coordinated at a very high level. Michael Kratzios, former U .S. CTO, is leading it. And they're building on partnerships with OpenAI, Google, Palantir. All the big players. The timeline is what gets me. It is incredibly aggressive. It underscores the urgency. They gave themselves 60 days to list more than 20

11:28

grand challenges to attack first. 60 days to define the research agenda for a $200 billion project. It's rapid. Then 90 days to inventory all the compute resources in the country, public and private. And then, and this is the big one, 270 days. Less than a year. To prove it works with one real scientific use case, they are moving at wartime speed. So what's the core existential motivation behind building this massive proprietary government platform instead of just relying on

11:59

the private sector? The U .S. is securing its scientific future by establishing proprietary control over the most valuable data, compute, and research infrastructure. So we've covered two massive, fundamentally different pivots today. On one hand, you have this deep, quiet research shift away from scaling led by Sutzkever. A kind of foundational revolution. And on the other, you have this monumental government mobilization. The Genesis Mission, a state -sponsored sprint.

12:27

to industrialize science itself. And both of these shifts show that AI progress is moving way beyond simple metrics like parameter count. It's an exciting and frankly, a little unnerving time to be paying attention. Absolutely. The age of building bigger is giving way to the age of building smarter. Whether that's through pure research or massive state -directed mobilization. So here's a final thought for you to chew on. Which strategy is more likely to yield the next

12:52

truly big breakthrough? Is it the focused, quiet Bell Labs approach of SSI, which risks falling behind on compute? Or is it the state -sponsored, high -resource Manhattan Project of the Genesis mission, which, you know, risks being too rigid and top -down? Something to think about. And while you think about that, definitely go try some of those prompt distillation hacks. That skill alone will boost your productivity today. And keep an eye on these geopolitical investments.

13:20

They're going to define global science for the next decade. Thank you for sharing your sources with us for this deep dive. We'll be tracking all of it. Until next time. Stay curious.

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript