🎙️ EP 109: GPT-5 Just Helped Prove a Quantum Theorem + DeepSeek’s Big AI Hack

00:00

The contrast in the AI world this week is, well, it's staggering. Yeah. On one side, you have these large language models suddenly cutting their operational costs in half, just like that. Right. And at the exact same time, those same models, now more efficient, are helping solve conceptual problems, problems that stumped, you know, the world's leading quantum computing minds. It's kind of wild. It really is. Welcome to the

00:25

Deep Dive. You've shared a stack of sources this week, and they really prove AI is getting radically smarter and much more economically viable, both at once. This feels like the moment the ceiling gets raised, you know, on both capacity and capability. Our mission today is pretty clear, then. We're going to unpack DeepSeek's cost -cutting secret, this sparse attention thing, and figure out why it actually matters for, say, API prices. Okay. Then we're diving into the conceptual frontier.

00:51

We'll look at how GPT -5 actually provided a genuine breakthrough in quantum mechanics. That's the Aronson story, right? Fascinating stuff. Exactly. And finally, we'll hit those critical shifts happening across the industry. Copyright fights, talent moving around, lots going on. All right, let's do it. Where do we start? The efficiency angle. Yeah, let's start with the money problem. It's fundamental to scaling AI, isn't it? Training, running these huge models.

01:16

It's hitting a wall. Totally. And in the U .S., the major labs, you know, they often just throw more hardware at it, more NVIDIA GPUs, especially when dealing with long prompts, long conversations. That's the brute force method. Yeah. Yeah. More compute, more cost. Simple as that. Pretty much. But DeepSeek, being based in China, operates under different... let's say, resource constraints. They were kind of forced to find a smarter way to scale up. A different path. And they found

01:44

a massive one. They just dropped this model, DeepSeek V3 .2 XP, and the headline here is huge. It cuts our operational costs by 50%. 50 % without losing performance. That's what they claim. No loss in quality compared to their previous models, which is, frankly, enormous news for the whole field. Okay, that is genuinely massive. So what's the technical trick? What lever do they pull for that kind of efficiency gain? The secret, or maybe the not -so -secret -anymore secret,

02:13

is sparse attention. Sparse attention, okay. Yeah. And to get why it's such a big deal, you've got to remember how the original LLM architecture works, that 2017 transformer model. Right, the foundation for most of this stuff. Exactly. It uses this process where basically every single word looks at every other single word in the whole sequence. It compares everything to everything. Like that analogy of Lego blocks. Yeah. Every block has to talk to every other block to figure

02:38

out the structure. That's a good way to put it. And it works fine for short bits of text. But imagine a document with, say, 10 ,000 words. Uh -oh. Yeah, that comparison process just explodes. It hits what's called quadratic staling. Basically, if you double the length of the input, the computation cost goes up by four times. It gets exponentially slower and way more expensive as your inputs

03:00

get longer. Which explains why trying to summarize a really long document with AI can feel sluggish and why it often costs more credits or tokens.

03:09

That quadratic thing. is the bottleneck it's the economic and technical bottleneck absolutely now sparse attention which deep seek is using here it completely changes that process how so well instead of comparing every word to every other word it's smarter it's an llm process that basically picks out only the keywords that actually matter for the comparison at that moment so it skips the noise yeah focuses on the relevant connections precisely it intelligently ignores

03:36

the irrelevant stuff And how did they engineer that? How do you make the model know which words to skip? They built this tiny specialized system. They call it the lightning indexer. Lightning indexer, okay. This little indexer helps the main model prioritize which connections are important, which words need to talk to which other words, and that shifts the scaling away from that awful dollar. Towards something more manageable, like linear. Closer to linear, yeah. Much, much more

04:02

manageable. And that's what suddenly makes really long context economically viable. And they published benchmarks showing it holds up like performance is still good compared to their old dense model V3 .1 Terminus. Yep. They claim no performance drop, but half the compute cost. OK, but you mentioned the original Transformers from 2017. Is sparse attention brand new science? That's the interesting part. Not really. OpenAI was actually pioneering sparse transformers back

04:28

in 2019. And Google released something similar called Reformer in 2020. So the idea has been around for years. Exactly. But DeepSeek seems to be the first major lab to really openly publish their specific implementation, the results, the cost savings. They kind of put it all out there, made the tech public in a way. And the impact is already happening. Oh, yeah. API prices for handling long inputs, they've already been slashed

04:52

up to 50 % in some cases. And this matters for you, the listener, because systems like, say, ChatGPT, they often still reprocess all the previous words in your conversation every time you add something new. Right, which is why those chat sessions can feel like they're slowing down the longer they go on. It's that silent tax on interaction length. Sparse attention helps fix that. Okay,

05:12

so here's the probing question then. If this tech is so good at cutting costs, and it's not exactly brand new, Why has OpenAI been so quiet about using it, or if they're using it, in GPT -4 or GPT -5? That's a really good question. And it hints at maybe some economic friction, you could say. While the tech is proven, fully adopting it might mean rethinking or even partially abandoning massive investments already made in GPU clusters designed for the old, dense way

05:40

of doing things. So yeah, efficiency remains this kind of hidden scaling constraint for all the big players. Okay, so we've tackled the money problem, the efficiency leap. But what about the actual brainpower, the capability side? Right. Let's talk about Scott Aronson, the quantum computing legend. We're shifting gears now from saving money to actual conceptual breakthroughs. Yeah, this is where it gets really, really interesting for me. Aronson was deep into this notoriously

06:03

tricky quantum proof. It concerns something called QMA. QMA. OK, for those of us not deep in complexity theory. What is that in simple terms? Simple terms. OK, think of it like this. You know, MP problems, problems where if someone gives you a solution, you can check it quickly on a regular computer. QMA or quantum Merlin Arthur is basically the quantum version of that. It deals with proofs that need a quantum computer to verify quickly. Got it. Quantum MP. Kind of. Kind of. Yeah. And

06:35

Aronson was stuck on a specific part. He was trying to prove that a new method he found for amplifying the certainty of these QMA proofs was truly optimal. He couldn't quite nail down the perfect mathematical verifier function. So a technical roadblock in his own mathematical work. A human stuck point. Exactly. And so he turned to GPT -5 thinking for help. Just ask the AI. Yep. And what happened next is pretty remarkable, especially given who Aronson is.

07:01

Apparently, the model gave a couple of unhelpful ideas at first. Okay. Typical AI sometimes. Right. But Aronson gave it some course correction, nudged it a bit, and then GPT -5 suggested a specific mathematical function. And this function? Mm -hmm. It worked. It broke his mental block completely. And the key thing here is Aronson's reaction, right? What did he say about it? He called the suggestion non -obvious and genuinely useful. Not obvious. Yeah. And he even added this quote,

07:28

which I think just says it all. If a grad student had given it to me, I'd have called it clever. Wow. OK, clever. Yeah. From Scott Aronson about an AI suggestion on quantum complexity theory. Right. That means the AI didn't just compute something he asked for. It didn't just run through known patterns. It actually found a novel step, an elegant shortcut, maybe something a top human mind working on the problem had missed. Whoa.

07:53

I mean, just imagine that AI actually contributing a genuinely clever step to advanced scientific theory. That feels. different. It feels very different. This is probably the first really clear example we've seen of this kind of thing, but you can bet it's going to be the first of thousands. It points towards true AI co -creation and science. Beat. So does a story like this confirm that AI is moving beyond just simulation, beyond pattern matching, towards genuine co -creation

08:21

in really high level research? I think it strongly suggests that, yeah, AI is becoming an invaluable partner for generating novel, non -obvious research insights. Okay, let's pivot then. Broader industry shift. Application adoption. Maybe some friction points. Quickfire. Let's do it. First up, institutional adoption seems to be accelerating like crazy. USC just gave full access to ChatGPT to all students, staff, faculty. Big deal. Apparently $1 .5 million

08:50

for one year. That's scale. That is huge scale. And while adoption like that speeds up, the legal fights are also heating up. No surprise there, really, especially around content. OpenAI launched Sora 2. Yeah, the video app. Yeah. Makes those short. like 10 second clips. They look pretty amazing. They do. But here's the controversy, the big friction point. Sources are saying it uses copyrighted material unless the owners actively opt out. Ah, the opt out model. That flips the

09:17

burden entirely, doesn't it? Completely. It puts the massive job of protecting content onto the creators, not the AI company scraping the data. You're kind of presumed in unless you fight to get out. Guilty until proven innocent, almost. You can see it that way. And it forces big media companies to constantly police what the AI is learning from. We already saw Disney apparently bail on that kind of arrangement. Yeah, I saw that. This feels like it's going to define copyright

09:42

law for the next decade. I think so, too. OK, shifting from consumption. to creation tools. Anthropic put out an important paper making a distinction between context engineering and prompt engineering. Okay. That sounds useful. We hear prompt engineering all the time, but context engineering, what's the difference they're highlighting? Good question. Because honestly, I still wrestle with prompt drift myself sometimes. Yeah, me too. You spend ages crafting this perfect persona

10:11

or instruction set for the AI. And then like two turns later, it's completely forgotten and gone off track. Is context engineering meant to fix that? That's basically the pain point it addresses, yeah. Yeah. So prompt engineering is just what you ask the model, the actual question or command. Context engineering is about building the stuff around the prompt before you even ask. It's like setting up the instruction manual,

10:33

the environment, the guardrails for the AI. Ensuring the model really understands its role, its constraints, the background, before it even starts thinking about your specific question. So it's like setting the stage properly, not just delivering the lines. Exactly. And Anthropic argues pretty convincingly that it leads to much more consistent, reliable AI behavior. Less drift. Makes sense. Okay, what

10:59

else? Infrastructure. Yeah, quick one. Google Drive is now using AI, apparently, to spot ransomware attacks and help users quickly restore files that got scrambled. Oh, that's practical. Security becoming a core AI application. Good to see. Definitely. And then there's the talent shift. This was noteworthy. Sources mentioned about 20 top AI researchers have left the big established labs, OpenAI, Google, Meta. 20? Wow, where'd they go? To start a new company together. A new

11:26

AI venture. That is a massive brain drain from the incumbents. 20 top people. Yeah. It shows incredible confidence in the market for specialized new ventures, right? Especially considering how insanely expensive it is to start a frontier AI lab from scratch these days. For sure. Funding spotlight Assort Health. They secured $76 million for their voice AI platform. The scale numbers were impressive. 14 languages handled 42 million patients. Eightfold revenue growth. Shows health

11:52

AI is scaling fast. Voice AI and healthcare, big area. And finally, tool update. Right. Cursor, the AI -first code editor, now supports controlling your browser, grabbing screenshots, debugging client -side issues, all integrated with Cloud Sonnet 4 .5, apparently, which is making the developer workflow smoother. Interesting. Tools getting more integrated. So that talent exodus.

12:14

Yeah. It does raise a question, doesn't it? What does this movement of top talent away from the big labs really suggest about where future AI innovation might be heading? Yeah, that's a good point. I guess it suggests that frontier innovation is increasingly driven by these focused, specialized startups, maybe. More agility, perhaps. Seems plausible. Yeah. Okay, let's try and recap the big ideas from this deep dive. Sounds good. So we saw two really massive threads changing the

12:40

AI. landscape just this week it feels like first the whole economic reality shifted deep seek basically proved you could potentially have your operational costs using these sparse attention techniques like their lightning indexer it tackles that critical quadratic scaling problem right making ai cheaper and feasible for much larger tasks and second the conceptual ceiling got pushed way higher. Yeah, the Aronson story. GPT -5 providing that genuinely clever, non -obvious step for

13:09

a quantum proof. That confirms we're really moving into an era of AI co -authorship, even in really hard science. Definitely. And then layered on top of that, you have all these... critical industry shifts happening fast. Yeah, the rapid adoption, like at USC, the looming copyright battles, especially over that Sora 2 opt -out model. Right, and that talent fragmentation, top researchers leaving

13:30

big labs for nimbler startups. So the whole AI landscape, it feels like it's simultaneously maturing, becoming more efficient, more stable, and accelerating, getting dramatically more capable, tackling harder problems. It's kind of doing both at once. Yeah, that's a good way to put it, maturing and accelerating. So for you listening, maybe explore one of those quick hits we mentioned. Good idea. You can look into the details of Anthropic's context engineering paper, see if it helps your

13:56

own AI interactions. Or maybe dig into the implications of that SORA2 opt -out model for copyright and creative work. Yeah, lots to dig into. But here's the final, maybe provocative thought we want to leave you with. Okay. If AI can demonstrably help break conceptual roadblocks in something as complex as quantum theory today. What previously intractable scientific problem, maybe in medicine or material science or fundamental physics, what's it going to tackle next year? That is something

14:27

to think about. Where does this capability actually lead us? A big question. Well, thank you for sharing your sources with us for this deep dive. Always fascinating.

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript