The Unknown Tech that Enables AGI: Claude Mythos and NVIDIA's Next Generation

⁠¶ The Rise of Claude Mythos

00:00

Josh: A couple of weeks ago, we covered the Claude Mythos release, Josh: the model that found decade old security flaws overnight and scared the hell Josh: out of basically anyone who is following the AI story. Josh: So much so that the federal government is involved. But the part that we didn't Josh: get into is the backend that powered this model. Josh: Mythos was built on a chip from March, 2024 that Jensen pulled out of his pocket Josh: on stage at GTC, which was the Blackwell chip.

00:22

Josh: It had 208 billion transistors. Everyone treated it like the future had arrived. Josh: And yet it took two years of fabrication for us to get the first manifestation Josh: of that, which is Claude Mythos. Josh: 24 models from Keynote to a working model. It happened with Hopper, Josh: it happened again with Blackwell, and it's going to happen again with our future models.

00:39

Josh: But the difference is we have a series of future models that exist today that Josh: we can kind of map out to where we're going to be heading based on this trajectory Josh: that we've seen with the previous chips. Josh: And it's pretty awe-inspiring to see where we are going to go considering there Josh: are three generations of chips that have already been announced since Blackwell. Josh: We have Vera Rubin, Rubin Ultra, and Feynman.

00:58

Josh: Each one, many multiples more powerful than the last. And when you look at what Josh: Blackwell already produced in the very first version, it gets impossible to Josh: imagine a world where we don't reach AGI on hardware that's already been designed. Josh: Everything that's been announced that is going into production almost certainly Josh: is going to produce models indistinguishable from AGI. At least that's what Josh: it seems like on surface level?

⁠¶ The Power of Hardware

01:18

Ejaaz: Yeah, so the story here in a single sentence is AGI, like AI models, are already here. Ejaaz: We just haven't distributed it because we haven't powered up the GPUs that enable Ejaaz: it. So everyone is obsessed with AI models. Ejaaz: We talk about our favorite models, how we prompt them, how intelligent they Ejaaz: are. But very few people are talking about the fact that Ejaaz: The hardware is the thing that powers these things. They train these things. Ejaaz: They inference these things.

01:43

Ejaaz: And it's still about 70% of the influence of how intelligent your model is. Ejaaz: And the prime example, most recent example of that has been Anthropics Mythos Ejaaz: release, right? You just mentioned it. It's discovered a bunch of different cybersecurity flaws. Ejaaz: It is this all being powerful thing that the governments around the world, including the U.S.

02:00

Ejaaz: Government, Federal Reserve, they're sharing meetings with the top banks to Ejaaz: talk about the craziness of this model we must prepare. Ejaaz: There's a lot of doomer news out there in the future. Ejaaz: Little do you know that this was powered by a GPU or this was trained by a GPU Ejaaz: that was built 20 months ago. So we're talking about almost two years ago. Ejaaz: It's called Blackwell. And I want to give you guys an idea of the timeline of what this looked like.

02:24

Ejaaz: So in March 2024, NVIDIA GTC, which is like their developer conference, Ejaaz: Jensen Huang comes on stage and he presents this gargantuan scrap of metal. Ejaaz: It looks very pretty, by the way. And he goes, this is Blackwell, Ejaaz: GB200, GB300, a brand new GPU. Ejaaz: We can train frontier models on it. Everyone gets so excited. Ejaaz: Their stock price absolutely ascends, right? Ejaaz: The thing is, people couldn't get their hands on this until exactly a year later.

02:52

Ejaaz: So to give you guys an idea of the timeline, he announces it in March 2024. Ejaaz: Then by the middle of the year, they discover there's like a bit of a design Ejaaz: flaw and they amend that. Ejaaz: And then by the end of 2024, early 2025, they start shipping these units of Ejaaz: Blackwell GPUs out to the top frontier AI labs. Ejaaz: But there's an important nuance here, which is it's just the GPU sitting in a data center.

03:15

Ejaaz: They aren't actually powered up. It's not until 6 to 12 months after that fact Ejaaz: that these GPUs were finally powered up, Ejaaz: used to train models, which is why we now start to see these new AGI-like models Ejaaz: like OpenAI SPUD and Claude Mythos come to fruition. Ejaaz: So the point is, there is a long gap between the frontier GPUs being announced Ejaaz: and rolled out to them actually being powered to train the models.

03:40

Ejaaz: We talked about Elon Musk and XAR a lot on this show before. Ejaaz: They actually have the largest arsenal of these Blackwell GPUs. Ejaaz: They bought about a million of them. Ejaaz: The crazy part about this now is they're not like one, two, but three new NVIDIA Ejaaz: GPU models that have been announced in the recent NVIDIA GTC.

⁠¶ The Evolution of AI Models

03:57

Ejaaz: So there is a major lag between Frontier hardware and the new AI models that are being released. Ejaaz: And people don't understand this. And we want to tell you the story. Josh: You just remember GPT-4, how long ago that was and how that felt like the huge, Josh: most pivotal model that OpenAI ever released. Josh: I mean, that was the big one right after ChatGPT came out. That was trained Josh: using the Hopper chips. You know, the most recent model.

04:20

Ejaaz: Hopper's a word I haven't heard in a while, Josh. Josh: Yeah, well, you know, GPT 5.4, the most recent model that we're using every Josh: single day on ChatGPT. That was also trained on Hopper chips. Josh: The same chips are training models from GPT-4 to GPT-5.4. Josh: And it's a testament to how the efficiency gains of software can actually increase Josh: the throughput of hardware.

04:42

Josh: And I think I want to use that as an example because what we just got recently Josh: with Mythos through Anthropic, that seems to be the first real implementation process. Josh: Of a true Blackwell model. And rumors are that SPUD, the new open AI model, Josh: is going to kind of be the same in terms of power that is coming as it relates Josh: to the first Blackwell model.

05:01

Josh: And even if we don't actually iterate on the hardware, the amount of progress Josh: we're going to get from Blackwell models alone seems like it is going to be Josh: difficult to imagine it doesn't become some sort of an AGI, right?

05:13

Josh: It's like when you think about the difference of intelligence between GPT-4 Josh: and GPT-5.4 and how far we've come, that applied to Blackwell at this new scale, Josh: seems crazy but that's not even the crazy part because Josh: we have an entire roadmap of these three generations of Josh: chips that are coming that we can very clearly map to Josh: the gains that we're going to see and i think that's when things

05:34

Josh: get like particularly disturbing because on the chart that we're looking on Josh: screen now we have blackwell that's where we are right now blackwell is a significant Josh: improvement over the previous model but then we have vera rubin which jumps Josh: from 20 petaflops to 50 petaflops that's a two and a half to five times multiple Josh: on the compute then we We have Ruben Ultra, Josh: which is scheduled for the second half of 2027.

⁠¶ Accelerating Towards AGI

05:56

Josh: That is a 14 times multiple. Josh: And then we have Feynman in 2028, which is an estimated 30 to 50 times multiple. Josh: On the current chip stack that we have today, assuming that we get no software progress at all. Josh: And what we saw with the Hopper chips is that we got a tremendous amount of Josh: progress just from software.

06:18

Josh: So when you combine this 30 to 50 times multiple with a maybe another 100 times Josh: multiple on software, if we make another breakthrough, we're looking at some Josh: pretty insane improvements here that like are really hard to wrap your head around. Ejaaz: I want to point out that these improvements, these multiples that you just mentioned Ejaaz: are just on the speed and power of these hardware modules, right?

06:39

Ejaaz: So it's going to work 3x harder or 14x harder, but it's also going to cost you Ejaaz: a lot less to be able to train the same type of intelligence or model. Ejaaz: So the intelligence per density, which is a unit that we completely made up, Ejaaz: and we don't know if it exists, but it somehow rhymes in my head at least, Ejaaz: is improving and it's going to be cheaper with each successive model.

07:00

Ejaaz: But if you want to get a bit of context as to like what that looks like in terms Ejaaz: of like the models that you use today and what it's going to look like tomorrow, Ejaaz: we have this other table here, which kind of like maps it out. Ejaaz: So with Blackwell today, you get about a two to three X more intelligent, crazier model, right? Ejaaz: That's what Claude Mythos is supposedly meant to be. It's like a larger size. Ejaaz: It's trained on these Blackwells.

07:21

Ejaaz: You're going to see a bunch of models similar come out from OpenAI and XAI over Ejaaz: the next couple of months. Josh: Just to pause you there, these are already models deemed too dangerous to release for the public. Ejaaz: Yes. Just some emergency meetings literally being called by the federal chair, top banks. Ejaaz: Actually, I read something yesterday that the NSA is using or conferring or Ejaaz: re-engaged with Anthropic, as well as the Pentagon and the U.S.

07:46

Ejaaz: Defense Department, after banning and blacklisting Anthropic because it's so powerful. Josh: And that's where we are today. Ejaaz: That's today. So that's right here. 2026, two to three X, right? Ejaaz: Yeah, crazy. Now, you might notice that by next year, we have a larger multiple Ejaaz: on the original multiple. Ejaaz: By next year, we're going to have a 10 to 15x improvement purely through Vero Ejaaz: Rubin GPUs. Now, I must emphasize...

08:10

Ejaaz: This does not include post-training. This doesn't include all the fine, Ejaaz: fancy techniques that AI labs themselves will implement to make a smart model. Ejaaz: This is just the hardware. Ejaaz: It's like buying the hardware and training a model today versus next year, Ejaaz: you're gonna get a 10 to 15x more intelligent model, but it gets even scarier. Ejaaz: 2028, 30 to 50x.

08:31

Ejaaz: 2029, 100 to 200x. Now, I haven't seen these multiples in any other industry Ejaaz: for any kind of performance or hardware improvement. Ejaaz: So I can't wrap my head around this because it looks like just a few small numbers Ejaaz: that are getting larger, but these are multiples of its predecessor, Ejaaz: which means that we're probably going to get AGI, Ejaaz: honestly, by the start of next year. Ejaaz: And they're trained on hardware that currently exists and is rolling out.

08:59

Ejaaz: I don't know. I'm just kind of scared reading all of this, to be honest, Ejaaz: because what happens if we have universal access to this? Ejaaz: There's going to be a load of malicious actors which can use these models for Ejaaz: various different things. But also, I don't know what these models are going Ejaaz: to be capable of. They're going to be so much smarter than humans themselves.

09:13

Josh: The disturbing thing is that this technology is here. Like this is, Josh: it's no longer an engineering problem or a physics problem necessarily. Josh: It's just a matter of actually producing the thing and plugging it into an outlet and putting it online. Josh: And this is coming. Like there are no novel breakthroughs required to make this a reality.

09:30

Josh: Now, what that looks like on the other side, I don't know, but I think it's Josh: safe to assume the velocity of improvement we're going to get is certainly not Josh: slowing down. It is turning more closely resemble a vertical line than anything else.

⁠¶ Defining AGI and Its Implications

09:42

Josh: And I think it begs the question, like, at what point do we reach AGI and how do we even define that? Josh: Because I'm not sure we spoke about that much on the show, but Ejaz, Josh: when you say AGI, what do you mean by AGI? What would you be looking for? Josh: To declare, okay, we have finally reached AGI. Ejaaz: Okay, so this is like my own made-up definition, but it's what will make me go, okay, this is AGI.

10:06

Ejaaz: It would be a single AI model, not many, but a single AI model that advances Ejaaz: the frontier of three key major industries autonomously. So I'll pick these Ejaaz: industries as examples. Ejaaz: Financial industry, so it trades better than the average world. Ejaaz: Sorry, then the best hedge fund or investor. Ejaaz: It is able to make assessments better than any of the financial analysts, Ejaaz: the top experts, et cetera, in that industry.

10:33

Ejaaz: In science, it has discovered a bunch of medical cures for some major diseases Ejaaz: such as cancer, Alzheimer's, and stuff like that, that scientists, Ejaaz: top scientists at their top level could not figure out. It accelerates their research. Ejaaz: And maybe one other industry that I can't think of right now, Ejaaz: but it's when these models start doing things that the best of the best humans Ejaaz: right now couldn't figure out themselves and couldn't have seen themselves.

10:57

Ejaaz: Do you have a similar definition or? Josh: Yeah, I think that sounds right. I think, and again, it's very fuzzy. Josh: Everyone kind of has their own custom definition of what they believe AGI is going to be. Josh: But for me, it's just AI that's smarter than the smartest human at pretty much Josh: any cognitive task that exists.

11:12

Josh: So you can go to this model and it will be better Josh: anyone else who you can ask on planet earth about anything and Josh: the problem with models today is they're very spiky like you can do this Josh: for code probably and it can code better than every Josh: human on earth but if you ask it you know a generalized question Josh: about something that you really know a lot about there's a Josh: lot of times where it's not completely accurate or it will respond

11:33

Josh: as if it has the intelligence of a three-year-old it fails the Josh: reasoning tests of a lot of simple things it still feels like Josh: it's this very spiky entity once it is fully Josh: developed once it is actually better at every cognitive task Josh: that includes physical things too that includes like understanding physics Josh: of the real world world models that feels like agi and Josh: then artificial super intelligence asi feels like

11:54

Josh: it is smarter than all humans combined so it's like if we put all of our brains Josh: together no matter how long we tried we can never come up with the things that Josh: artificial super intelligence will come up with and i mean will we get there Josh: using this chip architecture possibly I'm seeing a 50x multiple, Josh: not including the software multiples.

12:13

Josh: And like those compounding on top of each other at the rate that we're moving, Josh: seems like the only real constraint is going to be physical. Josh: It's going to be actually rolling out these models and powering them on. Ejaaz: Well, another crazy thing is, I think a lot of people, including myself, Ejaaz: would assume that with every chip upgrade, it's going to be more expensive, Ejaaz: and it's going to be bigger.

12:35

Ejaaz: It's going to be clunkier, right? Like the data centers are going to get bigger, Ejaaz: it's going to be more expensive. Ejaaz: I wish I had a chart to show this, but it's actually the complete inverse. Ejaaz: And I'll give you some examples, some numbers to explain that, right? Ejaaz: So a reasoning task that costs $1 on Blackwell costs $0.20 on Vero Rubin, Ejaaz: which is rolling out as we speak or later this year.

12:57

Ejaaz: And it'll only cost $0.07 on Rubin Ultra, which starts to get released by the start of next year. Ejaaz: So the cost is going down pretty massively. Ejaaz: Now, by 2028, Jensen announced the Feynman GPU, right? Ejaaz: A single rack of that. So we're talking about like just a couple of that. Ejaaz: Blocked on top of each other, will process more compute than was required to Ejaaz: train GPT-4 that you mentioned earlier, Josh.

13:23

Ejaaz: So the point is, less is more, but somehow more powerful, but also somehow more Ejaaz: cheap relative to the intelligence that you're building. Ejaaz: And if you assume this intelligence is going to reach this ASI, Ejaaz: AGI-like state, it's going to make you money as well. Ejaaz: So you end up just having i guess i i'm afraid to say this but the best of old Ejaaz: worlds both worlds i don't know what humans are going to be doing but it's great for ai.

13:47

Josh: Basically yeah there's no world in which things don't Josh: get better and it feels like right now we're really just constrained by this Josh: compute power there's this great meme that i saw online it's Josh: it it said uh mythos is too powerful for public release Josh: but the reality is is that they're just completely out of compute and Josh: anthropic can't actually supply the tokens required to give Josh: mythos to the world these optimizations these cost structures

14:07

Josh: yeah there it is we got on screen now great meme Josh: great meme but these these cost structures that are Josh: going to incur from these new models are going to completely destroy that factor Josh: at least for now until whatever that next generation of model is that is so Josh: powerful that it's constraining gpus and the interesting thing is that open Josh: ai has the same exact thing going on all these models are kind of converging

14:29

Josh: on the same spot but they all seem to be compute constrained. Ejaaz: I think what critics will push back on though, Josh, for everything that we've Ejaaz: said so far is, okay, cool. Ejaaz: You can buy these new hardware things, but why would you do that if you could Ejaaz: just wait a few months or six months and buy the next thing? Ejaaz: Jensen's just shipping out these products. He's making a load more money. Ejaaz: It doesn't make sense. These things are depreciating assets.

14:52

Ejaaz: By the time you've bought the first one and you've ramped that up with power Ejaaz: and training your next model, there's already three other new chip architectures.

⁠¶ The GPU Market Dynamics

14:59

Ejaaz: And he would be right, that critic would be right, Ejaaz: except that they're massively, massively wrong. And we have proof for that, Ejaaz: right? GPUs have now become this anti-depreciation machine. Josh: One of the most amazing things about this phenomenon, and it feels like a narrative Josh: violation, is the idea that the GPUs that were released three years ago are Josh: actually more valuable today than they were at the time they launched, Josh: which is a pretty bizarre idea.

15:25

Josh: We have this artifact on screen that shows a chart. Josh: And an H100 from NVIDIA cost $30,000 when it launched in 2023.

15:32

Josh: At its peak because of the scarcity because everyone Josh: needs these things it was selling for a four times multiple at 120 000 Josh: per h100 this is kind Josh: of outrageous it was a little exorbitant we don't need to Josh: be paying that much money but now that they are old they're not depreciated Josh: but there's much better hardware out there they're still holding their price Josh: at 30 000 in fact you can see a rebound that happens in late 2025 where the

15:56

Josh: cost of these h100 gpus actually ticks upwards And I think a lot of the people, Josh: Michael Burry most famously, who is the guy behind the big short, Josh: He created an entire short thesis around the idea that the depreciation schedule Josh: of these GPUs wasn't aggressive enough and they were actually going to lose Josh: their value and therefore the market was going to deflate because the companies Josh: weren't marking these down properly.

16:19

Josh: The reality is, is that not only are they not going down, they're starting to Josh: trend back up because the incremental cost for a token is so low with these Josh: and everyone's so desperate for compute that they're like, well, Josh: might as well spend some extra money, Josh: get the H100s and start generating inference tokens with them. Josh: It's this pretty amazing phenomenon that's happening.

16:36

Ejaaz: Yeah, so if you're wondering why this is happening, explicitly it's AI demand Ejaaz: is growing faster than chip supply can expand. Ejaaz: We don't have enough fabs or the manufacturing prowess or the energy grid to Ejaaz: support creating and generating more GPUs to satiate the demand that we're seeing Ejaaz: in AI across all these different industries, right? It's a very pervasive bit of technology. Ejaaz: Now, the data that we're showing you on the screen right now isn't siloed to

17:01

Ejaaz: like a few research papers. This is happening in the market right now, Ejaaz: and it's incredibly liquid. Ejaaz: So a new phenomenon of companies in AI whose stocks have all skyrocketed are Ejaaz: these things called neoclouds, right? Ejaaz: So these are like, think of it as like AWS. They supply compute to train your Ejaaz: AI models by setting up their own data centers, and they kind of like provide Ejaaz: it to you in like a cloud or data center specific structure.

⁠¶ The Role of Neoclouds

17:26

Ejaaz: Examples would be CoreWeave, for example. The idea here is these data centers or these GPU providers. Ejaaz: 70% of the GPUs that they're running are old GPUs that we're showing you on our screen right now. Ejaaz: And they're booked out, I'm not exaggerating, 6 to 12 months in advance.

17:44

Ejaaz: In fact, they're done so in contracts and the same providers renew the contracts Ejaaz: three months before the contract needs to be renewed just to make sure that Ejaaz: they get access to these older GPUs. Ejaaz: So the point I'm trying to make, and you mentioned this just now, Ejaaz: Josh, is all that matters is can I get AI tokens generated to do the thing that Ejaaz: my company needs or answer the prompt that I have?

18:06

Ejaaz: And if the answer is yes, and it's for a reasonable price, I'm down to go for Ejaaz: that because the value that you can build and earn on top of that is invaluable, Ejaaz: right? They can have a large markup on that. Ejaaz: So it makes sense that these assets are kind of like in high demand. Ejaaz: And to your earlier point, Michael J. Burry like shorted the entire market saying Ejaaz: that these are depreciating assets and he got that completely wrong.

18:26

Ejaaz: And his thesis specifically was based on it can't train frontier models. Ejaaz: And he's actually right. Ejaaz: The older models can't train frontier models. But what they are being used for Ejaaz: is one thing very specifically, inference, which is if someone has a question, Ejaaz: how do I get them the answer? How do I process the prompt? Ejaaz: That's what the older GPUs are being used for. And they're really damn good at it.

18:47

Ejaaz: And the reason why it's important and essential for AI labs specifically who Ejaaz: are training models, who you might think might want the expensive models is Ejaaz: they have a ton of inference. Ejaaz: They use inference to even train the new models. So it's this new paradigm where Ejaaz: all these old GPU architectures are being re-found or repurposed for this really Ejaaz: important thing that is inference.

⁠¶ Inference and Its Importance

19:07

Ejaaz: So important context to understand if you're investing in some of these companies, for example. Josh: Yeah. And why is it so valuable? Well, it's a testament to the software improvements, Josh: right? So we have those software efficiency improvements that we didn't have three years ago. Josh: So that same hardware generates a lot more value.

19:20

Josh: And if we scroll down to the value multiplier section of this artifact it shows Josh: that the cost of a chatbot inference in 2023 was three dollars an hour and now Josh: autonomous agents completing these complex tasks is 30 to 300 dollars per hour Josh: The value that you can charge for these tokens is significantly higher than it was in the past. Josh: And the amount of tokens that you're able to generate efficiently at higher Josh: quality is much higher as well.

19:45

Josh: So there's all these converging forces that are just making the market desperate for compute. Josh: Nobody has the compute required that they want. And NVIDIA is trying to put Josh: it online as fast as they can, but it's not fast enough.

⁠¶ Future Prospects and Challenges

19:56

Josh: And I assume as we go through this, we're going to continue to see varying bottlenecks Josh: and the efficiencies will move to where there are bottlenecks, Josh: which creates new bottlenecks right now we're seeing some convergence around Josh: cpus and cpus seem to be like they're going to be hitting a Josh: shortage somewhat soon because we're out of gpus let's move to cpus Josh: and it's it's this really interesting dynamic but that is the idea

20:16

Josh: on this nvidia episode or just the chip episode in Josh: general that it is hard to imagine a world in which we don't reach Josh: agi given the currently announced infrastructure it Josh: doesn't require any breakthroughs it's just if nvidia does Josh: what they announced on stage through jensen huang through these next three Josh: chips it is almost impossible to imagine what the world of intelligence is going

20:36

Josh: to look like and i think it's important to understand is that mythos is trained Josh: on a two-year-old chip and no one's really talking about that so it blew my Josh: mind hopefully it blew yours as well uh at least found it a little bit fascinating Josh: and that is our episode today thank you guys so much for watching we really appreciate it Ejaaz: And i know some of you are probably thinking oh there's a bunch of challenges

20:53

Ejaaz: here and josh actually just mentioned one of them which is like you got cpus Ejaaz: we don't have enough energy, we don't have enough memory. Ejaaz: And that's like another episode that we can get into. Ejaaz: So all of those things assumed will be leveled at some point. Ejaaz: And we're gonna see all those industries grow versus being constrained. Ejaaz: People are throwing trillions of dollars into this industry. Ejaaz: So all of those problems should theoretically be fixed.

21:18

Ejaaz: But rest be sure, we will be the first show to cover it and give you those thoughts Ejaaz: before it happens, by the way. Ejaaz: And Intel is a sneaky one to get into. But we'll talk about that another time. Ejaaz: Thank you so much for listening. If you are not subscribed to us, please subscribe. Ejaaz: It helps us out massively. We are having banger weeks on YouTube, Ejaaz: Spotify, Apple, and wherever you listen to us.

21:36

Ejaaz: Please rate us. Leave us a comment. We love hearing your feedback. Ejaaz: There are like thousands of newbies that are listening to the show, welcome. Ejaaz: And also give us feedback about stuff that we may not be covering that you want Ejaaz: to hear more of. We're always open to feedback. Ejaaz: But until then, I guess we'll see you on the next one.

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript