🎙️ EP 187: xAI’s Supercomputer Eats a City’s Power. DeepMind’s Cheap Spy Drops.

00:00

Imagine a single machine, just one. But this machine draws more electrical power than the entire city of San Francisco. Wow. And that's not, you know, science fiction. As of right now, January 19th, 2026, it is a physical reality humming away on the grid. That is the kind of stat that just stops you in your tracks. Yeah. I mean, we are not talking about a server farm anymore. We're talking about energy use on the scale of a, well, a small country. Welcome to

00:26

the Deep Dive. Today we're navigating a world of some pretty wild extremes. We've reached a point where the infrastructure for intelligence is becoming truly titanic. Literal city -sized brain. Exactly. And yet at the exact same moment people are what fleeing to the countryside to knit sweaters. It's just we have the biggest computers in history and also a massive spike in yarn sales. It's a fascinating dichotomy. It is. And we're going to unpack that tension.

00:52

We have a stack of reports today covering the activation of XAI's Colossus 2, this strange cultural thing called the analog backlash, and some really interesting research from DeepMind on how to keep these massive systems safe without going broke. Plus, on a much more practical level, we really need to talk about why copying and pasting from an AI still completely ruins your formatting. Yes. Let's start with the heavy metal,

01:17

the hardware. We're looking at these reports on XAI, Elon Musk's AI company, and this new cluster, Colossus 2. Cluster just feels like such a small word for what this actually is. I mean, to set the scene, XAI has just activated a one gigawatt supercomputer. One gigawatt. Yeah. And it's purpose built for one thing, training the Grok series of models. And just for context, that's roughly the output of a large nuclear

01:42

reactor, right? Exactly. They have built a machine that effectively requires a nuclear reactor's worth of juice just to turn on. And the timeline is just while Colossus 1 took, what, about 122 days to bring online, Colossus 2 followed immediately after. Just bang, bang. And looking at the specs here, it's not even stopping at one gigawatt. Oh, no, that's just the starting line. The reports are saying it's already set to expand to one and a half gigawatts by April and then hit two

02:08

gigawatts soon after that. The logistics of that are, well, they're mind -bending. It has to sit on dedicated grid lines. It has its own on -site substations. And the heat, just imagine the thermodynamics of, what, 555 ,000 GPUs running at full tilt. Yeah, the cooling systems alone must be an engineering marvel. It's liquid cooling on just a ridiculous scale. And you mentioned the GPUs, 555 ,000 of them. The procurement number is just eye -watering. The cost for that hardware alone is estimated

02:37

at roughly $18 billion. $18 billion. That suddenly explains why XAI blew right past their original $15 billion funding goal. Yeah, if you want to play at this table, the buy -in is astronomical. They ended up raising a $20 billion Series E, backed by, you know, all the heavy hitters, Fidelity, Nvidia, Cisco. And the stated mission here, the philosophy behind spending all this money, is to understand the universe. They claim that to do that, you need the world's biggest brain.

03:05

Right. And practically speaking, this machine is what's going to power Grok 4, Grok Voice, and the upcoming Grok 5. Technologically, though, what stands out to me is that they've kind of hit a ceiling, but not on hardware. It's a software ceiling. Right. The reports say that the global batch sizes for training are now limited by optimization, not by compute. They effectively have enough hardware to do continuous trillion parameter training. It's just there. It's brute force on

03:33

a celestial scale, which I have to ask. Does building a brain the size of a city actually guarantee understanding? Or are we just building the most expensive pattern matching machine in human history? That is the multibillion dollar question, isn't it? But I guess when you have half a million GPUs, the strategy is pretty clear that quantity might just have a quality all its own. The hope is that understanding just kind of emerges from the math if you make the math

03:56

big enough. OK, so while XAI is building the digital equivalent of a Dyson sphere. the rest of us, the human world, we seem to be reacting in a very, very different way. Yeah. Welcome to what some are calling the 2026 paradox. On one side, the digital economy is absolutely on fire. OpenAI has revealed they hit $20 billion in annualized revenue for 2025. That's triple the year before. The numbers are vertical. Totally. And Gemini API traffic doubled in just five months.

04:26

But then you look at the cultural data. And you get yarn. You get yarn. I'm not kidding. Sales of knitting kits at Michaels have jumped 1 ,000. 200%. Wow. Searches for analog lifestyle have just exploded. It's this weird vibe where for a lot of people, 2026 feels a lot like 1996. It feels like a retreat. Like while the algorithms are eating the Internet, people are just craving something tactile, something they can, you know, control from start to finish without a prompt

04:52

window. Well, there's some real economic anxiety under that, too. Economists are warning that 2026 is the year this all hits the political radar. AI isn't necessarily taking everyone's job overnight. What it is doing is deleting the on -ramp. The entry -level positions. Exactly. All the junior work is being done by the models now. So it's getting harder and harder for beginners

05:13

to break into these industries. You have this squeeze where the digital elite are generating massive revenue, and a lot of other people are feeling locked out. And they knit. They knit. It's a way to feel competent, I think. When you knit a scarf, you know exactly how it was made. There are no hallucinations in wool. No hallucinations in wool. I like that. But for those of us who do have to work in this digital storm, we can't just go knit all day. We have to use these tools.

05:39

And honestly, it's still kind of a mess. It really is. I was reading through the source material on perfectly formatted reports, and it just... it hit home we talk about super intelligence but most of us are just struggling to get a document to look right you spend hours on a prompt you get great text you paste it into google docs and It just explodes. The formatting nightmare. Giant bold text where a header should be. Bullet points that turn into random dashes. It just

06:08

looks so amateur. I still wrestle with prompt drift myself. You spend like three hours trying to get the headers just right, and the model just gives you bold text instead of proper H1 tags. It's infuriating. The friction is so real. The whole promise of AI is speed. But if you spend 20 minutes reformatting font sizes and bullet points, you haven't actually saved any time. So what's the fix? The sources mentioned

06:30

some new empowered tools. Well, Google has rolled out this personal intelligence thing in Gemini. It's supposed to act like a, quote, weirdly well -informed best friend that remembers your context. But for the formatting problem specifically, the solution seems to be learning how to force the output. What do you mean by force it? You have to stop asking nicely. You need to be incredibly specific about the underlying code. You have to tell it, I'll put this in Markdown that maps

06:56

to the specific H1 and H2 tags. Or you use these new bridge tools like Noodle Seed or Flow Genie that structure the data before it even hits the document. So we have to learn to speak the machine's language just to get it to speak ours properly. Pretty much. Until the AI can intuitively understand your company's brand style guide, you have to manually override its default chaos. It really brings up a question of utility then. If we have to fight the AI to format a simple page, are

07:24

we really more productive yet? Only if you master those force formatting prompts. Otherwise, yeah, it's often just a very messy copy paste job. Now, speaking of looking under the hood, let's pivot to something arguably more critical than font sizes, safety. With models as big as Colossus 2, the question of control is, it's massive. Yeah, and DeepMind just dropped a paper on this that is... Honestly, a game changer. This is

07:47

the activation probes paper? Yes. And if you care about AI safety, but you also care about, you know, not lighting money on fire, this is huge. The problem they're really solving here is cost, right? Monitoring these huge models for safety usually costs a fortune. An insane amount. Traditionally, to check if an AI is being toxic or dangerous, you have to run its output through another AI to check it. It's like hiring a supervisor to watch every single employee 24

08:17

-7. And that supervisor demands a big salary. A huge one in compute power. So DeepMind's solution is to stop watching the output and start watching the brain itself. So as it's thinking. Precisely. As the AI is processing a request, it's doing all this internal matrix math. So DeepMind developed what they call activation probes. And you can think of them like tiny sensors or like a tiny brain scan. They scan the model's internal activity while it's thinking. So it's like reading the

08:42

mind before the mouth opens. That's the perfect analogy. It's eavesdropping on the inner monologue. And because it's just reading data that is already being computed anyway, it's incredibly cheap. How cheap are we talking? We're talking 10 ,000 times cheaper than running a full safety monitor on top. 10 ,000 times. That effectively makes safety free or close to it. Basically. And it works using a few clever layers. They use these multi -max probes to pick out signals and huge

09:08

prompts, up to a million tokens. And then something called rolling mean attention to filter out all the noise and spot, you know, the toxic thoughts. And what if a probe isn't sure about something? Then... And only then does it call in the supervisor. It's a cascade system. If the probe sees a red flag, it sends just that little snippet to a lightweight model like Gemini Flash to get a second opinion. So you cut your overall safety costs by 50x or more without really losing reliability.

09:35

That's a brilliant piece of engineering. It's moving safety from an external policing layer to an internal awareness layer. We're essentially installing a conscience into the AI that costs almost nothing to run. A lightweight moral compass embedded right there in the matrix math. I want to take just a brief pause here. We've covered city -sized computers, the return of knitting, the struggle of formatting a simple document, and now the internal conscience of the machine.

10:00

We'll be right back. And we are back. We've definitely covered a lot of ground today. Let's try to synthesize this for a moment, connect some of these dots. Okay, the big picture. We've got this massive convergence happening. On one side, you have a company like XAI. They are building physical infrastructure at a scale that challenges the power grids of major American cities. 1 .5 gigawatts. $18 billion. Pure brute force. And then on the complete other end of the spectrum, you have

10:31

DeepMind. Right. They're looking at the microscopic level. They're finding these tiny efficiencies, inserting tiny, cheap probes into the math to make these things safe without bankrupting the companies that run them. So you have this macro expansion and this micro optimization happening at the exact same time. Exactly. And who is sandwiched in the middle of all this silicon and code? Us.

10:51

The humans. Yeah, the humans who are generating billions in revenue for open AI, but also the humans who are feeling totally overwhelmed by it all. The humans buying all that yarn. It's a real tug of war. We're building these, you know, gods in the cloud. But down here on Earth, we just want a nice sweater and a document that formats correctly. And honestly, the fact that we can't get the document to format correctly is probably why we're all knitting the sweater.

11:17

It's a need to feel competence. To feel like we can actually finish a task. I think that's going to be the theme in 2026. Competence in an age of automation. Before we wrap up, for anyone listening who is still stuck in that formatting loop. Yes, please do yourself a favor. Look up the formatting guides for these models. Learn the force formatting prompts. I'm telling you, it's a small skill, but it will change your daily

11:40

quality of life. And as you head out into your week, I want to leave you with one final thought on that. analog backlash. Is this surge in yarn sales just a temporary fad, a little blip? Or are we seeing the beginning of a permanent bifurcation, a real split between a digital elite who live entirely in the stream and a kind of analog resistance that chooses to disconnect? That is something to think about. Are we heading for a world where true luxury just means being offline? I think

12:08

we might be. Thanks for diving in with us. See you in the deep end.

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript