How CoreWeave Sees the Market for Compute Right Now

Speaker 1

00:02

Bloomberg Audio Studios, Podcasts, radio News.

Speaker 2

00:18

Hello and welcome to another episode of the Odd Lots podcast.

Speaker 3

00:21

I'm Joe Wisenthal and I'm Tracy Alloway.

Speaker 2

00:24

Tracy, I'm envisioning this future where like we have to do a state of the sort of AI inference market episode, like once a month, you know, where it's like things are moving so rapidly and there's so much change either in terms of what models are using or what they're being used for, et cetera, that in the same way we would do, like you know, the occasional regular stock market episode or whatever, we would just do, Okay, what are we seeing right now in a inference trends because

00:53

it just feels like the moment we do an episode, a few weeks later it may be out of date.

Speaker 3

00:58

We should just buy the bullet and do a weekly episode, transform lots more into a market update on compute.

Speaker 2

01:04

We could do inference in I don't know, we'll have to workshop inference. No, no, we'd have to. But anyway, this is lots of lots of inference. This is like the story of the moment, and we know that, you know, a couple of years ago, everyone was sort of dabbling around with various things and experimenting and using AI, like oh like write a poem for me about this, etcetera.

01:30

That phase of AI is long over, and we know that companies specifically are spending a ton on compute, so much so that CFOs around the world are getting sticker shock about their compute budgets. And there was even a headline of like Uber saying like, okay, like fifteen hundred dollars of max per employee, like don't spend more than that in a month on token, So like this is a very fast moving area.

Speaker 3

01:54

Yeah, you're starting to get headlines about, I guess a corporate reckoning AI as more people experiment and spend money on it. The Uber headline that you mentioned apparently Uber burned through its entire twenty twenty six AI budget in four months basically, and like what's more important is the COEO was actually asking whether or not that was worth it, like whether they saw productivity gains or whatever as a result of that. The other very amusing headline that I saw,

02:22

and it was citing an unnamed source. It's from Axios, so you know, oh yeah, not entirely sure it's true, but reportedly great headline. It was a great headline. An AI consultant told Axios that one of their clients recently spent half a billion dollars in a single month after failing to put usage limits on this.

Speaker 2

02:40

Yeah, it's because there's everyone that's like, oh, I just have a simple question. I want to look up our guests title. I'm going to use the most advanced model to do that, et cetera. I have a theory and we'll get into this with our guests that one of the things that will and we've talked about this with a Goldenman's Marco Ardenti, but one of the things I predict is that companies are like, clearly you know, they're going to keep using it more and more would be

03:04

my guess. But there are probably a lot of investment made in sort of like optimal model routing. Because some models are like one hundred per query of what a frontier model is, probably a lot of people don't know like what is the sort of like efficient frontier model usage, and so actually routing the query to the sort of most efficient model. I have a feeling we're going to see a lot of investment in that area.

Speaker 3

03:27

Specifically, well, there's also just the question of whether or not the models get cheaper overall as they advance, right, and we have seen some I think Nvidia has a new system or chip out or something that is supposed to reduce token usage. We can get into that as well.

Speaker 2

03:41

And you know, we did that live episode recently with In Dunning of Hudson River Trading and he said a lot of interesting things in that, But one of the things he said is that the scarcity is increasingly like just the real estate component. Finding a suitable place to plug in your GPU, at least from his perspective, right now, is as much, if not more so, of a challenge than securing GPUs themselves, so like.

Speaker 3

04:08

Which is different to what it was like three years ago.

Speaker 2

04:10

Yeah, yeah, so just like where you plug it in. We know there's all the like the anti data center politics out there, so it's like, yeah, we got to take the pulse of.

Speaker 3

04:17

This market, all right, consider this our inference update.

Speaker 2

04:20

Yeah, well, I'm really excited to say we really do have the perfect guests. Someone we spoke to like truly feels like eons a good I think the first thing we ever connected with this company, They've always had a lot of chips. But I think the first time we ever linked up with this company was still in the era where people were excited about in video chips being

04:36

used for like cryptomining and stuff like that. But we are now in this very different era and this is truly like one of the companies of the moment, and that is, of course core weave, one of the so called neo clouds, offering both training and inference services for all sorts of different AI workloads. I'm very excited to say back on the show, we have Brandon McBee, Coreweav's co founder and chief development officer, So thank you so much for coming on ALTS.

Speaker 4

05:02

Appreciate being invited back, guys, and that was a fantastic intro. We look forward to hitting these topics today.

Speaker 2

05:08

All right, here's my question. So we know that like at the tail end of last year and then in the first quarter of this year, it's everyone started using clog code and just there's clearly a key inflection moment for sort of like overall AI demand. And then we get into Q two and suddenly the CFO is, oh my gosh, we're spending this much on inference. We got to like figure things out just straight up like in

05:36

the last month whatever. Do you see any signs of that happening yet of these companies which are all like still AI eager AI adopters trying to get a little bit of a handle and maybe slowing the rate of the rate of growth. Is that happening yet?

Speaker 4

05:57

Yeah, I think you see head lines there that there are surprises of spend et cetera. I'd say our interpretation of it is entirely look at the authentic and foundational demand that is out there right, Like, all we're really doing is talking about how much consumption there is of AI and use for it. And I think that that was a real question in the market twelve eighteen, twenty four months ago, is will there be demand FEI? Where

06:27

is this inference demand that everyone's been talking about. And I think you're absolutely correct January or so with this kind of like next group of models that were coming out, everyone all of a sudden and all at once said this is what we've needed, like this is the real

06:44

product breakthrough. But I think we're keeping in mind that product breakthrough was like for a limited set of people at the end of the day, right, we're talking like coding professionals, finance professionals, but it's a relatively small group of people that are using infrastructure at this normal scale. And so where we see this moving towards next is broader enterprise use, like likely not seeing this whole to

07:11

connecting approach, and I think that that is unsustainable. But do we see adoption in other sectors and how this can continue to spread out? Absolutely? I mean, you know, on our end, I think we have ten over one billion dollar clients at this point, and our financial services client backlog is into tens of billions of dollars at this point. And so we're now talking about things outside

07:41

of AI labs, outside of hyperscalers. And look, as you guys know, we support nine of the top ten AI labs on the planet, and if you exclude China and everything that's going on over there, Like, we have a lot of visibility into what people are doing, and we're not seeing any pullback on what they're doing on inference today.

08:05

If anything, it just remains this unrelenting demand for access to the best technology solution in the market for running artificial intelligence, and that's core week solution in the market.

Speaker 3

08:19

Wait, say more about the customer mix now versus say three years ago. So you have hyper scalers, you've got startups, you've got various businesses. How has that, I guess composition shifted over time.

Speaker 4

08:32

Yeah, it's shifted enormously towards a more diverse customer base. Right. We got a lot of flat for this. In our IPO. Right, people were noting that we only had a handful of large clients, that our clients were like just the hyperscalers and AI lab or two. And I think that we have made tremendous progress in driving diversifications. So I'd say it's broadly cross three buckets today. Right. We had hyperscaleup clients who continue to grow with us. We have AI

09:05

lab clients. As I said, nine of the top ten AI labs on the planet choose core REEF. And then we have this enterprise base. And the enterprise base just doesn't grab as many headlines as you would expect because it's not these massive, multi billion dollar contracts that are being signed. But I think in Q four alone, we added twice as many logos to our client base as we had ever done versus any previous court. Right, And

09:37

that enterprise base is one that's growing so much. And there was a point you guys hit on in the intro that I think is really worth acknowledging, and it was this concept of model routing and the idea that like, not everyone needs just the latest model, that it's different

09:54

types of models I can hit different use cases. And this is something we've been talking about for a while right as it relates to the infrastructure side of things as well, right, because you don't need that latest model for everything, and accordingly you don't need the latest piece of infrastructure to support every single inference or training query

10:14

that's out there. You can kind of conceptualize this matrix of different sizes of workloads well to the different sizes of GPUs, and all of a sudden that tells you, my god, like h one hundreds could last six, seven, eight years, a one hundreds are going to last longer. And it totally changes the entire conversation around depreci full life of infrastructure, as that was a really popular topic

10:40

during twenty twenty five. People were saying like, oh, this stuff will last two years, it's worth zero afterwards, in

10:46

like we've never seen any semblance of that. Because of the point you guys are accurately making, which is users are going to need to find the way to use the appropriate model for their prompts, and that'll be solved by model round to your point, but that just further enables this concept that infrastructure is going to be used longer, and we see that every day in our portfolio, extending all the way back to A one hundreds.

Speaker 2

11:17

I just want to ask a specific question about the broadening out of the customer base. And you mentioned, for example, financial services clients. When you talk about, say a financial services client as being distinct client from one of the major AI labs, does that mean, well, you're saying so it's like I'm just making it up. Let's just say I don't know if these relationships exist. Let's say a

11:39

city group has an enterprise licensed with an ANTHROPIC. Does that count as Anthropic as a customer or city as a customer. And when you talk about this broadening out, are there essentially more types of entities who are building some type of model, not necessarily an LLM per se, but it's some type of internal house specific model from which they want to run inference.

Speaker 4

12:05

It's a great question. The scenario you presented ANTHROPIC would be our clients.

Speaker 2

12:09

Okay, got it.

Speaker 4

12:09

So what I'm highlighting I want to correct a number I said earlier are financial service clients and this is direct to those financial services. They're approaching ten billion in backlog. Okay, so this would be you know, a good example of this, and that's when we made recently is with Jane Street. Okay, right. That's not Jane Street coming through Open AI or Anthropic to get to us. That is Jane Street coming directly to us and using our platform, and that for.

Speaker 2

12:34

A model that they're building. So it's a Jane Street No, no, no, I'm not saying it's setting inside training, but it would be inference of a model that it's the Jane Street's model of something rather than Jane Street's contract and enterprise relationship with one of the major labs.

Speaker 4

12:53

At the end of the day, we don't know what exact workloads these entities are running, especially for and he's like Jane Street would imagine that's highly secretive. Yeah. But the point I would say is more that this is not them coming through an AI lab yus. They are interfacing with and managing the infrastructure directly on our platform. And that's a really important distinction as we grow this diversified client base. And I again, I think that we've just done a wonderful job of executing.

Speaker 2

13:26

That over the past, as you've talked about, including in earnings releases, and as you can just tell from these huge token budgets, inference demand is booming, but model training is still important. But in addition to model training, to say, okay, relative if you have a pie chart, the part that's inference is getting bigger. But I assume the training is

14:03

also growing as well. But I'm curious from the perspective of like say, the AI labs when they think about growth, has there been a subtle shift from investing to push the pure model frontier, having the absolute best state of

14:19

the art model, versus investing in, say, better harnesses. Because a big reason we're excited, and i'll talk about AI right now, is really the excitement that happened over with Claude Code in the final quarter of twenty twenty five, and it's like, oh, this harness has really unlocked a

14:38

bunch of capabilities. Has there been a shift in investment from rather than just the purest, most advanced model to let's invest more in tooling capacity and other things that allow companies and clients to get more juice from an advanced model.

Speaker 4

14:55

I don't think that we're exposed to that decision making with the AI labs as counterparties to us. The observation I would make in a behavior change for the AI labs is they want access to more infrastructure for longer duration, right, And I'll qualify it a little bit, which is a year. Two years ago, we were signing three year committed contracts. The type of contracts we sign are basically like take our pay contracts, which is the best way to finance

15:29

the infrastructure that we are building for our clients. Last year it was four year contracts. Right. They were saying, we want explicit access to Hopper for four years or Blackwell for four years. Now they're coming and saying, well, actually we want it for five years. We don't want any interruption of use. We'll commit to the exact same economics throughout the full duration of the contract. You can't upgrade or change the infrastructure within it. You cannot cancel

15:59

the contract. It for five years, and they want it

16:02

at more scale. Right. The deployments are getting larger and larger, so that's probably the best characterization we can offer on decision making that AI labs are going through right now as they look from an infrastructure perspective, it absolutely seems like tooling is important, but scaling laws are still holding, right, Like your ability to advance your frontier model through accessing more infrastructure, its scale holds and that will hold through

16:31

Vera Ruben we expect and seemingly it's not stopping anytime soon.

Speaker 3

16:36

Oh yeah, what's the deal with Arah Rubin? Can you explain that to us?

Speaker 4

16:40

Which aspect of it?

Speaker 3

16:41

What is it?

Speaker 4

16:42

Oh?

Speaker 2

16:42

Yes, basically yeah.

Speaker 4

16:44

So it's just Nvidio's next architecture that's coming out, right, Like the current architecture that we're deploying today is Blackwell. Blackwall comes. We deplay phronomenally in a NBL seventy two configuration, which was an entire architecture change from the right. If you recall Hopper game before Blackwell Hopper, you could deploy these forty two U racks, which was typically like eight GPUs in a server case. You would take it, plug

17:10

it in largely air cooled as well. Right, we ran some liquid cooling just so we understood the requirements of liquid cooling because Blackwell for our deployments is overwhelmingly liquid cooled and its deployment configuration, and instead of eight GPUs and a forty two U configuration, it's in this larger seventy two GPU rack. It's like an entire chassis that's being brought in and it just looks entirely different in the data center. It's like this giant tower thing that

17:42

you've seen in pictures floating around on x so. Vera Rubin will be the next architecture that comes out, and we've started receiving testing racks for Vera Rubin.

Speaker 3

17:57

But the basic idea is like the new configureuration makes the whole system more efficient, like more tokens per energy use and that sort of thing.

Speaker 4

18:07

Yes, yeah, I think that's kind of where you're getting to with it. But that doesn't necessarily mean going back to the point earlier that everyone only wants the latest generation of GPU. Right, we have massive demand for Ampeer, Hopper, Blackwell, et cetera. And it just varies by US case model

18:28

and type of client as well. Like I would qualify that AI labs are probably the ones who are lining up first to secure access to the latest generation GPUs, whereas enterprise clients might be probably very focused on current generation, right like Hopper and Blackwell.

Speaker 2

18:50

Right now, I'm going to be honest for a second. You know, I try to keep up on a lot of things AI related, I really do, and every single day the ones that I got to do not keep like in my mind. If you asked me, like I liked it in the old days, when it was like one eighty six, two eighty six, three eighty six, four eighty six, Penniem and then like Penniam two, et cetera. There was just this numerical sequence that I could keep

19:14

track of in my head. And so if someone asked me like Joe, like Vera, Ruben Hopper, Blackwell, what was the sequence, I'd be like, I gotta be honest with you. I like, don't exactly remember, and I will prioritize that at some point. But speaking of Silicon, so yesterday Microsoft came out with a big They're really they want to be in the game too. They don't want to just be connected the labs. They want to have advanced models too,

19:37

and apparently it's a good model. And they announced the MAI Thinking one model, but they said it's optimized on the maya two hundred chip, which is their own chip. And this is a thing which is even again going back to our recent conversation we had even a place like Hudson River Trading is thinking about getting into the customized hardware game. How much juice for the squeeze is there of aligning the model with custom silicon From your vantage point.

Speaker 4

20:09

What we could offer is what we hear from our clients. Yeah, on that, And it's important to keep in mind we can run any type of silicon on our platform. Right. We are entirely customer led in what we build. Like, we don't go commit to CAPEX and specuatively hope people come and use infrastructure, right like, We wait until a client says, we want you to go do this specific build, here's what we want it to look like, and then we go commit to that CAPEX right more like a

20:37

success based CAPEX approach. And the client isn't asking for anything but in nvidio infrastructure. And I think a large contributor to that is I mean, they built this incredible ecosystem around their chipset. They have been dedicated to that for I think over fifteen years at this point. Through the Kuda architecture and in video. From what we hear from more clients, that platform just remains the most efficient, the most scalable, the most reliable set of infrastructure that

21:11

is in the market. Right, So I think others there's always been you think over the past few years, right, there's always been talking like what it is, but yeah, this other again and these other chips, and at the end of the day, like people are still using in video infrastructure, they're committing to in Vidia infrastructure for five plus year contracts in these billion, multi billion dollar commitments because they know that that is going to be a

21:40

critical part of how they scale their business. We really don't see demands on a material basis for anything but that in Vidia compute and that's what we are building today.

Speaker 2

21:53

Obviously, just to push back on this a little bit, and I'm not really in any position to push back, I can only relate what past guests have said and my own reading. So what one of our guests said is that absolutely in Nvidia has the lock on model training, that if you want to train a model, that yes, in video chips are the older game in talent, but that for inference they're really his view, this is in

22:20

Donning again his views, there really were options. And then of course we had someone who was much more biased. We interviewed the CEU of Sarah Brash, the company that makes the Gigantic plate and or sorry, the Gigantic pichanic, and of course he did but I mean, of course he was gonna say, yeah, the Kudo mote is vastly

22:37

overrated for inference. It barely exists now. Of course, of course he's going to say that, so, like, you know, he's in a competitor, but we've also heard it from a user of inference, and intuitively it makes sense, like training is very complicated, all that stuff. But what you're saying is that from the customer standpoint, you see the demand for in Vidia on both the training and the inference as being steady, and that you perceive that advantage to be consistent through both aspects.

Speaker 4

23:09

So I believe in our last quarterly report, our CEO might qualify that inference workloads represent well in access of fifty percent of infrastructure utilization on our platform. Exact same infrastructure we do is for training. Yeah. Right, When going back to my commentive like it's very fungible between those different types of workloads those customers are choosing in Vidia

23:33

to work with. Okay, I think what you're going to see is people will want to try at small scale other types of silicon, but the reliable, proven, and remains from our perspective, most efficient infrastructure to use is in video today. Does that change over time? Who really knows, But I think we've seen in video battling this concept for years, and every year they show up and like

24:06

they remain the de facto choice for AI infrastructure. I think we're going to be one of the first people in the market to see it because that will be a tone shift change for our clients asking us to run something else that hasn't happened.

Speaker 3

24:21

Okay, So have the constraints on your business changed at all? So three years ago we were talking about GPUs and how hard they were to actually get. I imagine GPU securing GPUs is still competitive to say the least. But are you seeing other constraints emerge? Like Joe mentioned in the intro, just land usage just places to actually build data centers.

Speaker 4

24:44

Landed usage specifically, I wouldn't say is as much of a concern. Having a powered shell is the battleneck? Okay today and let me qualify ball poweredshell. PowerShell is effectively an empty data center that is energized. Right. It has all the power and associated components. I can come into it and deliver electrons into a reck, has the cooling system built within it, like it has a whole thing. Right.

25:13

Poweredshell is the industry termed for it. That is the bottleneck because of all of the supply chains that come into that, right Like, not only you have electricity, do you have the land, et cetera, But you have the backup batteries, supplies, you have the transformers, you have personnel. Right they just think about the electricians for these sites and getting the accreditation on the electrician side to be

25:39

able to participate in these builds. I mean, I think it's a five year plus apprenticeship to be able to go through that program. Right. We can't just make new electricians leveraging a supply chain, right like that that's a trade that you can't really scale efficiently. So that is

25:56

absolutely the bottleneck for us. And I think our peer set that's out there right now, access to chips, I think we have a phenomenal relationship within video where we've just proven to be the best operator of this infrastructure on the planet. You know, a bottleneck that existed for us previously I think was access to financing.

Speaker 3

26:18

Yeah right, we an't know, doesn't seem to be an issue anymore.

Speaker 4

26:22

I would agree with that broadly, But that's years of work in execution that has delivered that ability for us. I mean, year to date, we've raised over twenty one billion dollars of financing for our business. Like you don't get to do that and just go from you know, zero to twenty right out of nowhere. And I think

26:46

that's largely driven by our track record of execution. Right, our investors, our creditors can see this deep set of experience over the years of consistently delivering on these builds. I mean, we have over a gigawant in active power at this point, right, like a gigawat like at the

27:08

data center level with GPUs delivering into clients. And I think that there's been kind of a misunderstanding of the market where people are conflating the concept that like, you know, something on paper is equivalent to being physically done and delivered. And all I can say is there's an enormous gap between you know, signing for power for delivery in twenty thirty versus actually delivering that into billable GPU hours, and that gap of execution is what has driven down our

27:42

costly capital so aggressively. That gap is where our business sits and why it has been so successful. I mean, that's the secret Sauce is our ability to take these data center deployments and these customer relationships and deliver were billable GPU hours into them.

Speaker 2

28:03

You know, speaking of financing, I just want to say, you know, during last year, like maybe six months ago, that might have been the sort of near peak of the Michael Burry inspired These chips are like in the last two years stuff. And one of the viral charts that you would see on Twitter was the core We've CDs chart. Those have come way in, So it is it is you know, while yeah, that's right, that's the thing about CDs. No one never posts charts of credit

28:33

default swaps when they're coming in. They people love to post them when they're blowing out. They have come in, So you know, that does speak to some of this point about these anxieties having been a leaved at least somewhat since the start of the year. You know, it occurred to me, like we're talking about credit default swaps,

28:52

we're talking about financing. I'm sort of gearing up to write a big thing maybe, but I'm writing it in my head currently that there really are a lot of analogies between the business of data centers and the business of banking. And one of the things in banking is we all learned from SVB was the risk of industry and depositor concentration that you if you have all your depositors are either in like one depositor gets too big, or all your depositors are in the same industry, then

29:20

you have this risk of like correlated withdrawals. And that's what obviously did in SVB. When you think about planning and you think about, okay, here's a investment, et cetera, how much does this come up sort of like thinking about I guess tenant uh diversification. Yeah, tenant diversification as something that you think about in your multi your planning.

Speaker 4

29:43

It's a critical aspect of it. Right, As I said earlier too, like this was a key criticism of us coming into our im right where we had that customer concentration in our revenue, and we have made enormous progress there, and I think the best way to think about it is we could take all of our unallocated capacity. And I say that very specific it's not unsold capacity and

30:08

implying that there's no demand for it. It's unallocated. There's intense demand for we're figuring out where it should go, and that customer piece of it, I think, honestly, like we could allocate all that capacity to like single name clients, right, Like, there is a pretty significant number of single name clients we can go allocated out into. But I don't think that is the business we are supposed to be building here.

30:33

I think the business we are supposed to be building is a diversified cloud that is supporting the leading AI consumers and producers on the planet. I don't think we're supposed to be supported in just one or two companies.

Speaker 3

31:04

When it comes to financing, Can you say a little bit more about what changed to make the market more comfortable with this, because like this is the big story in markets, just how much AI is now being issued through the corporate bond market. The equity market as we

31:20

know is basically all big tech at the moment. Like, what changed on the part of investors was Was it just pure return and performance or were there I guess efforts to like make the contracts more robust or increase visibility into demand and that sort of thing.

Speaker 4

31:37

Seeing the inference aspect of it really emboldened investors, But like that was really just January, right, or maybe late Q four where you started seeing this just massive inflection of demand driven by inference for us, right, It's tough for me to speak about you know, other companies, but for us, like why why have we been underwritten at

32:04

such scale and at a decreasing costly capital? I think it goes back to that track record of execution, right, is just the market has watched us execute and watch us deliver on these contracts. And the way then tell me if I'm going into too much detail here, but but the way that we finance our business. You kind of break it into two broad buckets, right, you have parent co financing and acid co financing. And acid co financing is where all of the GPUs you get financed. Right,

32:34

It's where all of our client contracts sit. And we can take these financings and put them into SPDs or we'll just call it a box so to say. And you.

Speaker 2

32:48

Keep going lots of connotations, but keep going, keep we.

Speaker 4

32:54

Put them into SPDs UH and these these stvs, they have the emphas structure, they have the data center costs, and they have the debt agreements within them. And so you're able to pair this like five year take or pay contract to an amortization schedule on the debt and you have the revenue come into the box, pay down the amortization schedule, pay down the operating costs of the data center, and it still contributes a It has a twenty five percent contribution margin of profit up to the

33:29

parent code. Right, Like these are highly profitable agreements down to the SPB stack. And so you take that SPB out to the credit market and say, look at this instrument. It's a discrete set of contracts with counterparties like any ease you want to consume GPU compute, you have the data centers within it, et cetera. And you know, one of the latest ones we did was as we call ddtail four. This was a investment grade rated first of its class. No one had done this before for GPU financing,

34:02

non recourse HPC infrastructure financing, and got done it. So for plus two twenty five like that is a phenomenal cost of capital for us, And importantly, we were able to bring in the insurance charge of capital, which is a massive change of capital out there that is looking to do allocations into the space. So we're kind of continuously making progress through these different stacks of capital, unlocking

34:29

access to more and more types of investors. It's why you've seen this move into the convertible note market, into the unsecured market as well, along with taking direct strategic equity investments. But for us, it's really important for the entire investor space to understand this business because this business largely didn't exist before, right, Like people weren't making loans into the hyper scale just to go credit these buildouts.

34:57

Right it's on core weave honestly to be building this path into how do you finance the AI hyperscaler effectively? And I think we've just done a terrific job of it over the past few years.

Speaker 2

35:12

You used to be, in a prior lifetime a trader.

Speaker 4

35:15

Right, yes, as a commodity trader.

Speaker 2

35:17

So I'm curious, like you know, there's a lot of interest in and I don't know if it's going to materialize in GPU capacity trading and there's gonna be a new contract. We recently interviewed the CEO of Compute Exchange and they're very close to having something listed on the CME. From your perspective, because I don't have a view on this yet, the one like you see, like okay, a big AI company does a five year contract. As you say,

35:45

the duration is lengthening, we're gonna lock this in. I don't know, like what the need is for tradeable compute in that environment, etc. What's your guest, like, do you anticipate that there will be a sufficient ecology of hedgers and speculators such that there will be a liquid market for tradable compute.

Speaker 4

36:05

I think it's a it's a very much a timeline question that's out there. Short term No, let me offer why no short term? And then I'd say maybe in the long term. And it all comes back to fungibility. Right. If you think about gold, the gold is defined by its chemical composition, right, and there's no question of of what is gold and not gold, et cetera. Compute really isn't, right,

36:33

especially GPU compute GP compute today is not fungible. And I think that this is well understood by our client base, by our suppliers, by you know, third party consultants like syny Analysis, And it's it's this idea that an H one hundred deployed and one cloud doesn't have the same

36:55

performance of an eight one in another cloud. And the metrics that people use are things like good put or model flop utilization MFUS, and there are these measurements of like how much more performance is one the exact same GPU by the way, versus another GPU deployed in another facility. And so in order for something to be commoditized, it has to be fungible, right, Otherwise there's just too much you know, murkiness, and there isn't like an exact data point in there.

Speaker 2

37:27

Can I push on that a little bit further? So, I mean, I think that seems like a reasonable view. Is the non fungibility related to configuration of like how they literally like the configuration of the GPUs within physically, Like what is it? Is it about power? I mean, I think they're all like, you know, there are plenty of places that will say you only have nine nines or however many nines you need in your industry or whatever.

37:54

What is it in your view that would cause significant changes in the performance of an H one hundred in one cloud verse another.

Speaker 4

38:03

It could be in some part configuration. Right, we build everything the DJX reference back, which is the most outlined by video. It's the most performance way to build, operate, and deliver GPUs. But the rest of it, honestly is just how you operate the GPUs, and that is the core weave software stack. That is, how do you keep these GPUs online? Right? Like what happens if the GPU flails? Can you predict if a GPU is about to fail and swap in other infrastructure so that the client doesn't

38:37

have downtime on that component. And there's an immense suite of software solutions that and infrastuctual management solutions that we have built to have the best good put to have the best MFUS in the industry. That's none of that is off the shelf, right, And so I wouldn't say it comes down to the strict components. That's kind of like a minimum starting point, right, Like you have to start in DJ's referencepect But where's differentiation come from there?

39:06

I mean, that's that's the core re product you're describing right there.

Speaker 2

39:10

By the way, Tracy, I'm just looking up in terms of art. Goodput measures the fraction of peak hardware performance that the training job can extract. This is according to Google and mfu's model FLOPS utilization Hardware metric for evaluating real world efficiency of LEMA training. So two new terms. I actually hadn't heard of MFUS or good put before this. I just learned two new terms today.

Speaker 3

39:32

We got to create a gloss AI glossary. Yeah, I do, Brandon. When Joe asked you that question about compute markets earlier, you said it was a timeline question, which in my mind implies that it's inevitable, like it's just a question of how long it takes. But then when you describe the fungibility problem, it seems like this is an actual issue that will be very difficult to solve.

Speaker 4

39:55

Yes, I think that characterization is absolutely correct. Right, Like if you just general commodity theory and I traded natural as electricity agriculture products for over a decade, like it suggests that it should become that at some point, But what is the reality today? The reality is this stuff

40:15

isn't getting easier to operate, right. We've moved from these kind of relatively simple forty two to uter air cold racks of Hopper to these immensely complex Blackwell deployments moving into Vera Rubin following that, like, it's not getting easier to build, operate, provision, deliver these reputs. It's getting more difficult, and I think until it starts becoming easier, you don't

40:44

really have a path to commoditization. You will have to continue to prioritize working with the world class and world leading operators of infrastructure. That's where we sit.

Speaker 2

40:57

First of all, this is helpful, and I like that we're getting multiple respectives because I do think this is gonna be like one of the big questions for financial markets, because let's say if they took off, then you could imagine that might even improve financing conditions because then the lender can hedge against that. Yeah, so like there would probably be some good things for the industry if this took off. So I appreciate it's good to have your

41:19

perspective on this. Why is it you know, I'm a I'm an inference I am an inference user, by the way, So I made a little machine learning model in one of my hobby projects, and I provide inference O, Havelock, dot AI or I'm a user of inference or whatever, I have a model whatever.

Speaker 3

41:38

Why is it that impressive if you were providing I'm trying to.

Speaker 2

41:42

I guess I have a consumer of inference. I use a anyway, Why is it that I am actually very easily able to get not a huge allocation of you like, GPU access? So I was like, how do I train this model? It's a model called bert the Google really east in twenty eighteen or twenty nineteen. I fine tuned it for my purposes and then literally using claud code, I was able to in ten minutes sign up. I started using this company called Modal, and I was able

42:12

to start training a model. I was surprised that there's like and it didn't cost me very.

Speaker 4

42:18

Much and I have like no volume.

Speaker 2

42:19

But nonetheless, evidently there was a little GPU capacity out there that I could get and it cost me like five dollars or something for the whole thing. Given what you always hear about like a utilization is slammed, why is it actually not that hard to find GPU capacity for someone like myself?

Speaker 4

42:37

You know, I think it's the skill O difference right there, finding ones or tens of GPUs. I think that's way more accessible out there. Okay, our clients are focused on the hundreds of thousands of gps.

Speaker 2

42:51

I'm not there yet, but I'm not there yet, not yet.

Speaker 4

42:54

I'm sure you'll get there, yes, And that's where it kind of decommodetizes itself with scale as well. Right, Like, as you're in the hundreds of thousands component, there's just not that many deployments, right, It's handfuls of deployments at that size, But getting access to ones of GPUs, I think that there is a lot more ability to go secure that sizing in the market.

Speaker 3

43:19

So Joe and I are heading to Hong Kong very soon, and I expect that AI in China is going to be a big topic of conversation. How would you characterize I guess the difference between the US and the Chinese market at the moment. I'm sure this is something you think about, even though you don't participate in the Chinese market directly.

Speaker 2

43:38

Yeah, that's Tracy asking for questions.

Speaker 3

43:40

Yeahs Basically it's.

Speaker 2

43:41

Like questions that we can ask people when we're over there.

Speaker 4

43:44

Yeah, that's likely going to be my response, Tracy is like, we just do not participate in that market, I think that there's opportunity for us to be expanding. As you guys know, we operate in Canada, Europe. I think moving further east makes a lot of sense for us, But we're trying to be very methodical in the way that we expand, So unfortunately, I'm not gonna be able to

44:07

healthy with specific questions in that market. But I would imagine you're going to encounter a lot of the same things that you're seeing in the US, which is just insatiable, unrelenting demand for AI And like you know, we just kind of keep coming back to this. It's like there is no solution in sight for being able to satiate demand, right, There's just too many supply chain there's no path to solving demand in the near term or even the medium term.

Speaker 2

44:37

Frankly, you mentioned sir Tracy has two about land use. You said really was an issue. But like the first time we talked to you in twenty twenty three or whenever that was, there was not a major growing movement of people who were just like anti data centers in America. Maybe there were a few fringed people, but it was not something that was on the mind of politicians and activists and so forth. You do see these headlines, you know about some projects really having been shelved. It was

45:06

like a big one. Northern Virginia is a huge hotspot for it and there was a big project that was they pulled the plug on due to them they couldn't get an agreement with the local government. That must have affect you. What are you seeing in terms of like your capacity to build? How has it changed specifically in light of or have you seen a change? Would you be able to build faster in a world where this had never become a political hot button issue.

Speaker 4

45:31

I believe it has become that hot button issue. It's something that we're quite proactive about in market and I think you just kind of go through the checks on the diligence process to make sure you're going through it correctly. I think that there's misconceptions out there, like water usage.

Speaker 2

45:49

Yeah, setting aside, the misconcept like setting aside, I know, setting aside, the whole debate about but just in terms of, like operationally, what's it changed for you in terms.

Speaker 4

45:59

Of Yeah, no, I would say our greatest challenge is still just getting that delivery of our like the construction and all the goings and getting everything in there like that is truly more of the ball back that's in the market today.

Speaker 2

46:14

Brandon, thank you so much for coming back on odd lugs. I'll have you back next month for another market about it. No, or at least or maybe in three years, not three years, not three years.

Speaker 4

46:24

But really that's an eternity.

Speaker 2

46:26

Yeah, I know, thank you so much.

Speaker 4

46:29

Thanks guys, appreciate it.

Speaker 2

46:43

I'm very excited about whether compute features will take off. I think this is an exciting like story. You know, it's not the biggest story in the world, but it is actually a very exciting story.

Speaker 3

46:52

I've said this before. Even if you're not that interested in AI, this is a really interesting market structure story, right. It's based the creation of a brand new market and poses all these interesting philosophical questions about how you do that. And I thought Brandon's point about fungibility. I mean, that is a real issue, and it does seem like it's a challenging one to fix at the moment. I don't know if it's inevitable in the future, but who knows.

Speaker 2

47:22

No, No, I mean it makes a lot of sense. This was also Lewis Hart's point that it's like the fun you know, it's in the word commodity, right if it's if it's not a commodity, you're not going to get a commodity market for it. And of course a number of entities are betting that it will be commoditized.

47:38

But if the if it's true that like you know, they're getting more difficult to work, that the technical demands on the influence provider, on the data center company are getting greater in order to get the maximum you know, juice, then maybe it doesn't become commoditized. But I think that's like a fascinating thing.

Speaker 3

47:59

Like if you do see those efficiency improvements and new designs and things like that, you could imagine that, like the demand is there for standardized GPU as well.

Speaker 2

48:09

Yeah, I don't know.

Speaker 3

48:10

Like I'm really torn. It feels like they should go either way.

Speaker 2

48:12

Well, and even in his answer, he talked about how they can figure their own GPUs to expect largely that Nvidia itself has come up with, so in theory, like there is aspect that everyone can match to. So that's like a really interesting that's a really interesting question. I also really want to do more on all of these, so Google his TPUs, Amazon his tranium. Microsoft has its

48:42

own hardware. Maybe even Jane Streets in the Hudson River tradings will have their own hardware if they're not, like I want to understand better why, right, because like they presumably have some reason and they at least like the Microsoft will say, well, this will run better on our customers hardware. I want to understand why that would be.

49:02

How much difference in performance is there? And then the degree to which demand materializes from users for non in video silica, And there's like a really big question.

Speaker 3

49:14

Yeah, why custom chips?

Speaker 2

49:15

Yeah, and what can you get out of that if you align model and chip to optimally work together. I have no idea, but I feel like it's an episode I would like to do.

Speaker 3

49:24

Yeah, we should, all right? Shall we leave it there in the meantime, Let's leave it there?

Speaker 4

49:28

All right?

Speaker 3

49:28

This has been another episode of the All Thoughts podcast. I'm Tracy Alloway. You can follow me at Tracy Alloway.

Speaker 2

49:33

And I'm Joe Wisenthal. You can follow me at the Stalwart. Follow our guest Brandon McBee at Brandon McBee. Follow our producers Carmen Rodriguez at Carmen armand Dash'll Bennett at Dashbod, kelb Brooks at Kelbrooks and Kevin Lozano at Kevin Lloyd Lozano. From our odd Laws content, go to Bloomberg dot com slash odd lots or have a daily newsletter and all of our episodes and you can shout about all these topics twenty four to seven in our discord Discord dot gg slash oddlines.

Speaker 3

49:59

And if you and enjoy all thoughts, if you want us to do an episode on custom chips, then please leave us a positive review on your favorite podcast platform. And remember, if you are a Bloomberg subscriber, you can listen to all of our episodes absolutely ad free. All you need to do is find the Bloomberg channel on Apple Podcasts and follow the instructions there. Thanks for listening.

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript