How to Build the Ultimate GPU Cloud to Power AI

Speaker 1

00:10

Hello, and welcome to another episode of the Odd Lots Podcast.

Speaker 2

00:14

I'm Joe Wisenthal and I'm Tracy Alloway.

Speaker 1

00:16

Tracy, have you looked at in video stock chart lately? And by lately, I don't mean like over the last two years. I mean like just like over the last like two weeks or two months.

Speaker 2

00:24

I don't need to look at it because everyone keeps talking about it. So I know, I know what's happening.

Speaker 1

00:28

You know what, I'm pretty happy about. Could I just say, you know, we did that episode like two months ago, yes, with Stacy Raskin, and we were like, what's what's up a good video, like well, you know, I know it's at the center of the AI chips boom and whatever. And then like we did that episode and it came out and then a week later like they just like knocked it out of the park.

Speaker 3

00:48

Yeah, so you.

Speaker 2

00:49

Know, we were early.

Speaker 3

00:51

We were at least like, you know, a good like two weeks earlier.

Speaker 2

00:55

Hey, hey, two weeks I'll take it.

Speaker 3

00:57

I'll take it.

Speaker 1

00:58

So clearly something that you know, we and we talked about this with Stacy, like you know, something that in Nvidia has is like everyone's trying to buy it. Everyone's trying to get it, But then it raises the next question of like, okay, but what is that market? Like how do you buy a chip?

Speaker 2

01:13

Yeah? How do you buy a chip? And then I guess what do you actually do with it once you have it? Because my impression is that for a lot of these AI applications, the way you use the chips, the way you set up the data centers is very, very different to what we've seen in the past. And I think also what in Vidia is doing now is kind of different. But maybe we can get into this

01:35

with our guests. My impression is they're trying to create a sort of like holistic approach for customers where they provide not just the hardware, but also some services to go along with it.

Speaker 1

01:46

Yes, right, and like all the software and Stacy talked about that with the Kuda ecosystemica that.

Speaker 3

01:50

Was it, how dominant that is? But right, like what do you do with it? Like how do you get one? If? Like what you know, what would we do.

Speaker 1

01:56

Tracy if a big palette of in Vidia chips wound up here?

Speaker 2

02:01

Do you want to know a secret? Yeah, my basement is filled with h one hundred chips. Just got a pile of them. It came with the house.

Speaker 1

02:08

It was on that ship that was stuck off the Chesapeake, and instead of getting your cowards, you got it.

Speaker 2

02:13

I just caught a palette of age four hundreds.

Speaker 1

02:15

That that well, we're manifesting that into reality. So anyway, I like how this world works so essentially, like the trading and dealing of these, like the hottest commodity in the world right which is these these advanced chips from AI, and how that works and who can get one? I still think is like a sort of mystery that we need to delve further into this question.

Speaker 2

02:38

I agree, And there is also there's a lot of excitement around it right now for the obvious reasons of everyone's really into generative AI and in video stock is exploding, as we already talked about, but we're also seeing a lot of previous I guess consumers of chips, like the crypto miners start to pivot into the space, and I'd be curious to see what they're doing in it as well, and how much of that is just you know, desperation versus versus a real business opportunity.

Speaker 3

03:08

In the video game market.

Speaker 2

03:09

Yeah, oh totally, I forgot about.

Speaker 1

03:11

Which was like the other thing. It's like, for years I thought of Nvidia is the video game company. Yeah, because they had their logo on xboxes.

Speaker 2

03:17

And how realistic is that pivot? What proportion of those types of chips can be used for AI?

Speaker 1

03:24

Now, well, I'm very excited. We do have I believe the perfect guest. We are going to be speaking with Brandon McBee. He is the chief strategy officer and co founder of core Weave, which is a specialized cloud services provider that's basically providing this sort of like high volume compute to AI type companies. They recently raised over four hundred million dollars. Have been in this space for a little while. So Brandon, thank you so much for coming on odd lots.

Speaker 4

03:51

Thanks for the opportunity. Guys, really excited to chat with you all today.

Speaker 1

03:54

So let's just let me sorry if Tracy and I, like, I don't know why they would do this, but if like some VC was like, you know, we want you to do on launch GPT. We want you to like do a pore base large language model off of all the work you've done. We want you to compete with open AI. And they gave us like I don't know some like, you know, one hundred million dollar rays, they said, go start, do your startup? Could I call in video and buy chips? Would I be able to get in the door there?

Speaker 4

04:22

Gosh? I mean you're I think you and everyone else is asking that question, and you're going to have a

04:28

huge problem doing that. Right now, it's mostly just around how much in demand this infrastructure became, right I mean, you could argue it's one of the most critical pieces of information technology resources on the planet right now and suddenly everyone needs it, and you know, I like to contextualize it in that, you know, the piece of software adoption for AIS like one of the fastest adoption curves we've ever seen, right Like, you're you're hitting these milestones

04:57

faster than any other software platform previously, and now all of a sudden, you're asking infrastructure build to keep up with that, right a space that traditionally takes more time, and it's created this massive supply demand and balance just on in place infrastructure today and not only infratructure is available to purchase, and it's an issue that is going to be ongoing for a bit as well, we think.

Speaker 2

05:23

So can I ask the basic question, which is core weave. What do you do exactly? Joe mentioned the capital raise, which I think has you valued at something like two billion dollars, So congrats, but what exactly are you doing here?

Speaker 4

05:38

Yeah? Thank you. So Corey is a specialized cloud service provider that is focused on highly parallelizable workloads. So we build and operate the world's most performant GPU infrastructure at scale and predominantly serve three sectors. That's the artificial intelligence sector, the media and entertainment sector, and the computational chemistry sector. So we build specialize in building this infrastructure at supercompute scale.

06:04

It's like quite literally, you know, it's sixteen thousand GPU fabric and we can get into all the details and how complex that is. But we build that so that entities can come in and train these next generation foundation machine learning models on. And you know, we found ourselves in a spot where we can do that better than literally anyone else in the market and do it on a timeline that's faster or I think the only entity with H one hundred available to clients at scale globally today.

Speaker 2

06:32

So you have an actual basement full of H one hundred chips. Well, can you talk to us. You know, when you say infrastructure, we help clients build out the infrastructure, help us conceptualize this. What does what does the infrastructure for this type of AI actually look like? And how does it differ to infrastructure for other types of large scale technology projects.

Speaker 4

06:57

Yeah, totally, so, you know, I I think during the last in video quarterly earnings called Jensen put this a really great way in the Q and A section, he said that we are at the first year of a decade long modernization of the data center, or like making the data center intelligent. Right, you can kind of you could suggest that the last generation or the twenty tens data center was comprised of CPU, compute, storage and these

07:23

things that didn't really work together that intelligently. And the way that in Nvidia has positioned itself is to make it a smart data center that's like smart routing of data packets of different pieces of infrastructure in there. That's all focused on how do you expand the throughput in communicability of and between pieces of infrastructure. Right, It's just an amazingly different approach to data center deployments. And so the way that we're building it and we're working with

07:54

Nvidia infrastructure. We design everything to a DGX reference back in dgx's in videos like how do you draw the most performance out of Nvidia infrastructure is possible with all the anciliary components associated with it. So all this stuff is going into what's qualified as a Tier three or

08:11

a Tier four data center. We collocate with within these things, so we're not quite building in a basement, even though like in our past history we certainly you know, had time doing that, but this is within you know, just amazing collocation sites that are operated by our partners such as switch right. So a Tier three a Tier four site is something that's qualified based on its ability to

08:35

serve workloads with an extremely high uptime. So we're talking like ninety nine point nine nine percent uptime rate, and that's guaranteed by its power redundancy, it's Internet redundancy, and its security and then ultimately like it's connectivity to the Internet backbone. Right, So as it's like, as a first step, you're housed within these data centers that are just critical parts of the Internet infrastructure, and then from there you

09:06

start building out the servers within there. And I can go into that detail.

Speaker 1

09:10

So you mentioned actually I want to just get sort of defined some terms. Can you just real quickly before we move on Tier three tier four?

Speaker 3

09:17

What do you mean by this?

Speaker 4

09:18

Yeah? So tier three, tier four. This all goes back to like the quality of the data center that you're in it. It's all about the reliability and up time that you should be able to achieve out of that data center. It's another way to qualify the services around it. It's like power. You get redundant power, right like multiple power services in case one goes offline, there's another one you get, you know, redundant cooling, you get redundant Internet connectivity.

09:44

It's all these services that like have extra fiil safs that allow for you to operate at the highest up time and security level possible.

Speaker 1

09:52

Is higher tier better? Like tier three four? Is that better than Tier one and tier two?

Speaker 4

09:57

That's correct?

Speaker 1

09:57

Okay, so quick follow up question. Then you know we're interested in, like, okay, where the rubber hits the road. The scarcity is here. Let's say Tracy miraculously opens her basement and there really is like you know, all these palettes of these video chips, there is there capacity at the data centers right now, She's like, you know, what we want to co locate with you. You guys have great power, pretty well connected to the internet. You have like good

10:22

security guards. So there's operated twenty four to seven. We want to set something up, like is there space there?

Speaker 4

10:28

Yeah, it's a fantastic question. It's a it's an issue that didn't really pop up until really in the last eight weeks or so.

Speaker 3

10:34

Oh, it's really happening that fast.

Speaker 4

10:38

It's happening that fast, Joe.

Speaker 3

10:39

And it's okay, So.

Speaker 2

10:40

That we said the two week lead time on in video was very important, Joe.

Speaker 3

10:45

Yeah, you're right, you're right. Is wow? Wait what happened?

Speaker 4

10:48

Wait?

Speaker 1

10:49

What happened sixteen described? Sixteen weeks ago?

Speaker 3

10:52

Verus eight weeks ago?

Speaker 4

10:54

Sure, it even last year? Right, So this is a space, the data centers space, collocation space that's been fairly chronically underinvested in because the Hyperscale has just built out their own data centers, right instead. But what's happened is the infrastructure changed. The type of compute that we're putting in these data centers, it's different than the last generation, right, so we're predominantly focused on GPU compute instead of CPU

11:20

compute and GPU compute. It's about four times more power dance than CPU compute, and that throws the data center planning into chaos, right because ultimately, let's say you have a ten thousand square foot room in the data center, right, and you have a certain amount of power it's called one hundred units of power that go into that ten

11:40

thousand square feet. Well, because I'm four times more power DNS, it means that now I take those hundred units of power, but I only require about twenty five percent of that data center footprint or in other words, twenty five hundred

11:53

square feet within that ten thousand square foot footprint. So that then leads to like, not only is the space in the data center being used inefficiently now because you theoretically have to run more power into the data center to use that full ten thousand square feet due to the Poara density delta, but now you have cooling issues, right because you designed that footprint to be able to cool ten thousand square feet spread out across that entire area. But now you're dropping storry.

Speaker 1

12:21

Sorry, I just want to back up because this is extremely interesting, so I don't want to I just want to get this detail right.

Speaker 3

12:27

Just sorry, just to and then move on.

Speaker 1

12:30

But the let's given an x amount of power at one hundred units of power. What you're saying is that with this next generation of compute, it now only gets that's now.

Speaker 3

12:42

Only sufficient for a quarter of the data center.

Speaker 1

12:44

In other words, that to power that whole that space, and that to then power the whole space, you really would need like four x the power.

Speaker 4

12:53

That's accurate. Okay. The complication really arises out of the cooling that that's required from that, right, So if you imagine you can cool a ten thousand square foot space and you designed for that, that's one thing. But now if you have to cool in a much more dense area,

13:08

that's a different type of cooling requirement. And so that's led to this issue where there's only a certain subset of Tier three and four data centers across the US that can are currently designed for or can quickly be designed and changed to be able to accommodate this new power density issue. So now not only like if you had all those eighth one hundreds in your basement, you

13:34

might not have a place to plug them into. And that's become a pretty big problem for the industry very quickly, and truly has only arisen in the last eight weeks or so, and it's going to persist for a few quarters.

Speaker 2

13:46

So you were describing the difference between CPU and GPU. How do you actually connect these newer types or these different types of chips together, because I imagine, you know, old data centers you just have a bunch of like Ethernet cables or something like that. But for this type of processing power, do you need something different?

Speaker 4

14:08

That's exactly correct, Chracy. So what we so the legacy the generalized compute data centers are really what the hyperscalers look like. You know, Amazon, Google, Microsoft, Oracle. They predominantly use something that's called Ethernet to connect all the service together. And the reason you use that was, you know, you don't really need to have high data throughput to connect all these servers together, right, They just need to be

14:31

able to send some messages back and forth. They talk to each other about what they're working on, but they're not, you know, necessarily doing highly collaborative tasks that require moving lots of data in between each other. That's changed. So so today what people are focused on and need to build are these effectively supercomputers. Right, and so we refer to the connectivity between them, the network between them as

14:56

a fabric, right, it's called a network fabric. So if we're build holding something to help train like the next generation GPT model, typically clients are coming to us saying, hey, I need a sixteen thousand GPU fabric of H one hundred. So that's there's about eight GPUs that go into each server, and then you have to run this connectivity between each

15:19

one of those servers. But it's now done in a different way to your point, So we're using a in Nvidio technology called InfiniBand which has the highest data throughput to connect each of these devices together. And you know, taking this sixteen thousand GPU cluster as an example, there's two crazy numbers in here. One is that there are forty eight thousand discrete connections that need to be made, right like plugging one thing in from one computer to

15:50

another computer. But there's lots of switches and routers that are between there. But you need to that forty eight thousand times, and it takes over five hundred miles of fiber optic cabling to do that successfully across the sixteen thousand GPU cluster. And now again you're doing that within a small space with a ton of power density, with a ton of cooling, and it's just a completely different

16:14

way to build this infrastructure. It's just because the requirements have changed, right, Like we've moved into this, like this area where we are designing next generation AI models and it requires a completely different type of compute, and it's just it's caught the whole sector by surprise so much so that you know, it's really challenging to go procure it at the hyperscalers today because they didn't specialize in

16:38

building it. And that's you know where where core we've comes in is we only focus on building this type

16:43

of compute for clients. It's our specialty. We hire all of our engineering around it, all of our research goes into it, and it's you know, it's been a fantastic spot to be but our goal at the end of the day is just to be able to get this infrastructure into the hands of end consumers so that they can build the amazing AI companies that have ones looking forward to using and incorporating to enterprises and software companies.

Speaker 2

17:21

You know, you mentioned these special or purpose built connections that Nvidia is making, and this kind of leads nicely into my next question, which is what exactly is your relationship with Nvidia and in order to provide this type of service, you know, vast amounts of processing power that is well suited to a particular type of technology in this case AI, do you have to have a really

17:49

good relationship with Nvidia to make that work? Like do you have to have special access to H one, hundreds and other chips.

Speaker 4

17:57

It's a great question, and I'll try to offer or from Nvidia's perspective, and it goes a little bit back to the answer I just provided as well in that I would think from in Nvidia's seat, what's most important is empowering end users of their compute to be able to access their compute and the most performant variant possible at scale, and to be able to access it quickly, right, Like a new generation comes out, they want to be able to get their hands on it, right. And we've

18:25

built Core. We've around hitting every single one of those checkboxes. Right.

18:29

We build it at DGX reference back, we build it at scale, and we bring it online on a timeline that's you know, within months of a next generation chipset launch, as opposed to you know, the more traditional legacy hyperscalers that take quarters at a times, so US being in a position to do that has has enabled us fantastic access within Nvidia, and we have a history of consistently executing on exactly what we've what we say we'll do right, we under promise and over deliver as a business, and

19:02

I think that's just put us in this place where Nvidia has the confidence in allocating infrastructure to us because they know it's going to come online, they know it's going to get to consumers faster than anyone else in the market, and they know it's going to be delivered in its most performance configuration that exists.

Speaker 1

19:19

You know, I was thinking as I listened to some of these answers, I keep having like these like imagines, like you know, there's probably like some random industrial company that's like traded like you know on the like S and P four hundred that makes some cooling fluid whose like sales are going to be up ten x. So I'm like googling while we're talking, like what is a

19:40

company that makes kool aid fluid? Or like who is some company that's like really good at making these like infinite bands, Because it just likes.

Speaker 3

19:46

Right, yeah, like what are the anyway?

Speaker 4

19:48

Right?

Speaker 1

19:48

But like right, like you know there's going to be some yeah urchiery plate that are like thirty x up. But you know, I want to get a sense from you of so it's really changed a lot, and I kind of you know, in the last several months.

Speaker 3

20:02

Could we see it from in video results?

Speaker 1

20:04

What you're describing, like how big is the market getting and the way I think you know, I know, like with AI, there's training and they sort of build the model and then there's inference, and the inference is how they spit out the results. Can you talk a little bit about what you're seeing in terms of the growth of both of those aspects of AI, which is bigger and which is growing faster? And how do they compare to like the size of the installed compute base that already exists.

Speaker 4

20:31

Oh. Absolutely, So this is one of my favorite topics because it's just mind blowing the scale that's going to be needed to support AI and scale this infrastructure. So okay, so today most of the funding that's going into the AI space is too for funding to train next generation foundation models. Right, So when a company's raising a bunch of money at the end of the day, most of that money is going into cloud compute to go train this next generation found model to build that intellectual property.

21:01

So they have this model and they can go bring it into the inference market. And what I would say is, we're having a supply demand issue like a chip access crunch in the training phase, where in reality, the scale of the inference market is where we're all the demand truly is going to sit. So what I'd offer to help contextualize that is, let's take you know, there's some well known models in the market today. Let's let's say there's a preach an in market trained model and it

21:35

took about let's say ten thousand. A one hundred or so to train A one hundred is the last generation in chip, but you know it still applies in terms of relative scale here. So that company that used ten train their model, our understanding is they're going to need about a million GPUs within one to two years of launch to support the entire inference demand.

Speaker 1

22:00

So ten you could train the model on ten thousand of these chips, ten thousand of these sisters, whatever they're and then if they're actually going to be in the market and sell something or provide some service to make it worthwhile.

Speaker 3

22:13

They're going to need a million.

Speaker 4

22:15

A million, and I think that's just within first two years of launch. Show like we're we're talking about something that's going to continue growing afterwards. And so what does a million GPUs mean? Obviously? Right, so you know a couple I think it was like into last year, all the hyperscalers combined, right, Amazon, Google, Microsoft, Oracle. You can throw a COORWY from there, it was about, you know, five hundred thousand GPUs globally right available across those platforms.

22:42

I'd see it end of this year, it'll be closer to a million or so. But that's suggesting then that one AI company with one model could consume the entire global footprint of GPUs. And and now you start to think, wait, aren't there a bunch of other companies training these models in market right now? And I leo'ld say, yes, there are. So it can imply that there are in the short term, the demand of several million GPUs just to support the

23:12

inference market. And there's just there's just nowhere near enough globally of this infrastructure, and it's it's going to be a big challenge for the market as we exit this training phase and move into the productization or really just the commercialization of these models, like how do you generate revenue off them? And it's it's something that I don't think many people truly understand just the amount of scale

23:36

and construction that needs to take place. And now you put that in the same framework of the data centers that we were talking about, right, So there's this lack of data center space, there's lack of chipset supply, like it's it's going to be an issue for years that we see.

Speaker 2

23:50

So when it comes to scale, you know, you keep mentioning the hyper scalers, which is a great term, but people like Amazon, Google, I guess, Microsoft, IBM, et cetera. How quickly or what is your impression of how quickly they are able to ramp up in this space? Like how fast could they react to some of the trends that you've been outlining.

Speaker 4

24:17

Yeah, so I can offer what I'm seeing today. You know, the h one hundreds started to be distributed globally to all of us, right, like all the entities that have these you know, kind of upper tier relationships with Nvidia back in March, right, so we started getting them the s infrastructure online in April, really scaling in May, and you know, we have builds going on at ten data centers across the US right now, and we're delivering it

24:42

to clients. The guidance that we're seeing from the hyperscalers is that they're not going to begin delivering scale access to the H one hundred chipset until late Q three, maybe mid Q four, and some of them are even beginning to guide into Q one. And it's all driven by the fact that this is just a different type

25:04

of compute that they're building relative to last generation. Right You're no longer just running Ethernet to your point, between all these devices, you're not just plugging in CPU blades. You're having to deal with like totally different data center power density and cooling requirements. You're having to build supercomputers instead, with five hundred miles of fiber and all these connections.

25:24

It's just it's a completely different way to build the cloud, and it's taking them some time to catch up because

25:29

you have to retrain entire organizations to do this. So you know, as of now, i'd say the direct answer is three quarters after a chip set launch, But it's seeming it might take longer, And I think that's all going to contribute to this just kind of slower ability to scale infrastructure than what's being dictated by the adoption rate of AI software, And it's going to lead to this supply demand imbalance that will just last for a while.

Speaker 2

25:57

You know, you keep mentioning or we both keep mentioning the H one hundred for obvious reasons, But do you look at other chips or what would happen to you know, your own business if, for instance, a new chip was developed that could do the same thing or better than an Nvidia H one hundred. Like, for instance, I hear a lot of excitement about some of the stuff that AMD is developing. And I'm not a chips expert, except

26:25

maybe when it comes to freedo's or layers. But like, how how big a difference would that make to you if we suddenly got a different chip manufacturer gain prominence in AI.

Speaker 4

26:40

Sure? So I'd offer kind of two broad responses. One, typically, when you train a model you're going to use, you're going to use the same chips for inference on that

26:52

model as well. Right, So two PT four, For example, I was trained on a one hundreds, they're predominantly going to use a one hundreds going for or you might fit in some kind of newer generation hyper efficient chips into there, but it's not like you need a quote a GP with more vram on it, right, Like you're in a need your forty gig or your your eighty grid gig ram chip because that's the size of the model that that you trained, Right, You're not gonna need

27:18

like next multiple generations. You're not going to like really be able to adopt them to change the efficiency of serving that model. So what we view is that a chip's lifespan is like this first two to three years is spent training models, and then it's next four to five years is spent doing inference for those models that

27:38

it trained. And then within there as well, you do this thing called fine tuning, which is updating the model with new information, right, Like how do you keep a model like up to date with what's happened on a Twitter or what's happened on in the media. Right, you have to keep retraining it, right, and you'll use those

27:55

same chips to do that. But so your question on other chip sets, and this is something that we have a particularly interesting view into because we have like you know, call it six hundred and fifty AI clients right, and we're having conversations with them daily to ensure that we're

28:11

meeting their scaling demands. So it gives us a look into six to twelve months into the future what type of infrastructure they expect to need, and it's it's overwhelmingly people still want access to Nvidia chips and the reason for this is something that dates back, I think it's nearly fifteen years when Nvidia and Jensen made the decision to open source Kuda and to make this software set

28:38

accessible to the machine learning community. And you know today if you go to GitHub and you search a machine learning project, they's all reference Kuda drivers. And he's established this utter dominance of ecosystem around his compute within the mL space really similar to like the x eighty six instruction set for CP you versus ARM, right, like X

29:02

eighty six is is used prodominantly. ARM has been trying to find its way into the space for a while now and it's just really struggled because all the engineers and developers are used to x eighty six. Similar to how all the engineers and developers in the AI space are used to using KUDA, so it's something that like obviously AMD is highly incentivized to find a way into the sector, but they just don't have the ecosystem and

29:28

it's a huge moat to deal with. Then, you know, kudos to Nvidia for establishing themselves and having the patients to stick with it and to continue to support that community over the last fifteen years, and it's it's really paying off for them in spades today. You know, if the demand comes for that infrastructure at some point, it's you know, we can run other pieces of infrastructure within

29:50

our data center. But I also find that Nvidia has such an advantage on the competition with not only its GPUs, but all its components that support the gp USE, like the infinite band fabric, that it's it's gonna be a really difficult company to displace from the market in terms of the best standard for AI infrastructure.

Speaker 1

30:12

Can I ask a question, and I'm gonna I want to ask this politely because it's not intended to be accusatory or anything like that, So I don't want you to you know here it is like but like when you're like talking about like hyperscalers and you're like you know, Amazon, Google, Microsoft, and you know kind of core Weave, and it's like, okay, those are trillion dollar companies and you're a two billion

30:33

dollar company. Like why like I don't still don't think I like wrap my head around, like and I know, like they're all like in they're all talking about.

Speaker 3

30:40

AI et cetera.

Speaker 1

30:41

Like can you still just like explain to me a little bit, like why aren't they just gonna frankly like steamroll you or be able to let's put it this way, be able to Okay, maybe it'll take a few quarters to re evaluate things, but like you know, eventually this just becomes this sort of de facto offering from these big is that have these huge cloud budgets that must be orders of magnitude large larger than yours.

Speaker 4

31:06

Yeah. Yeah, I would really love to be able to have access to their their cost of capital, that's pretty sure. So the way, look, it's the way I had to talk about this is we don't have a silver bullet necessarily, right. I can't point to like a super secret piece of technology that we put inside of our servers or anything

31:24

along those lines. But the way I like to broadly contextualize, it is referencing another sector, and it's that like Ford should be able to produce a model, why right, Like they have the budget, they have the people, they have the decades of expertise. But in order to ask them to produce a model, why you would have to ask them to foundationally change the way that they produce a vehicle, all the way from research to servicing and that entire mechanism.

31:55

Like it's a giant organization. Now you have to go ask that huge organization and people to change the way that they go about producing things.

Speaker 1

32:04

And I get that, but just to push back a little bit, and I get And this is like a theme that we that comes up in various flavors on odd lots a lot, which is that like companies have internal it's really hard to replicate sort of like tacit knowledge within a corporation. And we see that with companies that make semiconductor equipment. We see that with companies that make airplanes. We see that with real estate developers that know how to turn an office building into a condo.

32:28

And so I think this is like a deep point, but you know they are offering AI stuff like I can look at Google right now, like there's Cloud AI like and there's Asia AI, and they all have their announcements. So I'm still trying to understand, like what is it that you're offering that all the hyper scalers they all have, they all say they have AI offering, So what is the difference between sort of like what you have and what they say is like their you know, AI compute platforms.

Speaker 4

32:54

Absolutely, and this will really depend on how much technical detail you like me to get into. But broadly, through structure differentiation like literally using different components to build our cloud, and through software differentiation we use different pieces of software to operate and optimize our cloud, we're able to deliver a product that's about forty to sixty percent more efficient on a workload adjusted basis than what you find across

33:18

any of the hyperscalers. So, in other words, if you were to take the same workload or like go do the same process at a hyperscaler on the exact same GPU compute versus core weave, we're going to be forty to sixty percent more efficient at doing that because of the way that we've configured everything relative to the hyperscalers and it comes back to this analogy between like why Forward can't produce the model? Why again like they can.

33:46

These are trillion dollar companies we're talking about. To your point, they have the budget, they have the personnel, and they certainly have the motivation to do so. But you know, it's not just one singular thing they have to change. It's a completely different way to building their their business that they would have to orchestrate. And it's what's the analogies. However many miles it takes to turn an aircraft carrier, right, like it's it's going to take them a while to

34:09

do that. And I think if they do get there at some point, which you know, I don't disagree with you, they're certainly motivated to, it's it's going to have taken them some time, literally years to get there, and they're going to look really similar to us. And meanwhile, I've dominated mortgage share and I've really established my product and market and I continue when I'll continue to differentiate myself on the software asign business as well.

Speaker 2

34:49

Since we're on the topic of adaptation, can I ask about, you know, your own evolution as a company, because I think I've read that you started out in ethereum mining, and at one point I'm pretty sure crypto mining was a substantial, if not the biggest, portion of your business. But you have clearly adapted or pivoted into this AI space. So what has that been like and can you maybe describe some of the trends that you've seen over your history.

Speaker 4

35:21

Yes, absolutely, and you're right. We did start within the cryptocurrency space back in twenty seventeen or so, and that was spawned out of just frankly curiosity from a group of former commodity traders. So myself, my two co founders, we ran hedge funds, we ran family offices, so we traded in these energy markets. So we were always attracted

35:42

to supply demand mechanics. But what attracted us but then cryptocurrency was there's this arbitrage opportunity that was a permissionless revenue stream, right, Like I knew the cost of power, I knew what the hardware could generate in terms of revenue with using a power input. Thus it's effectively an arbitrage right, So we explored that we had some of

36:02

the infrastructure operating literally in our basements. As you said, then that like quickly turned into scaling across warehouses, and at some point in twenty I think it was twenty eighteen, maybe late twenty eighteen, we were the largest ethereum miner in North America. We were operating over fifty thousand GPUs,

36:26

we represented over one percent of the ethereum network. But during that whole time, we just kept coming back to the idea that there's no moat, there's no advantage that we could create for ourselves relative to our competitors, right, Like, sure, you could maybe focus on power price and just kind of chase the cheapest power, but that just felt like

36:46

chasing to the bottom of the bucket, right. You know, I think an area we could have gone into is producing your own chips, right because if you produce the own chips and you run the mining equipment before anyone else has access to it, then you have an advantage for that period. But you know, we weren't going to

37:00

go design and fabric own chips. So what we kept coming back to was this GPU compute, Man, what if we could do other things, right, like what if they were what if we could develop uncorrelated optionality into multiple high growth markets right in those markets or where we predominantly sit today with an artificial intelligence, media and entertainment,

37:22

and computational chemistry. And the original thesis was, well, whenever our compute isn't being allocated into those sectors, we'll just have it mining cryptocurrency and we'll build out this fantastic company that has one hundred percent utilization rate across the infrastructure because it could switch immediately from being released from an AI workload into going back into the Ethereum network.

37:44

And we did get a brief glimpse of being able to operate that way in twenty twenty one, as we had our cloud live and we had AI clients in place, but Ethereum mining effectively ended during the merge in Q

38:00

three of twenty twenty two. But I'd say the other thing that we never appreciated was the utter complexity of running a CSP forgetting about the software side of the business, which in and of itself, you know, we spent about four years developing the software to build a modern cloud to do infrastructure orchestration and actually be a cloud service provider.

38:24

The components themselves that the sector broadly used for crypto mining were these retail grade GPUs, right, the kind of things that you plug in your desktop to go play video.

Speaker 3

38:37

They were like selling them on stock X. Yes, yes it was.

Speaker 4

38:41

It was crazy during that period to get your hands on that infrastructure for crypto.

Speaker 1

38:45

Mining, and all the video gamers hated the crypto people, right because they're like, I want to like play this game and they would like line up what is it? Game stop and like the geek wire shop and all that or whatever it is, and they like couldn't get it because you got it, not you.

Speaker 3

38:58

But yeah, they're orbating.

Speaker 1

39:01

Access to the chips first and getting more value out of them so that you could bit them up.

Speaker 4

39:05

We were certainly part of the problem, and that's absolutely correct. But you know what we found ultimately is like those chips,

39:13

that's not what you run enterprise grade workloads on. That's not what's supporting you know, the largest AI companies in the world, And starting in twenty nineteen, we stopped buying any of those chips and only focused on purchasing enterprise grade GPU chip sets that you know, Nvidia has a probably about twelve different SKUs that they offer, including a one hundred and h one hundred chips, and really oriented

39:38

our business around it. So it's a. I don't expect to see much repurposing of this kind of older retail grade GPU equipment that was used for crypto mining, because in crypto mining, you want to buy the cheapest chip that can do the thing for it, right, that can participate in crypto mining. But there's a huge difference in price between a retail plugin into your computer so you can play video games chip and an enterprise grade you

40:03

can run it twenty four to seven. There's not going to be downtime, You're going to have a low failure rate. Like there's a large technology difference and there's a large pricing difference between those and the crypto miners. You only needed the retail grade ship because you know, if it went down for two percent five percent of the time

40:19

for a failure rate, that's not a big deal. But the tolerance, the uptime tolerance for these enterprise grade workloads is measured on the thousandths of a percent, and it's a different type of infrastructure, so we don't expect to see the components really being reused, if at all. And then the other variable, going back to the very beginning of our conversation are the data centers in which these

40:43

are housed. So Joe, to your point earlier, you know, we sit within tier three tier four data centers, and that's the basically the broad industry standard for being able to serve these kind of workloads. The crypto miners sat within tier zero tier one data centers, and these things are like highly interruptible. They do like really interesting things like helping load balance the power markets in places like ir Cot right, Like they'll shut down when power prices

41:11

go too high and it load balances the grid. But enterprise AI workloads don't have a tolerance for that. Their tolerance again is measured on the thousands of a percentage in terms of uptime. So not only does the infrastructure not work from cryptomning, but the data centers that they built within don't work either the way that they're currently configured. Now, they could potentially convert their sites into tier three and

41:39

tier four data centers. I'll tell you that in and of itself, that is an extremely challenging task and it takes a lot of proprietary knowledge and industry expertise to do so. It's not just throwing a few fans in a room and a few air conditioning units. It's a it's it. Honestly, it feels like walking to a spaceship.

Speaker 3

41:57

Tracy, this is this is an episode.

Speaker 1

42:00

I don't know about you, Tracy, there's like six another like six follow on episodes. No, seriously, like the whole like data center market and the coolant and all, you know, the electricity, Like there's so many different rabbit holes you could go down, just like with the infrastructure you're.

Speaker 2

42:15

Talking about, for sure, And I think the estimates that I've seen on repurposing crypto GPUs, I think I've seen like five to fifteen percent, so to Brannon's point, but I'm sure, I'm sure there will be people out there who try.

Speaker 4

42:32

You got to try, right, because what if it works right, If you can make that work, that's amazing. But we're just you know, coming as an entity that was an extremely large operator of that infrastructure and has built, you know, one of the largest cloud service providers for AI workloads.

42:49

I can tell you it's it's gonna be really really hard to do it because we've had exposure in both those places and at the end of the day, they're just very very different businesses, both from the type of nearing and developers that you employed to the infrastructure to the data centers that you sit within.

Speaker 1

43:04

So can I just go back you know, yeah, just sort of like big picture, and I guess it sort of goes back to like who gets access to what?

Speaker 3

43:12

Who gets access to chips?

Speaker 1

43:14

And I imagine that you know, not only do you need a lot of money to like build a relationship with like Nvidio, you also probably need like a you know, expectation you're going to be back the next year, back the next year, back the next year, and that you

43:26

actually like have relationship and so forth. But I have to imagine like planning is really tough, and when like you know, you have this sort of like AI Machine language whatever like industry, and then something like chet GPT comes out and like suddenly everyone like, oh I need

43:43

to like have AI access. Talk to us about like the sort of like challenge of just sort of like planning to build when it can move that fast, and like everyone is just sort of guessing how big this market is going to be in two to three years.

Speaker 4

43:57

Oh my gosh, it's it's it's been utterly insane right like the it you know, back to last year, you know, the supply chain and the ability to get your hands on components. You know, you would call your your OEM. The OEM is the original equipment manufacturer, Like those are the super micros, the gigabytes of the world, who actually you know, build the nodes, build the servers, and you're you're buying through them, and then they buy the GPUs

44:20

from Nvidia and build all the components together. Right, So if you called them and said, hey, I need this many nodes to be delivered, they'll say, great, we'll start assembling. Takes us, you know, a week to two weeks to get the parts in assembling, and then it's another week for them to ship them to you, and then it takes us two to three weeks to plug them in

44:37

and put them online get them going. Right now, that's completely changed, as you know, like all the supply chain has gotten thrown off so much so that you know, in Vidia is fully allocated, like they've fully sold out their infrastructure through the end of the year. Right, you can't call them, You can't call the OEM and just say you need more compute chips like that, that's not possible. So much so that you know, when clients are coming to us today and they're asking for like a four

45:06

thousand GPU cluster to be built for them. We're telling them Q one, and increasingly it's moving towards Q two at this point because Q one is starting to get booked up right now, So it's something that a lot of time has been added to it. And then there's other supply chain variables within there as well. You know, we had a client earlier this year that we were in negotiations with them on the contract and you know, we really wanted to perform well on timing for it.

45:34

So we knew because of our orientation within the supply chain that there were some critical components that needed to be ordered ahead of time so that it would reduce our time to bring in the infrastructure online. And at that point it was the power supply units and the fans for the nodes that the Oliams were putting together, and if we hadn't have done that, it would have been another i think eight weeks on top of the build process, just because not all the components would have

46:03

been there at the same time. So you're navigating this, you know, within other kind of global supply chain disruptions and inflation and all these other things that are going on right now and it's just an insanely complex task that I think, you know, the generation of software developers and founders that we're working with today were used to being able to go to a cloud service provider and

46:28

just getting whatever infrastructure they needed. Right. You go to your hyperscalers and say all right, any of this and it was just there and available. And that just doesn't exist today because the pace of demand growth that we've been on and just the lack of this infrastructure's availability, and it's just caught everyone by surprise. Again. You're you're asking infrastructure to keep pace with the fastest adoption of a new piece of software that's ever occurred.

Speaker 1

46:58

Brandon McBee, core Leave, Thank you so much. That was a great conversation. Like I said, I always sort of measure the quality of a conversation of like do I get seven.

Speaker 2

47:07

How many additional episode like that is.

Speaker 1

47:09

A pretty good proxy for a good conversation. Do you get like eight ideas for future episodes? We got a bunch there, So thank you so much for coming.

Speaker 3

47:15

On the podcast.

Speaker 4

47:16

Always happy to chat with you guys, and thank you for the invitation.

Speaker 3

47:32

Tracy.

Speaker 1

47:33

I want to find that company that makes the coolant for the data No, seriously, for the data centers. That allows them to pack more compute and more energy into this space, because it's like it feels like they're probably going to make a fortune of the.

Speaker 2

47:46

Next Joe, I think you just want to talk to an HVACT contractor that's like installing out.

Speaker 3

47:51

Can we talk to it?

Speaker 1

47:52

Just some random like I love the Maybe it was such a funny thought like these like really advanced data centers, like oh, do we have like a local air conditioning guy who can like.

Speaker 2

48:01

But I imagine actually that would have been a good question for Brandon, wouldn't it. Like the labor constraints in building and adapting those data centers. But there was so much in there. One of the things, one of the things that I was thinking about was the point about how, well, okay, if you train a model on one type of chip,

48:18

you're going to keep using that type of chip. And I guess, I guess it's kind of obvious, but it does suggest that there's some stickiness there, Like if you start out using an Nvidia each one hundred, you're going to keep using them, and in fact, you're going to consume even more because the processing power required the compute required for the inference is higher than for the actual initial training.

Speaker 1

48:42

Which I knew that that was the case because Stacy said so as well, but I did not realize quite the scale of like how much more Like okay, like if you train a model and then we try to take it to market product tize it.

Speaker 3

48:55

As a business person.

Speaker 1

48:55

I'd say, if we try to productize, like how much more computing power or we would need for the inferant aspect? And meanwhile we have to keep training it all the time to keep it up with fresh data and.

Speaker 3

49:05

Stuff like that.

Speaker 2

49:06

Yeah, totally. And the other thing that I was thinking about, and again Stacy mentioned this in our discussion with him as well, but this idea of Nvidia building a kind of large ecosystem around the hardware. So you have the open source software Kudo, which we talked about a little bit, and then you have these sort of high touch partnerships with companies like core Weave where they're trying to make it as easy as possible for you to use their chips and set them up in a way that works

49:38

for you. It feels like maybe it feels almost like what bitmin used to do.

Speaker 3

49:44

Do you remember that, uh, no.

Speaker 2

49:46

Maybe they're still doing it anyway, but it does feel like they're trying to build this like ecosystem mote around the chip technology.

Speaker 4

49:54

Yeah.

Speaker 3

49:54

No, absolutely true.

Speaker 1

49:56

And you know, I really do take that point that Brandon made about like every company has a sort of like knowledge that cannot be written down on a piece of paper. Yeah, which is a Dan Wong point that

50:06

we've been talking about for years. And so it's like to your point, you know, like you have to like use different types of connectors and different types of power and all these stuff like the ease with which any sort of traditional cloud provider or data center provider can you know, sort of switch to it's like a you know, it's not trivial even with lots of no.

Speaker 2

50:26

But coming away, I'm coming away from that conversation thinking, like the big question here is how quickly can those other hyperscalers adapt and like how big a moat can Nvidia build around this business?

Speaker 1

50:39

And then I mean the other question I have is like what if none of these companies make any money building AI models, Like I still don't think like that's been proven and so you can have this like huge boom and like, hey, we got to build any by a model is what we're going to build, like you know, outlaws GPT for like data stuff and whatever. But it all is somewhat predicated on these companies being successful and

50:59

making a lot of money. And if they're not, and if it turns out that like the monetization of AI products is trickier than expected, then that also raises this question about like how long.

Speaker 2

51:09

This Like I'm sorry, Joe, so you're saying that tech companies should make money? Is that it? Are you sure?

Speaker 3

51:17

That's right? That's it's real post zerp thinking of it?

Speaker 2

51:20

I know, all right? Shall we leave it there?

Speaker 3

51:23

Let's leave it there.

Speaker 2

51:23

This has been another episode of the All Thoughts podcast. I'm Tracy Alloway. You can follow me on Twitter at Tracy Alloway.

Speaker 1

51:30

And I'm Joe Wisenthal. You can follow me on Twitter at the Stalwart. Follow our guest Brannon McBee.

Speaker 3

51:35

He's at Brannon McBee.

Speaker 1

51:36

Follow our producers Carmen Rodriguez at Carmen Arman and dash Ol Bennett at dashbot. And check out all of the Bloomberg podcasts under the handle at podcasts, and for more Odd Lots content, go to Bloomberg dot com slash odd lots where we have transcripts, a blog, and a newsletter that comes out each Friday. And check out our Discord. We have an AI channel and a semiconductor channel in there so people talk about these topics twenty four to seven.

Speaker 3

52:01

Maybe they'll be.

Speaker 1

52:02

Talking about them in both of those rooms when this comes out. Discord dot gg, slash.

Speaker 2

52:07

Outline and if you enjoy all thoughts, if you appreciate conversations like the one we just had with Brandon McBee, then please leave us a positive review on your favorite podcast platform. Thanks for listening.

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript