Why Cerebras CEO Andrew Feldman Built The World's Largest Computer Chip

⁠¶ Intro / Opening

Speaker 1

00:02

Bloomberg Audio Studios, Podcasts, Radio News.

⁠¶ AI's Expanding Market Influence

Speaker 2

00:18

Hello and welcome to another episode of the Odd Lots podcast.

Speaker 3

00:22

I'm Jill Wisenthal and I'm Tracy Alloway.

Speaker 2

00:25

Tracy, I have to say, unfortunately, I don't have AI psychosis.

Speaker 4

00:29

I'm certain of that debatable.

Speaker 2

00:31

I'm pretty sure. I'm pretty sure I don't have AI psychosis. I do have to say, unfortunately, like the amount of time now where it's like it feels like AI related questions and there's many of them are sort of like swallowing up the other thoughts that I have in my head, whether it's questions about which models best and why, and what are the economics of inference and how much training is pre training versus post training for each model.

Speaker 4

01:02

Like it's just sort of like this blog, there's a growing that's taking out more and more of my thoughts.

Speaker 3

01:07

What is your definition of AI psychosis? Because one would argue that maybe thinking about AI literally all the time would be a form of psychosis.

Speaker 2

01:16

Well, let's just say, like, I'm not the type who thinks that, Like, I don't like think that the AI is a friend, for one saying I'm not in love with the AI models. I don't think that in collaboration with chat GPT, that I'm stumbling on unified theory of physics and things like that.

Speaker 3

01:34

So like, but you do spend a lot of time in putting instructions, pressing the button, yes, what comes out, and.

Speaker 4

01:41

See what comes out.

Speaker 2

01:41

I'm just saying I think I'm aware that I'm talking to machine and that we're not establishing any great breakthroughs of which we are collaborators in partners and friends.

⁠¶ Introducing Cerebras and Giant Chips

Speaker 3

01:51

Recognizing you have a problem is the first step towards healing Joe. Seriously, though, there's there's a good reason to think about AI more and more, which is a huge chunk of not just the market, but the real economy is now revolving around AI right totally.

Speaker 2

02:05

So anyway, again, within the AI conversation, there are a lot of subcategories. One of the subcategories happens to be another odd Lat's favorite topic, which is chips. Of course, chips are used in multiple different ways. The chips are used in different parts of the AI supply chain, different types of chips of different roles, and so we have to learn more.

Speaker 3

02:25

We have to learn more, and I have to say I'm particularly interested in the company we're about to speak to partly because the two things I know about them are number one, they just had a huge IPO yep, right, raising something like five point five billion dollars at kind of insane multiple. I can't even do a price to earnings multiple because they're not profitable yet, but I think just on a sales basis, it was like sixty seven

02:49

times forward earnings, which is pretty juicy, pretty hot. And the second thing I know about the company is they make giant way first, which is just a fun image in your head.

Speaker 5

03:02

That's right.

Speaker 2

03:02

So if you were thinking it's like, okay, there is a hot entrance in this space. What is their differentiator? Well, one fact about them is their chips are just enormous. About the size of the dinner plate. One might think you're reading an onion article, but in fact it's real and apparently it actually has some real technical advantages.

Speaker 3

03:22

And it's different and so what everyone else is doing. So everyone else is, I guess, doing this sort of like modular networking thing where you get together a bunch of chips and you connect them together and that's how you get more compute, more memory, more power basically, But this company has done something different in the form of the giant wafers.

Speaker 2

03:40

The giant wafer, and if you figure that to get maximum performance, you sort of want to lessen the distance between things, then put it all.

Speaker 4

03:48

On one wafer.

Speaker 2

03:48

Anyway, we're gonna learn a lot more. I'm very sad about giant wafers.

Speaker 5

03:53

More.

Speaker 2

03:53

I'm very excited to say we do have the founder and CEO of Sarah Bras on the podcast, Andrew Feldman.

Speaker 4

03:59

Truly the perfect guest.

Speaker 2

04:00

So, Andrew, thank you so much for coming on the podcast on the week of your IPO.

Speaker 5

04:05

Well, thank you so much for having me. What a pleasure.

Speaker 2

04:07

Absolutely, Why don't you just start us.

Speaker 4

04:09

Off the big giant chip.

Speaker 2

04:11

They're apparently real, They're as big as a dinner plate.

⁠¶ The Wafer-Scale Architectural Advantage

04:14

What is the technical reason why this actually makes sense as a superior form of architecture for at least some aspect of AI.

Speaker 6

04:25

I think larger chips process more information and less time, okay, and.

Speaker 5

04:29

That produces faster results.

Speaker 6

04:32

And everybody had gone to bigger chips and video had moved from four hundred square millimeters to eight hundred square millimeters over.

Speaker 5

04:41

The course of five or six years for.

Speaker 6

04:43

This exact reason, and in the compute industry wafer scale, which is building a chip for.

Speaker 2

04:50

Those, by the way, for those who are just listening, andrews Now holding up the chip, and yes, it looks it actually looks bigger than a dinner plate, to be honest.

Speaker 4

04:57

But that is a big that's a big chip.

Speaker 5

05:00

That's fifty think.

Speaker 6

05:03

It's fifty eight times larger than any other chip that had ever been Wow. And what it did was it allowed us to use a different type of memory, a type of memory that at the beginning, there are two types of memory. There's memory that can store a lot, but it's really slow, and there's memory that can't store very much per square millimeter, but it's blisteringly fast. And historically all graphics processing units use this memory that could store a lot but was really slow, and that's the

05:37

reason they do inference so slowly. So if you're using Claude right now, or you're using anything but chat GPT, what you frequently feel is you'll enter your prompt and you'll wait for an answer, right, And that's because the memory is slow and they have to move a ton of information from memory to compute. Now, by going to

05:58

wayfer scale use this fast memory. Now we couldn't make that memory store more information per square millimeter, but we could add square millimeters, and so by building this big chip, we were able to stuff it to the gills with this fast memory. And that's why we're fifteen times faster than the fastest GPU. That's why on some problems we're fifty, one hundred, even one thousand times faster than graphics processing units.

⁠¶ Overcoming Wafer-Scale Engineering Challenges

Speaker 3

06:28

Wait, can you explain how you actually managed to do this? Because I know there have been previous attempts to do wayfer scale, and I seem to remember there was even like an early attempt in the nineteen eighties or something to do it. How are you able to pull this off?

Speaker 6

06:42

Yeah, it was an ambitious undertaking, that's for sure. Every previous effort in the seventy five year history of our industry had failed, including Gene Amdall, who's sort of on the mount Rushmore of compute in our industry. He failed sort of spectacularly in the mid eighties at a company called Trilogy. Not only that, but after we succeeded, people who had visited us, who'd been in our labs tried

07:08

to copy us, and they also failed. And so what we were able to do is solve a set of really fundamental problems, and those problems cut across a wide swath of technology. They cut across lithography, so we had to collaborate closely with TSMC, and they turned out to be a great partner. We had to make inventions in material and packaging. That's how you put a process, or how you put a piece of silicon on a motherboard deliver power and IO to it. We had to make

07:40

inventions in power delivery. Right when you build a giant chip, you're going to deliver way more power to it than if you do a chip the size of a postage stamp. We had to invent ways to cool it. We had to write new types of.

Speaker 5

07:52

Software that ran on it.

Speaker 6

07:54

All of these had never been done before, and it was a decade long process. It took us five years and about five hundred million dollars to deliver the first one, and it's been an extraordinary run since. In December, we signed a deal with open Ai North to twenty billion dollars, one of the largest contracts ever signed in Silicon Valley, and then in March we signed a deal with with AWS where they would deploy our systems in their data centers in their AWS data centers, and so it's just

08:27

been an extraordinary run. But it took a long time. It took extraordinary engineering, and there were certainly long periods of time when it wasn't clear we were going to make this work.

⁠¶ Cerebras's Focus on AI Inference

Speaker 2

08:38

Obviously you've hit this remarkable milestone you have in fact IPO and so forth, and right now market's valuing your company at sixty four billion dollars early days of the IPO. Just for the listener to understand, the chips are are a solely in inference as opposed to, you know, in training. When we think about AI, I think about, okay, there's training, training, the model, and then answer giving that's the inference.

Speaker 4

09:03

Are the tips for just for inference.

Speaker 5

09:05

So a couple things I think you framed it exactly right.

Speaker 6

09:08

Training is how we make AI, and inference is how we use AI. And so what happened was that in sort of twenty twenty five, in the first part of twenty twenty five, the models we made were smart enough to be useful, and there was an explosion of use. And we use AI by doing inference. So there was this sort of tidal wave of demand on inference, and that has continued in twenty twenty six, and we think it will continue for years and years to come. And

09:38

so that's what had happened in twenty fifteen. When we began thinking about the company. We knew that AI was on the horizon and they would eat a huge amount of computer right, and we made sort of two fundamental bets. We bet that it would need dedicated silicon, and right, graphics had needed dedicated silicon, that's how you got.

Speaker 5

10:02

The graphics processing unit.

Speaker 6

10:04

Mobile compute had needed dedicated compute.

Speaker 5

10:07

That's where you got ARM processors.

Speaker 6

10:09

We made that bet, and we made a bet that modifying the GPU architecture wouldn't be right. You needed to start with a clean sheet of paper. And so what we started with was a new vision, and that vision could do training and it could do inference, and it

10:25

was orders of magnitude faster at both. But right now what we're seeing is such an explosion in demand for inference that a lot of the business this minuted his inference, even though we're just as fast at the same amount faster than GPUs on training.

⁠¶ The Indispensable Role of Speed in AI

Speaker 4

10:43

That's interesting.

Speaker 2

10:44

Maybe we'll get more to the theoretical training market a little later.

Speaker 4

10:47

Just real quick on inference.

Speaker 2

10:49

Ben Thompson, who writes a newsletter about tech, He wrote a piece in which he distinguishes between answer inference and agentic So answer inferences like format by resume or whatever, or write me an essay on X or Y, or answer some questions, and then agentic inference is like, Okay, here's this thing that's going to go around. Do you distinguish and do services for you not producing visual answers?

Speaker 4

11:15

Do you distinguish between those two? Is that a real divide in your view? And can your chips do both?

Speaker 6

11:22

Our chips can do both. I think it is a divide, Okay. I think speed.

Speaker 5

11:27

Matters equally in both.

Speaker 4

11:29

Okay.

Speaker 6

11:30

I think if you are engaged with the AI, if you're writing code, which is agentic. If you're writing code or you're doing work, nobody wants to wait. I mean, we could just turn the question around and say, well, how big is the market for slow search zero? How big is the market for dial up internet zero?

Speaker 5

11:49

Why is that? Because nobody wants to wait? Right?

Speaker 6

11:52

So, if you're engaged with the AI, speed is of the essence. But if the AI is doing agentic work and your competitor gets three times five times, ten times as much work done in twenty minutes than you do, you're gonna get smoked. And so this notion somehow that been proposed that speed isn't very important in agentic flows is dead wrong. That speed is important in all aspects of productive work, and that your ability to get more done in less time is a fundamental advantage that accrues

12:29

over time. Right If while your competitor is doing one unit of work, you can do three, and in the next time they do one unit of work, you do six. Sure, right, this adds up over time and you beat them in any line of work. And so speed, which is sort of our specialty, is important across the board.

⁠¶ Economics of Fast AI Tokens

Speaker 3

12:53

What do giant wafers and speed in general actually mean for I guess the economics of tokens, because one way I think about it, I have this sort of vision in my head, like, Okay, if I'm out shopping for toothpaste, I know I need toothpaste every once in a while, and I go into like a cvs A store, I get one thing of toothpaste, and then maybe a week later I get some more toothpaste. Or I could go to Costco and buy a giant thing of toothpaste and take it home, probably at a cheaper cost. And that's

13:22

sort of how I think of the giant wafers. Maybe it's bad analogy, but what does speed actually mean for the cost of tokens?

Speaker 6

13:30

Well, I think there are a couple observations. I think people have chosen so far to price speed a little higher. For example, Anthropic offered a premium service in which they offered tokens twice as fast and charged six times as much, and they sold it out and they couldn't meet the demand. Now, just to give you an idea, we're fifteen times faster than there, twice as fast, and so people value speed because it allows them to do more work and they value their time. And when you can do more work

14:07

in less time, you are making people more productive. That's why people have chosen to price them at a premium.

Speaker 5

14:13

They don't cost more to make.

Speaker 6

14:16

In fact, in the GPU architecture is an extremely good architecture and extremely efficient at building very slow tokens. And if you don't mind slow, the cost per token on a GPU is extremely low. But the GPU has a characteristic that as you try and go faster, the cost and the power used per token increase, sort of like as you go faster in your car, your miles per

14:46

gallon decrease. Right, So what happens is as you try and get fast enough to be useful, fast enough to be interesting, fast enough to keep users intelligence focused on this product, they become extremely expensive and extremely power hungry. And so the question is is not just what people are paying for a token, what people are choosing to price them at, but what they actually cost to make.

Speaker 5

15:14

And GPS make very.

Speaker 6

15:16

Slow tokens very cheaply, and they're unbelievably expensive at fast tokens. We make fast tokens vastly less expensive than the GPU's and we use a tiny fraction of the power.

⁠¶ Navigating AI Chip Supply Constraints

Speaker 2

15:43

Let's say we stipulate that this is not true and everyone wants the fastest and everyone's like, you know what, this is the solution that the Cerebras technology one big chip. This is really where it's at. How much of your market share for the inference market when you look out next year, the year after, et cetera, how much is your market share going to be dictated by your ability to get capacity at tsmc fabs. How much is that a gating mechanism for growth?

Speaker 5

16:15

You know, TSMC is a huge part of the supply chain. Yeah, but we have some real advantages.

Speaker 6

16:21

There are three areas right now that are limiting vendors in building AI computes.

Speaker 5

16:28

Number one is HBM memory.

Speaker 6

16:31

Is this memory we described earlier that can store a lot, but it's really slow. That's made by three companies approximately Samsung, Heienix, and Micron, and it's under unbelievable supply pressure. It's extremely difficult to get their very long lead times. It's unbelievably expensive right now, we don't use it. The second part that's limiting is a process inside of TSMC called cooths, and this is the process that in Nvidia and other GPUs use.

Speaker 5

17:02

We don't use it.

Speaker 6

17:03

The third thing is that at TSMC, the factory that is under most pressure is their three nanimeter factory.

Speaker 5

17:12

We don't use it. We use five nanometer.

Speaker 6

17:14

So we have managed to avoid some of the most binding supply constraints. Now, TSMC still has to give us a meaningful allocation, and they've been an extraordinary partner from the get go, and they are the greatest manufacturing company on earth by far. A fab is sort of a modern pyramid. It's an unbelievable thing. And I highly recommend you or any of your your listeners, if you get a chance to go to Taipei, go and see them.

Speaker 5

17:42

They are just extraordinary.

Speaker 3

17:44

Can you do faburs You can't, Actually, you can't do you can go and they have a museum of Innovation and it is an extraordinary thing.

Speaker 6

17:53

They are the sort of the national champion of Taiwan. But I think today TSMC is given a as many wafers as we've needed. Business today is constrained by data centers, and that's the grand irony, right You invent technology that has been unbuildable, never been invented for seventy five years in the history of compute. You write software that is extraordinary, You build a product that is vastly faster.

Speaker 5

18:19

Than the cumbent. And what are we all constrained by buildings? All right?

Speaker 6

18:24

The data centers right now are everybody's constrained in the entire industry powered buildings, So real estate.

Speaker 5

18:31

It is an amazing thing right now.

Speaker 6

18:33

And that is too sort of across the board, and that will not change for the next fifteen or eighteen months for sure.

Speaker 3

18:41

I mean, since we're talking physical constraints, I guess I should ask you. We did an episode about helium recently, a helium shortage given the situation in the Strait of Horror moves, and one of the things that helium is used for is lithography on semiconductor chips. Has that affected you at all or is that so thing that you're monitoring.

Speaker 6

19:01

We monitor, but there's not a lot we can do, and there's plenty of stuff to worry about that we can't affect. We obviously are in communication every day with TSMC.

Speaker 5

19:12

We're in communication with our entire.

Speaker 6

19:14

Supply chain every single day, and we stay abreast of the various issues. But it has had no impact on us, and we put that in the bucket of things that our manufacturing partners worry about also and that we can't help.

⁠¶ Open vs. Closed Source AI Models

Speaker 2

19:31

You know, So, in addition to manufacturing these chips, you actually I didn't realize this. You have your own cloud we do, and or you have your own cloud services, which I have a bunch of questions about that. You have your own cloud services through which a user can actually get access to various open source models and so forth.

19:52

It looks a little bit sort of visually, it looks a lot like the open router interface roughly the same environment, except like the open source What I'm something I'm curious about and maybe you could speak to this. You know, in traditional software open source. One nice thing about open sources you don't have to pay for it, so it's free.

Speaker 5

20:13

It's a little bit.

Speaker 2

20:14

Different when we're talking about there's no really such thing as like free AI software because even if it's like free, you still have to pay for the depreciation of the chips and you have to pay for the electricity to run them. So there's no real sarch things like free open source AI software. But what I am curious about in your experience as a cloud vendor, are the open sources models cheaper on a per unit of intelligence basis?

20:40

If we had some way of saying levelized cost of intelligence, which I don't know if the industry has yet, Are open source models cheaper per IQ point whatever we want, however we want to measure intelligence.

Speaker 5

20:54

Yes, by a lot?

Speaker 6

20:55

Really yeah, I think in the closed source world you're paying a lot for that extra little bit of intelligence. Right the open source models. There are no open source models that are as good as.

Speaker 5

21:06

The closed source models.

Speaker 6

21:07

Okay, think of it as three four percent five percent different Okay, something in that range, and it could be a little more, could be a little less, but the cost to you using them. You can jump up right now and run KIMMI Kate two. It's a one trillion parameter model. It's an open source model on cerebras where ten or fifteen times faster than others. And what you're paying for is the cost of our power and some

21:35

cost of the compute that took to calculate it. What you're not paying for was the cost to train it. And that's a battle that.

Speaker 5

21:43

Is underway in the market.

Speaker 6

21:45

You have open Ai with their coding software, you have Anthropic with their coding software. And you've got companies like Cursor and Cognition that are using open source. We power open Ai and we power Cognitive. You have a battle underway between closed source and open source, and I think that the winners of that battle is yet to be determined. What is clear is that the closed sources is strictly better by a little bit by how much varies, and it's more expensive.

Speaker 3

22:19

Yeah, I think we've talked about this before, but like I've heard of a lot of big companies in the US who have been like very quietly shifting from some of the closed source models to the open source models, like the Chinese ones like Kimmy Kimmy and Quinn. I'm sorry to press you on this point, but if you had to make a bet, like in twenty years, is the dominant aim AI model going to be a cheap open source thing or a more expensive, incrementally better closed source model.

Speaker 5

22:46

I don't think there's going to be one. Right, There's not one SaaS software. Right. There's some big dogs right.

Speaker 6

22:52

There, Salesforce, there's some other sort of giant players, and there are lots of other specialists. I can't think of many markets where we've sort of settled onto to one player.

Speaker 5

23:02

Right. If you look at the semiconductor market, you've got x eighty six, where you've got two major players.

Speaker 6

23:07

In AMD and Intel, and then you've got a whole adjacent market owned by ARM and the companies that build ARM parts, and then you've got customs silicon around that. I think that's the way you're going to have this. We're going to have you know, open aye is going to continue to do extraordinary things. They will be competitors to them, and they will be open source. I don't think I don't think any of those go away.

⁠¶ Debunking Nvidia's CUDA Dominance

Speaker 3

23:31

Since we're on the topic of software, one of the things you often hear when talking about you know, new chip entrance going up against Nvidia, is this idea that well, you know, like in video chips, they're great and all, but the real mode of Invidio's business is Kuda, right, software stack that goes with it. What's your take on that, Like, is that a realistic concern for someone who's trying to

23:57

go up against a company is big? And I guess as embedded in the software system as Nvidia currently is.

Speaker 6

24:04

In Nvidia is probably the greatest company in the first part of this century, right, you know, Jensen's one of the great CEOs of our era, along with hoc Ten at Broadcom and maybe Lisa at AMD.

Speaker 5

24:15

Just extraordinary.

Speaker 6

24:17

And Kuda was really important in the creating of the AI landscape, but it's not important now and it has no role whatsoever in inference.

Speaker 5

24:27

If you want to move from running.

Speaker 6

24:30

A model on GPUs today to running it on US, we can move it in ten keystrokes, just move point to our API. So that's the first part. The second part is that a year ago, every major frontier lab model had been built on a Kudah foundation, and today two of three haven't, so they lost seventy percent market share. They're three leading from tier models Gemini, Claude and GPT. Gemini built by Google on TPUs, trained on TPUs, served on TPUs.

Speaker 5

25:11

No Kuda.

Speaker 6

25:12

Anthropics models trained on Trainium, no Kuda served on TPUs, on Trainium and on GPUs, and Open Eyes GPT trained on GPUs in the Kuda environment. So two of the three leading models today used no Kuda. That's a hemorrhaging a share. And so I think what was true three or five years ago in which Kuda had a dominant position with central has shrunk significantly and not important at all in inference and shrinking in its role in training.

⁠¶ Financialization of Compute Markets

Speaker 2

25:51

You know, since we're talking about the economics, since we're talking about you know, the economics of inference and all this stuff, I've actually I would love to get your take one of the things that like, literally in the last couple of weeks, there's been this flurry of announcements of these attempts to financialize the market for compute and so it's like, oh, you're going to buy some capacity, the H one hundred benchmark et cetera. And people want

26:18

maybe theoretically hedging it. I'm not entirely convinced. It still seems to me like I it's not like maybe. But on the other hand, like an inference provider can lock in a very long term relationship bilaterally with the data center and so forth, and no need for like these spot hedging markets. Do you think the market is going to evolve in such a way that there will be significant demand for financial instruments that allow inference providers to hedge their price exposure.

Speaker 5

26:50

I don't know. I'm not a financial engineer. It's the first thing, okay, But we can look a little bit at history.

Speaker 6

26:55

The guys at core Weave were enormously innovative in how to fund some of their massive deployments. They were some of the first to use a debt instrument that had a backstop with the GPU, and this enabled them to really leap out and sort of have first mover advantage in the neocloud space.

Speaker 5

27:19

And that was an innovation in financial.

Speaker 6

27:21

Engineering, and extremely creative others followed, and now there's a big, an active debt market and funding the building and the fit out of data centers. When you have a market that is that big and that active, you have people who want to make bets on either side, and I think over time those bets normalized and regularize, and you can wrap them up and you can make it easy to make the bet. When sort of CO two was one of the first to loan money against GPUs for

27:54

core Weave, this was really innovative. And not only does core Weave get credit for the creating of the instrument, but so does the other side of the deal for doing it and making a successful innovative bet. And as sort of more and more people jumped in and these could be regularized, they could be more easily priced, and then once it's regularized and you have a market, then

28:19

derivatives of that market are easy to make. Historically, and that's sort of the way I see this unfolding that as this market for data centers and compute matures, there'll be people making bets on either side, and financial instruments.

Speaker 5

28:36

Will be created to do it.

Speaker 6

28:38

Whether it's a good idea or not, I have no opinion at this time.

⁠¶ G42 Partnership and Data Sovereignty

Speaker 3

28:41

Since we brought up finance, I was looking through the IPO filing and looking at some of the actual numbers in there, and I know you have the open AI deal now, but a huge chunk of your revenue comes from this company called G forty two in Abu Dhabi, and I think they're both like your biggest customer and also a major investor. What does G forty two actually do with all these chips?

Speaker 6

29:07

Sure last year they were a really important chunk of our business, a lot of it.

Speaker 5

29:14

They're a minority investor.

Speaker 6

29:17

They are the national champion, the national AI champion of the UAE, and they build a cloud that is used across the UAE's ecosystem.

Speaker 5

29:29

So it's used by leading universities there. It's used by leading.

Speaker 6

29:33

Companies there, companies like ADNOC, they're they're leading oil company. It's used by G forty two's nine operating companies.

Speaker 5

29:44

The deployments to date have been in the US.

Speaker 6

29:47

We have data centers that massive data centers that run equipment for G forty two here in Santa Clara, but also in Minneapolis and Dallas, stag this soon in Toronto, and so they're doing training and they're doing inference. The training they're doing, they have pioneered some of the leading English Arabic models. They've done genomic work. They are doing serving of models, and they're operating as a cloud, particularly for the UAE ecosystem, but also for global companies.

Speaker 2

30:40

Do you think that over time, corporate users and perhaps individual users, but corporate users will want inference served from a company that's separate from the model maker, such that they can be certain that they are not revealing and thus training the company that might replace them.

Speaker 4

30:59

I mean, look.

Speaker 2

31:00

Anthropic every couple of days and now to some new thing. Oh we have a new markdown file that could do this for tax or that could do this for whatever.

Speaker 4

31:09

And then a bunch of companies.

Speaker 2

31:10

File like our companies that you use AI increasingly going to want to want want to use data centers and inference providers that aren't the model themselves.

Speaker 6

31:23

Well, first, I think there is a type of professional, a type of job that is most directly under threat from AIK and they're almost always white collar, and they require you to have expertise over a body of knowledge. Right, That's what an accountant is, right, They have you have expertise over a body of knowledge of rulings, of previous examples of tax case law, et cetera. That's exactly what AI is good at right now, exactly so lawyer's accountants.

32:01

There's sort of these professionals who have stood between sort of the ordinary person who doesn't know anything about IRS tax rules and the tax rules that is under threat, and that is something that it will be very easy for companies like open ai and Anthropic to chew through. There are other areas like say drug design, genetics, genomics, where companies like galaxosmith Klein have remarkable and.

Speaker 5

32:37

Unique data sets.

Speaker 6

32:40

This is true for one of our large customers, Mayo Clinic. It's true for Glaxosmithklin and other of our pharma customers.

Speaker 5

32:47

They have unique.

Speaker 6

32:48

Data and they will be able to find insight in that data, and they will be able to get value from that data, and they will certainly.

Speaker 5

32:59

Not want to share that data with the foundation.

Speaker 6

33:03

Model makers unless they are guaranteed that it will not sort of make the general model smarter. And these are companies that have spent twenty or thirty years spending tens of billions of dollars a year gathering data right patient care records or test results for drug design. They're going to mine the insight in this work and they're going to provide find extraordinary things and those are much more protected because the INSIGHT's in the data and they have the data.

⁠¶ Challenges in US Chip Manufacturing

Speaker 3

33:37

You know, you were talking about fabs in Taiwan earlier, and I'm now regretting not going on a fab tour when I was in Taipei, but it just didn't cross my mind at that time. Next time, Yeah, hopefully. There have been various efforts under the Chips Act and some other industrial policies to try to build more chip making capacity in the US. In your view, what's the big I guess impediment to actually do it? Yeah? A, is it happening? And then B why does it seem so difficult to actually make happen?

Speaker 5

34:11

Right?

Speaker 6

34:12

The first thing is difficult because it's a difficult problem to They're hard that they cost thirty or forty billion dollars and take five or six years to build. So that amount of money in that amount of time cuts across administrations, right, And that's a problem with the politics in the US is it's hard to make policy that's durable across administrations and across time.

Speaker 5

34:37

The first thing.

Speaker 6

34:39

The second thing is these are remarkably complicated buildings, and we have a sort of a hodgepodge, a sort of strange lattice work of local regional building codes.

Speaker 5

34:55

That a fab maker has to negotiate. Third is we're trying.

Speaker 6

35:02

TSMC has dedicated tens of billions of dollars to their fabs in Arizona and have committed hundreds of billions more. Samsung has dedicated tens of billions of dollars and committed

35:14

hundreds of billions more to their fabs in Texas. But they take a long time, and we have to remain committed to building not just the fab, but the surrounding ecosystem, not just for three or five years, but for twenty years or twenty five years, because you want not just one fab, but you want a whole trajectory of fabs. You want them working at today's cutting edge, but tomorrow's and next years and in ten years cutting edge as well.

35:43

And those are things that have proven really challenging in the US, and I think we needed their strategic assets, and I think we need to find ways to collaborate with those that have the expertise and to find ways to build policy that is durable over a length of time that can build a vibrant ecosystem in the fab and the associated elements.

⁠¶ Navigating Semiconductor Export Controls

Speaker 3

36:09

So the other big political economy theme I guess when it comes to semiconductors is this idea that they are in fact a strategically important technology, and so the US should place some limitations on their use abroad. And so we've seen things like export controls, export restrictions. You're an actual chip company, and so I'm very curious at an operating level what your experience of these kind of export controls has actually been, Like how much time does that

36:41

take up for you? And then also given that one of your biggest customers is an international firm in Abu Dhabi, like, how important is the trajectory of those export controls to your future business?

Speaker 6

36:53

I think, you know, three or four years ago, I would have said not important at all. I think today they're really important. In the administration, I got to know the leadership and the Department of Commerce and in the Biss Division of Commerce, which oversees the licensing. I think this is an extraordinarily difficult job, and we saw really hard working, smart people doing a job.

Speaker 5

37:18

That is very, very difficult.

Speaker 6

37:20

I got to know the people in this administration and I found the same every single one of them is earning a tiny fraction of what they could earn in the private sector, and is doing this because they believe that this is an important mission. The problem is is that there are differing views about the right way to do this, and there are differing views on the right way.

Speaker 5

37:40

To achieve the goal, which is.

Speaker 6

37:43

To not give your most precious technology to your industrial enemy. And I think we can agree that today, in today's environment, China is an industrial enemy.

Speaker 5

37:56

Good well meaning people can.

Speaker 6

37:58

Disagree on whether the right strategy is to limit them from gaining access. Others argue, as those that Nvidia have argued, is that the right strategy is to give them access and to keep them working on our product, on us made on us sort of designed product. I come down on the other side of that argument. I understand they're

38:20

good arguments in both directions. I think limiting the distribution the diffusion of our most precious technologies makes sense, and I think we have to do it thoughtfully and we have to recognize that means some markets will be foreclosed to us, and I'm okay with that.

⁠¶ AWS Partnership and Market Access

Speaker 2

38:42

Just quickly, on the sort of like current business stuff you mentioned to deal with AWS, how does that work? Could customers right now like could customers of aws pay them to have infrint served specifically on one of your chips.

Speaker 5

38:58

Not yet, but soon, okay, they will be.

Speaker 6

39:00

It will be served in Bedrock, which is their AI as a service offering, and they will yes be able to go down the clickdown menu and get super fast inference, which will be delivered via a combination of what's called a disaggregated solution, which is using some tranium for some of the inference work and using the cerebras technology in our systems called the CS three for other parts of the work.

Speaker 2

39:28

And presumably someone who scrolled down and selects that they would pay some premium for that ultra fast in front.

Speaker 5

39:35

I think they will pay a premium.

Speaker 6

39:37

We will see this as entirely as Amazon wishes to price it to their product.

⁠¶ IPO, Wealth, and Long-Term Vision

Speaker 2

39:41

See you iPod this week. It's May twenty six. This is not the first time that you've tried to or look towards going to the IPO market. Door headlines going back to twenty twenty four about wanting to try for the IPO market, and then there were headlines last year, especially because of the relationship with G forty two, about Syphius and some of the national security concerns, and maybe that was an issue with the IPO. And then but also last September, you got one of your looks like

40:15

g Round g Round. One of the participants in the g Round investor was seventeen eighty nine Capital, which is of course the firm that's associated with Donald Trump Junior, which is a lot of things, and then the IPO happens. I'm a cynic, so I wonder if the participation, if Donald Trump Junior's investment in your company made it easier to get the green light from these national security concerns to do an IPO.

Speaker 5

40:43

I wish it were that easy. No, it had no role at all.

Speaker 6

40:47

We resolved all SIFIAUS issues in March of twenty twenty five. I believe that was before we took money from seventeen eighty nine. Okay, moreover, I wouldn't ask. That's not who I am and that's not the way we roll. So we took money because they are a thoughtful venture firm, and we don't believe that there's only one point.

Speaker 5

41:10

Of political view. There are lots of political views.

Speaker 6

41:13

They all have some merit, they'll have some weaknesses, and so we have right leaning political some investors, we have left leaning the fact that this firm had some right leaning in sort of investors. We were looking only at their ability to help us.

Speaker 5

41:31

Build an extraordinary company.

Speaker 6

41:33

And we have asked, and we will not at We have never asked, nor will we ever ask for political access or anything of the kind.

Speaker 3

41:42

What's it like to become a billionaire in a single day? This is something I assume will never happen to me, so I might as well ask you no.

Speaker 6

41:49

I think the honest truth is it was a big nothing for me. I had some wealth before and have some wealth after. Right, I think this is a very difficult way to make money, right, being a tech CEO, I think what you have to do is you have to love the work, you have to love the people, and you have to think every day about how to make your team rich. And far more important than sort of some change in my wealth was we made more than eight hundred millionaires. Nice, and that's something I'm proud

42:22

of every minute of every day. And at my last company we made a hundred millionaires. And at this company, through our IPO, we made more than eight hundred And that's something that you wake up feeling good about yourself.

Speaker 5

42:38

Every single day.

Speaker 3

42:39

That was going to be my last question, but actually you just reminded me in that answer. You know the idea that getting here, I said, you became a billionaire in a day, but obviously this was the outcome of years and years and years of work. And if we think about technological hardware, one of the things most people associated with is really long lead times and really big

43:03

research and development budgets. Now that you're a public company, how do you sort of balance that quarter to quarter financial performance pressure with the idea that you still need to be investing in capex, in new you know, new ways of designing chips, new improvements to the existing ones.

Speaker 6

43:23

Well, first, we think the opportunity for innovation, based on our way for scale engine, the best work is still ahead of us.

Speaker 5

43:32

Number one, we see an.

Speaker 6

43:34

Opportunity for extraordinary innovation in the years ahead to make leaps. Every bit is big and often bigger than what we made by building the largest chip on earth. When you love building hardware, the fact that it takes time is part of the deal, right that what we do can't be done in a week or a month or a year. And that's what you sign up for, and that's true.

Speaker 5

44:00

In every profession.

Speaker 6

44:02

You sign up for the good and the challenging, and you have to sort of make peace with that. If you're a person that wants to dive in and sort of begin iterating right away and fail quickly and code up something and look at it and throw it out in the market and see if it wins God speed, that's great, And that's not for me. You know, in our business, we measure twice before we cut once. And you have to put that in your soul, and you

44:34

have to like it. You have to like that mistakes in our business are really expensive, and you have to like the fact that you breathe life.

Speaker 5

44:42

Into a chunk of silicon and you get it to do.

Speaker 6

44:46

Things that nobody else has ever been able to make a chunk of silicon do. And if that's for you, then this process that takes time and money, you love that too. And so I think I would love it less if you could do it in a week. And I think the people that I love to work with they feel the same way. And they like being engineers not because it's a path to money. They like being engineers because they like building things, and they like building

45:14

hard things. And I like working with them for for exactly that reason.

Speaker 2

45:19

Yeah, you mentioned breathing life into a chunk of silicon. My dad, who's a physicist, always likes to point out how carbon and silicon are right next to each other on the periodic table. They are, and they're sort of like, here are the two things that we have closest to life, and they're literally touching each other.

Speaker 4

45:35

Maybe there's something deep in that.

Speaker 6

45:37

I think that's a really thoughtful thing, your father said, thank you, And I think that's really cool. And nobody pointed that out to me. The stared at periodic tables for a long time. But I think to the extent we can make artificial life, we need silicon.

Speaker 4

45:52

Yeah, and they're right next to each other.

Speaker 6

45:54

Right, carbon, carbon is the heart of all other life, and artificial life will be we've founded at least the elligent part will be foundered on silicon.

Speaker 2

46:02

Right below silicon is germanium.

Speaker 4

46:04

Maybe the next I don't.

Speaker 5

46:05

Know what what does that mean.

Speaker 2

46:07

Let's keep Yeah, let's keep an eye on germanium next. Andrew, thank you so much for coming on odd lots fascinating conversation right in the sweet spot of what we're interested.

Speaker 4

46:16

Really appreciate you taking your time.

Speaker 6

46:17

Hey, thank you guys for having me, and I really appreciate it.

Speaker 5

46:20

Look forward to seeing you against it.

Speaker 4

46:34

That was really fun.

Speaker 2

46:35

I'm super interested in this topic and it does feel to me like the economics of inference in particular and the market for ins ference inference capacity speed.

Speaker 4

46:47

Like it's still day one, you know what I'm saying.

Speaker 3

46:50

I just like looking at the giant.

Speaker 2

46:51

It's so cool. It's it really doesn't seem like an onion thing, doesn't it.

Speaker 4

46:55

It's like company solved in ference.

Speaker 3

46:58

With a giant building the biggest But.

Speaker 4

47:01

It is interesting.

Speaker 2

47:01

We did that episode of course with Ray Wang from semi Analysis and talking about the role like memory as being this really important part of the sort of cutting as chipsets, and it's interesting to think it's like, Okay, well, here is a bottleneck that doesn't run into that they don't have, and the idea that at least as he described it, they're not fighting to get the smallest animator chips and so maybe that gives them a little bit of breathing room ont capacity there too.

Speaker 3

47:29

Yeah, I mean, I do imagine there are some downsides to having giant chips, and you know, just as there are upsides that Andrew laid out the other thing I was wondering. I know he made the case for the reason speed is very important, but like I can also imagine a world where maybe it's not that important, you know, Like I think at some point, like the incremental speed factor just starts to become less important when weighed against like the incremental cost of generating it speed.

Speaker 2

48:03

I think it really this is like one of those things where it probably really depends what you're You're what you're using it for. Right, So it's like if you're like, you know what, I'm really curious why pterodactyls aren't actually dinosaurs?

Speaker 4

48:16

Can you explain it to me? Then it's like I don't care about that, Like that fraction of a second.

Speaker 3

48:21

I would wait five minutes for the chat bought to tell you you're wrong, Joe.

Speaker 2

48:24

You just you just don't really care that much. But if you're doing some sort of like agenta coding thing or whatever, et cetera, then like, yeah, that definitely adds up. And I will say, like, as you use it more like it's just like everything else the hit the treadmill of expectations. Here's some task that you can do in thirty seconds, which maybe several years ago would have taken you thirty minutes, and you get it patient in that thirty seconds, and you want it in ten seconds. And

48:53

that's just like that competition to shave down seconds. I think it's always going to be there, so no one ever gets satisfied with this, is is my point. It always eventually becomes like it feels like waiting.

Speaker 3

49:04

But to me, this feels like this is the crux of the AI valuation argument, which is like how much of a premium are we going to place on a model that maybe a closed source model that is maybe slightly better than an open source model. How much premium are we going to place on compute that is slightly faster than this other type of compute or like other use of compute like that. To me, it's an unanswered question.

49:30

And Andrew is pretty upfront about closed versus open source, but I think on the speed question too, like we're going to find.

Speaker 2

49:37

Out, We're going to find out, and you know, I think one of the things that is going to happen. And there have been all these stories about sort of like token shock, like how these companies are spending on tokens. My guess is one of the things that will happen at some point is there's going to be a lot more discussion about why are we using this ultra premium model when we could have done this, Like there is a lot of just like throw it at the AI, rack up.

Speaker 4

50:04

Those bills, et cetera.

Speaker 2

50:05

And at some point there's going to be this like, Okay, what really needs to be served fast? What really needs to be served on the most premium pro source models, and companies are probably going to get a lot more skilled at allocating from you know, different forms of inference depending on the need.

Speaker 3

50:23

Yeah, I think that's exactly it. And at that point, like we could well see some of the dynamics in the market start to change in terms of valuation. Shall we leave it there?

Speaker 4

50:32

Let's leave it there.

Speaker 3

50:33

This has been another episode of the Authoughts podcast. I'm Tracy Alloway. You can follow me at Tracy Alloway and.

Speaker 4

50:38

I'm Joe Wisenthal. You can follow me at the Stalwart.

Speaker 2

50:41

Follow our producers Carman Rodriguez at Carmen armand dash, Ol Bennett at Dashbot, Cal Brooks at Cale Brooks and Kevin Lozano at Kevin Lloyd Lozano and from our Odd Lots content.

Speaker 4

50:52

Go to Bloomberg dot com.

Speaker 2

50:53

Slash odd lots, where the daily newsletter and all of our episodes, and you can shout about all of these topics twenty four to seven in our disc discord dot gg slash thoughts and.

Speaker 3

51:03

If you enjoy all thoughts, If you like it when we talk about giant wafers, then please leave us a positive review on your favorite podcast platform. And remember, if you're a Bloomberg subscriber, you can listen to all of our episodes absolutely add free. All you need to do is find the Bloomberg channel on Apple Podcasts and follow the instructions there. Thanks for listening

Transcript source: Provided by creator in RSS feed: download file

Summary

Episode description

Transcript

⁠¶ Intro / Opening

⁠¶ AI's Expanding Market Influence

⁠¶ Introducing Cerebras and Giant Chips

⁠¶ The Wafer-Scale Architectural Advantage

⁠¶ Overcoming Wafer-Scale Engineering Challenges

⁠¶ Cerebras's Focus on AI Inference

⁠¶ The Indispensable Role of Speed in AI

⁠¶ Economics of Fast AI Tokens

⁠¶ Navigating AI Chip Supply Constraints

⁠¶ Open vs. Closed Source AI Models

⁠¶ Debunking Nvidia's CUDA Dominance

⁠¶ Financialization of Compute Markets

⁠¶ G42 Partnership and Data Sovereignty

⁠¶ Challenges in US Chip Manufacturing

⁠¶ Navigating Semiconductor Export Controls

⁠¶ AWS Partnership and Market Access

⁠¶ IPO, Wealth, and Long-Term Vision

Why Cerebras CEO Andrew Feldman Built The World's Largest Computer Chip

Summary ✨

Episode description

Transcript

Summary