Episode 8 - OpenAI x Broadcom and the future of compute | OpenAI Podcast

⁠¶ Announcing the partnership

00:00

Hello, I'm Andrew Mayne, and welcome to the OpenAI Podcast. Today, we're excited to be breaking some news involving Broadcom and OpenAI. Joining me from OpenAI is Sam Allman and Greg Brockman, and from Broadcom, Hawk Tan and Charlie Kawas. A lot of ways that you would look at the AI infrastructure build-out right now, you would say it's the biggest joint industrial project in human history. We're defining civilizations.

00:25

next generation operating system like that is a drop in the bucket compared to where we need to go that's a big drop so what are we talking about today what brought you all together So today we're announcing a partnership between Broadcom and OpenAI. We've been working together for about the last 18 months designing a new custom chip. More recently, we've also started working on a whole custom system. These things have gotten so complex, you need the whole thing.

00:49

And we will be starting in late next year, deploying 10 gigawatts of these racks, of these systems. and our chip, which is a gigantic amount of computing infrastructure, to serve the needs of the world to use advanced intelligence. So this is going to entail both compute and chip design and scaling out?

01:08

This is a full system. So we worked, we closely collaborated for a while on designing a chip that is specific for our workloads when it became clear to us just how much capacity, inference capacity, the world was going to need. We began to think about, could we do a chip that was meant just for that kind of a very specific workload? Broadcom is the best partner in the world for that, obviously. And then to our great surprise, this was not the way we started.

01:34

But as we realized that we were going to really need the whole system together to support this, as this got more and more complex, it turns out Broadcom is also incredible at... helping design systems. So we are working together on that entire package, and this will help us even further increase the amount of capacity we can offer for...

01:58

our services. So, Hawk, how did this come about? When did this start? When did you guys first talk about working together on this? Well, other than the fact that Sam and Greg are great people to work with, it's a natural fit.

02:13

Because OpenAI has been doing, continues to do the most advanced models, frontier models in generative AI out there. But as part of it... you continue to need compute capacity, the best, latest compute capacity as you progress in a roadmap towards a better and better frontier model and towards superintelligence. And compute is a key part, and that comes based on semiconductors. And as Sam indicated, more than semiconductors. And we are, even though I say it myself.

02:51

probably the best semiconductor company out there. And more than that, AI is a very, very exciting opportunity for us in terms of we are, my engineers are pushing. innovation envelope and newer and newer generations of semiconductor technology. So for us, collaborating with the best.

⁠¶ The scale of AI infrastructure

03:16

generate DVI company on that is a natural fit. And this isn't just chips, it's going out to scale, like 10 gigawatts. And I can't have trouble kind of even understanding that. What does that even mean when you're talking about 10 gigawatts? First of all, you said it's not just chips that Hawk touched on this too, but the vertical integration point is really important. We are able to think from like etching the transistors.

03:38

all the way up to the token that comes out when you ask Chachapiti a question and design the whole system all of the stuff about the chip the way we design these racks, the networking between them, how the algorithms that we're using fit the inference chip itself, a lot of other stuff all the way to the end product. And...

03:57

One of the many reasons I'm so excited about that is by being able to optimize across that entire stack, we can get huge efficiency gains. And that will lead to much better performance, faster models, cheaper models, all of that. As you get...

04:12

that better performance and cheaper and smarter models one thing that we have consistently seen is people just want to use way more so we used to think like oh we'll optimize things by 10x and we'll solve all of our problems but you optimize by 10x and there's 20x more demand

04:27

So 10 gigawatts, 10 incremental gigawatts, this is all on top of what we're already doing with other partners and all the other data centers and silicon partnerships we've done. 10 gigawatts is a gigantic amount of capacity. And yet, if we do as good of a job as we hope, even though it's vastly more than the world has today, we expect that very high-quality intelligence delivered very fast and at very low price.

04:53

the world will absorb it super fast and just find incredible new things to use it for. So what is the hope with this? The hope is that the kinds of things people are doing now with this compute, you know, writing code... doing more and automating more and more of enterprises, generating videos in Sora, whatever it is, they will be able to do that much more of it and with much smarter models. It's amazing. So Greg and Charlie, when you think about...

05:19

historically when people have tried to develop chips or hardware to suit whatever was the current modem for using computing at that point? What examples have you looked upon historically to figure out how to plan forward? What's been inspiring you when you think about this?

05:34

Well, I'd say the number one thing, honestly, is working with good partners. I think it's very clear that we, as a company... are not able to do everything ourselves and getting into actually building our own chips for our own specific workloads was not something we could do from a total standstill.

05:49

without working with Hawk and Charlie and Broadcom. So this has been really incredible to lean on their expertise together with our understanding of the workload. And it's actually very interesting to see the places where OpenAI is able to do things very differently.

⁠¶ Collaboration and innovation in chip design

06:03

from the rest of the industry or the way that things would historically be done. For example, we've been able to apply our own models. to designing this chip, which has been really cool. We've been able to pull on the schedule. We've been able to get massive area reductions, right? You take components that humans had already optimized and just pour compute into it. And the model comes out.

06:22

with its own optimizations. And it's very interesting. We're at the point now where I don't think any of the optimizations we have are ones that human designers couldn't have come up with. Like usually our experts take a look at it later and say, yeah, like this was on my list, but it was like. 20 things that would have taken them another month to get to. And that's actually really, really interesting that we were coming up on a deadline.

06:43

working with Charlie's team and we were running optimizations, we had a choice of, do we actually take a look at what those optimizations were or do we just keep going until the deadline and then take a look after? And we decided, of course, you got to just keep going. And so we've really been...

06:57

building up this expertise in-house to understand this domain. And that's something we actually think can help lift up the whole industry. But I think that we are heading to a world where AI intelligence is able to help humanity make new breakthroughs that just would not be possible otherwise. And we're going to need...

07:12

Just as much compute as possible to power that. Like one example of something very concrete is that, you know, we are in a world now where chat GPT is changing from something that you talk to interactively to something that can go do work for you. behind the scenes if you've used features like pulse you wake up every morning it has some really interesting

07:31

things that are related to what you're interested in. It's very personalized. And our intent is to turn ChatGPT into something that helps you achieve your goals. The thing is, we can only release this to the pro tier. Because that's the amount of compute that we have available. And ideally, everyone would have an agent that's running for them 24-7 behind the scenes, helping them achieve their goals. And so ideally, everyone has their own accelerator, has their own...

07:55

compute power that's just running constantly. And that means there's 10 billion humans. We are nowhere near being able to build 10 billion chips. And so there's a long way to go before we are able to saturate not just the demand, but what humanity really deserves. So, Charlie, being very deeply technical and being with a company that's been at a number of forefronts of some of these revolutions, what's it been like working with a company like OpenAI and working with Greg on this?

08:19

So for us, it's been absolutely exciting and refreshing because the beauty of the work we do together is focus on a certain workload. We started actually first looking at the IP and AI accelerator. which is what we call the XPU. And then we realized very quickly that... We now can actually go to the workload all the way down to the transistor. And as Greg was just explaining, how we can both work together to go customize that platform for your workload.

⁠¶ Historical context and future vision

08:49

resulting in the best platform in the world. Then we realized, as Sam was saying earlier on, it's not just that. XPU or accelerator. Actually, it's the networking that needs to go to scale it up, scale it out, and scale it across. And so suddenly we started seeing that we actually can drive next level of standardization and openness. that not just only benefits us, I think it actually will benefit the entire ecosystem and it gets Gen AI.

09:16

to an AGI much faster. So very excited about the technical capabilities of the teams we have, but also the vision and I think the speed at which we've been moving. I'm still kind of wrapping my head around the scale of it because it's just from both trying to design something like a chip and to help figure out how you're going to get the maximum efficiency on this to just the size of it, the infrastructure, what's involved in this is a global effort.

09:41

And what comparisons have you been able to draw for this to other examples in history? I always think the historical analogies are tough, but this is... As far as I know, I don't know what fraction building the Great Wall was of global GDP at the time. But a lot of ways that you would look at the AI infrastructure build-out right now, you would say it's the biggest joint industrial.

10:07

project in human history and this is like this requires a lot of companies a lot of countries a lot of industries to come together and a lot of stuff has to happen at the same time and we've all got to kind of like invest together but at this point given everything we see coming on the research front given all of the value we see being created on the business front

10:32

I think the whole industry has decided this is like a very good bet to take, but it is huge. You go to one of these, even one gigawatt data centers, and you look at the scale of what's happening there. It's like a tiny city. It's a big complex thing. So it is just like incredible scale.

10:51

To the point of this being a massive collaborative project, I feel like whenever I call Charlie, he's in a different part of the world trying to secure capacity, trying to find a way to help us build what we're trying to do together. Exactly. Coolest thing, actually, I was thinking about is what we're doing together in this wonderful partnership. We're defining civilization's next generation operating system. And we're doing it, as you're saying, at the transistor level.

11:16

building new fabs, building new manufacturing sites, all the way to building these racks, and ultimately the data centers you're talking about, 10 gigawatts of data centers.

11:26

Yeah, I think it's an important thing to keep track of is often people get fixated just on the chips themselves. And it's kind of like thinking the National Highway Project was about selling asphalt or railroads are about steel. In reality, it's the things become possible on top of that. And you've probably thought a lot about that.

11:40

Like what happens? Well, I think this is like railroad, internet. That's what I think this is becoming over time, critical infrastructure or critical utility. And more than just... critical utility for, say, 10,000 enterprises. This is critical utility over time, right, Sam, for 8 billion people globally. That's, I think...

12:05

It's like the industrial revolution of a different sort coming forth. But it doesn't, it cannot be done with just one party or we like to think and done with two, but more than it needs.

⁠¶ Role of compute in AI development

12:18

A lot of partnerships, it needs a collaboration across an ecosystem. And also because of that, it's important to create. Much as we say about developing chips for specific workloads, applications and LLM, it also requires somewhat standards that are open, more transparent for all to use because you need to build up. a whole infrastructure at the end of the day to become a critical utility for six billion people in the world. And we're very excited, frankly, which is why.

12:53

We think we make great partners because I think we share the same conviction. And more than that, it is about scaling computing to create breakthroughs. in super intelligence and models. It's building the foundation of that. You guys have a lot on your plate. Why design chips now? Well...

13:18

You know, this project, we've probably been working on it for 18 months now, and it's moved incredibly quickly. We've hired some really amazing people. And I think what we found is that we have a deep understanding of the workload. And we work with a number of parties across the ecosystem. And there's a number of chips out there that I think are really incredible.

13:41

There's a niche for each one. And so we've really been looking for specific workloads that we feel are underserved. How can we build something that will be able to accelerate what's possible? And so I think that that ability to say... that we are able to do the full vertical integration for something we see coming but it's hard for us to work through other partners like that's a very clear use case for this kind of project yeah actually more than that and greg you put it very well

14:08

Really, why you want to do your chip is computing is a big part of what's gating this journey. towards superintelligence, towards creating better and better frontier models. It's really a lot of it down to computing, and not just any computing, computing that is effective, high performance, and efficient. given especially on power. And what Greg is saying is exactly what we learned and saw here. For instance, if you want to train, you design chips that are much stronger.

14:43

in computing capacity measured TFLOPs, as well as network, because it's not just one chip that makes it happen. It's a cluster, as Charlie put it. But if you want to do inference, you put in more memory. and memory access relative to compute. So you are actually, over time, creating chips optimized for particular workloads, applications.

⁠¶ Optimizing for specific workloads

15:11

as we go along. And that, at the end of the day, is what will create the most effective models. It's a platform that you want to create end-to-end. piece of historical context is that when we started OpenAI, we didn't really have that much of a focus on compute. We felt that the path to AGI is really about ideas. It's really about tryouts and stuff. Eventually, we'll...

15:36

put the right conceptual pieces in place, and then AGI. And about two years in, in 2017, the thing that we found was that we were getting the best results out of scale. It wasn't something we set out to prove. It was something we really discovered empirically because of everything else that didn't work nearly as well. And the first results we're scaling up are...

15:55

reinforcement learning in the context of the video game Dota 2. Did you guys pay attention to the Dota 2 project back in the day? Yes. It was a super cool project. And we really saw you scale up by 2x and suddenly your agent is 2x better. It's like, okay, we have to push.

16:10

this to the limit. And at that point, we started paying attention to the whole ecosystem, right? There were all sorts of chip startups with novel approaches that were very different from GPUs. And we started giving them a ton of feedback saying, here's where we think things are going. It needs to be models of this shape. And honestly, A lot of them just didn't listen to us.

16:25

Right. And so it's like very frustrating to be in this position where you say we see the direction the future should be going. We have no ability to really influence it besides sort of, you know, just like sort of trying to influence other people's roadmaps. And so by being able to take some of this in-house, we feel like we are able.

16:40

to actually realize that vision. And again, in a way that like we hope that we can show a direction and other people will fill in because the amount of compute required to bring our vision of AGI to the world, 10 gigawatts is not enough. That is a drop in the bucket compared to where we need to go. That's a big drop. The bucket's really big. What becomes possible with this when you're building your own chips for inference and for training? Where can you take this?

17:08

to zoom out a little bit if you if you simplify what we do in this whole process to you know

17:16

melt sand, run energy through it, and get intelligence out the other end. You're not literally melting sand. Like, it's a nice visual. That's a good one. That's all we have to do. I did. What we want is... the most intelligence we can get out of each unit of energy and because that will become the gate at some point and and i hope what this whole process will show us which is you know from the model we design to the chip to the rack we will be able to wring out so much more intelligence per watt

17:54

And then everybody that's using these models in all of these incredible ways will do so much with it. That's what I hope for. And you control your own destiny. If you do your own chips.

⁠¶ Journey towards AGI

18:08

You control your destiny. Yeah, it's interesting to think about how the things that we're doing today are pretty amazing, remarkable, but we're using stuff that wasn't actually designed specifically for the way we're doing it. Oh, I mean... The GPUs of today are incredible, incredible things. I'm very grateful and we will...

18:29

continue to really need a lot of those. The flexibility and the ability to let us do fast research is amazing. But you are right that as we get more and more confident in what the shape of the future is going to look like. a very optimized system to the workload will let us ring more per watt. That's great. And it's a long journey that takes decades. So if you go back to Hawk's example, take railroads, it took about a century to roll it out.

18:58

as a critical infrastructure if you take the internet it took about 30 years this is not going to take five years it's going to take a long time so i think As we collectively, especially with this partnership, continue to figure out ways to wring out more tokens out of it, we'll discover that, oh, for this training or research, maybe a GPU is great. Or maybe, you know what, we can...

19:21

Take whatever we're doing with Greg. It's actually a platform that allows you, like a Lego block, to take in things and out. And now suddenly we can get another XPU or an accelerator for next-gen that's targeted at it. training or an inference or a research. Yeah, and to the point that Sam said of GPUs have really come an incredible way, in 2017 when we started looking at all these other accelerators, it was actually very non-obvious about...

19:49

what the landscape would look like in 5, 10 years. And I think it's really a testament to companies like NVIDIA AMD for how much the GPU has just... move forward and continue to be the dominant accelerator. But at the same time, there's a massive design space out there. I think that what we see is workloads that are not served through existing platforms, and that's where that full vertical integration.

20:16

is something unique it's interesting too because the idea that you'd want to put inferences close to the user is something kind of relatively new you know we understood training but then you think about like the number of people every day using these products and how much they need compute to do fun things or serious things. And when you start thinking about kind of like the scale of it, like we talked before, I keep coming back to it's a very big thing.

20:40

Where, you know, where does it keep going? Is it just a thing that we're going to continuously find new things to use compute for? The first cluster OpenAI had, the first one that I can remember the energy size for. It was two megawatts. Adorable. We got things done with those two. I don't remember when we got to 20. I remember when we got to 200. You know, and we will finish this year.

⁠¶ Future of AI and compute capacity

21:08

a little bit over two gigawatts, and these recent partnerships will take us close to 30. And the world has done far more than I thought they were going to do. Turns out you can like serve, you know, 10% of the world's population with ChatGPT on and do the research and do Sora and do our API and a few other things on two gigawatts. But think about how much more the world would like to do than they get to do right now.

21:37

If we had 30 gigawatts today with today's quality of models, I think you would still saturate that relatively quickly in terms of what people would do, especially with the lower cost we'll be able to do with this. The thing we have learned again and again is, let's say we can push GPT-6 to feel like, you know, 30 IQ points past GPT-5, something big.

22:02

And that it can work on problems not for a few hours, but for a few days, weeks, months, whatever. The amount, and while we do that, we bring the cost per token down. The amount of economic value and sort of surplus demand that happens each time we've been able to do that. goes up a crazy amount so you can see to pick a i think well-known example at this point when chat gpt could write a little bit of code

22:26

People actually used it for that. They would very painfully paste in their code and wait, and they would say, do this for me, and paste it back in and whatever. And models couldn't do much, but they could do a few things. The models got better than you are. The UX got better. Now we have Codex. Codex is growing unbelievably fast and can now do like a few hours of work at a higher level of kind of capability. And when that's possible, the demand increase is crazy.

22:50

Maybe the next version of Codex can do like a few days of work at kind of one of the best engineer you know level, or maybe that takes a few more versions, whatever, it'll get there. Think how much demand there will just be for that and then do it for every knowledge work industry.

23:04

One way I like to think of it is that intelligence is the fundamental driver of economic growth, of increasing the standard of living for everyone. And what we're doing with AI... is actually bringing more intelligence and amplifying the intelligence of everyone and so as these models get better i think everyone's going to become more productive the output of what is possible

23:26

is going to just be totally different from what exists today. It's interesting, too, that going from a point when with GPT-3, which was pretty cost, you know, it was expensive comparatively to where you're at a level of a GPT-5 and the fact that you can provide that free... to people? And is that a motivating factor for you, the fact that every time you create these new efficiencies, that it just benefits so many more people?

23:47

Yes. Absolutely, absolutely, absolutely. And from our side on hardware, compute capacity, where to some extent the rubber hits the road on this, it's really...

⁠¶ Wrap-up and future projects

24:02

incumbent on earth to keep optimizing, pushing the envelope on leading-edge technology. And there's still room to go. And there's room to go even on where we are as we go from two nanometers going forward, less smaller than two nanometers as we start doing all kinds of different technology. It is really great, exciting times. especially for the hardware and the semiconductor industry. What Broadcon has done here is really quite incredible.

24:37

It used to be extremely difficult for a company like ours to think about making a competitive chip. In fact, so hard we just wouldn't have done it. And I think a lot of other companies wouldn't have done it as well. And all of this customized... chip and system to a workload just wouldn't be a thing in the world but the fact that they have pushed so hard and so well on making it so that they can a company can partner with them and they can do a miracle of technology chip

25:04

quickly and at scale. Unfortunately they do it for all of our competitors too, but hopefully our chip will be the best. Yes, of course. Thank you. It's really quite incredible. And I think also not just what they can do for us today, but looking at the upcoming roadmap, it's just so exciting, the kinds of technologies that they're going to be able to bring to bear for us to be able to utilize. Well, it's just the excitement of enabling.

25:27

joint and collaboratively, models, chat GPT-5, 6, 7, on and on. And each of them will require a different chip, a better chip, a more. a more developed chip, advanced chip that we haven't even begun to figure out how to get there. But we will. And actually, the GPTs are definitely going to be an increasing part of that.

25:51

Yes. We're actually looking forward to that because my software engineers now already use that from a software point of view. And it's delivering efficiencies of dozens of engineers. Really? Yes. Great. On the hardware side, we're not there yet. But, you know, the good news. We'll get there. We should talk. Yes, we should absolutely leverage this. But I was going to say with respect to compute. So when we started building these XPUs, you can maximum build.

26:18

a certain number of compute in 800 square millimeter. That's it. Now, today, we're actually working together to ship multiple of these in a two-dimensional space. The next thing we're talking about is stacking these into the same chips. So now we're actually going in the, you know, Y dimension or Z dimension, if you want to think three dimensional.

26:37

And then the last step we're actually also talking about is now we're going to bring optics into this, which is actually what we just announced, which is 100 terabits of switching. with optics integrated all into the same chip. So these are sort of the technologies that will take compute. the size of the cluster, the total performance and wattage of the cluster to a whole new level. I think it will keep doubling at least every six to 12 months.

27:03

What kind of timeframe are we talking about? When are we going to first start to see what's coming out of this relationship? End of next year, and then we'll deploy very rapidly over the next three years. Absolutely. Greg and I are talking about this at least once a week. We just had a chat earlier today on this. Yes, good progress today. Yes, exactly. But yeah, we're really excited to get Silicon back. Starting soon, actually. Yes, very soon. Yeah, I think that...

27:27

My view of this whole project is It's not easy, right? It's easy to just say, oh yeah, 10 gigawatts. But like when you look at what is required to actually design a whole new chip and to actually deliver this at scale, get the whole thing working end to end, it's just an astronomical amount of work. And I would say that we're very serious. You know, our mission is to ensure...

27:43

that AGI benefits all of humanity. We're very serious about benefits everyone. Like we really want this to be a technology that is accessible to the whole world that lifts up everyone. And you can really see that in trying to make the world be one of compute abundance. Because I think by default, we're heading towards one that is quite compute scarce.

28:02

You ask my wife when she's trying to get more Sora credits. It feels very scary. No, no. We feel it so concretely. Teams within OpenAI, their output is just a direct function of how much compute they get. And so that the amount of intensity on who gets the compute allocation is so extreme. And so I think that what we really want is to be a world where just if you have an idea, you want to create, you want to...

28:29

go build something that you have the compute power behind you to make it happen. Gentlemen, thank you very much for sharing this with us. This is going to be very exciting to see where this goes, and I hope we can keep talking about this as it continues to develop. Thank you. Thank you guys for the partnership. Thank you for the partnership. We're really enjoying it. We are too. Thank you.

✨ This transcript was generated by Metacast using AI and may contain inaccuracies. Learn more about transcripts.

Episode 8 - OpenAI x Broadcom and the future of compute

Summary

Episode description

Transcript