Latest Interview of Elon Talking About AI Grok 3!!!

Speaker 1

00:00

Welcome to the GROC three presentation.

Speaker 2

00:03

The mission of Xai and Grock is to understand the universe. We want to understand the nature of the universe so we can figure out what's going on. Where are the aliens, what's the meaning of life, how does the universe end?

Speaker 1

00:14

How did it start? All these fundamental questions.

Speaker 2

00:17

We're driven by curiosity about the nature of the universe, and that's also what causes us to be a maximally truth seeking AI, even if that truth is sometimes at.

Speaker 1

00:28

Odds with what is politically correct.

Speaker 2

00:31

In order to understand the nature of the universe, you must absolutely rigorously pursue truth or you will not understand the universe. You'll be suffering from some amount of delusion or error. So that is our goal, figure out what's going on. And we're very excited to present GROC three, which is we think, in order of magnitude, more capable than GROC two in a very short period of time.

Speaker 1

00:54

And that's thanks to the hard work of an incredible team.

Speaker 2

00:58

My monitor to work with such a great team, and of course we'd love to have so the smartest humans out there join our team.

Speaker 1

01:05

So let's go, Hi Ron.

Speaker 3

01:07

My name is Igor lead Engineering at xi I GBPA leading research. I'm Tony working on the reasoning team, all right, Ele, And I don't do anything. I just show up occasionally.

Speaker 1

01:19

Yeah.

Speaker 3

01:20

So, like I mentioned, GROK is the tool that we're working on. Grock is our AI that we're building here at XAI, and we've been working extremely hard over the last few months to improve GROC as much as we can so we can give all of you access to it. We think it's going to be extremely useful. Do we think it's going to be interesting to talk to a funny really really funny and we're going to explain to you how we've improved GROC over the last few months.

01:41

We've made quite a jump in capabilities. Actually we should explain maybe also what is why do we call it groc?

Speaker 2

01:46

So grog is a word from a Heinland novel Stranger in a Strange Land, and it's used by a guy who was raised on Mars. And the word grock is to sort of fully and profoundly understand something. That's what the word grog means, fully and profoundly understand something.

Speaker 1

02:02

And empathy is important. True.

Speaker 4

02:04

So yeah, if we charged xs progress in the last few months, it's only been seventeen months since we started kicking off our very first model. GROCK one was almost like a toy by this point on the two hundred and fourteen billion parameters. And now if we're prout the progress the time on XXI is the performance of favorite benchmark numbers at MLU on the yaxis, we're literally progressing

02:32

at the unprecedent speed across the whole field. And then we kick off GROG one point five right after GUARG one released after November twenty twenty three, and then GROG two.

Speaker 1

02:43

So if you look at where all the performance coming from.

Speaker 4

02:46

We have a very correct engineering team and all the best AII talent. The only one thing we need is a big intelligence comes from big cluster, so we can reconvert the entire progress and makes AI now replacing the benchmarket of waxes to the total amount of training flops. That is how many GPS we can run at any given time to trail all large language models to compress the entire Internet.

Speaker 1

03:12

So after old human knowledge. Really that's right.

Speaker 2

03:16

Yeah, the Internet is being part of it, but it's really all human knowledge everything.

Speaker 4

03:19

Yeah, the whole Internet fits into USB stick at this point it's like all the human tokens very soon into the real world. We had so much trouble actually training Grock two back in the days.

Speaker 1

03:29

We kickoff the model around February.

Speaker 4

03:31

And we thought we had a large amount of chips, but turned out we can barely get AK training chips running coherently at any given time, and we had so many cooling and power issues.

Speaker 1

03:45

I think you were there in the data center.

Speaker 2

03:47

Yeah, it was like really sort of more like AK chips on average at eighty percent efficiency, more like like sixty five hundred effective h one hundreds training for near several months, but you know we're the one hundred k.

Speaker 1

04:00

So yeah, that's right, that's right. So what's the next step?

Speaker 4

04:06

Right, so after Groark too, So if we all continue accelerate, we have to take the matter into our own hands. We have to solve all the coolings, all the power issues and everything.

Speaker 3

04:16

So it so on April of last year, Elan decided that really the only way for XAI to succeed, for XI to build the best AI out there, is to build our own data center. So really we realize we have to build the data center in about four months. It turned out it took US one hundred and twenty two days to get the first one hundred KGPUS up and running, and that was a monumental effort to be able to do that. We believe it's the biggest fully

04:41

connected h one hundred cluster of its kind. We actually decided that we need to double the size of the cluster pretty much immediately if we want to build the kind of way that we want to build.

Speaker 1

04:52

So we then had another phase.

Speaker 3

04:55

Which we haven't talked about publicly, so this is the first time that we're talking about this, where we doubled the capacity of the data center yet again. And that not only took us ninety two days. So we've been able to use all of diseributse, use all of this compute to improve GROG in the meantime. And basically today we're going to present the results of that, the fruits that came from that.

Speaker 4

05:16

That's yeah, So all the paths, all the roads leads to GROD three ten x more compute, more than ten x really maybe fifteen x compared to our private generation model, and GROCT finished the pre trading early January and we started, you know, the model is still currently trading. Actually, so this is a little preview of our benchmine numbers. So we evaluate the GROCK three on you know, three different categories on general mathematical reasoning, on general knowledge about STEM

05:49

and science, and then also on computer science coding. So amy American Invitational Math Examination hosted you know a year. If we evaluated model performance, we can see that the GROD three across the board is in the league of its own, even as little brother Groctor Mini is reaching the frontier across.

Speaker 1

06:14

All the other competitors.

Speaker 4

06:15

You would say, well, at this point, all these benchmarks you're just evaluating, you know, the memorization of the textbooks, memorization of the GitHub repos, how about real time usefulness? How about we actually use those models in our product. So what we did instead is we actually kicked off a blind test of our garacy model code named Chocolate, Pretty Hot Chocolate. I've been running on this platform called

06:42

chap Arena for two weeks. I think the entire x platform at some point speculated this might be the next generation of AI coming away. So how this chap arena works is that it's stripped away the entire product service, right, It just raw comparison of the engine of those Asia's the language models themselves and place interface.

Speaker 1

07:04

Where the user will submit one single querry and you.

Speaker 4

07:07

Get to show two responses, you don't know which model they come from, and India you make the vote. So in this blind test, GROX three, an early version of GROX three already reached like fourteen hundred. No other models had reached an ELO score had to have comparison to all the other models at this score.

Speaker 1

07:27

And it's not just one single category.

Speaker 4

07:30

It's fourteen hundred aggregated across all the categories in chapbock capabilities, in struction, following coding, So it's number one across the board in this.

Speaker 2

07:40

Blind test, and it's still climbing, so we actually keep to keep updating it. So it's forty fourteen hundred, but fourteen hundred and climbing.

Speaker 3

07:47

Yeah, And in fact, we have a version of the model that we think is already much better than the one that we tested here. Yeah, we'll see, I guess, but that's the one that we're working on talking about today.

Speaker 1

07:59

Yeah.

Speaker 2

07:59

So actually, thing, if you're using BARG three, you I think you may notice improvements almost every day because we're continuously improving the model. So literally even within twenty four hours, you'll see improvements.

Speaker 1

08:11

Yep.

Speaker 4

08:13

But we believe here at the XAI, getting the best pre training model is not enough. That's not enough to build the best AI and the best A. I need to think like a human to contemplate about all the possible solutions, self critique, verify all the solutions, backtrack, and also think from the first principle.

Speaker 1

08:33

That's a very important capability.

Speaker 4

08:35

So we believe that as we take the best PREA training model and continue training with reinforcement learning, it will enlicit the additional reasoning capabilities that allows the model just becomes so much better and scale, not just in the training time, but actually in the test time as well. So we already found the model is extremely useful internally, saving one hundreds of hours of coding time. So you are the user of our these model, what does the use cases yea.

Speaker 3

09:03

So, like Jimmy said, we've added advanced reasoning capabilities to GROG and we've been testing them pretty heavily over the last few weeks in order to give you a little bit of a taste of what it looks like when GROG is solving hard reasoning problems. So we've prepared two little problems for you. One that comes from physics and one is actually a game that God is going to

09:20

write for us. When it comes to the physics problem, you know, what we want Grog to do is to plot a viable trajectory to do a transfer from Earth to Mars and then at a later point in time a transfer back from Mars to Earth. And that requires, you know, some physics that Grog will have to understand. So we're going to challenge Grock, you know, come up with a viable trajectory, calculate it, and then plot it

09:43

for us so we can see it. And yeah, this is totally unscripted, by the way, this is the Grog interface, and we've typed in this text that you can see here generate code for an animated three D plot of a launch from Earth, landing on Mars, and then back to Earth at the next launch window. And we've not kicked off or the query and you can see Grog spinking. So out of Rock's advanced reasoning capabilities are these thinking

10:08

traces that you can see here. You can even go inside and actually read what Grock is thinking as it's going through the problem, as it's trying to solve it.

Speaker 2

10:15

Yeah, weld say like we are doing some obscuration of the thinking so that our model doesn't get totally copied instantly. So there's more to the thinking than is displayed.

Speaker 3

10:27

And because this is totally unscripted, there's actually a chance that Grock might make a little courting mistake and it might not actually work. So just in case, we're going to launch two more instances of this, so if something goes wrong, we were able to search to those enshow you something that's presentable, So we're kicking off the other two as well, And like I said, we have a

10:47

second problem as well. Actually, one of our favorite activities here XCI is having rock right games for us, not just any now any old game, any game that you might already be familiar with, but actually creating new games on the spot and being creative about us. So one example that we found was really really fun is create a game that's a mixture of the two games Tetris and Jewels.

Speaker 2

11:12

So this is maybe an important thing like that this obviously if you if you ask an ai to create a game like Tetris, there's there are many examples of Tetris on the on the internet.

Speaker 1

11:21

The air or game like Dueled whatever, this, it can copy it.

Speaker 2

11:26

What's interesting here is it achieved a creative solution combining the two games that actually works and is a good game. Yeah, that's the it's created. We're seeing the beginnings of creativity.

Speaker 3

11:40

Fingers crossed that we can recreate that, but hopefully it works. Hopefully it's actually because this is a bit more challenging, we're going to use something special here, which we call big Brain. That's our mode in which we use more computation, which was more reasoning of our rock, just to make sure that you know, there's a good chance here that

11:56

actually might actually do it. So we're also going to fire of free attempts here at solving this game, at creating this game that's a mixture of Tetris and the Jewels. Let's let's see what god comes out.

Speaker 2

12:10

I've played the game. It's pretty good. Like it's like, wow, okay, this is something.

Speaker 3

12:16

Yeah, So while Grog is thinking in the background, we can now actually talk about some concrete numbers. How well is Groug doing across tons of different tasks that we've tested on. So we'll hand it over to Tony to talk about that.

Speaker 5

12:30

Yeah, Okay, so let's see how Greg has on those interesting challenging benchmarks. So yeah, so reasoning again refers to those models that actually thinks quite for quite a long time before it tries to solve a problem. So in this case, you know, around the months ago the growth three apprecionning finishes, so after that we worked very hard to put the reasoning capability into the current Growth three model. But again this is very early days, so the model

13:00

still currently in training. So right now, what we're going to show to people is this beta version of the growth three reasoning model. Alongside, we also are training a mini version of the reasoning model. So essentially on this plot you can see the growth three reasoning beta and then Growth three mini reasoning. The reasoning mini reasoning is actually a model that we train for much longer time, and you can see that sometimes you actually perform slightly

13:25

better compared to the Growth three reasoning. This also just means that there's a huge potential for the Growth three reasoning because it's trained for.

Speaker 1

13:32

Much less time.

Speaker 5

13:34

So all right, so let's actually look at how it does on those three benchmarks. So Jimmy also introduced already, so essentially we're looking at three different areas mathematics, science, and coding. And for math we're picking this high school competition math problem. For science we actually picked those PhD level science questions.

Speaker 1

13:56

And for coding, it's also actually pretty challenging.

Speaker 5

13:59

It's competitive code and also some eco, which is some interview problems that people usually get when they interview for companies. So on those benchmarks, you can see that the growth three actually performed quite well across the board compared to other competitors. Yeah, so it's pretty promising. These models are very smart.

Speaker 1

14:19

So totally what what are those shaded bars? Yeah, so okay, so I'm you asked this question.

Speaker 5

14:25

So for those models, because it can reason, it can think, you can also ask them to even think longer. You can spend more what we call test and compute, which means you can spend more time to reason, to think about the problem before you spit out the answer. So in this case, the shaded bar here means that we

14:46

just asked the model to spend more time. You know, it can solve the same problem many many times before it tries to conclude what is the right solution, And once you give this compute or this kind of budget to the model. It turns out the model can even perform better. So this is it's honly the shaded part in those.

Speaker 4

15:07

Plots, right, So I think this is really exciting, right because now instead of just doing one chain of thoughts with AI while not the multiplex at once. Yes, so that's a very powerful technique that allows to continue to scale the model capabilities after training. And you know, people often ask were actually just overfitt into the benchmarks?

Speaker 1

15:28

So how mogulization?

Speaker 5

15:30

So yes, I think, yeah, this is definitely a question that we are asking ourselves whether we're overfitting into those current benchmarks.

Speaker 1

15:38

Luckily, we have a real test.

Speaker 5

15:41

So about five days ago, AIM twenty twenty five just finished. This is where high school come students compete in this particular benchmark. So we got this very fresh new competition, and then we ask our two models to compete on the same benchmark, the same exam, and it turns out very interest the gross three reasoning the big one actually does better on this particular new, fresh exam. This also means that the generalization capability of the big model is stronger,

16:12

much stronger compared to the smaller model. If you compare to the last year's exam. Actually this is the opposite. The smaller model kind of learns the previous exams better. So yeah, so this this actually shows some kind of true generalization from the model.

Speaker 4

16:28

Right, So, seventeen months ago our Rock zero and Rock Web baret solved any high school problems. That's right, And now we have a kid that just already graduate. The Grock is right to go to college?

Speaker 1

16:39

Is that right? I mean, it's won't belong before. It's simply perfect.

Speaker 2

16:43

The human exams won't be part it'll be too easy.

Speaker 4

16:46

Yeah, and the internally we actually as a rocket continually evolved.

Speaker 1

16:51

We're going to talk about, you know what we're.

Speaker 4

16:52

Excited about, but very soon there will be no more benchmark left.

Speaker 1

16:57

Yeah. Yeah. One thing that's quite fascinating.

Speaker 3

16:59

I think that we basically only trained Grock's reasoning abilities on math problems and competitive coding problems. It's so very very specialized kinds of tasks, but somehow it's able to work on all kinds of other different tasks, so including creating games. No, lots and lots of different things. And what seems to be happening is that basically Grog learns this ability to detect its own mistakes and it's thinking, correct them, persist on a problem, try lots of different variants,

17:27

pick them one that's best. So there are is generalized generalizing abilities that learns from mathematics and from coding, which it can then use to solve all kinds of other problems.

Speaker 1

17:37

So that's yeah, that's pretty I mean, reality is the instantiation of mathematics. That's right.

Speaker 4

17:43

And one thing we're actually really excited about that going back to our faulty mission, is what if one day we have a computer just like the thought that utilize our entire.

Speaker 1

17:53

Cluster just for the one very important problem.

Speaker 4

17:56

In the test time, all the deply turned out right, So I think back then we'll building the GBU clusters together. Uh, you're applobing cables, and I remember that when we turned on the first initial test, you can hear all the GPS humming in the hallway.

Speaker 1

18:12

That's almost feel like spiritual.

Speaker 3

18:14

Yeah, that's actually a pretty cool thing that we're able to do. That we can go into the data center and tinker with the machines there. So for example, we went in and we unplucked a few of the cables and just made sure that our training setup is still running, running stably. So that's something that's you know, I think most AI you know teams out there don't usually do. But it's actually totally unlocks like a new level of reliability and what you're able to do with the hypers.

Speaker 1

18:40

So okay, so when are we going to solve remont?

Speaker 4

18:43

So the easiest solution is to enumerate over all possible strains and as all you have a verifier, en up compute, you'll be able to do it.

Speaker 1

18:52

Okay, my projection will be what your guess, what is your neural edge? Calculate?

Speaker 4

18:58

So my my boat for the so so three years ago I told you that I think not two years later, two things is gonna happen.

Speaker 1

19:06

We're gonna see machines win some battles.

Speaker 4

19:09

Yes, two ways a word, fields metal, Globel price with probably some expert in the loop.

Speaker 3

19:16

Right, so the expert uplifting. So this year or next year, that's what it comes down to you. Yeah, So it looks like GROG finished all of its thinking on the on the two problems. So let's take a look at what it said. All right, so this was the little physics.

Speaker 1

19:29

Problem we had.

Speaker 3

19:31

You know, we we've collapsed the thoughts here, so they're you know, they're hidden. And then we see Grock's answer below that, so it explains it wrought a Python script here using map plot Lip then gives us all of the code. So let's take a quick look at the code. You know, it seems like it's doing reasonable things here, not not totally of the mark, solf Capitler says here,

19:53

so maybe it's solving capitalist laws, capital capitalist law and americally. Yeah, there's really only one way to find out if this thing is working.

Speaker 1

20:01

I would say, let's let's give it a try. Let's run.

Speaker 3

20:03

Let's run the code all right, and we can see, yeah, I've got animating two different planets, Earth and Mars here, and then the green ball is the vehicle that's transiting the spacecraft that's transitioning between Earth and Mars, and you could see the journey from Earth to Mars and looks like, yeah, indeed the astronauts were turned safely, you know, at the right moment in time. So obviously this was just generated on the spot, so we can't tell you if that

20:33

was actually correct solution. So we're going to take a close a look. Now, maybe we're going to call some colleagues a space X ask them if if this.

Speaker 2

20:40

Is legit, that's pretty close it's it's I mean yeah, I mean, there's there's there's there's a lot of complexities in the actual orbits that have to be taking into account, but this is this is pretty close to what looks like I add that or my pendets here. This has got the Earth home and transfer on it. When where we're going to install rock on a rocket.

Speaker 1

21:04

Well, I suppose in two years three years.

Speaker 2

21:08

Everything is two years away. Well, Earth and Mars. Transit can occurs every twenty six months. The next we're currently in a transit window approximately. The next one would be November of next year, roughly the end of next year, and if all go as well, SpaceX will send starship rockets to Mars with Optimus robots and Rock.

Speaker 1

21:37

Mh.

Speaker 3

21:38

I'm curious about this combination of Tetris and the duets. Looks like the Tetris as we've named it internally. So okay, we also have an output from Rock here wrote a Python script Spence.

Speaker 1

21:52

That is what it's been doing.

Speaker 3

21:53

If you look at the code, there are some constants that are being defined here, some.

Speaker 1

21:58

Colors, and then the totrominos.

Speaker 3

22:01

The pieces of Tetris are there, obviously very hard to see and at one glance if this is good, so we gotta we gotta run this to figure out if it's working.

Speaker 1

22:10

Well, let's let's give it a try.

Speaker 3

22:12

Fingers crossed ay, Right, So this kind of looks like Tetris, but the colors are a little bit off, right, The colors are different here, and if you think about what's going what's going on here? The jewel it has this mechanic where if you get free jewels in a row, you know, then they disappear and also gravity activates. Right, So what happens if you get three of the colors together? Okay,

22:38

so something happened. So I think I think what Bruck did in this version is that, you know, once you connect three at least three blocks of the same color in a row.

Speaker 1

22:51

Then.

Speaker 3

22:53

Gravity activates and they disappear, and then gravity activates and all the other blocks fall down. I'm kind of kind of curious if there's still a Tetris mechanic here where if the line is full, does it actually clear it or what happens? Then it's up to interpretation.

Speaker 2

23:11

So who knows every I mean, when you're it'll do different variants when you ask it.

Speaker 3

23:15

It doesn't do the same thing every time exactly. We've seen a few other work very differently, but that's one seems cool.

Speaker 4

23:22

So are we ready for game studio at x l AI.

Speaker 2

23:27

Yes, so we're launching an AI gaming studio at x c I. If you're interested in joining us in building AI games, Uh, please join x AI where we're launching in AI Gaming studio or announcing it tonight.

Speaker 1

23:39

Let's go. Yeah, big games, that's an actual game. Yeah, all right.

Speaker 4

23:45

So I think one thing is super exciting for us is that once you have the best patriot model, you have the best reason model.

Speaker 1

23:55

Right, so we already see that we.

Speaker 4

23:58

Actually give a capability for those model.

Speaker 1

24:00

To think harder, think longer, think more broad.

Speaker 4

24:04

The performance continuing improves, and we're really excited about the next front here that will happen if we'll not only allow the model to think harder, but also provide more tools, just like call real humans to solve those problems. For real humans, we don't ask them to solve women hypothesis

24:21

just with a piece of pen and paper. The Internet, so with all the basic web browsing, search engine and coding interpreters that builds the foundations and the best reasoning model build the foundations for the grog agent to come. So today we're actually introducing a new product called deep search that is the first generation of our grock agents that not just helping the engineers and research and scientists do coding, but actually help everyone to answer questions that.

Speaker 1

24:56

You have day today.

Speaker 4

24:57

It's kind of like a next generation search engine that really help you to understand the universe. So you can start asking questions like, for example, hey, when is the next starship launch day? For example, So let's try that the answer. On the left inside we see a high

25:17

level progress bar. Essentially, you know the model knowledge is going to do one single search like the current right system, but actually thought very deeply about, hey, what's the user intent here, and what are the facts that actually consider at the same time, and how many different websites actually actually go and read the accountent right, So this can save hundreds of hours of everyone's Google time if you want to really look into certain topics.

Speaker 1

25:42

And then on the right.

Speaker 4

25:44

Inside you can see the bullet of how the current model you know, is doing what websites browsing, what sources verifying and oftentimes actually cross validate different sources out there to make sure the answer is actually correct before the output final answer, we can, you know, at the same time, fire up a few more querries. How about you know,

26:04

you know your gamer right, so sure? Yeah, so how about what are some of the best bills and most popular bills in uh pathl hardcore right hardcore league?

Speaker 2

26:15

You if you can technically just look at the hardcore letterer, it might be a fast.

Speaker 1

26:19

Way to figure it out.

Speaker 4

26:20

They always see what model does. And then we can also do uh you know, something more fun. For example, how about like make a prediction about the marsh madness out there.

Speaker 1

26:30

Yeah, so this is.

Speaker 2

26:31

Kind of a fun one where Warren Buffett has a billion dollar bet if you can exactly match the I think the the the sort of the entire winning tree of marsh Madness, you can win a billion dollars from Warren Buffett. So, like, it would be pretty cool if AI could help you win a billion dollars from Buffett.

Speaker 1

26:51

It seems like a pretty good investment.

Speaker 4

26:53

Let's go yeah, all right, so now let's fire up the quarry and see what mama does.

Speaker 1

27:00

So we can actually go back to our very first one.

Speaker 4

27:03

How about the it wasn't counting on this, that's right, Okay, so we got the first one and all the thought around one minute. Okay, so the key inside here. The next starship is going to be on twenty fourth or later. So no earlier than February twenty fourth, it might be sooner. So yeah, so I think we can you know, go down to go down what a model does, so it does little research flight seven what happened got grounded and actually.

Speaker 1

27:28

Look into the FCC filing you know, from data collections.

Speaker 4

27:34

Uh, and that should make a new conclusion that yeh if we continue to roll down, let's see right, Yeah, so it makes uh the you know little table. I think inside XAI we often joked about the time to the first table is the only you know latency that matters. Yeah, so that's how the model making influence. And look at all the sources and then we can look into the

28:01

gaming one. So how about the break So for this particular one, we look at the buildings light and you know it's a lot better so uh what they inferl us. But if we go down, so the surprising fact of all the other builds, So look into the twelve classes, so we'll see that the medium bill was pretty popular whenever the game first came out. And now the Invokers of the World.

Speaker 1

28:29

Took over Monkety Biker for sure. Yeah that's right.

Speaker 4

28:33

Yeah, followed by the stone Weavers. Then that's really good mapping. So yeah, and then we can see uh uh the match manners.

Speaker 1

28:42

How about that?

Speaker 4

28:43

Soe One interesting thing about the dep search is that if you actually go into the panel.

Speaker 1

28:49

Where it shows, you know, what are the subtasks.

Speaker 4

28:52

You can actually click the bottom left of this right and then in this case you can actually go to actually reading to the mind of Grock. What informations does the model actually think about our truck worthy what or not? How does it actually cross all their different information sources? So that makes the entire search experience and information with Trual process a lot more transparent to our users.

Speaker 3

29:16

This is much more powerful than any search engine out there. You can literally just tell it only use sources from X. You know, we'll try to respect that, and so it's much more steerable, much more intelligent than I mean.

Speaker 1

29:28

It really should save you a lot of times.

Speaker 2

29:29

So something that might take an hour or an hour of researching on the web or searching media. You can just ask it to go do that and come back in ten minutes later.

Speaker 1

29:37

It's done. An hour's work worth of work for you.

Speaker 2

29:40

That's really what it comes down to, exactly and and maybe better than you could have done it yourself.

Speaker 1

29:44

Yeah.

Speaker 4

29:44

Think about the informount of interns working for you that you can just fire up all the tasks and come back a minute later.

Speaker 1

29:51

This is going to be interesting one.

Speaker 4

29:52

So marchmass had not happened yet, so I guess we had to follow up with a next livestream.

Speaker 2

30:00

Yeah, it seems like pretty good. Like forty dollars might get you a billion dollars forty dollars subscription, that's.

Speaker 1

30:06

Right, I mean my work. So yeah, so where are the users gonna have their heads on? Rock three yees?

Speaker 3

30:12

So the good news is we've been working tirelessly to actually release all of these features that we've shown you. The Grock free based model with amazing chat capabilities that's really useful, that's really interesting to talk to, the deep search, the advanced reasoning mode, all of these things. We want to row them out to you today, starting with the plus subscribers on x So it's the first group that will initially get access.

Speaker 1

30:37

Make sure to update your x.

Speaker 3

30:38

Up if you want to see all of the advanced capabilities, because we just released the update now as we're as we're talking here, and yeah, if you're interested in getting early access to grock, then sign up for Premium Plus. And also we're announcing that we're starting a separate subscription for Grock that we call Super Grock for those who those real rock fans. That one of the most bands capabilities and the earliest access to new futures.

Speaker 1

31:04

So feel free to check that out as well.

Speaker 2

31:06

This this is for the dedicated grock app and for the website exactly.

Speaker 3

31:10

So our new website is called grock dot com. Yeah, and you're also guess and you can also find our Brock app in the ir S app Store, and that gives you a more even more polished experience that's totally grock focused. If you're if you want to have grock not easily available one tape away.

Speaker 2

31:29

Yeah, and the version on grock dot com on you know, on a web browser is going to be the most the latest and most advanced version because obviously it takes us a while to get it get something into an app and they get it approved by the app store. So and then if there's something on a phone format is limitations where you can do so. The most powerful version of grock and the latest version will be the web version at rock dot com.

Speaker 3

31:51

Yeah, so so watch out for the name grock Free in the app giveaway exactly that that's that's the giveaway that you have grock free. And if it says grow true then at grogby hasn't quite arrived for yet, but we're working hard to brow this out today and then to even more people over the coming days.

Speaker 4

32:08

Yeah, make sure you update your phone app too, where you're going to get all the tools were showcased today with the thinking mold with the deep search. So yeah, really looking forward to all the feedbacks you have.

Speaker 2

32:20

Yeah, I think we should emphasize that this is kind of a beta like meaning that you should expect some imperfections at first, but we will improve it rapidly almost every day.

Speaker 1

32:31

In fact, every day I think it'll get better.

Speaker 2

32:34

So if you want a more polished version, i'd like maybe wait a week, but expect improvements literally every day. And then we're also going to be providing a voice you can have conversational. In fact, I was trying it earlier today. It's working pretty well, but not we need these a bit more polish, the sort of way where you can just literally talk to it like you're talking to a person. That's awesome. It's actually I think one

32:58

of the best experiences of Grog. That's that's probably about a week away.

Speaker 3

33:03

So with that said, I think we might have some audience questions. Surely, all right, take a look. Yeah, let's take a look the audience from the ass platform.

Speaker 2

33:16

Yeah.

Speaker 1

33:17

Cool.

Speaker 3

33:18

So the first question here is when Grock Voice Assistant. When is it coming out as soon as possible, just like Elan said, just a little bit of publishing away from being reached to everybody. Obviously, it's going to be released in an early form and we're going to rapidliterate among that.

Speaker 4

33:34

The next question is like when will GROX three being the API? So this is coming in the GROG three API with both the reasing models and deep search.

Speaker 1

33:44

Is coming away in the coming weeks.

Speaker 4

33:46

We're actually very excited about the enterprise use cases of all these additional tools that now Grock has access to, and how the testim computed and tool use can actually really accelerate all the business use cases. Another one is will voice mode be native or a text to speech? So I think that means is it going to be one one model that is understanding what you say and then talking back to you, or is it going to be some.

Speaker 3

34:08

System that has text of speech inside of it. And the good news is it's going to be one model, like and not a variant of grock Free that we're going to release, which basically understands what you're saying what you're saying and then generates the audio directly from that. So very much like grock Free generates text, that model

34:26

generates audio, and that has a bunch of advantagers. I was talking to it earlier today and it said hi, Igor, no, reading my name from probably from some texts that it had, And I said, no, my name is Igor, and it remember that, you know, so it could continue to say Igor, just like a human world, and you can't.

Speaker 1

34:44

Achieve that with Texas speech.

Speaker 4

34:46

So yeah, so here's a question for you, pretty spicy, you know, is Grog a boy or a girl?

Speaker 1

34:56

And they think Grog is whatever you wanted to be? Yah?

Speaker 2

34:59

Yeah, single, yes, all right, the shop is open. So honestly, people are going to pull in love with Croc. It's it's like probable.

Speaker 1

35:10

Uh.

Speaker 4

35:11

The next question, will Grock be able to transcribe audio into text? Yes, so we'll have this capability in both the app and also the API without that's like Groschia just be your personal assistant looking over your shoulder, right and follow you along the way, learn everything you have learned, and really help you to understand the world better.

Speaker 1

35:28

It becomes smodery every day.

Speaker 2

35:30

Yeah, I mean the voicematter Groc doesn't isn't simply it's not just voice text. It understands like tone, inflection, pacing, everything.

Speaker 1

35:38

It's it's wild. I mean, it's like token a person. Yep.

Speaker 4

35:42

So any plans for conversation memory, absolutely, we're working on it right now.

Speaker 1

35:47

That's right.

Speaker 4

35:50

Let's see without the other ones. So what about the you know, the DM features? Right, So if you have personalization said that, if you uh, you know Grock remembers your previous interactions, yes, Should it be one Groc or multiple different.

Speaker 2

36:07

Grocks, It's up to you. You can have one Grok or many Grons. I suspect people will probably have one one.

Speaker 1

36:14

Yeah, I won't have a doctor Groc. Yeah, the Grouk doc. That's right, right, cool.

Speaker 3

36:22

So in the past we've open sourced grock one, so somebody is asking us we're going to do it again?

Speaker 2

36:29

Yeah, I think when once scrut A general approach is that we will open source the last version when the next version is fully out. It's like when when GROG three is mature and stable, which is probably within a few months, then will open source GROUG two.

Speaker 4

36:48

Mm hmm, okay, so we probably have time for one last question. What was the most difficult part about working on this project? I assume Grock three and the world I'm most excited about, so I think me looking back, you know, getting the whole model training on one hundred k h one hundred coherently, that's almost a battle against the final boss of the universe, the entropy, Because at a given time, you can have a cosmic rate that beaming downe and flip a bit you know, transistor and

37:20

not the entire graded update. If it's fit mentis a bit, the entire grade update is out of whack.

Speaker 1

37:27

And nine hundred thousand of those.

Speaker 4

37:28

Orchestrate them every time at any given time, and GPS can go down, and.

Speaker 2

37:35

I mean it's worth breaking down, like how were we able to get the world's most powerful training cluster operational within one hundred and twenty two days because we started off we we actually weren't intending to do a data

37:49

center ourselves. We were going to just we went to the data center providers and said, how long would it take to have one hundred thousand GPUs operating coherently in a single location, And we've got time frames from eighteen to twenty four months, so like, well, eighteen twenty four months, that means losing as a certainty. So the only option was to do it ourselves. So if you break down the problem, you guest doing like reasoning here, it takes

38:15

your thing. Yeah, exactly. So what we needed a building. We can't build a building, so we must use an existing building.

Speaker 1

38:23

So we looked for basically for factories that had been.

Speaker 2

38:29

That had been abandoned, but the factory was in good shape, like the company had gone bankrupt to something.

Speaker 1

38:33

So we found an electro Luxe factory in Memphis.

Speaker 2

38:36

That's why it's in Memphis, home of Elvis and also one of the oldest I think it was the capital of ancient Egypt, and it was actually very nice factory that forever for whatever reason that Electrolux had left and

38:54

that that gave us shelter for the computers. Then we needed power that we needed at least one hundred and twenty megawats at first, but the building only had fifteen megawats, and ultimately for two hundred thousand megree thousand GPUs we needed a quarter gigawat, so we they initially leased a whole bunch of generators, so we have generators on one side of the building, just trailer after trailer of generators

39:20

until we get the utility power to come in. But then we also need cooling, so on the other side of the building it was just trailer after trailer of cooling, So we leased about a quarter of the mobile cooling capacity of the United States on the other side of the building. Then we needed to get the GPUs all installed, and they're all liquid cooled, so in order to achieve the density necessary, this is a liquid cooled system.

Speaker 1

39:41

So we had to get all the plumbing for liquid cooling.

Speaker 2

39:44

Now, we had ever done a liquid cooling data center at scale, so this was an incredibly dedicated effort by a very talented team to achieve that outcome.

Speaker 1

39:55

I may think, now it's going to work. Nope.

Speaker 2

39:57

The issue is that the the power fluctuations for a GPU cluster are dramatic. So it's like this giant symphony that has taking place, like having a symphony with one hundred thousand or two hundred thousand participants in the symphony, and the whole orchestra will go quiet and loud in you know, one hundred milliseconds, and so this caused massive power fluctuations. So then which then caused the generators to lose their minds and they.

Speaker 1

40:28

Weren't expecting this.

Speaker 2

40:30

So to buffer the power, we then used Tesla megapas to smooth out the power.

Speaker 1

40:38

So the mega packs had to be reprogrammed.

Speaker 2

40:41

So with the XAI we were working with Teesla, we reprogrammed the megapacs to be able to deal with these dramatic power fluctuation fluctuations to smooth out the powers that the computers could actually run properly, and that worked.

Speaker 1

40:57

It quite tricky and then but.

Speaker 2

41:00

Even at that point, just left to make the computers all communicate effectively, so all the networking had to be solved and debugging Brazilian network cables a debugging Nickel at four in the morning, or we solved it like roughly four twenty am. Was well figured out, Like there's some well, there are a whole bunch of issues. One there was like a bios mismatch.

Speaker 1

41:25

Bios was not set up correctly.

Speaker 3

41:28

We HADSPCI outputs between two different machines. One that was working, yeah, one that was not working. Many many other things.

Speaker 1

41:38

I mean, yeah, exactly.

Speaker 2

41:38

This would go on for a long time if we actually listed all the things. But it's like interesting, It's not like, oh, we just magically made it happen. You had to break down the problem, just like groctas for reasoning, into the constituent elements, and then solve each of the constituent elements in order to achieve a coherent training cluster in a period of time that is a small fraction of what anyone else could do it in.

Speaker 3

41:59

And then once the train cluster was up and running and we could use it, or we had to make sure that it actually stays healthy throughout, which is his own brand challenge. And then we had to get every single detail of the training right in order to get a rookery level model, which is actually really really hard.

42:14

So we don't know if there are any other models out there that have Rockery's capabilities, but whoever trains a model better than rock Crey has to be extremely good at the science of deeplining at every aspect of the engineering.

Speaker 1

42:26

So it's it's not so easy to pull this off. And this is now going to be the last cluster were build and last model we train.

Speaker 2

42:33

Oh yeah, we're We've already started work on the next cluster, which will be about five times to power, so instead of a quarter gigawad, roughly one point to giga.

Speaker 1

42:42

What what's the what's the back to the future was? What's the power?

Speaker 2

42:49

Does the back to the Future car anyway back to the future power? It's like roughly in that order, I think. So, you know, there will be the sort of the GB two hundred slash three hundred pleasure once again. It will be the most powerful training cluster from the world. So we're not like stopping here, and.

Speaker 4

43:05

Our reason model is going to continue improve by accessing more tools every day. So yeah, we're very excited to share any of that coming results with you all.

Speaker 3

43:14

Yeah, the thing that keeps us going is basically being able to give free to you and then seeing the usage go up, seeing everybody enjoy no clock.

Speaker 1

43:23

That's that's what really gets us up in the morning. So thanks for your name. Thanks guys,

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript