Jack Morris on Finding the Next Big AI Breakthrough

Speaker 1

00:02

Bloomberg Audio Studios, Podcasts, Radio News. Hello and welcome to another episode of The Odd Laws podcast.

Speaker 2

00:21

I'm Joe Wisenthal and I'm Tracy Alloway.

Speaker 1

00:23

Tracy, have you played around with GPT five much?

Speaker 2

00:26

Not really, I've been perplexity pills. Oh that's what your main Yeah, that's my main one at the moment. But is it good? I hear mixed.

Speaker 1

00:35

I use it because I use GPT every day. It does not strike me as like obviously better yeah for my uses than like the three models, which I've been very impressed by because you know, I want to establish them.

Speaker 3

00:46

No hater or anything like that.

Speaker 1

00:48

But like, it did not strike me as like, oh, this is like an.

Speaker 2

00:50

Amazing Yeah, this is the thing.

Speaker 3

00:53

Step function or whatever.

Speaker 2

00:54

It feels like the sort of breakthroughs awe inspiring breakthroughs are kind of behind us, and a lot of the progress on the models feels very incremental at this point, even though people are spending a lot of time and resources on doing it.

Speaker 1

01:07

The one thing GPG five does prompt me and say, oh, that's a great question. Would you like to follow up more on that?

Speaker 3

01:12

But it's like does it.

Speaker 2

01:13

Say, o, Joe, you're so smart? That's such a smart question.

Speaker 3

01:16

Say you know what it did? Say?

Speaker 1

01:17

I asked to follow up, and it started an answer with love it and then love it? Do you want me to look into that?

Speaker 4

01:23

Yes?

Speaker 2

01:24

They are very flattering, aren't they. Actually, that's one thing I like about perplexity is it doesn't really flatter you. It just spits out an answer.

Speaker 1

01:30

So anyway, there's so many questions I have about AI, and we talk about the business old fair amount and video and all that stuff. We actually don't really talk that much about the pure research side as much. But it's pretty important, I think, because I think a lot of people would agree that if the skills are like slowing down, or if there were a wall or something like that, that might change some of these business model calculations,

01:51

et cetera. So I think it's good we need to get an update on just sort of the state of the art the science of AI.

Speaker 4

01:57

Yeah.

Speaker 2

01:57

Also, it would be nice just to understand what's possible in terms of the AI models and what people are actually researching, what they're working towards, work like, is it mostly about price? Is it mostly about the output? Is it mostly about energy use? All those things?

Speaker 1

02:12

All those things, Well, I'm really excited to say we have the perfect guest, someone who is an AI researcher. We're gonna be speaking with Jack Morris. He's currently about to finish his PhD.

Speaker 3

02:20

At Cornell in AI.

Speaker 1

02:22

He's been affiliated with Meta professionally, so presumably he already has a one hundred million dollar pay package in the bank. Now he's shaking his head, he's not that's a joke. But Jack, thank you so much for coming on odd lots.

Speaker 4

02:36

Yeah, thanks for having me. This is gonna be fun.

Speaker 1

02:38

What do you explain to me, like what you're up to, because I don't really understand how.

Speaker 3

02:41

It works where people are.

Speaker 1

02:42

They're at a university and they're also at a company, and this isn't how it works. And much of the world, right, people get their degree and then they get a job. I get the impression that in the AI world it's a little fuzzier in terms of one's affiliations between industry and education and stuff like that.

Speaker 4

02:58

Yeah, that's definitely true. I think might be on the way out, but I can tell you about my situation. So there's kind of a public research world and like a private research world, and all the academic institutions do public research, and the AI labs like Open Ai, Anthropic, Google, deep Mind, they essentially do private research where they have these people in house that are running experiments and learning more about their systems, but they don't publish anything or

03:25

share any of their knowledge. And so a cool thing about getting your PhD right now is you can do research right about it and then publicize it like put it online, I tweet about it. I kind of like can talk to you about it. And there's a few places left that will still kind of moment, we're never.

Speaker 3

03:41

Going to hear from you again.

Speaker 4

03:44

Yeah, I'll make sure they have a clause in my contract that I can still talk to Joe and Tracy.

Speaker 2

03:49

The all thoughts clause. Yes, that would be important. So when we say AI research or an AI researcher, what exactly does that entail? Can't the AI models just research themselves? Just let them do it?

Speaker 4

04:02

Yeah, that's actually a very smart idea, and like people are really worried about that. Actually, Like if we get to the point where the AI can improve itself into researching, yeah, then it sort of gets smarter and then it improves themself again and it ends up being this kind of exponential improvement that ends up with all of our demise. But I think right now it's not quite there yet. Like maybe you can talk to CHGBT what good Yeah, And good news for me too, because it means I

04:30

can still get a degree and be gainfully employed. But I think it's it's still helpful, but we still need like humans to make these improvements. And in terms of what the actual day to day work looks like, I think it really varies. Like there's some people working on trying to make the models run faster, or trying to make the hardware that runs the models run faster more efficiently.

04:50

There's people that try to work on the data, like what should we train on more coding problems or more textbooks or more Reddit posts, what works best to make the model? And then there's a lot more people working on different areas of the stack, like training algorithms. I kind of have my own little niche and niche. There's this old field of information theory from like the twentieth century where they talk about bits like a zero or a one is a bit and you can add them

05:16

up and have kilobytes and megabytes. And so I've been trying to think about what that means in like the chat GBT world, if you train a model on a certain number of bits, how many bits does it actually learn? And like can you look at the model and figure out like if you have one slice of the model,

05:30

how many bits that is and stuff like that. So maybe the easiest way to explain is if you had, for some god forsaken reason to use chat GBT as like a flash drive, like you had a certain set of data and it had to memorize all that data, Like how much data could it actually store? That's the kind of area I've been working in. And then you know, once you're there, you kind of realize we could do this, or maybe next semester, if we have time, we could

05:53

try this other thing. And so there's it kind of branches out and there's a lot of little problems that you can try.

Speaker 1

05:57

I mentioned GPT five fine to me, It does not strike me as like you know, because actually so the first time I use cha GPT is genuinely blown away like most people. And then actually I was pretty blown away by the three models, in part because of how well they could do document search and superior to Google Search in many respects and also just the organization of a lot of unstructured data, et cetera. Like I didn't have like some oh my god wow moment with GPT five.

06:25

It's like, this seems like, how do we measure whether AI is getting better all the time.

Speaker 4

06:32

Yeah, that's that's a huge question, right.

Speaker 1

06:35

Well, let me ask you, Okay, let me ask you actually a more specific question. How do the entities that test AI models as their job or as their function? What does the formal testing process look like to rank the quality of AI models?

Speaker 4

06:52

Okay, yeah, that's that's more tractable. We can we can start there, and then we can talk about three and GPT five. So there's essentially two ways people do this kind of model evaluation. The main one is just by testing them on different data sets. So, for example, there's this data set called swee bench that's a bunch of software engineering related coding problems and they all have a human written solution and tests, and so you can ask GPT five, can you write the code for this and

07:20

then run the tests and see if it's right? And still the models are pretty bad at that. I think they can do about half of them. They're very hard. They're like entire days of work for professional software engineers. But when a new model comes out, they can say, oh, look, we actually got a higher score on sweet bench. And there's a ton of different data sets like that. So when GBT five comes out, they say, you know, it's

07:39

better at these types of coding tests. And a big one that specifically open AI has been advocating for is math, like they did the International Math Olympiad, and they said essentially GBT five scored at the level of the best high school mathematicians, which is pretty cool. But you raise a good question of how is that actually map to real world usage? And I think this is like a really hard problem that people still haven't figured out.

Speaker 2

08:06

Does anyone try to capture that sort of like genes sequah? I guess when it comes to AI models, is one of the tests asking it to I don't know, come up with a stupid limerick or something.

Speaker 4

08:18

Yeah, there are a lot of tests like that. There's some creative writing benchmarks and some poetry related ones. But I think you point out something interesting that for example, I mostly use Claude from Anthropic and I think Claude does have this something to it that's like a little bit different, and it's very difficult to characterize. It's just sort of the way it speaks to you and the way it thinks of itself is I like it a lot better, but I don't know how you would design

08:45

like a data set that can really capture that. The second way they do the evaluation is by they call it it's Elo scores, like in chess. So they, for example, ask the two models to write a limerick, and then they have humans rank which one is better, and they make this kind of lat of Elo rankings for models. So I think right now Claude or GPT five or maybe the Google model is top on this ladder.

Speaker 1

09:10

The algorithm made famous in the social network that Mark Zuckerberg used to rate the of his colleagues still the workhorse model for comp evaluation.

Speaker 2

09:19

That's some good trivia, Joe, very good and no comment. Well, I assume just on the hard number evaluation. People are also ranking these on data usage, energy, that sort of.

Speaker 4

09:31

Thing as well.

Speaker 2

09:32

Right speed, speed would be a definitely.

Speaker 4

09:35

The AI companies like to use price as a metric, which is kind of interesting because there's a lot that goes on behind the scenes, including just sort of like free money that drives the prices down, but they also do benchmark speed, and I think you make a good point that the benchmarks can be pretty misleading, Like, for example, there's a bunch of recent open source models that came from different Chinese AI labs that have really, really high

09:59

scores on certain benchmarks, but people kind of think they're not as good for real world usage for whatever reason.

Speaker 1

10:06

I've seen people talk about this isn't part of the problem with testing AI or evaluating AI. That a lot of these problems exist in the real world already, right, You see this a lot, and people are always finding this, which is that here's an AI model that is amazing at math on the math Olympiad, and yet it gets tripped up by questions like which is heavier a pound of steel or two pounds of feathers, And they'll say

10:33

that that's a trick question. A pound of steel weighs the same as two pounds of feathers, which is clearly like it was clearly then been trained in some sense to recognize these steel versus feathers thing or whatever it is. I forget if it's steel, But it also clearly can't measure whether one or.

Speaker 3

10:49

Two is bigger.

Speaker 4

10:50

Yeah, that's a really good example. I think they kind of successively include these kinds of things in more rounds of training data, and so every time a new model comes out, they kind of patch little holes that appeared in the previous models. So you're pointing to this, like they probably started with the classic riddle that's like a pound of bricks or a pound of feathers bricks and they're equal, but then like the models got that wrong, and so they added to.

Speaker 1

11:13

Something a very efficient way to achieve intelligence, like, oh yeah, we should have included that.

Speaker 3

11:20

Oh yeah, we got to include that trick. Oh yeah, we gotta have right.

Speaker 1

11:22

Like ever, like going that does not speak to me of a line towards something that we would call anything resembling human intelligence.

Speaker 4

11:32

I definitely agree. I think one counter example is people said this for a long time about self driving cars, Like everyone was really excited about them for a long time, and then they kind of didn't really work, like eight or so years ago, and there was this period where they were saying, oh, the models can't do green cones. We're going out there trying to take videos of green cones, and yeah, they can't do snow. I'm saying that it

11:55

worked for them, and so it might be possible. But in the case of language models, there's something a little more interesting happening, because we now have two ways to learn. If you guys are ready, we could we could get into something a little technical, which I think gives you some insights. So there's essentially two ways you can teach

12:13

machines to learn from data. One is called supervised learning, where the computer will copy what you did, which is like basically what we're talking about now, and the other is called reinforcement learning, where the computer just does something and then you give it a reward if it does something well. And so for a long time, like the original chat GBT was mostly just trained with supervised learning, like it would just copy the text from all of the Internet, and so the best it could ever do

12:39

is emulate Reddit posts very well. And there was a tiny bit of reinforcement learning, but people didn't know how to do it right. And then you mentioned this three model, which is kind of in some ways like a big jump, like it made the models much better at math, much better at certain things. And the way they did that is actually through reinforcement learning. Found out a way to kind of like let the model think for a while and then give it a reward when it gets the

13:05

answer at the end. It's kind of scary.

Speaker 2

13:07

Yeah, when you say give it a reward, is.

Speaker 3

13:10

It like take a cookie paying robots?

Speaker 1

13:13

Yeah?

Speaker 2

13:14

Well no, genuinely, like what is the reward? You just tell it it did a good job.

Speaker 4

13:18

You just give it like a higher number. Okay, and that makes you happy, all right.

Speaker 2

13:22

I'd get a little bit worried when we're like giving it cupcakes or something like here you go, good job. Just going back to the intro, you know, we were talking about how it feels like a lot of the progress on AI models is a little bit more incremental, and I guess it's hard to tell whether that's just personal bias because now we're used to them and the sort of wow moment has passed. But what does it

13:44

feel like to you in terms of improvements? Are we seeing the improvement cycle accelerate or decelerate at this point?

Speaker 4

13:52

I think it's kind of like the market, where it's like always it gets faster for a little while, and then it feels like things have slowed down and the progress is never quite in the areas that you expect as one example, people really thought this year was the year when the assistance would start being able to act like actual assistants, like the Year of agents. People actually coined that term, I think, like the year of agents, and it really it didn't happen for whatever reason. Maybe

14:19

it will in the next three months. But the agents are still pretty bad the ones that you can use. But they did get way better at competitive math, Like now they can do these like world class proofs that they couldn't do before. So it's almost unpredictable, like which areas the AI will kind of conquer next, But it does feel like progress is continuing.

Speaker 1

14:39

Actually, what happened with agents? I've never had a successful agent experience, even basic things like come up with a list of every past odd Lots guests, yeah and put it in a file or something like that, which just there's an RSS feed that exists for odd Lots. This should be ray stick for it all around, and then something will happen or it'll get lazy. Here's like here's fifteen and what is actually this is thought leaders love this stuff. They love to talking about the agents. So

15:06

what actually happened with agents? Maybe they'll get there, but what do you use to what is the roadblock there.

Speaker 4

15:11

I don't think there's any conceptual roadblock, Like there's no reason why you couldn't collect data for that and train them either in a supervised way or using reinforcement learning. It just hasn't happened yet. So I think maybe behind the scenes it turned out that the problem was harder than people thought, Like getting data from all those scenarios

15:28

is really hard. And there have been some stories from like people that I've heard of that found these little companies in San Francisco and they build these tiny environments for the AI labs to do reinforcement learning on for agents, like for example, doing a calendar. They'll build like a little calendar app, but make it have rewards so you can do reinforcement learning, and they can just sell that

15:49

for like hundreds of thousands of dollars. So I think the progress is ongoing behind the scenes, Like there's a whole ecosystem built around it. It just hasn't really manifested in the products that we use.

Speaker 2

16:00

I was going to ask, how much of the difficulty is, you know, the actual development of the models, the thinking part, versus just getting them to plug in seamlessly with other applications.

Speaker 4

16:11

Yeah, I think the second thing is probably the biggest barrier in terms of time, Like it just takes a really long time to figure out what data you need and collect it properly and actually train the models on that data. But at the same time, there are people like me who are trying to work on better like conceptual frameworks for training the models. So to go back to the three example, doing reinforcement learning on CHATGBT, like that seems to me like a huge breakthrough, Like we

16:40

didn't know how to do that before. It unlocks all sorts of doors and ways to train the models. So even if maybe you don't think that model was that much better than the previous one, it seems like it will give us huge improvements in the future.

Speaker 1

17:10

So you mentioned at the intro that it's possible, hopefully you'll get a close but you might end up in a situation which you go to work for some frontier AI lab and we never hear from you again, or you just post cryptic tweets like oh no idea, what's coming, Oh it's gonna.

Speaker 3

17:26

Be so over or whatever. Yeah, an the death Star, Yeah, it's very annoying.

Speaker 1

17:30

The way they all tweet, it's possible talk to us about like why not work on an open source project? And this is of course when people talk about deep seek and a lot of the Chinese models that the US competes with, a lot of those are open source. Presumably you could keep coming on odd lads over and over again, why like what is even the case for the best and the brightest to work on a closed source frontier models.

Speaker 4

17:54

Yeah, it's a really hard question, Like I've I've struggled with this in my own personal decision making. I was originally thinking, Oh, I'd love to become a professor and mentor younger students and get a whole like group of these ideas going and start working on similar related problems to the stuff I was talking about. And I still think that would be fun. But there's a big gap in terms of the things we can do at Cornell and the things that you can do at open AI.

18:20

Like they just have like crazy infrastructure for training models really easily and data and a ton of really good data. And so I think as that gap has widened, I've felt like a lot of what we're doing is like kind of devising these toy scenarios where we can study interesting things, but I feel a bit disconnected from the real like progress of humanity. You know, like if you really agree that this is like the biggest problem of

18:47

our time. I don't want to say it's like the Manhattan Project, but like, what's more like trying to go to the Moon in the sixties? The space race. It's kind of like a space race going on in these different private labs. You want to be a part of it. Like there's crazy energy that it has huge implications for the future of society. So I think I am interested

19:05

in participating in that. My big question is like, if you think that the reinforcement learning thing was the most recent big scientific breakthrough, like oh one, and then three, what's next? And then like where will that actually be happening. That's kind of what I'm thinking about right now.

Speaker 2

19:22

Just on the data point. I was reading your excellent substack and you argue that there's probably an upper bound to what you can get out of a given data set, and at some point, like the training starts to look pretty similar, right, and the data becomes the differentiating factor. How important are data sets to AI research? And I guess, like, how do you go about finding really cool ones and

19:50

what's left. Because I feel like, you know, using the space race analogy, everyone has been running so fast on this. It feels like all the data sets must have been a explored by now, but I guess they haven't.

Speaker 4

20:02

Yeah, yeah, I think this is really getting to the heart of what people are trying to figure out right now in all these different labs. So I think you're pretty much right that all of the public data sets we have are pretty much used to TRAIN three or GPT five or whatever. If there is a really good website that should have been scraped and downloaded into the model, it should probably be used. But there apparently is a

20:30

much larger amount of private data than public data. I mean, you all work for Bloomberg, so you're probably intimately familiar with this. But if you think about the different AI labs that exist, they actually now do kind of have different data related modes. Like XAI, they have all of the Twitter data that's basically impossible to get elsewhere. CHADGBT now has all of the user conversations with CHATGBT, which are really useful. Claude has a ton of coding data

20:55

that other people don't have. Google has YouTube, which some people think might be like the next source of making really good models, and none of those things are really included, at least not much in today's models.

Speaker 3

21:06

This is really important.

Speaker 1

21:07

Like once a lab builds some sort of base, whether it's anthropic encoding or maybe cursor encoding, even though they're not like a core lab, et cetera, like they become a source of their own data that literally nobody else has.

Speaker 4

21:23

Yeah, actually Cursor is a great example. So they are very technical, they have really smart people. They're very small, so they haven't quite scaled to at least in terms of the number of people.

Speaker 1

21:33

But I think about this like every time I was like, when I've played with this is like this is good, this is bad. I'm constantly teaching their model to get better, right right, right right.

Speaker 4

21:41

They're in a problem where they have the data. They just have to take the right algorithms and scale it up to train the model to be as good as Claude is. But that actually seems a lot more feasible than other companies that have no data and want to train good models, even if they know how, it seems very difficult.

Speaker 2

21:56

How closely are AI researchers working or talking to I guess other parts of the AI ecosystem, so you know, chip makers, maybe cloud providers, that sort of thing. Is there a lot of dialogue or not really.

Speaker 4

22:11

I think certain people talk all the time to the chip makers, Like there's a big community of people. You know, the AI models all run on GPUs, and there are a lot of people that are getting really good at writing fast GPU code. It's called kernels, and all those people who work on kernels talk to the chip makers all the time. Like Amazon's making their own chips, Google

22:32

has their own chip. Now all the hyperscalers are making chips, and I think they're all trying to talk to the people that actually write the fast code that runs on chips to figure out I think they call it hardware software code design, Like everyone's kind of getting together and trying to figure out what the best way is to design the next round of GPUs.

Speaker 1

22:48

So you mentioned, okay, Google might have an advantage because it owns YouTube and there's just tons of obviously just tons of.

Speaker 3

22:57

Data in there.

Speaker 1

22:57

So one way you could get access to the YouTube data is to literally be Google and own it. But another way that maybe you could get access to YouTube data is operate in China where there are no laws about this type of thing, or no, they're not beholding

23:13

the US copyright and just sort of scrape at all. Again, since most of the Chinese AI labs are open to source, why isn't this just a huge advantage for the Chinese labs that they're really not going to be Hey, open AI they get super at the New York Times all these deepseek isn't having to deal with all these headaches?

Speaker 4

23:33

Yeah, I think the American AI labs will probably do things behind the scenes that they wouldn't tell you about to get good data solution. Just don't so Yeah, Like I think they wouldn't release the models that are potentially trained on scraped or copyrighted data. But if that's the way to get better math Olympiad scores, then people will

23:55

I think I would guess do that. But you're right that like the Chinese, the Chinese model makers can to sort of take all the books that they can pirate from the Internet and train on them and they're not violating any laws and they can release the model to the public and it's all fine, which is honestly great for us because then people like me could probably download a model that's better than we would get otherwise.

Speaker 2

24:15

What was your impression of deep Seek when it came out? And now?

Speaker 4

24:20

I was pretty surprised at how much of a splash they made. The model is really good, and I think a lot of people are building on it, including me, and like most people that are at AI companies that aren't super super big are building on deep Seek. But it was surprising, like what a huge deal it was to people, like my mom's asking me about deep Seek. I think my grandma knew about deep Seek, and she barely knew about chat GBT.

Speaker 2

24:47

So that's when you know it's gone mainstream when starts.

Speaker 4

24:50

Asking you and there was nothing else so far. I think in the AI space that's made quite that much news.

Speaker 1

24:55

But it sounds like what you're saying is that it's a very good model, but that on the actual specs from your perspective, it didn't quite deserve is much attention, Like it was good, but like in your view, it's not so good that everyone needed to be talking about it.

Speaker 4

25:12

Yeah, I think it's really useful because they released all their model weights and they said exactly what they did to train it. Although they didn't say what the data was, but it gave me the impression of there maybe six to twelve months behind the American AI labs in terms of how well they can do the training and stuff. But it still was a pretty big update for me to know that, Wow, there are one hundred people that don't have PhDs working at a Chinese hedge fund that

25:36

are training these like cutting edge models. Like it is incredible and they work very hard, they're very good.

Speaker 2

25:57

Do you have pressure or do you feel pressure or do AI researchers in general fuel pressure to consider monetization when they're researching things or is it you know, mostly still curiosity driven, that sort of old school Silicon Valley we're improving the world kind of thing. Or is it much more mercenary given that all of these big companies seem to be competing in the same space.

Speaker 4

26:22

Yeah. I think that over time it's gotten harder and harder to do things that are just like cool ideas or seem cute but don't have any necessary application, and things are getting closer and closer to products, you know, even like the language models that power CHAGBT. I was working in those before CHAGBT, and they had some uses, but also they're intellectually interesting and like fun to build.

26:47

But now if I came up with a better way to train CHAGBT, that's like a multi billion dollar innovation.

Speaker 2

26:54

The stakes are higher.

Speaker 4

26:55

Yeah, I'd be like an asset to like the United States government or something if I knew how to do that. So I guess it depends on what kind of problems you work on. Like, I'm more interested in understanding how things work, so it becomes a bit less financially dire.

Speaker 1

27:10

I think that six to twelve month gap between what was that that was a January deep segment. Yeah, everyone was in December that they first got at attention, then for some reason really hit in January. Is that a sustainable gap? Is there something either in access to data, access to talent, access to compute, access to chips, whatever, access to energy that in your view will allow us frontier lebs to maintain some sort of six to twelve month gap for a while.

Speaker 4

27:39

It's pretty unclear to me. I think there are different beliefs you can have. You can believe that the ideas and the people are really the thing that differentiates the models, and in that case, I think we haven't so far seen a lot of like the top USAI researchers going to work at Chinese labs, so that seems stable. You could think that chips really matter, and in that case the chip race is really happening between big American companies.

28:02

Like I think, actually China has a pretty big deficit coming up in terms of like the GPUs we're exporting, or you can think that the data matters, and I guess actually any of these point in the favor of the US. I think if you think the data really matters, maybe the data they gather through like deepseek dot com usage is really good and they can use it to like bootstrap a better model. But I think the American companies really do have an advantage. Like you all might

28:28

have heard this story just as an anecdote. Apparently at Anthropic they've been buying and scanning thousands of old books for several years, so they have this division. I think they're based in New York that buys like shipping containers full of old manuscripts, cuts off the spines and puts them in these scanning machines and then they turn them into like really high quality text. And so I'm noting

28:49

Claude has this like weird aspect to it. Maybe part of the reason is they've gathered like trillions of words worth of like old book data over many years, and that's pretty hard to replicate elsewhere. So I think that head start really does mean a lot.

Speaker 2

29:06

What are you most excited about at the moment? The book thing sounds very cool, but what is getting all your attention right now?

Speaker 4

29:14

Thanks for asking. I think I mentioned before I'm really trying to figure out what's coming next. There are some obvious things like we can get computer usage data and train better agents, or we can get more coding data and make them better coding or writing gp code or whatever,

29:30

But like, what are the non obvious advancements? And my personal opinion is that the next round of improvements and AI models will come from some type of personalization and online learning, which means like models that one are trained like per person or per company. So like you could think of like CHADGBT is the same model that gets served to everyone, so it has to store information about

29:57

random restaurants and like countries you never go to. But instead if you had a CHAGBT that's specific to Bloomberg or specific to your work, it might be able to like use more of its brain to do work for you.

30:10

And then the second thing is if it was updated every day, so like if you ask it to make your odd lots calendar, yeah, or RSS feed and you're like, no, that was wrong, Like you did it wrong for this reason this reason, and you try again tomorrow, it'll still break tomorrow because it doesn't like continuously improve its capabilities. So oh yeah, I think that's the direction things are going.

Speaker 3

30:31

I've heard people talk about this now.

Speaker 1

30:33

Granted, models are getting better over time, but you know, people might compare a coding model to a beginning software engineer and say, the coding model is better, but that software engineer is going to start getting better the next day they're on the job, and every day for the rest of their career, they're probably going to be a better software engineer than they were the day before, whereas at least that version of the model will not be better.

Speaker 3

30:56

That is that right? Yeah?

Speaker 1

30:58

Yeah, that seems like an issue that people talk about in your world.

Speaker 4

31:01

Yeah, yeah, I think this is a big problem. It's like we have to wait six months for the chat GPT five point one to come out, and then maybe they'll include your problems as the training data, and so

31:11

maybe it'll get better, but it might not. And instead, I think people need to think about ways to do that update more dynamically, like every time you talk to it, or maybe every night when you go to sleep, the model kind of like gets to work and studies what it was talking to you about and crafts better tests for itself and then learns and then when you wake up, the model's actually better.

Speaker 1

31:30

The other big question that I have and is kind of related to this, especially when we're talking about AI replacing the humans in certain forms of labor, is that like do we need really really advanced aid like in other words, like there is a lot of again, the existing models are extremely impressive, Like in your view, do we need to get a lot better technically for them

31:54

to have economic impact? And since these are in many cases businesses at the end of the day, is it necessary that there's so much work being done towards advancing the cutting edge?

Speaker 4

32:06

Yeah, yeah, that's a great question, Like we could have really good interns without ever getting better scores on the Math Olympiad, Like that's not necessarily something that we ever had to go after. I think part of the reason for that is that AI labs are engaged in this kind of neck and neck race to have the smartest model. But I totally agree that AI could be economically transformative without having a higher ceiling in terms of what it

32:32

can do. It's more like it needs to be more consistent or like dependable than actually smarter.

Speaker 2

32:37

This might be a weird question, but once you've made a sort of foundational improvement to a particular model, how easy or difficult is it to rewind if you need to. And one of the reasons I ask is because you know, some people have been complaining that they've been training chat GPT to I don't know, be their boyfriend or whatever, be their therapist topic. Yeah, and then it gets upgraded and all of that training suddenly disappears and the personality of the model changes.

Speaker 4

33:07

Yeah, that was a really interesting story. So I think the model before GPT five was four to zero. And they said that they thought internally, like all the scientists encoder people, that the new model was superior in every way. It gives you shorter responses, it's a bit nicer, it's much smarter. And then people got really upset because they had spent so much time talking to the old model that they felt like they'd experience like a serious loss in their life.

Speaker 2

33:33

Joe would miss the love it love it No.

Speaker 1

33:37

But for real, this is un Ironically this strikes me as another example for open source, which is that if I'm going to form a I don't see it. I'm forty five, I'm too old for that. But if someone is going to form like some sort of friendship with an AI model, I don't want it to be at the whim of Sam Altman deciding it was like, oh there's an upgrade. I would like to be friends, so weird to be friends with the model that I know that I can run in perpetuity and it will never change.

Speaker 4

34:06

Yeah. I think that's definitely a good argument for why open source is important, And if you ever fall in love with a model, you should fall in love with an openness.

Speaker 2

34:16

That's good life advice, practical life.

Speaker 3

34:18

Advice, really good life advice.

Speaker 2

34:19

Well, speaking of open source, you know, I know programmers tend to like open source for obvious reasons, but are there any downsides to open source for AI specifically?

Speaker 4

34:31

I think if you're running a company, there are a lot of downsides potentially to open source. If you have some brand new, fancy way of doing computation inside the model that's actually better, you might want to keep that information to yourself. And when you release the model, to make it runnable, you have to release all the code to run the model, which might contain like your secrets, and so I think that's why people are hesitant to

34:52

do it. The other reason is because when you release the model, it actually contains quite a lot of residual information about how you actually trained it, Like you might be able to infer what the data set was and what the training process was, or even reconstruct the entire training data set given just the weights of the model. And so if you're worried about people finding out that a certain thing was in your training data, you probably can't release that model open source.

Speaker 2

35:19

That reminds me how much of an AI researcher's day to day life is just like looking at other model, other people's models, and trying to, like I guess, pull them apart and figure out how they were made and sort of work backwards.

Speaker 4

35:34

That definitely happens from time to time. I think usually the scientific process is something like you start with other people's models, and you run them and you see what happens, and then you decide on some part of that process that you think could be improved or could be explored further, and you make some tiny changes to it, and then you run it again and you compare like numbers, or you make graphs of what happened before and what happens after.

35:57

So actually quite a bit of it, like, for example, pull the GPT two model from open Ai, which was twenty nineteen or something, their first kind of really larger scale chatbot. Like I've spent hundreds of hours kind of like playing with that code and talking to the model and stuff like that. So thank goodness for open source.

Speaker 1

36:15

For that reason, I joked in the beginning about you having one hundred million dollar salary, but for real, as you think about your career, and I hope you do get a hundred million dollar salary, but as you think about your career, what excites you?

Speaker 3

36:30

And how much is it money?

Speaker 1

36:32

But the reason I think about this is like they're huge checks out there, but maybe some things are more. Maybe achieving AGI is more excited than making an ad network more efficient. Maybe something there's something more exciting than shaving off a billionth of a second in terms of a trade execution, all these things like how much is it about exploring the frontiers of science, the new space race, landing on the Moon versus the paycheck?

Speaker 4

37:00

It's all about the paycheck. I'm just kidding. No, no,

37:03

not at all. Yeah, it's funny you ask. So this hasn't happened to me, But just in the past two weeks or so, a good friend of mine has been dealing with this problem because she got an offer on the order of like tens of millions of dollars per year from a big AI company and she wasn't sure if she wanted to work there, and I think originally she said no, and then they doubled her offer, and then like it's the exact same amount of cash, but twice as much per year for certain number of years.

37:31

And you know, we were talking amongst ourselves like what does this even mean at this point, Like you're, you know, a twenty eight year old computer scientist that's been coming from a PhD. So you make more on the order of tens of thousands of dollars per year. I honestly think personally, the marginal difference between having like ten and twenty million dollars is like very low, Like I don't even know what I would do with this.

Speaker 1

37:53

Is this is my experience for me making ten million twenty mine has basically.

Speaker 3

37:58

Been the same to me.

Speaker 4

37:59

Yeah, congratulations, but so yeah, I think there's more of a desire to like be there the next time something really interesting happens, and that kind of supersedes the money. Like any of these places will pay you what's like a really good salary to live on, and so it's actually not a big consideration. It only becomes complicated when you have like one option that's going to pay you like forty times more than the other option, and then things get confusing.

Speaker 2

38:26

No, this isn't this should actually I was just thinking about making twenty million.

Speaker 3

38:30

No, I think.

Speaker 1

38:32

Because I think about, Okay, what if you have this great salary and you're like can live very easily in New York City and have a really great life, or you could make ten times that, which is a stupid insane salary, right, but you don't write like your job.

Speaker 3

38:46

But it's so.

Speaker 1

38:46

Much money that strikes me is like not a trivial life. You only live one time. There's like a different so it could be a difficult question.

Speaker 4

38:55

Yeah, yeah, but you can remind yourself that, like the job you take once isn't the job that defines you forever. Maybe maybe the right thing to do is to take it for a few years but not the whole time, and then go do something.

Speaker 1

39:06

Everyone says they're going to do that and then.

Speaker 2

39:09

They get locked in. Speaking of insanely large salaries, we know that people are earning these salaries because they're like star AI researchers. How much does personality play into where you want to go work? Would you want to go work somewhere specifically because there's an absolutely amazing researcher, or does it tend to be again more about the paycheck, maybe more about the data that's available to you, or maybe more about the specific project that you're going to be working on.

Speaker 4

39:38

Yeah, I think different people assign different amounts of weight to each of those things. In my experience, like most of the people I know come from academia, which means they already kind of gave up more of a salary to do study things more deeply for several years. So I think people that I know are more biased against money. But like people do care about that. But I think that the ego thing really matters. Some people want to feel like they're very important and they're working on a

40:02

problem that matters. One way some companies are able to pull researchers away from other companies is by saying, we'll sign you more importance in your role and we'll give.

Speaker 2

40:11

You we'll give you a really big title.

Speaker 4

40:12

Yeah, exactly. Seriously, the title is like, Okay, maybe before you were like a researcher or not. You get to be like a head researcher. You get to have people under you, or you're a chief scientist, and all these things do matter to people.

Speaker 3

40:23

It's a very good book about it.

Speaker 1

40:26

Pursuing a mission in the realm of like a driven visionary even when it's commercially.

Speaker 2

40:32

Just say it, just say yeah, that's right. No.

Speaker 1

40:35

I think about this all the time. Do you want to work for Ilia or do you want to work for Sam? And which one is the ahab and which one is just trying to make an honest living selling ads. I find this to be like a genuinely interesting, interesting question for any individual to have to reckon with in this career.

Speaker 4

40:51

Oh. Absolutely, And sometimes it can be.

Speaker 1

40:53

Very difficult to tell Jack Morris, thank you so much for coming on. Please pursue a career that will allow you to come back on a log.

Speaker 2

41:00

Or insert the odd lots close when you're negotiating your one hundred million dollar salary, or.

Speaker 1

41:06

Take the fifty so you know what, fifty million, but let me I don't need one hundred million, fifty million.

Speaker 3

41:10

But keep the album.

Speaker 4

41:11

Yeah, that would be fine with me.

Speaker 3

41:13

All right, great, Well, thank you so much.

Speaker 2

41:15

Yeah, thanks, thank you so much.

Speaker 4

41:16

That was great.

Speaker 1

41:29

Appreciate I think about that sometimes, like what if you got like an insane salary like that, you just could you would be insane to say no to But like I don't know, that's I mean.

Speaker 3

41:39

It's not our problem, but like, wouldn't it be fun?

Speaker 1

41:42

You know? It's like, oh, but you're gonna be working on ad optimization or whatever and you're not going to be there when they land.

Speaker 3

41:49

On the moon. But you got paid ten times.

Speaker 1

41:51

More than the people at the Bay station working on landing on the moon. That strins me as a kind of a tough life choice.

Speaker 2

41:56

I think you're using up a lot of brain power and energy on a problem which will Jem said is not you.

Speaker 3

42:02

That's exactly right.

Speaker 2

42:03

No, that conversation was really fun. Nice to talk to an actual researcher just doing stuff in the space. One thing I thought was very interesting was this idea that everyone gets excited about a specific improvement in AI, and then it seems like that particular one doesn't materialize and instead something else emerges, as like the big breakthrough. So instead of agents, we have math.

Speaker 1

42:27

And math which none of us will ever. I would really like for an agent to do something simple. I'm going to a city book on the trip or whatever. Or change my flight. Oh my god, I tried to.

Speaker 2

42:37

That would be amazing.

Speaker 1

42:38

Recently change my flight. Here's my information. I don't I would like that. I do not need the math olympiad. I am very impressed.

Speaker 3

42:46

I don't need it.

Speaker 2

42:47

Also, I am now very very intrigued by reinforced learning and how you actually reward the computers for doing good stuff. I feel like, actually that would be a really interesting area to mine. Which is motivating motivating the models to do better?

Speaker 1

43:04

Yeah, I've thought about that, like in chess, like how do how do the computers know.

Speaker 3

43:08

They want to win?

Speaker 4

43:09

Yeah?

Speaker 3

43:09

You know, like why do they care?

Speaker 2

43:10

You know, all they're saying anyway, why are they here? Why are we here?

Speaker 3

43:14

That's the thing with AI conversations.

Speaker 2

43:16

That's existential fact, something.

Speaker 1

43:17

We didn't talk about, which I am interested. No one really talks about AI safety anymore. If you notice, like they like very little, like for better or worse. You don't hear people just all money and they don't really talk about what the AI kill us all one day.

Speaker 3

43:30

But one thing I did wonder about.

Speaker 1

43:32

So when Deep Seat came out, one of its breakthroughs was it showed the whole chain of thought, right, you could see that, which prior to that open AI or chatchybt's chain of thought model didn't show you.

Speaker 4

43:42

That, right.

Speaker 1

43:42

And it does strike me that if there are certain things that are for safety reasons or whatever held back or they don't want to do this, the nature of competition means all the guardrails are coming off of Actually, like that's if there's some guardrail you you have on someone's going to open source whatever it is and they're going to all give it up.

Speaker 2

44:00

Yeah, both on the guardrails and on the data use ys. All right, well shall we leave it there.

Speaker 3

44:06

Let's leave it there.

Speaker 2

44:07

This has been another episode of the aud Loots podcast. I'm Tracy Alloway. You can follow me at Tracy Alloway.

Speaker 1

44:12

And I'm Jill Wisenthal. You can follow me at the Stalwart. Follow our guest Jack Morris, He's at j xmnop. Follow our producers Kerman Rodriguez at Kerman armand dash O Bennett at Dashbod and kil Brooks at Kilbrooks. More odd Loss content, go to Bloomberg dot com slash od Lots with the daily newsletter and all of our episodes, and you can chat about all of these topics twenty four to seven in our discord Discord dot gg slash.

Speaker 2

44:36

Odd Lots And if you enjoy odd Lots, if you like it when we talk about twenty million dollars salaries that will never be ours, then please leave us a positive review on your favorite podcast platform. And remember, if you are a Bloomberg subscriber, you can listen to all of our episodes absolutely ad free. All you need to do is find the Bloomberg channel on Apple Podcasts and follow the instructions there. Thanks for listening it

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript