Why You Should Wait Out AI’s Super-Spending False Start

Speaker 1

00:02

Bloomberg Audio Studios, Podcasts, Radio News. Welcome to Marin Talks Money, the podcast in which people who know the markets explain the markets. I am Maren Summerzet Web and this week I am speaking with doctor Yanushmeretzky, who is an AI partner at Aaron Innovation Capital. Now, as you know, on this podcast, we like to talk about the big forces affecting our economy and markets in general, so it's really no surprise that we keep coming back to the impact

00:40

of AI. We've talked at different times about the consequences for jobs, for inflation, for interest rates, for tech companies, and whether politicians and indeed policymakers, let alone ordinary workers and investors, are ready for any of this. But what we've never asked is is it actually working? Yanish Welcome to Marri Talks Money.

Speaker 2

01:01

Thank you so much for having me.

Speaker 1

01:03

And it's coming to start with a brief explanation of exactly what it is that we mean when we say AI.

Speaker 2

01:10

Yes, everyone talks.

Speaker 1

01:11

About AI all the time. Aire say that is going to change the world, it's going to solve all our problems, it's going to destroy our jobs, etc. But what do we actually mean when we say AI.

Speaker 2

01:21

So these days, what we mean by saying AI is a system which is approximating certain process. It might be a system which is approximating language. It might be a system which is approximating images. It might be a system which is approximating how a robot moves. By the end of the day, the current generation of AI techniques, those neural networks are the function approximators. This is approximate things. They don't solve intelligence, they approximate. So that's why you

01:53

may have an illusion that those systems are intelligent. At the end of the day, they are approximating intelligence.

Speaker 1

01:59

And I suppose there's two ways to look at this. What AI as you've just described needs from us and what we really need from it to make it work for us. So if you look at the way that the big hyperscalas are approaching things at the moment, they're building massive data centers to build out their capacity, and that requires vast amounts of energy, It requires lots of coolants, it requires a very large volume of very different types of chips, right, yes, and all those things. Obviously there

02:25

are troubles at the moment. We're getting all these things with the war in the Middle East, etc. So there are supply restrictions, but nonetheless none of these things are really our long term problem. Given their correct policy choices, all those material elements can easily be not easily, but

02:40

can be found and built in. Then there's the second bit that we've talked about when we met we last met, which is the data that you require to train a model, and that can't be created in the volumes that are required. And we've hit it. We've hit a supply problem with data.

Speaker 2

03:00

We have hit a supply problem with diverse data. To clarify the thing, because you can just create an infinite amount of data by randomly generating new words, so you can create it. But we're talking about creating high quality, diverse data. And you see we've run out of data, diverse data, not yesterday, not a month ago. We have run out of data three and a half years ago. You can to understand that the last frontier model GBT four, which was not a combination of agents, etc. Just basic LLM.

03:34

The last LLM that was the most potent was GBT four. It was released when it was released in twenty twenty three in January, However, the training of that model finished at the end of twenty twenty two, so it's three and a half years ago. We have trained the model that used all the publicly available data on the Internet. There is nothing more out there to use to train the model. Hit the data ceiling not a month ago, but three and a half years ago. It's extremely important.

04:06

And yes, what we're doing right now is we're trying to put together multiple lms. We're trying to have synthetic data, but the performance isn't really there. We've hit diminishing returns not a month ago, but three and a half years ago.

Speaker 1

04:19

Okay, And it's surely new data is created all the time on the Internet. You know, we talk about diverse data, and it's an awful lot more created in the past than there is now. But everyone's use of the Internet surely creates huge volumes of new data all the time.

Speaker 2

04:33

That's even worse. Humans are creating new data on the Internet, but that data falls into certain patterns. How many conversations on the weather you can have every day?

Speaker 1

04:42

A lot, actually a lot. I think I've already had three today, to be honest. You know, it's very cold in Edinburgh.

Speaker 2

04:48

You can have a lot, but there's a bigger, profound problem there I want to mention is that if you look at the data that is currently being created on the open Internet, it is data create by those llms, which number one is inaccurate because it suffers from Halla's hallucinations, and number two, it feeds into itself, so that the new models now train on the entire Internet, the training on the output from other models, which is I said,

05:17

are making mistakes and hallucinations, and that, using a technical term, slowly ends up leading to something called model collapse, where the models themselves are actually getting dumber.

Speaker 1

05:29

Okay, so if you train a new model on newly created data, you're effectively training on its own nonsense or nonsense created by other similar models.

Speaker 2

05:39

Yeah, well, not only nonsense. We have to understand that those llms using artificial neural networks do you're correct let's say ninety five ninety nine percent of the time, So most of the content is correct, but you no longer know what content is incorrect, which is a significant degradation of the quality of data. It's almost like having access to a calculator which claims to be correct one hundred percent of the time, but in reality it is correct

06:04

ninety five ninety nine percent of the time. How much would you pay for that calculator? Would you use the output from that calculator to produce novel calculator? No, you wouldn't do that. But we are doing it right now, and we are seeing right now that the performance of those models on benchmark is not increasing anymore plat out three years ago. And there's even a bigger, profound question. We are training those models now using higher quality data,

06:30

so no longer we're using the entire internet. Now we are filtering from the Internet a lot of garbage, so we're using a smaller training data set, thus producing smaller llms that have the same quality. But as I mentioned, they are smaller, So what is the consequence of it. Well, as we speak, I'm running those models on my laptop. I don't use any data center for it. I never will. I'm running them on my laptop. So the models have

06:55

gotten better. But by saying better, I mean they are more power efficial.

Speaker 1

07:00

They're not all right, So I mean that they're most specific, so that I get toward a more specific task because they're using a narrower range of data, so they can't be a type of general intelligence they're specific and.

Speaker 2

07:12

Remarkably no, they are general purpose. I mean you can just go online and down on quent three point five and three point six. This is a general purpose model, which is even accessing the internet to give you Summari is reasoning. It's doing this thing on your laptop. It's today. Of course. On top of that, if you want to, you can find tune your model on your proprietary data. But you see even the fine tuning process these days can be done on your laptop, which again raises the question,

07:41

why do you need to pay for compute? Why do you need all this expansion of the data centers. You really don't. And what's going to happen in year two, three years from now when the majority of laptops out there will be able to run general purpose the most potent language models with access to the Internet. I just I'm just not seeing it. And that's on top of one fundamental thing that I want to mention upfront, because you mentioned what does the AI need from us? And

08:09

what we need from the AI? So what the AI needs from us are two things, as you mentioned. Number one, compute and yes, we can provide more and more compute as long as oracle CDs is not going down, which it is right now. And number two, we need to provide more data. We've use all of it, so to some extent we cannot produce better AI. Now what do we want to have from the AI. We want to have at least two things and just bury me here. Number one, we want to have systems which are continually learning.

08:42

So just like during this podcast today, I hope that I will be able to learn something from you. You will be able to learn something from me and remember tomorrow or maybe remembered after Easter. Current generation systems in general, neural networks are not learning anything new you when you interact with them, which is significant limitation. So we're not getting from the AIS what we want to They're not learning from us. It's a fundamental limitation that's not solved.

09:10

And number two, you want to mention this thing up front. Those systems are stochastic. They are probabilistic. You cannot trust them. They roll the dice whenever they produce output, so to some extent you cannot trust their output. Can you make them deterministic? Yes, of course you can by making sure that they always produce the most likely token. Now, use the word token. But the problem with this thing is that then they will be just copying the data from

09:38

the training set. So just imagine those lawsuits when you see verbatting copies of all the podcasts books produced as an output of Gemini or open Ai system. So those systems number one are not continually learning, which basically for me is just a no go. And number two, those systems are not to be trusted because whenever they produce output, that output is it stochastic. It's not a terministic it's basically rolling the dice to catch you output.

Speaker 1

10:05

Yeah. Can we talk a bit more about that about how the hallucinations or errors have it build up?

Speaker 2

10:10

Absolutely? I me try to explain using simple terms when you use ch GPT or Gemini or copilot, but this is Copilot is actually open eye system. It just produces texts, so you have an impression that generates one word at the time deterministically, that's what's on the screen, just one word after another. But in reality that's not what those systems produce. If you are a developer like me and my coelagus are developers as well, you can look at

10:42

the developer output of a large language. Models do you know what it gives you. It gives you a vector of fifty thousand elements, actually fifty two thousand elements, so fifty thousand elements, and each element has a certain probability of being correct zero point ninety five zero one zero zero one zero, zero point three. It's an entire vector. It's not one word has probability one. Everything else is zero. No,

11:07

there is a little bit of error there. It has to be, and so not is what happens when you're producing one word at the time, or one token at the time. You can think of a token as a ward split by two or three. So when you produce output one token at the time, this system is rolling

11:24

the dice all the time. It's making a small, tiny error every time it produces a ward s. At the beginning, you may not perceive the error, but over time, after three hundred, five hundred a thousand words, the error is going to be so big that it's going to result in the critical failure of the system. You cannot circumvent it because those systems are probabilistic. It's not like in an Excel spreadsheet, where you can have a chain of aund ten twenty one hundred one thousand formulas and you

11:54

know that the formulas are going to produce the correct result. Here, if you have a chain of words, every time you produce a word, you accrue a little bit of error and you see this error manifest itself later on.

Speaker 1

12:08

Okay, so is this solvable? I guess that's the key question. Is it possible? And the type of models that we're using at the moment, which are super hyped, is it possible for that hallucination problem or compounding error problem to be solved, or is it simply a systemic shortcoming that is unresolvable.

Speaker 2

12:28

So here I'm again speaking from my experience having two ephds in mathematics and computer science and twenty years of experience in deep mind and IBM Watson research. No, it's impossible. You have to use a different technique for it. Yes, there are different techniques that are emerging right now. In full disclaimer, I'm also a co founder of a startup working on one of those techniques. We call them fractal brain. Yes,

12:53

there are new techniques on the horizon. However, the existing techniques have a building mechanism so that every time produce an output another word, you have a little bit of error. You cannot eliminate it thing. And there's also a fundamental other thing you mentioned hallucinations. What do we mean by hallucinations? It's number one, it's producing those small errors one word

13:14

at a time. But there's also another reason for hallucinations when this system just doesn't remember what was mentioned yesterday, a week ago, or a month ago, and goes back to the initial question. You need both. You need to have a system which, like humans, it's continually learning. And number two, you need to have a system which does not make errors when it produces the next word. It shouldn't roll a dice. You need those two and so now to answer your question, can we solve this problem

13:43

using artificial neural networks? No, there are attempts to. There have been attempts to circumvent it. If you want to have continual learning, you can maybe try to use something like continual backpropagation from rich Sutt only got a two in go word in twenty twenty four. Or we can, but it's not solving the problem. At least he's attempting to find solutions to it. You can retrain the system, of course, right, I mean, why not take GBT for after the end of the podcast. Retrain the entire system.

14:12

It's going to cost you five million dollars. But you could do that. You could retrain the entire system, or you can find you in your system. But when you find you in your system, the system is forgetting what it was trained on before. So you're suffering something which is called catastrophic forgetting or catastrophic interference. Long story short, No, you cannot solve the outstanding problems of hallucinations and lack of continual learning. Unfortunately. I wish you could, but you

14:41

need to have different things for it. And it's not just me saying it. Look at the landscape of researchers, leading researchers in the field, my colleagues, I know all of them personally. They have all jumped ship. This is important. Look at, for example, young Lecon from Metaga. He jumped ship. He's working on his new startup am I line apps, not working on llms, saying l lams is a dead end to AGI. Look at Michael Iff Dave Silver from DeepMind. He just left deep Mind, but a couple of weeks

15:09

ago he formed Ineffable Intelligence. I think they're raising a billion dollars the same thing. He is not a believer of using l lms for general artificial intelligence, and can go on and on Iliasutskiver, Andre Carpaty. And so it's not just me who is saying that we need to go back to research. The leading research and researchers in the field have already jumped ship a few months, maybe

15:34

even years ago, working on the next generation things. The market still believes we can solve hallucinations, but the leading researchers have jumped ship. That's unbelievable to me that we keep pouring money into bigger data centers, knowing we've used all the data already, and knowing that even if they have more data, you will not solve continual learning, and you will not solve hallucinations. So why is not everyone? Why are people doing it?

Speaker 1

16:02

Okay, So if we know clearly, and it sounds from what you say that we do know very clearly that using more and more computing power and more and more already mildly corrupted data isn't going to get us anywhere. Then the enormous cap expand on vast data centers, the hundreds of billions of dollars that have already gone into this and are still projected to go into This is a catastrophic misallocation of capital.

Speaker 2

16:27

Well that's how you look at it. So obviously you can say the biggest winner of it is Nvidia because it's producing those GPUs or during the gold rush, you should invest in companies that produce the shovels you can still make money. For example, I'm a partner at Urn Innovation Capital in the UK and we are investing in airs mission two companies, one called high Verge and second

16:51

called Hydra. These companies improve coolings of data centers or these companies produce better algorithms to run on data centers. So you can still allocate your capital wisely, but you shouldn't allocate them in companies which are spending on this compute. You should allocate your capital in companies that are allowing those data centers to run efficiently because those data centers, who knows, maybe they will be used for different purpose

17:19

at some point. So again, there are going to be winners and losers of the current gold rush in GPUs and AI. I would say companies that have not invested massively in data centers and in front here models are going to be the winners. There's some companies without mentioning the names, that have been accused of not training their own language models. I think these, to me are going to be the winners in today's market.

Speaker 1

17:47

Are we talking about Apple?

Speaker 2

17:49

I will let you determine that thing. But you can see some companies number one, have not invested in the LM front here models, but the research teams have kept public papers saying that those lms stayed on the reason they make mistakes. So let you find those companies. And there are some companies that have borrowed a lot of money to expand those center centers. And it's not me. Look at the market, look at CDs on Oracle. It's the market is just flashing reds saying this is foolish.

18:17

So we're seeing those signals already. But I want to make sure that if you want to make money in today's market, is given that the governments have to pay eight nine percent of the revenue on servicing the debt. You don't know if the market is going to go up and down, and maybe a capital injection, liquid, the

18:33

injection from Central Bucks. You don't know those things. So I would not recommend your short or go along any investment, I would recommend maybe doing an arbitrush buy companies that have not wasted money on llms and short companies that have borrowed a lot of money to expand data centers. That will be my suggestion, but I might be wrong.

Speaker 1

18:51

Again. Can you tell us about any of the startup companies that you're invested in or interested in that are taking us to this new frontier and AI away from the LLM model and towards a different model.

Speaker 2

19:24

Absolutely so. In our investment fund, our an innovation capital, we have access to a lot of companies, actually maybe twenty or thirty companies that are developing the next frontier models which are not necessarily using artificial in our networks. And I can speak about three or four of them which are very exciting. One company you can have a look. They are based in Switzerland. They're called Innate AI. Again, I'm advertising them and we're not investing in them. We're

19:48

not investing them yet. They are developing new version of neural networks which are inspired on the brain. This is an effort that was going on in Europe for more than a decade, the Blue Brain project. They are developing something new which is not an artificial neural network. So that's one kind. Look at another company, for example, PATHWAYAI in the Bay Area. Again another example they mentioned upfront, you need to solve continual learning. You need to solve that.

20:18

If you don't have it, forget about the solution to AGI. And so they have been developing systems that can learn using something called heavy on learning, which is a local learning technique that happens in the brain, not using bad propagation grid in descent. So this is another example. Another company that I'm actually a co founder of and the CEO called Fractal Brain AI. Have a look at the thing. It's also based on prefrontal cortex, and it's this idea

20:45

that those networks are continually growing and rewiring themselves. So no longer you have a fixed network with a fixed number of parameters. No, the network is growing expanding themselves. Like today, you're going to probably form a connections after this podcast. Those networks do the same, They create new connections all the time, and on top of that, they are continually learning and thousands of times more power efficient

21:10

in addition to being data efficient. So these are only some of the examples of companies that I'm personally very excited about. But as ILIOSU Skiverer said the other day on one of those podcasts, that we have gone back from the age of scaling to the age of research, So researchers have gone back to developing the new things. It's just onlin end here me and my teams have started developing, for example, fractals fractal brain twelve years ago.

21:41

We knew about those outstanding limitations of artificial neural networks more than a decade ago, so we wouldn't invest our time in it.

Speaker 1

21:48

Nevertheless, somehow I got to cut up in this sort of super bubble hype, despite the fact that good scientists aren't you.

Speaker 2

21:54

But this is good. I like the hype because you see, to some extent that hype and the lands they allowed us to understand that it's possible to approximate human language. So now that you know that you can approximate human language with el lamps, you can try to find ways to actually solve the idea of human language. If you can approximate something, you can see the size of it. So it's almost like if someone showed you, hey, there is a rocket there, it flies, you already know the

22:24

size of the rocket. You can know you can fly, you can start to crack the details of the engine of the rocket. So to some extent, I like the current hype. I like the current generation systems because they allowed us to understand the size of the problem and approximate it. Now that we know that we can in principle approximate human language, let's just solve it, okay.

Speaker 1

22:44

And the other thing I suppose we should say is that while we spent quite a lot of time criticizing this generation of llms, they're still great, still as really useful. It's not like we have a totally pointless technology, something that can remove entry level jobs across the board, which of course comes with its own problems, but none LASS has enormous use from activity and business.

Speaker 2

23:04

Absolutely, I love maybe not just l elms. I love the generative AI. For example. Not sure if you've noticed behind me, I have this amazing landscape of London, but you can tell it's all fake here, like this building is all tilted here. So those systems they produce very pretty graphics. It's inaccurate, but it's okay for me. It still gives me a very nice background. Same thing with text. They will produce beautifully looking poem. They can summarize the document. Yes,

23:32

there are errors. The are like this building here. You can tell it's all tilted a little bit, But I'm okay with that. So those systems are very good for creating templates, knowing what templates of data, it's so called boiler plate code, creating nice graphics. They're not good for details to put in there. And it's interesting because about two or three years ago, I was giving a talk to high school students. They're asking me what do we use those lms for. I told them for generating templates,

24:02

templates of presentation, etc. And for summarizing documents. But I misled them. I don't think you should be using those systems for summarization for two reasons. The reason number one, in that summary there might be errors and mistakes, So if you summarize a document, don't throw away the original. And number two, you know, when you're summarizing something, you should know what you care about. For example, if you are to summarize today's podcast, maybe you only care about

24:33

this tilted building here, which is fake. Maybe that's what you're looking for. LM doesn't know it. So when it produces a summary of text. When it compresses graphics, it doesn't know what you care about really, so it's going to produce a summary maybe lacking the details that you want to know later on.

Speaker 1

24:51

And I suppose the other things with the errors. You should only really be using it for things where you know that you will be able to spot the errors at the end.

Speaker 2

24:59

So this is a actually very interesting I think that the killer use case for generative AI is producing output that you yourself can check for correctness. So for example, if I'm if I want to compute one hundred plus one hundred, LM will give me an answer I know I can check the answer myself the correctness. I like it this way. Unfortunately, people are using those lms to answer a question to which they themselves don't know the answer to. This is a recipe of an absolute disaster.

25:30

In the worst case, you can use those systems to check whether your output that you produce yourself is correct. You can do it that way. But people using the other way around. They are asking those lms to produce an output they don't know what the output should be, and the output can have one or five percent error, right, why would you even do that?

Speaker 1

25:50

So are we worrying unnecessarily about the job market. We spend most of the talks and podcasts and panels, etc. That I do, the question is always what on earth does my child do for work? In an age of AI. Are we worrying too much about that? Because the human input will remain absolutely compulsory for the next couple of decades.

Speaker 2

26:08

Having the same problem as I mentioned, my son as is thirteen years old, so mis the father to give him advice what to do in the future. I guess being a scuba diver instructor is a good job. Yeah, it is a great job. You're going to need them. I worry about interns entry level of software engineers because

26:27

to some extent you can automate. It's not replaced. You can automate most of the tasks that you delegate to interns today, like, for example, write a boilerplate code, template code, check some if there's errors in that code, you can automate that. But what's going to happen then is that we are not going to have interns anymore or significantly smaller number of interests and entry level software engineers. So what's the consequence. What's going to happen with the entire

26:55

promotion cycle. If you're in senior management or a middle management, you're gonna get promoted. Who's going to replace you? Who's going to become a software architect if we're not training the new entry level software engineers. So there's this entire skills gap right now. Some people that I know have chosen not to pursue studies in computer science because for the very reason they worry that we're not going to

27:20

need software engineers. Yes, we are going to need software engineers, it's just you need to jump immediately to being an architect of a software engineering system. And to make sure verry hard. It is hard. It Typically what you do is you get gain this experience on the job. You go to Google, spend first two three years writing code, but you appear so software architects, and so you learn

27:43

from them. If we don't have this experience that we're giving to entry level software engineers, they won't be able to have this experience. So this is what always me more. It's not about replacing software engineers, it's about us not having a pipeline of software architects. And see your software engineers, we are not having this pielime anymore. That worries me quite a bit, to be honest.

Speaker 1

28:04

Yeah, and the pipeline problem is just being discussed in a lot of other professions as well, most obviously in a legal profession.

Speaker 2

28:10

Absolutely, again for lawyers, this is very interesting and this is not my profession, so you can discount what I say quite a bit way. Now, you can get the initial blueprints of legal documents very quickly, and we do it all the time at our startup. You can get a blueprint of a legal document, but I would never use that document to get an investor on board in my startup. I wouldn't do that. I still need to send it to an actual human being to at least

28:34

prove read it. So we're going to have to have those lawyers which can prove readdocuments produced by the generative AI, but aren't doing it already. They already have hundreds of template documents safe on their computers. It's just changing the names of companies investors in those documents. So the legal profession is not going to go away because those generative AI systems, they don't have the notion of true and false. They don't they confused. It's all probabilistic, so they will

29:04

make an error. Sometimes after twenty thirty forty legal statements, they will just change one true into false. So we're still going to need to have lawyers for it. This is one of those professions where I don't think it's going to be automated fully by AI. But as I said, there are other professions that have already been automated. For example, content creators. So you can go online, go to YouTube.

29:25

You're going to see a lot of videos summarizing AI, AI bubble summarizing conflict in Ukraine or Iran, all generated using generative AI. So some jobs have already gone away, the jobs that require to be you to be one hundred percent accurate, those jobs are not going to go away using the current generation of AI systems. Remember, there are new generations on the horizons. My startup is working

29:51

on it, other startups are working on it. But with the current generation of tools, they will not displace those jobs yet.

Speaker 1

29:57

Yeah, it sounds to me like we shouldn't necessarily be frightened the current generation of tools, but we should hardly be pretty frightened the next generation.

Speaker 2

30:05

I would give a very simple example here. So me and my colleagues we've built Alpha go at Deepmine in twenty fifteen, which is the system that won in computer Goal with the World champion, and so back then it was a state of the art system. These days, you yourself can win against that system. I can do it as well. Why because those systems have stayed the same, and humans have identified flaws in that systems. They have identified hallucinations errors, And now when you play against that system,

30:39

you just exploit this system, exploit its weaknesses. The system is not adapting itself, not rewiring itself. So to answer your question quickly, I'm personally not worried of existing AI systems because they are adapting themselves. However, I do have to mention I do worry that people are going to be using existing systems in domains where they should not be used, for example, for identifying targets to bombing around. It's just don't do those things. Those systems make errors

31:09

and hallucinations. So I worry about misuse of existing AI tools. I don't worry about these tools themselves being malicious. No, I worry about inadvertent misuse of those tools without understanding what they are not good for.

Speaker 1

31:26

Okay, interesting, always something to worry about. Can I ask you one last thing. One of the things that you mentioned earlier was the extraordinary energy inefficiency of the current systems. And it is true, isn't it. The human brain is

31:39

remarkably energy efficient. And when you look at these models, the amount of energy that they will use to simply have the same thought that the human can have on a couple of words, it's that's an extraordinary problem that in your new generation that we talked about earlier will be diminished.

Speaker 2

31:55

Absolutely. The systems that I know that fractal brain is coming up with, innate AI is coming up with. See those systems, they don't use back propagation to produce the next word. They don't have to load from the memory all of those one point five trillion parameters just to produce the next word. They don't do that. They load maybe one hundred, maybe one thousand parameters three four orders

32:24

of magnitude better power efficiency. And we know this thing, and we know that we've trained what first version of our fractal language model using I think zero point one percent electricity that open AI use for GPT one and two, so we know it's possible to do that. It finds its interesting. Before this podcast, we had a conversation. If

32:42

you recall me on Easter dinner and cooking potatoes. So as we're talking right now, I can assure you that for me and for you to produce the next word in our conversation, you are not thinking about potatoes for Easter. You're not doing that thing. You don't need those connections too. When we talk about a we don't do those things.

33:02

And maybe now you think about your Eastern dinner. But the point is that those systems should produce the next word only loading up parameters which are important for the next world. And this is not billions of parameters, nor hundreds of thousands of most maybe a thousand, two thousand parameters. So yes, the next generation systems, in addition to being deterministic,

33:23

you can trust their output. They have power and the data efficiency which is remarkable, talking three orders of magnitude better, it's coming out. This thing is coming better. And again it's the world may not be ready for it yet. Plus you didn't understand my own motivation for building those systems and the dangers. We have systems which are adapting themselves, rewiring themselves, so to some extent you cannot outsmart them permanently.

33:50

They'll learn from your mistakes. It's like your kid. Try using some technique on your kid. It's going to learn how to adapt itself. Those systems do it as well, So I personally do not no if it's a good time to release those systems to the general public. I don't. That's why those systems are not being published because think about it. If it has a solution for a system which is learning all the time adapting it's flaws, that

34:13

becomes spooky to release that system. So I worry more about potential misuse of those next generation systems, less worry about their performance because they're already beating existing systems on common benchmarks.

Speaker 1

34:27

Okay, thank you very much. It's one last thing imish before we go. I think our listeners will have been absolutely fascinated by all this and what I often ask people, I'm going to ask you as well, and I hope that we'll be able to understand that.

Speaker 2

34:40

Your suggestion, what are you reading at the moment, Well, I wish I could tell you the IM reading books on AI, but I don't. In fact, four years ago I started learning without any necessity Spanish, zero necessity, nothing. The reason I started doing that is I wanted to again learn a language the way humans do that without billions of words, not with a small number of words.

35:02

So as an expers a mental experiment to learn whether our fractal language model learns language in the same way as I'm learning Spanish, I started learning Spanish, and because of that, I'll I'll disappoint you. I'm reading kids' books, reading Diary of the wimpik It in Spanish. Of course, I'm reading Harry Potter as well in Spanish. I'm reading lots and lots of books in Spanish, but these are elementary school books. I'm sorry to disappointed.

Speaker 1

35:25

You're reading the same things in Spanish that my son is. So there we go in common.

Speaker 2

35:29

Yes, absolutely, this is actually good because if you want to understand, for example, how language works, at least try to learn yet another language. Now that you can do an introspection, you can see how you're learning a language, and so it's remarkable you can learn a language after in my case, about seven or eight thousand hours. I guess my Spanish is better than my English right now, So eight thousand hours, how many tokens we're talking about?

35:51

Couple million tokens, couple million tokens of a training set rather than couple trillion tokens. So that's what fascinates me. And yeah, maybe next time we can speak in the Spanish on the podcast.

Speaker 1

36:04

No, I'll have to get one of my kids in for that.

Speaker 2

36:07

Okay, sounds good.

Speaker 1

36:09

Yeahhish, thank you so much for joining us today.

Speaker 2

36:11

Its pleasure. Thank you so much for having me.

Speaker 1

36:18

Thanks for listening to this week's Marin Talks Money. If you like our show, rate to review and subscribe wherever you listen to podcasts, and keep sending your questions or comments the Merrin Money at Bloomberg dot net. You can also follow me and John on Twitter or x I

36:30

met Marinas w and John is Underscore Stepic. This episode was hosted by Me Marenzumaset Web was produced by Sammersadi and Moses and sound designed by Blake Maples and Aaron Kasper and special thanks of course to Yannish Mareski.

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript