#153 - Taylor Swift Deepfakes, ChatGPT features, Meta-Prompting, two new US bills | Last Week in AI podcast

⁠¶ Intro / Banter

Andrey

00:04

Hello and welcome to scan today's Last Week and I podcast, where you can hear us chat about what's going on with AI. As usual, in this episode, we will summarize and discuss some of last week's most interesting AI news. You can also check out our last week in AI newsletter at lastweekin.AI for articles we did not cover in this episode, I am one of your hosts, Andrey Kurenkov. I finished my PhD focused on AI last year and I now work at a generative AI startup.

Jeremie

00:33

And I'm your other host, Jeremie Harris. I'm the co-founder of Gladstone AI, which is in the AI safety company. We do stuff with AI in national security, export controls, alignment, weaponization, all that fun, fun stuff that has us think about AI as a potentially WMD like or WMD enabling piece of tech. What a week, what a week, what a week. A Andre like this is, I feel like

00:56

we're seeing this more and more. When we came back from the holidays, there was like this low and there weren't that many papers. You know, everybody's kind of taking a little bit of time off. Most of the big research was actually China based, which I found really interesting, kind of what you might expect to there's kind of a disproportionate amount of, of China based research. Now, the Western world is kind of woken up and we have Eichler.

01:15

We have a whole bunch of different, conferences coming around the corner. So, here we go. Like, I mean, I feel like we're just hitting the hitting the ground running in 2024.

Andrey

01:24

Yeah, we're going to get it going. And I feel like also the news in general is getting going in a sense. Like we had sort of a slight acceleration coming with the election and more and more incidents related to AI happening with that. And now as we'll discuss this Taylor Swift story happened and C might be the biggest story of the year so far. Surprisingly, that's something I would have predicted. But there you go. That's how it happens.

01:52

Before we get into a news, let's just quickly take care of our sponsor. And once again, we are sponsored by the Super Data Science Podcast, which is a great resource to learn about machine learning, AI, data, careers, all that stuff. It's interviews with all sorts of people working with data science or AI. Hosted by John Crone, chief data scientist and co-founder of the machine Learning Nebula and the author of the bestselling book In Deep

02:21

Learning, Illustrated. They, post twice a week and have 700 800 episodes now, like crazy amount with all sorts of people. So if you would like to hear from people in the AI space, in addition to hearing about the news as you do on this podcast, we do think we Super Data Science Podcast is a great option for you to check out.

Jeremie

02:44

Yeah, and everybody already knows that I'm a giant John Crone fanboy. John Crone, very, very good interviewer. Very gifted. Awesome. Awesome guests. Can't recommend the podcast highly enough. Please check it out.

Andrey

02:56

As always, you can always start with Jeremie's episodes and learn more about his views on.

Jeremie

03:01

I keep forgetting that we plug that every time I feel like it's like, yeah, it's like it's like an ad for me. Somehow I feel like.

Andrey

03:10

Well, with that, let's go ahead and get started. And, you want to do something a little bit different this time, just because

⁠¶ Synthetic Media & Art

03:17

it does feel like the biggest news and it's going to come back, a little bit. So instead of starting with tools and apps as we do usually, I figure let's start with synthetic media and art. And as I mentioned, maybe the biggest story, of the last couple weeks, or at least seemingly the most widespread story, and,

03:38

sorry, that seems to have some impact. We are starting with the deepfakes of Taylor Swift on Twitter or X. So in case you haven't heard there, where I generated explicit images of Taylor Swift that were widely circulated on X, for instance, a prominent post, where these images had 45 million views and hundreds of thousands of interactions before the user's account was suspended for violating the

04:09

platform's policy. And, yeah, there was just a lot of them to the point that at some point, Twitter restricted search for Taylor Swift, as some reported, and some reporting turned up that it with many of these, it appeared to be that they were actually created with Microsoft Designer, Microsoft's tool that you're not supposed to be able to do this sort of thing with.

04:33

But there was a conversation among some people that was found via telegram, and it seems that they basically found a way to hack it, to make the model do things it wasn't supposed to do via prompt hacking. And they generate these images pursuant to acts. And there was this crazy storm where suddenly Taylor Swift and I were in the same sentence a lot. And. And as we'll cover in the rest of the episode, there are actually some pretty serious consequences to this happening.

Jeremie

05:06

Yeah, and I feel like this is another, instance where we're realizing that as a society, as civilization, we just haven't come to terms with this problem of, you know, language models, image generation models and all that, being vulnerable to exploits, being vulnerable

05:24

to jailbreaks. Right. Like these companies are putting in tons of effort, millions and millions of dollars, tens of millions of dollars to secure these models and have them say no when you ask them for help, you know, to bury a dead body or to make an explicit image that has Taylor Swift in it or what have you. And these things always, always, always have workarounds.

05:42

And I think as a society, we are still in the phase where we're somehow pretending that companies actually have the technical chops to, prevent misuse of these tools. And we're leaning on that an awful lot. The reality is, once somebody discovers a jailbreak that might take some time to uncover, but then it spreads at the speed of software, and you can have these very rapid takeoffs of very risky and unfortunate

06:06

behavior in this instance. The number one post hit 45 million views, had 24,000 shares. So tons of interaction, all this stuff. And it was only removed 17 hours after it was initially posted. And this is kind of another dimension of it, which is, you know, like Mark Twain said, a lie gets halfway around the world as the truth is still putting on its shoes or something like that. You know, in AI that is extra true, right? In the age of social media, you post something goes viral.

06:35

The correction is never as viral as the original post. The correction always comes after people have already gotten riled up and, and where the damage has been done, and you really can't. You can take down the tweet, but you can't undo history. And that data is still online forever. So kind of an interesting, situation for Twitter to be in, especially given they are currently being investigated by the EU over claims that X is being used to disseminate illegal content and disinformation.

07:03

And so this is sort of a bit of hot water for them to be in, in particular in light of those investigations. So, you know, difficult times ahead. For I think everyone in tech right now is trying to figure out who owns responsibility for certain misuses, you know, is it Microsoft Design or is it the the software that was used to produce these things? Is it the platform that was used to share it? Is it the individuals who

07:24

shared it? Like we kind of got to sort this out and we got an election around the corner.

Andrey

07:28

And this is a real reminder that this exact thing, you know, nonconsensual deepfakes, often being pornographic or explicit, you know, has been one of the concerns with AI now for a while, really, since the beginning of the deep fakes, going back to like 2019, 2020, there were already tools being created to make it easy and accessible to do this sort of thing. And now it's getting easier and easier with the advance of technology.

07:58

So still a very real kind of problem category for AI and something that now with this Taylor Swift, story, I guess, really hit the mainstream with a lot of discussion and accusation and actions being taken, I think, to really address this situation more rapidly than I guess we had in the past years, even though, like the notion of deepfakes and the notion of DNA photography has been around for a while.

08:28

Next story. Speaking of celebrities, this is a follow up on a previous story we had, and the story is that YouTube deletes a thousand videos of celebrity eye scam ads. So this is following up this investigation by four for media that pointed out all these scam ads that were being posted with celebrities like Taylor Swift and Steve Harvey and Joe Rogan and various other famous people after that investigation came out, YouTube did address, I guess, its findings.

09:03

And so, according to the story, there were, you know, about a thousand videos deleted and these videos had almost 200 million views in total. So I guess, yeah, another example of a platform dealing with deepfakes and having to address them rapidly after they come to light.

Jeremie

09:23

Yeah, and the interesting angle that pops up here too is, you know, you start thinking about the liability that YouTube might have over hosting videos that are, you know, I mean, these these are reputationally damaging, right? If you're Taylor Swift, if you're Steve Harvey and you see a bunch of videos coming out and you're you're promoting some Medicare scam and that does brain damage in a very serious way.

09:44

And if these videos stay up long enough, as you said, to collect 200 million views, like you can assign a dollar value to each of those views in terms of how much damage is potentially being done to their their brand. So yeah, I mean, I have no idea what the liability regime is going to end up looking like around this stuff. I think it's to my understanding, this is something that's still, to some degree, up in the air and unresolved.

10:04

So, I think probably, I don't know, 2024 seems like a year where we'll also see, in addition to a whole bunch of precedent being set on election interference, we'll probably also see a lot of interesting precedents set in the courts as people try to figure out where responsibility ultimately lies on some of these issues.

Andrey

10:20

And just a couple more stories in this section. Moving on to a lightning round for some quick ones, we have yet another a fake story. Actually, this next story is that Iceland has had its own AI George Carlin moment and considers law against deep faking the dead. So we've covered it. There was this AI synthesized, a comedy special starring the famous comedian George Carlin that, kind of hit the news a couple weeks ago. And now a similar thing has happened in Iceland.

10:54

There was a video that featured comedian Hemi Goon. I think that's how it's pronounced, who passed away in, 2013. And there was, yeah, this video that aired during, actually a television event created with assistance from Icelandic startup Overton, which creates, AI voice overs or has that capability. So, interestingly, yeah, this led to a lot of conversations.

11:24

And now there is a whole consideration of what do we do about deepfakes in response to this, in a way similar to a Taylor Swift situation?

Jeremie

11:35

Yeah. I just want to pause and highlight the fact that this is the most 2024 thing I've ever heard. A headline that says Iceland consider is law against Deep Faking the Dead. That's kind of like, you know, where we're in the Black Mirror phase of human history right now is pretty insane. Yeah. And like deep thinking, like reanimating deceased people or deep faking them in, in whatever form, you know, deepfakes are also not the only

11:59

way that can be done. Right? Like, you can have chat bots, you can have, audio file or audio generation alone, all those things. And it all kind of seems like something. Again, as a civilization, we have to make those decisions. Now. It's like, you know, philosophy's checking our homework. After 2000 years, we don't. For 10,000 years, we don't really have much to show for it at this point.

Andrey

12:19

And let's round out to a section and this initial slate of stories or something a little bit more fun. I figured I'd just wrote in. So there was a story about how guns N roses, the famous rock band, has shared, and I generated video for a new song titled The General. So you can go ahead and check that video out if you take a look. It's kind of interesting. They really went sort of low tech, so to speak, for I it's it's very obvious that it's I, it's a lot of filters. It's kind of very wavy.

12:56

We're not trying to make a very beautiful image or anything. It's it's really more like a filter. You could go like a year ago and see this sort of stuff. But I do think this kind of highlights for of a mainstream band like guns N roses, really old school band, you could say them using AI in this video is yet another pointer to how it is becoming more mainstream. And I suppose how, you know, AI tooling presumably is making its way into more and more creative professionals, toolboxes.

Jeremie

13:29

And then moving on to tools and apps. And it's, a bit of a continuation,

⁠¶ Tools & Apps

13:33

I guess, in a sense of some of the stuff we've been talking about. First one is Microsoft makes swift changes to AI tool. And so this is in response to the Taylor Swift stuff that we talked about. Obviously, the case of of designer being used potentially to create some of this, explicit, imagery of Taylor Swift, these nude images, which, by the way, did come out of for Chan and a telegram channel. So that was kind of the place, as so often is, it is where these things

13:59

were shared initially. This is basically Microsoft's response to this. They're saying, look, we've introduced a whole bunch more protections into designers. So hopefully those stick, they are explicitly kind of on the back of this Taylor Swift situation and sort of issuing these standard, you know, corporate reassurances. We're investigating these reports and are taking appropriate action to address them.

14:19

Says Microsoft. And then they reinforce the fact that their code of conduct already prohibits their, you know, use of tools for this sort of thing. And, and then they're highlighting that they actually have large teams working on the development of guardrails and other safety systems. But, you know, one of the challenges really is the technical one.

14:35

You can have as large a team as you want working on this, but we're facing down a situation where there are fundamental technical constraints that companies are apparently facing right now. Like nobody knows how to solve, for example, the problem of AI alignment, the problem of getting AI systems to reliably do what we want them to do.

14:53

And as long as that remains the case, that not only creates potentially, you know, catastrophic risk in the long term, but the way it gets expressed today is that you can't prevent jailbreaks. You can't predict how these systems are going to behave in a wide range of circumstances under a wide range of prompts. And so, really the idea of developing guardrails, that's good, it's good, it's helpful. But there is kind of this fundamental limit to how far that can go.

15:18

At least there seems to be at this point. Until we make some, some really basic breakthroughs in the science of understanding AI.

Andrey

15:25

That's right. Yeah. This, story has some good examples going into a bit more detail of how this happened. So designer was not supposed to allow you to do as you said. You know, it was meant to prevent you from generating images of Taylor Swift. But in this article they show how if you type Taylor Swift, it would prevent you.

15:47

But if you type Taylor Singer Swift, it would go ahead and generate an image of Taylor Swift and how, you know, it would prevent you from explicitly describing sexual acts or sexual kind of, scenarios. But if you just use suggestive wording and, kind of indirect descriptions, it would still go ahead and do that for you. So that's kind of what we meant by prompt packing is basically tweaking the the prompt a little bit to get

16:23

around the guardrails. And as we said, now I guess it's updated to prevent that. And. Yeah, another reminder of that. In general for AI products, you are going to have to be really careful about this sort of thing. Next story is that OpenAI has dropped prices and supposedly fixed the lazy Jupiter for the price drop. Here is about in GPT 3.5 turbo, and for that one, one of our most popular APIs, the prices have actually been reduced by 50% for your input to a model and 25% for outputs.

17:04

Pretty substantial drop. There was also an update of the model, which is, may say improved in various ways, and there is now a preview model called GPU for turbo, along with some of these fixes that fix an issue we covered a couple weeks ago with GPT for supposedly being lazy and refusing to do the work, essentially to reply, so if you tell it, you know, write me a little

17:34

short story. Sometimes you you four would do something like say, okay, here's your first paragraph and then fill in the remainder or something along those lines. So along with this slate of updates there, they say that they have addressed those kind of concerns.

Jeremie

17:51

Yeah. And a lot of their updates have to do with, two new embedding text embedding models that they're releasing. So text embedding is this thing where you take in a piece of text, you feed it to your AI system, and rather than, you know, predicting an output, in the usual way, you know, when you use ChatGPT, you just kind of get some sort of text output. This instead will give you a list of numbers that represents essentially the meaning that's encoded in that piece of text.

18:16

So you kind of turn it into a numerical representation that allows you to kind of do math, if you will, on those, on the meaning behind those words. And that's known as an embedding. The embedding is sort of interesting.

18:28

Like it's it's very useful for a lot of back end applications if you want to compare the meaning of different things, if you want to compare, for example, for the purpose of making, a ranking of product reviews, you know, which product reviews are the most positive, which are the most negative, that sort of thing. So what's happening right now is OpenAI is releasing a new small text embedding model that's designed to be very efficient, you

18:48

know, not very costly. And, they have, a commonly used benchmark for this sort of thing, for multi-language retrieval. It's called miracle. The score apparently on that has gone up by over 10%, from 31% to 44 for, for their new text embedding model relative to the old one. So it's going to be pretty clearly a big kind of upgrade. And the pricing for that model has gone down by a factor of

19:14

five. So we're seeing not only better quality but also better pricing, something sort of mirrored by the larger, new text embedding model they're also releasing. And they've got anyway, all kinds of really exciting developer tools in there. And we'll get into the details too much. But, this is actually a pretty, pretty big set of updates, like, it's a bit of a smorgasbord of different things. And as you said, there have been, a lot of complaints about this idea of GPT being very lazy.

19:40

Apparently that's been fixed the way the laziness would manifest, if you recall from previous episodes, as people would ask, you know, GPT to do, I don't know, to do some task. And it would kind of go, well, you know, you know, you could probably do it by doing these steps. And what you're asking it is to do those steps, but it will just kind of tell you what steps it should do rather than executing them.

20:02

It's kind of common way that would manifest. Apparently that's fixed no information about how exactly it was fixed, but good to know that that's no longer an issue.

Andrey

20:10

That's right. Yeah. In the, release notes here for GPT four turbo, they said this model completes tasks like code generation more fairly than the previous preview model, and is intended to reduce cases of, quote, laziness where the model doesn't complete a task. So I assume we're working on it throughout. I guess their, model slate, including GPT four turbo. And with this GPT 3.5, it has various, improvements. On to the lightning round and one more story about OpenAI.

20:43

And it is that Chad, GPT now has added GPT three, a new feature that allows you to basically mention specific instances of GPT. So it's a better feature, and it allows you to converse with multiple versions of ChatGPT from the GPT store in the same chat window by basically addressing them with at, you know, music, GPT at teacher, GPT three, etc., etc., etc.. So yeah, it's in beta and I guess they are working toward integrating more and more of a store in various ways.

Jeremie

21:24

Yeah. And the example they get here is they have a Biden GPT and a Trump GPT. Talking to each other in the same thread. So you can kind of summon the Biden GPT by basically doing like that Biden GPT and then like getting it to to generate an output and doing that back and forth. This is really interesting because it's a fundamental shift in the way that we interact with these systems. Right? Normally with ChatGPT, you have to give it a prompt.

21:49

That prompt, you can think of it as a thing that activates a particular version of ChatGPT. Right? You're telling it, for example, like, hey, I want you to act like Elon Musk helping me to solve some problem in rocketry or something. And then it'll do an impression effectively. An impression of you on Musk, in that context. So every time we give a prompt to ChatGPT, in this case, we summon like a different version of ChatGPT. Now here we have, and that's one way to do it.

22:17

You can also do fine tuning to also get different behavior. What this allows you to do is in a single dialog, invoke all of these different versions of GPT that have been pre-built, these GPT from the GPT store so that they can interact with each other so that it's more convenient, and so on and so forth. This is deeply related, by the way, to a paper we'll be talking about in the Research and Advancement section on meta prompting. And I think that's not a coincidence.

22:41

I think the whole field increasingly is moving in this direction of, you know, how do we benefit from both the generality of these models and the, kind of expertise of more specialized versions? Right. You have a specialized Trump bot, a specialized Biden bot, for example. You want to be able to benefit from that, but not also lose track, lose the value, the generality of the base model. And this way of interacting in the same chat window with a bunch of these

23:06

different ones. Whether the person who is leading the interaction is a human in this case, or as we'll see later, an AI for meta prompting, it's just this way of kind of navigating the trade offs between generality and specificity. This is a big strategic play, for OpenAI. You know, Sam Altman was talking to Bill gates on a podcast, and he was saying, you know, customizability, personalization are really key things. On opening AI's development roadmap.

23:32

You know, that that really maps onto this, right? You have a whole bunch of customized and tailored bots, and you can get them to sort of orchestrate some interaction, through these new window. So kind of in an interesting, plot twist on the way that we are interacting with these sorts of systems.

Andrey

23:46

Next, we have a story that's a little bit more insider baseball, maybe something you wouldn't see in the New York Times. But if it is fun, if you're a regular listener, we have a story that ChatGPT finally has competition. Google bid with Gemini just matched it on the large model systems organization's chat bot arena. So this was discussed on Reddit and Twitter, like in AI circles, where Google borrowed after an update that possibly, added a rag to it.

24:23

So possibly there's sort of some cheating here with retrieval of extra information. But anyway, the story is that on this leaderboard, Google Bombard now matches ChatGPT. And that is the case for a first time. So yeah, there was some, some excitement of seeing, you know, finally a competing chat bots seemingly perform on par at least.

Jeremie

24:48

Yeah. And just for context, I mean the so the LMS, this chat bot arena leaderboard, which is kind of like the thing that's being used to assess that. Yes. In fact, Bard does seem to perform better or this version of it anyway. It's kind of an interesting tool. We talked a lot about the Hugging Face leaderboard, usually for open source models, and, and tracking specific benchmarks. That's a really good one. The way this one works, they end up,

25:12

yeah. You got these two models, it head to head at any given time. So you'll write a prompt. The prompt gets sent to two models, but you don't know which ones. And after the response is shown to you, you pick which one is best. And so over many, many rounds, you kind of end up aggregating these scores. So not actually so dissimilar from other, other approaches to this, but notable that through that process, which is human driven, it's not like AI evaluation or anything like that.

25:40

You do see, Bard performing really well. And this is, by the way, the first time that Bard has ever actually beaten the best version of GPT four. So it is an interesting kind of development as Google and Microsoft or really Google, and OpenAI go to war on who has the best chat bot.

Andrey

25:59

It's still in second spot. The first bot is actually a Jeep D4 turbo. But, you know, it's pretty close and it does beat the base Jeep G4, like you said. And yeah, this was big enough news. I guess the reason people kind of discussed it partially was Jeff Dean, the one of the leads of AI at Google, even posted about it on Twitter is supposedly there is a Gemini Pro scale model. Very ambiguous. But, whatever they did, they did make a jump quite a bit. And yeah, maybe Bard is kind of good.

26:36

Now. Next story. Going back to something we had last week, we have some more browser updates with AI integrations. And this story is covering specifically how both browsers, Brave and Ark are interrogating AI stuff. Ark is this minimalistic browser, kind of a smaller one, and they are adding the ability for users to switch, default search engine to V perplexity AI driven search that we've covered quite a bit. So that is the extent of it.

27:10

You can just always do perplexity by default, whereas, brave is possibly a bit more mature, privacy focused browser. And this week we announced that Leo, where I, a browser chat bot assistant, is getting upgraded to make sure all eight seven be this cutting edge or top of the line open source model. That is pretty much the best you can do as far as open models. So yeah, browsers continue to move along and add more I.

Jeremie

27:48

Yeah, this is really interesting from the standpoint of perplexity. Right. Because strategically they seem to really be leaning into this idea of using partnerships as a way to drive adoption and, you know, recognizing and recognizing that there's not going to be competing head to head with, like, google.com for the search market. So they've we've seen them, you know, now with this arc integration.

28:07

But also earlier we talked about, you know, their partnership with with rabbit, and anyway, and various other, other companies that are bringing in user is bringing in eyeballs. And those partnerships seem to be the way they're trying to leverage the leverage their way up to, a position of, you know, maybe competing with Google at some point. So kind of kind of interesting. And it does seem to be paying off right now.

28:28

I mean, it's it's something to note that we have seen their name come up an awful, awful lot, on last week in AI in the last, well, couple of weeks.

Andrey

28:37

One last story for the section. It's that Baidu's Ernie AI chat bot will power Samsung's new Galaxy S20 for smartphones. So I think just last week we were talking about how the smartphones will have live translation capabilities powered by Gemini Nano, and this story is highlighting how in China it's actually will not be Gemini that will be in the phone. Instead, it will be Ernie AI from Baidu. Baidu being a massive company in China, somewhat like the Google of China, you could say.

29:14

So yeah, this is really highlighting Baidu still being a very serious player in the AI space, especially with this Ernie AI chat about being one of the top ones in usage. And now I guess it will be deployed even more.

Jeremie

29:32

Yeah. And I think we covered a few weeks ago how Ernie Byte had actually reached the 100 million user threshold. Though there were a bunch of caveats around whether that was, you know, monthly active users or like what specifically that meant. But the reason they were reporting that was that OpenAI was famous for being the fastest piece of software ever developed, reached 100 million

29:49

users. In that case, they were, you know, pretty transparent about what that meant in this case, maybe maybe less so. But, it is interesting. It's also, you know, a sign that China's strategy of creating their Great Firewall is helping their domestic companies, kind of bootstrap themselves up to be credible,

30:08

credible players in this space. Obviously, if OpenAI had free reign to compete in that market, they would very likely, you know, completely trounce all these Chinese companies because they are consistently about, you know, depending on on who you ask and how you calculate. Yeah. 12 to 18 months ahead. Certainly on the, the language modeling side. But, what we're seeing here is, you know, the domestic companies like Baidu taking advantage of this wide open market and, and following

30:34

up with some, some pretty solid tech. I mean, this is, you know, this is market ready, it seems.

Andrey

30:39

And on to applications and business.

⁠¶ Applications & Business

30:42

Speaking of markets, the first story is AI companies lose 190 billion in market cap after Alphabet and Microsoft reports. So it is the season for financial reports and the latest ones have been a little disappointing. These. Looking at the story, there were a couple companies that lost out after seeing their numbers. So for instance, alphabet dropped 5.6% after hours and missed expectations on ad revenue.

31:17

There was even a little bit of a drop in Nvidia, 2% drop in AMD, 6%, even a tiny drop in Microsoft, which has really been doing crazy good on top of its, I pushed so Microsoft dropped 0.7%. So yeah, it's worth, I guess, remembering that AI is still not at a point where it's printing money for most companies, except maybe Nvidia.

Jeremie

31:45

I mean yeah it's the Nvidia was down as well. It seems a little bit. But I think these things are all relative. You know when, when we think about. Yeah. Oh well it was a disappointing performance. On the whole we see these like relatively you know, yes. Significant in the near-term drops. In the long term, if you think about where these companies were at before the ChatGPT era, you know, the on the promise of AI, their their stock

32:06

value is just rocketed up. And this might just be a slight correction. I think it's interesting that, we're seeing a mix of, not just like cloud providers, but also AI hardware designers, AMD and Nvidia in particular, plus model developers. Like, really it's it's a whole, the whole gamut of AI related companies. Usually when you look at, space like AI, it it doesn't rise and fall universally. Right. You see, you know, there's a lot more consumer adoption than

32:37

expected. So, you know, companies that actually serve models, might do better or there's some issue with the AI hardware supply chain. And as a result, the companies that specialize in the hardware getting but the others don't. This is interesting because we're seeing kind of this like more universal blanket, drop on all these things which, you know, may tie more into broader market sentiment than, indicating that there's a fundamental issue. But, it's hard to know.

33:03

Again, one big thing that people are pricing in when they look at these kinds of companies is absolutely the prospect that, you know, I may be somewhere fairly radical three years from now, for example. And so you start to look at valuations in the trillions of dollars as being maybe small through that lens. Right. So there's it's an expected value calculation as all investment decisions are. But I think that's that's kind of the big question.

33:25

These, these small micro moves I don't think are the big thing to be tracking. In the long run. The question is who's going to be able to automate like a big chunk of human labor? And that's a big part of what's propping up these, these prices for, you know, everything from the, the model developers, more kind of like maybe more like Google that they do hardware to, but all the way down to, you know, AMD and Nvidia.

Andrey

33:47

Yeah. That's a good point to like in the big picture 180 billion in market cap sounds dramatic. But these are relatively small movements and maybe reflect as was noted, you know, things like the ad revenue of alphabet or revenue of AMD that we reported, even though we did project strong sales for its AI processors. So. Yeah. You know, there's still a lot of regular business that needs to fund

34:18

all these efforts. And it seems investors are still a little sensitive to the money being brought in being significant. And on the second story, we were touching on something we haven't talked about in a little while. And that is cruise. So just a bit of background. Cruise has been in the news for some months now, I think since about October November, where was a major incident where cruise

34:43

was involved? Cruise, for context, being one of the leading self-driving car companies along with Waymo, they offered a commercial offering in San Francisco where you could hail a robotaxi and would take you anywhere. So last year, they were involved in an accident where a human did a hit and run on someone crossing the street. Very bad. Fortunately, by some bad luck, the human was then launched onto a cruise car, and the cruise car did a pull over maneuver that dragged this person.

35:18

And then afterwards there was a meeting and a cruise. People at this meeting apparently failed to fully inform the regulators, and that led to a whole bunch of trouble, like a lot of fallout from cruise ever since that happened, a lot of bad stuff. So this news is about how there was a report by law firm Quinn Emanuel, Alucard and Sullivan, an outside firm hired by cruise to sexually investigate what happened here and why was there, this breakdown in communication

35:53

and so on. And it's somewhat interesting. So it appears that cruise, whoever was in this meeting, did try to inform the regulators of what happened, and they were unable to because of bad internet. And so what happened was, yeah, they they, you know, wanted to just play the video and here, here's a video that shows everything. And a bad internet prevented the video from fully playing.

36:24

And apparently cruise had this plan to let the video speak for itself to whoever they're meeting with and that, according to this report, is essentially what happened. They they intended to not hide this, aspect of the incident by dragging, the person that they originally failed to mention. And so because they didn't verbally kind of go over it and because they were unable to play the video due to bad internet, they yeah, didn't actually get that across and a whole bunch of bad stuff happened.

37:01

And just one more thing is, ultimately this it's a long report. And the firm does say that, you know, it was fundamentally flawed and kind of probably not a good idea to assume, but the video can speak for itself and not, like, explicitly disclose the details to regulators.

Jeremie

37:21

Yeah. And the report that's filed here is kind of chalking this up to cruise, you know, having too much of an us versus them attitude with respect to regulators, which may perhaps serve to explain why they were not so forthright necessarily with flagging the the dragging incident. One of the things that the article notes is that there were over 100 cruise employees who were aware of the pedestrian driving incident, prior to the meeting, the meeting that was held in the mayor's

37:49

office in this context. Anyway, the bottom line is, you know, there is an awareness that cruise, at the time that these things maybe ought to have been disclosed. But cruise. Yeah, chose not to verbally say anything about it. And they were just like, well, let the video speak for itself. But then if the video itself does not contain that crucial, decisive step in the, in the story here, not mentioning it verbally, you know, it's just kind of hard to see.

38:16

I'm not a lawyer, but it's it's hard to see rationally how you how you can justify that. Certainly the report seems to take a pretty dim view of, of that particular approach, so who knows? I mean, I guess, saved by the bell. Saved by the bad internet. That may not be enough of a legal defense for cruise here, I don't know.

Andrey

38:34

And the full report, it's almost 200 pages, and you can actually read the full PDF. It's a very exhaustive breakdown of what happened. So I know kind of interesting to see in a way, it's it's a dramatic story, right. Because in the aftermath of this, GM just recently announced their slashing, spending on cruise by half by like $1 billion. Cruise stopped all testing in the US. They stopped offering commercial offering like it was catastrophic.

39:06

What happened? And you look at the details here and it turns out like there wasn't some cover up plan that was just cruise. Was just very focused on pointing out that the initial incident was not because of them. It was a human driver who hit someone and then as a result, kind of downplayed too much the the actual part or the crews did a play of dragging a person and, creating additional harm.

39:34

So, yeah, in a way, as a narrative, it's it's quite interesting to think of how, you know, over the course of a few days because of some bad decision making, a lot of bad stuff happened.

Jeremie

39:48

Yeah, it kind of feels like re relitigating reliving the same discussion that we had earlier in the context of a Taylor Swift piece. Right. Like where does the responsibility live? It doesn't live with a human who made a bad judgment call early on in the process. Or. But does it live or how much of it lives with yoga, with Cruz itself? And, and with the downstream consequences of what they did or what their car did, it's, you know, we do I don't think we just have these answers ready to go.

40:15

So again, 2024, I think going to be a big year for liability. And I there's obviously legislation before Congress that people are considering now too. So I think we'll have a much clearer set of answers to these things. They may not be satisfactory, but we're definitely gonna see a lot of this stuff getting chewed on in public in the future.

Andrey

40:31

Do the lightning round with her story is that a hugging face teams up of Google to accelerate open AI development. So this is an announcement of a strategic collaboration that would give developers on Google Cloud a streamlined way to use models on hugging face. Teams that use these models could train and serve them with Google Cloud more easily. This is just announced. It's not yet available. These are probably going to come out in the first half of 2024.

41:05

But yeah, interesting I guess collaboration for Google and hugging face.

Jeremie

41:12

Yeah. You know, it's always been the case that, you could make a lot of money by building wrappers around cloud infrastructure. Because cloud infrastructure is so painful to use. It takes so much expertise to know how to use, you know, especially like AWS, which is notoriously rough for this.

41:28

But, you know, Google Cloud and other services like this, famously, you know, back in the day, there was a company called Heroku who basically were just like a wrapper around Amazon Web Services that made it easier to use because AWS is just such a like, such a crap fest if you're a developer, it's such a pain to learn.

41:45

And so, this is, you know, maybe a version of that kind of a place where hugging face is making itself the user friendly face of Google Cloud, essentially through this deal, or at least that's in part what they're doing. Except instead of doing what Heroku did and facing software engineers, what they're doing is they're facing machine learning, engineers, AI developers, that sort of thing.

42:05

But there's a dimension of this that has a lot of overlap with that hugging face play hugging face, by the way, that became, I think, a $7 billion company. So there's a lot of value you can unlock just by making things easier to use. And that's, you know, that's basically it.

42:18

And a lot of good, as well, opportunity for Google to draw in more, more usage of their products just without having to do the whole all the work involved in making it user friendly and all types of kind of interesting collaboration and certainly a good move for hugging face. Because that deeper partnership with Google, you know, is, is going to make a lot of AI hardware essentially available to other users.

Andrey

42:38

By the way, I should mention for context, if people don't know, hugging Face is a big player in the AI space as kind of a host of AI models. So they are, a repository, and they currently have over 500,000 AI models and 250,000 data sets. So if you are developing an AI model, you train a neural net, may post their models on there, and they have been offering the ability to serve models to say, okay, here's an open model that's on here.

43:13

In just a few clicks, I can go ahead and deploy it on one of several cloud providers. So this is, I guess, extending their offerings to now also Google Cloud and to Google's, TPUs, tensor processing units and various cool hardware. So really an extension of hugging face already doing.

Jeremie

43:33

Up next, we have Elon Musk's AI eyes $6 billion in funding to challenge ChatGPT maker open AI. Okay, so x AI, of course Elon Musk's AGI play. Roughly speaking, you can think of this as the thing that Elon Musk did because he was worried that after OpenAI sort of like parted ways with him, he no longer has an AGI play. And he, of course, is and thinks of himself as being a big player in the

43:57

kind of AGI space. So he spins up, AI hires, eager boost can a whole bunch of other former like DeepMind OpenAI people and they start up x AI. There was a time when, x I had submitted, some, some documents with the US SEC, where they'd set a target for $1 billion in fundraising that they were going to try to try to reach. This suggests that they've now exceeded that, or at least that they're planning to. So in rather than $1 billion in, in funding, it looks like they might be

44:28

looking to raise $6 billion. It's a little unclear right now. There's nothing kind of concrete. And then the valuation of 20 billion now the valuation of 20 billion is interesting, right. Because when we think about the other companies in the space, OpenAI obviously like way ahead, they got a valuation. Seems like it might be in the $80 billion range plus. But when you think about some of the other players, anthropic has raised at around an $18 billion valuation, if my memory serves.

44:53

And so this would actually place Z slightly ahead of anthropic latest valuation. That's an interesting question, because so far we don't really have that many proof points from x AI other than, you know, grok to the extent that's been involved in a lot of that, their work has led to that. So we don't have the kinds of proof points that I would argue anthropic has so far with cloud and cloud too, at scale. So it's an interesting question about, you know, how they defend that

45:19

valuation. Part of the answer may be who they're engaging to raise funds from. So it seems like Z has been focusing on family offices in Hong Kong and like sovereign wealth funds in the Middle East, that sort of thing. This raises interesting national security questions, right?

45:36

Because we've had, situations in the past that we've reported on in the last couple of weeks where the U.S is taking a dim view of, for example, Sam Altman trying to raise funds for his chip, initiatives from, from folks in the Middle East and so on and from the UAE in particular, because of their affiliations with Chinese based, organizations, Chinese

45:56

funded organizations. Well, here is AI apparently engaging with family offices in Hong Kong, which is now very much in the Chinese, not just sphere of influence, but absorbed into into China at this stage. So kind of interesting from a national security perspective, what are the implications? Of raising funds from these sorts of sources. Last thing to note is apparently Morgan Stanley is coordinating this whole

46:18

fundraise. They've got obviously a whole bunch of experience doing tons of international kind of big money stuff. So, and they were also involved in the acquisition of X, or Twitter as it was then. When, when Elon, kind of acquired it. So anyway, really interesting situation. They big fundraise, lots of interesting, counterparties like folks who are actually being, throwing money at this initiative, from countries that raise interesting national security questions.

46:45

And I'm really curious how, how this will all play out and what the implications for Z will be going for it.

Andrey

46:50

This move away is according to a reporting by the Financial Times, they cited various sources saying that this is what's happening. Yeah. And Musk in response posted on Twitter or X that they were not raising capital at all, basically denying all of this. So hard to say. Maybe the reporters got it wrong, but, either way, this is what, was covered and what?

Jeremie

47:15

Yeah, it seems weirdly specific.

Andrey

47:17

Yeah. So I don't know what we don't know, but that's what you said. And onto the last story. I chip Startup Rebellion's snags funding to challenge and video. This is a South Korean AI chip startup rebellion's incorporated, and they have secured 124 million in series B funding to develop their next generation AI chip, rebel, which is specifically designed for running large language models. So pretty significant raise this was the only partner in 2020.

47:53

They are partnering with Samsung to fabricate its chips using four nanometer technology. So a very high tech bet with this.

Jeremie

48:00

One four nanometer. The significance of that too is, you know, the Nvidia, we talked about this for on the podcast. So sorry, that sounds familiar, but the Nvidia A100 GPU that's at seven nanometers, the seven nanometer process, the H100 is a five nanometer process. So when you're looking at four nanometer technology, that truly is kind of next generation stuff, this sort of thing usually takes a really long

48:24

time to mature. So the fact that rebellion was started in 2020, you know, the right now, the timeline for sub five nanometer tech is I'm trying to remember now, I think I think Nvidia is looking at was it 20, 25, 26 maybe for that next node size. These are called nodes by the way the the x nanometer or whatever. So the four nanometer, the three nanometer node size, is a 20, 25 ish. So I'm sorry. Sorry. Let me take that back.

48:52

The availability of three nanometer, node sizes for AI purposes, will probably be unlocked kind of on that time horizon. Three nanometer node already exists, being used for the iPhone. But anyway, it's it's, it's a whole thing. So this would be very advanced tech. It would not be, on my reading. It would not be truly cutting edge, when it comes out.

49:11

But it does suggest that you have this, you know, new entrant who is trying to play an important role here and partner with, Samsung, which is sort of right now in second place relative to TSMC, Taiwan Semiconductor Manufacturing Company in terms of its ability to make cutting edge, cutting edge chips. So, yeah, it's still early days, believe it or not, $124 million raise in this space is not actually that big. And the valuation is 650 million.

49:39

So, yeah, not not an enormous I know it sounds weird, but not an enormous, kind of amount of, money inbound because this activity is so capital expensive, you just need that much money to get a chip, project off the ground these days. So worth tracking. There are a whole bunch of other early stage, similar, companies. Tens torrent, for example, we've talked about on the podcast before. So we'll we'll just see where this goes and what the proof points end up being.

⁠¶ Projects & Open Source

Andrey

50:07

And, do the projects and open source section where we have a fund, a trio of stories are all dealing with code generation. The first one is about alpha sodium that was inspired by DeepMind's alpha code, but is open source and now seemingly surpassed it. So they announced improvements. And one of the cool things that got some attention was that this one has a neat thing called flow engineering.

50:38

So instead of just kind of generating, text as you do of large language models, they have, whole kind of little architecture of how to generate code with iteration, that generates things, this discriminator type thing, adversarial model that provides, code integrity for testing, reflection and spec matching, etc.. So, yeah, this is, cool model that, is now quite good. And it is developed by podium AI at Tel Aviv. Based startup.

Jeremie

51:13

Yeah, it is a specialist. So there's this thing that happens and a lot of these announcements where people will pitch, their model as being really exciting in that it beats GPT four, and then you'll be like, well, wait a minute. Like, are we talking about you're beating GPT four with a general purpose model or a model that specializes in, for example, coding, and

51:31

that's actually what's going on here. So we have the specialist model, alphacode, that is able to outperform GPT four at coding by an interesting margin. I mean, it's 19% to 44% on the code context benchmark, so that yeah, that's not nothing. And that's a legit benchmark. But but it is a specialist model versus a generalist. So I think an important thing to flag.

51:54

Another interesting thing is, you know, as you said, Andre, like the the strategy here for, alphacode is basically built around a fancy prompt engineering technique, right?

52:05

Where they, they have the system generate code. They have other, instances of the system kind of review the code, and it goes back and forth in this very kind of, again, inspired way, like inspired by generative adversarial networks, where you have a, where one network that generates a thing and then another that kind of, critiques it, if you will, in a sense. And they go back and forth.

52:26

But yeah, they're calling this flow engineering since, you almost see them, at least my read on the article and a lot of the quotes was they're really trying to make flow engineering a thing and, you know, make sense that this is something that's I don't know how truly unique it is. I mean, I feel like I've seen a lot of schemes like this, in the back end, but, but, hey, it works. You know, there's no arguing the 44% performance on the code context, benchmark. So really interesting.

52:52

Yeah. It's, it's a new a new thing. I wonder if flow engineering will second will be using that term a lot more in the future.

Andrey

52:59

And the second industry of models related to code is coming from meta. It is Code Llama and they announced an update to the largest version of it, code number 70 B. That is quite a bit better. So the numbers that we do have is that it scored 53% in accuracy and human eval, outperforming GPT 3.5, not outperforming G4. Not any details on specific coding benchmarks, at least in this new story. But, yeah, another big, code specific specialized model.

Jeremie

53:38

Yeah. And it's another one based on Lomita as well. So continue to see that one. Be used as one of the default ones, at least when you reach into your, your kind of bag of open source, frontier open source models. And, it is apparently because of column 70 be the one that they're making here is still available, for commercial and research uses. So they haven't made the license any more restrictive, which is good to

54:02

know. And, the for $70 billion or $70 billion, the full 70 billion parameter model, was trained apparently on 1,000,000,000TB of code and code related data. So it's actually quite you know, it's a decent amount. So, anyway, kind of an interesting development and another feather in, in Meta's cap here as they looked at promote the llama two series with llama three, I guess on the horizon two. It's it's about that time.

Andrey

54:28

And to round it out, the last of these models is a deep Sikh coder. So they have yet another large language model that specializes for programing. And this one ranges from 1.3 billion to 33 billion, for parameters and trained on this time 2 trillion tokens. So a little bit more, they show, various benchmarks of that. It is state of the art among open source models and is close, or actually better than Codex and GPT 3.5, although not quite to a level of a GPU for turbo. So yeah, there you go.

55:13

Three separate announcements of code, related models. And yeah, this is a major space. I think if you look at sort of where a lot of the generation of text by AI models is already happening, a large chunk of it is probably in programing, because I think the adoption has been pretty rapid. And when you code right, it's pretty much just constantly generating for you these suggestions of what do you code next. So I guess it does make sense in that vein why

55:45

there's a lot of focus on it. It's just already being used a lot.

⁠¶ Research & Advancements

Jeremie

55:50

And moving on to our research and advancement section. This is this is a an idea that I'm really excited about. I think it's very interesting. It's a paper called meta prompting enhancing language models with task agnostic scaffolding. Okay. Maybe it makes sense to. Oh, by the way, so one of these authors is from OpenAI, the others from Stanford. So they're very kind of closely tracking, obviously the space.

56:15

But one of the things that you find when you try to get more juice out of language models, you know, you can think about prompting, models directly to answer a question like, you know, write me, I don't know, an essay, about Shakespeare or something like that. You can think about prompts that try to turn more explicitly your model into a specialist. So you might imagine saying something like, you are a PhD in the history of Shakespearean English.

56:42

You answer questions in such and such a way, in such and such a style and blah, blah, blah. And then you asked a question, right? So now you're really adding a lot of context, to turn that model into a specialist. You can also imagine fine tuning the model and so on and so forth. But there's this kind of challenge that comes up where if you're going to turn that model into a specialist with a very specific prompt, for example, you're going to lose some of its generalization ability.

57:05

And so one of the fundamental questions that people are facing right now at the frontier, as we speak, at the frontier of, of language models, is how do we navigate this tradeoff between generality and specificity? How do we add experts, expert large language models that specialize in certain domains? How do we benefit from that expertise without sacrificing the general kind of knowledge of the base model? And this is an attempt to answer that question.

57:29

So one of the ways that they approach this is to say, okay, you can actually think about the process of getting an answer from a language model as having two parts. The first is figuring out what the right prompt is for that model. And the second is actually getting like running that prompt and getting the output from the model. Now, historically, humans are the ones who think of the prompt that we're going to give to them, right?

57:55

We're the ones who are responsible for figuring out like, what the hell am I going to, you know, how am I going to get this model just so so that it answers my question the way I want? And this is an attempt to change that. So tend to say, well, actually, that even the process of finding the prompt is something that in principle, we could automate. It's just a subtask that we could ask the lamb to do. And so what we're going to find is we're going to break down with meta

58:17

prompting. We're to break down this process into a series of steps. First, we'll take a high level instruction, where the human isn't. It won't bother to like, be very specific about how exactly this task needs to be solved. We'll trust a high level sort of, conductor language model, like an orchestra conductor. The mean. High level metamodel to break down that task into smaller, manageable pieces. We're then going to have to assign each of these pieces to a specialized

58:47

expert model. And those expert models, they're going to have prompts that turn them into experts. And those prompts are going to be created by the meta model at the top, like the orchestra conductor that you originally asked your query to. So you now getting this parrot model to like, come up with the specific prompts that will turn other versions of itself actually into the experts

59:11

that it can then query. So you're benefiting from the general knowledge of that parrot model, but also from the specificity, specialized nature of the experts that it will itself create and instantiate with a kind of custom prompt. And then that parrot model will also oversee communication between these models. And it'll also kind of apply critical thinking. So for example, it might ask a math model to figure out, you know, how do how do we translate Celsius to Fahrenheit or something like that.

59:36

And then another model like, okay, well what does that mean about the weather? It's likely to rain tomorrow and kind of combine those two results together. And it would apply its own critical thinking to kind of glue those pieces together. So essentially you have this this kind of overarching model that is the orchestra conductor that it also creates on its own, kind of spawns new models that have, prompts

59:57

that turn them into specialists. And you, the user, never have to tell it which, specialist models to prompt or how to prompt them. You just have to sit back, ask your simple question at the top and see this thing kind of go through. And there's a whole bunch of interaction between these systems on the fly. And the last thing I'll say, I just think this is such a fascinating thing, not just because it works really well, but also because of what it tells us about the future of AI.

01:00:22

I think we talked about this as early as like last year, but what we're starting to see increasingly is more and more of the compute budget that's going into getting good results from language model models or other AI systems is going in not to training much of the training phase, but rather to the inference phase. In other words, we train this model once, right? We train the parrot model once, and then we're allowing it to like generate not just one output but a whole bunch of outputs.

01:00:49

For example, we're allowing to figure out, okay, what are all the experts I need to summit? What are all the prompts I need to summit for those experts? How should I make those experts interact with each other to get the final result? So there's a lot more computation going on at the answer generation phase, the so-called inference phase of the process. And there's like relative to the training phase. And this is what you might expect as the systems get more and more context

01:01:11

awareness during training. You can kind of trust them to figure more stuff out during the inference stage. They have enough context to come up with new solutions. So I thought that was really interesting. As an instance of that trend, we're really seeing a whole lot of different independent inferences that are required to do this. This was not possible. You know, back in the day when we had way less efficient models like GPT three, models that also had less world knowledge.

01:01:34

They couldn't lean on them as much during the inference phase. But because computing has gotten cheaper and again, this is last week and I talking about hardware all the time, this is why we do it. This makes it possible to do things like this. And at the same time, these models have more world knowledge, so you can actually rely on them to reason sensibly at inference time.

01:01:50

So I thought, this is just a fascinating, little data point on this, this journey we seem to be on towards more and more general purpose models.

Andrey

01:01:57

Right? Yeah, I think it's very much a trend to, as you said, focus more on the inference stage and to do various things like information retrieval. Right, is a big one. You can have things like Bart or Perplexity that behind the scenes do some sort of search through some database, give that information to the language model, and the language model, then synthesizes response. This is in a sense similar, right?

01:02:23

It's the same basic idea of our tools that the model can draw on and kind of get these, things behind the scenes that inform its answer that you're not aware of. So it's not no longer just a large language model anymore, right? It's not just a neural net that, given some text, spits out some new text. There's a lot of stuff happening in the background. And just to give some concrete examples, they have, various experts have to play around with various tasks.

01:02:51

So for instance, we have this game of 24, which I guess is some tricky rules related to numbers and puzzles kind of thing. And they have things like an expert mathematician, an expert in Python, an expert in problem solving that work on that one. They have another task that is sonnet writing, and they have an expert poet and an expert essayist. Then they have, yeah, quite a few of these. So another example is word sorting. And they have an expert linguist, an expert proofreader and an expert

01:03:25

essayist. And for all of these, as you might imagine, for these various tasks, we have, about a dozen of them, using this meta prompting approach with this kind of combination of different models. Have one model taken in all the various outputs and synthesizing final output that works better than any other prompt engineering kind of hack you could try. Where would you prefer?

01:03:49

So yeah, also, I think to me, interesting in a sense of like, you know, there's always been a question of is it just down to scaling? Are you I get one giant neural net that just does it all. If you make it big enough and train it on enough data. This is, you know, in some, in some sense could be said to be new or symbolic, or at least it's not just one big model of scaled up.

01:04:11

It's a whole system where you, like, make these individual neural parts and you set up the orchestration between them and, how they interact and so on. So there's a sort of cognitive architecture, you can almost argue, and that's something that there's a connection that seems to be in favor of. You will need some sort of architecture of different components and not just one big giant crazy neural net.

Jeremie

01:04:37

Know it for sure. I mean, I the the idea of this being on a continuum to neuro symbolic reasoning is a really good point. It's not. Not at all something that's obvious too, because, you know, when you approach it from the standpoint of, well, philosophically, we, you know, don't throw in terminology like connectionism is like the other end of that spectrum, right? Where you're like, everything is one big neural

01:04:55

network. And one of the interesting things about this approach to is it is much more interpretable where you have intermediate readouts that are human readable and for safety reasons. Some people think that's a really good thing. There's disagreement there. But anyway, that's a really interesting philosophical point. Where is the line between neural symbolic reasoning and whatever the hell this is?

Andrey

01:05:15

And onto the next story. It's about ego. Speculative sampling requires weird thinking. Feature uncertainty. That's the name of the paper, and it is about inference. So when you want to actually generate an output from your large language model, there are various ways to speed it up.

01:05:36

And one of them is this idea of speculative sampling, basically kind of, doing something like what CPUs do you speculatively decide on what to model might want to go for and speed up the sampling that way by not necessarily querying a model quite as much this, without getting super technical, they show that they have some rethinking of how you do this speculative sampling. You can get a pretty major speedup. So without any sort of loss in performance, no fine tuning, nothing.

01:06:18

You can get free time speedup with this sampling, then vanilla decoding and, two times faster than other proposed things like this. So yeah, it's, pretty significant speedup in, in the ability to get outputs just by changing how you generate that output over a very kind of more naive way of doing it.

Jeremie

01:06:45

Yeah. And, you know, just by way back when you would talk about speculative sampling. One way to think of that is in contrast to what's known as greedy sampling. So you can imagine, like your model is going to, taking a prompt and then it's got to figure out, right what the next word is that it's going to generate based on that prompt or based on the previous words

01:07:02

it's generated. And it's going to have a whole bunch of predictions about like is the next word, that is the next word happy or whatever. And for each of those possibilities it's going to assign a probability. Right. So language models are really just that. They're engines that produce a distribution of probabilities over words. And then one option is to say okay, well why don't we just pick the most likely word, right? If the most likely word is the then let's just go with

01:07:28

that. That's called greedy sampling. The alternative to that is to say, well, wait a minute. You know, if we just anchor on like the very next word and we keep doing this sort of like at every next step, we're like, all right, what's the single best kind of highest probability prediction? We gotta get locked into a particular train of thought. But sometimes it's hard to tell what outputs are ideal before you've actually generated a number of words in that output.

01:07:53

So you can kind of go like, oh, I see where this is going. Yeah. Maybe the first word didn't look right. But when I actually let this play out it's like, damn, that's a good sentence. So this is what speculative sampling is meant to do. It's meant to allow you to kind of explore different possible, continuations at at each step of the text generation process and

01:08:13

then sample from those. So it's kind of a difference between saying like, let me just do a one step look ahead and actually like, let's see how this would play out if we actually followed these different, you know, a couple of different trains of thought. So anyway, that's kind of what's, what's at the heart of the speculative sampling thing that's been discussed.

Andrey

01:08:30

Yeah. To get into a little bit more detail, specifically the way this works is. You have a second model producing these probabilities of like what is her body of fear? What is probability of, what is the probability of any given next word? And what you can do is you have like a smaller model that gives you these probabilities as a proxy for a big model that is very expensive to get these probability estimates out of.

01:08:55

And you can basically go ahead and say, okay, let me just go ahead with what the small model said, verify with my big model that that was actually a good call. But I'll also continue going forward, hoping that that is correct answer. So in this, paper, we essentially just showed how you can get this small proxy model to work better and, get, very nice speedups.

Jeremie

01:09:21

And moving on to our lightning round. So let's try to do some sound bites. It's a low budget show, folks. We don't, we can't afford a sound guy. We don't have a sound guy. Anyway, the paper here we're talking about, it's called our vision Transformers. More data hungry than newborn visual systems. And if you look at that title and go, newborn visual systems. Are we talking about chickens?

01:09:45

And the answer is yes. So what they're doing is they're basically trying to settle an argument that people have been having over whether transformers, transformer models, the same sort of models that, you know, the T in ChatGPT that we all know and love are these models, when applied to vision, are they as efficient or really similarly, similarly as efficient as of biological systems? And the reason you might want to ask this question of is is, first off, just about fundamental curiosity.

01:10:13

And I like how much further can we go in terms of squeezing more efficiency out of these systems. If biological systems are proof that we can actually go further? And that's sort of an interesting note that maybe we can push, you know, squeeze more out of less. And in other pieces, kind of more on the neuroscience side, you transformers are sort of related to, some current hippocampus models in computational neuroscience.

01:10:36

And they can reproduce, the, precisely tuned spatial representations of the hippocampal formation, which is a lot of syllables, which basically says, you know, their behavior mimics some of the behavior that we do see in biological systems.

01:10:50

People have run tests like this where they can validate that convolutional neural networks, which are sort of the spiritual predecessor of vision transformers, or at least not spiritual, but they were these were the things that you used to use if you wanted to solve computer vision problems. And they actually do, look a lot into performing roughly is as comparable to newborn chicks.

01:11:13

But a lot of that was assumed to come from the fact that convolutional neural networks, have a structure that is really, really designed to explicitly lean into some of the symmetries that images contain. We'll get into the details of that too much. But if you if, you know, you know, and convolutional neural networks are, are very, you know, have a what's known as a strong inductive bias. They're very much designed for vision. Whereas vision transformers don't really have that the more general

01:11:39

purpose. And so what they do in the study is they raise a bunch of newborn chicks in a really controlled, extremely boring environment that has one object. And basically they, they try to, they simulate the chicks experience. So, create a first person chick simulator, that simulates what the world would look like if you were that chick moving around in that very, very simple, very boring chamber with that one object. And then they train a vision transformer on that data, and they test it the

01:12:11

same way. They then test the chicks. So so they expose essentially chicks to the same visual data that a vision transformer is exposed to. And long the short of it is, they end up finding actually surprisingly similar performance for these, these two of these two systems that the chick and the Vision Transformer, is pretty remarkable. They, they look at like how many training samples do you need to get the Vision Transformer to match chick performance?

01:12:38

The estimate, you know, ends up being with around 80,000 images or so. They're used to train these models, which, by the way, does map pretty reasonably based on their calculations to like an estimate of the amount of visual experience that a chick might have over the course of its life. You actually find sort of similar performance. And they test these, these two kind of entities, these two intelligent systems, if you will.

01:13:02

By, by having them look at a new object that they hadn't seen before or an object, the same object they'd seen, but from a different viewpoint, to try to see if it can recognize that same object and distinguish between the two. So a really interesting piece, especially if you're interested in like the, the question of AI versus natural intelligence. You know, biology what's what's the delta there? This is a big update for me. I honestly didn't expect Vision Transformers

01:13:27

to be this efficient. So yeah, interesting thing to know.

Andrey

01:13:30

Yeah. That's right. I think, it's probably unexpected that the answer to the question. The paper. The title Our Vision Transformer is more data hungry when you are in visual systems as these in this experiment is. No, we are not more data hungry are very, you know, roughly comparable. Now of course you know this isn't learning to see in the same way humans are. This is on this specific task of like, can you recognize this thing that you've seen before but from a new perspective.

01:14:01

So this is under very specific experimental condition. But I do agree with you that it's kind of interesting to see that they are sort of comparable in this setting.

Jeremie

01:14:13

They also, by the way, they just kind of slip this in. But I think I believe that they're casually inventing a new vision transformer. In the context of this paper, which I think is just worth mentioning super quickly because it does have a pretty simple operating, kind of method. So the way they set this up is they have, you know, you imagine a video which is really a collection of images, time ordered. And what they do is they train the system to look at one frame

01:14:40

and they're like, well, you know what? From one frame to the next, things shouldn't really have changed that much. Right? In one frame of video, the, you know, the substance, the the meaning behind the scene should not have changed that much.

01:14:52

And so let's see if we can train the system to sort of like, recognize, frames that are close together, even close to, to, you know, a given frame and, and distinguish those from frames that are further away and make sure that they're represented in the neural network in similar ways, if they're close together and represented differently, if they're far away, sort of in time. And that approach, it's called contrastive learning.

01:15:16

And we've seen variants of that, going all the way back to clip, I think was the first time I remember seeing it back in like 2021, from OpenAI. But, yeah, they're applying it to video generation here. And, the reason they're using this is that this is believed to be how animals also learn, to recognize scenes and do a lot of vision learning. We kind of assume that things haven't changed much.

01:15:37

Our brains do, over short periods of time and then kind of distinguish that from like later periods where a new scene might be arising anyway helps inform how we represent the world internally within our brains. So really interesting neuroscience and AI paper all bustled together into one.

Andrey

01:15:53

Next paper circuit component reuse across tasks in transformer language models. So this is going to get a little bit technical. This is in the field of interoperability and specifically mechanistic interoperability, which is a thing that anthropic and certain researchers have pushed hard on, where essentially you try to discover interpretable parts of a big, messy neural net that you can say, okay, this culmination of elements of a neural nets does this thing.

01:16:24

And that, roughly speaking, is what a circuit is like. You know, a combination of these smaller parts together, do whatever the model is doing. And so this paper is looking at a specific so-called circuit. There was a previous paper from 2022 where they discovered a circuit for indirect object identification. So, set of units in the network that seemed to really specialize in this task. And this paper looked and saw that in another task, the same circuit was actually reused.

01:17:02

So this is meant to showcase the usefulness of this for interoperability, where if you identify a bunch of circuits, you can then for various, situations be able to understand what a neural is doing in terms of a circuits being involved.

Jeremie

01:17:19

Yeah, this is actually pretty significant from a safety standpoint. One of the big hopes, for people who are pessimistic about how hard it will be to align AI systems, align superintelligent AI systems, Agis, and make them safe. One of the one of the hopes is, look, we we may not be able to to do that.

01:17:38

That may not be possible, but at least we can be able to understand the the reasoning of these systems so that they don't take us by surprise and therefore analyze, you know, what the goings on are inside these networks. So mechanistic interpretability, which, by the way, is something that anthropic is really big on. And they've done some excellent work here. Is is really focused on in part answering that question.

01:18:01

And one concern has been, look, we are able to identify circuits that are associated with certain ideas. You know, that's great for this AI safety question. But usually the way you do that is you identify those circuits in the context of a very narrow problem set. So you look at like, you know, identifying cars and you're like, oh, okay. You know, these neurons, all this fire when there's a car in the picture.

01:18:24

But in the like the worry here is, does that actually generalize, like in the more general case where we just have a system, we don't know what its reasoning process is. It could be very open ended, you know, language model, reasoning on a wide range of tasks, like will the things that we have learned by studying one specific use case actually generalize. And in the most pessimistic case, every different task could be handled uniquely by the model, like in a different way.

01:18:50

And if that's true, then having a circuit for every task would like leave us as they put in the paper and leave us no better off from an interpretability standpoint than having the full model itself. Like basically everything is bespoke, every single task. And so it really doesn't like your interpretability work doesn't generalize, so it gets kind of useless. This is why they're really so focused on the idea of reaching. Use of these circuits, right? Can we show that in two different tasks?

01:19:18

We actually find re-use of a given circuit. The tasks they choose are, are related, or at least they believe that they're somewhat related, but they're not exactly overlapping. And that's you can sort of see how early on we are in mechanistic interpretability, as people have to kind of come up with very, very controlled experiments to validate, that, that in fact there is a reuse of a particular circuit.

01:19:41

There's a lot more to say about this paper. I think this is something really, really fascinating. At some point I want to do a deeper dive into into this whole field. Actually, maybe this is the stuff of a special episode, but, for now, just, like, worth flagging. This is all research that was done on, GPT version of GPT two. Right. So a lot of mechanistic interpretability research, you'll see it done on smaller versions, sort of older models that are easier to work with.

01:20:07

This is a big part of the reason. So, the space is moving fast. To be clear, interpretability research is moving by leaps and bounds, especially now that there's so much attention on it for safety reasons. But and still, you can tell how early days it is by how tightly controlled these experiments have to be and how limited the generalization ability of a lot of these tools, at least, is suspected to be at this point.

Andrey

01:20:29

And one last story for this section. Shocking amount of the web is already I translated trash. That's the headline of a story. And yeah, this is about a study by researchers at Amazon who found that over half of the sentences on the internet have been translated into two or more languages, and often with poor quality due to AI or machine translation. Thus, the AI translated trash.

01:20:59

So they generate a corpus of 6.38 billion sentences from the web, and found that 57.1% of the sentences were translated into at least three languages. To be a little bit more detailed, we trash here is more often than not when translation is towards low resource languages. So resources were not spoken as much, or don't have as much data on them on human like English or French.

01:21:29

So for low resource languages, like, African languages, wildlife or resource, they say that they basically have much worse translations, which, you know, kind of make sense. You would expect the models to be worse at translating for these low

01:21:46

resource languages. And, yeah. So I guess another indicator of the web and just our digital world being kind of maybe you could say spammed or just generally filled with AI generated stuff, as I'm sure will be the case going forward pretty much indefinitely.

Jeremie

01:22:09

Yeah, I think there's also this like implicit question here about the selection bias. Right. Like the worse the generation is, the easier it is to detect as AI generated content. This sort of makes me wonder, you know, are we potentially undercounting the amount of AI generated, high quality content that's out there? Really hard to know because of course, in the limit it starts to look exactly like human generated text. And so I think this is just one of those additional challenges we have.

01:22:36

Like it's difficult to even look around at us and sort of concretely, comprehensively assess what the internet is made up of today, because I had gotten so good.

⁠¶ Policy & Safety

Andrey

01:22:45

And onto the policy and safety section, starting out with a throwback to the very first thing we touched on in the episode, Taylor Swift once again. So this news is that the Taylor Swift AI images prompt US Bill to tackle nonconsensual sexual deepfakes. So yeah, this is like hot off a press.

01:23:09

Yesterday evening, just before we started courting a group of, about bipartisan group of senators introduced a bill Tuesday that would criminalize the spread of nonconsensual and sexualized images generated by a. So this is, coming from Dick Durbin of US Senate majority whip and senators, Lindsey Graham, Amy Klobuchar, and Josh Hawley. The bill itself is the Disrupt Explicit Forged Images and Non-consensual Edits Act of 2024 or for short with Defiance Act.

01:23:46

And yeah, they you like, you know, explicitly call out, Taylor Swift to quote here, this man fake sexually explicit images of Taylor Swift were generated by AI swept across social media. Although the imagery may be fake, the harm to the victims from distribution is very real. So this is from a press release. And, presumably, I guess if it works yet before. But I guess possibly because of a story coming out, they did, but go ahead and introduce it soon after.

Jeremie

01:24:17

Yeah, I was just realizing, so, first of all, the title of draft bills is always hilarious. They have to have an acronym for it, like, you know, the Chips act. And anyway, it's like it's it's very common thing. It's also the first time I've actually realized this is the same thing that I people do with their papers as well. Like, we come up with elaborate reasons that it has to be called the Llama and the acronym is all bungle. And like you can tell, the people are trying really hard.

01:24:41

So they Congress has somebody something called the AI researchers. But yeah, no, really, it's interesting to see this come off. So recently off the the heels of the Taylor Swift thing that obviously made a really big impact. It's so hard to predict which of these things will land and which ones won't, because the set of issues has been live for a long, long time. Right. Like this is not the first time we've been talking about, about, you know, what do you call deepfakes,

01:25:06

or anything like that. But, also interesting is to build the bipartisan consensus around this, the bipartisan, kind of nature of these issues varies a lot from one to the

01:25:18

next. And, you know, on stuff like this, you tend to see a lot of consensus is since Tennessee, less consensus on, sort of like anticipatory stuff like before the certain risks manifest, obviously when including election interference stuff, but also physical risk from AI systems, that we, you know, may have plenty of reason to think is coming. There tends to be a lot more resistance to that sort of thing.

01:25:41

So, this is a nice way to sort of flex those bipartisan muscles and hopefully, something like this will end up being incorporated into US law.

Andrey

01:25:49

And I do think it's interesting. Just last week we covered v Preventing Deepfakes of Intimate Images Act that was also bipartisan and also introduced actually reintroduced last week. So this seemingly is a totally separate effort that was just done in parallel, also dealing with sexually explicit deepfakes, as you covered last week, that other act was inspired by a totally different event dealing with deepfakes at a

01:26:19

high school. Here, I suppose, you know, they do call out, Taylor Swift as an example, but it probably is a broader pattern. So the details here are probably different in terms of the specifics of video. But, you know, broadly speaking, you know, this will explicitly, allow people to sue for spreading, nonconsensual, non-consensual AI generated imagery.

Jeremie

01:26:51

Yeah. And then the liability piece will tell you about another bill, in a minute. But the liability piece is so, so important because that's what causes corporations to move, right? Like if they have the sense that we'd actually we have legal exposure for doing X, Y or Z, you know, then they move and cover that base. It's why you see a lot of lobbying, against liability and the idea that, hey, you know, we don't know enough yet to, like, impose liability.

01:27:13

Maybe we have a sort of, safe harbor where as long as the corporation does certain safety related actions, then they're let off the hook. But the problem is that we don't know what safety related actions are required to prevent things like jailbreaks, as we've talked about earlier this episode. And so as a result, all you can really do is say, hey,

01:27:31

there's a liability regime. I don't care how you do it, but it's now your responsibility to throw, you know, billions and billions of dollars that you have, at solving this problem. And I'm sure things will have a magical way of sorting themselves out once the incentive is in place. But that's sort of the position, at least of a lot of the people who advocate for these sorts of liability. Frameworks.

Andrey

01:27:50

Next story OpenAI and Google will be required to notify the government about AI models. This is an announcement from the US Secretary of Commerce, Gina Raimondo, that said that there will be this new requirement that mandate companies to share details every time we train a new large language model. And, these companies will have to share risk safety data for review, which is part of the AI executive order from last year from the Biden administration.

01:28:21

We discussed, if it gets to time, how they're where some of these details in there on like if you're training a very, very big model, there are some additional, these responsibilities or actions you have to take. And so this is, kind of an update on that, so to speak, where once again, it's being reiterated that this will be a, component and something that, the companies will have to abide by.

Jeremie

01:28:46

Yeah. And you can see the administration reiterating the, the grounds that they, claim to have to impose this requirement on companies. So, for context, the vehicle they're using to allow the government to do what in the US is highly unusual, right? Having a government step in and tell private companies that they will report on certain activities, especially in the United States, where we have a tradition of, you know, free market, sort of liberalism. This is highly unusual.

01:29:14

The thing that makes the government able to. Do this. Is this thing called the Defense Production Act. This was invoked in the White House's executive order that Biden signed fairly recently. And this was. It's going to come up again a little bit in a couple minutes. But, for context, this is an act that came out like in the 50s, and it was meant to respond to that basically the start of the Korean War.

01:29:36

It was it was part of an exercise to get, mobilization of civil defense and kind of broader society to support the war effort. It's something that you're only really meant to invoke when there's a national security emergency, and it gives the the president the power to require businesses to do certain things. The last time it was invoked was in 2021, actually. So fairly recently, in the context of Covid, just to basically order companies to start producing pandemic related protective equipment.

01:30:02

So this is something that you usually need to find is heavily justified. Of course, the administration, current administration believes that it is on the grounds of a number of things, including the weaponization potential of these systems. We've already seen these, you know, they're they're able to to carry out autonomous cyber attacks. An AI scaling suggests that that's going to get, you know, potentially much, much worse very soon. So you kind of need to get ahead of the ball on

01:30:26

this. But still, the fundamental basis for this reporting requirement is something that is increasingly coming under fire. And you can see the administration now trying to be clear about the justification for that, a justification which, you know, I, I, I actually think is quite strong, just on the basis some of the national security work have been doing on I think this is absolutely, sensible, but it does mean we have to have a conversation about it.

01:30:51

And I do understand arguments that say, you know, no, this is kind of this is unprecedented. It's, you know, maybe not appropriate in this context or whatever. Ultimately, it boils down to how well we understand or buy into the risk picture with AI. And, and that's sort of that the core here.

01:31:07

Is there a national security emergency? So, Jeremy Window is now, the Secretary of Commerce is now, sort of having to answer questions about this and talk about additional requirements like know your customer of what requirements that are being imposed on cloud service providers, which now are going to require folks to ask, you know, a whole bunch of questions to their customers before they let them do big training runs. And, this is as a way of being consistent with their export control

01:31:33

policies. So basically saying, look, as as she says, you know, we use export controls on ships. Those ships are in American cloud data centers. And so we also have to think about closing down that avenue for potential, malicious activity, as she puts it. But basically, you know, China potentially using, the cloud to bypass the controls on high end chips that already exist. So really interesting package just blends politics and technical stuff all

01:31:56

together. And it's absolutely I mean, this is the debate right now on the Hill. In terms of what's appropriate, what authorities can be invoked to manage this stuff in the short term before we have legislation that Congress has to pass, probably hopefully, we'll see in 2024.

Andrey

01:32:11

And speaking of our debate, starting out the Lightning round, the story we have is the campaign to take down the Biden AI executive order, which is all about that same topic, is really going into how the use of the Defense Production Act is facing opposition from lawmakers and tech lobbyists and conservative groups that, argue that basically this is overreach. It's not a legitimate, way of doing it. So quite a bit of detail here on this, push by different groups

01:32:43

against it. One specific detail is that the Republican Senate Commerce staff are reportedly slowing down all AI regulation going through their committee.

01:32:55

Another detail is that, the Americans for Prosperity Foundation, a nonprofit founded by the Koch brothers, Summerbee is basically, have filed Freedom of Information Act requests and a lawsuit against the Commerce Department and the, you know, anyway, and another, Department of International Government, the demanding agency records on the Defense Production Act and AI. So, yeah, some controversy over some pushback over the ability to do this, going on.

Jeremie

01:33:31

Yeah. And this is really down to that question that we were talking about just a couple minutes ago of whether we think there is a genuine national security, concern here. This is certainly the position of, of the white House. And there was actually a quote explicitly saying that from Ben Buchanan, who's the white House special adviser on AI. He says, yeah, like we invoked the Defense Production Act emergency power because has and this is a quote because there is no kidding a national

01:33:58

security concern. You can debate this. We can debate whether that's true or not. But but if it is true, then, you know, quickly you fall into the territory where this can be justified. Definitely get the arguments, against, invoking the DPA against this whole line of reasoning. I'm myself like I'm a godly libertarian, you know, I'm a tech guy. Silicon Valley, I start I started a lot of companies and I don't like regulation. I think it's really bad thing.

01:34:26

And we definitely need regulation for a narrow range of issues in, in the world when, when there are risks that are generated that can't be controlled by the market. The vast majority of problems have free market solutions. But the question is whether this is one of them.

01:34:39

And certainly, I think, you know, having spent all the time that we have were working in the national security space looking at what these models can do already and then looking at what scale seems poised to deliver, like in the next year or two. It's really difficult for me to say that there is not a national security

01:34:56

emergency here. I mean, this is something that's evolving super fast and it you know, if you don't if you don't get ahead of the ball, these models can't be deleted once they're shared on the internet. Once people buy, you know, very powerful AI processors, you can't take them back. You know, right now we don't have an easy way of tracking them all. And if there's an algorithmic breakthrough that makes it possible for them to be used for very dangerous things, you can't put the tooth, the

01:35:17

toothpaste back into the tube. So we do have Senator Mike rounds, who's a Republican senator, coming out and saying, you know, there's not a national emergency on AI in the South. He's a South Dakota Republican, and he's worked with Chuck Schumer, on some AI, legislation. But he's opposed to the use of the Defense Production Act. Because he says, really? This is not necessarily what the DPA was made for in the first place.

01:35:41

And again, like, really understand that. I think this is a very challenging debate, and everybody's trying to be thoughtful about it. Yeah. From my perspective, it it pretty clearly is a significant, risk. And arguably I would say an emergency. But, there's a lot of, you know, there's a lot of room here for people to, to figure out what makes sense and how we cover these bases.

01:36:00

It may be through the Defense Production Act, or it may be something else, but, somehow, probably will want to cover these bases.

Andrey

01:36:06

And once again, another story on U.S. politics. The next story is Representative Jeff Jackson introduces a bi partizan cloud AI act to stop China from remotely using American technology to build AI tools. So as we've covered quite a lot, there is an export ban so that you cannot buy, GPUs and other things that are used for running and training AI models. But you can still pay to use this hardware via the cloud, right? So that's what most companies actually do.

01:36:42

They don't necessarily buy a ton of hardware. They just pay Amazon or Google or Microsoft or many other potential, companies to use their GPUs on the cloud and do their own training and inference and so on, without having to actually set up their own server farms and so on. So there's now this, cloud AI act, closing loopholes for the overseas use and Development of Artificial

01:37:10

Intelligence Act. That is basically, yeah, closing that loophole, I guess you could say, of saying, you know, if you're in China, you will not be able to access GPUs, via a cloud.

Jeremie

01:37:24

Yeah. The premise here is, you know, if there's a ban on exporting certain GPUs to China, then there should also be a ban on Chinese domiciled organizations or individuals accessing those same GPUs if they just happen to be served up on the cloud. That's kind of the premise here. There is actually a debate here, by the way. This is not viewed by everyone as strictly a loophole.

01:37:44

One of the reasons that this could be not a bad thing, that is the cloud compute loophole, is that cloud uses is trackable. It means that the US can actually monitor Chinese use of AI systems to the extent they use the cloud. It also has this effect of reducing the Chinese domestic incentive to develop an independent compute supply

01:38:07

chain. Right. Because you're essentially taking business away that otherwise would be going to like Huawei and would allow them, for example, to like, ratchet up their, their production and compete, ultimately more with the Western market. The flip side of this is that, you know what? China has actually made it a policy priority to to develop a domestic AI supply chain. So maybe this doesn't matter at all. And at the end of the day, they're

01:38:29

going to do it anyway. They're going to they're going to prevent people from using, Western domiciled cloud services or try to artificially inflate their, their domestic cloud. So it's an interesting debate. It's not 100% clear either way. It's an interesting piece of legislation. It's definitely one way to go on this. By the way, you can read it, I did it's like surprisingly short.

01:38:48

It's like four pages long. And it it talks a lot about, the risk of weaponization and, and all kinds of stuff like that. So worth giving it a read if you're if you're a policy nerd.

Andrey

01:39:00

Next up, a follow up to a story we covered last week. Last week, we heard about how VR was, seeming deepfake, calling a robocall happening out in New Hampshire, telling people not to vote. And when New Hampshire primary and voice call had someone who sounded like, Joe Biden, AI generated audio. So the story is that I startup 11 labs has banned VR account where it has been blamed for that audio. It seems that this audio that wasn't robocall was generated by 11 labs. Misleading.

01:39:36

Creator of text to audio for voice synthesis. We also covered last week how we have reached unicorn status. And yeah, so it has been blamed or at least sourced to them. And now the user's account has been suspended as, another, yet another sign of how the platforms that enable you to generate various media with AI will have to be taken with responsibility to really prevent misuse of this sort.

Jeremie

01:40:06

And we're wrapping up with how the West can match Russia in drone innovation. So this is an article by, the very, talented and bright Sam Bendit. Who full disclosure, I know, and Jay Palace, who is a senior official it responsible for a lot of AI testing evaluation stuff in the US Department of Defense. Both, both very thoughtful people, involved deeply in the US are tracking deeply, actually involved in Jane's case. But Sam as well, tracking, the state of US DoD policy

01:40:38

and the Russia situation in particular. So, really quick overview. You know, it's a long piece. It has a lot of good detail if you're

01:40:45

interested in it. But one of the key differences between Russia and the United States when it comes to the automation of warfare is this idea that the West is focused more on having systems that autonomously observe and orient themselves and maybe make some simple decisions, but they don't tend to outright act in the battlefield and in a kind of very liberal, uncontrolled way without human intervention. Whereas Russia is trying to automate the whole what's known as a whole kill

01:41:16

chain. So the chain from kind of observing its environment to orienting itself to making decisions and then actually acting on, you know, a decision to, to take out a target. So Russia is really trying to automate the full stack. And one of the challenges the United States faces is that they don't really want to automate all that quickly, because they have a set of responsible AI principles that they're tied to as a sort of liberal democracy. You know, they have the values you might imagine.

01:41:42

This is enshrined in, to a significant degree in DoD directive 3009. We've talked about that one in the past, but this really is you know, it's sometimes interpreted. It's not quite right to interpret. It is a saying that you always need a human in the loop, but it's sometimes interpreted slightly falsely as saying that it definitely does impose a lot of requirements, ethical requirements, testing evaluation requirements on, us DoD use of autonomous systems.

01:42:08

The other thing is there's, apparently, Russia has a system called Sturm 1.2. It's a heavy quadcopter drone, and supposedly it can drop projectiles without involving a human at all. So, you know, it's also used as a kamikaze drone and all this sort of thing. They've got a whole bunch of examples of these autonomous systems that Russia's feeling that the US would never field, at least currently

01:42:33

based on the constraints that they're facing. One of the most amazing things to me in this article was this reference to Russia actively testing their commercial systems in live military operations. So, you know, obviously you would do that if you were, unbound and restricted, by kind of ethical, moral considerations around this stuff. If you just wanted to say, hey, you know what, we got a new system, like, let's see how it does. Let's just ship it.

01:42:57

This also helps them with their development process, right? They're just able to throw these things out without really doing that much testing and evaluation. And needless to say, this is not something the US military would ever do. So, one of the challenges that they uncover here is the big need for the U.S, DoD to modernize its acquisitions process and, blend that in with testing and evaluation because they

01:43:19

got to move faster, right? If the bar is higher for the Department of Defense because they have these ethical and moral, guidelines that they're trying to follow, then they just need to get faster at meeting those bars. And, it's I will say it's something we've seen firsthand just how dedicated the US government is to safety. When, but, yeah, we've deployed GPT four powered applications and stuff in the US, too. I think we were actually the first ones ever to do that, by

01:43:43

the way. A little little bragging point, but the the amount of a focus on safety and sort of like testing, evaluation and integration of those systems, is really impressive. And it is in sharp contrast to the approach that seems to be used right now by Russia. So that's a really big handicap for the US, going into this conflict. And it's something that structurally, you know, either those that bar has to be lowered or we need to get faster meeting that bar.

01:44:08

But ultimately you get into a situation where when your adversaries are willing to, to move faster than you are, and be more reckless, it creates a race to the bottom, and we need to avoid that, of course, at all costs, especially in these, DoD applications.

Andrey

01:44:21

Yeah, we we've been doing this podcast since, like, March of 2020. And in that space, there's been a couple new stories on AI guided drones, drones that autonomously, you know, find, target and go for it. It has sort of flown under the radar. So to this day, there's no significant automation happening as far as I guess is generally understood. But I think it's very much something that is kind of there to waiting to happen, so to speak.

01:44:55

And, it'll be interesting to see if this year we'll start seeing much more automation of at these drone attacks with AI or if. Yeah. Is it going to be more of a conversation about AI enabled weaponry and, regulations and so on around that? And with that, we are done with our episode

⁠¶ Outro

01:45:17

of last week in AI. You can find the articles we discussed here today and subscribe to our weekly newsletter with similar ones at last week in AI. You can get in touch with us by emailing contact at. Last week in that I or Jeremy at hello at Gladstone or Jeremy it is.

Jeremie

01:45:35

Hello. That's not my first name though. That's just that's just. Hello.

Andrey

01:45:38

Hello. Oh, like Gladstone, I like that sound that I. And as always, we appreciate it. If you share with with your friends, if you give us nice reviews and generally, you know, make us feel nice about how a podcast is going, but it's more than anything, we love to see that people are listening and benefiting from us recording all this for two hours, so please keep doing it.

Transcript source: Provided by creator in RSS feed: download file

#153 - Taylor Swift Deepfakes, ChatGPT features, Meta-Prompting, two new US bills

Episode description

Transcript