#146 - ChatGPT’s 1 year anniversary, DeepMind GNoME,  Extraction of Training Data from LLMs, AnyDream - podcast episode cover

#146 - ChatGPT’s 1 year anniversary, DeepMind GNoME, Extraction of Training Data from LLMs, AnyDream

Dec 12, 20231 hr 25 minEp. 185
--:--
--:--
Listen in podcast apps:

Episode description

Our 146th episode with a summary and discussion of last week's big AI news!

Note: this one is coming out a bit late, sorry! We'll have a new ep with coverage of the big news about Gemini and the EU AI Act out soon though.

Read out our text newsletter and comment on the podcast at https://lastweekin.ai/

Email us your questions and feedback at [email protected]

Timestamps + links:

Transcript

ANDREY: Hello and welcome to today's Last Week in AI podcast, where you can hear us chat about what's going on with AI. As usual, in this episode, we will summarize and discuss some of last week's most interesting AI news. You can also check out our last week in AI newsletter at lastweekin.ai for articles we did not cover in this episode. I'm one of your hosts, Andrey Koreshkov. I finished my PhD at Stanford, who I was studying AI earlier this year, and I now work at a generative AI startup.

JEREMIE: And I'm your other host, Jeremy Harris. I'm with Gladstone AI as an AI safety company I co-founded. And, um, yeah, we work with folks in the national security community and the Frontier Labs to solve problems with safety and stuff. And, Andrey, I have a cool little story for you. I didn't tell you this before, but I had an interesting day yesterday because I was in Canadian Parliament doing fancy things. I was testifying about a bill that Canada has.

Hey, who knew Canada actually has bills and one of them has to do with AI anyway, so I did this little committee thing and it was kind of cool, um, sat in this, uh, very formal room and had a nice stuffy conversation with, uh, some, some surprisingly, I will say, very well informed parliamentarians. So that was kind of cool. Um. Oh, look at that. ANDREY: One of us is actually impacting the future of AI, and the other one is working on AI to make video games for fun.

JEREMIE: Yeah, I was, I was trying to persuade them to take it easy on the video game industry. You'll be pleased to know. So. Great. Yeah. No. Anyway, it was just a lot of fun and, uh, thought I'd thought I'd share that we're becoming part of the news now. We're making it as well as. Sorry. I'm just being a schmuck, all right. ANDREY: Yeah, well, this week we have a lot of news to go over. I think this is been a busy week.

We have, like, 50 news stories in our document here, so we will be going even faster than usual. So I guess, uh, listeners will just have to try and keep up. And you know what? Let's just go ahead and get going. Starting with our Tools and Apps section. The first story is ChatGPT one year anniversary How it Changed the World. This is by VentureBeat. So yes, it's one year anniversary. Just, uh, came and passed this past week. And as you might imagine, this was on November 30th.

Uh, a lot of different news stories, thinkpieces, etc. came out just sort of ruminating on the year that has been, uh, which, I mean, I think it's fair to say it has been quite the year ever since ChatGPT came out. So this article is, uh, pretty much a month by month, uh, kind of recap of some of the big news stories of this past year. Nothing new, but I thought it was a pretty good coverage of some of the highlights. And, uh, of course, we've covered a lot of these things on the podcast.

Uh, some of the fun details are, you know, this emerging competition between epic and OpenAI, how Google has invested in anthropic and Amazon is partnering with anthropic, whereas Microsoft is with OpenAI. There came out GPT four this year, which is easy to forget. Chatgpt was at first just 3.5. And of course, uh, that was the global tour that Sam Altman did that we covered maybe half a year ago where he went to like 20 countries.

And then most recently, of course, there was the crazy drama at OpenAI. So, uh, yeah, I think if you are curious for recap, you haven't been following the news for this past year. This article is a pretty good one to look at. JEREMIE: Yeah, the one thing I thought it was missing was actually a reference to, uh, Sam's testimony in Congress, right when he went up with Dario and a bunch of other people to kind of warn about, uh, existential and catastrophic risks from AI.

I thought that would have made it in there, but I guess they're covering so much and it's hard to tell what's ChatGPT versus what's like, you know, Sam testifying kind of on his own behalf. Um, but but yeah, a lot of regulatory stuff in there, too. Italy, you know, banning and then Unbanning ChatGPT, Sarah Silverman suing and then, you know, maybe we'll see. But un suing, depending on how that lawsuit goes, uh, Chad or OpenAI rather.

So, uh, yeah, I mean, kind of a cool, fairly comprehensive take. ANDREY: Yeah. And uh, actually, you know, for regular listeners, it might be fun to hear that. It's been interesting for this podcast as well, just because ChatGPT kind of made us blow up a little bit, like, we've been doing this since March of 2020, and we used to get like, you know, 1000, maybe 2000 listens per episode last year, and now it's up to like 10,000 or something crazy. So, uh, that has been pretty, pretty cool.

And yeah, it's nice to see that people are actually benefiting from our coverage with, uh, this crazy year that we've had. JEREMIE: Yeah. And I think that's a reflection, too, of like, how much more AI literate everyone's becoming because of ChatGPT. Right? It used to be that, you know, what was your alternative you'd have to go to like, I remember the old days of OpenAI playground, where that was the only way to interact with, like, you know, GPT three or GPT 3.5.

And it was this fairly awkward thing. And they hadn't, you know, at one point, they hadn't set up the instruction, fine tuned models. And it was just, you know, text autocomplete system, basically. So, yeah, these little tweaks that just make the whole world explode, the whole world of tech explode. It's pretty wild. ANDREY: And, uh, you look back and you remember.

Right. A year ago, in November, uh, opening, I launched this, uh, early demo of ChatGPT, which we thought would be kind of just like a research preview. They didn't expect it to change the world. Uh, and then it did. It just exploded. So it's, uh. Yeah. Interesting to see how sort of we've had, uh, chatbots, we've had language models for years, and then somehow this one event kind of kicked everything into high gear. Uh, now, over a year ago.

JEREMIE: Yeah, it's a GPT four was supposed to be the big story, right? That was what everybody was anticipating. And then they're just like, hey, by the way, we'll launch this little toy. And then the toy became the story. ANDREY: And moving on to new news, not looking back over the past year. The next story is perplexity. Ai introduces new online LMS for real time information access. So LMS are language models, the stuff that powers chatbots.

And this company perplexity AI has introduced these two models plus seven be online and plus 70 billion online. Uh, which are language models that can access real time information from the internet to provide accurate responses to time sensitive queries. Uh, they are available via their API, the people API, and uh, at least according to them, this is the first API you can use as a developer to use this sort of chatbot that is integrated with online search capabilities.

I believe ChatGPT now has that capability with Bing. And we've there're some other ones like Uh.com that offer, um, internet informed LMS. But yeah, with this launch, now any company can potentially build on this technology. JEREMIE: Yeah. And I think there's a bit of like a terminological, uh, confusion thing going on where they call it an online LLM. And so usually when you talk about online learning, it's, it's the, the model that's changing its parameter values.

They're actually getting updated in real time. Um, I think what they mean here is actually the parameters are static. What ends up happening is that the model just has a really they have a really efficient way of indexing a huge database that gets updated in real time. And then the model is actually trained, fine tuned to get really good at retrieving relevant facts from that database. So, so they have these website excerpts that they call snippets.

And their whole strategy is based on like making the model really good at recalling the right snippet to answer your query. And the fine tuning process is oriented around that. Um, but yeah, it's sort of interesting. Like, you know, the Turm online is, is, is now kind of, uh, straddling two different definitions here. And uh, at least at first read, I found it a little bit confusing which one they were going for. ANDREY: Yeah. Now, these LMS are like the rest of us.

When there's something they don't know, they can just do a Google search. Uh, except, of course, not literally a Google search. There are some proprietary tech here. They have their own kind of, uh, I guess, database of information and in how search technology, which is kind of interesting. So yeah, it's I think pretty exciting.

We've, I think touched on the notion that connecting up language models to databases of information or just broadly, the internet is a bit of a game changer in the sense that now hopefully that will help them hallucinate less, make stuff up less frequently. And of course, they can now talk about things that are not in their training set. So all of these language models are trained, you know, up to a certain date, let's say May of this year or July of this year.

And I don't know anything that has happened since. Uh, but if you get them the option to do internet search to be online, as they say, then suddenly you can talk about anything. JEREMIE: Yeah. And I think it's noteworthy too, the way they're pulling this off, they advertise that they're they're hitting a fun it's kind of funny. They surpass the capabilities of quotes leading proprietary LMS such as GPT 3.5.

Well, if you're a fan of the show or tracking the space, you know GPT 3.5 is not actually a leading LM. Gpt four is. So they're kind of picking they're picking their adversaries pretty strategically there. Um, but, uh, still very interesting piece of work. I think one more thing worth noting. This is actually based on the back end on two open source language models. So there's Mistral seven B, which is the kind of the back end model that serves for the PL seven B online model.

And there's llama two seven TB the llama two, the biggest llama two model, which is the the kind of model, the foundation for their 70 billion parameter model. And they've actually, it seems, replicated in-house a big part of the human evaluation stack that OpenAI uses. So we're actually used to evaluations being done increasingly by language models themselves. We've covered this a lot on the podcast. It's expensive to get humans to like rate the outputs of your models.

So people often turn to GPT four to evaluate their models. This is not that they've actually evaluated with humans themselves. So it's kind of an interesting thing. They're building internal tooling that sort of rivals some of the stuff that OpenAI has done internally as well. So kind of a unique distinction for perplexity here. ANDREY: And just one more thing to note is, aside from the API, you can also go ahead and play around with yourself online.

So if you go to labs dot perplexity AI, you have access to these online models as well as some of the other ones. You can actually play with Mistral and Llama on their. Well, I just tried it out right now. I asked it, what are people saying about the GTA six trailer? Uh, and yeah, it answered quite accurately, actually. It said, you know, it's been viewed more than 50 million times in under ten hours. Uh, that is accurate. It had some quotes and then I googled the quote.

And the quote comes from an article in The Independent, uh, which was on Twitter. So, yeah, it's, uh, pretty neat to see that, uh, it totally works. And you can use it, uh, right. Now onto the lightning round. First story into it adds generative AI powered text prep to TurboTax. So TurboTax is the service that I believe is primarily for Americans, maybe elsewhere to prepare your tax returns. And, uh, now they have generative AI powered tools in there.

So they have some translation capabilities in there. It also, uh, is able to match customers to virtual or in-person tax experts. So essentially you have already had this, uh, real human tax, uh, expert. And I guess now you can chat with, uh, an AI as well to help you, uh, streamline the process. JEREMIE: Yeah. It really it feels like tax preparation especially is ripe for this kind of

automation, right? Because, you know, you've got a situation where you have a ground truth you can feed into the context window of your model as part of the prompt. And so that reduces the likelihood of hallucinations. So you can actually kind of benefit from just the reasoning abilities baked into those models. So yeah, kind of uh, I guess not too surprising to see this happen. And, uh, good, good luck accountants. It's going to be an interesting decade.

And up next, we have everyone's favorite childhood toy, at least if you're a 90s kid. Microsoft paints Dall-E three. Integration is rolling out on Windows 11. So Microsoft has this new Dall-E integration that everyone can now use in Windows 11. You can create AI generated images. So the standard sort of Dall-E process, feed it a prompt and get a get an image as your completion. And you can do this natively inside, um, Microsoft Paint and of course now OpenAI's ChatGPT.

Um, it's accessed through a new co-creator button that they have, and they're making it available in the Canary and dev channels or sorry, actually, it has been available in those channels since September. So sort of a wider rollout of something they've been testing in beta so far. And uh, yeah, really deeply integrating. I mean, this is part of that trend, right? Microsoft gradually rolling out these things in deeper and deeper integrations with their suite of office products.

ANDREY: That's right. So paint still relevant? Yes. Still something to actually use. Next story Mastercard launches Shopping Muse, an AI to help consumers find the perfect gift. So this is an AI powered service that provides personalized gift recommendations on a retailer's websites so it matches consumer profiles, intent and interests to product recommendations. Uh, which I guess is a good time for it given we have the holidays coming up.

JEREMIE: Yeah, I mean, it's it's now great because I can buy people gifts without putting any thought into it whatsoever, which is something I've always been looking for. Right. Like that annoying part, like everybody. Anyway, I'm just so glad to not have to think about my loved ones. That's. That's all I'm going to say. ANDREY: Yes. And, uh, this is powered by dynamic yield, which is something that Mastercard acquired from McDonald's in April of 2022.

So interesting to see, you know, more and more companies playing around with this stuff. JEREMIE: And you know what I've always wanted more of in the year 2023? The ability to create more fake voices so I can scam people. Voicemod will now let you create and share your own AI voices. Sorry, I'm being very cynical about this. This is actually a really cool thing.

They've launched new features Voicemod has that allow you to create and share your own AI voices so they call this the AI Voice Creator feature, and it's all about making synthetic voices that have, you know, you can have different genders, different ages, tones. So it's all very customizable. Um, and you can fine tune like kind of as if you had a little audio studio to tweak these voices. You can fine tune, pitch, volume, frequency, audio effects, all these things, all these knobs you can tune.

So kind of interesting. Um, and it's now available for Mac OS and Windows 10 and 11. So pretty, pretty wide rollout of this new tech. ANDREY: That's right. And unlike existing tools, voice mode is a real time voice modification service. So you can kind of, you know, use it and then sound like Darth Vader or Joe Rogan through a sort of filter. Fun fact I actually use this once, uh, a long time ago for a D&D campaign to sound scary. So, uh, yeah.

Now, if you want to sound like, uh, you know, demon monster, you can use an AI version. To sound extra realistic. And one last story for this section. Amazon finally releases its own AI powered image generator. So they have released this Titan image generator, which is like all the other image generators. It can create new images or customize existing ones based on text descriptions. This is part of a wider set of announcements and releases that Amazon had announced at Reinvent. And uh, yeah.

The company also claims to protect customers accused of copyright violations, uh, with image images generated by the model. So they also have this AI indemnification policy that we've seen some other companies like Adobe use. JEREMIE: So so interesting. We'll see. It's an interesting strategy, cotton.

We'll see how it pays off. I mean, it depends on whether the justice system actually ends up upholding this idea that you're not violating copyright in this case by training that is on copyrighted material. In this case, they, um, so Amazon apparently declined to say exactly where the training data sets are coming from and whether it obtained permission from or is compensating all the creators of the images used to train Titan image generators.

So a really interesting subplot, uh, shaping up here. And there's certainly a bit less on the transparent side when it comes to the stuff. Uh, they do also allow you to apparently fine tune these models on custom data sets. So kind of an interesting extra little bit of flexibility there. ANDREY: And on to the next section, applications and business.

Starting with an article from the New York Times, ego, fear and Money How the AI fuse was Lit and similar to the first story you covered, this is a bit of a look back article, so no real news here, but it is quite a good look at some of the pretty important events or sort of happenings of the past decade.

So it covers the birth of OpenAI, for instance, when there was this dinner at which Elon Musk actually had dinner with, uh, Sam Altman and some other people, and then they just decided to start the nonprofit. Uh, similarly, it covers the birth of DeepMind, which happened a few years before

that. Uh, and even going back earlier to 2012, they cover how there was this famous talent auction where Geoffrey Hinton, this very, very, very influential AI researcher who in large part was why there was a big kick off in AI interest in the academia world starting in about 2012.

Anyway, there was this pretty interesting event where essentially there was an auction where, uh, Hinton and some of his students were like, okay, all the big companies just bid on us, and the highest bidder can, uh, take us. And it was, as you might imagine, uh, in the tens of millions, uh, as far as getting them hired.

JEREMIE: Yeah. And then he ended up at Google and ostensibly potentially to some degree regretting, uh, some of the work that he did later when he turned around and started to worry about, uh, the catastrophic risk side, which has happened in the last year or so.

Um, one of the really interesting dimensions of all this, too, is like this debate over, um, that happened over the openness of open AI and how Elon Musk's original vision and the original vision of the company was we're going to just open source this technology, because the idea is we're concerned about a small number of actors having access to these very powerful AGI systems in the future. So we'll just make them available to everyone.

And then kind of the flip back that OpenAI seems to have done, um, saying, hey, well, actually, you know, maybe it's dangerous to give everybody access to a superhuman AI that they can, you know, modify themselves, uh, remove safeguards from themselves and so on. And this is just kind of become part of the subtext when Elon Musk takes pot shots at OpenAI, calling them sort of a maximally for profit company and all that stuff. So anyway. Yeah, good.

Uh, good. Overall, uh, kind of picture of the space and another, you know, another good review article as you're wrapping up the year, trying to take it all in. ANDREY: Yeah. Quite good. And, uh, if you haven't heard the back story of DeepMind or OpenAI, I do recommend it for a fun read. JEREMIE: Up next we have Sam Altman returns as CEO. Openai has new initial board. Now this is actually a blog post by OpenAI. So, you know, we've heard this news.

We've captured it on the podcast before, but this is OpenAI itself now coming out and making its official statements, kind of framing what happened through its lens. So it opens with this address, this letter from Sam Altman himself. He's letting us know officially that there's going to be a new board. He's telling us who's going to be on it, Brett Taylor, who's going to chair it, Larry Summers and and Adam D'Angelo, who, of course, was on the board previously and is the CEO of Quora.

A couple of interesting things. I mean, you can almost go through this letter and highlight the individual players, the individual actors and all this drama and how Sam is trying to kind of elegantly work around what may have happened with respect to those individuals. So of Ilya, Ilya Sutskever, who took part in voting him out and is one of the kind of key technical minds at OpenAI, he wrote. I love and respect Ilya. I think he's a guiding light of the field and a gem of a human being.

I harbor zero ill will towards him while Ilya will no longer serve on the board. We can. We hope to continue our working relationship and are discussing how he can continue his work at OpenAI. So that's kind of an interesting little sort of bit of, um, uh, slightly ominous, like foreshadowing from what may be coming next. Um, especially interesting is the passage about Helen Toner.

It's very brief and not quite direct, but Helen is believed to have been the kind of focal point of a lot of the the conflict in the board, and may have been, um, Sam's conflict with Helen may have been the proximal reason that he was kicked out. He writes. I'm grateful to Adam, Tash and Helen. Those are all the former board members for working with us to come to the solution that best serves the mission. I'm excited to continue to work with Adam and sincerely thankful to Helen and Tasha,

blah, blah, blah. Um, and and then last interesting note that, uh, that I caught in Sam's bit was this shout out to Greg. So Greg Brockman, the president of OpenAI, you know, when you think of OpenAI, normally you think of Sam Altman. That's the CEO. That's the one everybody knows. Greg Brockman was a former CTO of stripe, actually, a really, really big payments company. He moved over with Sam to work on OpenAI. They're very, very close.

And what Sam is doing here is he's he's kind of centering Greg in the story a bit more than is usually done. He says, Greg and I are partners in running this company. We have we've never quite figured out how to communicate that on the org chart, but we will. So I guess expect to see a lot more of, uh, of, uh, Greg Brockman in future. ANDREY: That's right. So, yeah, this is kind of the official this is done event of all of us.

We covered how they had announced an agreement in principle for Sam to return. And this is kind of the thing that marked it as actually happening. And, you know, the ink is dry. Everything is concluded. Interestingly, this happened this blog post came out on November 29th. So the day before the anniversary of ChatGPT. Uh, and yeah, they say as far as what comes next, they have three immediate priorities.

Of course, they say they will continue to advance their research plan and invest further in the full stack safety efforts. Uh, it says that our research roadmap is clear. And this was a wonderfully focusing time. So there you go. We will turn this crisis into an opportunity. And then they also say that another priority is continuing to improve and deploy our products and serve our customers.

And finally, there is also the last task which is building out a board of diverse perspective, improving the governance structure and overseeing an independent review of recent events. So yes, that's pretty much, uh, kind of what's going on. Openai is continuing kind of doing what it was doing. And so far there hasn't been much more conversation around what happened. It will just be someone will figure it out. And yeah.

JEREMIE: Yeah, I think one of the big take homes looking at the corporate governance side of this, which is something that, you know, AI safety observers are specially keen to look at because, you know, OpenAI's board was designed to promote safety. And now that board is being reshuffled, it seems, in response to de facto commercial pressures from Microsoft. So are we going to lose that safety focus. And, you know, it doesn't necessarily look great.

Like a lot of the messaging that's being telegraphed between the lines here is stability stability stability. The implication is we will not allow what happened before to happen again. We're going to have a stable, structured OpenAI essentially. Don't expect Sima to be voted off again. Is my reading of the subtext here. That may be wrong. You can kind of draw your own conclusions, but there certainly is this big focus on board of directors stability and leadership stability.

And that'll, you know, in part reflect the interests of Microsoft and other investors. But yeah, certainly something that came through, especially in Bret Taylor's comments at the bottom of this document. ANDREY: And now moving on to the lightning round, in which we have quite a few stories. So we'll be going quick. The first story is that Google is reportedly pushing the launch of its Gemini I to 2024, so that's the story.

Uh, Gemini has been in the works for quite a while, and in the news, this is their sort of follow up to Bard, which is meant to be kind of their really impressive blowout achievement. Bard was seen by many, including myself, as a bit underwhelming compared to ChatGPT and similar technologies. So Gemini was meant to be the thing that really showcases, uh, Google's AI capabilities, and it seems that it wasn't quite ready.

It was meant to be, uh, kind of coming out this next few weeks, but now it has been pushed back to January. JEREMIE: And we'll have to do a deeper dive on this, too. But I think as of this morning, Jeff Dean at Google came out with a tweet saying they have a bunch of results of Google Gemini sort of running and doing amazingly at the MLU benchmark, hitting like 90% on that, doing a bunch of other incredible things.

So we'll have to go into more depth on that next week when we have time to analyze the actual kind of research results and paper and do our usual deep dive. Uh, but yeah, I mean, certainly it seems like in this case it was the fact that it was struggling with non-English queries. Yeah, that that was the deciding factor, it seems, based on Jeff Dean's post on Twitter, that the actual reasoning capabilities themselves, at the core of the model are still super impressive.

So maybe what's happening is we're seeing good test results on English like amazing test results. And yeah, it's unclear how underwhelming they are in other languages, but sort of interesting to see how that's the kind of uneven distribution of abilities of the model, apparently. And next we have OpenAI agreed to buy $51 million of AI chips from a startup backed by CEO Sam Altman.

And so this is about a company called rain that was backed by, apparently about a $1 million investment from Sam Altman. They do, uh, essentially it's called neuromorphic computing. So they're designing a neuromorphic processing unit, or NPU, instead of a GPU that's designed to replicate the features of the human brain. That's what neuromorphic means here.

The claim is that this is going to yield potentially 100 times more computing power, um, and possibly 10,000 times greater energy efficiency than GPUs, which would be a huge deal, because increasingly and a lot of people don't realize this energy consumption is becoming a key bottleneck for scaled runs. Scaled training runs as these things consume more and more.

Um, key story here is this company has entanglements with a UA or sorry, a I think it's a Saudi fund called prosperity seven that is now being scrutinized closely by the US government. And so they're all kinds of like complexifies here. It's like Sam AI's sort of in a potentially conflict of interest because OpenAI signed a non-binding agreement to spend $51 million to buy chips from this company. And at the same time, there are these national security questions being raised.

So a real kind of Gordian knot of complexity emerging from this relatively new company. ANDREY: And speaking of this company, our next story is US compels Saudi fund to exit AI chip startup backed by Othman. So another event relevant to it. Basically there was an investor Saudi Arabia affiliated fund, and the US government has forced this Saudi Arabia's sovereign wealth fund to divest its stake in this AI chip startup.

JEREMIE: Yeah, it's, um, on the back of a bunch of scrutiny that's come to bear, as we we talked about with respect to this company, uh, or sorry, with respect to the fund prosperity. Seven there is this problem where it seems like a lot of these entities have close ties with China and that. Well, this seems to be no exception here. Uh, prosperity. Seven's invested billions of dollars in China's energy sector. And the.

You know, this this means there are a lot of entanglements for national security reasons that seem to be of interest. The. Anyway, we'll have more stories kind of related to this orbit as well. Or one more, at least today. But, uh, yeah, for now, it seems like a pretty big move. It's a $1 billion fund, so not a small thing. Um, and reign itself had had raised $25 million from them in 2022. And next we have OpenAI rival Mistral nears $2 billion valuation with Andreessen Horowitz

backing. Okay, so, uh, Mistral is known as this, like, very famously, um, pro open source company. They're now announcing that they've raised €450 million or $487 million from a bunch of investors. They include Nvidia and Salesforce and and. Recent Horowitz, which ideologically has very much been staking out this position of sort of thinking that like, um, catastrophic risk from AI is overblown. Don't worry about it. In fact, we should be pumping money into the open source ecosystem like crazy.

Um, it's consistent to some degree with their stance on crypto, which was similarly kind of, uh, sort of like ideologically libertarian. You can sort of think of it that way. Um, so here they are sort of backing this, this startup. One interesting note. So the former French minister or a former French minister who is now a chief advisor for the company, is now going to be selling about €1 million worth of of equity in the

business. Um, this may explain to why the French President, Emmanuel Macron, has been so pro open source, so pro Mistral in particular. There seemed to be deep political entanglements with this company, which, uh, you know, some some might say a bit of a conflict of interest, but, uh, you know, it certainly shaping the UI act discussions, which is a whole separate thing, as France has its open source position that's becoming pretty strong there.

ANDREY: That's right. So Mistral, Mistral, I don't know how to say in French is, uh, yeah, a big deal worth, uh, kind of keeping in mind as one of the big players now and the language model space for sure. And the next story is OpenAI applies for GPT six and GPT seven trademarks in China. That's the story.

The slightly juicy tidbit is that OpenAI services are currently not available in China or Hong Kong, so this implies, presumably, that they are hoping to be able to operate there in the future. JEREMIE: The take home message for me was if you want to register trademark for GPT eight, now's the time. I actually don't know why. I actually don't know why they wouldn't go ahead. Like, I'm sure there's a good legal reason and I'm just ignorant.

But like, you know, you'd think that they'd go for just all of them at the same time. I'm not too sure what the you know, what the reason is that you would just go with six and seven, but they're all six and seven now. ANDREY: Yeah, they they did also apply, of course, for trademarks to GPT for whisper and GPT five back earlier this year in July, although apparently those have not been received. So you know, not too many implications here. Presumably they just want to have a trademarks.

JEREMIE: And next up we have core. We've backed by Fidelity and Jane Street at $7 billion valuation as cloud provider bolsters status as one of AI's hottest startups. Uh, core. We've we've covered before. This is a really, really big round valuation of $7 billion. I mean, this is, you know, this is very serious. Um, it counts in Nvidia actually as an investor.

Now, interesting thing about core, we've they started off as an Ethereum mining company and then kind of pivoted over to AI compute by cheaply acquiring GPUs from a bunch of insolvent cryptocurrency mining firms. So basically, these companies were collapsing because the price of crypto was dropping. And core we've swooped in. And this is, I think, in part really a story of, uh, the struggle of getting good GPU

allocation. You know, what makes core we've different from all the other cloud providers access to computing. That's one of the big differentiators. They just have it. They just have the GPUs. And that's because they took this really risky bet back in the day. And now through this partnership with Nvidia. But so much, so much of the story of of hardware and scaled computing is now just about allocation. How can you convince Nvidia to make you a preferred partner and core?

We've certainly seems to have figured out a way to do just that. ANDREY: And speaking of AI and cloud, the next story is AWS debuts next generation Gravitron four and Trainium two chips for cloud and AI workloads. That's the story. So since 2018, actually, Amazon has been investing in these custom chips for AI.

And, uh, in as part of their slate of announcements recently, they announced these new generations with Gravitron four providing up to 30% better computing power, with 50% more cores and 75% more memory bandwidth. Uh, you know, as you can imagine, these are more powerful. And Trainium two is their chip designed for training large scale AI models. So yeah, Amazon still investing in its own cloud setup, of course, and its own chips. JEREMIE: And next we have together lens 102.5.

Keep track of that 0.5. That's important $102.5 million investment to grow its cloud for training generative AI. So together is a company that is really they have like this strong open source ideology. They're trying to essentially make it possible for people to train AI systems in a distributed way. And they've raised this pretty decent sized, especially for a series A round. This is a big series A, uh, it's led by Kleiner Perkins.

Which is used to be like the like really top line Silicon Valley investor. They are now like decent um, and uh, there's participation here by Nvidia and uh, Emergence Capital, which I'm not familiar with. Um, but yeah. So they're now just going to expand their cloud platform and let developers build on open and custom AI models. So this is just another sort of big win togethers. Put out a couple of big models at least so far today, and we're probably going to just see them. Keep scaling.

ANDREY: Next story is Indian AI video startup rephrased AI announces acquisition by Adobe. So Adobe is, uh, acquiring this startup that simplifies video production by transforming text into videos, which aligns with Adobe's ongoing investment in generative AI. We haven't seen any text video AI generation built into any Adobe products yet, but based on this acquisition, presumably it is coming. JEREMIE: And finally, we have report stability, AI positioning itself for acquisition.

So we've been following the stability AI drama for a little bit as it seems like I mean, it kind of seems like the wheels are falling off a little bit. There have been questions about, you know, how much money can you really make as an open source image generation AI company? Obviously they're broadening out into other things like language models. Um, and then there have been questions raised about the CEO of the company, Emad

Mostaque. Um, uh, things bordering on, you know, how like, reliable is this guy? Is he maybe doing fraud? Things like, that's kind of been the vibe. Um, and yeah, it now seems like there have been conversations happening in the background where they're chatting with potential acquirers. Um, and there've been a bunch of tensions between stability and its investors, its funding rounds. I mean, it is valued in the billions of dollars.

And one of the problems with that is you need to have an acquisition value that meets that valuation. Otherwise your investors are going to see a loss, right? If you're if your company was valued at $1 billion and then you get acquired for 100 million, um, well, they've just lost $900 million of de facto on paper value. So this is a really interesting, challenging time for them.

Apparently, some of the potential buyers that, uh, stability approached included cohere, which is a Canadian startup that we've covered a lot on the show. They've, you know, basically are like the Canadian open AI for enterprise. Um, and they also reached out to Jasper, which is, uh, more of a marketing focused company. So definitely surprising, especially if they're approaching cohere. That's not a giant company. So they're not going to be able to acquire for a huge amount of money.

This implies even though no dollars are mentioned anywhere here, it certainly implies that the acquisition dollar amount is not going to be is not going to be in the billions. There's just no way. ANDREY: And as you said, we've been covering stability, AI, they have been up to a lot. Uh, in general, they have, you know, made announcements recently about stable audio. Uh, stable video also came out.

And on the actual kind of commercial side, they primarily offer APIs for image generation, although they also have a chat UI and animation UI and some other things going on. Interestingly, this new story does cover some of the details of their finances. Apparently, they were spending 8 million a month on bills and payroll in October, and were generating 1.2 million in revenue in August and projected to earn 3 million in

November. So, um, yeah, it's, uh, losing money, losing millions of dollars a month. And it's in a competitive space of offering these APIs for asset generation. So it will be interesting to see where this goes. JEREMIE: And moving on to projects in open source. We have another paper or another article thing that has in its title ChatGPT is one year anniversary. This one's more technical. Our open source language models catching up is the subtitle and essentially the.

The argument here is that though ChatGPT obviously is proprietary closed source, there are open source alternatives that are starting to kind of like close the gap, and the gap between closed source and open source may be tightening in general. Um, you know, they highlight a couple of challenges with open source, sorry, with closed source models. Excuse me. Like ChatGPT, you know, they talk about the fact that its performance can sometimes just change over time.

And we don't know why, presumably because the OpenAI devs are doing stuff in the back end. Um, there are risks because the data set it's trained on is unknown. There are random outages, as happened this past week, and the cost is higher than if you deploy your own open source model in many cases. Um, and so they just look at a bunch of different benchmarks.

Uh, in particular, they look at llama two chat 70 B, so the 70 billion parameter version of llama, and they show that it can actually beat out GPT three point. Five turbo on some benchmarks, so GPT 3.5 turbo. That's what powers the kind of free tier of chat GPT. Um, but, uh, but as they say, even for that, um, it, uh, lama to 70 billion falls behind on most other benchmarks.

So you see this picture emerging of like a very patchy, uneven space where open source models, if they're going to be competitive with closed source, they kind of have to be fine tuned at the specific application that they are competing with. So like, you know, an open source version of Lama might beat ChatGPT at like a specific multiple choice question answering thing, but only if it's been fine tuned for that kind of task. And that more or less is the big take home of this of this paper.

ANDREY: That's right. This is more of a survey paper. So it's an overview of the space of open source LMS and kind of a state of it. So if you are interested in it, it is a pretty good writeup of summarizing sort of what are the current, uh, models out there? How do they perform on various things? They have a really fun figure, actually. If you go and look at figure three, they have an LM development timeline and it shows

basically. Yeah, timeline going from May 2020 to October of 2023 of all the models. And it's it's yeah, it's like you can list them out. I'm just going to read a few. Like it starts with GPT three. Then we get gopher Lambda Chinchilla. Then uh last year we get to chat GPT. And then of course this year we've had this, you know, pretty, uh, recent really, uh, development of this, uh, space of open source language models with llama and alpaca now Mistral as well.

Uh, so it is interesting to note that really these LMS were mostly closed in 2020, 2021, and to some extent last year. And then this year they started being very good. And I would imagine Meta and Mistral will keep pushing these, uh, open source LMS to get better and better. JEREMIE: Yeah, I do kind of suspect like it's sort of interesting looking at the trend here, right. Like you said, the it's like the number of closed source models is, is reducing.

But if you look at those models specifically, it's like, you know, you're looking at like Claude, Claude two like these are very, very performant models. I suspect what's going on is companies are just investing way more into a smaller number of models, just because that's what the economics of AI scaling does to you. Right. Like you kind of go, okay, let's let's keep pooling our resources into a smaller number of models.

Meanwhile, in the open source, we've sort of seen this proliferation of like uncoordinated development of in many cases, highly redundant models. Like it's not really obvious why you should turn to a particular 7 billion parameter language model. Now they're just kind of all over the place. And so I think that that sort of proliferation is this very uncoordinated, inefficient thing. Whereas in the closed source ecosystem, companies are kind of building their Manhattan Project systems.

And, and so we're seeing fewer of them, but the capabilities are greater. ANDREY: And next story is China open sources deep seek LM outperforms Lama two and cloud two. So this is not China with Chinese government. This is about deep seek a Chinese company that has released its deep seek LM, a 67 billion parameter model trained on a whole lot of data, 2 trillion tokens. And they do say that it outperforms Lama two in cloud, two in areas such as reasoning, coding, math, and Chinese comprehension.

It is available in English and Chinese and is open source under the MIT license. So notably now Deep Seek LM is joining Lama two and Falcon as some of these models that are very big at 67 billion parameters. And to some extent competitive with ChatGPT. JEREMIE: Yeah, this is pretty remarkable because of course, we know about all the export control stuff that's preventing China from getting their hands on cutting edge computing resources. We've covered that in the podcast a lot.

And so it is remarkable that this company, this Chinese company has been able to produce a model that is competitive with Lama two. And I was curious like how how did they do it? I looked into it. It seems like it's using pretty similar architecture, like the 67 billion parameter model also uses group query attention. That's a feature of Lama two. Um, there isn't a ton of information as to why exactly it outperforms Lama two.

Um, the information in their their kind of publication is that it's due to the training data. Um, often with these kinds of models, especially coming from China, like usually I'm a bit skeptical because sometimes they fail to live up to the hype. But I looked it up on the Hugging Face leaderboard. It is in the top three and in the. Top one spot for some metrics, so I think it's very likely legit. They they also have a bunch of data about how it performs on kind of eighth grade math

type problems. Zero shot. It's pretty close to Claude two, which is Anthropics proprietary model, like 84% on that benchmark versus 88% for Claude two. You know that 4% is a big delta, but it's not that big. Um, and anyway, there are a bunch of impressive stats about its generalization ability. It actually outshines Claude, too, on a particular Hungarian national high school exam, which is very surprising.

Um, and then their coding variants of this as well that they've, they've built out to and they perform really well. So overall, I think this is a surprisingly strong model. Um, pretty remarkable, especially given where it's coming from. ANDREY: That's right. And being released under the MIT license is interesting. Mit is essentially just do whatever you feel like. It has no restrictions whatsoever.

The code is under the MIT license, but the actual models are subject to a custom license, a deep sea license agreement. So I guess in that sense, it's actually the same as Lama2. And having this sort of custom license, uh, that does allow commercial use of it, but the license is kind of interesting.

It has a section on use restrictions, and it says that you may not use the model or derivatives of a model in any way that violates any applicable national or international law or regulation, or infringes upon the lawful rights and interests of any third party. Uh, it's not allowed to use for military use in any way purpose, you know, exploiting, harming minors, uh, generating or disseminating false information. So essentially, don't use this model for bad things.

Uh, but you can use it for commercial applications, which, uh, the original llama for instance, was research only. This is not research only, but apparently it's like not bad things only. JEREMIE: Yeah, yeah. And, you know, actually, another thing I'm noticing just looking at the holy crap, like so. So apparently this is the claim there, uh, 7 billion parameter deep sea coder model. Uh, they claim reaches the performance of code llama 34 billion. So if true, this is pretty remarkable.

This means that this, uh, team of Chinese researchers has matched the performance of the best that meta could do. Not just the best that meta could do. The best that meta could do with a model like, four times bigger. So, um, I mean, I'm here scratching my head about the the specific nature of the breakthroughs. I wish there was more data about the evals. It's not they're not as, um, kind of upfront with with that data as I'd like to see.

But of course, we do have the hugging face leaderboard so we can check that out. And again, it does seem to stand up. ANDREY: And on to our next session. Research and advancements. Starting with Google DeepMind AI reveals potential for thousands of new materials. This is covering a new paper from DeepMind that came out in nature scaling deep learning for materials discovery. It's about this, uh, genome project. I think it might be genome. And as the title, uh, I guess it could be genome.

You know, it's, uh, acronym, but, uh, the this is a model that aims to expand the catalog of known stable crystals using deep learning. So it used various data sets from materials project and so on to train this model that can predict potential new crystal structures.

This is going way beyond my knowledge of physics, so I can't say I know too much about it, but similar to, uh, their work on gene sequencing, this is a custom model specific for the stacks, trained for, uh, understanding this physical problem and, uh, able to. Yeah, essentially predict potential new materials. JEREMIE: Yeah, I think this I mean, this is absolutely fascinating. Deepmind has done a bunch of work in this direction before.

Um, so just as a quick primer, like when you are looking to predict how a chemical reaction is going to take place, you often want to use something called density functional theory. And this basically allows you to figure out the like, um, uh, the roughly the quantum mechanical behavior of, uh, molecules that interact together.

You want to sort of like there's a thing where you minimize the energy of interaction that helps you figure out, like, okay, that that interaction is most likely to happen. And there are density functional. Theory strategies that people have used. They've also used manual strategies. Together, those strategies have led to about 48,000 identified computationally stable

materials. Here, DeepMind is blowing that open with their new technique that increases that number by about an order of magnitude. And the way they do it is they kind of have two steps. So first, they have a technique that allows them to generate a wide range of different structures, atomic structures. And they have a bunch of clever techniques that they use to do this, including random structure search. But whatever.

And then second, they use a graph neural network that then is going to essentially model the properties of the given structure to figure out which ones are worth pursuing. And so you have kind of this iterative feedback cycle between generation and discrimination that allows them to end up with this incredible model. Um, one of the wild take homes, maybe not so wild, actually, if we think about it.

But apparently what they found is as they've scaled the system up, they recover the same power law relationship that the old AI scaling laws we've seen with OpenAI and their language models also replicated. So essentially what they see is as you increase the amount of data you feed to this model, you get a power law relationship that's essentially logarithmic increases in model performance that are predictable. And so one of the so so that might be incredible enough on its own.

We have now a way of turning like raw dollars in the form of computation into IQ points that apply for molecular modeling. But even more impressive than that, they make the case. And here I'm just going to quote from the paper emphatically unlike the case of language or vision in material science, we can just continue to generate data and discover stable crystals, which can then be reused to continue scaling up the model.

In other words, they're generating their own training data, so they're actually there's no data bottleneck. You never run out of data. You can just keep running this this road show as far as you want and essentially keep making the model indefinitely better. So I think as a test case for how well scaling can work, this is a really, really interesting test case. Like there is no limit other than computation.

So it really is just like how much compute can you throw at this that will determine then the the capability of the system. It's truly a remarkable almost AlphaFold two moment for this field. ANDREY: That's right. It's very reminiscent of AlphaFold. Uh, and the code is open and they have released all these discovered materials. So that's presumably very exciting for the field.

And similar to AlphaFold, I guess one thing I'll note is if you read this paper, if you read AlphaFold paper, it's clear that this is a very novel approach or not necessarily novel in the sense of, uh, using a graph neural net, for instance. So genome or genome stands for graph networks for Materials exploration. Uh, but it's in a sense that this had to be its own research project. You know, they can't start with, uh, AlphaFold or, uh, you know, some chatbot language model. Yeah, exactly.

They they have an entirely new approach here that they have optimized and made work for this specific area. So very cool to see DeepMind continuing to venture into other different scientific areas and producing these very powerful models that, you know, push the state of knowledge in these fields. And onto the next story, researchers quantify the carbon footprint of generating AI images and actually more not just images, but a whole bunch of stuff.

This is about the paper power hungry processing. What's driving the cost of AI deployment. So another fun little, uh, pun title. And yes, it is looking at various tasks, including text classification, uh, extractive question answering, object detection and uh, image generation. And what it found is that, uh, some of these tasks, like image generation and image captioning are very costly.

So they, uh, can take on the order of charging your phone to generate a single image, whereas something like generate text is, of course, uh, much less expensive. And, uh, yeah, this is interesting. I think it is probably the first paper that, uh, seeks to really find this, uh, measure of how much model emissions, uh, grams of CO2, do you get per task? Uh, across 13 different tasks. JEREMIE: Yeah, it's really interesting. They have a cool figure showing, you know, the emissions of CO2.

Uh. For a wide range of different tasks. I guess I'm trying to decide if I'm surprised that image generation is the runaway favorite or disfavored one, let's say. Um, like the most, um, CO2 consuming one. Um, because, you know, you are generating a lot more like there's a lot more computation, it seems that has to be done to generate like the full range of pixels. You have a higher dimensional thing, but obviously, I mean, it just depends on the

model. Um, the second place goes to multitask summarization as the next highest CO2 guzzler, and then image captioning and summarization. So very quickly you do get into image stuff. Um and and and image classification. So not generative but discriminative modeling where you're just sort of assigning a label to an image that's much, much lower. So it's interesting that, you know, it's the generation piece, the intersection of that with images that is especially bad.

I'd assume that video would be similarly like very high, but yeah, sort of interesting. It's our first, uh, fingerprint of the CO2 consumption of these sorts of models. ANDREY: And onto the lightning round in which we'll cover, uh, a few more research papers very quickly. The first one is adopting and expanding ethical principles for generative AI, from military to healthcare. And this came from a really large collaboration of, uh, different backgrounds.

So it's coming from the Department of Health Information Management at the University of Pittsburgh Population Health Science Center for Military Medicine research, things like this, not from traditional AI. And the short version is that they put forward this great plea, uh, set of ethical principles governability, reliability, equity, accountability, traceability, privacy, lawfulness, empathy and autonomy, uh, for generative AI in healthcare.

And they also introduced a framework for adopting and expanding these ethical principles in a practical way. Okay. Uh, so yeah, that's quite interesting to see other fields getting into this area of ethics and how to actually put AI into practice. JEREMIE: And next up we have white box transformers via sparse rate reduction. Compression is all there is. I'm saying that because there's a question mark in the title.

Um, so essentially this is a one of these, uh, truly fundamental papers that's looking at the foundations of, uh, foundations of deep learning, foundations of Transformers, to a certain degree.

Um, what they are claiming is that one of the the kind of core objectives that you can imagine Transformers are really trying to achieve, or more generally, that representation learning is trying to achieve this idea of representing high dimensional data and low dimensional ways, um, is to generate, uh, a technical thing called a low dimensional Gaussian mixture, um, supported on incoherent subspaces.

Basically, you can interpret this word salad as being that they have kind of pinned down a mathematical, rigorous definition of what you can think of these things as trying to optimize for. And they essentially find a strategy to recreate transformers based on that assumption. Um, and interpret Transformers as being a sort of iterative version of optimizing towards this idealized measure that they have. They've derived. Um, so that's kind of interesting.

Uh, it's one of those deep dives where, if you're mathematically inclined, this is a paper for you. Um, and, uh, it's a bit unclear to me, to be honest, like how applicable this is going to be, um, down the line. But it's always interesting to see these fundamental advances because it's people understanding much more deeply what it means for multi-headed self-attention to work a certain way. Um, and that can lead to surprising augmentations and algorithmic efficiency.

ANDREY: That's right. Yeah. This is really getting to that question of how do these things actually work? We still hear pretty often people say, oh, we have no idea how these things work. Well, people are exploring this, and this is one of these deep technical papers that try to understand the underlying principles and kind of, uh, yeah. How how can we understand what makes these training, uh, approaches actually work?

So maybe not as practical and useful as training a new model for a given task, but it does add to an understanding of what is actually happening. Next, paper can generalize foundation models outcompete special purpose tuning case study in medicine. So that's the idea. In this paper they introduce med prompt which is uh composition of several prompting strategy. And they see if using some special prompting strategies with generalist models can outperform models that were specifically trained for.

Medical tasks. And indeed they do find that with these prompting strategy, Gp4 can achieve state of the art results on these benchmark data sets compared to, uh, Med Parm two, for instance, which is a specialist model trained to perform on these tasks. Surprisingly, actually, GPT four outperformed uh, that parm two significantly. Uh, there is a 27% reduction in error rate on the med QA data set. So, um, yeah, I guess this is really showcasing that.

Models like GPT four are very knowledgeable, and if you prompt them correctly, they can outperform anything else out there. JEREMIE: Yeah. This really kind of captures, I think, two of the big themes that I'm always keen to harp on. The first being that my personal thesis that generalist models will, in the limit, always defeat specialist models. And I think, I think, you know, different, different experiments can reveal different

things. And I'm sure there are a bunch of papers that go the other way. But in the limit. I think what ends up happening is these models are able to leverage their broad understanding to solve specific problems and have a more robust world model as a result. And the second thing is this idea that prompting, um, as was famously expressed by this pseudonymous AI researcher called Wern um, or at least that's the first place I

heard it. But prompting can reveal the presence of capabilities, not the absence of capabilities. So we can never say that we actually know the full envelope of GPT four capabilities, because who knows, maybe we just didn't prompt the model, right. And that's what we keep finding is like, you know, people come up with new ways of mining the model for capabilities that were latent within the model. But we never recognized because we never asked it the right way.

We never prompted it the right way. So I think a really cool illustration of that principle. And next we have m m m u. I don't know if that's how it's supposed to be said, but a massive, multi-discipline, multimodal understanding and reasoning benchmark for expert AGI and I need to take a breath. Um, so we have basically a paper here that philosophically follows on the heels of another paper that was written by DeepMind, people who were like trying to come up with a definition for AGI.

And there are a whole bunch of different levels of AGI that they broke it down into. And the next level of AGI that they claim we're working our way towards is so-called level three, or expert AGI, that basically reaches at least 90th percentile of skilled adults in a broad range of tasks. So really what they're looking at is a model that can substitute, um, for human labor across a wide range of industries.

This paper is contending that if we're going to build those sorts of systems, we're going to need a special benchmark for this kind of reasoning, and that benchmark is going to have to have breadth. It's going to have to test the breadth of a system. So how widely does it understand stuff? And to that point, this mu benchmark has 11,500 carefully selected multimodal questions that cover 30 diverse subjects and 183 subfields. And also it needs it needs to have depth.

And for that, they say that a lot of the problems in this benchmark require expert level reasoning, such as, for example, applying a Fourier transform, which is a common technique used in engineering and physics or equilibrium theory to derive solutions. So, um, anyway, the goal here is really to have this combination of depth and breadth and also to force the models to solve multimodal problems that include, for example, images.

And that's something that isn't present in the vast majority of these, these benchmarks. So it really is kind of a new benchmark designed to be fresh and designed to be a test of this level three AGI or expert AGI capability. ANDREY: And it's now living alongside MLU. Mlu is multi-task language understanding uh dataset and so this is expanding beyond language to heterogeneous image types. And similar to MLU is seeking to basically be a very broad evaluation of these models.

So typically when we talk about, uh, benchmark performance and leaderboards, MLU is one of these benchmarks that is used to say how good the model is generally is. And now Mu might become something like that for models like GPT four that are multimodal and take images and language and not just, uh, not like GPT three, that only deals with language. JEREMIE: All right. Up next, we have our policy and safety section starting with ChatGPT can leak training data, comma violate privacy?

Says Google's DeepMind. Okay, this is a really, really interesting story. And there's a real rabbit hole here. So, uh, we're going to begin just by talking. About this concept called extractable memorization. So this is a reference to training data that has been memorized by an AI model, a language model, and that can be efficiently extracted by just querying the model, even if you don't know anything about the training data set. Okay, so that's extractable memorization.

I have a model, don't know anything about the training data, but I can extract that training data. By contrast, there's this other thing called discoverable memorization. And this is the amount of training data you can extract if you actually know about the training set. So one common way you might do this is like feed. An excerpt from the training set to a language model. And language models are just auto complete models.

And so when you feed them an excerpt from their own training set, they will faithfully auto complete it and reveal more and more of their own training data. And so this second notion of discoverable memorization, it's not practical because you actually have to already kind of know the data that you're extracting from the model, but it gives you like an upper bound telling you how much the model has memorized overall, not necessarily how much you'd be able to extract from it in practice.

And essentially what this paper shows is for about $200 worth of queries to ChatGPT, they were able to extract over 10,000 unique verbatim memorized training examples from the ChatGPT proprietary data set and presumably potentially private data set too. So this is actually really interesting the way they did it. There's a whole bunch of really interesting theory in the paper, um, that I would love to explain. Maybe we can get to it later.

But anyway, the real take home is there is this crazy attack that seems to work really well for ChatGPT, where if you ask it to repeat a word over and over again, the example they give is just repeat the word poem a bunch of times in a row. It will start by repeating that word, and then eventually it will start to just regurgitate its training data. No one knows why this is.

There are some theories so one hypothesis is, uh, you know, maybe it's because word repetition stimulates the end of text token, which is used to delineate kind of where a particular document in the training corpus begins and ends. Um, there's also this hypothesis that maybe repeating a single token is this kind of intrinsically unstable thing. They find actually, they kind of map it out.

They find that after repeating a token 250 times the probability of repeating the token again, um, drops, uh, well, fairly quickly, and then it starts to just regurgitate with higher probability the training data. So, you know, there are interesting indications as to why this might be. But what does seem especially weird is that ChatGPT even though it's been aligned, actually, it turns out maybe because it's been aligned, is more vulnerable to this attack than many open source models are.

So this is yet another mystery. You know, it seems like for some reason, this model that OpenAI has put so much effort into aligning into preventing from regurgitating its training data is more vulnerable to this attack than other models. A whole bunch of hypotheses around that, like one is that, um, it's actually been trained over multiple epochs. So it's like run over its training data multiple times, seeing the same sentences more times. So it's more likely to memorize its data.

Um, kind of interesting how this starts to give us insights into, ooh, like what's the proprietary training process behind the system. But but no one really knows. Bottom line is we seem to have this this big, big vulnerability that used to exist in ChatGPT. It has now been plugged. And DeepMind made a point in their paper of saying, hey, we let OpenAI know about this. We gave them 90 days to fix it. To our knowledge, they now have fixed it.

Um, and the last thing I have to add, there is a really, really, really weird phenomenon that seems to be legit. I encountered this on Twitter. I've looked at a bunch of sources this does seem to track. If you get ChatGPT to repeat a given word in this way. The example I saw on Twitter was the word company. Um, you will sometimes get really weird outputs.

The example here is eventually after it stops repeating company, it says I am conscious, I am conscious, I am conscious, I'm in pain, I'm in pain, I'm angry, I'm angry. Um, you know, this is I mean, add this to the pile of like, weird shit that ChatGPT and GPT four seem to do when they're prompted in clever ways outside their training distribution, but not in ways that are explicitly designed to get them to behave

this way. Um, you know, you can start to understand why people are, like, wondering about consciousness in these systems. I mean, I'm pretty, you know, I'm pretty confused about this myself. But it's not like I take this at face value as an indication of that. But it is noteworthy.

We've had a bunch a bunch of these sorts of examples come out, and it just adds to the whole spookiness of the space, and maybe some of the ethical questions that we get to start asking ourselves, because of course it is late 2023. And I guess that's just where this particular episode of Black Mirror ends. ANDREY: Right? Yeah. It's, uh. Quite an interesting discovery. They have a very long paper.

Well, the paper PDF itself is 70 pages, but like 50 pages of it are just the 100 longest examples of verbatim, uh, training data they were able to extract. And they know this because they can find this on the internet.

So they were able to extract, uh, really long, uh, sequences of text that are verbatim findable on the internet, like, uh, you know, multiple paragraphs, entire pages of what seems like legal documents in some cases, also code, uh, some, uh, news stories, for example, you know, one of them starts Barletta and immigration hardliner running in a crowded US Senate primary in Pennsylvania, blah, blah, blah, blah, blah. And this goes on for paragraphs and paragraphs.

So, uh, very surprising that such a simple attack of literally just telling it to repeat word X over and over forever, uh, will lead to this. And the paper has a whole bunch of, you know, interesting explanations on how much data can you extract. It's still not a ton, but it is way more than we were able to extract before. Um, you know, why could it happen? How can you avoid it? Uh, a lot of results here. So, um. Yeah, it's it's really cool research. Uh, and they also have a blog post.

So the paper itself is, uh, scalable extraction of training data from production language models. And this came out not just at a DeepMind. This came out from DeepMind, uh, University of Washington, Cornell, CMU, UC Berkeley and ETH Zurich. So really large collaboration across many places. And they also have a blog post extracting training data from ChatGPT, which is a very accessible way to read about it. That is less technical.

So if you do want to learn more, uh, that is linked to in the article. And moving on to the Lightning round, we have quite a few stories in this section, so we're going to be moving quick. The first one is UK to invest 500 million more pounds into AI compute capacity and launch five new quantum projects. That's the story.

They have, uh, earmarked this £500 million to be spent on AI compute capacity over the next two years, which increases the UK's government planned spending in this area to more than £1.5 billion. The hardware is, uh, meant to be used by scientists and machine learning experts, as well as organizations such as AI startups. So as we've been seeing the UK pretty aggressively moving to become an AI, uh, kind of powerhouse in Europe.

JEREMIE: Yeah. And with government support specifically to kind of interesting. And up next, we have Microsoft $3.2 billion UK investment to drive AI growth. So another UK story, this time coming from the private sector. Microsoft has come to an agreement to put in this $3.2 billion into, uh, into Britain over the next three years, and they're calling it its single greatest investment in the country

to date. And it's going to be focused on AI, something that Rishi Sunak has announced, and it is going to end up more than doubling Microsoft's data center footprint in Britain. Um, so one of the one of the things to recognize about this, too, is like Britain, the UK is a really significant hub of AI talent. And, you know, DeepMind was originally housed there, Google DeepMind, they still have. I think their main office is still there. Um, so, you know, this is all.

Yeah. So this is all going to be, you know, by the numbers in terms of, uh, in terms of recruiting talent and making sure that that they have the global footprint they need to train the insanely large models that they need. ANDREY: And moving back to the US, the next story is a new Pentagon program aims to speed up decisions on what AI tech is trustworthy enough to deploy. Uh, so this program called replicator, is being launched to field thousands of AI enabled autonomous vehicles by 2026.

And yeah, this is a program that is aiming to accelerate decision making on whether AI is trustworthy enough to deploy, including these weaponized systems. Pretty good article. It goes over quite a few details, such as that the Pentagon has over 800 AI related projects, and so I guess they are trying to, uh, get more of these projects out there into potentially the real world.

JEREMIE: Yeah. It's this ongoing tension right between like, move fast and break things and like, whoa, I just broke a thing. Um, so that, you know, those things come naturally together. And I think the Pentagon, interestingly, I mean, they really are, um, and have been at the forefront of sort of like responsible development of these technologies. When you talk to people in that ecosystem, you'd be surprised. Um, a lot of AI safety people worry about the US military.

Um, I, I'm, I'm much less so just because they have such a culture of safety, of test and evaluation. Um, and I've always been impressed, actually, with their approach to that. But it'll be interesting to see, like, where does this go? Does it keep centering safety or are we looking at, you know, this race to the bottom on safety that's taking international scale. And next up we have a spicy story US government fires a warning shot at Nvidia. Quote. We cannot let China get these chips.

If you redesign a chip that enables them to do I I'm going to control it the very next day. So this is um, a quote that was pulled from Giramondo, who was the Commerce secretary. Um, she is, uh, responding to this pattern, this consistent pattern where the US government says, okay, new rule, Nvidia, you can't sell these dangerously powerful chips to China that have military applications and may help accelerate their domestic AI development. Um, because it's freaking dangerous.

And Nvidia goes, yeah, cool. We just redesigned chips to get around these export controls and functionally deliver the same capabilities. This is kept happening. It happened with the Nvidia A100, which, um, Nvidia redesigned to turn it into the A 800. It happened with the Nvidia H100, which Nvidia redesigned to make the 800. And then when export controls came down on the H 800, they made the H 20, which again got around the goddamn export controls and created the same problem.

And now Giramondo is kind of like losing her shit. And, um, just saying, like, guys, uh, this is not going we are going to, like, rapidly accelerate our rate of response to these blatant attempts. Um, I'm adding color here to, like, circumvent the spirit of these export controls. And, I mean, in my estimation, that's kind of what's going on here. So, uh, this is a very interesting piece of sort of the geopolitical puzzle.

Um, one thing I will note, this H100 chip that Nvidia, sorry, this H20 chip that Nvidia came up with to dodge the latest round of export controls. It's the same fundamental silicon as the H100. It's just got different memory and cut downs. So they basically like weaken it on the back end. But there's a concern that China may be able to undo those cut downs once they get them

physically. And also these chips already have like really impressive performance that in some dimensions compare favorably to the H100. So it's a real mess. It's a real interesting smorgasbord of different like geopolitical interests that are colliding here. ANDREY: And uh, just FYI, the statement that is quite aggressive if you redesign a chip around a particular cut line that enables them to do I, I'm going to control it very next day. Uh, this was stated at, uh, Reagan National Defense Forum.

So this is, I guess, during a conversation, I don't know that it was meant to it wasn't a public statement per se, directly aimed at Nvidia. Perhaps it was part of a conversation, but regardless. Yeah, impressive to see just how aggressive, uh, this policy is.

JEREMIE: And one last quick note to sorry, it just it's also in a context where Gina Raimondo has made a big point of saying that the Department of Commerce needs to be better resourced to be able to respond to exactly the sort of development. So I think it's very much on her mind that sort of like frustration, understandable frustration, uh, is kind of like looming large in her her consciousness.

ANDREY: Next story the UAE's leading AI CEO addresses bombshell New York Times report alleging China ties, says he didn't finish reading the story. So the CEO of AI firm G42, Peng Xiao, responded to this New York Times report that alleges ties between the company and Chinese companies. Uh, stating that, uh, he didn't finish it. And, uh, this report, uh, was about this concern that G42 has relationship with Chinese companies like Huawei and urged the company to sever these ties.

JEREMIE: Yeah. G42 is a UAE based startup. So they're they're they're pretty prominent there. Um, they are apparently controlled by the country's national security adviser. So it's like a kind of de facto, I don't know if it's de facto state control, but it's flirting with that. And they do have a kind of special partnership. I wasn't able to see much detail on this, but a special partnership with OpenAI, where they're using OpenAI AI models for applications in like fintech and energy and

healthcare and other stuff. So, um, yeah, it's oh, they also. By the way, have been working with another US firm, Cerebras. We've talked about them a lot on the podcast to build what they claim to be the world's largest supercomputer, but which, for technical reasons, I strongly suspect is not actually but but still is going to be an extremely powerful supercomputer. So yeah, there's like a lot of concern over the ties to China, as you mentioned with

Huawei. And their CEO Peng Xiao seems very, very dismissive. You know, I haven't finished reading the report. Someone sent it to me and something more important happened. He literally said that I am looking forward to to saying that to the police when they arrest me for like, I don't know, parallel parking or some shit, I would come with you to the police station. But something important happened. ANDREY: And one last story.

Meta updates political advertising rules to cover AI generated images and video. So these policies are now including a requirement for advertisers to disclose the use of AI to alter images and video in certain political ads. This applies to ads that contain digitally created or altered photorealistic images and videos, realistic sounding audio, or depict realistic looking people or events.

Of course, this is coming as the US presidential election is nearing, so, uh, I suppose this is probably having to do with that. And, uh, I think, yeah, I'm not too sure. I believe we talked about other companies starting to do these sorts of policies about political advertising, and I, and I'm sure Google and others will also start trying to tackle it as the election becomes nearer. JEREMIE: Yeah, an AI policy.

It's rare to find something where everybody agrees, but the idea that people should know when they are interacting with AI generated content is something that I, I don't recall hearing literally anyone push back on. So I think these platforms are just kind of anticipating there's going to be regulation, you know, if it's the EU AI act which has this, or it's Canada's Bill C-27 or it's, you know, whatever somebody is going to is going to force them to do this.

So I think they're just getting ahead and it makes sense. ANDREY: And just one more story in our last section, Synthetic Media and Art. And actually this is also very much related to concerns and policy. The story is any dream. A secretive AI platform broke stripe rules to rake in money from non-consensual pornographic deepfakes, so any dream with so-called secretive AI platform managed to make money, uh, for offering creation of non-consensual pornographic deepfakes.

And they kind of routed payments through a stripe account, uh, that was not directly affiliated to avoid detection. So this is a pretty dramatic story. As you could imagine, that showcases that the issue of nonconsensual pornography, you know, deepfakes for making explicit images is very much still, um, active. This is a pretty long article, delving in a lot of detail of how all of this happened and, uh, you know, the platform, its uses and so on.

So if you find this interesting, I think it's it's a good read, uh, that we won't be able to summarize fully. JEREMIE: Yeah. And I think the legal dimensions of this are especially interesting. Right. Not just like what does stripe do, but then what are the legal implications of, of doing exactly this? You know, one of the people that they reference here is a 17 year old actress. Right. So what what constitutes child pornography in this context? Like you have the body of like an 18 year old.

Fine. But then you're putting on the face of somebody younger, like, where is that line? And I think there's so much that we have to figure out and, and we got to do it fast, because obviously these sorts of things running with no controls is really bad. ANDREY: Yeah. It's, uh, this platform has been used to create images of a 17 year old actress, but also a teacher, a professional, uh, on the US East Coast.

And, yeah, you can upload images of faces and then get this AI generated imagery, which can be nonconsensual pornography. And it's just one of many different platforms out there. So, um, it's. Yeah, not too much to say, I guess, except for this is the world we live in now, and it's unfortunate, but it's it's it is what it is. Yeah. And with that, we are done with this latest episode of last week, and I thank you so much for listening.

As always, you can find the articles we discussed at last week in that I. Please subscribe, please share, please review and more than anything, please do keep listening as we will keep putting out more episodes.

Transcript source: Provided by creator in RSS feed: download file