Hello and welcome to the Skynet today's last week in AI podcast, where you can hear us chat about what's going on with AI as usual. In this episode, we will summarize and discuss some of last week's most interesting AI news. You can also check out our last week in AI newsletter, as lastweekin.AI for articles that we did not cover in this episode, I am one of your hosts, Andrey Kurenkov. I finished my PhD focused on AI at Stanford earlier this year, and I now work at a generative AI startup.
And this week Jeremy is not around. He is off talking to politicians, I guess somewhere. So we have a guest host.
Hey, I'm Daniel Bashir, I am a machine learning compiler engineer at Amazon Web Services. I also co run another publication, a good friend of last week and I call it the gradient. Um that's me.
Yes that's right. We've had Daniel on before. He, uh, records the gradient podcast. So he interviews a ton of people in AI. It must be, what, like 80 people now at least, right?
Quite a few. Yeah. We just dropped episode 106. If you're interested in, uh, this will be a lot of you on. Sure. The philosophy of language and, uh, propositional attitudes. I have a two hour, ten minute conversation with the professor at UT Austin about this that came out Thursday.
Right? Yeah. That is, uh, definitely going deep. And I guess that's generally true for a gradient. It's another project I'm also involved in. It's, uh, digital publication and also newsletter and and podcast. And here, uh, last week and I, we cover broadly kind of everything. Right. Whereas on the gradient you really go deep and
pretty technical. So not for everyone, but if you like going deep on, you know, certain topics and, and really getting into the weeds of stuff, then you might want to check it out. And before we get into discussing the news, I just want to have a quick shout out to a bunch of new reviews on the Apple Podcasts. Uh, I guess people heard our appreciation last time, and so we got like six new ones. Uh, which is great. Yeah, we really love seeing it.
A couple of you mentioned that we recorded while we were sick last time, so it's cool to hear that, uh, you know, that inspires you or you think, you know, that makes us committed to this, which I guess we are. And, uh, yeah. Thank you, everyone for reviews that one of you mentioned, uh, longer segments on arts and entertainment with AI. And actually, we'll have some new stories on that this week. So that would be, I guess, uh, nice for you to hear.
All righty. Kicking off our first section, tools and apps with OpenAI's custom GPT store is now open for business. So this is I guess would be a new story of the week. Is OpenAI has their store for custom chat bots? Uh, this is after the GPT uh builder program, which was announced in November, was kind of launched. And there's now been 3 million bots created by users. So originally this store was supposed to launch actually earlier but has now launched.
So essentially, instead of just chatting with one ChatGPT, you can now chat with all these various GPT guys, right? Customized versions of ChatGPT so that users on the platform can create. And this is now available to users of ChatGPT plus enterprise users and to this new tier, uh, subscription that we'll cover next after this.
Yeah, it's generally sounds really exciting, and I think that the idea of getting to work with chat with GPT three, other people are creating seems really exciting. I think that some of us are going to have areas of expertise that others just aren't, or are willing to put in the work to create a sort of GPT. But it definitely interesting to note that they do have a review system in place for this custom GPT, so they want to make sure they meet grant brand guidelines and usage policies.
There's also a reporting system and kind of seeing a little bit around the Twitterverse. I've seen at least a few people who have made custom GPT that ultimately got taken down and seem pretty unhappy about it, so I'm not entirely sure what those I'll. I'll be honest, I haven't looked too deeply into the guidelines just yet, so I'm not entirely sure if they're being applied consistently, but it's, um, interesting to see the response right now.
Right? Yeah. And, uh, it'll be just to see how this grows, because they can use ChatGPT for a lot of stuff as is. Right. Uh, and these customized ChatGPT, these are not too different. I think you customize them similar to other chatbot platforms. You kind of prompt them, give them some example. So it's not a ton of work to create a custom ChatGPT, but it does seem like they might actually train on interactions.
So these customized GPT will diverge from kind of main ChatGPT over time, potentially as people use them, which would mean that this will be a true repository of, I guess, you know, thousands or hundreds of thousands of different chat bots trained on different data and different interactions. So, uh, yeah, seems like probably a big deal. Really?
Yeah, that's pretty valuable, I think, especially for people who are trying to build businesses on top of these and are maybe not super. Happy with what? Gpt2 3.5 right now, for example, is offering them and that maybe it's a little bit too broad or the kind of trade offs that are implicit in using it just don't really work for their use case. And I've heard a at least a few people complain about this thing before.
So I'm curious if the GPT store is going to really change the game in that regard.
That's right. And, uh, it seems like at some point there will be also monetization for the creators of these chat bots. So it'll be a whole kind of platform for chat bots, similar to something like character, that AI that allows you to create your own chat bots, to just chat with services going
in that direction very much. And already when you go to the store page, you can kind of browse for, um, different applications for image generation, for writing, productivity, programing, education, etc. etc. etc. so if you're a fan of ChatGPT now, you might want to go and look if there's a customized version for your needs I guess, or create one for yourself. And onto the second branch of OpenAI developments of this week.
The next story is on how OpenAI has released this new way to subscribe to chat. GPT aimed at small teams, and this is, uh, which had GPT team here. That is kind of in between a single user and enterprise. Seemingly. This is, uh, workspace for teams of up to 149 people, and it introduces admin tools for team management and, uh, you know, all the usual access to all the tools and also has a guarantee that this will not be, uh, using your data to train similar to the enterprise tier.
And this is priced at $30 per user per month, uh, for multi building or 25 per user per month. So a bit more than the standard ChatGPT Pro if you are a single user. But yeah, it's interesting that I guess we're expanding their offerings to now cover small businesses seemingly.
Yeah, I was I guess I was just kind of noting at the end of our discussion of the last story about what the Gpt2 store could offer businesses. And and this is even more in the realm of small and medium sized businesses may be very small tech startups right now are pretty interested in the differentiations that could kind of happen with training their own GPT, or achieving the equivalent effect through some other mechanism. And again, that's like really, really hard.
And I think that for many of the things that businesses want to do, there just aren't a lot of good solutions out there. And there's still a lot of research problems that need to be solved, it feels like. But it does seem that OpenAI is still targeting this market in a pretty important way.
Yeah. And it's also, I think, in a way interesting that this is now similar to Google's G Suite and Microsoft's Copilot thing, where everyone now has a program where you can pay $30 per month per user, or 25, in this case, if you pay for a whole year. Um, so everyone is going after the enterprise and now also small businesses in this case.
For our first lightning round, we'll start with a story about something called the rabbit R1. There's an AI startup called rabbit out there that is launched a standalone AI device. It's priced at $199, and it's an AI powered gadget that can actually use your apps for you. It's about half the size of an iPhone with a 2.8in touchscreen, a rotating camera, a scroll wheel for navigation, and a 2.3GHz MediaTek processor, along with four gigs of memory and 128GB
of storage. It runs on rabbit's own operating system called rabbit OS, which is based on what they call a large action model. This acts as a universal controller for apps that can control music. It can order cards, buy groceries, send messages, and do more through a single user, uh, through a single interface. And again, this is a pretty interesting move. I think that there are a lot of people who are really interested in building more a gigantic products and things based off of large language
models right now. So it's very interesting to see rabbit actually kind of come out there with a device that is looking to serve this sort of need.
It's, uh, pretty neat looking, I guess it's if you go to the article and as always, we'll have links here. Uh, it looks like a little kind of square with a screen and a camera and some and a scroller thingy. It's a it's actually not too clear, so it's, uh. Most similar to the eye pin that you have covered before, in that it is sort of an eye first device that is meant to be a sort of hardware smart assistant, a device that can potentially augment or replace your smartphone with a built in.
We just announced it, and I guess a lot of people on the Twitterverse and elsewhere got hyped about it. The initial 10,000 units sold out already, so we'll see. Yeah, it's just announced. I don't know when they'll even come out, but people seem pretty hyped.
Our next story is about one of the tech giants. Amazon's Alexa has gotten some new generative, AI powered experiences. These are developed by character AI, splash and volley, and they're all available for you in the Amazon Alexa Skill Store. I promise I'm not marketing for Amazon. I'm just telling you what's happening here. Uh, for some of the examples here, character AI's experience allows Alexa users to have real time conversations with
different personas. These include fictional characters and historical figures. You might have seen that meta recently launched the sort of thing as well, and their messenger app, so this seems to be the sort of experience that a lot of companies, especially in the social space, seem to want to build around. Splash launched a free Alexa skill. This enables users to create songs using their voice. They can choose a musical genre, add lyrics, and either rap or sing along.
Perhaps good if you're kind of interested in creating music, but you have no capacity for actually composing stuff. Like me. Volley has introduced a generative, AI powered 20 questions game. This uses AI to interact with users by asking questions, providing hints, and explaining yes or no questions.
Seems like a bit of a no brainer. And uh, maybe a good strategy Amazon to partner with is already established. Uh, other companies like character that AI that we've covered as being very popular. You can't talk to all the characters. Uh, it seems you can talk to, like, Elon Musk or William Shakespeare. You actually have still go to character. Yeah, but a lot of them are now available on Alexa. So yeah. Now if you have one, you can play around with some fun AI stuff. And one last story for this
section. Google is working on an advanced version of Bard that you will pay for. And that story. It's supposedly going to be called Bard Advanced, and this will be something you pay for through Google One. This will presumably be powered by Gemini Ultra, this, uh, version of Gemini, their flagship model that's akin to ChatGPT that, uh, is yet to be out. So yeah, not too surprising here. I guess. Something we probably all expected, but, uh, will be very interesting to see when this does come out.
If it will measure up to GPD four and other, I guess paid tier chat bots.
Everybody right now has been talking about how it feels. Google is really sleepin in the AI game right now when it comes to shipping advanced chat bots, how long it took them to finally get Gemini out. And it'll be interesting because Google, kind of as an incumbent, has certain sorts of natural advantages. They have distribution and things like this.
So the question I guess for Google is will they be able to deliver something that is enough of an improvement over everything else and is distributed in the right way so that they can recover some of that market share from the competitors? I think that's a really big question for them right now.
And on to the next section, applications and business, starting with one of our favorite topics in business hardware and and Nvidia. And the story is that Nvidia's newest chips are designed to run AI at home and are probably going to compete with Intel and AMD. The company of Nvidia announced three new graphics cards RTX for uh 4060 super, RTX 4070 TI Super, and one more RTX 2080 super, all priced between 600 and $1000.
Relatively cheap relative to the high end GPUs people use for AI training and things like that. And these will have tensor cores for running generative AI applications. So this is kind of moving away from the business enterprise level of GPUs that cost tens of thousands of dollars each towards more of a consumer bent. And as we've covered, AMD and Intel both have had some, uh, of their own hardware announcements aimed more at runtime, not training at
inference. And so with these announcements, Nvidia is, uh, running into that category as well.
This is pretty bad. They are really jumping into, again, taking great market share here and something that's going to be pretty important going forward as Andre just kind of pointed out. This is again, not on the side of individuals like you and me might not be training our models, but the games we play, the programs we use, Photoshop, for instance, they are more and more going to start integrating generative AI features.
I think as soon as GPT three came out, for instance, we already saw people experimenting with using it to generate dialogs for characters. Um, and also like the ship can be used for tasks like generating images on Photoshop's Firefly generator, like removing backgrounds and video calls. Um, and Nvidia is also developing tools for integrating generative AI into games. So this is something that's going to be pretty huge going forward. And I'm not at all surprised to see Nvidia jumping onto that.
Seems like, uh, as always, hardware is whatever money is at or has been in large part so far. So Nvidia uh, still might be running giant. Uh, we'll have to see if Intel and AMD do manage to make a dent with these new chips. Uh, in the fray.
Our next story kind of ties in naturally to what we were talking about with games. So valve has recently updated its rules for game developers publishing AI based games on steam, which requires them to disclose actually when their games use AI technology. And this is really a move that is coming.
And I think a lot of ways right now, this isn't just happening in games, but in the case of valve, this is aiming to increase transparency around the use of AI and games, protect against the risks of AI generated content, and allow customers to make informed decisions about purchasing AI based games. Um, these rules are coming after a lot of developers complained that valve was rejecting game submissions containing AI generated assets due to copyright concerns.
And so again, this is just a case where having that transparency is going to be pretty important. There are a lot of things people will run into when they're developing AI generated media, where they come into conflict with things like copyright and having some knowledge about what's going on, that there is a generative AI system being used, and probably some of the technical details of that system are going to be pretty important to both understanding and mitigating these concerns.
Seems that so far, the policy has been to basically reject, uh, games with user AI. And this explicit updates to the policies were. Valve's blog post essentially opens the floodgates, so to speak, it seems. So now you are officially allowed to use AI, you just have to disclose it. Pretty much necessary for valve. Uh, given that probably very many games being submitted with AI generated content, uh, they probably don't even know.
And in many cases, uh, because it's not like you can necessarily tell. So yeah, interesting to see in the gaming space, which is of course huge and asset generation being a huge kind of application of AI. Steam, for those who don't know, I guess we should probably mention, is a major marketplace for video games. So if you want to buy a game on a PC or Mac or any kind of non, uh, console, you would usually go for steam. Uh, so it's, it's a huge deal for them to allow it.
It's kind of like, I don't know, YouTube allowing AI in their videos, so to speak.
Onto the Lightning round. So not too long ago we were talking about some of the big questions for Google as it tries to push our generative AI systems. And our next story is about an AI powered search engine called perplexity that really wants to make Google Dance. They recently raised $73.6 million in a funding round. This is led by IVP, with participation from NEA, Databricks Ventures, and Nvidia and Jeff Bezos, among others.
Those are some pretty big names in the investment landscape, so this is like a pretty serious fund raise. The round is valuing the company at $520 million. Post-money. That's a lot of money. Uh, so perplexity was founded in August 2022 by a couple of engineers with backgrounds and AI distributed systems, search engines, databases, basically all the stuff you need to put together to create a search engine like this.
And perplexity offers a chat bot like interface that allows users to ask questions in natural language and respond to the summary containing source citations, which is again a really powerful alternative to something like Google. When ChatGPT came out, people were really excited about the fact that you don't have to go to Google and then look at like the top ten lines to figure out the information you were looking for. You could just have it delivered to you.
And so when you develop a system that is capable of doing a lot of the job of that search for you to deliver you the information and deliver it correctly with. Links and sources and all of that. You've got something really powerful, and it lowers the amount of work a user has to do to find what they're looking for.
So this is what they offer is quite similar to Google's Bard, for instance, where you ask a question of a chat bot and it provides an answer and also links to these, uh, articles, as you said. Newcomb is another version of this. Uh, and ChatGPT also can do it. So it seems like a news kind of bet on a search paradigm where you ask a question and it searches a web for you and provides you with summary. And, uh, yeah, this is a big player in that space.
They claim to have 10 million active users and they're now valued at over 500 million. So we'll have to see. But, uh, yeah, if you're looking to try out a tool, then perplexity might be something you're interested in. And next story is about self-driving cars and about how Waymo will start testing robotaxis on Phenix highways. Waymo has been testing and actually running commercial services in several cities for a while now. They've been in Phenix for a very long time.
I've been in San Francisco for a while, but this has always been in the city itself, in the streets where I've not been allowed to ride on highways. So this will be kind of expanding that to allow the cars to use that. And, uh, yeah, once they start testing, presumably, you know, and sometime after it will expand to allow that in the commercial commercial offering as well.
It's pretty exciting, and it does seem like, at least in limited cases, they are going to deliver more and more advanced features. I actually grew up in Phenix, and I remember the first time I ever saw Waymo on a street was when I was visiting home for the summer for a break during college.
I think this was sophomore year or so, and we were driving home from Sky Harbor Airport, and I just saw random Waymo's on the street, and I'm pretty sure that was one of the first times I'd ever seen a self-driving car actually going on the road. So it's pretty interesting, exciting, I think, to see how far they've come in that regard. So I'm definitely very curious to see how things work for them going forward.
Yeah, I personally use Waymo now in S.F. whenever I'm there instead of Uber. So I've used it like 20 times now or something. And uh, yeah, it's it's really good. I've never had any issues. So I personally am excited for it to use highways because then maybe I could actually go from not a safe from Palo Alto Monte or whatever up to where. So this could be a big deal. And the next story is on stock photos. The story is Getty, and Nvidia brings generative AI to stock photos.
They are launching generative AI by AI stock via text image platform that they are going to allow you to create stock photos with. This is building on the previous AI image generation tool, but is designed for individual users rather than enterprise solutions, and this was done in collaboration with Nvidia. Trained with a Picasso model I learned from Getty's library and uh, also AI stock, stock photo library.
So yeah, it's, uh, expanding, I guess the range, uh, of users that can create stock photos beyond just big businesses to small and medium sized businesses.
This is definitely one of those pretty obvious markets, I feel, as I was kind of going to happen eventually, as so many of us. I mean, I think you and I have had to use stock photos at times, so I'm not at all surprised to see Garry getting into this. Um, also, I guess interesting is that contributors whose content was used to train the model can participate in the revenue sharing program, which is a pretty important detail for something like this.
Right? This is, um, coming out of Shutterstock, which also offers, uh, service like this. And they are going to price $15 for 100 prompts, with each prompt generating four images, compared to buying stock photos, where each stock photo usually costs a couple bucks at least to license. Uh, this could be something a lot of people will would like to use, I guess.
Next up for research and advancements, our first story is something coming out of DeepMind getting back into robotics. They're developing multiple research projects to create robots that can make decisions faster and work in real world scenarios. Again, this is a very, very hard problem in AI getting robots to do things that are actually useful in a robust way. The first project is a system called auto art, which combines large foundational models with a robot control model.
This allows robots to gather training data in new environment and multitask. The goal is it can simultaneously direct multiple robots to carry out diverse tasks in a range of settings. It's been tested in real world evaluations over several months, and demand is integrated a robot constitution into auto art to ensure that the robots follow specific safety rules, including, you guessed it, Isaac Asimov's Three Laws of Robotics.
This is, uh, pretty fun. There's also another system they developed called the Self Adaptive robust attention for robotics transformers to improve the efficiency of robotic transformer models. And you might have heard of the robotics transformer projects are DeepMind has come out with recently. I think that the development of foundation models really offered some very interesting research directions for grounding the outputs of these models.
So, for example, putting a language model together with something like a robot arm and grounding the language model, suggestions for achieving a task like, I want you to move this block from this part of the room to another part of the room based off of what the robot arm could actually do. So lots of hard problems like this to explore, and it seems like DeepMind is going down that route again.
They have been working on this, you know, forever, kind of, uh, they've done robotics research, but this is covering a blog post in which we sort of bundled a few different things, as you mentioned. So there's other TV research paper where the full title is Authority and Body Foundation models for large scale Orchestration of Robotic agents. Uh, and then there is some, uh, self adaptive robust attention as well. It's. Interesting kind of.
I guess we're starting to highlight the growing amount of work we have in this direction and the, I guess, growing capabilities of a models for robotics specifically. Right. So these are foundation models as, as part of that other title, uh, they say embodied foundation models, models that are trained to really control robots. And the numbers here are pretty impressive.
So they say that they had authority proposing instructions to over 20 robots across multiple buildings and collecting 77,000 real robot episodes via both teleoperation at autonomous robot policies, and that that allows them to collect a lot of data and therefore train these models, uh, to, you know, continually, I guess, expand the amount of data and control more and more robot robots allowed all over the place.
So it seems like a pretty exciting time for biotics in terms of getting these models that allow you to do low level control. You know, moving a robot to pick stuff up and then move around your environment. Now we also have these higher level control things, like our team that are orchestrating and making decisions on what different robots should do and kind of doing the high
level decision making. So once we get these foundation models trained, and DeepMind seems to be very much pushing on that front, but then you could get to a point of general purpose robotics. And, uh, within the next year or two, that is seeming more and more plausible given the pretty rapid advancements in that space.
Our next story is about a pretty recent paper that is called MoU Mamba, and I'll talk about what this actually means for a little bit. So as you might be aware, Transformers really, really powerful architecture, not the most efficient architecture in the world. When you feed a transformer a bunch of words, it's context for that transformer to do inference and deliver words to you, to expand on that context and to generate text. It's actually pretty computationally expensive.
And that inference time, that inference actually squares with the length of the context you gave the transformer the computations. It has to do so if you give it a hundred words, then you can think of the the computation that it takes to be 100 squared. Not the best explanation in the world, but it looks something like this. And that's not the most computationally feasible thing to do, especially when you scale to super long context lines.
So one thing researchers are doing right now is they're trying to figure out how to a work on transformers so as to mitigate that issue. But also there is a line of research exploring something called state space models, which offer linear time inference with respect to the context land. Again, that's much less expensive here. And they also had this pretty efficient training process via a hardware aware design. Uh, state space models are pretty complicated math wise.
They're inspired by a lot of things like control theory, but basically there have been a number of deep space state space models introduced. Um, and they are currently especially with the recent state space model called Mamba, apparently challenging the dominance of Transformers. So people are really looking into this research area. And despite the fact that all of the big architectures today are based on transformers, it's kind of another important line of research.
This paper, Emily Mamba, combines the recent model model with something called Mixture of experts, which is a class of techniques that allow drastically increasing the number of parameters in a model without much impact to the operations, the number of operations required for the models, inference and training. Basically, again, making that model a lot more powerful without having to substantially increase the computational
cost of running that model. So this paper basically combines these two techniques and claims that to unlock the potential of state space models for scaling, they should be combined with this mixture of
experts technique. And so they showcase it on the recent model and find that it outperforms both the original number model and transformers, with mixture of experts achieving the same performance as Mamba and much fewer training steps while preserving inference performance gains of mama against the transformer.
That's right. Yeah. So if you're a regular listener, you've heard us cover Mamba, you've heard us cover mixture of experts who have mixed Rhel and some other things. So this is basically kind of gluing the two things together. If you look at the paper, it's it's not anything to, I guess, technically complex. Conceptually, they just add mixture of experts to Mamba. And the findings are that it seems to make it a lot more efficient. Uh, so these two promising techniques
are better together? Yeah. That's like an exciting, uh, finding, given where we cover again, again, that it costs, you know, crazy amounts of money to train these models. Millions of dollars. And here Mamba is making it so the model itself is more efficient. Mixture of experts makes it so you can train with half the kind of computation to get the same performance,
seemingly. And that means that we could unlock a lot of efficiency and potentially that would enable a lot of scaling, which would mean that we can make our models even better. So yeah exciting times in the I. You know, architecture technical detail space for a while, for a couple years it was all transformers all the time. And nothing seemed to really, you know, cross that threshold of being good enough to replace Transformers. But you might be getting there.
Mhm.
And actually riding around starting with Pixar Delta fast and controllable image generation of latent consistency models. X art is one of his text to image generation models. Pixar Alpha is something that had existed before that could generate these high quality image, uh, of up to 1000 pixel resolutions with an efficient training process.
So pixel delta is basically a delta, a next step on that that introduces some extra tricks into the process of the selection consistency model and and control. Net uh, just combining some existing concepts and that significantly, uh, speeds things up. It produces high quality images and just a few computation steps. So that means that it takes only half a second for generating a thousand by a thousand pixel image. Uh, that's a seven fold speed up.
And it is also meant to train within a single day on high end GPUs. So yeah, it's following up on a lot of progress also in this space of make it being able to generate images quickly. And, uh, yeah, for a lot of these businesses and applications that have generative, uh, texture image capabilities, soon we might be able to see images being generated in under a second, just super quick.
Yeah, I think the big thing to focus on there, as Andre was saying, is that the primary upshot for a lot of this technical work is when these models eventually get integrated into things like consumer applications and business, or deliver to you apps that you might be using yourself, everything is going to be a lot faster. It's going to be higher quality, it's going to be more controllable. You're going to get the types of images you were actually looking for a lot more easily.
And next paper is in surf. That's a good little fun texture. Having a generative object in the search in a neural for these scenes. So you haven't covered Nerf in a little while, but Nerf is still very popular. Nerf is a technique for generating 3D models and 3D scenes from images and in surf is, as the title of a paper says, a way to edit insert objects into 3D scenes constructed by Nerf. So similar to, I guess, being able to edit a 2D image and in paint something in there.
And now you can do that with 3D scenes. And as I've said before on this podcast, I think 3D generation and editing of 3D is going to be a big trend, an area of improvement throughout this year.
I think one of the cool features of this is that it allows for controllable and 3D consistent object insertion, without requiring explicit 3D information as input. So again, I think these methods, as they get more powerful, they're just going to require less and less effort on the part of the people using and developing things, which is really exciting.
Just to be extra clear, this requires a bounding box in the 3D scene and a text prompt. So the way you might think of using this is you're looking at a free 3D scene on your screen. You kind of, you know, rotating around, see a floor and you want to add a table onto a floor. You have a bounding box, say add X here and it does that. And uh, yeah. If you're curious about to set up the game space, they have a project page or a nice little video and it's pretty seamless. It's it's pretty impressive.
Next story is about the impact of reasoning step length on large language models. For some context here, a lot of you have probably heard of chain of thought prompting before. This is pretty crucial in enhancing the reasoning abilities of large language models. I just did air quotes around reasoning abilities, because there is a lot of back and forth over whether these things actually reason. I tend towards the more skeptical side, but what have you.
And so again, the relationship between chain of thought effectiveness and the length of reasoning steps and prompts isn't super well understood. So in this paper, researchers conducted experiments to explore that relationship I just mentioned. They manipulated the length of reasoning steps and train of thought demonstrations while keeping other factors constant. Again, this sort of length of reasoning stops things.
Uh, has a bit to do with how complex the issue you're trying to get your language model to reason through might be, and they found that lengthening the reasoning steps and prompts without adding new information actually significantly improves our reasoning abilities across multiple datasets. Uh, possibly because there's some more context or something like
this added to that. And they also found that shortening the reasoning steps, even while preserving key information, considerably reduces the model's reasoning abilities.
So this is in the context of chain of thought prompting where you tell the model, you know, think through what you should do and then give me the solution. Reasoning steps. Here is basically how much, uh, budget do you give it to work with in terms of how much reasoning, how much kind of prelude to its answer in terms of these fast steps.
It is allowed, and we've covered quite a few papers here on this whole prompt engineering genre of research where you're like, okay, how do I alter my prompts and get the model to be, uh, more accurate chain of not being one of the big ones. So this is pretty useful in terms of understanding how to use chain of thought. Well, and uh, we've, you've covered quite a few papers that integration of art in data generation and sort of factual factuality, uh, checking and then various things like that.
So it's it's pretty significant to, I guess, understand a little bit better how to use chain of thought properly. And one paper, which we aren't going to go too far into because we have already covered this topic of mixed raw. But I think it is worth highlighting that, uh, the company Mistral has released the full white paper, so originally released a model. Make draw eight by seven b, which we covered back when it came out to this mixture of experts.
Uh, variant of on a chat bot that was very good and is very popular with people building on top of this open source, uh, release of the model. So we've released, uh, a paper now that you can go look through for a whole bunch of results. Uh, you know, lots of analysis on training and performance and so on that, uh, um, yeah. Provides more details that kind of corroborate what we already know, which is mixture model is quite good.
And using a mixture of experts seems to really improve training and accuracy, uh, in pretty impressive ways.
Next up is our policy and safety section. And our first story here is, uh, this is this is going to be a fun one. So I am decidedly not a fan of some of the directions that AI companionship is going. And this kind of story is a good reason for that sort of thing. So recently, Anita and OpenAI have managed to spawn a wave of AI sex companions, and you might know already where this is going. There is a website where users can interact with, uh, and chat with AI bots called Chubb AI.
An interesting name there. It offers a variety of scenarios, including a brothel staffed by underage girls, which is raising concerns about AI powered child pornographic roleplay. The uncensored AI economy includes a lot of sites like chubby AI, and this was initially spurred by OpenAI and later accelerated by Mat as a release of its open source llama tool. Again, technologies that can be used for good, but they're always dual
use. You release something open source that anybody can use and and things like this are just going to happen. Um, and experts are warning again that these activities and papers real world dangers for minors and raise questions about legal and ethical boundaries, as well as tech companies accountability for uncensored AI. Um, top AI is actually, for context here, an uncensored clone of character AI, which Andre mentioned earlier.
It allows again users to engage in roleplay scenarios of AI characters. And as we mentioned, some of these involve child pornography. Not a fan.
The title, I guess, is highlighting meta in particular because as we've covered, they released llama one and llama two, which are very powerful chat bots that you can use for nefarious activities, and this is highlighting an example of that. Uh, OpenAI, I've a highlight in the title because you can jailbreak ChatGPT to do things that it's not supposed to, although, uh, I assume in the terms of service, you know, OpenAI can come after you if you are doing that and, uh,
can kind of stop you. Whereas with, uh, models where open source models are released, you can basically take them and, uh, do whatever you want. Uh, to some extent with them. So yeah, not surprising, I suppose. But we do live in a world now where these models are in the wild and have been in wild for a little while.
So having a lot of this uncensored AI, so to speak, a lot of people are kind of putting a lot of effort into having models that you can use to do whatever you want, even, let's say, role play with underage. Uh, what what time should I use here? Uh, underage, uh, sex workers. Let's go have sure that, uh, is, uh, just happening. And if you're interested, I guess in the details or at least.
And, uh, real world example of this already happening job is, uh, discussed in a decent amount of detail in this story as, uh, as a prominent example. And apparently it's making a lot of money. It's, uh, generated over 1 million in revenue since launching this chat service, so. Mhm. Yeah. We'll we'll have to see if meta does try to kind of fight uh, uses of its open source models that are this problematic.
Yeah. And before we jump into our next story, I'll just very quickly highlight that part of why this sort of thing is even able to happen is that users have figured out ways to jailbreak the chat bots to obtain unmoderated responses, which is what's leading to the emergence of these uncensored bots.
So that's again kind of highlighting the research area of there is this back and forth where companies like OpenAI, like anthropic, like meta, are training their models not just to predict the next word, but adding these techniques on top of them in order to ensure that they. Output, things that are reasonable, that meet certain guidelines that they're going to want. And this does mean that users using ChatGPT might have a more frustrating experience.
But it also guards again to use cases like these. And so there is this back and forth between the training techniques used to make these models have safer, more aligned outputs with principles. But then people, on the other hand, who are inevitably going to come and figure out ways to get around that.
Now I will mention it's also worth noting that number two isn't fully, fully open sourced. There are also, uh, certain restrictions in the license agreement. So it does say explicitly that your use of a llama must comply with applicable laws and regulations and adhere to the acceptable use policy for llama material. And there is quite a few details in that use policy. So potentially meta could go after this organization if they are indeed using llama to as a backbone.
But as an organization you could also just not say anything potentially. Right. You could just build that service off of any open source chat bot. And meta may not be able to go after you necessarily. So there's a lot of kind of dimensions to open source. And should you open source and whatever about whatever. We're not getting into it. You're just kind of highlighting something that is now happening because of revelation of open source.
Um, next story, this is now on. You've seen a lot of cases, and we discussed one a little bit earlier about the use of copyrighted material and AI models and what that looks like. OpenAI has recently stated that it would be impossible to create AI tools like its chat bot chatbot, without access to copyrighted material. Um, and this is, again, as AI firms like OpenAI are facing increasing scrutiny over the content to use to train their products.
Um, as we all know, a lot of the data from the internet that ChatGPT that image generators like stable diffusion are trained on is covered by copyright. And just recently, the New York Times sued OpenAI and Microsoft, accusing them of unlawful use of its work to create their product.
In fact, as this was going on, a lot of users went into ChatGPT and realized that yes, it was actually quite easy to get ChatGPT to just verbatim spit out quotations and actually pretty extended quotations of multiple paragraphs of New York Times articles just by prompting it in the right ways.
OpenAI pretty quickly covered this. Up I went, and a couple hours after seeing some of these posts to try it myself, and pretty obviously OpenAI had sort of patched this up, but it's definitely pretty concerning. And interesting that you were just able to get these verbatim repetitions.
Right. So this is kind of development on top of that, uh, New York Times lawsuit we covered last time. And this ties into a broader story of the general, I guess, argument. OpenAI is making their training models, uh, on copyrighted material is a case of fair use. So fair use is basically cases where there is an exemption to the copyright on the data. Uh, for, you know, there's examples like you can use copyrighted materials for educational purposes.
And opening a has been making the case that it is a case of fair use to use copyrighted material to train a model. Uh, and this is kind of an instance where they submitted this, I guess, uh, argument to the House of Lords. So, uh, we'll yeah, it'll be very interesting and impactful to see where this goes legally, I guess, will be a big fight over the question of whether data is fair use when it is used for training a model. You know, times obviously you're saying that it isn't, but, uh, it'll
it's still an open question. And it is kind of a very important question that will probably be addressed, uh, in sometime this year.
Mhm. And OpenAI is pretty openly supportive of independent analysis of their security measures. They've agreed to work with governments on safety testing their most powerful models before and after deployment. And I think that in a lot of cases, Sam Altman has kind of spoken to regulators and said, yeah, we're we're open to regulation.
Um, and again, in cases like this, notice might be questions, you might have different opinions, but at least it's kind of interesting to see that they appear to be willing to work with people. As to the details of what that actually looks like, it's it's hard to say very much right now.
And on to the Lightning Round with another story originating in England. The story is a judges in England and Wales are given cautious approval to use AI in writing legal opinions. So yeah, the courts and tribunals Judiciary in England and Wales has given its official permission to use AI in writing these. Uh, rulings. Uh, they still restrict and say that I should not be used for research or legal analysis due
to host nations. And you've seen several news stories already of cases where in the actual cases, lawyers, uh, cited kind of fake precedents or fake information because of chat bots. So, uh, this is among the first news stories I've seen where there's an official kind of policy of use chat bots to maybe write your rulings, but do not use them for research.
It's noted specifically, judges can use AI as a secondary tool for specific tasks. So this is like writing background material, summarizing known information, quickly locating familiar material. But it shouldn't be used for finding new unverifiable information or for providing analysis or reasoning.
And I think that these safeguards and limitations and that use, uh, at least just the rules around it are pretty important, especially when you're talking about, well, discourse statements people might make that could be legally binding in some way. And so I think it's actually a pretty big question for the law about for judges, maybe for companies when they issue statements that are legally binding. And maybe some of those statements are AI generated, what that
looks like, what are the parameters? How do you deal with that?
And on a related note, we actually have uh, little research paper in this non research section. But it's very relevant because this is a story from the human centered AI Institute at Stanford. And it is about a study from the Stanford lab about the amount of hallucination and kind of incorrect information found in the chatbot responses when you have specific legal queries. And the finding is that there's a huge amount
of mistakes. So they found that between 69% to 88% of responses to these legal queries can, uh, result in hallucinations in state of the art limbs. And also they highlighted that real limbs often lack self-awareness about the errors and can reinforce incorrect, uh, information, which is, uh, yeah, something that if you use an alarm, you might have found that sometimes if you point out things wrong, it would just say that, uh, it is in
fact, correct and so on. So very much ties into that previous story in terms of given that we know that it seems that current limbs make a lot of mistakes when you ask legal questions, and I guess would do analysis and research, it might be a good idea to restrict their use for this, uh, case, uh, just as we heard England did.
Yeah. And some of these findings are, I guess, pretty unsurprising if you spend time with the chat bot, but also, again, very important for the usage in this context, given the prevalence of certain sorts of cases. For example, in their training data, these chat bots might be they might favor things like well-known justices specific types of papers. The study also found something that is not just true in a legal contacts.
But again, if you spent any time with the language model, you might ask that these sorts of trick questions where included in the prompt, you get it, you lead it towards the wrong answer, and the model is very likely to take you up on that and just kind of go along with what you said. So the study found that alums are susceptible to counter factual bias, or the tendency to assume that a factual premise in a query is true, even if it's flatly long wrong.
Like if you ask it, why was peanut butter invented in 2020 or something? I can't promise you that Trump is going to work out, but when you make statements like this, when you query it, when you ask things that implicitly assume or just say something that is totally wrong, the model more often than not, just kind of goes along with you.
And to be clear, this is for general purpose chatbot services, specifically GPT 3.5. Lama to palm to. This is not, uh, covering chat about sort of specialized for legal applications. And there are startups like Harvey that are looking into, you know, making it, making chat bots usable for research and analysis. Uh, but the upshot is, if you're in law or if you I guess, talking to a lawyer, they should probably not be using charging pretty or anything like ChatGPT to do research.
Uh, unless it's, I don't know, a super well known or something, but in general it should be very careful.
Um, next section is on synthetic media and art, and we're kicking off this section with something that again, kind of concerns Rob Swift. The law recently, a list of 4700 artists whose work was used to train and or generator has gone viral, revealing names such as Norman Rockwell and Wes Anderson.
This list was used in a court exhibit and a lawsuit against companies Midjourney, stability, I didn't, Art and runway AI again, all very big names accused of misusing copyrighted works to train their AI systems. Many artists have accused Midjourney specifically of stealing their work without permission, and a spreadsheet listing almost 16,000 more artist names as proposed additions to a MIT journey style list was also shared on social media.
This is again highlighted artist's frustrations with the lack of regulation around AI generated art. Questions have been raised about the fairness of profiting from mass produced images when the AI models that create them are trained on and imitate styles created by real life artists. Again, if you're somebody out there who has put out a lot of paintings in your name, then somebody could prompt an image generator to ask for an image in your style.
But it's the image they want to see for you as an artist that might significantly impact your livelihood. Um, the document that contains all list was publicly accessible until it went viral, but an archive of the spreadsheet does remain available online.
This kind of came out of a court case, and then, uh, went viral and generated a lot of discussion. It's still the case that in the art world, these text image models are very controversial. A lot of people really viscerally hate them. I don't know if you ever interact with those communities online, as I guess more of an AI person. Yeah, there's strong feelings there.
It's very much still. And this kind of spreadsheet that highlights how thousands of artists, uh, have been sourced, so to speak, in terms of the data used to train the models for many of these people, kind of is adding fuel to the fire, it seems.
Mhm. And again, it's pretty hard to deal with this because OpenAI said in that previous story we covered that it's impossible to train their chat bots to be as good as they are without training on copyrighted material. And a specific case here where Midjourney founder David Holes and noted in 2022 he didn't seek consent from artists who are still alive or whose works are still under copyright, citing again the difficulty of tracking the origin of 100 million images.
It's just really, really hard to get all the data. You might want to make these models perform well, and imposing rules like getting consent from every artist whose work is still under copyright is really, really hard. So then the questions. There aren't a lot of good answers here. Or do you just not train the model on them? In that case, you don't get the good model. Or do you see? Permission. That doesn't scale very well.
It's, um. We're definitely not at a good point where we have a great resolution on things like this.
And onto a slightly different topic, which is deepfakes. The story is that YouTube is cracking down on AI generated true crime deepfakes, and this is once again an example of the weird sort of sci fi we're living in the future right now situation. We have gotten ourselves into with AI. So YouTube is banning content that uses AI to simulate the victims of crimes, which include minors from narrating their own deaths or violent experiences. Uh, this is crazy.
So apparently there's a genre of true crime content for using the AI to create disturbing depictions of victims. Uh, and yeah, now there's an explicit policy that says that you're not allowed to do that. It's. Yeah, pretty insane. But just another example of the sort of stuff we're gonna get from AI when it's in a wild.
Yeah. With with these powerful models. Again, we've talked about how they're going to get easier to use. They already are pretty easy to use to create things. And I guess among all of the types of content people consume, there's some pretty interesting issues out there. And inevitably, given how easy it is now to create content like this, somebody is going to go and do things like what we're seeing here. Families of victims depicted in these videos have criticized them as
disgusting. And again, YouTube is imposing some, some, uh, some actual recourse here. So violation of this updated policy results in a strike that removes the offending content and temporarily restricts to users activities on the platform, and penalties increase for further violations within 90 days, which would potentially lead to the removal of an entire channel.
And actually, on a related note for next story, starting off a lightning round is also on AI generated content on YouTube. This time it's not, uh, true crime, it's comedy. The story is that this AI generated the George Carlin comedy special was slammed by the comedian's daughter. And yeah, there's a whole YouTube video called, uh, George Carlin I'm Glad I'm Dead. George Carlin, if you don't know, is kind of a legendary stand up comedian who is just very well known and very well regarded.
So this, uh, was released and, uh, yeah, met a lot of criticism, including from George Carlin. Starter.
And this I especially covers current topics that Carlin might have addressed in his comedy if he were alive today. So do not shootings and billionaires like Jeff Bezos and Elon Musk. This is another again, example of where you're taking the art that somebody has created in a style that is uniquely that person's, and then creating the content you want out of it. The AI does clarify at the beginning of the special that it is an impersonation of George Carlin, developed by listening to all of this
material. But again, these are these are thorny questions. Is this the sort of thing that people obviously want to consume? These things? There's a market for it, but is it a good thing to have? Do we want this sort of thing? Hard to say.
I think it's pretty fair to say it's in poor taste personally, but yeah. Agreed. Next story Sag-Aftra signs deal with voice over studios for AI use in video games. So going back to Sag-Aftra, as we've covered, they had a strike last year that, uh, wound up with some, uh, agreements on the use of AI for your visual appearance. And now there is a deal with this AI voice over studio, Replica Studios, that sets the terms of use for AI in video games.
And these terms include informed consent for the use of AI to create digital voice replicas and, uh, requirements for save storage of digital assets. So very much yeah. Expanding on video that uh, was already in place, uh, that dealt with digital replicas and AI versions of actors.
Yeah. This is again kind of trying to rebalance. Now we have AI systems that are naturally just going to take away a lot of work from people, and they're looking for agreements to create new employment opportunities. Specifically, this agreement is expected to create new employment opportunities for voiceover performers who wish to license their voices for use in video games. And again, this applies just to digital replicas and not AI training to create synthetic performances.
Um, and so again, this is going to be another back and forth that we're all going to have to pay attention to going forward.
And we are going to wrap up with kind of just a fun story, not something so serious as we had a lot of, uh, kind of pretty downer stories. The story is that Mickey Mouse is now in the public domain and AI is on the case. So if you're online and you're in the AI spaces you may have already come across, this is a sort of meme.
Uh, free early Mickey Mouse cartoons entered the public domain via black and white ones, and a lot of people immediately started training on it and generating data to, in 1928, design to, uh, yeah, mess around and make funny things with AI, Mickey Mouse, AI 1928 Mickey Mouse.
And to to get into a small amount of technical detail along the way is fine tuned a version of stable Diffusion Excel with stills from three and 1928 cartoons. These were at Steamboat Willie Playing Crazy and The Gallop and Gaucho. It's been used to create humorous controversy. Images of Mickey Mouse, which again demonstrates the potential for parody and satire now that Mickey Mouse is in public domain. Um, and the use of stable diffusion.
Excel doesn't make these images 100% legal because, as we mentioned earlier, the base model still incorporates copyrighted work in its training data. Uh, but again, this is, uh, more fun, interesting use of these things.
Yeah. So and if you go to the link, we have a story in the show notes. You can see some examples of, uh, drawings of Mickey Mouse watching TV or eating pickles or whatever. So yeah, kind of just fun. And with that, we are going to wrap up. Thank you so much for listening to this week's episode. Today's last week in our podcast.
As always, you can find the articles we discussed here today and subscribe to our weekly newsletter with similar ones that last week in that I thank you, Daniel, for guest co-hosting.
Of course, great to join in as always.
And as always, we would appreciate it if you leave us a review or get in touch. Add contact add last week and that I with any thoughts or suggestions, but more than anything we would love it if you keep tuning in.