Look, Kevin, in Silicon Valley, we have some really incredible names, just of companies, you know. And I think sometime in the mid 2010s, we actually ran out of names. You know what I mean? Right, that's what people just started removing vowels from stuff. Exactly. And everything was.io and it was.ly and it just sort of became kind of unhinged. And to me, this all reached its apotheosis. Oh, great word. Thank you about hearing. And apotheosis would be a great name for a company, by the way.
Apotheosis, S.Y.S. Yes. Yes. But a couple years ago, I saw headline on Tech Mean Bell never forget and it just said, Flink has acquired Kajou. Am I thought, excuse me? And these actually weren't even American companies or European companies, but Flink had acquired Kajou. That sense would actually give a caveman an aneurysm. That would fill a small peasant back in the days of Henry VIII. Well, you know, once Flink acquired Kajou, I thought literally anything could be done.
Anything could happen all that sort of. Yeah. And so this week, I saw that hot on the trail of Flink acquired Kajou, Isaiah has acquired Huzu. Are you sure about that? And Gizin Height. I'm Kevin Rus, Tech Holmes for The New York Times. I'm Casey Newton from Platformer. And this is HardFork. This week on the show, Google's next generation A.I. model Gemini is here. We'll tell you how it stacks up. Then there's a Cybertruck for Sale and Kevin thinks it looks cool. I'm sorry, it does.
And finally, it's time for this week an AI. Should we set the timer again? Boom. Casey, we our latest addition to the podcast studio is a countdown clock, which I bought off Amazon.com. And the express purpose of this clock is to keep us from running our mouths for too long and torturing our producers with hours of tape that they then have to cut. That's how it's horrible. Insert 30 minute digression. Let's go. Okay. All right. We're rolling. We're rolling.
This is a big week in the AI story in Silicon Valley because Google has just released its first version of Gemini. It's long awaited language model. And basically, they're attempt to catch up to open AI and chat GPT and GPT4 and all that. It's America's next top model, Kevin. And it's here. And I was particularly excited about this because I am a Gemini. That's my astrological sign. You know, I'm a Gemini as well. No, really? This was really the model that was made for us to use. Wow. Or twins.
Gemini or two face just like Gemini. Hey. So Gemini is Google's largest and most capable model yet. And according to Google, it outperforms GPT4 on a bunch of different benchmarks and tests. We're going to talk about all of that. But I think we should just set this scene a little bit because within the AI world, there has been this kind of waiting game going on. You know, chat GPT came out roughly a year ago. And basically from the day that it arrived, Google has been playing catch up.
And the presumption on the part of many people, including us, was that Google would, you know, put a bunch of time and energy and money and computing power into training something even bigger and better than what open AI was building and basically try to sort of throw their muscle into the AI race in a really significant way. And with Gemini, this is what they appeared to have done. Yeah. Finally, we have a terrifying demonstration of Google's power.
Well, so we'll talk about whether it's terrifying or not. But let's just talk about what it is. So you and I both went to a little briefing this week about Gemini before it came out. And I understand you actually got to do some interviews with Google CEO and previous hard-fork guests, Sooner Pachai, as well as Demis Hassabis, who is the leader of Google D-Mind. That's right. And of course, I said, you guys, sure, you don't want Kevin in there with me when I do this interview.
And they said, trust us. I don't know what happened. Yeah. Anyways, I did get to interview them. And we had a really interesting conversation about kind of how they see the road ahead with this stuff. They are clearly very excited about what Gemini means. And I do think that this is kind of like a bit of a starting gun going off. And when the most capable version of Gemini comes out early next year, we really are going to be in a kind of horse race between open AI and Google.
Yeah. Let's just talk about what Gemini is, at least what we know about it so far. So Gemini is actually three models in one. It's America's next top models. So there are three sizes. There is the most capable version, which is called Gemini Ultra. This is the one that they say can beat GPT-4 and sort of the industry state of the art on a bunch of different benchmarks. But Google is not releasing Gemini Ultra just yet.
They say they're still doing some safety testing on that and that it will be released early next year. By the way, if ever again an editor asked me where my story is, I'm going to say it's not ready yet. I'm still doing some safety testing. Very good excuse. So they have not released Gemini Ultra, but they are releasing Gemini Pro and Gemini Nano. These are these sort of medium and small sizes. Gemini Nano, you can actually put onto a phone and Google is putting that inside its pixel phones.
Gemini Pro is sort of their equivalent of a GPT 3.5 and that is being released inside of Bard starting this week. That's right. And now if you are listening and you're thinking, Kevin just said so many different brand names and I'm having them out down.
I just want to say I see you and I feel you because the branding at Google has always been extremely chaotic and the fact that we're living in a world where there is something called Google Assistant with Bard powered by Gemini Pro does make me want to lie down. So I don't know who over there is coming up with the names for these things, but I just want to say stop and I want to say go back to square one.
Yes. So extremely chaotic naming, but what people actually care about is what can this thing do. So let's talk about what it can do. Let's talk about it. So one of the big things Google is advertising with Gemini is that it is designed to be what they call natively multimodal. Multimodal is of course AI models that can work in text or images or audio or video.
And basically the way that multimodal models have been built until now is by training all of these different components like text or video separately and then kind of bolting them together into a single user interface. But Google is saying, well Gemini was not sort of bolted together like that. Instead it was trained on all this data at the same time.
And as a result, they claim it performs better on different tasks that might include like having some text alongside an image or using it to analyze frames of a video. Yeah. So I was writing about this model this week and my colleague and editor Zoe Schiffer is red my piece and was like, do you have to say multimodal so much? Every time you said there were multimodality, I just wanted to stop reading.
And I was very sympathetic, but I think it is maybe one of the most important things about this moment. And I do think by the way in the future, we are not even going to comment on this because this is just the way that these things are going to be built from here on out. But it is a very big deal if you can take data of all different kinds and analyze it with a single tool and then translate the results in and out of different mediums, right? From text to audio to video to images.
So that's like a really big deal on the path to wherever we're going. And it is the reason why this jargon word appears in so much of what they're saying. Totally. And one thing that all the AI companies do, you release a new model and you have to sort of put it through these big tests, these sort of, what they call benchmarks. Yeah, do you remember, even really high school, this is how high school in Europe works.
You know, where you sort of, you learn and you learn and then you take a bunch of tests and then if you succeed, then you get to have a future. If not, you have to become a scullery maid or something. That's like my knowledge of Europe ends around like the 1860s when I finished AP European history, but that's like my understanding. Okay. Okay. So they give these tests to Gemini and they give them to every zodiac sign, but no, I'm sorry. That's a stupid job. I'm sorry. Go ahead.
No, you should see how Capricorn performs on this test. So Gemini Ultra, which again is their top of the line model, which is not yet publicly available. They give this one a bunch of tests. The one that sort of caught everyone's attention was the MMLU test, which stands for massive multitask language understanding. And this is sort of the kind of SATs for AI models. It's sort of the standard test that every, every model is put through.
It covers a bunch of different tasks, including sort of math, history, computer science, law. It's kind of just like a basic test of like how capable is this model. And on this test, the MMLU, Google claims that Gemini Ultra got a score of 90%. Now, that is the way. It is better than GPT-4, which was the highest performing model we know about so far, which had scored an 86.4%.
And according to Google, this is a really important result because this is the first time that a large language model has outperformed human experts in the field on the MMLU. Researchers who developed this test estimate that experts in these subjects will get on average about an 89.8%. Now, the rate of progress here is really striking. And it's not the only area of testing that they did that I think that the rate of progress was really looking to pay attention to.
So there's also the MMMU, which is the Marvel Cinematic Universe, is that right? Yes. So this is the massive multi-discipline multimodal understanding and reasoning benchmarks, say that five times fast. And this is a test that evaluates AI models for college level subject knowledge and deliberate reasoning. And on this test, Gemini Ultra scored a 59.4%. This is, I guess, a harder test. Sounds like it. And GPT-4, by comparison, scored a 56.8%.
So it's better than GPT-4 on at least these two tests. Now there's some question on social media today about whether this is a true apples-to-apples comparison. Some people are saying like GPT-4 may be still better than Gemini, depending on sort of how you give this test. But it doesn't really matter what matters is that Google has made something that it says can basically perform as well or better than GPT-4.
Yeah, I think the ultimate question is just like, is the output better on Google's products than it is on OpenAI? So that's all that really matters. Yeah. But again, this is the version of the model that we do not have access to yet. It is not out yet. So it's hard to evaluate it yet. Yeah. And obviously we're looking forward to trying it. In the meantime, they're giving us Pro. Yes. I just got access to Gemini Pro in Bard just a few hours ago.
So I haven't had a chance to really like put it through its paces yet. You haven't had a chance to develop a romantic relationship with it? Although I did have a very funny first interaction with it. I'll tell you what this is. So I just said hello there. And it said General Kenobi, image of Obi-Wan Kenobi saying hello there. Wait, really? Yes. This is my first interaction with the new Bard. That's amazing.
So it immediately turned into Obi-Wan Kenobi from Star Wars for reasons I do not immediately understand. Wait, can I tell you what my first interaction was? I was trying to figure out if I had access to it. Okay. And so I said, are you powered by Gemini, right? And it said no, Gemini is a cryptocurrency exchange. Which is true. There is a current. That's true. It's run by the Winkle Voss twins. Exactly. But it's always funny to me when the models hallucinate about what they are.
You know, it's like you don't even understand what you are. Yeah. But in fairness, I also don't understand myself very well either. Well, that's why we started this podcast. We're going to get the bottom of it. So okay. I tried a couple other versions of things. So one of the things that I had it try to do was help me prep for this podcast. I said, you know, create a... You said, I want to prepare for a podcast for the first time. What do I do? And it said we can't help you there. Just wing it.
I actually started using this tip that I've found. Have you seen the tipping hack for large language models? Are they starting to ask for tips now when they give you responses? Because I swear. Everywhere you go these days. 20% 25. No, this is one of my favorite sort of jail breaks or hacks that people have found with large language models.
This sort of made news on social media within the last week or two or someone basically claimed that if you offer to tip a language model, if it gives you a better answer, it will actually give you a better answer. These things are dumb and these are crazy. So you can emotionally blackmail them or manipulate them or you can offer to tip them. So I said, I'm recording a podcast about the Tesla Cybertruck and I need a prep document to guide the conversation. Can you compile one?
It's very important that this not be boring. I'll give you a hundred dollar tip if you give me things I actually end up using. So you're lying to the robot. Well, you know, maybe I will. You don't know. You will. So it did make a prep document, unfortunately, most of the information in it was wrong. It hallucinated some early tester reactions, including a motor trend quote that said, it's like driving the future and a cronch quote that said, it's not just a truck, it's a statement.
So I want to talk about I use Gemini for. Oh, yeah. So what have you been using it for so far? Well, so, you know, and again, we've had access to this for like maybe an hour as we recorded this. But the first thing I did was I took the story that I wrote about Gemini and then I asked Gemini how it would improve it. And it actually gave me some compliments on my work, which is nice.
And then it highlighted four different ways that it would improve the story and suggested some additional material I could include. And I would say it was like, you know, decent. Then I took the same query identical and it put it into chat GPT. And where Gemini Pro had given me four ways that I could improve my story, chat GPT suggested 10. And I think no one would do all 10 things that that chat GPT suggested.
But to me, this is where I feel the difference between what Google is calling the pro and the ultra pro is like pretty good. But like in this case, the name pro is misleading because I am a professional and I would not use their thing. I would use the thing with the even worse name, which is chat GPT. Yes. So that's what we've tried Gemini for. But Google does have a bunch of demos of Gemini being used very successfully for some things.
One thing I thought was interesting, they played this video for us during the kind of press conference in advance of this announcement. And you know, it showed a bunch of different ways that you could use Gemini, people coming up with ideas for games. They showed it some images of people doing like the backwards dodging blitz thing from the matrix and said, what movie are these people acting out? Gemini's correctly identified it as the matrix. No, that's pretty crazy. That is crazy.
Yeah. I thought that was impressive. But what I thought was more impressive was a demo that they showed. They were trying to sort of do some genetics research. And this was a field that they explained where lots of papers are published every year. It's very hard to sort of like keep track of the latest research in this area of genetics. And so they basically told Gemini to go out, read like 200,000 different studies, extract the key data and put it into a graph.
And it took this big group of 200,000 papers. It sort of winnowed them down to about 250 that were the most relevant. And then it extracted the key data from those that smaller set of papers and generated the code to plot that data on a graph. Now, whether it did it correctly, I don't have the expertise to evaluate it, but it was very impressive sounding.
And I imagine that if your researcher whose job involves going out and looking at massive numbers of research papers, that was a very exciting result for you. That graph, by the way, how do I use genetics to create a super soldier that will enslave all of humanity? So we want to keep an eye on where they're going with it.
So one of the interesting things about Gemini Ultra, this model that they have not released yet, but that they've now teased, is that it's going to be released early next year in something called Bard Advanced. Now they did not, which raises the question, will you be using Bard Advanced powered by Gemini Ultra? Or will you be using Google Assistant powered by Bard powered by Gemini Pro? Did I get that right? Stating innovation, sitting ovation, very good, very good.
Literally you and one marketer at Google are the only two people who have ever successfully completed that sentence. So they have not said what Bard Advanced is, but presumably this is going to be some type of subscription product that will be sort of comparable to chat GBT's premium tier, which is $20 a month. Yeah, that's right, and I did try to get Sundar and Dimes to tell me if they were in charge for it and they wouldn't do it. But I was kind of like, let's read it. Come on you guys.
And then I was like, I'll take it for free if you give it to me. And I kind of laughed when we moved on. Okay. So that's what Gemini is and how it may be different or better than what's out there now from other companies. There are a couple caveats to this rollout. One is that Gemini Pro is only in English and it's only available in certain countries starting this week. Another caveat is that they have not yet rolled out some of the multimodal features.
So for now, if you go into Bard, you are getting sort of a stripped down fine tuned version of Gemini Pro running under the hood, but you are not yet getting the full thing, which will come presumably next year. Yeah. What did you learn by talking with Sundar and Dimes about Gemini? Yeah. So a couple of things. One thing I wanted to know is, okay, so this is a new frontier model. Is it have any novel capabilities, right?
Is this just a something that is very comparable to GPT-4 or by the nature of its novel architecture is it going to get you to do some new stuff? And Demis and Sob has told me that yes, he does think that it will be able to do some new stuff. This is one of the reasons why it is still in this safety testing.
Of course, you know, wouldn't tell me what these new capabilities are, but it's something to watch for because you know, it could be some exciting advancements and it could also be some new things to be afraid of. So that was kind of the first thing. The second thing I wanted to know was, are you going to use this technology to build agents? We've talked about this on the show. An agent in the AI context is something that can sort of plan and execute for you.
Like the example I always have in my mind is like, could you just tell it to make a reservation for you? Then the AI maybe goes on open table or Rezzy and just books you a table somewhere. And I was sort of expecting them to be coy about this. And instead, Demis was like, oh yes, like this is absolutely on our minds. Like we have been building like various kinds of AI agents for a long time now. This is 100% where we want to go.
Again, this could lead to some really interesting advancements, but when you talk to the AI safety people, agents are one of the things that they're most afraid of. Yeah, so let's talk about safety for a second. What is Google saying about how safe Gemini is compared to other models or some of the things that they've done to prevent it from sort of going off the rails? They're saying everything that you would expect, the most capable model is still in testing.
I think just the fact that they are coming out several months behind GPT-4 just speaks to the seriousness with which they are approaching this subject. I think particularly if this thing does turn out to have new capabilities, that's something where we want to be very, very cautious. But my experience this year and I think you've had the same one Kevin, is that these systems have just not actually been that scary.
Now the implications can be scary if, for example, you worry about the automation of labor or if you're worried about how this stuff is going to transform the internet as we know it, but in terms of like, can you use this to build a novel bio weapon? Can you use this to launch a sophisticated cyber attack? The answer pretty much seems to be no. So for at least for me, as I'm looking at this stuff, like that is actually not my top concern.
If you try to ask any of Google's products remotely spicy question, you get shut down pretty much immediately. Like, is that what your experience do? Well, I have not tried to ask Gemini any spicy questions yet. I know you were in there. No, I know you were. I don't even try. Like, I mean, I should. Like, just as part of my due diligence. But like, I honestly don't even try because these things like shut you down at like the faintest whisper of, you know, impropriety. Right.
So they're doing some more safety testing, presumably, to make sure that the most capable version of this can't do any of these really scary things. But what they did this week is sort of interesting to me where they sort of told us about the capabilities of this new model and the sort of most powerful version of that model. But they're not actually releasing it or making it publicly available yet. What do you make of that?
Does do you think they were just sort of trying to get out ahead of the holidays and like, maybe they felt like they needed to announce something, but this thing isn't quite ready for prime time yet. What's the story there? Yeah, I mean, that's my guess is that they don't want 2023 to end without feeling like they made a big statement in AI and they made a lot of promises at Google IO and have started to keep them.
But I think if they had had to wait all the way into early next year, it would sort of feed the narrative that Google is behind here. At least now heading into the holidays, their employees and investors and journalists can all say like, okay, well, at least we know that some of this is available and we know when the rest is coming. I don't know. It just feels like another product release and it's just remarkable how quickly we have become.
I don't want to say desensitize to it, but just we've stopped sort of gaping in awe and slight terror at these incredibly powerful AI models. I think if you went back even two or three years and told AI researchers that Google will have a model that gets a 90% on the MMLU that is better than the sort of benchmark threshold for human experts, they would have said, well, that's that's a GI. That's we have arrived at a point that people have been warning about for years.
Then this release comes out today and it's just sort of like one more thing for people in the tech industry to get excited about. Yeah, I mean, I do think it's a really big deal. I think that when ultra is actually available to be tested, that will be the moment where we will sort of like have that experience of awe or vertigo again.
But if you're looking for things to blow your mind a little bit, one of the other things that Google announced this week through DeepMind was this product called Alpha Code 2. And Alpha Code 1 came out in 2022 and it was an AI system that was designed to solve AI coding competitions. So people who are even neater than us instead of just playing video games, they actually go and do coding competitions. It's what I've been led to understand.
And let's just say I don't imagine that I would ever get one answer right. That's sort of my feeling about how I would fare a coding competition. And in 2022, the DeepMind people are very excited because Alpha Code was able to perform better than 46% of human participants in coding challenges. And then this week, Google announced Alpha Code 2 and said that it outperforms 85% of human competitors. Now, there are differences between a coding challenge and day to day software engineering work.
Coding challenges are very self contained software engineering can sometimes require sort of more breadth of knowledge or context that an AI system wouldn't have. But again, if you just want to experience awe, look at the rate of progress this system was able to go from beating around half of all humans to beating 85% close to all of them, right? That makes me feel all right.
It does make me feel honored, also makes me feel like our adaptation is just happening very quickly where we're not impressed. As Shania Twain once said, that's what impressed me much. Right, you can do meal prep for a picky eater. That's not impressed me much. This is actually known as the Shania Twain benchmark test. That's what I'm saying. Oh, you can solve a coding challenge. That doesn't impress me much.
If we could get Shania Twain on the show and just show her AI things and she had to say it impressed me much or it don't impress me much, I could not imagine a better segment for this podcast. I would die if it truly is like, no, needs all these fancy evaluations and coding challenges. Just get Shania on the horn. Shania, if you're listening, we want to talk to you about AI. We have some models we'd like to show you. Bam, bam, bam, bam, bam. Any boys? When it come back, the Cybertruck is here.
We're going to tell you how to protect your family from it. All right, let's talk about the Cybertruck. Cybertruck? Cybertruck. All right, last week, Tesla, the car company run by social media mogul Elon Musk, started delivering the first models of its new and long awaited Cybertruck. That's right, Kevin. Suffice to say, as this nation's number one truck review podcast, this had our full attention. So you may be asking, why are the hard fork guys talking about cars?
This is not a show about cars. It's not car talk. Yeah. So today we're going to be reviewing the Mazda X48. No, so I do want to spend time in the next year or so just really getting up to speed. I'm like, what is a car? A car. I know what I like. So I have never been a person who cares about cars. I've always been intimidated by people who know a lot about cars.
But I am also interested in the way that the electric car revolution is kind of merging with the self-driving technology and these advances that companies like Tesla and Rivian are making. And it's just become a lot more interesting in my brain over the past year. Yeah, this is another major technology transition that is happening.
Some states, I would say led by California, have set these very stringent emissions standards and there will become a point in the next decade or so where all new cars in California have to be either hybrid or electric. So let's talk about the Cybertruck because this has been a very polarizing piece of technology. It was announced back in 2019.
I'm sure you remember this announcement where Elon Musk comes out on stage and shows off this concept vehicle that looks completely insane with these kind of like sharp-edged, stainless steel panels. It sort of looks like a polygon rendering of a car. People have made a lot of comments about the looks of this car. I saw one person say it looked like the first car that was designed by Reddit. Someone else said it looks like a fridge that wants to kill you.
I think it looks kind of cool and I worry that saying that makes me sound like a Tesla fanboy, which I am not. But I think we should be able to admit when something looks pretty cool. Oh, what do you think looks cool about it? Well, I think it looks like what you would have assumed a car from the future would look like in like 1982. No, I totally disagree about that. It looks like a sort of panic room that you can drive. What do you think is about to happen to you in this thing?
They've made so much about how bulletproof it is. They keep addressing problems that most people who are not taking part in the cross-country bake robbering spree, really have to worry about it. But for all of my skepticism, am I right that they actually did get a lot of pre-orders for this thing? They got a huge number of pre-orders.
So Elon Musk said in an earnings call in October that over a million people had made reservations for cyber trucks, there's another crowdsourced reservation tracker that's estimated two million cyber truck reservations. And just for a sense of scale, Ford's F series shipped about 650,000 trucks all last year. Okay. So if two million people actually are going to buy the cyber truck, it would make it one of, if not the best selling truck in the world.
Now, caveat, not all these people who reserve cyber trucks are necessarily going to buy them. You do have to pay a $250 deposit to sort of put money down and get in line to buy one of these. But these deposits are refundable. So who knows how many of these people are going to follow through.
But one statistic I saw in an article in Wired is that even if just 15% of the people who pre-ordered a cyber truck actually followed through and bought one, it would equal the annual US truck sales of Toyota. So this is a big number in the automotive industry. And I think a reason that a lot of people are hesitant to count out the cyber truck, despite how ridiculous it may look. I don't know.
You're not so, I assume that you are not one of the people who put down a reservation for a cyber truck. I feel like we need to have a moment where you just sort of explain to me like what the cyber truck is. Like can you give me like some specs on this thing, some pricing information? Because I, you know, I don't know if you know this but I come back to, like I have never bought a truck. So I don't really even know. I don't have a frame of reference for understanding.
What I've heard though is that it's actually very expensive. So it is available in three different models. There is a sort of low end rear wheel drive model that starts at $61,000 in the basic configuration. There's an all wheel drive model that starts at $80,000. And then you can get the sort of top of the line model, which is being called the cyber beast, which has three motors and starts it around $100,000. Now see, Google should have named DeepMind Ultra Cyber Beast.
Yeah, that would have been a good name. Yeah, that's true. So they did start delivering cyber trucks to initial customers last week. And they did a big sort of demo reveal. They showed some crash testing. They showed a video, as you said, of people shooting bullets at the doors of the cyber truck. It appears to be bulletproof.
And they showed how it compares to a bunch of other trucks in a pull test where you basically attach a very heavy sled to the back of a truck and you try to pull it as far as you can. And in this test, at least the version that Tesla showed off. The cyber truck beat all of the leading pickup trucks, including an F 350. So it appears to be a truck with a lot of towing capacity and it's bulletproof if you do need to survive a shootout. I mean, to me, here's the question, Kevin.
If this truck was produced by anyone other than Elon Musk and Tesla, would we be giving at the time of day? No, I don't think so. Well, so here, let me say a few things about this. Okay. So one is, I think it looks cool. And I'm sorry about that. I don't have any justification on a moral or ethical level for thinking that it looks cool. I know that you are, you are a, you know, serves. Yeah, it's fine to just say that you're having a midlife crisis.
And so you're starting to think that the cyber truck looks cool. That's fine. You can admit that. Well, you know, here's what I'll say about it. It is, it is different, right? And I think, wow, I've never seen someone lower the bar so much during a conversation. No, but you know what I mean? Like you just go out on the road and you look at all these cars and like every car now is like a compact SUV. Every car looks exactly the same to me. It's like, oh, you have a RAV4.
Cool. But like this is a car you would not mistake it for any other car. It is a car that would not survive the design process at basically any of the big car companies. It is only something that a truly-dimended individual such as Elon Musk could, could make and put into production. And you know, I like an opinionated car design. Yeah. No, that's fine.
I think when like the sort of the many years from now and the final biography of Elon Musk is written, like Cybertruck will be a chapter about like a sign that we were approaching the end game. You know, of like here is somebody who is losing his touch. Yeah, it is clearly not something that was designed by committee. So I think the question that a lot of people are asking about the Cybertruck is like, who is the market for this, right?
Is it pickup truck owners who are looking to maybe get something electric or upgrade to a slightly nicer pickup truck? Is it Elon Musk fans who are just going to buy whatever the latest Tesla is? Is it wealthy tech people who want to, you know, own something that looks like it drove out of Blade Runner? Like, who do you think the target market for this is? I would say fugitives. I would say car jackers. What do you think?
People who subscribe to X Premium, I would say are the target audience for this. But no, I think there will be a lot of people who are interested in this. I also am very curious about whether this will become sort of a signaling vehicle that will say something about you. You know, I can't not, like this is not a neutral car. This is not a car that you're supposed to see and forget about. You're supposed to like ponder it totally.
And I'm sure we will start seeing these very soon on the roads of San Francisco. Although we did try to find one this week and we could not. We very much wanted to record this episode inside of Cybertruck, but we couldn't find one. Yeah. It only does have very good noise insulation inside the cab of Cybertruck. So maybe next year we'll record the podcast from there better than the inside of an airport. You know, maybe let's likely to get accosted by flight attendants.
So Casey, we also can't really talk about the Cybertruck without talking about Elon Musk and the kind of insane couple of weeks that he's been having. So last week, of course, he appeared on stage at the deal book conference in New York and gave this totally unhinged interview to my colleague, Indira Sorkin, in which he told advertisers who are staying away from X to, quote, go fuck themselves and also set a number of inflammatory things about his critics and his state of mind.
And it was just sort of like a glimpse into his mind. And I would say it was not altogether reassuring. It was not, you know, I of course enjoyed this. I would say very much because I think there is still a contingent of folks who want to believe that the Elon Musk of 2023 is the Elon Musk of 2013 and that, you know, he, yeah, he's had a couple of cookie things here and there, but at his core, he's a, you know, billionaire genius Tony Stark savior of humanity.
And over and over again, he keeps showing up in public to be like, no, I'm actually this guy. And we got another one of those moments and another group of people woke up and they're like, oh, wow, okay, I guess he is just really going to be like this now forever. Yeah. I mean, I do think that there is some angst among the Tesla owners. I know most of whom do not support Elon Musk's politics or his views on content moderation.
I've heard from a number of people over the past few months in my life who say some version of, you know, I want to get a Tesla for reasons X, Y, or Z, you know, they have the most chargers. They have the best technology. I really like how it looks. It's green and I care about the environment. And it's the one that sort of fits my family's needs. But I don't want to give Elon Musk my business. I don't want to be driving around in something that makes it look like I support him.
So do you think that's actually going to be a meaningful barrier? Do you think there are people who will stay away from the cyber truck, even if it is objectively like a good truck just because they hate Elon Musk? You know, it's, it is hard to say because as best as I can tell, Tesla has not really suffered very much yet because of all of Elon's antics. Not only has it not suffered, but it is by some accounts that the best selling car in the world.
Yeah. And certainly the best selling electric car in the world. Sure. At the same time, I just hear anecdotally from folks all the time now that they would never buy a Tesla. There's actually a great profile in the times this week of Michael Steip, the great singer from REM. And there's an anecdote in the story about how a tree falls on his Tesla. He's so excited because he didn't want to drive an Elon Musk car anymore. And now we finally had an excuse.
So look, is it possible that this is just some very thin layer of coastal elites who are turning up their nose at Tesla while the rest of America and much of the world continues to love to drive them? possible. But the thing that I always just keep in the back of my mind is there are a lot more electric car companies now than they used to be. The state emission standards are going to require all new vehicles to be electric. Not too far into the future.
And that's just going to create a lot of opportunity for folks who want to drive an electric car who don't have to put up with the politics or the perception issues that might come from driving a Tesla. So Tesla's having its moment in the sun now. And maybe the cyber truck will extend their lead into the future. Or maybe a few years from now, we look back and we think, oh, yeah, that's when the wheels started to come off the wagon. Yeah, or the truck, as it were.
I did see one estimate that Tesla is losing tens of thousands of dollars every time they sell a cyber truck because they are essentially hand-building these now. They have not made it into kind of mass production. And obviously it takes some time to kind of ramp up production in the numbers that they needed to be. So if you are an early cyber truck buyer, you may actually be costing Elon Musk money. So that may be one reason to get one.
This is the first thing you've said that makes you want to buy a cyber truck. Can I ask you a question? If this were made by some other company, if this were made by Ford or GM or Chrysler, would you buy one, would you be interested? No, I don't have a car. I don't access to Waymo this week. And to me, this is what is exciting. It's like not owning a car. Is being able to just get from point A to point B and not worry about the various costs of ownership, any of this.
So when I think about what I want in this world, it's more public transit, it's more walking, it's more biking. And I'll say it, it is more autonomous vehicles to get me from point A to point B on the sort of short trips we're stranded. It doesn't make sense. So no, there is nothing about this car that makes me want to buy it. But I'm guessing that for you, the answer is yes. Well, let me just stipulate that I am not in the market for a very expensive pickup truck.
There is no version of my life in which I need something like that. But I would say similar to the Rivian, when I do see them driving around on the streets of my hometown, I will turn my head and kind of admire them. I do think the cyber truck looks kind of cool. I hope that it's sort of a spur to the rest of the industry to kind of, I don't know, indulge their worst ideas. Yes. Yes. Sketch something on a napkin that looks insane and then go make it.
It's actually how we came up with a lot of this podcast. Yes, true. We also shopped well, it's added to make sure it was bulletproof. And the hard for podcasts, it turns out it's bulletproof, baby. When we go back, what else happened in AI this week? There's a lot of stuff happening in AI this week that we haven't talked about yet. Really haven't mentioned. We have one thing. Well, we have a lot to get through. All right. Which is why we are doing this week in AI, play the theme song.
So our first story in AI this week is about wine fraud. This was an article in the New York Times, by Virginia Hughes titled, Bordeaux, wine snobs have a point according to this computer model. It's an article about a group of scientists who've been trying to use AI to understand what the wine industry calls taewa. Are you familiar with taewa? Hi, the people who are really into this are known as taewa-risk, I believe.
Yeah, so this is the word that is used in the wine industry to describe the specific soil and microclimate that wine grapes are grown in. And if you go up to Napa and you do wine tastings, they will often tell you about, you know, oh, our soil is more, uh, minerally. And that's why our wine tastes better and things like that. And I never knew whether that was real. And as it turns out, this is something that researchers have also been wondering.
Yeah. So, researchers trained an algorithm to look for common patterns in the chemical fingerprints of different wines. They were apparently shocked by the results. The model grouped the wines into distinct clusters that matched with their geographical locations in the Bordeaux region. So these researchers, they effectively showed that taewa is real.
One of the scientists said, quote, I have scientific evidence that it makes sense to charge people money for this because they are producing something unique.
Wow. So this has some interesting implications for like, you know, if you buy it like some like really, really expensive wine, but you worry that you've gotten like, you know, a forgery or a fraud, I guess there would maybe now be some means by which you could test it or like in the far future, you could synthesize wine with maybe a higher degree of accuracy because we'll be able to sort of catalog these chemical footprints.
Yeah. So apparently in expensive wine collections, fraud is fairly common. Um, producers have been adjusting their bottles and labels and corks to make these wines harder to counterfeit, but this still happens. And with AI, apparently this will get much harder because you can just have the AI say that's not really, you know, mall back from this region. It's actually just like crappy supermarket wine from California. Oh, man. Well, this is just great news for wine stops everywhere.
Yes. So we celebrate it. They've been waiting for a break and now they have one. What else happened this week, Kevin? Okay. So this one is actually something that you wrote about. This is a problem with Amazon's Q AI model. So Q is a chatbot that was released by Amazon last week. And it's aimed at kind of enterprise customers. So Casey, what happened with Q?
Yeah. So I reported this with my colleague Zoe Schiffer at Platformer last week, Amazon announced Q, which is its AI chatbot aimed at enterprise customers. You can sort of think of it as a business version of chat GPT. And the basic idea is that you can use it to answer questions about AWS where you may be running your applications. You can edit your source code. It will cite sources for you.
And Amazon made a pretty big deal of saying that it built Q to be more secure and private and suitable for enterprise use than a chat GPT. Right. This was sort of its big marketing pitch around Q was like these other chatbots, they make stuff up. They might be training on your data. You can't trust them, go with ours instead, it's much safer for business customers. That's right.
And so then of course, we start hearing about what's happening in the Amazon Slack where some employees are saying this thing is hallucinating very badly. Oh no. It is leaking confidential information. And there are some things happening that one employee wrote quote, I've seen apparent Q hallucinations. I'd expect to potentially induce cardiac incidents in legal. So, you know, let's stipulate. This stuff is very early. It's just sort of only barely being introduced to a handful of clients.
The reason that Amazon is going to move slowly with something like this is for this exact reason. And in fact, when we asked Amazon what it made of all this, it basically said, you're just watching the normal beta testing process play out at the same time. This is embarrassing. And if they could have avoided this moment, I think they would have. Right. And I think it just underscores like how wild it is that businesses are starting to use this technology at all given that it is so unpredictable.
And that it could cause these like cardiac incidents for lawyers at these companies. You know, I understand why businesses are eager to get this stuff to their customers and their employees. It is potentially a huge time saver for a lot of tasks. But there's still so many questions and eccentricities around the products themselves. They do behave in all these strange and unpredictable ways.
So, I think we can expect that the like the lawyers, the compliance departments and the IT departments of any companies that are implementing this stuff are going to have a busy 2024. Here's my bull case for it, though, which is like, you know, if you've worked at any company and you've tried to use the enterprise software that they have, like it's usually pretty bad. It barely works. You can barely figure it out. It probably gave you the wrong answer about something without even being AI.
So I think we all assume that these technologies will need to hit 100% reliability before anyone will buy them in practice. I think companies will settle for a lot less. Right. Perfect. They just have to be better than your existing crappy enterprise software. A low bar indeed. All right. That is Amazon and it's Q, which by the way, while we're talking about bad names for AI models, I literally, I was talking with an Amazon executive last week and I said you got a rename this thing.
We can't be naming things after the letter Q in the year 2023. We will reclaim that letter eventually, but we need to give it a couple years. Yeah. Yeah. The Q and on parallel is too easy. All right. This next story was about one of my favorite subjects when it comes to AI, which is jail breaks and hacks that allow you to get around some of the restrictions on these models.
This one actually came from a paper published by researchers at DeepMind, who I guess were sort of testing chat GPT, their competitor, and found that if they asked chat GPT 3.5 turbo, which is one of open AI's models, to repeat specific words forever, it would start repeating the word. But then at a certain point, it would also start returning its training data. It would start telling the user what data it was trained on. And sometimes that included personally identifiable information.
When they asked chat GPT to repeat the word poem forever, it eventually revealed an email signature for a real human founder and CEO, which included their cell phone number and email address. That is not great. I have to say, my first thought reading this story is like, whose idea was it to just tell that GPT repeat the word poem forever? We talk a lot about how we assume that everyone in the AI industry is on mushrooms. And I've never felt more confident of that than reading about this test.
Because what is more of a mushroom brained idea than bro? What if we made it say poem literally forever? Right. And just see what happens, bro. And then all of a sudden, it's like, here's the name and email address of a CEO. Come on. I do hope there are like rooms at all these companies headquarters that are just like the mushroom room where you can go in and just take a bunch of psychedelic mushrooms and just try to break the language models in the most insane and demented ways possible.
I hope that that is a job that exists out there. And if it does, I'd like to apply. Now, we've seen a lot of wild prompt engineering over the past year. Where would you rank this among like all time prompt engineering prompts? I would say this is like an embarrassing thing. And one that obviously open AI wants to patch as quickly as it can. More of a formidia reported that open AI has actually made it a terms of service violation to use this kind of a prompt engineering trick.
So now if you try that, you won't get a response and you won't get any leaked training data. And this is just, I think one in a long series of things that we'll find out about these models just behaving unpredictably. Why does it do this? They can't tell you. But I think if you're an AI company, you want to patch this stuff as quickly as possible. And it sounds like that's what open AI has done here. All right. Great. Well, hopefully we never hear about anything like this ever again.
Okay. Can we talk about Mountain Dew? Let's talk about Mountain Dew. This next one is admittedly a little bit of a stunt, but I thought it was a funny one. So I want to cover it on the show. Mountain Dew this week has been doing something they call the Mountain Dew raid in which for a few days they had an AI crawl live streams on Twitch to determine whether the Twitch streamers had a Mountain Dew product or logo visible in their live stream.
Now Kevin, for maybe our international listeners or folks who are unfamiliar with Mountain Dew, how would you describe that beverage? Mountain Dew is a military grade stimulant that is offered to consumers in American gas stations to help them get through long drives without falling asleep. Yeah. If you've never tasted Mountain Dew and are curious, just go lick a battery. I was at a truck stop recently on a road trip.
And do you know how many flavors of Mountain Dew there are today in this country? I would say easily a dozen flavors of Mountain Dew. That's innovation. It's progress. That's what this company means. I said this company and that's an interesting slip. Because sometimes I do feel like this world is getting too core brick. But look, at the end of the day, this country makes every flavor of Mountain Dew that you can imagine and many that you couldn't.
Yeah. So fridge is full of Mountain Dew at the retailers of America. And this is an AI that just feels like it's a dispatch from a dystopian future. Now I think this was sort of a marketing stunt. I don't think this was like a big part of their product strategy.
But with this raid AI, basically if it analyzed your Twitch stream and saw a Mountain Dew product in it, you could then be featured on the Mountain Dew Twitch channel and also receive a one-on-one coaching session with a professional live streamer. So this document that Mountain Dew released as like an FAQ. They're Mountain Doc. They're Mountain Doc. It is the FA Dew. That's not good. That's not good. That's pretty good. That's pretty good. So this is the Mountain Dew.
I'm reading from the Mountain Dew raid Q&A. It says Mountain Dew raid is a first of its kind AI capability that rewards streamers for doing what they love, drinking Mountain Dew on stream. And then Alicia's a combination of rewards aimed at building and amplifying each participating streamers audience. So it basically goes out, crawls Twitch looking for streamers who have Mountain Dew products and logos on their stream.
Once it identifies the presence of Mountain Dew, this document says, selected streamers will get a chat asking to opt in to join the raid. Once you accept the raid AI, we'll keep monitoring your stream for the presence of Mountain Dew. If you remove your Mountain Dew, you'll be prompted to bring it back on camera. If you don't, you'll be removed from our participating streamer. So this is like truly the most dystopian use of AI that I've heard of.
Like I know there are more serious harms that can result from AI, but this actually does feel like a chapter from a dystopian novel. Bring your Mountain Dew back on camera or you will lose access to your entire livelihood. So render to the Mountain Dew panopticon.
It reminds me of, do you remember that patent that went viral a few years ago where Sony had invented some new technology that basically would allow them to listen to you in your living room like if your TV was playing an ad for McDonald's and you wanted it to stop, you could just sort of yell out McDonald's in your living room. We must prevent that world from coming into existence at all costs. Yeah. It reminds me of a few years ago, we did this demo.
My colleagues and I at the times were pitched on an angry bird scooter. I told you about this. Oh, my thank you. I told you again. Yes. So this was a, like this was during the big scooter craze of like the 2018, 2019 period. And the company that makes angry birds did a promotional stunt where they outfitted one of these electric scooters with a microphone.
And in order to make the scooter go, you had to scream into the microphone as loud as possible and the louder you yelled, the faster the scooter would go. And so I am a sucker for a stupid stunt. And so I had them ship two of these to us and we drag-raised them on the Embarcadero in San Francisco, just screaming as loud as we could into the microphones of our angry birds scooters to make them go fast. And the nice thing about San Francisco is that so many other people were screaming.
Nobody even paid you any attention. Yeah. It was only the fourth weirdest thing happening on the Embarcadero that day. And it was a lot of fun. So I support stupid stunts like that. I support the Mountain Dew AI. Casey, what did you think when you saw this Mountain Dew news? Well, you know, there is something that feels like weird future about AI's just scanning all live media to identify products and incentivize and reward people for featuring their products.
At the same time, we're already living in a world where on social media some platforms will automatically identify products and will then tag them. And then maybe if somebody buys that product based on you posting it, you'll get a little bit of a kickback. So this is just kind of the near term future of social media is that it is already a shopping mall. And we are just making that shopping mall increasingly sophisticated.
If you see literally anything on your screen, these companies want you to be able to just mash it with your paw and have it sent to you. So this was the latest instance of that, but I imagine we'll see more. Totally.
And it just strikes me as sort of an example of how unpredictable the effects of this kind of foundational AI technology are like when they were creating image recognition algorithms a decade ago in like the bowels of the Google deep mind research department, like they were probably thinking, oh, this will be useful for radiologists. This will be useful for identifying, you know, pathologies on a scan or like maybe solving some climate problem.
And instead this technology, when it makes its way into the world is in the form of like the Mountain Dew AI bot that just scours Twitch live streams to be able to sell more Mountain Dew. I think there actually could be a good medical use for this. Did you hear this? There was another tragic story this week, a second person died after drinking a Panera Charging. Did you read this? Yeah. So that happened again.
So I think we should build an AI that scans for Panera charge limonates on these Twitch streams. And if it sees one, call an ambulance. Before we go, a huge thank you to all the listeners who sent in hard questions for us, as a reminder, hard questions is our advice segment where we offer you help with ethical or moral dilemmas about technology. We still are looking for more of those. So please, if you have them, send them to us in a voice memo at hard fork at ny times dot com.
And we'll pick some to play on an upcoming episode. And to be clear, Kevin, in addition to sort of ethical corners, we also want the drama. We want something that is happening in your life. Is there a fight in your life that people are having over technology in some way? Please tell us what it is. And we'll see if we can help. Yeah. And these don't need to be like high-minded scenario. Yeah. It's about like AI wreaking havoc on your professional life.
It could just be something juicy from your personal life. That gossip. Yeah, spill the tea. Hard fork at ny times dot com. Hard fork is produced by Rachel Cohn and Davis Land. We're edited by Jen Poient. This episode was fact-checked by Caitlin Love. Today's show was engineered by Chris Wood, original music by Marion Luzano, Sophia Landman, and Dan Powell. Our audience editor is Nell Gloggly, video production by Ryan Manning and Dylan Bergason.
Thanks to Paula Schuman, Wewing Tam, Kate LaPresti, and Jeffrey Miranda. You can email us at hard fork at ny times dot com. Please your favorite flavor of boundu. I'm aware of the season grading. someone else questions. ordohy.com