TechSupport: Pixel Peeping - How to Spot AI Video | TechStuff podcast

Speaker 1

00:07

Class. Welcome to tech Stuff. I'm Kara Price. Today's interview is all about Sora, the video generation tool and invite only social media app that Open Ai released at the beginning of October. If you're on TikTok, Instagram, or x you've likely seen videos made by Sora plastered all over your feeds. These videos ranged from the absurd cats dancing by a dumpster with sunglasses on to hyper realistic like

00:38

Queen Elizabeth trying jerk chicken in Jamaica. When I first saw these videos, I was entertained by the absurdist ones and kind of floored by the realistic ones. To me, Sora signals that we have officially entered the post bunny trampoline internet. Yeah, I'm talking about the AI video of the Horde of Bunnies jumping on a trampoline in the dark. I was very convinced that this video was real, and so were many people, which led to a mini panic. Is it even possible to detect what's fake and what's

01:06

not anymore? That's where my guest today comes in. His name is Jeremy Carrasco and he runs multiple social media accounts under the name show Tools AI.

Speaker 2

01:16

The idea that we can't tell what's real or not because of AI video is so far definitely.

Speaker 1

01:21

Not the case. He has only been a full time creator for four months, but he has become a trusted source for dissecting viral AI videos and explaining the tells.

Speaker 2

01:31

There is a physical truth to shooting a video with a camera. That physical truth isn't going away, and AI does a version that to our eyes look like that physical truth. But upon examination you can figure out that these things break down. And I do think that any normal person with decent eyesight can zoom into these AI videos and figure that out.

Speaker 1

01:57

So Jeremy wants his social videos to be education. He wants more people to get excited by what he calls pixel peeping, and he wants to improve people's media literacy and hopes his accounts can help people tune their AI vibe checker.

Speaker 2

02:11

I'm not naive to the fact that people aren't going to be pixel peeping on the videos that they watch, So it's just about trying to tune people's initial impressions so that they have something in their head that says ey, something might not be right here, and then they can use, hopefully other media skills that I teach them. In order to dive a little.

Speaker 1

02:28

Bit deeper, I talk to Jeremy about so many things, how video generation tools work, how to pick up on AI, tells, why Sora is an inflection point for the Internet, and what this signals for the future of social media. I started out by asking Jeremy to clarify what Sora is and what it does.

Speaker 2

02:47

So. Sora was originally released as Openay's first video model in October twenty twenty five. They reuse the Sora name to launch their social media app. A lot of the hype has been around a Sora app, which is currently invite only, and then there's the Sora TOI model that you can already access if you have API access or if you're a developer or even a normal person. There are tools that let you generate a video with the

03:14

Sora to video model without an invite. The Sora app experience is very unique in some ways and very familiar in others. It does feel like a TikTok for you page just for AI videos. You can scroll, it has an algorithm to suggest But what's gotten a lot of the tension is the ability to cameo someone, but really, these are just deep bakes. Like you're creating deep fakes of your friends. You're creating deep bakes of whoever lets you create a deep bake of them, And you have

03:43

different levels of permissions. So, for example, Jake Paul and Sam Altman let anyone deep fake them, whereas I let no one deep fake me because I'm not comfortable with that.

Speaker 1

03:54

What does it look like to let someone deep fake you on Sora?

Speaker 2

03:57

It looks like a version of you doing whatever they prompted you to do. Now, there are safety features in place, so you can't have them do anything violent, you can't do anything sexual. But it's really up to open a high to set those boundaries. And I don't think it's completely accurate. I've made versions of myself that I think don't look very much like me. I've made other versions

04:19

of myself that look a lot like me. That's really up to luck, because as we'll learn, these models aren't deterministic. There's a part of this that is random, so it's not repeatable. So Jake Paul is a very good example. There are a ton of AI videos of Jake Paul right now. All of them look a little bit different, but have his likeness, so you have to give permission for someone to make a video of you through the cameo feature.

Speaker 1

04:44

So would you say that AI video generation scares you, Like, is it something that keeps you up at night?

Speaker 2

04:51

It's not because I'm doing something about it now, but it really was, and I think it is keeping people up at night because so much of our time is spent on these short form video platforms like for better or worse. I do think that it is the primary way that people get information now. There was probably never the best format for that information in the first place, but here we are. So I think what keeps me up is really general media literacy skills, and I think

05:16

of AI video as an extension of that. A lot of people are kept up by what I think are irrational fears about AI video, Like, in my opinion, it's probably not going to be framing you for a crime anytime soon, but it might turn the core of public opinion against you. It might be spreading disinformation.

Speaker 1

05:34

Like.

Speaker 2

05:34

It's an extension of other media literacy problems, and it's a very believable one because people when they are scrolling, they're just there to tune out and scroll. They're not there to pixel peep and really pay attention, right, I.

Speaker 1

05:46

Mean, you don't think that we are living in a world where soon people could be framed for something they didn't do using manipulated video.

Speaker 2

05:54

Well, I think that. I'm not a lawyer, but I've done some looking into this, and the reality is that in order for something to be admitted into evidence, at least in the United States, it has to have an extensive metadata trail. It has to be authenticated. You have to get the person who filmed the video into the courtroom to say that they filmed it. And we have to understand that while our perception might be getting tricked, there are procedural and mathematical ways that these can be detected.

06:22

So it is not undetectable yet. And anyone who says it's undetectable is probably either selling you something or doesn't have a good eye. And anyone who says it will be undetectable does not know that, and frankly doesn't understand the technology that's making these AI videos very well.

Speaker 1

06:36

In my opinion, and right now, your likeness is not shared.

Speaker 2

06:41

No, I have a strong, strong bias against this because I believe that once your likeness gets out there and is deepfakable, so to speak. It's really hard to pull that back, not because you can't, like you can tell people to stop, but once it's out there, I think you lose a sense of trust. It's a line that I just don't want to cross. I'm not comfortable crossing, and I've actually told my followers I will never cross that line because it's just not what I'm interested in.

Speaker 1

07:10

So I was hoping that you could show me how to make a video using the Sora app.

Speaker 2

07:15

Sure, so this is the Sora desktop app. It is not the vertical experience that you have, you know, on the phone. It is, however, showing a lot of the same content. So this is essentially the for you page of Sora. And the thing to note here is that there are Sora water marks over each one of these videos.

07:33

In the mobile experience, those water marks go away, but they don't let you screen record in the mobile version, Whereas theoretically anyone could do what I'm doing right now, like I can share my screen here, I could record my screen. When you see Sora videos on social media, this is how they're being made.

Speaker 1

07:49

So let's try to make a Sora video. Let's do skiing with candy.

Speaker 2

07:56

Skiing with Candy. You want me to just say that and see what it comes up with. Yes, let's do it. I think that's a great idea.

Speaker 1

08:03

Why do you think it's a good idea?

Speaker 2

08:04

Because so something that people aren't talking enough about with Sora is that you can have a very simple prompt and it can come up with something really creative. That's really what, in my opinion, distinguishes it from other video models. Google vo three was how a lot of AI content was made a few weeks ago. If you don't give Google vo three a good prompt, it's just boring, Whereas Sora will go through some attempts to at least make it entertaining anyway.

Speaker 1

08:33

It's just incredible to me that in a given three weeks the world sort of changes.

Speaker 2

08:38

I think that there is a misconception that the world just changed because video AI made a huge, undetectable leap. It did make a step towards more realism. What soa to really improved. Where a lot of the human parts of video AI, such as hand movement or if they have a missing limb, or if their teeth look weird, or if their eyes look uncanny, hair like, there were all these little things that people would pick on again, a lot of them subconscious. Sura made a step towards

09:14

improving those things. It still has a lot of background issues. It is actually a noisier or muddier looking model in my opinion than video, but a lot of people aren't looking for that. A lot of the videos that go viral that are AI generated are security cams, our body cams are go pro looking cameras, things that people aren't looking at every day. But it really made improvements in

09:41

how good the outputs are to watch. Like story wise, if you were to release Google vo three as a social media app, it would fail just entirely because people would get on there and unless you're a good prompter, like, you're not going to come up with anything interesting. As Sah made anyone getting into AI, it's possible for you to come up with something interesting with a very basic prompt. That's a really, really big innovation that they didn't talk about.

10:09

But I think that's why it's had such an impact is because there's a huge volume of somewhat meaningful Sora videos out there, whereas there really wasn't with VEO when that came out right. So all right, so it came up with skiing with candy. Let's see what Let's see what I did here?

Speaker 1

10:25

Look what.

Speaker 2

10:27

Go mid slip snack classy and.

Speaker 3

10:29

A peppermint for the wind.

Speaker 1

10:31

Nothing like sweet feel to keep the turn smooth? Catch you at the bottom.

Speaker 2

10:35

All right? What are your impressions?

Speaker 1

10:37

I just don't I'm sorry, this is Is it okay that this is blowing my mind?

Speaker 2

10:42

It should okay?

Speaker 1

10:43

Good, it should blow your mind because I feel daft, Like I feel like I can't wrap my head around this, Like I'm assuming this woman in the video with her ski mask on is not a real person.

Speaker 2

10:54

No, she's not a real person. And we don't know how they invented her. They just came up with that.

Speaker 3

10:58

What so.

Speaker 2

11:00

So there are things about this that stick out to me as obvious AI video. And then there are things about this that I just have to say, wow, that is incredible. So if I can just explain what I see here someone who watches these, so please. She starts out by skiing down the hill, but she's kind of skiing like it's snowboarding. Then she stops. She has some peppermints in her hand, she has some bags of candy in her hand, and there are some weird things going

11:27

on here. But what it did with it is without any input, it basically made a social media video with it. It's like she's promoting this candy. There's someone responding to her in the background. It invented a straw for her exactly. It she talks like an influencer.

Speaker 1

11:45

I just it really trips me up that she's not a real person, that this person does not exist in the world. It's really weird.

Speaker 2

11:51

Same. I mean, I have to tell myself it's not a real person.

Speaker 1

11:54

I mean, it would be like if you didn't exist.

Speaker 2

11:56

Yeah, it's That's the thing is, it's visually feels the same as talking to another person online. Of course, there are there are tells, so I'll get into those. So first of all, you have just the context. Why is she skiing down the hill with a bag of candy and why is she just putting it in her mouth with the wrappers. Then there are some artifacts that I can see, especially at the beginning of the generation. Her jacket and her pants are incredibly pixelated when it starts.

12:24

But the other thing here is that it's very noisy. If we actually zoom in there's a lot of artifacts in the mountains back there.

Speaker 1

12:34

It is weird how she's eating the candy. That's a little uncanny.

Speaker 2

12:37

It's weird. Yeah, she's eating raft candy and the bag there just stuck to her knee. Yeah, you know, so at first it's a ziplock bag, then it's not a ziplock bag, then it sticks to her knee. Her feet are backwards, like her foot there is literally backwards in this version. She doesn't have a foot like you know, you get into it. It's kind of funny.

Speaker 1

12:58

But this is why you have such a large platfor because like I look at this at first and I'm like, oh, it's perfect. Like in a way, if I see the trappings of what I think i'm seeing, I don't really look for the detail that's wrong.

Speaker 2

13:11

Especially when you're just scrolling on TikTok or Instagram. You're not looking for.

Speaker 1

13:15

Anything wrong, right, which is how they want you to look at it, or scrolling on Sora.

Speaker 2

13:18

Or scrolling on Sora. A lot of them are leaving Sara and making it out to all these platforms. Yeah, you're not going to be looking for these things. I'm totally aware of that. I mean, on first watch, are you gonna pick out everything that's wrong with this? No, But if you watch it five times and start zooming in, you're gonna start noticing that her feet are literally backwards. So yeah, when it comes down to it, I think what's really very important about Sora is that it did

13:44

all that work for you. You didn't need to know how to prompt the video. AI. If you were to put skiing with candy into Google video, it's just going to be boring. I'll just tell you that right now.

Speaker 1

13:56

So if I wanted this video, this exact video, for three, what would I have to prompt it to do?

Speaker 2

14:04

You'd have to act like a camera director. You'd have to say, video starting with women skiing down the slope. She is wearing a pink and yellow top, a turquoise bottom, She's holding a bag of candy in her right hand, pepperminster her left hand, and you'd have to go shot by shot to give it. I can actually show you something that I came up with that more clearly demonstrates

14:27

this point. So this is a video I made yesterday with the prompt epic anime of Diego Maradonna scoring a goal in the world cup, weaving.

Speaker 1

14:36

Past one, still going two. Defender's beaten. He won't announcers.

Speaker 2

14:42

This is him dribbling through an entire defense. It is an epic looking anime. Anime. People would say it doesn't look great, but normal people probably wouldn't notice it. And what blew me away about this is that it created Diego Maradonna's most famous goal and it added the announcers. I didn't tell it to do any of that. Now, if I compare that to what Google Vio did with the exact same prompt it did this, this.

Speaker 1

15:15

One is b team.

Speaker 2

15:16

It is. The quality of the video is actually better, but it didn't make it interesting. So again, that's why you're seeing so much, Sarah, as you don't need to be very creative.

Speaker 1

15:27

What are the implications of a social media app being designed and housing videos full of fake people? Like it's just crazy to me that I can watch a video of someone who doesn't exist.

Speaker 2

15:37

I think that we don't know the implications, and I would push back on it being like our inevitable future a bit, but I would say that it is normalizing deep faking, and I don't think we know what that will mean for us. But I don't think it'll be good. I think it might be entertaining, I think it might be interesting. It is certainly a technical achievement, but I don't consider it to be a technological advancement. I'm not

16:07

so sure it is progress. But it is a pretty incredible thing that they've been able to pull off, and I think that it is rational for people to look at these videos and be pretty freaked out. And that's what a lot of my comments are because what isn't clear is how this is going to improve social media in anyway, to improve our media literacy skills in any way.

16:32

There are definitely tech advancements here that can improve advancements towards artificial general intelligence, like there are technical reasons that this could be helpful in the future. But the step that open Aye took to release this in a social media app was a huge jump, in my opinion, in the wrong direction. But the technology is here to stay for sure.

Speaker 1

17:03

After the break, will we become desensitized to deep fakes?

Speaker 3

17:08

Stay with us?

Speaker 1

17:27

One thing that I can't really get over about sore Too is that Sam Altman is letting anybody use his likeness. He opened his likeness to any sor user, so I could say Sam Altman building a snowman for example, why do this, Like, as the head of the company.

Speaker 2

17:45

I can only guess. I think that it is generally just attempt at normalizing deep baking people, and I think people should be really scared of crossing that line. I think it's a serious thing to do, and I think open pushing everyone in that direction before anyone was even asking for it is really frightening. You could create deep fakes of people before there was the technology to do it. There was a lot of friction and social pressure not to do it. That friction was helpful in keeping our

18:18

information economy healthy. Even with safety features on the Sora app of like letting letting you set permissions, people are gonna mess that up. People won't know that they can be deepfaked, and of course that's their responsibility to know. But you've just opened up an entire can of worms. There are other issues here, like currently you can't delete your Sora account without deleting your entire JATGPT account.

Speaker 3

18:40

Wow.

Speaker 2

18:41

And again like you can't pull this back, Like in theory you could stop people. But if you are a public figure and you open up this can of worms, it could really backfire. So it's Sora accelerating this deep fake idea into a space that just hasn't been that full explored yet. And I don't think i'd want to be inly adopter of this because there's a lot of negative, like downside risk that I just don't think we figured out yet.

Speaker 1

19:06

So you have a video where you talk about how SOA is actually costing open Ai about one dollar per post? Can you explain that calculation and what it means for Sora long term?

Speaker 2

19:17

This was an educated guest that ended up being right. Every video you create is basically on open AI's dime. So, for example, two weeks ago, if I, as a creator wanted to post an Ai video to TikTok or Instagram, I would have to pay a subscription to make that video and download it or pay per post. So there are commodity prices for these video models. For Google vo it's a dollar fifty to three dollars. Sora is currently

19:46

around a dollar. But the Sora application is free, and anytime you create an Ai video on that it is free to you. So as always I would ask the question, if it is free, are you the product? And I'm this case they are taking your data, they're taking your face scans, they're taking your props. Right, so there's that question of why are they doing this? Of course they're

20:08

also doing it to get users. But imagine you were TikTok or Instagram and every single time someone posted a video on your site you needed to pay a dollar. How quickly is that going to add up? For Sora?

Speaker 1

20:21

Very quickly?

Speaker 2

20:22

Would advertisers be able to make up that difference? Are you going to need subscribers to help make up with that difference? I mean, video takes a ton of compute. It is costing them GPU compute, it is costing them

20:35

opportunity costs. The GPUs could be used for other things, right, So the fact that they chose a video social media app where every time someone posts on your platform it's costing you money is pretty confusing to me as someone who understands that those advertiser clicks are not even close to worth that much.

Speaker 1

20:53

My question is, is your sam Altman, you oversee the most popular or AI tool on the market, Why are you going into social media?

Speaker 2

21:06

You're asking the right question that I think even open AI's own employees are asking. There has been some reporting on even open AI people being confused by this. At the end of the day, TikTok is releasing an AI generator, I get ads for that all the time. YouTube is putting Google vo three into YouTube shorts. Everyone's looking at this as how do we build like the AI video feed? And it appears to me the rationale would be to generate some sort of advertiser revenue. I think that would

21:38

be the simple answer. But whether or not that actually works is a huge open question.

Speaker 1

21:44

So in the future, say, Sora, the app is running ads between videos.

Speaker 2

21:48

Yeah, absolutely interesting.

Speaker 1

21:51

So in one of your videos, you say that AI will end social media? What do you mean by that?

Speaker 2

21:57

I think it has the potential to end the four you page as we know it, unless the social media companies figure out a way to filter AI content. Again, we do not know how people are going to react to this when it's deployed much wider. But it is a rational thing to not want to only see AI slop in your feed. And I say AI slop because it's bad. Let's assume even that it's better. Let's assume

22:24

that AI video were indistinguishable. If that were the case, would you actually want more of it in your feed, or would you want to turn it off even more. I don't think that we know the answers to these questions, but it's very likely that if companies that are running these platforms can't figure out a way to filter out AI content, there's a part of the population that's going to start tuning out. There's also advertisers that might be scared by that, So I do think it's an existential

22:55

threat to the for you page. I think it actually might be a boon for this subscriber or substack type communities, like I think thatsting when people start rushing towards people that they trust, I think that that could be a really, really positive thing. I'll say for me, one of the things that I would be looking at if I were an AI creator is the fact that because Sora too is so good at making videos, it lowered the barrier of entries so far that I don't think open ai

23:22

is that far from generating their own feed. You know, if you can make an interesting video with only two sentences, well, chat gbt can make two sentences. They're collecting everyone's prompts, they're seeing what gets likes and engagement on Sora. I don't understand why they would need a human in the loop soon.

Speaker 1

23:41

I believe there's Actually we were just covering a story in the Financial Times about gen Z being less on social media, and I think a lot of it has to do with the sort of enthitification of the feed. And I see a lot of people kind of resigned to the fact that going on Instagram means scrolling through a lot of shit, and a lot of shit that's AI generated. It's no longer social media. It's like watching fake video. Yeah, it's hyper and shitification. It is the

24:09

most and shittified feed you could possibly have. And I am totally agreeing that there will be people who are super down with that and who are going to enjoy it.

Speaker 2

24:20

Again, there are people who enjoy this. I don't want to say that they're doing the wrong thing by enjoying AI video a.

Speaker 1

24:26

Fruit cutting another fruit, something like that.

Speaker 2

24:29

Yeah, Like, I'm not here to judge what people are watching. But if you play this out to its logical conclusion, here, it looks like social media companies generating their own videos without creators in the middle, for a hyper and shitified feed.

Speaker 1

24:47

So five to ten years is a huge difference. So let's just say, five years from now, what do you think the state of AI video looks like, and what does it mean for the Internet, for politics, and just us generally as a culture.

Speaker 2

25:00

If we project the current growth out, it is indistinguishable and everywhere. If we take a contrarian view, we can see that people might not be into it and it might lose a lot of money. We don't know which direction it's going to go, and I don't claim to be able to tell which direction we're going in. But in that first scenario where it's indistinguishable, it'll still be distinguishable by machine learning algorithms, It'll still be detectable by experts.

25:30

I still don't think it presents legal problems, but it presents massive disinformation problems. I'm very scared about that. And then there's another scenario which I think is a little bit more optimistic, which I actually subscribe to, which is that AI content becomes its own genre. There are companies that figure out a way to monetize it. It stays separate from our real feeds to whatever degree the viewer wants.

25:58

And I think that this is the optimistic vision, and that a lot of the tech community believes in too, and that Sam Altman would probably say, you know, he's been asked about this, He's been asked, how do we tell what's real or fake? And I actually didn't hate his answer. He said, well, just like we've always told we follow the people we trust, like we have human

26:13

communication networks. Now, I think that his accelerationist view is kind of running against that a little bit, but I do believe that at its core, that's how we're going to figure this out, and it might push people less online. Like I just think that there's just so many unanswered questions. But yeah, there's a few different scenarios that right now, I think we just have to flip a coin on which one we believe in.

Speaker 1

26:37

So you said the reason that you got interested in understanding AI video was as a tool for production. When that was the case, what were you excited about and sort of why has that now changed for you.

Speaker 2

26:50

I was excited about it lowering the grounds to doing creative things. I have a green screen studio in my basement. I was excited about it, you know, putting me in different types of stud and different types of environments. I was excited about it improving my graphics workflows. What started steering me away from it. It was some of the ethical concerns. I did realize that at the end of

27:12

the day, like this was mostly stolen information. It was actually not that much more useful than the actual room I'm in right now, Like I can make a decent studio myself. And really what made me turn was just using the tools. I think a lot of the people who are using them, who come from my background, realize that they aren't very fun tools to use. It's not a creative process for me. It's really frustrating.

Speaker 1

27:42

Well, you just type something in.

Speaker 2

27:43

You just type something in, and you hope it comes back the way you want it. It's like if because I have a history as a director, it is like every time I needed to tell the actor exactly what to say, exactly how to deliver it, over and over and over. And as a creative person and as a director, I just want to collaborate with people who bring something to the table. I don't want to bring everything to the table myself. I don't want to tell everyone how

28:06

to do everything right. That's not what the process of creating ever was. It was always about collaboration. It was always a fun process. I find the idea of just sitting in my basement creating AI videos with text is just it's exhausting. It doesn't feel creative at all. So but I'm not saying that people should hate every AI video they see, like some of them can be creative. But yeah, it's just taking that opportunity to train yourself

28:32

to see what these video models look like. Because if you're into it, that's totally fine, but then you're at least ready for when it is used for disinformation, which I think is enough of ball at this point.

Speaker 1

28:41

Well, thank you so much, Jeremy. I will be tuned into your feed. You are I don't know what I would call you. Is it vigilante justice? I don't think so. But you're doing some kind of public service education education.

Speaker 3

28:55

You're an educator, Yeah, there you go.

Speaker 1

28:57

You're an AI educator.

Speaker 3

28:58

Yeah, for tech stuff.

Speaker 1

29:22

I'm Kara Price. This episode was produced by Eliza Dennis, Melissa Slaughter, and Tyler Hill. It was executive produced by me oswa Oshan, Julia Nutter, and Kate Osborne for Kaleidoscope and Katrina Norvell for iHeart Podcasts. Kyle Murdoch mixed this episode and wrote our theme song. Join us on Friday for the week in tech oz and I will run through the headlines you may have missed. Please rate, review, and reach out to us at tech Stuff Podcast at gmail dot com.

Transcript source: Provided by creator in RSS feed: download file

TechSupport: Pixel Peeping - How to Spot AI Video

Episode description

Transcript