Deepfakes and the Future of Truth

Speaker 1

00:15

Pushkin, you're listening to Brave New Planet, a podcast about amazing new technologies that could dramatically improve our world. Or if we don't make wise choices, could leave us a lot worse off. Utopia or dystopia. It's up to us pash the eye. On July sixteenth, nineteen sixty nine, Apollo eleven blasted off from the Kennedy Space Center near Cape Canaveral, Florida.

00:58

Twenty five million Americans watched on television as the spacecraft ascended toward the heavens, carrying commander Neil Armstrong, Lunar Module pilot Buzz Aldron, and Command Module pilot Michael Collins their mission to be the first humans in history to set foot on the Moon. Four days later, on Sunday, July twentieth, the lunar module separated from the command ship and soon fired its rockets to begin its lunar descent. Five minutes later,

01:35

disaster struck about a mile above the Moon's surface. Program alarms twelve O one and twelve O two sounded loudly, indicating that the mission computer was overloaded, and then, well, every American knows what happened next. Lost date of five good evening, my fellow Americans, President Richard Nick addressed a grieving nation. Fades has ordained that the men who went to the Moon to explore in peace will stay on

02:19

the Moon to rest in peace. These brave men, Neil Armstrong and Edwin Auburn, know that there's no hope for their recovery, but they also know that there is hope for mankind in their sacrifice. He ended with the now famous words for every human being who looks up at the Moon and the nights to come, will know that there is some corner another word that is forever mankind. Wait a minute, that never happened. The Moon mission was

03:02

a historic success. The three astronauts returned safely to ticker tape parades and a celebrity thirty eight day world tour. Those alarms actually did sound, but they turned out to be harmless. Nixon never delivered that speech. His speechwriter had written it, but it sat in a folder labeled an event of Moon disaster until now. The Nixon you just heard is a deep fake, part of a seven minute

03:34

film created by artificial intelligence deep learning algorithms. The fake was made by the Center for Advanced Virtuality at the Massachusetts Institute of Technology as part of an art exhibit to raise awareness about the power of synthesized media. Not long ago, something like this would have taken a lot

03:54

of time and money, But now it's getting easy. You can make new paintings in the style of French Impressionism, revived dead movie stars, help patients with nor degenerative disease, or soon maybe take a class on a tour of ancient rome. But as the technology quickly becomes democratized, we're getting to the point where almost anyone can create a fake video of a friend, an ex lover, a stranger, or a public figure that's embarrassing, pornographic, or perhaps capable

04:26

of causing international chaos. Some argue that in a culture where fake news spreads like wildfire and political leaders deny the veracity of hard facts, deep fake media may do a lot more harm than good. Today's big question will synthesized media unleash a new wave of creativity or will it erode the already tenuous role of truth in our democracy? And is there anything we can do to keep it in check. My name is Eric Lander. I'm a scientist

05:12

who works on ways to improve human health. I helped lead the Human Genome Project, and today I lead the Broad Institute of MIT and Harvard. In the twenty first century, powerful technologies have been appearing at a breathtaking pace related to the Internet, artificial intelligence, genetic engineering, and more. They have amazing potential upsides, but we can't ignore the risks that come with them. The decisions aren't just up to

05:38

scientists or politicians. Whether we like it or not, we all of us are the stewards of a brave New Planet. This generation's choices will shape the future as never before. Coming up on today's episode of Brave New Planet, I speak with some of the leaders behind advances in synthesized media. You could, certainly, by the way, generate stories that could be fresh and interesting and new and personal for every child. We got emails from people who were quadruplegic and they

06:13

asked us if we could make them dance. We hear from experts about some of the frightening ways that bad actors can use deep fakes. Creditors would chime in and say, you can absolutely make a deep fake sex video of your ex with thirty pictures. I've done it with twenty.

06:30

Here's the things that keep me up at night right a video of Donald Trump saying I've launched nuclear weapons against Iran, and before anybody gets around to firing out whether this is real or not, we have global nuclear outdown and we explore how we might prevent the worst abuses. It's important that younger people advocate for the Internet that they want. We have to fight for it. We have to ask for different things. Stay with us, Chapter one,

07:05

Abraham Lincoln's Head. To begin to understand and the significance of deep fake technology, I went to San Francisco to speak with a world expert on synthetic media. My name is Alexei or sometimes called Alyosha Afros, and I'm a professor at UC Berkeley and Computer Science and Lexical Engineering Department. My research is on computer vision, computer graphics, machine learning, various aspects of artificial intelligence. Where'd you grow up. I

07:39

grew up in Saint Petersburg in Russia. I was one of those geeky kids playing around with computers or dreaming about computers. My first computer was actually the first Soviet personal computer. So you actually are involved in making sort of synthetic content, synthetic media, that's right. Alexei has invented powerful artificial intelligence tools, but his lab also has one full ability to use computers to enhance the human experience. I was struck by a remarkable video on YouTube created

08:17

by his team at Berkeley. So this was a project that actually was done by my students who didn't even think of this as anything but a silly little toy project of trying to see if we could get a geeky computer science student to move like a ballerina. In the video, one of the students, Carolyn cham dances with a skill and grace of a professional despite never having studied ballet. The idea is, you take a source actor

08:50

like a ballerina. There is a way to detect the limbs of the dancer, have a kind of a skeleton extracted, and also have my student just move around and do some geeky moves. And now we're basically just going to try to sympathize the appearance of my student driven by the skeleton of the ballerina. Put it all together, and then we have our grad student dancing pirouets like a ballerina. Through artificial intelligence, Carolyn's body is puppeteered by the dancer.

09:28

We weren't even going to publish it, but we just released a video on YouTube called Everybody Dance Now, and somehow it really touched the nerve. Well, there's been an explosion recently a new ways to manipulate media. Alexei notes that the idea itself isn't new, It has a long history. I can't help but ask, given that you come from Russia. One of the premier users of doctoring photographs I think was Stalin, who used the ability to manipulate images for

10:04

political effect. How did they do that? Can you think of examples of this and like what was the technology? Then? The urge to change photographs has been around basically since the invention of photography. For example, there is a photograph of Abraham Lincoln that still hangs in many classrooms. That's fake. It's actually Calhoun with Lincoln's head attached to it. Alexei's referring to John C. Calhoun, the South Carolina senator and

10:34

champion of slavery. A Civil War portrait artist superimposed a photo of Lincoln's head onto an engraving of Calhoun's body because he thought Lincoln's gangly frame wasn't dignified enough, and so they just said Okay, we can use Calhoun. Let's slap the Lincoln's head on his body. And then, of course, as soon as you go into the twentieth century, as soon as you get to dictatorships, this is a wonderful toy for a dictator to use. So again, Stalin was

11:08

big fan of this. He would get rid of people in photographs once they were out of favor, or once they got jailed or killed. He would just basically get them scratched out with reasonably crude techniques. Hitler did it, Mao did it, Castro did it, Bresnev did it. I'm sure US agencies have done it. Also, we have always manipulated images with a desire to change history. This is Honi f Reed. He's also a professor at Berkeley and

11:39

a friend of Alexey's. I'm a professor of computer science and I'm an expert in digital forensics, where Alexei works on making synthetic media. Honey has devoted his career to identifying when synthetic media is being used to fool people, that is, spotting fakes. He regularly collaborates on this mission with Alexey so I, met Alyosha efros Ten, twenty years ago.

12:05

He is really incredibly creative and clever guy, and he has done what I consider some of the most interesting work in computer vision and computer graphics over the last two decades. And if you really want to do forensics, well, you have to partner with somebody like Aliosha. You have to partner with a world class mind who knows how to think about the synthesis side so that you can synthesize the absolute best content and then think about how

12:32

to detect it. I think it's interesting that if you're somebody on the synthesis side and developing the forensic there's a little bit of a jekylin hide there, and I think it's really fascinating. You know, the idea of altering photos, it's not entirely new. How far back does this go? So we used to have in the days of Stalin, highly talented, highly skilled, time consuming, difficult process of manipulating images, removing somebody, erasing something from the image, splicing faces together.

13:03

And then we moved into the digital age where now a highly talented digital artist could remove one face and add another phase, but it was still a time consuming and required scale. In nineteen ninety four, The makers of the movie Forrest Gump won an Oscar for Visual Effects for their representations of the title character interacting with historical figures like President John F. Kennedy gratulating how do the field being all Americans? It's very good congratulation. How do

13:33

you feel I got? I believe that he had. Now computers are doing all of the heavy lifting of what used to be relegated to talented artists. The average person now can use sophisticated technology to not just capture the recording,

13:50

but also manipulate it and then distribute it. The tools used to create synthetic media have grown by leaps and bounds, especially in the past few years, and so now we have technology broadly called deep fake, but more specifically should be called synthesized content, where you point an image or a video or an audio to an AI or machine learning system and it will replace the face for you. I mean it can do that in an image, it can do that in a video, or it can synthesize

14:19

audio for you in a particular person's voice. It's becomes straightforward to swap people's faces. There's a popular YouTube video that features tech pioneer Elon Musk's adult face on a baby's body, and there's a famous meme where actor Nicholas Cage's face replaces those of leading movie actors, both male and female. You can put words into people's mouths and make them jump and dance and run. You can even resurrect powerful figures and have them deliver a fake speech

14:55

about a fake tragedy. From an Altered History, Chapter two, Creating Nixon. The text of Nixon's Moon disaster speech that we heard at the top of the show is actually not fake. As I mentioned, it was written for President Nixon as a contingency speech and thankfully never had to be delivered. It's an amazing piece of writing. It was written by Bill Safire, who was one of Nixon's speech writers.

15:28

This is artist in journalist Francesca Panetta. She's the co director of the Nixon Fake or MIT's Moon Disaster Team. She's also the creative director in MIT's Center for Advanced Virtuality. I was doing experimental journalism at the Guardian newspaper. I ran the Guardians Virtual Reality studio for the last three years. The second half of the Moon Disaster team is sound

15:53

artist Halsey Bergund. My name is Halsey Bergund. I am a sound artist and technologist, and I've had a lot of experience with lots of sorts of audio enhanced with technology, though this is my first experience with synthetic media, especially since I typically focus on authenticity of voice and now I'm kind of doing the opposite. So together, Halsey and Francesca chose to automate a tragic moment in history that

16:19

never actually happened. I think it all started with it being the fiftieth anniversary of the moon landing last year, and add on top of that an election cycle in this country, and dealing with this information, which is obviously very important in election cycles. It was like lightbulbs went on and we got very excited about pursuing it. It's possible to make mediocre fakes pretty quickly and cheaply, but

16:43

Francesca and Halsey wanted high production values. So how does one go about making a first rate fake presidential address? There are two components. There's the visuals and there's the audio, and they are completely different processes. So we decided to go with a video dialogue replacement company called Kenny Ai, who would do the visuals for us and then we decided to go with re Speech, who are a dialogue replacement company for the voice of Nixon. They tackled the

17:15

voice first, the more challenging of the two mediums. What we were told to do was to get two to three hours worth of Nixon talking. That was pretty easy because the Nixon Library has hours and hours of Nixon, mainly giving Vietnam's speeches. The Communist armies of North Vietnam launched a massive inversion of South Vietnam. That audio was then chopped up into chunks between one and three seconds long. We found this incredibly patient actor called Lewis D. Wheeler.

17:47

Lewis would listen to the one second clip and then he would repeat that and do what I believe was right. Re Speech would say to us things like we need to change the diagonal attention, which meant nothing to us. Yes, we have a whole lot of potential band name going forward. Yeah, Synthetic Nixon is another good one. So once we have our Nixon model made out of these thousands of tiny clips, it means that whatever our actor says will come out

18:26

then in Nixon's voice. So then what we did was record the contingency speech of Nixon, and it meant that we got Lewis's actually performance but in Nixon's voice. What about the video part? I mean, the video was much easier. We're talking a couple of days here and a tiny amount of data just with Lewis's iPhone. We filmed him reading the contingency speech once a couple of minutes of him just chatting to camera, and that was it fate that the men who went to the Moon to explore

19:01

in peace will stay on. You know. We were told by Kenny Ai that everything would be the same in the video apart from just the area around the mouth. So every gesture of the hand, every blink, every time he moved his face, all of that would stay the same,

19:21

but just the mouth basically would change. So we used Nixon's resignation speech to have served in this office, it's to have felt a very personal sense of it was the speech of Nixon that looked the most somber, where he seemed to have the most emotion in his face. So what actually went on in the computer? Artificial intelligence sometimes sounds inscrutable, but the basic ideas are quite simple. In this case, it uses a type of computer program

19:54

called an auto encoder. It's trained to take complicated things, say spoken sentences or pictures, encode them in a much simpler form, and then decode them to recover the original as bested. Care the encoder tries to reduce things to their essence, throwing away most of the information but keeping enough to do a good job of reconstructing it to make a deep fake. Here's the trick. Train a speech auto encoder for Nixon to Nixon, and a speech auto encoder for actor to actor, but force them to use

20:31

the same encoder. Then you can input actor and decoded as Nixon. If you have enough data. It's a piece of cake around there. Carefully created video, the Moon Disaster team created an entire art installation a nineteen sixties living room with a fake vintage newspaper sharing the fake tragic news while a fake Nixon speaks solemnly on a vintage black and white television. Some people, when they were watching

21:05

the installation, they watched a number of times. You'd see them, they'd watch at once, then they would watch it again, staring at the lips to see if they could see any lack of synchronicity. We had some people who thought that perhaps Nixon had actually recorded this speech as a

21:21

contingency speech for it to go onto television. Lots of folks who were listening, viewing, and even press folks just immediately said, oh, the voice is real or whatever you said these things that weren't accurate because they just felt like there wasn't even a question. I suppose that is what we wanted to achieve, But at the same time, it was a little bit eye opening and like a little scary. You know that that could happen. Chapter three, Everybody dance. What do you see as just the wonderful

21:57

upside of having technologies like this? Yeah, I mean a aion art is becoming a whole field in itself, so creatively, there is enormous potential. One of the potential positive educational uses of deep fake technology would be to bring historical figures back to life to make learning more durable. I think one could do that with bringing Abraham Lincoln back to life and having him deliver speeches. Film companies are

22:23

really excited about re enactments. We're already beginning to see this in films like Star Wars, when we're bringing people like Carrie Fisher back to life. I mean that is at the moment not being done through deep fake technologies. This is using fatty traditional techniques of CGI at the moment, So we still have to see our first deep fake big cinema screen release. But this is just to come

22:47

like the technology is getting better and better. Not only will we be able to potentially bring back actors and actresses who are no longer alive and have them star in movies, but an actor could make a model of their own voice and then sell the use of that voice to anybody to do a voiceover of whatever is wanted, and so they could have twenty of the is going on at the same time, and the sort of restriction

23:11

of their physical presence is no longer there. And that might mean that, you know, Brad Pitt is in everything, or it might just mean that lower budget films can afford to have some of the higher cost talent. At that point, you know, the top twenty actors could just do everything. Yes, there's no doubt that there will be winners and losers from these technologies, but the potential of synthetic media goes way beyond the arts. There are possible

23:36

medical and therapeutic applications. There are companies that are working very hard to allow people who have either lost their voice or who never had a voice, to be able to speak in a way that is either how they used to speak or in a way that isn't a canned voice that everybody has. Alexei ePROs and his students discovered potential uses of synthetic media and medicine quite unintentionally while working on their Everybody Dance Now project that could

24:06

turn anyone into a ballerina. Were kind of surprised for all the positive feedback we got. We've got emails from people who were quadriplegic and they asked us if we could make them dance, and it was very unexpected. So now we are trying to get the software to be in a state where people can use it, because yeah, it's somehow it did hit a nerve with folks. Chapter four Unicorns in the Andes. The past few years have seen amazing advances in the creation of synthetic media through

24:43

artificial intelligence. The technology now goes far beyond fitting one face over another face in a video. A recent breakthrough has made it possible to create entirely new and very convincing content out of thin air. The breakthrough called generative adversarial networks or GAMS, came from a machine learning researcher at Google named Ian Goodfellow. Like auto encoders, the basic idea is simple but brilliant. Suppose you want to create

25:16

amazingly realistic photos of people who don't exist. While you build a GAN consisting of two computer programs, a photo generator that learns to generate fake photos and a photo discriminator that learns to discriminate or identify fake photos from a vast collection of real photos. You then let the two programs compete, continually tweaking their code to outsmart each other. By the time they're done, the GAN can generate amazingly

25:48

convincing fakes. You can see for yourself if you go to the website this Person does Not Exist dot com. Every time you refresh the page, you're shown a new uncanny image of a person who, as the website says, does not and never did exist. Francescan I actually tried out the website. This young Asian woman. She's got great complexion. Envious of that neat black hair with a fringe pink lipstick and a slightly dreamy look as she's kind of

26:24

gazing off to her left. Oh, here's a woman who looks like she could be a neighbor of mine in Cambridge, probably about sixty five. She's got nice wire framed glasses, layered hair. Her earrings don't actually match, but that could just be her distinctive style. I mean, of course, she doesn't really exist. It's hard to argue that gams aren't creating original art. In fact, an artist collective recently used

26:58

a GAM to create a French Impressionist style portrait. When Christie's sold it at auction, it fetched an eye popping four hundred and thirty two thousand dollars. Alexei Efros, the Berkeley professor, recently pushed gans a step further, creating something called cycle gans. By connecting two gans together in a clever way, cycle gans can transform a monet painting into what's seemingly a photograph of the same scene, or turn a summer landscape into a winter landscape of the same view.

27:35

Alexei's cycle gans seem like magic. If you were to add in virtual reality, the possibilities become mind blowing. You may be reminiscing about walking down Saint German and Paris and with a few clicks. You are there, and you're walking down the boulevard, and you're looking at all the buildings, and maybe you can even switch to a different year. And I think that is I think very exciting as a way to mentally travel to different places. So if you do this in VR, I mean, can you imagine

28:13

classes going on a class visit to ancient Rome. That's right, you could imagine from how a particular city like Chrome. Luke's now trying to extrapolate to how it looked in the past. It turns out that gans aren't just transforming images. I spoke with a friend who's very familiar with another remarkable application of the technology. My name is Reid Hoffman. I'm a podcaster of Master's Scale. I'm a partner at Greylock, which is where we're sitting right now, co founder of LinkedIn,

28:43

and then a variety of other eccentric hobbies. Reid is a board member of an unusual organization called open AI. Open a Eyes is highly concerned with artificial general intelligence human level intelligence. I helped Sam Altman and Elon Musk standing up. The basic concern was that if one company created and deployed that that could be is balancing in

29:09

all kinds of ways. And so the thought is, if it could be created, we should make sure that there is essentially a nonprofit that is creating this and that can make that technology available at selective time, slices to industry as a whole, government, etc. Last year, open ai released a program that uses gams to write language from a short opening prompt. The system, called GPT two, can spin a convincing article or story instead of a deep

29:42

fake video. It's deep fake text. It's pretty amazing actually. For example, open ai researchers gave the program the following prompt. In a shocking finding, scientists discovered a herd of unicorns living in a remote, previously unexplored valley in the Andes Mountains. Even more surprising to the researches was the fact that the unicorns spoke perfect to English. GPT two took it from there, the livering nine crisp paragraphs on the landmark discovery.

30:13

I asked Franz to read a bit from the story. Doctor Jorge Perez, an evolutionary biologists from the University of Lapez and several companions, were exploring the Andes Mountains when they found a small valley with no other animals or humans. Perez noticed that the valley had what appeared to be a natural fountains surrounded by two peaks of rock and silver snow. Perez and the others then ventured further into

30:39

the valley. By the time we reached the top of one peak, the water looked blue with some crystals on top, said Perez. Perez and his friends were astonished to see the unicorn. Heard. Tell me some of the great things you can do with language generation, well, say, for example, entertainment, generate stories that could be fresh and interesting and new

31:03

and personal for every child. Embed educational things in those stories of the on into the fact that the story is involving them and their friends, but also now brings in grammar and math and other kinds of things as the doing it generate explanatory material of this kind of education that works best for this audience, for this kind of people, like we want to have this kind of math or this kind of physics, or this kind of history or this kind of poetry explained in the right way,

31:34

and also the style of language right like you know native city x language. When open ai announced its breakthrough program for text generation, it took the unusual step of not releasing the full powered version because it was worried about the possible consequences. Now, part of the open AI decision to say we're going to release a smaller model than the one we did is because we think that

31:59

the deep fake problem hasn't been solved. And by the way, some people complained about that, because they said, well, you're slowing down our ability to do progress. And so for the answer and say, look, when these are at least to the entire public, we cannot control the downside as well as upsides. Downsides from art to therapy to virtual time travel, personalized stories and education, synthetic media has amazing upsides. What could possibly go wrong? Chapter five? What could possibly

32:34

go wrong? The downsides are actually not hard to find. The ability to reshape reality brings extraordinary power, and people inevitably use power to control other people. It should be no surprise, therefore, that ninety six percent of fake videos posted online are non consensual pornography videos, almost always of women manipulated to depict sex acts that never actually occurred. I spoke with a professor who studies deep fakes, including

33:07

digital attempts to control women's bodies. I'm Danielle Citron and I am a law professor at Boston University School of Law. I write about privacy, technology, automation. My newest work and my next book is going to be about sexual privacy. So I've worked in and around consumer privacy, individual rights, civil rights. I write a lot about free speech and then automated systems. When do you first become aware of

33:35

deep fakes? Do you remember when this cross rit I did? So, there was a Reddit thread devoted to, you know, fake pornography movies of Gal Jadot Emma Watson. But the reddit thread sort of spooled not just from celebrities but ordinary people, and so you had rereditors asking each other, how do I make a deep fake sex video of max girlfriend? I have thirty pictures? And then other redditors would chime

33:58

in and say, look at this YouTube tutorial. You can absolutely make a deep fake sex video of your ex with thirty pictures. I've done it with twenty. In November two thousand seventeen, an anonymous reditor began posting synthesized porn videos under the pseudonym deep fakes, perhaps a nod to the deep learning technology used to create them as well as the nineteen seventies porn film deep Throat. The Internet quickly adopted the term deep fakes and broadened its meanings

34:30

beyond pornography. To create the videos, he used celebrity faces from Google image search and YouTube videos and then trains an algorithm on that content together with pornographic videos. Have you seen deep fake pornography videos? Yes, so still pretty crude, so you probably can tell that it's a fake, but

34:53

for the person who's inserted into pornography, it's devastating. You use the neural network technology, the artificial intelligence technology to create out of digital whole cloth pornography videos using proba real pornography and then inserting the person in the pornography so they become the female actress. If it's a female, it's usually a female in that video. My name is Noel Martin and I am an activist and Laura Form campaigner in Australia. Noel is twenty six years old and

35:31

she lives in Perth, Australia. So the first time that I discovered myself on pornographic sites was when I was eighteen and out of curiosity, decided to Google image reverse search myself in an instant, like in a less than a millisecond, my life completely changed. At first, it started with photos still images stolen from Noel's social media accounts. They were then doctoring my face from ordinary images and superimposing those onto the bodies of women depicting me having

36:14

sexual intercourse. It proved impossible to identify who was manipulating Nowell's image in this way. It's still unclear today, which made it difficult for her to seek legal action. I went to the police soon after, I contacted government agencies, tried getting a private investigator. Essentially, there's nothing that they could do. The sites are hosted overseas, the perpetrators are probably overseas. The reaction was at the end of the day, I think you can contact the webmasters to try and

36:48

get things deleted. You know, you can adjust your privacy setting so that nothing is available to anyone publicly. It was an unwinnable situation. Then things started to escalate. In twenty eighteen, who Well saw a synthesized pornographic video of herself and I believe that it was done for the purposes of silencing me because I've been very public about my story and advocating for change. So I had actually gotten email from a fake email address, and you know,

37:25

I clicked the link. I was actually at work. It was a video of me having sexual intercourse. The title had my name, the face of the woman in it was edited so that it was my face, and you know, all the tags were like Noel Martin Australia, feminist, and it didn't look real, but the context of everything with the title my face, with the tags all points to

37:57

me being depicted in this video. The fakes were of poor quality, but poor and consumers are in a discriminating lot, and many people reacted to them as if they were real. The public reaction was horrifying to me. I was a victim, blamed and slut shamed, and it's definitely limited the course of where I can go in terms of career and employment. Noel finished a degree in law and began campaigning to

38:24

criminalize this sort of content. My advocacy and my activism started off because I had a lived experience of this, and I experienced it at a time where it wasn't criminalized in Australia. The distribution of altered intimate images or altered intimate videos and so I had to petition, meet

38:49

with my politicians in my area. I wrote a number of articles, I spoke to the media, and I was involved in the law reform in Australia in a number of jurisdictions in Western Australia and New South Wales, and I ended up being involved in two press conferences with the Attorney generals of each state at the announcement of

39:12

the law that was criminalizing this abuse. Today, in part because of Noel's activism, it is illegal in Australia to distribute intimate images without consent, including intimate images and videos that have been altered. Although it doesn't encompass all malicious synthetic media, Noel has made a solid start. Chapter six, Scissors and Glue. The videos depicting Noel Martin were nowhere near as sophisticated as those made by the Moon Disastered team.

39:50

They were more cheap fakes than deep fakes, and yet the point didn't have to be perfect to be devastating. The same turns out to be true in politics. To understand the power of fakes, you have to understand human psychology. It turns out that people are pretty easy to fool. Carry I was running for President of the US. His stance on the Vietnam War was controversial. Jane Fonda, of course, was a very controversial figure back then because of her

40:18

anti war stand. What have we become as a nation if we call the men heroes that were used by the Pentagon to try to exterminate an entire people? What business have we to try to exterminate a people? And somebody had created a photo of the two of them sharing a stage and an anti war rally with the hopes of damaging the Carry campaign. The photo was fake.

40:37

They had never shared a stage together. They just took two images, probably put it into some standard photo editing software like a Photoshop, and just put a headline around it, and out to the world it went. And I will tell you I remember the most fascinating interview I've heard in a long time was right after the election, Kerry of course lost, and a voter was being interviewed and asked how they voted, and he said he couldn't vote

41:01

for Carry, and the interview said, well why not? And the gentleman said, I couldn't get that photo of John Carry and Jane Fonda out of my head. And the interviews, well, you know, that photo is fake, and the guy said, much to my surprise, yes, but I couldn't get it

41:15

out of my mind. And this shows you the power of visual imagery, Like even after I tell you something is fake, it still had an impact on somebody, and I thought, Wow, we're in a lot of trouble because it's very very hard to put the cat back into the bag. Once that content is out there, you can't undo it. So seeing is believing, even above thinking Yeah,

41:37

that seems to be the rule. There is very good evidence from the social science literature that it's very very difficult to correct the record after the mistakes are out there. Law professor Danielle Citram also notes that humans tend to pass on information without thinking, which triggers what she calls

41:55

information cascades. Information cascades is a phenomenon where we have so much information overload that when someone sends us something, some information, and we trust that person, we pass it on. We don't even check it's veracity, and so information can go viral fairly quickly because we're not terribly reflective, because we act on impulse. Danielle says that information cascades have been given new life in the twenty first century through

42:24

social media. Think about the twentieth century phenomenon, where do we get most of our information from trusted sources, trusted newspapers, trusted major couple of TV channels. Growing up, we only had you know, we didn't have a million, and they were adhering to journalistic ethics and commitments to truth and neutrality and notion that you can't publish something without checking it. Now we are publishing information that most people say. We're

42:52

lying on our peers and our friends. Social media platforms are designed to tailor our information diet to what we want and to our pre existing views, so we're locked in a digital echo chamber. We think everybody agrees with us. We pass on that information. We haven't checked the veracity. It goes wild and we're especially likely to pass it on if it's negative and novel. Why's that? It's just like it's one of our weaknesses. We know how gossip

43:22

goes like wildfire online. So like Hillary Clinton as running a sex ring. That's crazy. Oh my god, Eric, did you hear about that. I'll post it on Facebook. Eric, you pass it on. We just can't help ourselves, and it is much in the way that we love suits and fats and pizza. You know, we indulge. We don't think. On some sense, this phenomenon is an old phenomenon. Right is the famous observation by Mark Twain about how a lie gets halfway around the world before the truth gets

43:57

its pants. Hall. Yeah, the truth still in the bedroom getting dressed, and we often will see the lie, but the rebuttal is not seen. It's often lost in the noise ways of the defamatory statements. That is not new. But what is new is a number of things about our information ecosystem are our force multipliers Chapter seven, Truth Decay. Many experts are worried that the rapid advances in making fakes, combined with a catalyst of information cascades, will undermine democracy.

44:39

The biggest concerns have focused on elections Globally, we are looking at highly polarized situations where this kind of manipulated media can be used as a weapon. One of the main reasons Francesca and Halsey made their Nixon deep fake was to spread awareness about the risks of misinformation campaigns

45:01

before the twenty twenty US presidential election. Similarly, a group showcased the power of deep fakes by making videos the run up to the UK parliamentary election showing the two bitter rivals, Boris Johnson and Jeremy Corman, each endorsing the other. I wish to rise above this divide and indorse my worthy opponent, the right Honorable Jeremy Corbyn. SIPI Prime Minister of our United Kingdom, back Boris Johnson to continue as our Prime Minister. But you know what, don't listen to me.

45:33

I think I may be one of the thousands of deep fakes on the Internet, using powerful technologies to tell stories that aren't so. This just kind of indicates how candidates and political figures can be misrepresented, and you just need to feed them into people's social media feeds for them to be seeing this at times when the stakes are pretty high. So far, we haven't yet seen sophisticated

46:01

deep fakes in US or UK politics. That might be because fakes will be most effective if they're tim for XM chaos, say close to election day, when newsrooms won't have the time to investigate and debunk them. But another reason might be the cheap fakes made with basic video editing software are actually pretty effective. Remember the video that surfaced of how speaker Nancy Pelosi, in which she appeared intoxicated and confused. We want to give this president the

46:34

opportunity to something historic for our country. Both President Trump and Rudy Giuliani shared the video as fact on Twitter. The video is just a cheap fake, just slowed down Pelosi's speech to make her seem incompetent. But maybe elections won't be the biggest targets. Some people worry that deep fakes could be weaponized to foment international conflict. Berkeley professor Honey f Reed has been working with US government's Media

47:06

Forensics program to address this issue. DARPA, the Defense Department's research arm, has been pouring a lot of money over the last five years into this program. They are very concerned about how this technology can be a threat to national security and also how when we get images and videos from around the world in areas of conflict, do we know if they're real or not? Is this really an image of a US soldier who has been taken hostage? How do we know? So? What do you see as

47:34

some of the worst case scenarios. Here's the things that keep me up at night. Right. A video of Donald Trump saying I've launched nuclear weapons against Iran, and before anybody gets around to figuring out whether this is real or not, where we have global nuclear moutdown. And here's the thing. I don't think that that's likely, but I also don't think that the probability of that is zero. And that should worry us because while it's not likely,

48:00

the consequences are spectacularly bad. Lawyer Danielle Citram worries about an even more plausible scenario. And imagine a deep fake of a well known American general burning a koran and it is timed at a very tense moment in a particular most you know country, whether it's Afghanistan. It could then lead to physical violence. And you think this could be made. No general, no qoran actually used in the video, just programmed. You can use the technology to mine existing photographs.

48:37

Kind of easy, especially with someone like take Jim Mattis when he was our defense secretary. Of Jim Mattis, you know, actually taking a koran and ripping it in half and say all Muslims should die. Imagine the chaos in diplomacy, the chaos of our soldiers abroad in Muslim countries. It would be inciting violence without question. Well, we haven't yet seen spectacular fake videos used to disrupt elections or create

49:06

international chaos. We have seen recingly sophisticated attacks on public policymaking. So we've got an example in twenty seventeen where the FEC solicited public comment on the proposal to repeal net neutrality. Net neutrality is the principle that internet service providers should be a neutral public utility. They shouldn't discriminate between websites, say slowing down Netflix streaming to encourage you to purchase

49:35

a different online video service. As President Barack Obama described in twenty fourteen, there are no gatekeepers deciding which sites you get to access. There are no toll roads on the information super Highway. Federal Communications Policy had long supported net neutrality, but in twenty seventeen, the Trump administration favored repealing the policy. There were twenty two million comments that the FEC received, but ninety six percent of those were

50:05

actually fake. The interesting thing is the real comments were opposed to repeal, whereas the fake comments were in favor. A Wall Street Journal investigation exposed that the fake public comments were generated by bots. It found similar problems with public comments about pay data lending. The bots varied their comments in a combinatorial fashion so that the content wasn't identical. With a little sleuthing, though, you could see that they

50:36

were generated by computers. But with the technology increasingly able to generate completely original writing, like open aiyes program that wrote the story about unicorns in the Andes, it's going to become hard to spot the fakes. So there was this Harvest student, Max Weiss, who used GPT two to kind of demonstrate this, And I went on his site yesterday and he's got this little test where you need

51:02

to decide whether a comment is real or fake. So you go on and you read it and you decide whether it's been written by a bot or by a human. So I did this, and the ones that seemed to be really well written and quite narrative discussive, generally I was picking them as human. I was wrong almost all the time. It was amazing and alarming. In our democracy, public comments have been an important way in which citizens can make their voices heard, but now it's becoming easy

51:33

to drown out those voices with millions of fake opinions. Now, the downfall of truth likely won't come with a bang, but a whimper, a slow, steady erosion that some call truth decay. If you can't believe anything you read, or hear or see anymore, I don't know how you have a democracy a I don't know, frankly, how we have civilized society if everybody's going to live in an echo chamber believing their own version of events. How do we have a dialogue if we can't agree on basic facts.

52:03

In the end, the most insidious impact of deep fakes may not be the deep fake content itself, but the ability to claim that real content is fake. It's something that Danielle Citron refers to as the liar's dividend. The liar's dividend is that the more you educate people about the phenomenon of deep fags, the more the wrongdoer can disclaim reality. Think about what President Trump did with the Access Hollywood tape. You know, I'm automatically attracted to beautiful

52:34

I just started kissing them. It's like a magnet, kid. I don't even know it. And when you're started, they let you do it. You can do anything whatever you want. Grab them by the pro I can do anything. Initially, Trump apologized for the remarks. Anyone who knows me knows these words don't reflect who I am. I said it, I was wrong, and I apologize. But in twenty seventeen, a year after his initial apology and with the idea of deep fake content starting to gain attention, Trump changed

53:08

his tomb upon reflection, he said, they're not real. That wasn't me. I don't think that was my voice. That's the liar's dividend. In practice, the Trump comments about excess Hollywood was remarkable. Slightly more subtle than that, he said, I'm not sure that was me. Right, Well, that's the corrosive gas lighting. Chapter eight, A Life Stored in the Cloud. Deep fakes have the potential to devastate individuals and harms society. The question is can we stop them from spreading before

53:53

they get out of control. To do so, we'd need reliable ways to spot deep fakes. So the good news is there are still artifacts in the synthesized content, whether those are images, audio, or a video, that we as the experts, can tell apart. So when, for example, The New York Times wants to run a story with a video, we can help them validate it. What are the real sophisticated experts looking. Yeah, so the eyes are really wonderful forensically because they reflect back to you what is in

54:22

the scene. So I'm sitting now right now in a studio, there's maybe about a dozen or so lights around me, and you can see this very complex set of reflections in my eyes. So we can analyze fairly complex lighting patterns, for example, to determine if this is one person's head spliced onto another person's body, or if the two people standing next to each other were digitally inserted from another photograph. I could spend another hour telling you about the many

54:48

different forensic techniques that we've developed. There's no silver bullet here. Really is a sort of a time consuming and deliberate and thoughtful and it requires many many tools, and it requires people with a fair amount of skill to do this. Honey Freed also has quite a few detection techniques that he won't speak about publicly for fear of the deep fake creators will learn how beat his tests. I don't create a GitHub repository and give my code to all

55:12

my adversaries. I don't have just one forensic techniques. I have a couple dozen of them. So that means you, as the person creating this now have to go back and implement twenty different techniques and you have to do it just perfectly, and that makes the landscape a little bit more tricky for you to manage. As technology makes it easier to create deep fakes, a big problem will

55:34

be the sheer amounts of content to review. So the average person can download software repositories, and so it's getting to the point now where the average person can just run these as if they're running any standard piece of software. There's also websites that have propped up where you can pay them twenty bucks and you tell them, please put this person's face into this person's video, and they will do that for you. And so it doesn't take a

55:57

lot to get access to these tools. Now, I will say that the output of those are not quite as good as what we can create inside the lab. And you just know what the trend is. You just know it's going to get better and cheaper and faster and easy to use. Detecting dow fakes will be a never ending cat and mouse game. Remember how generative adversarial networks or gams are built by training a fake generator to outsmart a detector. Well. As detectors get better, fake generators

56:27

will be trained to keep pays still. Detectives like Honey and platforms like Facebook are working to develop automated ways to spot deep fakes rapidly and reliably. That's important because more than five hundred additional hours of video are being uploaded to YouTube every minute. I don't mean to sound defeatist about this, but I'm going to lose this war. I know this because it's always going to be easier to create content than it is to detect it. But here's where I will win. I will take it out

57:00

of the hands of the average person. So think about, for example, the creation of counterfeit currency. With the latest innovations brought on by the Treasure Department, it is hard for the average person to take their inkjet printer and create compelling fake currency. And I think that's going to be the same trend here is that if you're using some off the shelf tool, if you're paying somebody on the website, we're going to find you, and we're going

57:22

to find you quickly. But if you are a dedicated, highly skilled of the time and the effort to create it, we are going to have to work really hard to detect those. Given the challenges of detecting fake content, some people envision a different kind of techno fix. They propose developing airtight ways for content creators to mark their own original video as real. That way, we get instantly recognize

57:48

an altered version if it wasn't identical. Now there's ways of authenticating at the point of recording, and these are what it's called control capture system. So here's the idea. You use a special app on your mobile device that, at the point of capture, a cryptographically signs the image of the video or the audio. It puts that signature

58:07

onto the blockchain. The only thing you have to know about the blockchain is that it is an immutable distributed ledger, which means that that signature is essentially impossible to manipulate. And now all of that happened at the point of recording.

58:21

If I was running a campaign today and I was worried about my candidates likeness being misused, absolutely every public event that they were at, I would record with a control capture system and I'd be able to prove what they actually said or did at any point in the future. So this approach would shift the burden of authentication so the people creating the videos rather than publishers or consumers. Law professor Danielle Citron has explored how this solution could

58:48

quickly become dystopium. We might see the emergence of an essentially an audit trail of everything you do and say all of the time. Danielle refers to the business model as immutable lifelogs in the cloud. In a way we sort of already seen it. There are health plans that if you wear a fitbit all the time and you let yourself be monitored as your insurance, you know your

59:10

health insurance rates. But you can see how if the incentives are there in the market to self surveil, whether it's for health insurance, life insurance, car insurance, we're going to see the unraveling of bribe to say by ourselves. You know, corporations may very well, because the CEO is so valuable, they may say you've got to have a log, an immutable audit trail of everything you do and say.

59:37

So when that deep fake comes up the night before the IPO, you can say, look, the CEO wasn't taking the bribe wasn't having sex with a prostitute, and so we have proof because we have an auto trail, we have a log. So when we were imagining, we were imagining a business model that hasn't quite come up, but we have gotten a number of requests from insurance companies as well as companies to say we're interested in this idea.

01:00:06

So how much has to be in that log? Does this have to be a whole video of your life? That is a great question, one that terrifies us. So it may be that you're logging locate geolocation, you're logging videos, you see people talking and who they're interacting with, and that might be good enough to prevent the mischief that would hijack the IP. Your whole life online, yes, stored securely, our clock down, protected in the cloud. It is, at

01:00:35

least for a privacy scholar. There are so many reasons why we ought to have privacy that aren't about hiding things. It's about creating spaces and managing boundaries around ourselves and our intimates and our loved ones. So I worry that if we entirely unravel privacy A in the wrong hands is very dangerous. Right B It changes how we think about ourselves and humanity? Chapter nine, Section two thirty. So

01:01:10

technofixes are complicated. What about passing laws to ban deep fakes or at least deep fakes that don't disclose their fake So the video and audio is speech in our First Amendment doctrine is very much a protective of free speech, and the Supreme Court has explained that lies just lies themselves without harm is protected speech. When lies cause certain kinds of harm, we can regulate it. Defamation of private people, threats, incitement, fraud,

01:01:42

impersonation of government officials. What about lies concerning public figures like politicians? California and Texas, for instance, recently pass laws making it illegal to publish deep fakes of a candidate in the weeks leading up to an election. It's not clear yet whether the laws will pass constitutional muster. As you're saying in an American content, we are just not going to be able to law great fakes. Yeah, we can't have a flat van, and I don't think we should.

01:02:12

It would fail on doctrinal grounds, but ultimately it would prevent the positive uses. Interestingly, in January twenty twenty, China, which has no First Amendment protecting free speech promulgated regulations banning deep fakes. The use of AI or virtuality now needs to be clearly marked in a prominent manner, and the failure to do so is considered a criminal offense. To explore other options for the US, I went to

01:02:44

speak with a public policy expert. My name is Joan Donovan, and I work at Harvard Kennedy Shorenstein Center, where I lead a team of researchers looking at medium manipulation and disinformation campaigns. Joan is head of the Technology and Social Change Research Project, and her staff studies how social media gives rise to hoaxes and scams. Her team is particularly

01:03:07

interested and precisely how misinformation spreads across the Internet. Ultimately, underneath all of this is the distribution mechanism, which is social media and platforms. And platforms have to rethink the openness of their design because that has now become a territory for information warfare. In early twenty twenty, Facebook announced

01:03:32

a major policy change about synthesized content. Facebook preissued policies now on deep fake saying that if it is an AI generated video and it's misleading in some other contextual way, then they will remove it. Interestingly, Facebook ban the Moon Disaster Team's Nixon video even though it was made for educational purposes, but didn't remove the slowed down version of Nancy Pelosi, which was made to mislead the public. Why

01:04:05

because the Pelosi video wasn't created with artificial elligience. For now, Facebook is choosing to target deep fakes, but not cheap fakes. One way to push platforms to take a stronger stance might be to remove some of the legal protections that they currently enjoy. Under Section two thirty of the Communication's Decency Act past in nineteen ninety six, platforms aren't legally liable for content posted by its users. The fact that platforms have no responsibility for the content they host has

01:04:38

an upside. It's led to the massive diversity of online content we enjoyed today, but it also allows a dangerous escalation of fake news. Is it time to change section two thirty to create incentives for platforms to police false content? I ask the former head of a major platform, LinkedIn co founder Reid Hoffman. For example, let's take my view of what the response to the Christchurch shooting should be as to say, well, we want you to solve, not

01:05:11

having terrorism, murderer or murderers displayed to people. So we're simply going to do a fine of ten thousand dollars per view. Two shootings occurred at mosques in Christchurch, New Zealand in March twenty nineteen. Graphic videos of the event were soon posted online. Five people saw it, that's fifty thousand dollars. But if he becomes a meme and a million people see it, that's ten billion dollars. Yes, right, So what's really trying to do is get you to say,

01:05:43

let's make sure that the meme never happens. Okay, So that's a governance mechanism there. Yes, you find the channel the platform based on number of views would be a very general way to say. Now you guys have to solve. Now you solve, you figure it out. What about other solutions? If we are to make regulation, it should be about the amount of staff in proportion to the out of users so that they can get a handle on the content.

01:06:13

But can they be fast enough. Maybe the viral spread should be slowed down enough to allow them to moderate. Let's put it this way. The stock market has certain governors built in when there's massive changes in a stock price. There are decelerators that kick in, breaks that kick in should the platforms have breaks that kick in before something

01:06:37

can go fully viral. So in terms of deceleration, there are things that they do already that accelerate the process that they need to think differently about, especially when it comes to something turning into a trending topic. So there needs to be an intervening moment before things get to the homepage and get to trending, where there is a content review. So much to say here, but I want to think particularly about listeners who are in their twenties

01:07:08

and thirties, are very tech savvy. They're going to be part of the solution here. What would you say to them about what they can do? I think it's important that younger people advocate for the Internet that they want.

01:07:27

We have to fight for it, We have to ask for different things, and that kind of agitation can come in the form of posting on the platform, writing letters, joining groups like Fight for the Future, and trying to work on getting platforms to do better and to advocate for the kind of content that you want to see more of. The important thing is that our society is shaped by these platforms and so we're not going to do away with them, but we don't have to make

01:08:03

do with them either. Conclusion, choose your planet. So there you have it. Stewards of the Brave New Planet. Synthetic media or deep fakes. People have been manipulating content for more than a hundred years, but recent advances in AI have taken it to a whole new level of verisimilitude.

01:08:33

The technology could transform movies and television, favored actors from years past starring in new narratives, along with actors who never existed, patients regaining the ability to speak in their own voices, personalized stories created on demand for any child around the globe, matching their interests, written in their dialect,

01:08:56

representing their communities. But there's also great potential for harm, the ability to cast anyone in a pornographic video, weaponized media dropping days before an election, or provoking international conflicts. Are we going to be able to tell fact from fiction? Will truth survive? And what does it mean for our democracy? Better? Fake detection may help, but it'll be hard for it to keep up, and logging our lives in blockchain to

01:09:28

protect against misrepresentation doesn't sound like an attractive idea. Outright bands on deep fakes are being tried in some countries, but they're tricky in the US given our constitutional protections for free speech. Maybe the best solution is to put the liability on platforms like Facebook and YouTube. If we can joan Donovan's right to get the future you want, you're going to have to fight for it. You don't have to be an expert, and you don't have to

01:09:59

do it alone. When enough people get engaged, we make wise choices. Deep fakes are a problem that everyone can engage with. Brainstorm with your friends about what should be done. Use social media. Tweet at your elected representatives to ask if they're working on laws, like in California and Texas. And if you work for a tech company, ask yourself and your colleagues if you're doing enough. You can find lots of resources and ideas at our website Brave New

01:10:31

Planet dot org. It's time to choose our planet. The future is up to us. Brave New Planet is a coproduction of the Broad Institute of MT and Harvard Pushkin Industries in the Boston Globe, with support from the Alfred P. Sloan Foundation. Our show is produced by Rebecca Lee Douglas with Mary Doo theme song composed by Ned Porter, mastering and sound designed by James Garver, fact checking by as

01:11:09

If Fridman and a Stitt and Enchant. Special thanks to Christine Heenan and Rachel Roberts at Clarendon Communications, to Lee McGuire, Kristen Zarelli and Justine Levin Allerhans at the Broad, to mil Lobell and Heather Faine at Pushkin, and to Eliah Edie Brode who made the Broad Institute possible. This is brave new planet. I'm Eric Lander.

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript