Shh! The Tech is Listening!

Speaker 1

00:04

Welcome to Tech Stuff, a production of I Heart Radios How Stuff Works. Hey there, and welcome to tech Stuff. I'm your host, Jonathan Strickland. I'm an executive producer with I Heart Radio and How Stuff Works, and I love all things tech. And I'm sitting in the audience of a local theater like Stage theater not long ago. I'm waiting for the show to start, and there's a song that's playing over the sound system, and I'm really kind of digging the song, but I totally don't recognize it.

00:38

And I glanced down at my phone and I see that on the phone below the time on the locked phone screen, it says that the song is danger High Voltage by Electric six. Now this is obviously a hypothetical example because I would recognize that song anywhere, but you get the point. Anyway, I'm thinking, that's so cool. My phone knows what songs are playing around me. That's so neat.

01:02

I didn't even have to tell to do anything. And then a couple of hours later, as I think back on this moment, uncertainty and dreads start to see Ben, wait a minute, if my phone can identify a song that's playing around me, that means my phone is actually listening to stuff. It wouldn't be able to tell me the song title. Otherwise it has to be able to pick up the audio. I didn't activate any app. I didn't turn on shah Zam or ask my phone or anything.

01:30

My phone did it by itself. So my phone is detecting the sounds around it even when it's not in an active mode. Now, on a similar note, I'm sure we all have had these personal assistant experiences out there. Whether we use one ourselves, we've been around when someone else uses them, things like Google Assistant or Alexa or

01:52

Siri or Cartana. There's more of them out there. You can activate these assistants with a specific word or phrase, and then you speak to them to carry out some sort of task or to get you some sort of information or something along those lines. We've got a Google Home device in our house, so we might use it to get a quick rundown on the weather Report. We might ask it to play a track off an album

02:15

by the jazz Fusion band weather Report. But wait, that means that device is listening to We didn't have to take any physical action. We didn't have to push a button to make it work. We just spoke the keyword or a key phrase, and off it goes. And then we get into stuff that seems super creepy. And I'm sure most of you have had some sort of experience

02:37

like this. Say you're chatting with friends, maybe you're at a restaurant or you're just hanging out, and you're talking about this new snack food you just heard about, and this is just one part of a conversation that rambles all over the place. But then you talk a little bit about the snack food for a couple of minutes. You're like, you've heard about it, you wanted to try it, you haven't tried it yet. Later on, you pop on over to Facebook, and as you're scrolling through your feed,

03:03

there it is. There's an ad for the very same snack food you mentioned to your friends just a little earlier that day. You've never purchased the snack as far as you remember, you haven't even searched for it on the web, and there's the ad. So as Facebook listening in on your conversation in an effort to serve up

03:22

a laser focused targeted ad. One this episode, we're gonna take a look at the technology that allows our devices to listen in on us, and we'll explore the studies about whether or not anything hanky is going on and try to separate fact from fud FU D that's fear, uncertainty and doubt. And we'll also chat about some recent news stories about how big companies have been handing over audio messages to third party human contractors and what that

03:51

means in terms of privacy and ethics. Now, first, let's address a big reason why devices aren't constantly recording or broadcasting all the sounds within an environment that's reachable by microphone. It's because that's truly enormous, Like, that's a huge amount of data. So let's just take Facebook as an example. There are more than two billion people using Facebook every month. At least one and a half billion people pop on

04:21

Facebook every single day. Now that's not necessarily the same one and a half billion people every day, but every day one point five billion people check Facebook, and out of that number, nearly one billion of them are accessing Facebook on mobile devices. So, just from a data management standpoint, there's no way any company, even one as large as Facebook, could be actively monitoring, recording, or even analyzing all that audio that would be coming in from a billion mobile handsets.

04:54

We are in the age of big data, but we still have our limits. Plus you'd have to figure out that you know that that large amount of data, most of it wouldn't be useful to Facebook. Now, don't get me wrong. At the end of the day, you and I are the products being bought and sold on Facebook and Google and other providers out there. We're potential customers for all of the advertisers that use those companies like

05:22

Facebook as a platform. So it benefits the advertisers and Facebook and sometimes even us as customers to match the right ads to the right people. So there's definitely an incentive to learn as much about users as possible to leverage their interests and potentially convert them into paying customers to an advertiser. Now, this is the very basic foundation

05:46

of Facebook's business model. So if Facebook could do this from a technical standpoint, and if the company could get away with it from a public perception standpoint, I think there's little doubt that face Book would do it. But honestly, it's just way too much information to process and to boil down into actionable plans. We talk about a lot of stuff in our day, you know, and some of

06:12

it we may not really be interested in. We're just talking about something, So it wouldn't do Facebook any good to serve up ads for stuff that we weren't actually really interested in, So it has to pick and choose its moments. Facebook has denied using phone microphones in this way. In a June second, two thousand sixteen blog post on the Facebook newsroom site, a company representative wrote this, and

06:34

here's a quote. Facebook does not use your phone's microphone to inform ads or to change what you see in news feed. Some recent articles have suggested that we must be listening to people's conversations in order to show them relevant ads. This is not true. We show ads based on people's interests and other profile information, not what you're talking out loud about. We only access your microphone if you have given our app permission, and if you are

07:02

actively using a specific feature that requires audio. This might include recording a video or using an optional feature we introduced two years ago to include music or other audio in your status updates. End quote. Now, it's understandable that people would be a bit skeptical regarding Facebook's claims of innocence. In this regard. The company has had several high profile scandals and issues with privacy and security. Zuckerberg himself once

07:29

famously declared that privacy is dead. Also, he simultaneously does his best to preserve his own privacy. But that's commentary for another episode. So I don't blame people for thinking that Facebook might actually be listening in on conversations because the company has already proven it hasn't been the best steward of user privacy in the past. But that doesn't mean the company has actually been spying on people. It

07:56

doesn't have to, at least not in that way. And this is where we get into some troubling territory because it's where we start to learn how services like Google and Facebook and others can glean information about us, whether we have consciously shared that information or not, and it helps explain how these companies can advertise to us so effectively. One way Facebook does this is with an innovation called

08:22

Facebook Pixel. Now, this is a piece of code that Facebook's clients advertisers really can put on their own websites. So it's the type of code you would insert into the website for a business. So let's say you own a specialty niche marketing shop. We'll say you sell figurines based off of iconic horror movie monsters and characters, and you're going to advertise on Facebook. The pixel code is

08:49

one way Facebook can optimize that experience. The code pulls information off of user behavior on your website and sends it to Facebook. If people click over to your site because of an ad on Facebook, pixel will register it. This helps you see how effective or ineffective your ads are on the site. It also can target your ads to people on Facebook who would be most likely to

09:13

click on those ads. It might analyze the traits common to people who are interacting with your ads, and then extrapolate that to target people who have similar traits and behaviors but they haven't yet seen your advertisements. Facebook, meanwhile, can also use that data to serve up ads from other companies to users based on similar findings, and it can track other stuff too. Let's say you click over to an article on a blog or news site that

09:38

incorporates Facebook pixel in the site's code. Facebook can see how long you were on that article, which in turn indicates your interest and investment level in that topic. Then Facebook can serve up ads related to the contents of that article to you. In the end, it's all about analyzing user behavior to get the biggest return on investment, and it doesn't require are using the microphone to do it.

10:02

They can just look at who you are, where you've been, both in real life if it's tracking your location and on the Internet if it's tracking your your browsing and who your friends are. And all of this information combined gives Facebook a ton of data about what kind of ads to target towards you. Now, on top of that, Facebook can purchase information from data brokers to supplement its

10:26

own guard Ganga and database. There are companies that manage stuff like loyalty programs, which also track what you buy. They have to for the loyalty programs to work, and those purchases are linked to you as a person. They know, Oh, Jonathan goes to Starbucks all the time and he always gets those Nitro cold brews, So let's put an ad that targets him based on that information. Now, that data isn't just being used to help you get the best

10:52

deal on whatever it happens to be. That information is valuable. So companies that manage these loyalty programs can and do buy and sell sell that data you know are spending habits are part of this sort of encyclopedia entry about our interests, priorities, and behaviors. Now, none of this needs to use a microphone to spy on us. So in the case of seeing that snack food pop up on the Facebook feed, it could simply be that you exhibit behaviors similar to ones that people who have bought that

11:23

snack food tend to have. As well. You've liked the same sort of pages. You may even have a lot of friends who have already bought this stuff. You may live in a region where it has recently been introduced. These are the kinds of points of data that Facebook might use in order to serve that add up to

11:39

you that have nothing to do with your microphone. So you got the ad not because you talked about the snack food, but because Facebook has sussed out you're the type of person who would like that snack food because spoiler alert, You're not as special as you think you are, and I'm not as special as I think I am. Now you could argue, and I would agree with you on this, that what Facebook is doing is at least as creepy as listening in on a microphone, perhaps even

12:06

more so. Facebook has filed patents that focus on technology is meant to predict where you're going to go next based on your history of location data. So, in other words, Facebook is trying to figure out where you're going to

12:19

go before you go there. And it's not just you, it's all the people you know who are using Facebook two and so it's not just predicting where you'll go, it's also predicting which people you may be running into, because it's predicting those people are going to go to that same place and whether or not you might encounter one another. It can also use that to make suggestions to add people on Facebook who are going to those same places so that they become your friends online. Now

12:48

why does Facebook care who your friends are? Because the more people who use Facebook and the more interconnected they become, the more useful the information they generate for Facebook. That that ends up becoming more valuable to the company. So it is pretty creepy and invasive, and it doesn't have to use the microphone. But when we come back, I'll talk a bit more about these sound activated features and what's actually going on, because there is some stuff we've

13:15

got to be worried about. But first, let's take a quick break. When I opened this show, I talked about how my phone could listen in on music and identify the song even when the phone was in its locked mode. Now that's because I have a Pixel to xcel phone. It's an Android phone. It's actually a flagship Google phone, and there's a feature on the Pixel too that's called now playing. You have to activate this feature, you have to choose to optimize it. So I want to make

13:51

that clear. I chose to activate this feature. It's not just active by default, and with it active, the phone can identify music that's playing, and it can tell me the title even when the phone is in its locked position. So what gives Well, this is not as creepy and invasive as it sounds at first glance, because his feature, this is incredible to me, is actually entirely local to the Pixel two phones. It works on the phone itself. It's not consulting the cloud at all, it's not sending

14:24

any information. So how can that be possible? How can all this information exists on the phone already? Well, let's boil it down first, if you've ever played with any digital sound recording software, you've likely seen sound recorded as a wave form, a visualization of sound, and typically it's pretty simple stuff like if you're using a very basic sound recording system, you're mostly looking at changes in amplitude

14:52

or volume. In other words, so you see a continuous series of peaks and valleys over the course of a sound recording. Those represent the loudest and the quietest parts of the recording that changes in volume. You can also graph frequency or pitch, and you can if you zoom way in, see shapes in the wave form that indicates specific phonetics and sounds. Anyone who has worked in audio editing for a while can identify at a glance certain

15:20

distinctive sounds. Tari, my producer, can probably tell you just by looking at a waveform of my recording which moments represent the irritating mouth sounds she removes before publishing an episode. It doesn't take long before you can do this yourself. It's actually pretty easy to identify, say it like a

15:40

high hat symbol in a music recording, because it's very distinctive. Now, that means that songs have these distinctive features like a fingerprint that represent the sound of the song, and if you can recognize the fingerprint, you can identify the song even if you're not listening to the song at that moment. And you could look at a print out of a wave form of a song and you can try and match it against a library of print outs. That's essentially

16:10

what the pixel Too is doing. The program runs in the background, It activates when the sound profile indicates that there's music present, so it then analyzes the sound that's coming in through the microphone and it creates one of these digital fingerprints that I was just saying. Then, just like you would with a crime scene fingerprint, the pixel Too will compare the digital analysis of the song that's playing against a local database on the phone of fingerprints

16:38

that represent thousands of popular songs for your region. Now exactly how many hasn't really been released, but supposedly in the tens of thousands of songs range. And if the pixel Too finds a match between the song that is currently playing and the one that's in the database, it returns the result. This works even if the phone has cellular and WiFi data turned off, because again it's all local. Now the now playing feature doesn't run constantly because that

17:06

would drain battery life like crazy. Instead, it samples the audio approximately every sixty seconds, and it takes time to match a song to an entry in the database. The cleaner the audio, in other words, the less background noise and less interference that's present, the faster this process tends to be. This means that when songs transition from one song to another, it can take a little bit before

17:31

the phone registers the change. It all depends on the acoustic quality of the environment and where in this sampling cycle the phone is at any given time, so that's not quite as creepy because everything's local on the device.

17:45

It's not sending any data out anywhere else. It's not listening to what I'm listening to and an alerting Google to let them know, hey, Jonathan's once again listening to the soundtrack to be More Chill, which would be an accurate suggestion that it would make because I do listen to that a lot. Anyway, you can use this feature to learn more about the track, the artist, the album,

18:09

including potentially purchasing that music. And those features do connect to the outside world through WiFi or cellular connections, but that requires an extra step on the part of the user. Also, Google pushes out updates to this database with the most popular songs, and these are regionalized to reflect the country you're in, because you're less likely to run into, say a Peruvian pop song when you're in Scotland. The push

18:35

updates do happen over WiFi or cellular local connections. But but this is just the reference data that analyze music gets compared against. An app like Shazam, on the other hand, connects to the cloud, but you also have to activate the app to have it listened to the audio, so it's a user choice to have the app listen. So this is more like a push to talk device, except

18:59

it's pushed to listen. Shazam is also analyzing music to sus out a digital fingerprint for the audio, but it can compare the sampled audio against a much larger database consisting of millions of songs, rather than the tens of thousands you would find on the pixel to now playing feature. More importantly, I think it's fair to say this isn't a creepy use of the technology, since the listening feature only activates on the user's command rather than just being

19:27

on by default. Now, this isn't that much different than what virtual assistants are doing when you use them. Clearly, the microphone on a virtual assistant like Google Home or Siri or whatever, it has to be active all the time, otherwise you wouldn't get a response when you used whatever the keyword or phrase was to activate the assistant. I'm going to try and avoid saying any of those phrases, by the way, because I don't want those of you who have those devices to deal with the frustration of

19:57

them going off in response to something I say. A Now, those words or phrases have a specific sound, just like music does. In this case, we're talking about phonemes, which are recognizable sounds found in language. So in English there are forty four phonemes. The order and combination of those phonemes are the key. So if you say something that has those phonemes in the right order, or if it's close enough, if it's an a noisy environment, this can

20:26

activate the virtual assistant. It's like a key fitting into a lock. Now, if you're saying other stuff, it's like the wrong key is inserted and nothing happens. It's only when you say something that fits the lock that the assistant activates. This process continues after activation. When you talk to the virtual assistant, it analyzes your speech by phonemes. Software processes those to figure out what words you are actually saying. Well for the first step, that is, because

20:56

it's actually more complicated than that. So, for example, there are hominems. These are words that have a similar sound but different meanings and often different spellings. An easy example is the number eight in the past tense for to eat, such as I ate an entire bowl of cao. Mm hmm okay. So those two words eight and eight sound

21:22

exactly the same, but they have different meanings. Now that means the software can't rely on just the sounds you're making when you speak to figure out what you mean, has to actually analyze syntax and context and make judgment calls about what you are actually meaning when you say these things. Sometimes it gets things right, sometimes it gets things wrong. But don't be too hard on it. Because

21:46

humans misunderstand other humans all the time. Even when we are both communicating with it in the same language, we can misunderstand each other. Now, this is still just the first step you can think of. This is essentially speed each to text. From there, you have to determine what is actually being asked by the speaker, what is the intent behind the words. If someone speaks French very slowly to me, I might be able to spell out what is being said phonetically, but that doesn't mean I understand

22:17

the actual content of what was spoken. And to complicate matters, there are a lot of different ways to ask for the same information. I might say what's the weather for this week? Or will I need an umbrella today, or one of a dozen other ways to inquire about the weather. The software has to be able to determine what the intent was behind my question, and then there's another step,

22:41

which is matching intent with action. The assistant has to respond to my request, and hopefully it does so in a way that's relevant to whatever I was asking about in the first place. So if I ask my virtual assistant for an update on the weather, I'm not going to be impressed if it instead tells me about the track FAIC or vice versa. And as assistants get connected into more systems like security systems, lights, apps, and more, the software has to send appropriate commands to these other

23:12

elements to produce the expected results. Now, this is all impressive, and because it's impressive, it could be a little scary when we think about assistance as hanging on our every word. What are are they always listening? Are they always paying attention? Now? They're always monitoring sound, but they're not doing so in an effort to broadcast or record information. They are on alert for that initiating phrase or word. They ignore everything else.

23:40

More on that a little bit later. Now that being said, there are ways in which someone could hack an assistant or a phone, or really any connected device that has a microphone in order to eavesdrop using that devices microphone. Edward Snowden revealed that the n s A use such tactics in the agency's surveillance efforts. Apps that have access to your phone's camera and microphone for the purposes of sharing video, audio, and related features can do some disturbing

24:10

stuff if they're compromised. They can also do some disturbing stuff if they're not compromised, but if the party behind it is malicious. Felix Krauss made such an app as a proof of concept for iOS devices. The app, like many others, asked the user for permission to access the camera. Kraus stated that once a user agreed to this, the app could access both the front and back camera anytime the app was in the foreground of the iOS device.

24:39

It could take videos and pictures with no indication to the user that such a thing was happening, and it could upload that data to a remote server. It could even run real time facial recognition software. Now does this mean apps like Facebook's Messenger or YouTube are doing this? Well, not necessarily, but it does mean it's at least possible to do and nothing is stopping him. More, let's say ethically unconcerned app from doing just that. So what can

25:08

you do to protect yourself from bad actors? Uh, here's the bad news. Not much you could go without using such devices and apps in the first place. That's pretty darn restrictive. Crowds recommended using camera covers to obscure the phone's cameras when you weren't actively using them, or revoking camera access to the various apps on the phone. And that's about it. Yikes. Now, when we come back, I'll cover a related topic that's been in the news lately.

25:38

But first let's take another quick break. Okay, so we know it's possible to use cameras and microphones against people, either with malware or what amounts to a security loophole between handset hardware and apps. But there's something us we need to chat about, and that's humans listening in on what were assumed to be private conversations and messages. Now

26:08

here's the context. In August two thousand nineteen, several major media outlets reported an upsetting revelation, namely that Facebook had been sending out audio files that users were creating in Facebook Messenger, for example. And these were audio clips sent through Messenger itself, so it's akin to a private text to a friend. And Facebook was sending these audio files to a third party contractor to transcribe that audio. So imagine having a private text message thread set to a

26:40

complete stranger for review. It was similar to that, except it was audio, not text. So what's actually going on? Well, Facebook said this all had to do with users who had opted into having their audio messages transcribed automatically. Essentially, it was all about using the voice to text option

26:59

in Facebook. Now, according to Express Computer, this option didn't really have a warning that let you know that those audio files you were creating through this voice to text feature would go to be heard by any humans out there. In fact, they said that the warning that would pop up, or the notification that popped up said, turn on voice to text in this chat using Facebook Messenger, and above the no and yes buttons where you would choose one

27:34

of these options. Facebook further would describe the option display text of voice clips you send and receive. You can control whether text is visible to you for each chat. So again it makes it sound like, oh, this is all automated. If I use voice to text, I just say a phrase, the text shows up. I might have to make some adjustments to the text, maybe it has misinterpreted one of the words or whatever. But sort of

28:01

a hands free approach to sending messages in Messenger. Lots of apps use voice to text features, and in theory it's a pretty great feature. You can dictate a message to be sent to your friend without having to stare at the screen and type or swipe on a keyboard. Tons of folks use features like this if they want to interact with an app while they're driving, for example, to minimize the distractions they have as they putter around.

28:30

But you'll notice those messages don't seem to indicate anywhere that the voice to text recordings could be sent to a human being for review. Express Computer further explains that even on a supplemental page explaining the voice to text feature, Facebook fails to mention that human beings will be reviewing that material. Instead. The supplemental page talks about how voice to text uses machine learning to get better at interpreting what you saying, so that it becomes more useful to

29:02

you the more you actually use the feature. So the concept here was that some voice recognition software would transcribe this audio. Google Voice also used to do this for voice messages. I remember getting voicemails from my mother, who has a Southern US dialect as do I, but hers

29:21

is more pronounced. The Google Voice speech to text program had problems interpreting my mother's messages, and frequently the transcription would be hilariously off track, and most of the time I wouldn't even be able to guess what the original message was based off the transcription. It meant that I would listen to the voicemail and then I would shake my head a lot as I would read the transcription at the same time and just see how far off

29:48

it was. This is a big challenge for voice recognition programs. There are a lot of different dialects and accents. People from different regions within the same country can sound very different even if they're speaking the exact same language. If you get someone from Savannah, Georgia, a native of Savannah, Georgia, and a native from Boston, Massachusetts, they're going to be able to have a conversation with each other, but they will end up saying the same words very differently from

30:19

one another. And that's before you even start talking about people who have a different native language, who have learned English and have a foreign accent on top of the English they speak. There's no hard and fast rule you can create for a voice recognition program to follow to interpret speech correctly throughout a language. Because there's so much variation in how the words and that language are said,

30:45

training the model becomes a challenge. So one thing you can do is you have a human being transcribe spoken words and then compare the human transcription against the machine produce transcription in an effort to train your model to be more effective. Humans are pretty good, though not perfect, at figuring out what some other humans says. Assuming both

31:11

parties are fluent in the same language. By comparing these two records against each other and then making corrections to the model, computer scientists can tweak their voice recognition software models to be more accurate. Now, ideally you would do this before unleashing such a system on the public, but that's not really that practical. There is no in lab project that is going to come close to generating the amount of data and the sheer variety that you will

31:39

encounter out in the real world. Improving the model would happen much faster with a larger sample of subjects using the model, and a billion or so people is a pretty darn big sample size. But that means sending these audio files to humans in the first place. And Facebook has said that the files were anonymized so that there was no identifiable name or anything associated with each of the audio files being sent for human review. But hey,

32:09

I hear you say. Earlier in this episode, you pointed out how it's possible to really get an idea about a person just from the other data they provide, and you'd be right. These audio files had all sorts of different types of content in them, some of it was

32:25

likely upsetting disturbing or inappropriate. Contractors who had been hired to do the transcription came forward anonymously, I might add, because they didn't want to get fired from their jobs, and said they felt that the practice was an unethical one. And media outlets looked into it and their conclusions were

32:42

pretty much the same. Right down the board, Facebook was not transparent about what was happening with this audio, and there were no clear indications to users that their audio files might get sent to some stranger for the purposes of transcription. Now, for its part, Facebook said it halted the practice in early August two thousand nineteen, and third party contractors have said that that is true that they

33:06

no longer are doing this work for Facebook. Facebook isn't the only company to come under scrutiny for this kind of thing. Google, Apple, and Microsoft have also been under the microscope for very similar practices. Now, on the one hand, it's understandable that these companies want to improve their voice recognition capabilities. It's what makes these apps and products useful and makes it more useful to a wider variety of

33:29

people by training the models on this stuff. But the privacy concerns remain and it's something that isn't just troubling to users, but to the people actually being paid to transcribe the stuff in the first place. Now, it would be another matter if the companies were transparent about this practice.

33:46

If users knew that there's a chance a real, live human being would be listening in on some of those voice messages for the purposes of quality control for the voice to text feature, maybe they wouldn't opt into using the voice to text in the first place, or they

34:01

might opt in and not care. In some cases, I'm sure there'd be no shortage of people who would actually say truly terrible things, hoping that some poor contractor would have to listen to it all and check the audio against the automated transcription, because some people would just play nasty. Don't be nasty. By the way, there are better ways to entertain yourself than by making some other person's life miserable. Facebook could potentially face some serious charges based on this practice.

34:30

The company had settled with the Federal Trade Commission, or FTC, earlier in the summer of two thousand nineteen. The settlement was for an incredible five billion dollars, and it largely revolved around the company's rather abysmal record with privacy. The charges date all the way back to two thousand twelve, when the FTC brought eight privacy related allegations against Facebook. And again, this isn't a big surprise. Zuckerberg had already

34:59

cavalierly proclaimed privacy dead a couple of years before that. Now, in the settlement, Facebook agreed to adhere to some rules. Those rules said that Facebook was prohibited from making misrepresentations about the privacy or security of consumers information, prohibited from misrepresenting the extent to which it shares personal data, and

35:20

it required Facebook to implement a reasonable privacy program. Now I'm no legal expert, not by a long shot, but it seems to me that Facebook's failure to alert users that their voice to text data could be sent to non Facebook employees for review is in violation of this agreement. That Facebook agreed to these terms in July two thousand nineteen, and then continued the practice into August is a big problem.

35:47

Whether or not it will result in further legal action against this company is unknown as I record this episode, but it seems like it's at least possible, So I'm gonna wrap this up. We know that microphones can sit in on us without our knowledge. The n s A worked on programs in the United States that did exactly that.

36:06

And while companies with virtual personal assistants tell us that those assistants only activate when certain phrases are spoken, it's also possible that that list of phrases could go well beyond the ones published by the company. So, in other words, I might know that to wake up my hypothetical virtual assistant, I would have to say the alert phrase sky net awaken,

36:29

and then it pays attention. But what if there's a whole laundry list of other words or phrases that could wake it up so that it records or transcribes whatever audio follows. What if, for example, the phrase shopping or going shopping activates it so that whatever follows gets registered

36:48

by the device. So if I tell a friend tomorrow, I'm going shopping for some new sneakers, the device has registered the phrase new speakers because it paid attention once I said the words going shopping, and then I starting ads pop up everywhere I go online for sneakers. Now, is that something that's possible, Well, yeah, it's possible. That doesn't mean it's happening, but it could be It's also possible that my other behaviors have indicated that I'm on

37:15

the lookout for some new kicks. Coincidence is a thing, and it's frustrating because without seeing behind the scenes, it's hard to draw any firm conclusions. Most of us, myself included, have a limited understanding of exactly how much data we're generating in our day to day lives and how that data can be analyzed for patterns and predictions. We may not even be aware that we're heading toward a particular decision before an algorithm draws that conclusion, and it's spooky

37:46

and disturbing. But it doesn't necessarily mean that we're being spied on by a microphone. It may mean we're just broadcasting our decisions before we've known that we've made a decision, and it does indicate that there is some sort of eaves dropping going on, just not necessarily audio eaves dropping. It's more about all of our other behaviors that humans don't pick up on, so we've never had to worry about it before, but machines can analyze it at a

38:14

level that is disturbing. In fact, an actual study at Northeastern University looked into the possibility of whether or not phones were getting activated by clandestine phrases and listening in on conversations, and it found that there was no evidence

38:30

that this was happening. They did find that a lot of apps were taking screenshots of stuff on phones and sending those screenshots to third parties, though, so you know, that's also disturbing, But it doesn't appear that these devices are actively listening to you all the time and recording or transcribing or broadcasting that information anywhere. There's a lot

38:54

to lose from doing that approach. The problem is it is something that is possible, and the other problem is that there are other behaviors were doing that are just as revealing, if not more so, than recording what it is we're saying, and that without being aware of that, we are just giving away more and more information about ourselves and more and more control over our own lives.

39:21

And we're going to see more and more targeted ads that seem super creepy because there's mentioning things that we didn't think anyone knew about, because most people wouldn't pick up on it fun times, So I don't think this was a particularly you know, um, I don't think this show really helps allay any fears. It may just switch

39:44

fears from microphones to everything else. But I did want to cover this because a lot of people have been talking about it for the last few years, and with these transcription services that has brought the whole conversation back into you the forefront. So I wanted to take an opportunity to really tackle it here on the show. If you have a suggestion for a future episode of tech Stuff, send me an email the addresses tech Stuff at how stuff works dot com, or drop me a line. By

40:13

going to tech stuff podcast dot com. You will find there a link to all of our archived episodes, as well as links to our presence on social media where you can get in touch with us, and also a link to our online store, where every purchase you make goes to help the show. We greatly appreciate your support and I will talk to you again really soon. Text Stuff is a production of I Heart Radio's How Stuff Works.

40:42

For more podcasts from my heart Radio, visit the i heart Radio app, Apple Podcasts, or wherever you listen to your favorite shows.

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript