Check out the SoundGuys Podcast on iTunes! This episode “Smart Speakers: A New Legal Frontier” can be downloaded here. Below is a full transcript of the podcast. Be sure to subscribe to iTunes for more episodes!
Unless you’ve been living under a rock, the world has been taken by storm by smart speakers. In households across the globe, these always-listening, always at your beck and call personal assistants are used to fill a room with music, read the news, or control your smart home. But these devices serve a more sinister purpose, and with an even more horrifying byproduct for anyone who isn’t a complete exhibitionist: They take away your privacy in your own home… legally.
A smart speaker works by having a microphone that’s always on, and always connected to the internet. Once it recognizes a hotword, the microphone starts transcribing what you say, and sends it to the appropriate service to complete the function. Additionally, that transcribed audio file is stored to learn more about how people speak. Everything you say to that smart speaker is stored somewhere, with text.
Sometimes this happens in error, like when the Google Mini was initially sent to journalists, and was recording at random intervals 24/7… but who cares, right? There’s gotta be a law against accessing that kind of data, you may assume. You’d be wrong, and in fact the opposite may be true: barring a judgement with federal wiretapping statutes, current US law protects the abuse of that information.
When you’re in your home, it used to be that you were protected from unreasonable search and seizure, as guaranteed by the fourth amendment. However, as time went on, the law had to recognize limits to these protections. For example, if you’re shouting in public that you’re going to burn down a building, it’s reasonable to investigate if you’re a law enforcement officer, right? So rulings over the years have established that the fourth amendment extends only as far as you have a reasonable expectation of privacy. It forces law enforcement to proceed only when there’s an eminent threat, or a damn good reason to believe a crime is being committed (also known as probable cause). This is what prevents many agencies from simply monitoring homes with thermal imaging or other tactics.
While you’re used to being safe in your own home, a smart speaker destroys the legal requirements for your expectation of privacy—making the fourth amendment inapplicable if something you said in front of your Alexa lands you in legal trouble—or could be relevant in a civil suit against you. In court cases US vs. Miller and Smith vs. Maryland, the Supreme Court established that a person has “no legitimate expectation of privacy in information [they] voluntarily turn over to third parties.” The instant the data from your smart speakers is communicated outside the home, it’s no longer protected by the fourth amendment. Law enforcement doesn’t need a warrant to obtain that data, nor does a 3rd party. Expert Jay Stanley of the ACLU explains;
Yeah, it’s interesting because American courts have always given very, very high privacy protection to the home. There are some things that the police can do outside the home, that they can’t do inside the home, like use drug sniffing dogs, and as Anthony Scallia said in an opinion about using thermal imagers on a home, where that was banned without a warrant, the home has always been a sacred space in American Jurisprudence. At the same time, when it comes to privacy, there’s a thing called the third-party doctrine which says that information that you share with a third party no longer receives protection under the fourth amendment.
That started off as a common sense thing where, if you’re having a loud conversation on a sidewalk, you can’t expect to require a warrant for the police to overhear you. But that got extended to your information that’s being held by a bank, or the electric company or the telephone company. And so with the internet of things, and smart speakers, and smart electric beaters (Ed. note LULZ) a lot of very personal information that is inside the home is streaming to third parties and their servers. So, it should be protected because it’s inside the home, on the other hand, under the third party doctrine, which we’re trying to challenge and others are trying to challenge, and it really needs to be changed, but currently under the third party doctrine that information does not receive protection under the fourth amendment.
I mean, we think that the police should need to get a warrant in order to access the audio clips of you from your device that are stored on Amazon or Google servers or whoever else’s. We’re worried that police are going to claim otherwise and are going to fight in court not to need a warrant and our wiretapping laws are very, very complex and tangled so it’s pretty unclear to what extent they would apply to that kind of information if the police do want to get your audio. And at the end of the day, it’s either going to have to be hashed out in court, or preferably, Congress would pass a law that clarifies and puts in place some precise and strong privacy protections for this kind of data so we don’t have to feel paranoid in our own living rooms.
While there isn’t much case law established yet specifically with smart speakers, there’s very little that can be done to put the toothpaste back into the tube once something sensitive gets out. With a smart speaker or voice assistant active, you’re constantly recording a detailed log of who you talk to, what you’re messaging them, your unique identifier (or IP address) of the device you’re using can be used to see all traffic from it and connected accounts… the list goes on. As we saw with Facebook’s scandals in 2018, the amount of personal data these companies are collecting is staggering… and that once a leak of this data happens, it’s just out there. Forever.
Smart speakers throw gas on that fire by transcribing the voices of over 47 million Americans daily, and that number is growing. But much of this data contains a lot of information you don’t want people to have, like who you are, where you live, what you’re doing in your home, and what you’re saying to your friends and family. Private conversations end up by the truckload on the servers of Google, Amazon, and Apple. The thing to remember is; because all that speech is transcribed into written word, it can be mined for personally identifiable information with disgusting accuracy by someone with a little scripting knowledge. Police can ID suspects with an 87% effectiveness with merely a zip code, birthday, and sex, and that info can easily be extrapolated from voice data. Even the Apple HomePod with its anonymized data collection isn’t safe on a long enough timeline.
I mean anonymization is always good. It’s not always something that you can depend upon because a lot of things are easy to de-anonymize. You might make reference in your conversations to the town you live in, and your gender might be apparent, and you might make a reference to an interest of yours, like you’re interested in softball, or you might make a reference to something else, and by the end of the day, it actually not that hard to figure out who you are, even if you don’t say your address or something more explicit like that.
But there’s more, too. Not only is what you search for getting transcribed, but that recording of your voice can be analyzed to identify you in other contexts. Sci-fi shows often focus on camera systems picking people out of a crowd, but people are already getting identified by their voice.
One danger is simply that it’s a new vector for surveillance. We know that some banks and other companies are compiling large databases of voice prints so that people can be recognized by their voices. And it means that you, once a bank or government agency or anybody else has that kind of data, it can be re-purposed for all kinds of things. You might be walking down the street, you might be identified by your voice. You might call anonymously to an AM radio show about experiencing sexual abuse or harassment and you think you’re anonymous, but you’re not because someone has your voice print.
There are also questions around the accuracy around those kinds of voice prints. People may be susceptible to being misidentified. Maybe your voice sounds very similar to the timbre of terrorist from one continent or another. So I think those are some of the dangers around that kind of technology for civil liberties.
In short, no matter how careful you are, no matter how great these companies say they’re protecting you, the fact of the matter is you’re simply creating a huge database on everything that you are—to be used by someone you’ve never met, with motivations that you won’t know until something happens. It may not be today, but there’s more that can be done with this data as time goes on.
What’s the worst that could happen? (Spoiler alert: It’s really, really bad)
Basically, the smart speaker bypasses the legal expectation of privacy, which may make you wonder, “OK. What’s the worst thing that could happen?” Well, let’s take a hypothetical scenario.
Say one day, something you love—be it video games, music, beer, whatever—becomes illegal. You’ve been a fan of this your entire life; you’ve bought apparel, you’ve read articles on it, but most importantly, you’ve commanded your smart speaker to search for it. Because law enforcement doesn’t need a warrant to obtain a huge treasure trove of information on people who like this thing, they buy access to this data, and they may gain probable cause to search your home. It’s also possible that your employer could learn of your newly illegal activity, or your social circles.
Your life would be effectively ruined, even if no formal punishments came your way. Nobody wants to live in a Black Mirror episode, right? Well, it can get a whole lot worse depending on where you are. That worst-case scenario is already happening in China with its nascent Merit System.
First let’s define what a merit system is. According to the Oxford English Dictionary, it is “a system in which a post or promotion is awarded on the basis of competence rather than other criteria such as political affiliation or length of service. ” In theory, this is ideal. After all, who wouldn’t want to be awarded solely on their hard work? Well a social credit system doesn’t only award citizens based on merit. It also punishes them. According to a Wall Street Journal article from 2016, Beijing implemented this experimental social credit tool, which gives the government the right to blacklist its citizens from taking out loans, applying for jobs, and traveling.
Tacking onto this, Marketplace published an article detailing Xie Wen’s story; he applied for a bank loan and was rejected… which clued him in on the fact that he had been blacklisted and added to the Chinese Supreme Court’s list of discredited persons. As of the article’s publishing back in February, “Since the blacklist was created in October 2013, 9.59 million people have been to the list.”
If you follow the rules, then you'll be fine in a merit based system. But rules can change - your past cannot.
Again, “Why does this matter? I follow the rules and am fine with a merit-based society.” Well, the rules are subject to change at the will of the government in question. An innocuous activity like getting coffee could be deemed undesirable, and you’re instantly guilty of this offense. No trial, no defense, no path to restitution. You’re marked forever.
But before this dystopian-seeming credit system becomes ubiquitous, what are the ways that we’re combatting privacy breaches and ensuring data protection?
Where to we go from here?
Now this penalty applies to the Data Controller or the processor that is collecting/processing the data like cloud services. Interestingly, it currently does not apply to government using personal data for law enforcement. Which can get complicated. But going through the process to be removed isn’t a sure thing. You can easily be denied, in fact, chances are you WILL be denied. According to a recent article by Forbes, 90% of the requests to delist information was by individuals seeking to remove themselves from news articles, directories, and social media. Of those only 46% were approved. On top of that, delisting yourself can be denied based on what’s deemed to be for “the greater public interest”. If you go to the EU Privacy request form there’s a section that states:
When you make your request, we will balance the privacy rights of the individual concerned with the interest of the general public in having access to the information, as well as the right of others to distribute the information. For example, we may decline to remove certain information about financial scams, professional malpractice, criminal convictions, or public conduct of government officials. “ – EU Privacy Removal Request Form
Before we get even further into the weeds, it’s a good idea to define what “personal data” is. According to a press release from the European Commision personal data is defined as “any information relating to an individual,…. It can be anything from a name, a home address, a photo, an email address, bank details, posts on social networking websites, medical information, or a computer’s IP address.” And now the important question: Does it even matter? What I mean to what extent is this law even working? It’s currently being debated whether or not the “Right to be forgotten” applies for anyone overseas. Google has agreed to remove personal data from their domains based in the EU, but not abroad. So anyone overseas can still see the data if they search for you, and anyone within the country can just use a fake IP address or VPN. So if I’m a citizen in the EU and have my data successfully removed, my friends and family in New York can still it.
Because politics is politics, Google refusing to remove them from domains that are outside of the country boundaries makes me think that the only way for true privacy is if the world says enough is enough (something like United Nations v Google). Another option would be if Google decides it’s in its own best interests to comply with this new wave of personal data security and implements principles of data protection by default (like pseudonymization of personal data – a data management procedure by which personally identifiable information fields within a data record are replaced by one or more artificial identifiers, or pseudonyms). This isn’t perfect and there are ways to get around it but, it’s a start.