Listening pros and cons

Robert Dale

·
Follow

Oct 9, 2017

·
9 min read

The Pros and Cons of Listening Devices

Vastly improved speech recognition, backed by a more slowly improving ability to make sense of the recognized speech, has brought state-of-the-art NLP into our homes in the form of smart speakers and other devices that listen. Theres no doubt these devices can be incredibly useful, but they also may support incursions into our privacy. We look at where we are today, consider what might be coming, and express just a tad of caution.

This article first appeared as the September 2017 Industry Watch column in the Journal of Natural Language Engineering. You can find the full citation details here, and learn more about the Language Technology Group here.

Say hello to a new housemate

You will have noticed that the battle for the living room is well under way. Its looking to be the big money fight of the decade. In one corner, we have the Amazon Echo, slightly reminiscent of the black monolith from Kubricks 2001, or perhaps looking more like a deluxe Pringles can. In the opposite corner, we have Google Home, whose similarity to an air freshener has been noted by many. Each wants to be the favoured house guest, and theres a lot at stake.

In my house, opposite corners of the lounge room are indeed where weve chosen to locate our two fledgling AI cohabitants being suckers for gadgets, were among the 8 per cent of smart speaker owners who have hedged their bets by accommodating an instance of each. Im not quite sure why weve physically separated them in this way, allocating them separate territories as if they were two cats forced to tolerate living together. A suspicion they might get into an argument if they were too closely co-located, perhaps? Or an anthropomorphic sentiment that they each deserve their own space? Or maybe just that they dont look that good sitting next to each other.

Of course there are other embodied intelligent assistants knocking at the door. Harman Kardons Cortana-powered Invoke is due in the United States in the fall, and may be available by the time you read this. Apples HomePod is due at the end of the year. Samsung is said to be working on a smart speaker driven by Bixby, a virtual assistant whose first appearance on smart phones received a fair bit of criticism. There will be others. But right now, the focus of attention is on Google versus Amazon.

Each of the contenders for being your resident AI has a different ecosystem that it fits into. If you make use of any of Googles services, Google knows a great deal about you its worth periodically checking out myactivity.google.com or www.google.com/maps/timeline to remind yourself and is well placed to leverage the knowledge implicit in that data. The Echo knows your Amazon purchase history, and not surprisingly a key part of the game plan there is to make it incredibly easy to go voice shopping for more stuff at The Store That Has Everything [one-grunt ordering?]. Traditionally Google has made money from you via targeted advertising, but in a sign that it too wants a cut of your shopping dollars, the company has just announced a partnership with Walmart, positioning itself for head-on competition with Amazon.

Mom, theres a salesperson in the kitchen

I doubt that many people select which smart speaker to buy based on appearance alone. You could choose based on capabilities, and right now the Echo seems to be the clear winner here, with 17,000 skills as compared to Google Homes 468 actions. But that difference is largely an artifact of Amazons 16-month lead in entering the market, and is likely to be evened out over time, just as the Apple and Android app stores have grown to be roughly the same size.

Your choice of device is perhaps more likely to be influenced by which of the offerings is able to take advantage of what it knows about you to add value. Theres a kind of self-perpetuating soft lock-in that were already familiar with from alliances that tie together airline frequent flyer programs, gas station franchises and supermarket chains. Indeed, as the market here develops and matures, its not entirely inconceivable that you might even choose your smart speaker based on your preferred grocery store.

When their makers deem that the time is right, you can be sure that these devices will start serving ads. Amazon has already dipped its toes in this water earlier in 2017, allowing six-to-ten second sponsored messages to be played at start or end of an interaction with a skill. But shortly thereafter, they pulled back from this. I expect they will return before too long. At first these ads will be clearly marked as such, but, just as some newspapers carry advertorial content that can be hard to distinguish from independent news, the scope for conversational advertising will inevitably blur the boundaries between our assistant just being its usual cheery and helpful self, and seeking out a commission via persuading us to buy a particular brand of pasta.

Amazons Echo Show adds a touch screen and camera to the basic smart speaker setup, resulting in a kind of iPadEcho hybrid. Now thats potentially really useful; a pictures worth a thousand words, and I really want to see those socks before I buy them. And the same device that interactively steps you through a video of how to make a complicated dish can interrupt what its doing for some face-time with the kids. But it also enables a dynamic, personalized advertising billboard right there in your kitchen.

Hey Alexa, invite your friends over

Apart from weather updates and music streaming, a key potential of these devices that you hear a lot about is as controllers for the smart home: lock the door, dim the lights and turn the heating up with your voice.

Amazon and Google are falling over themselves to voice-enable all manner of devices. At the Consumer Electronics Show in January 2017, you could find Alexa integrations in Ford automobiles, robots, kitchen appliances and other brands of smart speakers. In July, Google had over 70 home automation partners and counting.

Samsung which, as well as being first in worldwide smartphone market share, is also a massive white goods manufacturer has indicated that, by 2020, all of its home appliance line-up will have smart features. The company is developing partners to expand voice to wearables, in-car systems and audio rigs.

And of course Apple has HomeKit for Siri, although this seems a bit left out in the cold until the HomePod lands; its not so convenient to have to pull out my phone to switch on the heating.

The result of all this combination of voice interfaces and device integration: friction-free ubiquitous access to intelligent devices around the home. The gap between future vision videos and what we have today has never been smaller.

But is this really such a good thing?

Take vacuum cleaners. Among the smart devices you can control with Google Assistant is LGs new Hom-Bot Turbo+ robot vacuum cleaner. Whats interesting about the Hom-Bot is that it comes with six cameras, to make navigating the home easier and avoid getting stuck on objects or knocking them over. But the cameras are also touted as having a security function: if the vacuum detects movement after the owner has left the house, it can record up to 100 minutes of video. Presumably it also has that capability when Im still at home.

And newer Roombas are able to upload data about the layout of your home to the cloud. iRobot, the manufacturer, has indicated that it might sell that data to others, such as Amazon, Alphabet or Apple. Isnt it a bit of a worry that these things are wandering around your house looking in every nook and cranny? Right now, we are assured that our data is safe and nothing will be done without our permission. But hang on, take a look at Googles terms and conditions; theres already a lot they can do with your data. And while these corporations might do no evil today, how can we be so sure that will be true into the future?

In 2015, it was revealed that Samsung smart TVs were sending unencrypted voice search data across the net. Earlier this year, Vizio, a TV manufacturer, was fined $2.2M for tracking viewing habits and selling the data to advertisers. Consumer Reports was prompted to offer advice on how to stop your TV from snooping.

Also in 2015, researchers found security flaws in Mattels talking Hello Barbie doll that would allow a hacker to reuse its authorization credentials. And earlier this year, Germanys Federal Network Agency banned a doll called My Friend Cayla on the grounds that it was an espionage device, transmitting everything it heard back to base. In the wake of this, the FBI issued a privacy warning for Internet-connected toys. Long gone are the days when all a parent had to worry about was the ingestion of small pieces of plastic.

So what do you do? Some of the advice being offered is pretty extreme: Mark Pugh from iServPro suggests that When your device is on, its your responsibility to remember not to say anything within earshot of it that you wouldnt say in public. Dont get too comfortable around it.

Creeping creepiness

I remember the first time Google popped up a card on my phone to remind me, unsolicited, that my car was parked a certain distance away. I thought this was cute, but my partner found it creepy. As our smart devices work more and more together, I think were going to have a lot of creepy moments. Do I really want my electric toothbrush to tell Alexa about my brushing habits so I can be better targeted for new dental care products?

Its not just that all these devices will be connected to each other. The fact that they are listening makes it infinitely harder for us to maintain an accurate mental model of what they know and what they dont know. Much is made of the fact that todays smart speakers only start recording when the wake words are heard. But this is a software-imposed limitation, and clearly the capabilities for always-on listening are there. We are simply trusting that the manufacturers will do the right thing. That might not be so comforting if you happen to live in an authoritarian state.

And besides, theres always scope for hacking: Wired reported at the beginning of the year that a security researcher had succeeded in turning an Amazon Echo into an eavesdropping device. Now, this particular hack required physical access; but should I believe that my internet-connected smart speaker is more secure than Irans nuclear facilities?

I recall when CCTV cameras first started being installed in the United Kingdom in large numbers in the 1980s. Many people expressed concern about what was then seen as a gross invasion of privacy. Today, we are much more blasé about the divide between public and private; while there are still some who stick masking tape over their webcams just in case someone is snooping, there are many more who post pictures of the most intimate moments of their lives for all to see. As these smart devices become more and more integrated in all that we do at home, the boundaries become ever more fuzzy and hard to determine.

Like the frog that boils to death as the temperature of the water slowly increases, we have allowed our lives, and now our homes, to be step-by-step made accessible to third parties. Its easy to see why. Interacting with all these devices by voice is just massively convenient. But it comes with a price. From one angle, it looks like we are willingly installing and paying for the last mile of the infrastructure needed for the ultimate surveillance society. Just sayin.

Let Cayla, the compromised doll we mentioned earlier, have the last words. In a YouTube video, Norways Consumer Councils technical director Finn Myrstad asks Cayla, Can I trust you?. Cayla replies I dont know.

If you enjoyed reading this piece, please check out my other posts on Medium. If youd like an easy way to keep up with the key developments in the commercial NLP world, consider signing up for This Week in NLP, a short and snappy weekly newsletter published each Friday.

Video liên quan

Chủ Đề