Tech.pinions

The Voice UI has Gone Mainstream

Voice control vector illustration. Smart computer voice control with human voice. Smart phone, smart house, modern computer technology. Voice control command background. Voice control icon

Reading Time: 6 minutes

The idea of talking conversationally to computers has been a long time in the works. Science fiction is so often a self-fulfilling prophecy as it provides a vision for humans to chase after with technological innovation. For those of us who have watched voice-based computer interactions evolve, we have seen it go through many manifestations as it grew up. We now find ourselves in a world where using voice to interface with a computer is commonplace on a regular basis for the masses. While I’m not quite confident we have reached an inflection point, I am confident we are at least on the cusp of one with voice-based user interfaces and the vision of the Hal 9000 (The AI assistant of Arthur C. Clarke’s Space Odyssey series) and Jarvis (the voice based AI assistant of Iron Man).

In anticipation of this and the many other “voice first” based products and experiences we believe will come to market in 2016-2017, we sought to do a quantitative study of Amazon’s Echo, Apple’s Siri, and Google’s Ok Google. We conducted two separate studies in early May, since our intuition told us voice would be a major theme of Google I/O and at Apple’s upcoming WWDC. We focused the Amazon Echo study on our early adopter panel since we knew we would not get a statistically significant number of Echo owners in our mainstream representative US consumer panel. We collaborated on the Amazon Echo study with my friend Aaron Suplizio (@aaronsuplizio) from Experian. Experian is also studying how the Echo is being used, specifically in the context of conversational commerce. (Experian didn’t pay us to do the study but did cover the costs for the raffle where two respondents won a $100 gift card.)

The second study was focused on our mainstream consumers to understand how they use Siri and Ok Google (or any Google voice-based search technique) to better learn how both are used and what the overall perception of each is by mainstream consumers. I’ll start by sharing what we learned about the Amazon Echo.

Amazon Echo and the Voice First User Interface
By spreading our study across 1300 early adopters, we found 13.86% of the panel owned an Amazon Echo. It came as no surprise to us the overwhelming majority of Echo owners also owned an iPhone (83.72%) as iPhone owners at large tend to show more early adopter tendencies vs. Android owners. What was most enlightening, in contrast to the Siri and Google voice study, was how different usage of the Echo was vs. Siri and OK Google. This was interesting both in terms of location of usage but also most common tasks.

We first wanted to understand where the Echo is used in most consumers’ home (we had a hunch it was either the kitchen or the living room). As you can see, the kitchen has the edge on the living room with 51% of consumers saying they have their Echo in the kitchen.

Given the type of things the Echo does, and perhaps in alignment with Amazon’s goals in delivering services to consumers via the Echo, knowing the primary usage room is important. Particularly because it is likely that the things we ask of our voice assistants may vary based on the context of the room or physical location we are in. For example, asking the Echo to turn the TV on is less relevant as a primary task unless the Echo is in the living room. We can certainly make the case someday voice assistants will be available at all times in all rooms. Click to enlarge the graph.

We followed this question by asking respondents to choose the top two things they do most often with their Echo. The top three most common use cases done regularly were: play a song (34%), control smart lights (turn on/off lights) 30%, and set a timer (24%). A few quick thoughts on Echo usage.

Playing a song as the top use case is not surprising given the product is positioned as a smart speaker. Bluetooth speakers have actually sold well at retail. The idea of having portable sound around the house is compelling for consumers. It also makes sense as the entry point for a smart voice assistant given the need for a speaker, microphone and accompanying components for microphone arrays and noise cancelling tech for better speech recognition. Controlling the lights is, in my opinion, a solid indicator of voice controlled smart home technologies which will someday become commonplace. As our homes get smarter, it makes sense that the way we will interact with our smart objects is through voice. It may be the catalyst to drive the true smart automated home into the masses.

In terms of overall satisfaction from Echo owners, most were satisfied with the overall product but satisfaction ranked highest when we asked specifically about the voice recognition capabilities of the Echo. Owners felt it delivered on recognizing what they were saying and performing the task they asked of it. This has a lot to do with the Echo’s microphone tech and noise cancelling capabilities as well as its connection to persistently good broadband which is often where Siri and Ok Google break down when trying to use while driving and/or operating in areas of poor quality service in mobile broadband networks.

Only 13% of Echo owners stated they noted declining usage since they acquired it. The top reason listed by those using it less was “the novelty of using my voice is wearing off”.

Understanding how Siri and Ok Google Are Used

Perhaps the most important observation we came away with from our study was Siri is the most used voice-based user interface. In our mainstream panel of 518 consumers (44% iPhone owners, 40% Android owners, 2% Windows Phone or Blackberry, 13% don’t own a smartphone), 65% indicated they had used either Siri, Google’s “Ok Google or voice search,” or Microsoft’s Cortana. Of all three, only 21% had never used Siri. Which compares to 34.8% who have never used Google’s voice solution, and 72% who have never used Microsoft’s Cortana. More consumers across the spectrum of operating systems (iOS, Android, Windows) have used Siri than any other voice UI. I credit the success of Apple’s iPad as assisting with this observation since many Android phone owners, non-smartphone owners, and Windows PC owners have iPads as well.

Looking at how they used each voice UI, we see for the most part people use Siri and OK Google/Voice search in the same ways on their smartphones. Contrasting these common usages against those of Echo, we see the distinct differences having a voice user interface to a communications device like a smartphone differs from one that is stationary in the home and positioned as a smart hub vs. a personal computing product like a smartphone, PC, or tablet.

Search is the most common task done on smartphones or tablets using Siri or OK Google/Google Voice. Google announced at Google I/O that 20% of all Google search queries are now done by voice. Looking at the data, we can conclude more voice search queries are done with Siri than with Google’s voice-based search. When I look at these most common tasks, they strike me as fairly basic. Which is an important observation to understand given where the market is today. These most common tasks may be simply because the products are still somewhat limited in their capabilities but could also be because they are the ones that work the best and most consistently.

Overall satisfaction with the voice recognition of Siri and OK Google/Google Voice search was relatively similar and only different slightly from the grades iPhone owners gave Siri and Android owners gave OK Google/Google voice search. Both were also below 80% which is not bad for where these technologies are today. The Echo’s voice recognition capabilities did yield higher satisfaction rates than both Siri and OK Google/Google voice search but I interpret that due to the technological variables of being stationary, having better noise cancellation, and a persistent high bandwidth connection to the internet. All things that are variables which impact the experience of voice-enabled user interfaces.

Finally, context of location usage for voice-based user interfaces is another important factor to understand. For those who use Siri or OK Google/Google voice search most regularly, the primary location is the car with 51% of consumers saying this is their primary location to use voice-enabled actions. The home was second with 39%. From a cultural perspective, it should come as no surprise that both these locations offer an element of privacy which is why only 6% of respondents said they commonly use Siri or OK Google/Google Voice in public.

Going Forward

I walked away from this study with confidence the voice user interface has gone mainstream. What’s more, mainstream consumers seem to recognize the value and convenience with them. Consider these statements from consumers:

  1. It does not always work but when it does it is very useful – 55% Strongly agree
  2. I would use my devices voice capabilities more if I could speak to it more naturally – 43% strongly agree
  3. If it worked more often, I would use my devices voice assistants more – 48% strongly agree
  4. I want my device’s voice interface to integrate better with more devices and apps that I use regularly – 66% strongly agree
  5. I am not comfortable speaking to my technology – 41% strongly DISAGREE

It is encouraging, from a sentiment perspective, that voice looks to be a natural extension of our keyboard/mouse/touch-based input and output methods. Consumers seem to recognize the value and desire for it to work in more ways. I’ve long said the true test of a great feature very early in its life cycle is when it combines both delight and frustration. Once you use it, you’re hooked but you want it to be great all the time because you can see the potential. This is why we snuck this question into the sentiment segment to see if consumers agreed and 47% strongly agree and 38% somewhat agree that, when their voice assistant works, it is great and, when it doesn’t, they get irritated.

The battle for the voice-based assistant is on. This is another area where the one with the biggest ecosystem built around their Voice UI/Voice OS has the best shot of being “hired” by the masses.

We appreciate all our panelists and their willingness to share their thoughts on consumer technology products. If you are interested in participating in our consumer studies, please click here.