Deeper Dive on Siri Usage and the Big Picture for Voice

On Tuesday, Bob O’Donnell wrote an insightful, high-level look on voice-based smart assistants. I want to dive a little deeper with some specific research I did on how people are using Siri and other voice assistants.

While my research on activities done most frequently with voice assistants is similar to Bob’s, I got a little more granular in the use cases.

Screen Shot 2015-12-02 at 6.12.50 PM

The chart is rather self-explanatory. However, there is a pattern. Currently, the voice-based assistants we use are primarily used for automation. Better put, we simply use voice to minimize the path to the results we desire. Voice automation of tasks is the common thread of how we interact via voice with our devices.

Those who wrote in comments for the “Other” segment categorized a number of things like searching, or launching an app, etc. Still all automation tasks.

When I asked questions about frequency of usage, the vast majority do these rarely or never. However, 28% said a few times a week and 11% said daily. This relatively low usage tells me a few different things.

The first is my sense that most people don’t know all the things they can do with their voice assistants. Therefore, their willingness to explore is somewhat constrained as they may feel they need a guide. Many consumers I’ve spoken with indicate they would like a clear set of instructions or things they can ask Siri for. Apple does provide this but my sense is it isn’t obvious to many. This point also brings up a bigger one that has been on my mind for some time.

These voice-based assistants, while rooted in natural language, still have a ways to go before they fully understand all the nuances of human interaction. To use voice assistants today, we still need to understand their vernacular vs. them understanding ours. Essentially, I still have to ask in a certain way. Until we get to a point where humans can talk to their devices in whatever way they feel comfortable, then we will be getting close to this technology becoming an essential feature.

What is promising about voice technology is we have a lot of indications this is a killer feature. My benchmark for a killer feature is when it works it, blows your mind — it’s truly useful and magical. But when it doesn’t work, it makes you angry. Time and time again I see this with consumers. They love it when it works and get mad when it doesn’t. This is actually a very good sign. The experience was so great it set a new expectation. This is very hard to do to begin with. Then when that expectation is not met, anger or frustration is predictable. Like I said, this is the positive sign of an important experience.

Being able to speak to our devices in our own natural language nuances and have it work nearly perfectly every time is the combination to take voice assistance mainstream and increase daily usage.

As we move closer to artificial intelligence, our devices will not just understand us better through our communication with them, but they will also get to know us. That is, know more intimate things about us. Our likes, dislikes, habits, nuances, relationships, etc. Right now, Siri knows where I live but not why I live there. It knows my wife’s name is Jen and it thinks my office is my local tennis courts because I go there so often. This just scratches the surface of what a true assistant needs to know to be that true assistant. When I say, “Siri I’m hungry”, it would be great if she listed the places I like to eat as options, offered to order ahead, or knew my food likes and patterns to perhaps recommend a new place for me to try that got good reviews. Once our devices know us, they can start to work on our behalf in the background and offer a much deeper level of value, like a true personal assistant would. This is where we are heading, I believe. This is where the smartphone becomes even more indispensable than it is today.

The last element of this that is interesting to me relates to whoever cracks this first. This is a key battle for Google, Apple, Microsoft, Amazon, etc., because whoever I “hire” to become my personal assistant, assuming it is the best personal assistant on the market, I will never fire it. Ask any executive and they will all tell you that, once you find a good personal assistant, you never let them go. That is why the “battle of the personal assistant” is central to consumer loyalty in the future.

Published by

Ben Bajarin

Ben Bajarin is a Principal Analyst and the head of primary research at Creative Strategies, Inc - An industry analysis, market intelligence and research firm located in Silicon Valley. His primary focus is consumer technology and market trend research and he is responsible for studying over 30 countries. Full Bio

15 thoughts on “Deeper Dive on Siri Usage and the Big Picture for Voice”

  1. Isn’t there the same special-purpose vs specialized dichotomy as for IoT/smarthome ? That is, do I want a smart assistant to handle all my queries, or do I want it to handle only my agenda, but better ? Especially if discovery is hard and usage low, should providers be looking for a specific killer use case that will motivate users and provide excellent service (and maybe expand from there later), or should they make sure all wants and needs are somewhat addressed, even if the much broader scope lessens quality and makes advertising/publicizing/training more foggy, and hope users will come back ?

    One thing I find counter-intuitive is that smart assistants are better suited to complex tasks. “OK Google, call Mom on her Mobile” isn’t significantly faster than taking out my phone, opening the dialer app, and selecting Mom in recent calls. Not a compelling use case. On the other hand “schedule a meeting early next week with Alice Bob and Chuck” is a lot more efficient than doing it myself.

    1. ‘”OK Google, call Mom on her Mobile” isn’t significantly faster than taking out my phone, opening the dialer app, and selecting Mom in recent calls. Not a compelling use case.’

      It’s a compelling case when you’re driving.

      1. Mmmm… calling while driving is a big no-no. it’s about the same as driving drunk, so I don’t do it. Even though I’m fairly sure “this is only true for the average idiot but nor for me” :-p

        1. For people like you, a smart assistant would not induce you to start making phone calls while you’re driving. But for the other, larger subset of drivers, I would prefer that the idiot on the road behind me is using a smart assistant to call his mom rather than directly pressing buttons on his mobile.

  2. “Right now, Siri knows where I live but not why I live there.” I contend that she will never ever know, much less understand, the “why”.

    And without that, smart assistants can never make the sort of inferences that make human personal assistants regularly perform for their bosses. Nothing wrong with that, but it is a mistake for developers or device makers to overstate their smart assistant’s capabilities.

    And yes, having to train oneself to speak a certain way just to speak to a smart assistant is a big roadblock to its wider use.

    1. I’m wondering
      1- If smart assistants (or ads engines) are even trying to be smart. Right now assistants and ads seem to be struggling to just understand what we want, let alone anticipate it. At some point smartness means being pro-active, and that’s… Clippy, at worst ^^
      2- If/when they start trying to anticipate will they manage to strike the balance between helpful and nagging, which I guess depends on frequency, timing and relevance ? Google Now is trying to do that a bit, suggesting web stuff for me to
      read, with a 30%-ish hit rate which is way better than my hand-built RSS
      hit rate.

      As a side note, do we even know why we live where we do ? Between accidental reasons, sub-conscious stuff, external constraints… my friends way too often have more insight than I do about myself. Let’s not set the bar too high for computers :-p

      1. I am actually far from setting the bar to high for computers. In fact quite the opposite, I think a lot of people are promising way too much about what computers can or will be able to do.

        1. I think the very idea of calling these features “smart assistants” is misleading, and could cause R&D to follow the wrong path. Much like how Japanese anime biased our robotics research efforts to pursue humanoid robots (mostly in vain up till now), instead of something more practical and attainable like the Roomba.

          As Ben mentions, the current state of smart assistants is nothing of the sort. They are simply voice interfaces. I suspect that acknowledging that these assistants will not be smart in the mid-term, and instead focusing on how conveniently one can control a device, will be key. The company that truly understands what to focus on and makes the necessary compromises, I think, will emerge as the clear victor.

          1. That’s an excellent point. Apparently a scientist at MS (and likely every where else, too) thinks the problem is speech recognition accuracy and apparently we are only 5 years away from that.

            http://www.businessinsider.com.au/microsoft-chief-scientist-xuedong-huang-on-the-future-of-speech-recognition-2015-12

            Really the difficulty is not accuracy but meaning. ATT, in the old days of landline telephones, used to gauge voice quality on the idea that people only hear about 70% of what is said anyway. (I have no idea what study or research they referenced for that).

            But it seems to me, just like we read words more by shape than by reading the letters, there is more to understanding than just vocal recognition accuracy.

            But then, as George Bernard Shaw once said:

            “The single biggest problem in communication is the illusion that it has taken place.”

            Joe

          2. Very interesting, and I agree on all the points you made.

            Of course, scientists and engineers have to be optimistic; their job is hard enough as it is. I think business leaders however, should be more focused on what is attainable mid-term. Those who don’t, fall into the trap that Sculley did with the Knowledge Navigator.

  3. As I also commented on Bob O’Donnell’s piece, I would appreciate a breakdown of “search for things on the Internet”.

    If Google intends to put any kind of advertising on smart assistants, “search for things on the Internet” is going to be where they make their money. Recommendations are very much similar in that they are basically searches that are performed automatically on your behalf.

    If “search for things on the Internet” includes things like searching for a place to eat, then yes, there is an advertising opportunity. Better still if the searches were like “find me a nice necktie to wear at the party”, but it’s challenging to browse through merchandise with a voice interface alone. If however the searches are more pinpoint and specific, like “what’s the weather today”, then there is very little advertising opportunity (unless you want to obstruct the user from immediately getting what they want. “It’s raining, and by the way, I know a shop that sells really cute umbrellas” is not what you what to hear).

    I do expect the “search for things on the Internet” is very different from the searching that we do on desktop PCs and even smartphones, and I would appreciate any data that highlights the differences or similarities.

  4. With Apple shooting itself in the foot with the privacy stance, I highly doubt it will get there faster than Google or Microsoft.

Leave a Reply

Your email address will not be published. Required fields are marked *