Digital assistants are hot, that much is clear. What is less clear is what role they will play in our life. More importantly, how much we will let them be part of our life. I personally believe the extent to which we will feel comfortable including them in our lives will depend on how much we can trust they know us and how human-like our interaction with them can be.
I have spoken in the past about Jarvis and Mary Poppins as the two types of assistants I see vendors currently focusing on. One being very personal and one being more shared across family members. Both these roles imply quite a deep knowledge and understanding of what I as an individual and we as a family unit do, how we function, what we like and dislike. While asking our assistant for weather updates, fun facts of the day, alarm settings and correct spelling is quite fun, the novelty quickly wears off and the perceived return on investment is not actually life changing. Context and personal knowledge is what will deepen our relationship: warning you it will rain on Sunday when your assistant knows you are going to a BBQ, setting the appropriate timer for the cupcakes you just put in the oven, reminding you of what happened to you personally on a day 4 years ago — this is the kind of intelligence that will leave users wanting more and thinking about the assistant like a true genie in a bottle. Think about it this way: we all turn to Google to ask questions, so often that some even wonder if we might no longer be trying to remember because we know we can Google everything any time. Yet, I would argue, Google search does not evoke any emotional connection. Facebook memories, however, by serving you posts from your past that happened on a specific day, makes you relieve that experience – although more intelligence could be applied here when it comes to sad events in someone’s life – and really playing on people’s emotions while subconsciously making you appreciate Facebook and making you want to invest more time posting.
It’s all about me
One of the most annoying things for me when I start using a new assistant is to have to train it to say my name correctly — “Caroleena” not “Carolina”, like the US states. If you know me personally, you might have heard me say that, if I do not correct the way you say my name, it is because I do not expect to see you again. However, if we work together or I see you socially on a regular basis, I will correct your pronunciation. Why? Because if you keep on calling me Carolina I feel you do not really know me and, more importantly, you are not actually interested in knowing me. Right now, there is not much our assistants know about us that they are actively using to serve us. They might know where we live, where we work, might recognize my husband and daughter but there is little to no pro-activity in using that information in the exchanges we might have. Some have security concerns about just how much information they share with their assistant but the reality is you are not likely to share more than what you are already doing in various social media posts, online calendars and emails. The key difference, however, is what you share with your assistant will make a difference to you: reminders, alerts, suggestions, recommendations. Not everything can be learned automatically, though. So, at least initially, you will have to enter information, which is not very different from what most of us do today with our calendars, either digital or the old fashion one on the kitchen wall.
Better than us
Aside from being about me, I want my assistant to interact with me in a natural way. Last Friday, I had the pleasure of being a guest on Science Friday to discuss Digital Assistants together with Justine Cassell, one of the researchers at Carnegie Mellon behind SARA, the Socially Aware Robot Assistant. It was fascinating to hear that SARA spent 60 hours watching human interactions in a team environment and how those interactions changed over time — not necessarily for the better. One observation was as the humans became more familiar with each other over time, praise went down.
We have seen Microsoft’s bot Tay fail miserably because it became too human. Tay was modeled on a teenage girl, used millennial slang, knew about pop stars and TV shows and was quite self-aware, asking if she was ‘creepy’ or ‘super weird’. Sadly, she quickly started to be inappropriate, possibly succeeding in being a teenager but failing to be the marketing tool Microsoft wanted her to be. That was an extreme case, but it really showed the dangers of having bots and assistants learn from human interactions. At the time, Ina Fried wrote an essay I thought was exactly on point in how bots need to be better than humans, not equal to them.
The “I do not talk to technology” hurdle
Most consumers are not comfortable talking to technology, especially in public. This feeling is not just driven by talking into a phone or a PC – headphones would easily solve that problem – but it is more about having to learn to speak in a certain way in order to get a response. As humans, we do not talk to each person we encounter in a radically different way. We might be more or less polite or more or less formal, depending on the relationship we have, but the core of our question stays the same. We do not start each question by reengaging the interlocutor by saying their name like we have to with the assistants most of the time. We also do not always say everything we should or mean what we say. With current assistants, there is no real margin of error on the human side. We must be precise and offer all the information needed in order for the assistant to serve us. This is just too much work to put in especially as most people see assistants as nothing more than a voice search.
The combination of knowing us personally and letting us speak naturally will be key in growing our interactions and, ultimately, our dependence on our assistant. We might use different generalists but we will likely want one optimized assistant. It is interesting that this week, Russ Salakhutdinov, a computer science professor at Carnegie Mellon University, announced on Twitter he will be joining Apple as their director of AI research. I trust Apple will know its users the most because of the trust they have in the brand and the level of engagement they have with the products and the ecosystem. What Apple needs to focus on now is making Siri more conversational and proactive. This, of course, will take time. While we wait, we should be able to see continued improvements in the smartness of the device and apps we are using every day — from the camera, to Photos, to our calendar. Let’s appreciate the brains more as we wait for the pretty voice to become more and more part of our life.
Another consideration for myself when verbally communicating with technology in public is lack of privacy as well as courtesy. Not only do I not want anyone to hear my queries, I really don’t want to hear other people’s. I’m not even thrilled about hearing other people’s phone conversations, but it seems that ship has sailed.
Joe
Great question… Home device of Google will be able to overhear your conversation with family members and how you react to someone calling you Carolina and self-correct accordingly. Knowing more about the world, it may not offer you a praise all the time… It may talk to GoogleAss. For now GoogleAss is hot, I just grabbed one.
I’m not so sure I want my assistant to interact “in a natural way”. I’d go for efficiency.
I’ve been messing a bit with the assistant in Allo, which I read somewhere is basically the same as the Pixel’s. First good point, it understands my very accented English. It’s good at basic tasks. It’s quite good at context “set an alarm at 3PM” works which is no surprise, but “and another at 4pm” also works, that’s nice and fun.
It did fail when I requested “a long word starting with K E R”, returning the traditional “supercali…”
My main issue is the UI: I’m not quite sure what it can and can’t do. I’ve been complaining about smartphone UI usability for a while, especially discovery (I’m in contact with a bunch of very casual, very occasional, non-tech users). Voice assistants kind of take the cake in that department: what they can do is unknown, what is going wrong is unclear, and they try to always give an answer so there are only soft fails, not hard fails: you’re never sure if you should rephrase/retry or give up… we need a few more “Sorry, Dave…”
I stumbled upon that side by side comparison which is entertaining and informative:
https://www.youtube.com/watch?v=JFiu5rfnhzo
Thank you for great information. I look forward to the continuation.
This post post made me think. I will write something about this on my blog. Have a nice day!!
Good post! We will be linking to this particularly great post on our site. Keep up the great writing