Microsoft launched Kinect back in November 2010 in a move to change the man-to-machine interface between the consumer to their living room content. While incredibly risky, the gamble paid off in the fastest selling consumer device, ever. I saw the potential after analyzing the usage models and technology for a few months after Kinect launch and predicted that at least all DMA’s would have the capability.
The Kinect launch sent shock waves into the industry because the titans of the living room like Sony, Samsung, and Toshiba hadn’t even gotten close to duplicating or leading with voice and air-gesture techniques. With Samsung and LG announcing future TVs with this capability at CES, Microsoft’s living room interaction strategy has officially been affirmed at CES and most importantly, the CE industry.
Samsung launched what it called “Smart Interaction”, which allows users to control and interact with their HDTVs. Smart Interaction allows the user to control the TV with their voice, air-gestures, and passively with their face. The voice and air gestures operate in a manner similar to Microsoft in that pre-defined gestures exist for different interactions. For instance, users can select an item by grabbing it, which signifies clicking an icon on a remote. Facial recognition essentially “logs you in” to your profile like a PC would giving you your personal settings for TV and also gives you the virtual remote.
A Step Further Than Microsoft ?
Samsung has one-upped Microsoft on one indicator, at least publicly, with their application development model. Samsung has broadly opened their APIs via an SDK which could pull in tens of thousands of developers. If this gains traction, we could see a future challenge arise where platforms are fighting for the number of apps in the same way Apple initially trumped everyone in smartphones. The initial iPhone lure was its design but also the apps, the hundreds of thousands of apps that were developed. It made Google Android look very weak initially until it caught up, still makes Blackberry and Windows Phone appear weaker, and can be argued it was the death blow to HP’s webOS. I believe that Microsoft is gearing up for a major “opening” of the Kinect ecosystem in the Windows 8 timeframe where Windows 8 Metro apps can be run inside the Kinect environment.
Challenges for Samsung and LG
Advanced HCI like voice and air-gesture control is a monumental undertaking and risk. Changing anything that stands between a CE user and the content is risky in that if it’s not perfect, and I mean perfect, users will stop using it. Look at version 1 of Apple’s Siri. Everyone who bought the phone tried it and most stopped using it because it wasn’t reliable or consistent. Microsoft Kinect has many, many contingencies to work well including standing in a specific “zone” to get the best air gestures to work correctly. Voice control only works in certain modes, not all interactions.
The fallback Apple has is that users don’t have to use Siri, it’s an option and it can be very personal in that most use Siri when others aren’t looking or listening. The Kinect fallback is a painful one, in that you wasted that cool looking $149 peripheral. Similarly, Samsung “Smart Interaction” users can fallback to the remote, and most will initially, until it’s perfected.
There are meaningful differences in consumer audiences of Siri, Kinect, and Samsung “Smart Interaction”. I argue that Siri and Kinect users are “pathfinders” and “explorers” in that they enjoy the challenge of trying new things. The traditional HDTV buyer doesn’t want any pathfinding or exploring; they want to watch content and if they’re feeling adventurous, they’ll go out on a limb and check sports scores. This means that Samsung’s customers won’t appreciate anything that just doesn’t work and don’t admire the “good try” or a Siri beta product.
One often-overlooked challenge in this space is content, or the amount of content you can actually control with voice and air gestures. Over the top services like Netflix and Hulu are fine if the app is resident in the TV, but what if you have a cable or satellite box which most of the living population have? What if you want to PVR something or want to play specific content that was saved on it? This is solvable if the TV has a perfect channel guide for the STB and service provider with IR-blasting capabilities to talk to it. That didn’t work out too well for Google TV V1, its end users or its partners.
This is the Future, Embrace It
The CE industry won’t get this right initially with a broad base of consumers but that won’t kill the interaction model. Hardware and software developers will keep improving until it finally does, and it truly becomes natural, consistent, and reliable. At some point in the very near future, most consumers will be able to control their HDTVs with their voice and air gestures. Many won’t want to do this, particularly those who are tech-phobic or late adopters.
In terms of industry investment, the positive part is that other devices like phones, tablets, PCs and even washing machines leverage the same interactions and technologies so there is a lot of investment and shared risk. The biggest question is, will one company other than Microsoft lead the future of living room? Your move, Apple.