Folks who found Apple’s iPhone announcement disappointing, and there were plenty of them, weren’t really paying attention. My colleagues Tim Bajarin and Ben Bajarin have outlined the reasons consumers should be excited about the new phone, despite the fact that it looks identical to its predecessor. I’m going to focus on just one of them, the Siri personal assistant.
It’s a huge mistake to regard Siri as a speech recognition component. Speech recognition has become highly developed, but by itself, it doesn’t do very much. Anyone who has used voice control on an Android phone knows it is very good at letting you dictate messages, but not much else.
Siri cracks a much tougher nut. For it to work, the software, which runs partly on the iPhone 4s and partly on Apple’s servers, must understand not just your words but your meaning. If you ask “should I wear a raincoat today?” and Siri responds with a weather forecast, were are looking at very significant advance in machine intelligence.
At this point, a couple of very important caveats are in order. Siri looked spectacular in Apple marketing chief Phil Schiller’s demo. But it was a demo, and the people who create demos carefully limit their choices to commands and functions that they are confident will work. Apple didn’t give attendees at the announcement any hands-on time with the phone. So until users have a chance to try out Siri in the wild, we’ll have to reserve judgment on how good it really is. In a move that seems more Googley than Apple-like, Siri is being released with the iPhone 4 on Oct. 14, but it is officially designated as a beta product, perhaps in and effort to temper expectations.
A second question is just how good it has to be for people to find it useful. If it doesn’t truly make the iPhone easier to use, people will abandon it quickly. But if it works anywhere near as well as it did in the demo, I suspect it will revolutionized the way we interact with devices.
While science fiction computers has been able to carry on intelligent conversations for decades, it has taken real world computers about that long just to learn to recognize words reliably. Speech recognition, which companies such as IBM and AT&T began working on seriously in the 1960s, was based primarily on signal processing and statistical analysis. Natural language understanding seemed hopelessly beyond reach, whether the input was spoken or typed.
Siri was developed by a company of the same name that was acquired by Apple. The original research was funded by the Defense Advanced Projects Research Agency, but Apple may have thrown more engineering and computer science muscle into the project than even the Pentagon can afford these days. But it also had to wait foir a dramatic increase in the processing power of mobile devices—one reason that Siri will not be available with iOS 5 on older phones–and more seamless communications that allow the work to be split between the phone and the server.
As smart as smartphones have become, simple tasks can require annoyingly many steps. Setting up a meeting requires checking a calendar for the proposed time, finding attendees in a contact list, and sending out invitations. If all that can be replaced by pushing a button and saying, “Set up a meeting with Tim Cook for 10 am on Friday,” ease of use will have taken a great leap forward.
One secret to any successful attempt at natural language understanding is restricting the range of commands, known as the domain, that it must make sense of. If you tell Siri, “Write Mr. Smith a script for simvastatin,” your iPhone will probably stare at you blankly (unless, of course, someone uses the Siri application programming interface to create a prescription-writing program.) The range of things you can reasonably ask a smartphone to do is still pretty limited.
The critical question is how much of that repertoire of requests Siri will handle well. If it is a reasonable fraction, Siri alone will provide ample reason for the iPhone 4’s success.