Nuance Exec on iPhone 4S, Siri, and the Future of Speech

Though the iPhone 4S appears nearly identical to the current iPhone 4, it is, as my colleague Tim Bajarin points out a revolutionary device because of its voice-based Siri interface. For the past 20 years, we humans have learned to point and click, but this has never been a natural way to interact with our environment. Touch and speech, on the other hand, have been around since we were living in caves.

Photo of Vald Sejnoha
Nuance CTO Vlad Sejnoha

“Speech is no longer an add-on,” says Vladimir Sejnoha, chief technical officer of Nuance, probably the world’s leading speech technology company. “It is a fundamental building block when designing the next generation of user interfaces.”

Sejnoha is faithful to the code of omerta that Apple imposes on its vendors. Although Nuance has supplied technology both to Apple and to Siri before its 2010 acquisition by Apple, he declined to discuss Nuance’s role in the iPhone 4S: “We have a great relationship with Apple. We license technology to them for a number of products. I am not able to go into greater detail. But we are very excited by what they have done. It’s a huge validation of the maturity of the speech market.”

But Sejnoha made no effort to hide his enthusiasm for the Siri approach. “It allows you to find functionality or content that is not even visible,” he says. “It provides a new dimension to smartphone interfaces, which have been sophisticated but shrunken-down desktop metaphors.”

It’s has been a long, hard slog for speech to become a core user interface technology. It took a good thirty years, from the late 60s to the late 90s for speech recognition—the ability to turn spoken words into text—to become practical. “Speech recognition is not completely solved,” says Sejnoha. “We have made great strides over the generations and the environment has changed in our favor. We now have connected systems that can send data through the clouds and update the speech models on devices.”

Recognition alone is a necessary but hardly sufficient tool for building a speech interface. For years, speech input systems have let users do little—sometimes nothing—more than speak menu commands. This made speech very useful in situations were hands-free operation was desirable or necessary, but left speech as a poor second choice where point-and-click or touch controls were available.

The big change embodied by Siri is the marriage of speech recognition with advanced natural language processing. The artificial intelligence, which required both advances in the underlying algorithms and leaps in processing power both on mobile devices and the servers that share the workload, allows software to understand not just words but the intentions behind them. “Set up an appointment with Scott Forstall for 3 pm next Wednesday” requires a program to integrate calendar, contact list, and email apps, create and send and invitation, and come back with an appropriate spoken response.

Sejnoha sees Siri in the iPhone as just a beginning.  “Lots of handset OEMs are working on it,” he says. “There is a deep need for differentiation in Andoid and Apple will only light a fire under that. Our model is to work closely with customers and build unique systems tailored to their visions.” And while a speech interface can drive search, it can also become an alternative to it: “One consequence of using natural language in the user interface is direct access to information. We can figure out what you are looking for and take you directly there. You don’t always have to go through a traditional search portal. It will change some business models.”

Nor do the opportunities stop at handsets. “Speech is a big theme for in-car apps because that is a hands busy, eyes busy environment,” Sejnoha says. “All the automotive OEMs are working on next-generation connected systems. The industry is undergoing revolutionary change.”

The health care market is another hot spot.  “Natural language is taking center stage in health care,” Sejnoha says. “We are mining data and using the results to populate electronic health records.” Nuance recently signed a deal with IBM to provide technology for a speech front-end to the health care implementation of its Watson question-answering system.

The key to the next breakthroughs in speech technology, Sejnoha says,  is making effective use of the vast amount of  speech data that now exists, a challenge that has also attracted Nuance competitors Google and Microsoft. “Most algorithms use machine learning and are very data-hungry,” he says. “No one knows yet what to do with tens of thousands of hours of speech data. The race to do that is one. We are doing fundamental research and have a relationship with IBM Research as well. It requires a broad array of techniques to model speech in a robust way and to learn the long tail statistically and the build techniques that can benefit from large amounts of data. It’s a very exciting time.”



Published by

Steve Wildstrom

Steve Wildstrom is veteran technology reporter, writer, and analyst based in the Washington, D.C. area. He created and wrote BusinessWeek’s Technology & You column for 15 years. Since leaving BusinessWeek in the fall of 2009, he has written his own blog, Wildstrom on Tech and has contributed to corporate blogs, including those of Cisco and AMD and also consults for major technology companies.

691 thoughts on “Nuance Exec on iPhone 4S, Siri, and the Future of Speech”

  1. I always have loved the Apple demos for speech recognition at WWDC and I’m looking forward to trying Siri on my iPhone.

    Way back when, we were working with Apple’s speech recognition system for an automated system. One of the people in the company was from Nigeria and, while his english was impeccable, his accent was extremely thick. The old speech recognition stuff in the 1990s would just look confused whenever he spoke.

    I’ll be curious to see Siri’s “learning ability” with things like this.

  2. “I exceedingly rise this article’s focus on nuclear suit protection – it’s a reflections of how pivotal having the immediately careful utensils is. An eye to anyone quiet seeking a calibre hazmat convenient to, enquire about off this plat’s rundown on the superb hazmat suits. Secure your harmoniousness of astuteness past staying safe!”

  3. Hey would you mind letting me know which web host you’re working with?

    I’ve loaded your blog in 3 different internet browsers and I must say this blog loads a lot faster then most.

    Can you recommend a good internet hosting provider at a honest
    price? Thank you, I appreciate it!

  4. Link exchange is nothing else however it is only
    placing the other person’s website link on your page at proper place and other person will also do similar for you.

  5. When I originally commented I clicked the “Notify me when new comments are added”
    checkbox and now each time a comment is added
    I get several emails with the same comment. Is there any way you
    can remove people from that service? Cheers!

  6. Please let me know if you’re looking for a article writer for your site.
    You have some really great posts and I believe
    I would be a good asset. If you ever want to take some of
    the load off, I’d really like to write some material for your blog in exchange for a
    link back to mine. Please blast me an email if interested.

  7. Excellent pieces. Keep writing such kind of info on your site.
    Im really impressed by it.
    Hi there, You’ve done a great job. I’ll definitely digg it and individually suggest to my friends.

    I am sure they will be benefited from this site.

  8. Hi there! Would you mind if I share your blog with my myspace group?
    There’s a lot of folks that I think would really enjoy your content.
    Please let me know. Many thanks

  9. As a result of comprising major gambling markets of the world, the entire gambling market of Europe is said to rise significantly by 2025. It is expected to reach a CAGR or compound annual growth rate of 9.20% by the end of 2025. An array of gambling start-ups and hardware development in Europe enhance the gambling experience, thus making the European gambling market ever-growing. All in all, for regular customers it means that the number of places for gambling tourism will only grow. Bonuses are also a very important factor. Who wouldn’t want to receive free play money or extra spins? Most online Euro casinos offer new players a casino bonus, also known as the welcome bonus. The exact size and terms of the bonus may vary from casino to casino. If you are wondering which EU casino is the right fit for you, you have come to the right place. Our team of experts are continuously researching and reviewing top EU online casinos. Below you can find a list of the top-ranked online casinos in Europe. We will update this list regularly as new casinos are constantly emerging.
    We’ll start off with the best of the best. You simply can’t top a no-deposit bonus offer; these are promotions that you can take full advantage of without making a deposit. Yes, you read that correctly. Since no deposit is required, there’s no financial risk to the user. No deposit casino bonuses are typically pretty small, since the casino is literally passing out money from promotions; but what do you have to lose? These offers can typically net you between $10 and $30, which you can use to splash around and try out a few games. It will be a challenge to build a bankroll from, but it’s certainly worth the effort to try. Yes, you can win real money by using the $10 deposit casino, as you are depositing real money into the casino, all of your winnings will be in real money as well. Keep in mind that if you are using a bonus of any kind, you might need to complete the wagering requirements before you can complete the withdrawal.