Siri and Apple’s Machine Learning Are About to Get a Lot Better

| Columns & Opinions

Siri, as we’ve know her (or him), has been both a blessing and a frustration. The technology, when it works is brilliant, but when its limitations are exposed, it can be very frustrating. Our appetite for a stellar chatbot companion has merely been whetted, and we’re about to get it. From Apple. On its terms. With privacy.


A very special article I came across today is a tour de force work of journalism by the esteemed Steven Levy, “The iBrain is Here.” This is must reading for several reasons.

First, it’s important to understand Apple’s technical approach to Siri, neural nets, machine learning, artificial intelligence and user privacy. Second, it’s important to understand that Apple isn’t going to settle for something half-baked or something that ends up being too creepy to bear. The article above explains all that and includes a discussion of Differential Privacy to be introduced in iOS 10. Finally, it’s reassuring to know that, despite the glitz we’ve seen from other companies like Amazon and Google, Apple is serious, well positioned and mindful of how this technology will impact users. Apple is taking a back seat to no one.

Alternatively, a troubling thing I’ve also seen discussed is that we should not get our hopes up for pleasing interactions, via voice, with our computers. In this article at TechCrunch, The new paradigm for human-bot communication,” the author/researchers give themselves some serious wriggle room. It’s almost defeatist.

The good news is that the belief that bots must master human language or replace apps to succeed is false.

I soundly reject that notion. Their lack of vision is staggering.

The Holy Grail is Holy for Siri

The whole point of man-machine voice interaction is to completely duplicate the manner of human-human communication. After all, our kids will be using them too. Anything less would not be the result that Apple, legendary for user interfaces expertise, would want to achieve. The Steven Levy article above reaffirms that goal. Here’s a great section to ponder.

So Apple moved Siri voice recognition to a neural-net based system for US users on that late July day (it went worldwide on August 15, 2014.) Some of the previous techniques remained operational — if you’re keeping score at home, this includes “hidden Markov models” — but now the system leverages machine learning techniques, including deep neural networks (DNN), convolutional neural networks, long short-term memory units, gated recurrent units, and n-grams…. When users made the upgrade, Siri still looked the same, but now it was supercharged with deep learning.

This groundwork is fundamental to Apple’s near term with the iPhone as well as the future, especially when it comes to communicating in tense situations with an Apple car (rumored) or family service robots (hypothetical).

I expect, given the resources Apple and others are pouring into this research, a day will come when many of the stilted, fictional voice interactions with computers we’ve seen in SciFi to date will seem ridiculous. Right now, they’re often seen as a model to shoot for. In ten years, we’ll laugh and laugh till it hurts to see how primitive our expectations were.

And we’ll be surprised to see ourselves in a new light. Not with hubris, eternally superior to the machine. Instead, we may well come to appreciate how to partner with and learn from this new intelligence. It’s going to be a leapfrog affair that will propel us both.

But this also has to be done carefully, and Apple seems to sense that achieving the goal with class, security, privacy and dignity for customers means a lot more in the long run for human culture than simply selling more sugar water.

Aptly, Mr. Levy closes with: “Skynet can wait.”

7 Comments Add a comment

  1. “This groundwork is fundamental to Apple’s near term with the iPhone as well as the future, especially when it comes to communicating in tense situations with an Apple car (rumored) or family service robots (hypothetical).”

    Yes, I suspect a strong intergration with the car and Apple’s other products. I am also thinking that the car could be in the list of items in Find iPhone no matter if the phone is onboard or not.

  2. John:

    This was a treat to read. Just a couple of thoughts.

    First, regarding the perception by some (perhaps many) that Apple are lagging behind in AI development, to some extent this may be a function of confirmation bias enhanced by the virtual echo chamber of social media, and commenters and pundits reinforcing the image of SIRI in its original release (eg MS’s adverts lampooning how ineffectual are Apple’s devices and AI compared to the Surface and/or Cortana).

    I have been impressed with what are clear ‘under the bonnet’ enhancements to SIRI over the past couple of years, and the precision and speed with which the AI responds to not only natural voice but user intent. We have become so accustomed to the rapid evolution of our technology that we dismiss what, even a couple of decades ago, was still the stuff of SciFi (I loved your link to Kirk using SIRI aboard the Enterprise).

    The second has more to do with Apple’s business model and quiet approach to R&D. While the tech pundit and financial analyst communities might prefer a public road map, Apple have learnt the hard way what happens when competitors descry their intent. Apple have demonstrated, more than once, the advantage of quietly setting the board before going for checkmate (like 64 bit processing on iOS devices). While it is true that multiple companies are angling to market a state of the art AI device, what they want to do with it, and how Apple intend to leverage this as an integral part of their platform which they alone control from concept to rollout, are likely quite different, not only in use case but even in background development, as Levy points out in the way in which Apple have figured out how to provide the data deep machine learning requires with preservation of user privacy, and will publish those methodologies for the greater tech community. Apple continue to write their own playbook. It remains to be seen how many companies can actually emulate it with as much success.

    It is clear that Apple will not permit perception to alter their business model or development strategy, even though, as Bryan Chaffin points out, they are more proactively taking control of their own narrative regarding development.

    Apple is a different company today than it was even 10 years ago, even though the embarkation on its present course was initiated even earlier still. The company is taking itself and us into uncharted territory; eschewing conventional wisdom (netbooks), boldly discarding old technologies (head phone jacks, floppy drives), and developing new skill sets to embark on whole new enterprises (wearables, automobiles, healthcare). As a result, two things are certain.

    First, those who are wedded to the old Apple will fall away in disillusionment, and will be replaced by new users. Second, critics will continue to decry Apple’s folly and predict doom and calamity, until they see and experience those responsible new products. Apple’s best rejoinder remains what it has always been, products that take us into the future and, by being a joy to use, enhance our creativity and productivity.

  3. aardman

    The problem with voice recognition, and this just happened to me with iMessages when I tried it again having reading this article, is if you use a new word, it makes a mistake and you have to correct it manually. That defeats the purpose of using dictation. Okay, it’s too much to ask the app to know something it hasn’t heard before, but there must be a way to make the correction also through the voice interface. Now that seems to be a big problem for a silicon mind, how does iMessages’ natural language interface distinguish between a command and actual text without reserving words that signal that a command or text is about to follow? Humans easily make this distinction in conversation. A steno easily figures out if what she is hearing is text to be transcribed rather than an aside. And if she makes a mistake, it is easily corrected through conversation. She easily infers intent, she has a good idea of what’s in the head of the person giving dictation.

    I think this is a challenge that pervades all of AI: inferring idiosyncratic intent.

    There is a much reproduced experiment in psychology (a field notorious for irreproducible experimental results) where they show that humans, down to 2 years old even, understand the concept of ‘people other than myself have their own thoughts and own sets of knowledge’ and, more importantly, can make accurate guesses of what another person is thinking of and knows.

    Earlier this week, news came out that this experiment was run on bonobos, chimps and orangutans and the key finding is that they also exhibit this ability. (Well maybe not to the extent that we do.) Here…

    As far as I know computers can’t do this, and as long as they don’t, AI systems will not be able to do the Star Trek-like things that people hope they can do in the future.

Add a Comment

Log in to comment (TMO, Twitter, Facebook) or Register for a TMO Account