Bill Stasior previously led Siri development at Apple. He sat down to discuss virtual assistants and how they can improve in the next 3-5 years (via Business Insider).
Apple didn’t talk about Siri much during WWDC19, but she will be getting a new voice using neural text-to-speech. Mr. Stasior says that no virtual assistant really lives up to the promise of understanding people naturally the way a person can:
I think everyone learns what commands work with the assistants and what commands don’t work with the assistants. And while that’s improving very rapidly right now, I think there’s still a long way to go.
When you want to talk to an assistant, you’re opening the door for almost any task or any question. There’s just an incredibly broad variety of language and ways of expressing ourselves. And having that general capability, we’re still a ways away from it.
Another feature that Apple announced is multi-user support on the HomePod. Siri will be able to tell who is speaking to her, and the HomePod will play that particular person’s favorite songs, playlists, etc.
Mr. Stasior went on to say that sometimes people mistake a person for someone else when talking on the phone (for example), but we still expect machines to tell people apart:
I think that’s interesting because we sometimes expect machines to be able to tell one person from the other just from the sound of their voice. It’s possible that they might be able to do that better than people. But for a starter, we know that [since] it’s hard enough to do as people, that it’s a challenging problem.