Apple’s A.I. Research Team Is Playing Catch-Up With Siri
OneZero’s General Intelligence is a roundup of the most important artificial intelligence and facial recognition news of the week.
Like every big tech company, Apple is in dire need of A.I. programmers. These algorithms serve as a foundation for everything from processing photos to make them look brighter and sharper to powering Siri to maybe even driving that Apple car.
So, in 2016, the company hired a well-known Carnegie Mellon professor named Ruslan Salakhutdinov to lead its A.I. division and, in a surprising move by the typically tight-lipped company, launched a research blog to publish some of its own work.
Apple makes some of its work public because the backbone of the A.I. field is still academic, and the ability to publish new research is a primary consideration for PhDs entering the world of tech.
“You can’t tell people, ‘Come work for us, but you can’t tell people what you’re doing,’ because you basically ruin their career,” Facebook chief scientist Yann LeCun told Business Insider in 2016.
Four years later, Apple is still publishing on its research blog, giving some up-to-date insight into what the company’s researchers are working on. There’s no promise that this research will make it into an Apple product, but the research shows the kind of ideas Apple is investing in.
Many of these papers focus on bolstering Siri, the company’s virtual assistant that’s commonly seen as inferior to Google Assistant and Alexa.
Apple researchers are trying to make Siri better understand the intent behind questions and are even trying to decode people’s emotions when they say a command. One paper also talks about “acoustic activity recognition,” or listening for specific noises. In a video accompanying the paper, a HomePod listens to noises being made around a kitchen and asks, “What’s that sound?” to which a researcher responds, “A microwave.”
Other Siri improvements have to do with multilingual use of the virtual assistant, with Apple making its own dataset to benchmark how well virtual assistants can answer questions in 26 different languages. There’s also research on the simple task of making Siri “wake up” to listen for a person’s command.
While the specific research is new, this is a well-worn tale to those watching Apple’s A.I. efforts. In 2017, an A.I. insider sent me presentation slides from a closed-door Apple event at one the industry’s largest conferences, which showed that Apple’s research team was working on A.I. for health, Siri, image processing, and even self-driving cars.
Many of those same themes can be found on the company’s website today. And Siri is still far behind Google Assistant and Alexa, perhaps even more so than three years ago.
An investment into voice assistants and smart speakers makes sense as Apple was preparing to expand its line of HomePod smart speakers with the HomePod Mini. Google and Amazon have amassed enormous market share with their respective smart speaker devices, and it’s clear that Apple is playing catch-up.
Here’s a little more detail on what Apple’s A.I. team is up to:
Making Siri smarter
Apple published quite a few papers this summer relating to voice assistants. One is targeted at better understanding user intent, or figuring out what a person wants Siri to do. Apple researchers describe a method that takes into account not only what a person says, but also context, including your location, browsing history, whether you’re driving, and previous other Siri requests. Another tries to infer a person’s emotions by analyzing their voice. And yet another introduces a new dataset to make Siri answer questions more reliably in different languages. The study focused on 26 languages, with the goal to better measure the ability of a voice assistant’s ability to work across multiple languages in relation to English.
Insulin-glucose prediction for diabetes
In August, Apple published a paper in which researchers attempt to address the trial and error of finding the right insulin doses by combining A.I. algorithms with more traditional models of insulin prediction. The team stresses that they haven’t solved the problem but introduced a new way of thinking about glucose prediction. The bigger picture, however, is Apple’s interest in health research and evidence that the company is thinking about how its products can serve those users with diabetes.
An A.I.-powered accessibility tool
And in an October paper, researchers describe a new tool called Rescribe, which makes it easier to record audio descriptions for videos. “Inline” audio descriptions like the ones detailed in the paper are essentially voice-overs for videos to aid people who can’t see the video aspect of the media. Rescribe’s goal is to allow one person to record these audio descriptions more effectively, rather than requiring a team of audio engineers, voice-over actors, and producers. Apple’s use for this could be widespread, from adding it into its own video editing software to using it to make its in-house movies and TV shows more accessible.