Apple Opens Up About Siri Data & Privacy

| Analysis


Apple has openly explained the company's policy on how it collects and stores Siri-related, revealing for the first time the company's approach to privacy. In a statement to Wired, the company said that it collects and stores data anonymously and that it (will) deletes that data after two years.

The first thing to understand is that Apple records and stores the vocal interactions customers have with Siri. When it does so, however, it attaches the data to a random number generated just for your Siri interactions. In other words, Apple is not attaching that data to your Apple ID, your name, an email address, or your phone number.

This process is referred to as anonymizing the data because barring forensic efforts there's no way to later attach the data to you, the user. At the same time, Apple has access to the data for analysis and further improving the way Siri works. This is key to improving the service both in terms of interacting with users and improving Siri's results.

Apple then stores that information for six months, presumably for additional analysis. At the six month mark, Apple then "disassociates" the data from the random identifying number, eliminating even that option for attaching it to a specific user, but it keeps the disassociated data for another 18 months.

At that point—the two year mark—Apple deletes the information, or it will. Siri was only introduced some 18 months ago, which means that everything we, as users, have said to Siri is sitting in one of Apple's data centers (think Maiden, NC).

Apple also said that,“If a user turns Siri off, both identifiers are deleted immediately along with any associated data.”

This information has not previously been available, and the American Civil Liberties Union wants Apple to post it on its Siri FAQ. In our opinion, that's a splendid idea. We are also pleased to see Apple opening up in the first place. While we're at it, Wired gets credit for getting the company to reveal the information.

Popular TMO Stories




This is useful information.

For Apple to publicly come out and declare that they are keeping anonymised data for a fixed term is not simply important, but has substantive accountability implications.

I share anonymised data all the time with key research partners. For example, many US Gov’t agencies I collaborate with can only receive anonymised data (e.g. US CDC, NIH). There are stringent regulations on what constitute anonymised data, and are not a matter of opinion. There can be no backward reconstitution capability for the end user (in this case, Apple themselves) to re-assign personal or unique identifiers to those data, such that they are once again personalised. It’s permissible for me, as the holder of the original data, to be able to do so, as often I send the anonymised file to, say, CDC for them to put in the lab results, and they subsequently return that anonymised file to me, which I reconstitute to its original form with the unique identifiers. For Apple to claim that the data are anonymised means that they should have no way to do this (barring forensics, which is not a high throughput technology, and thus cannot be used for the total population of Siri users).

Many thanks.

Log in to comment (TMO, Twitter or Facebook) or Register for a TMO account