By Robert McMillan
Last year, Apple Inc. kicked off a massive experiment with new
privacy technology aimed at solving an increasingly thorny problem:
how to build products that understand users without snooping on
their activities.
Its answer is differential privacy, a term virtually unknown
outside of academic circles until a year ago. Today, other
companies such as Microsoft Corp. and Uber Technologies Inc. are
experimenting with the technology.
The problem differential privacy tries to tackle is that modern
data-analysis tools are capable of finding links between large
databases. Privacy experts worry these tools could be used to
identify people in otherwise anonymous data sets.
Two years ago, researchers at the Massachusetts Institute of
Technology discovered shoppers could be identified by linking
social-media accounts to anonymous credit-card records and bits of
secondary information, such as the location or timing of
purchases.
"I don't think people are aware of how easy it is getting to
de-anonymize data," said Ishaan Nerurkar, whose startup LeapYear
Technologies Inc. sells software for leveraging machine learning
while using differential privacy to keep user data anonymous.
Differentially private algorithms blur the data being analyzed
by adding a measurable amount of statistical noise. This could be
done, for example, by swapping out the answer to one question (have
you ever committed a violent crime?) with a question that has a
statistically known response rate (were you born in February?).
Someone trying to find links in the data would never be sure
which question a particular person was asked. That lets researchers
analyze sensitive data such as medical records without being able
to tie the data back to specific people.
Differential privacy is key to Apple's artificial intelligence
efforts, said Abhradeep Guha Thakurta, an assistant professor at
University of California, Santa Cruz. Mr. Thakurta worked on
Apple's differential-privacy systems until January of this
year.
Apple has faced criticism for not keeping pace with rivals such
as Alphabet Inc.'s Google in developing AI technologies, which have
made giant leaps in image and language recognition software that
powers virtual assistants and self-driving cars.
While companies such as Google have access to massive volumes of
data required to improve artificial intelligence, Apple's privacy
policies have been a hindrance, blamed by some for turning the
company into a laggard when it comes to AI-driven products such as
Siri.
"Apple has tried to stay away from collecting data from users
until now, but to succeed in the AI era they have to collect
information about the user," Mr. Thakurta said. Apple began rolling
out the differential-privacy software in September, he said.
Users must elect to share analytics data with Apple before it is
used.
Originally used to understand how customers are using emojis or
new slang expressions on the phone, Apple is now expanding its use
of differential privacy to cover its collection and analysis of web
browsing and health-related data, Katie Skinner, an Apple software
engineer, said at the company's annual developer's conference in
June.
The company is now receiving millions of pieces of information
daily -- all protected via this technique -- from Macs, iPhones and
iPads running the latest operating systems, she said.
"Apple believes that great features and privacy go hand in
hand," an Apple spokesman said via email.
Google, one of differential privacy's earliest adopters, has
used it to keep Chrome browser data anonymous. But while the
technology is good for some types of analysis, it suffers where
precision is required. For example, experts at Google say it
doesn't work in so-called A/B tests, in which two versions of a
webpage are tested on a small number of users to see which
generates the best response.
"In some cases you simply can't answer the questions that
developers want answers to," said Yonatan Zunger, a privacy
engineer at Google. "We basically see differential privacy as a
useful tool in the toolbox, but not a silver bullet."
Researchers are coming up with "surprisingly powerful" uses of
differential privacy, but the technology is only about a decade
old, said Benjamin Pierce, a computer science professor at the
University of Pennsylvania. "We're really far from understanding
what the limits are," he said.
Differential privacy has seen wider adoption since Apple first
embraced it. Uber employees, for example, use it to improve
services without being overexposed to user data, a spokeswoman said
via email.
Microsoft is working with San Diego Gas & Electric Co. on a
pilot project to make smart-meter data available to researchers and
government agencies for analysis, while making sure "any data set
cannot be tied back to our customers," said Chris Vera, head of
customer privacy at the utility.
The U.S. Census Bureau confronted the problem of links between
data sets a decade ago. By 2005, the bureau was worried large
databases outside its control could be used to de-anonymize censor
participants, said John Abowd, chief scientist at the bureau. After
meeting with some of the creators of differential privacy, the
bureau became an proponent.
In 2008 the Census released its first product to use this
technology -- a web-based data-mapping portal called OnTheMap --
and the bureau is now "making an intense effort to apply
differential privacy to the publication of the 2020 census," Mr.
Abowd said.
Write to Robert McMillan at Robert.Mcmillan@wsj.com
(END) Dow Jones Newswires
July 07, 2017 07:14 ET (11:14 GMT)
Copyright (c) 2017 Dow Jones & Company, Inc.
Apple (NASDAQ:AAPL)
Historical Stock Chart
From Mar 2024 to Apr 2024
Apple (NASDAQ:AAPL)
Historical Stock Chart
From Apr 2023 to Apr 2024