Types of Differential Privacy

To achieve differential privacy we have to add noise to our data to protect the user's privacy. This could be done either locally or globally.

Local Differential Privacy

local differential privacy, is when each individual or data point in the database has noise added to it, hence the name local. This is done by the users before inserting the data in the database (i.e before querying the data). Local differential privacy guarantees the privacy of the data.

Global Differential Privacy

Global differential privacy, is when noise is added to the output of the query instead of each datapoint, hence the name global. It's done by the database curator. Global differential privacy guarantees the privacy of the data on the condition and if and only if the database curator is trustworthy.

Differential privacy always requires a form of randomness or noise added to the query to protect form things like differencing attack.

Randomized Response

Is a technique used in social scientists, that is used when studying/analyzing high level trends of a taboo behaviour. for instance lets say we are performing a study on a crime, say jaywalking. And we are going around asking people whether they have ever jaywalked before, while guaranteeing them that their answers will be confidential. No matter how much we assure them, there is still a high chance that some of our sample would have lied. And that could result in our data being skewed.

Now, comes in a very cool technique, where a certain degree of randomness is added to each individual's answer, granting them Plausible Deniability.

Plausible Deniability

So, instead of asking the question directly, we would tell the individual at question, flip a coin 2 times, and if the first flip is a head, answer yes/no honestly. And if the first coin is a tail, then answer according to the second coin. the ideas is that half the time the individuals will be answering honestly, while the rest of the time they would answering randomly with a 50–50 chance of being honest. This means that the answers are linked to coin flip, this gives a certain local degree of randomness to each individual, allowing them to freely answer the question, while maintaining their privacy due to the local randomness that is imposed.

The cool thing is that, the person performing the study, could remove this randomness/noise over the aggregate of the population, to obtain the accurate results.

Note

I am writing this article in part with Udacity’s secure and private AI Scholarship Challenge, as a way to share what I have learned so far.

#60daysofudacity #secureandprivateai

--

--

Aisha Elbadrawy
Secure and Private AI Writing Challenge

I am a computer science graduate, who is passionate about problem solving, learning and education. Interested in Software Development and Machine Learning.