Announcing CrowdZen

Solomon Rousseau
The Privacy Point
Published in
3 min readMay 23, 2017

We are excited to announce that CrowdZen has launched with UCLA Dining, as of Friday May 12, 2017 11am PST. CrowdZen provides the ability for students to “Never Wait in Line Again!”.

CrowdZen is a research project whose goal is to provide real-time preserving-preserving open data to help transform UCLA into a data driven and smart campus.

Why is this interesting and how does it compare to other universities?

CrowdZen with UCLA Dining is the first campus that provides to their students real-time privacy-preserving activity levels, to the best of our knowledge. Students are now able to optimize their schedule, check ahead how crowded popular locations on campus are, and beat the lines. The open data aspect enables students and researchers to analyze and compute the data to further provide enhancements via prediction algorithms. These algorithms would calculate how crowded a particular dining hall might be based on the menu, week of the quarter, and hour of the day.

In comparison, other universities place a web camera at the entrance. The camera does not protect student privacy as anyone on the internet can clearly see and identify the students in the video and also does not provide a privacy-preserving open data api.

How is privacy preserved?

CrowdZen advances the state-of-the-art in several key areas, as well as holding to the latest standards (e.g., no linkable identifiers). The key differentiator is that the actual location is never reported. Rather, the location is privatized and transmitted such that when the aggregate counts are calculated the “noise” or “error” attributed to the privatization is estimated and cancelled out thus giving an nearly accurate and privatized real-time count.

To put this in perspective, Waze users continuously upload their actual location to the Google Waze servers. There are two issues. The privacy issue is that Waze allows users to view current traffic in an arbitrary window. Thus, a malicious user simply needs to move the window to track a specific vehicle. The second issue is a denial of service attack, or rather an induced traffic jam. A malicious user can arbitrarily create and impersonate multiple vehicles in a specific area and induce traffic jams for a targeted vehicle. Since most Waze drivers simply follow Waze routes, due to the haphazard and side routes Waze suggests, the drivers would be none the wiser. While Google Waze has written a response, there is a fundamental issue. That is, the actual location is reported to Google Waze rather than privatizing the data and only uploading the privatized data.

Computing Over Privatized Data

CrowdZen introduces the first running system which aims to “compute over privatize data”. That is, only privatized data is collected and published such that all algorithms (e.g., machine learning) are computing over privatized data, rather than the actual data.

A simple example is as follows. Suppose I am at Ackerman. Rather than reporting my actual location Ackerman, I report I am at BPlate and Powell Library. That is, I am at two locations at the same time. In this case I never report my actual location Ackerman though introduce error by reporting two other locations. The key is that the “randomization” is performed via Bernoulli trials and everyone follows the same protocol and coin tosses. The coin toss biases are known so that the number of “errornoeous” reports are estimated and then subtracted from the “noisy” aggregate count. That is, error is introduced via the sampling variance and it is precisely this variance which is providing the privacy. The state-of-the-art privacy gold standard is a notion called differential privacy. Differential privacy provides the guarantee that the aggregate output released is almost independent of the presence or absence of any individual in the database. That is, regardless if a particular individual participated or not the aggregate output is pretty much the same. Thus, it is not possible to determine that particular individual’s record.

CrowdZen improves upon the notion of differential privacy and provides a form of privacy that is stronger. In subsequent posts we will examine more of the privacy state-of-the-art and innovations CrowdZen introduces.

--

--