Disclaimer: These notes are taken for the CS577 Data Privacy course offered by Dr. Erman Ayday in the 2021/2022 Fall Semester at Bilkent University.
The key idea behind differential privacy is that a user is given plausible deniability by adding random noise to their input. In the centralized differential privacy (DP) setting noise is added to the database. As the type of noise added is known statistical queries can still be computed by filtering out the noise while maintaining each user’s individual privacy. However, this approach to Differential Privacy requires users to trust the database maintainer to keep their privacy [1]. Local Differential Privacy (LDP)is based on the assumption that the data collector is untrusted. Specifically, in the setting, each participant locally perturbs her/his raw data with a differential privacy mechanism and transfers
the perturbed version to the untrusted server [2].
RAPPOR[3] was developed by Google, which is based on the Randomized Response and Bloom Filters to implement privately collecting Chrome’s setting of massive users. Given a client’s value v, the RAPPOR algorithm executed by the client’s machine, reports to the server a bit array of size k, that encodes a “noisy” representation of its true value v. The noisy representation of v is chosen in such a way so as to reveal a controlled amount of information about v, limiting the server’s ability to learn with confidence what v was. The algorithm takes in the client’s true value v and parameters of execution k, h, f, p, q, and is executed locally on the client’s machine performing the following steps:
One of the limitations of RAPPOR has to do with the ”leakage” of additional information when respondents use several clients that participate in the same collection event. In the real world, this problem is mitigated to some extent by the intrinsic difficulty of linking different clients to the same participant. Similar issues occur when highly correlated, or even exactly the same, predicates are collected at the same time. This issue, however, can be mostly handled with careful collection design. Such inadvertent correlations can possibly lead to the collection of too much-correlated information from a single client, or user, and a corresponding degradation of privacy guarantees.
[1] B. Bebensee, “Local differential privacy: A tutorial,” arXiv preprint arXiv:1907.11908, 2019. [2] X. Xiong, S. Liu, D. Li, Z. Cai, and X. Niu, “A comprehensive survey on local differential privacy,” Security and Communication Networks, vol. 2020, 2020. [3] U. Erlingsson, V. Pihur, and A. Korolova, “Rappor: Randomized aggregatable privacy-preserving ordinal response,” in Proceedings of the 2014 ACM SIGSAC conference on computer and communications security, 2014, pp. 1054–1067.
Comments