Appendix
Conclusion
Section titled “Conclusion”Differential privacy has emerged as a robust alternative to traditional anonymisation, offering provable privacy guarantees while still allowing meaningful data analysis. By adding controlled noise, it can prevent the re-identification of sensitive information and protect individuals’ privacy, while enabling accurate and reliable analysis without the loss of quality and analytical value that often results from traditional techniques. It allows sensitive data to be explored across silos, can shorten data access times by relaxing the friction of data-request processes, and can fulfil many types of use cases.
The academic and industrial communities have developed a range of tools that provide higher-level interfaces and abstract away implementation complexity, and frameworks designed for distributed computing now make scalable, cloud-based deployment feasible. As data privacy concerns continue to grow, the importance of differential privacy will only increase.
Some considerations remain. There is no universal value for epsilon; accuracy trade-offs depend on the data and the query; and tools continue to evolve, so any specific comparison or benchmark should be checked against current releases. For most organisations, the practical path is to start from the use case, choose a tool that fits its accuracy, scale, and security needs, and treat the privacy budget as a governed resource.
Privacy Definitions and Mechanisms
Section titled “Privacy Definitions and Mechanisms”| Term | Description |
|---|---|
| Laplace mechanism | Adds noise from the Laplace distribution, scaled to sensitivity / ε. The standard mechanism for pure differential privacy on numeric queries. |
| Geometric mechanism | A discrete analogue of the Laplace mechanism (discrete Laplace) that outputs integers; often the default for integer-valued data. |
| Gaussian / Discrete Gaussian mechanism | Adds Gaussian noise; used with approximate and zero-concentrated definitions. |
| Exponential mechanism | Selects an output from a set in a differentially private way; used for non-numeric outputs. |
| Pure DP | The basic (ε)-definition, with no failure probability. |
| Approximate DP | An (ε, δ)-relaxation that permits a small failure probability δ, enabling mechanisms such as Gaussian noise. |
| Zero-concentrated DP | A relaxation that supports tighter composition, often used with the Gaussian mechanism. |
| Sensitivity | The maximum change in a query’s output when one record is changed. Determines the noise scale, and depends on the mechanism, the query, and the neighbouring definition (bounded or unbounded). |
Tools and Code
Section titled “Tools and Code”- OpenDP (Harvard privacy team)
- Tumult Analytics (now maintained under the OpenDP project)
- PipelineDP (Google and OpenMined; the Python framework remains experimental)
- Diffprivlib (IBM)
- Benchmarking experiments and deployment scripts (GitHub)
- Differential privacy mechanisms notebook (Laplace and Gaussian)
- Consolidated whitepaper (Parts 1 to 3)
Recommended Reading
Section titled “Recommended Reading”- Damien Desfontaines’ differential privacy series
- Programming Differential Privacy (book)
- Differential privacy for dummies
- Differential Privacy: A primer for a non-technical audience
- The Algorithmic Foundations of Differential Privacy (Cynthia Dwork and Aaron Roth)
- A friendly video on reconstruction attacks (MinutePhysics)
- A practical beginners’ guide to differential privacy
- Differential privacy course (Gautam Kamath, University of Waterloo)