Exercise

Exploring data with a privacy budget accountant

Data exploration systems that provide differential privacy must manage a privacy budget that measures the amount of privacy lost across multiple queries.

In this exercise, you'll explore the IBM HR Analytics Employee Attrition & Performance dataset while keeping track of our privacy budget. Remember that if a query exceeds the privacy budget specified in the accountant, an error arises.

The histogram is a valuable tool to visualize the data in a differentially private way. The syntax is the same as the corresponding function in numpy, with an epsilon parameter.

The full dataset is available as hr and the employees' age attribute as ages. A custom function has been created and loaded as show_histogram() to plot the histogram as you did previously in the course.

Instructions

100 XP
  • Create a privacy BudgetAccountant with an epsilon of 1.5, using the constructor for it.
  • Generate a private histogram from the ages column and with an epsilon value of 0.1.
  • Get and show the private average of ages, using an epsilon of 0.9, and bounds from 10 to 100 as a tuple.
  • Print the privacy budget remaining for the two new following queries.