1. Learn
  2. /
  3. Courses
  4. /
  5. Data Privacy and Anonymization in Python

Connected

Exercise

Generalizing into ranges

K-anonymity can be a good privacy model for specific datasets that don't have many dimensions. The two main anonymization techniques used to transform a dataset into a k-anonymous table are generalization and suppression.

In this exercise, you will transform a satisfaction rating dataset to a 3-anonymous table containing possible sensitive attributes like satisfaction_rate and work_hours. Some combinations appear less than three times. Fix that to make the DataFrame 3-anonymous.

The DataFrame is available as employees. A k value of 3 is also available.

Instructions 1/3

undefined XP
    1
    2
    3
  • Calculate how many unique combinations there are for birth_year and department.
  • Use .reset_index() and name the newly generated column that will hold the counts as count, by passing it as the argument for the parameter name.