Generalizing into ranges
K-anonymity can be a good privacy model for specific datasets that don't have many dimensions. The two main anonymization techniques used to transform a dataset into a k-anonymous table are generalization and suppression.
In this exercise, you will transform a satisfaction rating dataset to a 3-anonymous table containing possible sensitive attributes like satisfaction_rate
and work_hours
. Some combinations appear less than three times. Fix that to make the DataFrame 3-anonymous.
The DataFrame is available as employees
. A k
value of 3 is also available.
This exercise is part of the course
Data Privacy and Anonymization in Python
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Calculate how many unique combinations are for BirthYear and Department
print(employees.groupby(['birth_year','department']).____)