LoslegenKostenlos loslegen

Preparing employee data for safe release

When you deal with real data, you need to make sure that there's no way our customer's or other people's personal information can be traced or exposed. In this exercise, you'll use a simplified version of the IBM HR Analytics Employee dataset to practice suppression and generalization techniques.

To avoid leaking information about the dataset, you will replace the column names with numbers.

The DataFrame is loaded as hr, use the console to explore it. numpy is imported as np.

Diese Übung ist Teil des Kurses

Data Privacy and Anonymization in Python

Kurs anzeigen

Interaktive Übung

Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.

# Drop unique data and almost unique data
df_dropped = ____(["employee_number", "monthly_income", "monthly_rate", "daily_rate"], axis=1) 

# Drop the rows with NaN values
df_cleaned = ____
Code bearbeiten und ausführen