CommencerCommencer gratuitement

Pre-processing data

Pre-processing for clustering can be a way to prepare data for more accurate segmentation. One type of pre-processing is feature scaling, a technique to standardize the independent features present in the data to fit a fixed range, e.g., 0-1 or 0-100.

In this exercise, you will perform clustering on the columns of parental_level_of_education and writing_score in the student performance dataset loaded as performance. First, you will create and run a k-means model without any pre-processing data. Then, do the same but by pre-processing data with feature scaling.

The private k-means model has been imported as KMeans from diffprivlib.models. The StandardScaler scaler and dimensionality reduction PCA have been imported from sklearn.

Cet exercice fait partie du cours

Data Privacy and Anonymization in Python

Afficher le cours

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

# Build the differentially private k-means model
model = KMeans(____)

# Fit the model to the data
____

# Print the inertia in the console output
print("The inertia of the private model is: ", model.inertia_)
Modifier et exécuter le code