Get startedGet started for free

Segmenting customers

In this exercise, you'll perform a Customer Segmentation from the Mall Customer Segmentation Dataset using a differentially private clustering model.

In K-means clustering, you can calculate the optimal number of clusters with the elbow method.

Resulting graphic from Elbow method with non-private model
From the resulting graphic, notice that the optimal number of clusters is 5. You'll cluster based on Annual Income and Spending Score, which have been loaded as X, and plot the resulting clusters.

The full dataset has been loaded as mall_df. For convenience, a custom function show_clusters() to plot the clusters is provided to you. Use ?show_clusters to learn more about it.

This exercise is part of the course

Data Privacy and Anonymization in Python

View Course

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Build the differentially private K-means model
model = ____
Edit and Run Code