BaşlayınÜcretsiz Başlayın

Build a differentially private classifier

In this exercise, you will build and train a private Gaussian Naive Bayes model on the Penguin dataset to classify if a penguin is male or female.

K-anonymity doesn't work well with high dimensional or diverse datasets due to its substantial theoretical and empirical limitations, the "curse of dimensionality". As the number of features or dimensions grows, the amount of data we need to generalize accurately grows exponentially. It's one of the reasons why differential privacy is the current preferred privacy model. Epsilon is independent of any background knowledge and "bounds" the sensitive information.

The DataFrame is loaded as penguin_df and split into X_train, y_train, X_test and y_test. The private model class has been imported as dp_GaussianNB.

Bu egzersiz

Data Privacy and Anonymization in Python

kursunun bir parçasıdır
Kursu Görüntüle

Egzersiz talimatları

  • Create a dp_GaussianNB classifier without parameters.
  • Fit the previously created model to the data without any parameters.
  • Calculate the score of the private model based on the test data.

Uygulamalı interaktif egzersiz

Bu örnek kodu tamamlayarak bu egzersizi bitirin.

# Built the private classifier without parameters
dp_clf = ____

# Fit the model to the data
____(X_train, y_train)

# Print the accuracy score
print("The accuracy with default settings is ", ____(X_test, y_test))
Kodu Düzenle ve Çalıştır