Aan de slagGa gratis aan de slag

Assessing a diabetes prediction classifier

In this chapter you'll work with the diabetes_df dataset introduced previously.

The goal is to predict whether or not each individual is likely to have diabetes based on the features body mass index (BMI) and age (in years). Therefore, it is a binary classification problem. A target value of 0 indicates that the individual does not have diabetes, while a value of 1 indicates that the individual does have diabetes.

diabetes_df has been preloaded for you as a pandas DataFrame and split into X_train, X_test, y_train, and y_test. In addition, a KNeighborsClassifier() has been instantiated and assigned to knn.

You will fit the model, make predictions on the test set, then produce a confusion matrix and classification report.

Deze oefening maakt deel uit van de cursus

Supervised Learning with scikit-learn

Cursus bekijken

Oefeninstructies

  • Import confusion_matrix and classification_report.
  • Fit the model to the training data.
  • Predict the labels of the test set, storing the results as y_pred.
  • Compute and print the confusion matrix and classification report for the test labels versus the predicted labels.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

# Import confusion matrix
____

knn = KNeighborsClassifier(n_neighbors=6)

# Fit the model to the training data
____

# Predict the labels of the test data: y_pred
y_pred = ____

# Generate the confusion matrix and classification report
print(____(____, ____))
print(____(____, ____))
Code bewerken en uitvoeren