LoslegenKostenlos loslegen

Checking results

In this exercise you're going to check the results of your DBSCAN fraud detection model. In reality, you often don't have reliable labels and this where a fraud analyst can help you validate the results. He/She can check your results and see whether the cases you flagged are indeed suspicious. You can also check historically known cases of fraud and see whether your model flags them.

In this case, you'll use the fraud labels to check your model results. The predicted cluster numbers are available under pred_labels as well as the original fraud labels labels.

Diese Übung ist Teil des Kurses

Fraud Detection in Python

Kurs anzeigen

Anleitung zur Übung

  • Create a dataframe combining the cluster numbers with the actual labels. This has been done for you.
  • Create a condition that flags fraud for the three smallest clusters: clusters 21, 17 and 9.
  • Create a crosstab from the actual fraud labels with the newly created predicted fraud labels.

Interaktive Übung

Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.

# Create a dataframe of the predicted cluster numbers and fraud labels 
df = pd.DataFrame({'clusternr':pred_labels,'fraud':labels})

# Create a condition flagging fraud for the smallest clusters 
df['predicted_fraud'] = np.where((df['clusternr']==21)|(____)|(____),1 , 0)

# Run a crosstab on the results 
print(pd.crosstab(df['fraud'], df['____'], rownames=['Actual Fraud'], colnames=['Flagged Fraud']))
Code bearbeiten und ausführen