CommencerCommencer gratuitement

Checking results

In this exercise you're going to check the results of your DBSCAN fraud detection model. In reality, you often don't have reliable labels and this where a fraud analyst can help you validate the results. He/She can check your results and see whether the cases you flagged are indeed suspicious. You can also check historically known cases of fraud and see whether your model flags them.

In this case, you'll use the fraud labels to check your model results. The predicted cluster numbers are available under pred_labels as well as the original fraud labels labels.

Cet exercice fait partie du cours

Fraud Detection in Python

Afficher le cours

Instructions

  • Create a dataframe combining the cluster numbers with the actual labels. This has been done for you.
  • Create a condition that flags fraud for the three smallest clusters: clusters 21, 17 and 9.
  • Create a crosstab from the actual fraud labels with the newly created predicted fraud labels.

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

# Create a dataframe of the predicted cluster numbers and fraud labels 
df = pd.DataFrame({'clusternr':pred_labels,'fraud':labels})

# Create a condition flagging fraud for the smallest clusters 
df['predicted_fraud'] = np.where((df['clusternr']==21)|(____)|(____),1 , 0)

# Run a crosstab on the results 
print(pd.crosstab(df['fraud'], df['____'], rownames=['Actual Fraud'], colnames=['Flagged Fraud']))
Modifier et exécuter le code