Evaluating the model's performance
The PyBooks team has been making strides on the book recommendation engine. The modeling team has provided you two different models ready for your book recommendation engine at PyBooks. One model is based on LSTM (lstm_model
) and the other uses a GRU (gru_model
). You've been tasked to evaluate and compare these models.
The testing labels y_test
and the model's predictions y_pred_lstm
for lstm_model
and y_pred_gru
for gru_model
.
This exercise is part of the course
Deep Learning for Text with PyTorch
Exercise instructions
- Define accuracy, precision, recall and F1 for multi-class classification by specifying
num_classes
andtask
. - Calculate and print the accuracy, precision, recall, and F1 score for
lstm_model
. - Similarly, calculate the evaluation metrics for
gru_model
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create an instance of the metrics
accuracy = ____(task=____, num_classes=3)
precision = ____(task=____, num_classes=3)
recall = ____(task=____, num_classes=3)
f1 = ____(task=____, num_classes=3)
# Calculate metrics for the LSTM model
accuracy_1 = accuracy(____, ____)
precision_1 = precision(____, ____)
recall_1 = recall(____, ____)
f1_1 = f1(____, ____)
print("LSTM Model - Accuracy: {}, Precision: {}, Recall: {}, F1 Score: {}".format(accuracy_1, precision_1, recall_1, f1_1))
# Calculate metrics for the GRU model
accuracy_2 = accuracy(____, ____)
precision_2 = precision(____, ____)
recall_2 = recall(____, ____)
f1_2 = f1(____, ____)
print("GRU Model - Accuracy: {}, Precision: {}, Recall: {}, F1 Score: {}".format(accuracy_2, precision_2, recall_2, f1_2))