Comparing BoW and TF-IDF representations
You're part of the analytics team at a wearable tech company. Your goal is to help product managers understand customer feedback on the company's new smartwatch. You've already preprocessed the text and created two representations: bow_matrix using CountVectorizer(), and tfidf_matrix using TfidfVectorizer(). In this exercise, you'll visualize and compare the two to better understand how each captures word importance.
Deze oefening maakt deel uit van de cursus
Natural Language Processing (NLP) in Python
Praktische interactieve oefening
Probeer deze oefening eens door deze voorbeeldcode in te vullen.
# Convert BoW matrix to a DataFrame
df_bow = pd.DataFrame(
____,
columns=vectorizer.____
)
# Plot the heatmap
plt.figure(figsize=(10, 6))
sns.heatmap(____, annot=True)
plt.title("BoW Scores Across Reviews")
plt.xlabel("Terms")
plt.xticks(rotation=45)
plt.ylabel("Documents")
plt.show()