Comparing BoW and TF-IDF representations
You're part of the analytics team at a wearable tech company. Your goal is to help product managers understand customer feedback on the company's new smartwatch. You've already preprocessed the text and created two representations: bow_matrix using CountVectorizer(), and tfidf_matrix using TfidfVectorizer(). In this exercise, you'll visualize and compare the two to better understand how each captures word importance.
Latihan ini adalah bagian dari kursus
Natural Language Processing (NLP) in Python
Latihan interaktif praktis
Cobalah latihan ini dengan menyelesaikan kode contoh berikut.
# Convert BoW matrix to a DataFrame
df_bow = pd.DataFrame(
____,
columns=vectorizer.____
)
# Plot the heatmap
plt.figure(figsize=(10, 6))
sns.heatmap(____, annot=True)
plt.title("BoW Scores Across Reviews")
plt.xlabel("Terms")
plt.xticks(rotation=45)
plt.ylabel("Documents")
plt.show()