Get startedGet started for free

Comparing BoW and TF-IDF representations

You're part of the analytics team at a wearable tech company. Your goal is to help product managers understand customer feedback on the company's new smartwatch. You've already preprocessed the text and created two representations: bow_matrix using CountVectorizer(), and tfidf_matrix using TfidfVectorizer(). In this exercise, you'll visualize and compare the two to better understand how each captures word importance.

This exercise is part of the course

Natural Language Processing (NLP) in Python

View Course

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Convert BoW matrix to a DataFrame
df_bow = pd.DataFrame(
    ____,
    columns=vectorizer.____
)

# Plot the heatmap
plt.figure(figsize=(10, 6))
sns.heatmap(____, annot=True)
plt.title("BoW Scores Across Reviews")
plt.xlabel("Terms")
plt.xticks(rotation=45)
plt.ylabel("Documents")
plt.show()
Edit and Run Code