Aan de slagGa gratis aan de slag

Plot recommendation engine

In this exercise, we will build a recommendation engine that suggests movies based on similarity of plot lines. You have been given a get_recommendations() function that takes in the title of a movie, a similarity matrix and an indices series as its arguments and outputs a list of most similar movies. indices has already been provided to you.

You have also been given a movie_plots Series that contains the plot lines of several movies. Your task is to generate a cosine similarity matrix for the tf-idf vectors of these plots.

Consequently, we will check the potency of our engine by generating recommendations for one of my favorite movies, The Dark Knight Rises.

Deze oefening maakt deel uit van de cursus

Feature Engineering for NLP in Python

Cursus bekijken

Oefeninstructies

  • Initialize a TfidfVectorizer with English stop_words. Name it tfidf.
  • Construct tfidf_matrix by fitting and transforming the movie plot data using fit_transform().
  • Generate the cosine similarity matrix cosine_sim using tfidf_matrix. Don't use cosine_similarity()!
  • Use get_recommendations() to generate recommendations for 'The Dark Knight Rises'.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

# Initialize the TfidfVectorizer 
tfidf = ____(____='english')

# Construct the TF-IDF matrix
tfidf_matrix = tfidf.____(movie_plots)

# Generate the cosine similarity matrix
cosine_sim = ____(tfidf_matrix, tfidf_matrix)
 
# Generate recommendations 
print(get_recommendations(____, cosine_sim, indices))
Code bewerken en uitvoeren