Get startedGet started for free

Measuring word vector similarity

In this lesson we will understand the power of word vectors using real world trained word vectors. These are word vectors extracted from a list of word vectors published by the Stanford NLP group. A word vector is a sequence or a vector of numerical values. For example, dog = (0.31, 0.92, 0.13)

The distance between word vectors can be measured using a pair-wise similarity metric. Here we will be using sklearn.metrics.pairwise.cosine_similarity. Cosine similarity produces a higher values when the element-wise similarity of two vectors is high and vice-versa.

This exercise is part of the course

Machine Translation with Keras

View Course

Exercise instructions

  • Print the length of the cat_vector using ndarray.size attribute.
  • Compute and print the similarity between the cat_vector and window_vector using cosine_similarity.
  • Compute and print the similarity between the cat_vector and dog_vector using cosine_similarity.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

from sklearn.metrics.pairwise import cosine_similarity

# Print the length of the cat_vector
print('Length of the cat_vector: ', ____.____)

# Compute and print the similarity between cat and window vectors
dist_cat_window = ____(____, window_vector)
print('Similarity(cat, window): ', ____)

# Compute and print the similarity between cat and dog vectors
print('Similarity(cat,dog): ', ____(____, ____))
Edit and Run Code