Measuring word vector similarity
In this lesson we will understand the power of word vectors using real world trained word vectors. These are word vectors extracted from a list of word vectors published by the Stanford NLP group. A word vector is a sequence or a vector of numerical values. For example,
dog = (0.31, 0.92, 0.13)
The distance between word vectors can be measured using a pair-wise similarity metric. Here we will be using sklearn.metrics.pairwise.cosine_similarity
. Cosine similarity produces a higher values when the element-wise similarity of two vectors is high and vice-versa.
This exercise is part of the course
Machine Translation with Keras
Exercise instructions
- Print the length of the
cat_vector
usingndarray.size
attribute. - Compute and print the similarity between the
cat_vector
andwindow_vector
usingcosine_similarity
. - Compute and print the similarity between the
cat_vector
anddog_vector
usingcosine_similarity
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
from sklearn.metrics.pairwise import cosine_similarity
# Print the length of the cat_vector
print('Length of the cat_vector: ', ____.____)
# Compute and print the similarity between cat and window vectors
dist_cat_window = ____(____, window_vector)
print('Similarity(cat, window): ', ____)
# Compute and print the similarity between cat and dog vectors
print('Similarity(cat,dog): ', ____(____, ____))