Similar words in a vocabulary
Finding semantically similar terms has various applications in information retrieval. In this exercise, you will practice finding the most semantically similar term to the word computer from the en_core_web_md
model's vocabulary.
The computer word vector is already extracted and stored as word_vector
. The en_core_web_md
model is already loaded as nlp
, and NumPy package is loaded as np
.
You can use the .most_similar()
function of the nlp.vocab.vectors
object to find the most semantically similar terms. Using [0][0]
to index the output of this function will return the word IDs of the semantically similar terms. nlp.vocab.strings[<a given word>]
can be used to find the word ID of a given word and it can similarly return the word associated with a given word ID.
Diese Übung ist Teil des Kurses
Natural Language Processing with spaCy
Anleitung zur Übung
- Find the most semantically similar term from the
en_core_web_md
vocabulary. - Find the list of similar words given the word IDs of the similar terms.
Interaktive Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
# Find the most similar word to the word computer
most_similar_words = nlp.vocab.vectors.____(np.asarray([____]), n = 1)
# Find the list of similar words given the word IDs
words = [nlp.____.____[____] for w in most_similar_words[0][0]]
print(words)