Doc similarity with spaCy
Semantic similarity is the process of analyzing multiple sentences to identify similarities between them. In this exercise, you will practice calculating semantic similarities of documents to a given document. The goal is to categorize a list of given reviews that are relevant to canned dog food.
The canned dog food category is stored at category
. A sample of five food reviews has been provided for you in a list called texts
. en_core_web_md
is loaded as nlp
.
Diese Übung ist Teil des Kurses
Natural Language Processing with spaCy
Anleitung zur Übung
- Create a
documents
list containingDoc
containers of alltexts
. - Create a
Doc
container of thecategory
and store it ascategory_document
. - Iterate through
documents
and print the similarity scores of eachDoc
container and thecategory_document
, rounded to three digits.
Interaktive Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
# Create a documents list containing Doc containers
documents = [____ for t in texts]
# Create a Doc container of the category
category = "canned dog food"
category_document = ____(____)
# Print similarity scores of each Doc container and the category_document
for i, doc in enumerate(documents):
print(f"Semantic similarity with document {i+1}:", round(doc.____(____), 3))