Doc similarity with spaCy
Semantic similarity is the process of analyzing multiple sentences to identify similarities between them. In this exercise, you will practice calculating semantic similarities of documents to a given document. The goal is to categorize a list of given reviews that are relevant to canned dog food.
The canned dog food category is stored at category
. A sample of five food reviews has been provided for you in a list called texts
. en_core_web_md
is loaded as nlp
.
This exercise is part of the course
Natural Language Processing with spaCy
Exercise instructions
- Create a
documents
list containingDoc
containers of alltexts
. - Create a
Doc
container of thecategory
and store it ascategory_document
. - Iterate through
documents
and print the similarity scores of eachDoc
container and thecategory_document
, rounded to three digits.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create a documents list containing Doc containers
documents = [____ for t in texts]
# Create a Doc container of the category
category = "canned dog food"
category_document = ____(____)
# Print similarity scores of each Doc container and the category_document
for i, doc in enumerate(documents):
print(f"Semantic similarity with document {i+1}:", round(doc.____(____), 3))