Doc similarity with spaCy
Semantic similarity is the process of analyzing multiple sentences to identify similarities between them. In this exercise, you will practice calculating semantic similarities of documents to a given document. The goal is to categorize a list of given reviews that are relevant to canned dog food.
The canned dog food category is stored at category. A sample of five food reviews has been provided for you in a list called texts. en_core_web_md is loaded as nlp.
This exercise is part of the course
Natural Language Processing with spaCy
Exercise instructions
- Create a
documentslist containingDoccontainers of alltexts. - Create a
Doccontainer of thecategoryand store it ascategory_document. - Iterate through
documentsand print the similarity scores of eachDoccontainer and thecategory_document, rounded to three digits.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create a documents list containing Doc containers
documents = [____ for t in texts]
# Create a Doc container of the category
category = "canned dog food"
category_document = ____(____)
# Print similarity scores of each Doc container and the category_document
for i, doc in enumerate(documents):
print(f"Semantic similarity with document {i+1}:", round(doc.____(____), 3))