Get startedGet started for free

Semantic similarity for categorizing text

The main objective of semantic similarity is to measure the distance between the semantic meanings of a pair of words, phrases, sentences, or documents. For example, the word “car” is more similar to “bus” than it is to “cat”. In this exercise, you will find similar sentences to the word sauce from an example text in Amazon Fine Food Reviews. You can use spacy to calculate the similarity score of the word sauce and any of the sentences in a given texts string and report the most similar sentence's score.

A texts string is pre-loaded that contains all reviews' Text data. You'll use en_core_web_md English model for this exercise which is already available as nlp.

This exercise is part of the course

Natural Language Processing with spaCy

View Course

Exercise instructions

  • Use nlp to generate Doc containers for the word sauce and for texts and store them at key and sentences respectively.
  • Calculate similarity scores of the word sauce with each sentence in the texts string (rounded to two digits).

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Populate Doc containers for the word "sauce" and for "texts" string
key = ____
sentences = ____

# Calculate similarity score of each sentence and a Doc container for the word sauce
semantic_scores = []
for sent in sentences.____:
	semantic_scores.append({"score": round(sent.____(____), 2)})
print(semantic_scores)
Edit and Run Code