Text processing with spaCy
Every NLP application consists of several text processing steps. You have already learned some of these steps, including tokenization, lemmatization, sentence segmentation and named entity recognition.
In this exercise, you'll continue to practice with text processing steps in spaCy, such as breaking the text into sentences and extracting named entities. You will use the first five reviews from the Amazon Fine Food Reviews dataset for this exercise. You can access these reviews by using the texts
object.
The en_core_web_sm
model has already been loaded for you to use, and you can access it by using nlp
. The list of Doc
containers for each item in texts
is also pre-loaded and accessible at documents
.
This exercise is part of the course
Natural Language Processing with spaCy
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create a list to store sentences of each Doc container in documents
sentences = [[____ for sent in doc.____] for doc in documents]
# Print number of sentences in each Doc container in documents
num_sentences = [len(____) for s in sentences]
print("Number of sentences in documents:\n", ____)