Sentence segmentation with spaCy
In this exercise, you will practice sentence segmentation. In NLP, segmenting a document into its sentences is a useful basic operation. It is one of the first steps in many NLP tasks that are more elaborate, such as detecting named entities. Additionally, capturing the number of sentences may provide some insight into the amount of information provided by the text.
You can access ten food reviews in the list called texts.
The en_core_web_sm model has already been loaded for you as nlp and .
Diese Übung ist Teil des Kurses
Natural Language Processing with spaCy
Anleitung zur Übung
- Run the
spaCymodel on each item in thetextslist to compiledocuments, a list of allDoccontainers. - Extract sentences of each
doccontainer by iterating throughdocumentslist and append them to a list calledsentences. - Count the number of sentences in each
doccontainer using thesentenceslist.
Interaktive Übung
Vervollständige den Beispielcode, um diese Übung erfolgreich abzuschließen.
# Generating a documents list of all Doc containers
documents = [____(text) for text in texts]
# Iterate through documents and append sentences in each doc to the sentences list
sentences = []
for doc in documents:
sentences.append([s for s in ____.____])
# Find number of sentences per each doc container
print([len(____) for s in sentences])