Model performance on your data
In this exercise, you will practice evaluating an existing model on your data. In this case, the aim is to examine model performance on a specific entity label, PRODUCT. If a model can accurately classify a large percentage of PRODUCT entities (e.g. more than 75%), you do not need to train the model on examples of PRODUCT entities, otherwise, you should consider training the model to improve its performance on PRODUCT entity prediction.
You'll use two reviews from the Amazon Fine Food Reviews dataset for this exercise. You can access these reviews by using the texts list.
The en_core_web_sm model is already loaded for you. You can access it by calling nlp(). The model is already ran on the texts list and documents, a list of Doc containers is available for your use.
Diese Übung ist Teil des Kurses
Natural Language Processing with spaCy
Anleitung zur Übung
- Compile a
target_entitieslist, of all the entities for each of thedocuments, and append a tuple of (entities text, entities label) only ifJumbois in the entity text. - For any tuple in the
target_entities, appendTrueto acorrect_labelslist if the entity label (second attribute in the tuple) isPRODUCT, otherwise appendFalse.
Interaktive Übung
Vervollständige den Beispielcode, um diese Übung erfolgreich abzuschließen.
# Append a tuple of (entities text, entities label) if Jumbo is in the entity
target_entities = []
for doc in ____:
target_entities.extend([(ent.____, ent.____) for ent in doc.____ if "Jumbo" in ent.text])
print(target_entities)
# Append True to the correct_labels list if the entity label is `PRODUCT`
correct_labels = []
for ent in target_entities:
if ____[1] == "PRODUCT":
correct_labels.append(____)
else:
correct_labels.append(____)
print(correct_labels)