Model performance on your data
In this exercise, you will practice evaluating an existing model on your data. In this case, the aim is to examine model performance on a specific entity label, PRODUCT
. If a model can accurately classify a large percentage of PRODUCT
entities (e.g. more than 75%), you do not need to train the model on examples of PRODUCT
entities, otherwise, you should consider training the model to improve its performance on PRODUCT
entity prediction.
You'll use two reviews from the Amazon Fine Food Reviews dataset for this exercise. You can access these reviews by using the texts
list.
The en_core_web_sm
model is already loaded for you. You can access it by calling nlp()
. The model is already ran on the texts
list and documents
, a list of Doc
containers is available for your use.
Este exercício faz parte do curso
Natural Language Processing with spaCy
Instruções do exercício
- Compile a
target_entities
list, of all the entities for each of thedocuments
, and append a tuple of (entities text, entities label) only ifJumbo
is in the entity text. - For any tuple in the
target_entities
, appendTrue
to acorrect_labels
list if the entity label (second attribute in the tuple) isPRODUCT
, otherwise appendFalse
.
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
# Append a tuple of (entities text, entities label) if Jumbo is in the entity
target_entities = []
for doc in ____:
target_entities.extend([(ent.____, ent.____) for ent in doc.____ if "Jumbo" in ent.text])
print(target_entities)
# Append True to the correct_labels list if the entity label is `PRODUCT`
correct_labels = []
for ent in target_entities:
if ____[1] == "PRODUCT":
correct_labels.append(____)
else:
correct_labels.append(____)
print(correct_labels)