CommencerCommencer gratuitement

Model performance on your data

In this exercise, you will practice evaluating an existing model on your data. In this case, the aim is to examine model performance on a specific entity label, PRODUCT. If a model can accurately classify a large percentage of PRODUCT entities (e.g. more than 75%), you do not need to train the model on examples of PRODUCT entities, otherwise, you should consider training the model to improve its performance on PRODUCT entity prediction.

You'll use two reviews from the Amazon Fine Food Reviews dataset for this exercise. You can access these reviews by using the texts list.

The en_core_web_sm model is already loaded for you. You can access it by calling nlp(). The model is already ran on the texts list and documents, a list of Doc containers is available for your use.

Cet exercice fait partie du cours

Natural Language Processing with spaCy

Afficher le cours

Instructions

  • Compile a target_entities list, of all the entities for each of the documents, and append a tuple of (entities text, entities label) only if Jumbo is in the entity text.
  • For any tuple in the target_entities, append True to a correct_labels list if the entity label (second attribute in the tuple) is PRODUCT, otherwise append False.

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

# Append a tuple of (entities text, entities label) if Jumbo is in the entity
target_entities = []
for doc in ____:
  target_entities.extend([(ent.____, ent.____) for ent in doc.____ if "Jumbo" in ent.text])
print(target_entities)

# Append True to the correct_labels list if the entity label is `PRODUCT`
correct_labels = []
for ent in target_entities:
  if ____[1] == "PRODUCT":
    correct_labels.append(____)
  else:
    correct_labels.append(____)
print(correct_labels)
Modifier et exécuter le code