1. Learn
  2. /
  3. Courses
  4. /
  5. Natural Language Processing with spaCy

Connected

Exercise

Model performance on your data

In this exercise, you will practice evaluating an existing model on your data. In this case, the aim is to examine model performance on a specific entity label, PRODUCT. If a model can accurately classify a large percentage of PRODUCT entities (e.g. more than 75%), you do not need to train the model on examples of PRODUCT entities, otherwise, you should consider training the model to improve its performance on PRODUCT entity prediction.

You'll use two reviews from the Amazon Fine Food Reviews dataset for this exercise. You can access these reviews by using the texts list.

The en_core_web_sm model is already loaded for you. You can access it by calling nlp(). The model is already ran on the texts list and documents, a list of Doc containers is available for your use.

Instructions

100 XP
  • Compile a target_entities list, of all the entities for each of the documents, and append a tuple of (entities text, entities label) only if Jumbo is in the entity text.
  • For any tuple in the target_entities, append True to a correct_labels list if the entity label (second attribute in the tuple) is PRODUCT, otherwise append False.