Compatible training data
Recall that you cannot feed the raw text directly to spaCy. Instead, you need to create an Example object for each training example. In this exercise, you will practice converting a training_data with a single annotated sentence into a list of Example objects.
en_core_web_sm model is already imported and ready for use as nlp. The Example class is also imported for your use.
Cet exercice fait partie du cours
Natural Language Processing with spaCy
Instructions
- Iterate through the text and annotations in the
training_data, convert the text to aDoccontainer and store it atdoc. - Create an
Exampleobject using thedocobject and the annotations of each training data point, and store it atexample_sentence. - Append
example_sentenceto a list ofall_examples.
Exercice interactif pratique
Essayez cet exercice en complétant cet exemple de code.
example_text = 'A patient with chest pain had hyperthyroidism.'
training_data = [(example_text, {'entities': [(15, 25, 'SYMPTOM'), (30, 45, 'DISEASE')]})]
all_examples = []
# Iterate through text and annotations and convert text to a Doc container
for text, annotations in training_data:
doc = nlp(____)
# Create an Example object from the doc contianer and annotations
example_sentence = ____.____(doc, ____)
print(example_sentence.to_dict(), "\n")
# Append the Example object to the list of all examples
all_examples.append(____)
print("Number of formatted training data: ", len(____))