Compatible training data
Recall that you cannot feed the raw text directly to spaCy
. Instead, you need to create an Example
object for each training example. In this exercise, you will practice converting a training_data
with a single annotated sentence into a list of Example
objects.
en_core_web_sm
model is already imported and ready for use as nlp
. The Example
class is also imported for your use.
This exercise is part of the course
Natural Language Processing with spaCy
Exercise instructions
- Iterate through the text and annotations in the
training_data
, convert the text to aDoc
container and store it atdoc
. - Create an
Example
object using thedoc
object and the annotations of each training data point, and store it atexample_sentence
. - Append
example_sentence
to a list ofall_examples
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
example_text = 'A patient with chest pain had hyperthyroidism.'
training_data = [(example_text, {'entities': [(15, 25, 'SYMPTOM'), (30, 45, 'DISEASE')]})]
all_examples = []
# Iterate through text and annotations and convert text to a Doc container
for text, annotations in training_data:
doc = nlp(____)
# Create an Example object from the doc contianer and annotations
example_sentence = ____.____(doc, ____)
print(example_sentence.to_dict(), "\n")
# Append the Example object to the list of all examples
all_examples.append(____)
print("Number of formatted training data: ", len(____))