Train an existing NER model
A spaCy model may not work well on a given data. One solution is to train the model on our data. In this exercise, you will practice training a NER model in order to improve its prediction performance.
A spaCy en_core_web_sm
model that is accessible as nlp
, which is not able to correctly predict house
as an entity in a test
string.
Given a training_data
, write the steps to update this model while iterating through the data two times. The other pipelines are already disabled and optimizer
is also ready to be used. Number of epochs is already set to 2.
This exercise is part of the course
Natural Language Processing with spaCy
Exercise instructions
- Use the
optimizer
object and for each epoch, shuffle the dataset usingrandom
package and create anExample
object. - Update the
nlp
model using.update
attribute and set thesgd
arguments to use the optimizer.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
nlp = spacy.load("en_core_web_sm")
print("Before training: ", [(ent.text, ent.label_) for ent in nlp(test).ents])
other_pipes = [pipe for pipe in nlp.pipe_names if pipe != 'ner']
nlp.disable_pipes(*other_pipes)
optimizer = nlp.create_optimizer()
# Shuffle training data and the dataset using random package per epoch
for i in range(epochs):
random.____(training_data)
for text, ____ in training_data:
doc = nlp.____(____)
# Update nlp model after setting sgd argument to optimizer
example = Example.____(____, ____)
nlp.____([____], sgd = ____)
print("After training: ", [(ent.text, ent.label_) for ent in nlp(test).ents])