Get startedGet started for free

Train an existing NER model

A spaCy model may not work well on a given data. One solution is to train the model on our data. In this exercise, you will practice training a NER model in order to improve its prediction performance.

A spaCy en_core_web_sm model that is accessible as nlp, which is not able to correctly predict house as an entity in a test string.

Given a training_data, write the steps to update this model while iterating through the data two times. The other pipelines are already disabled and optimizer is also ready to be used. Number of epochs is already set to 2.

This exercise is part of the course

Natural Language Processing with spaCy

View Course

Exercise instructions

  • Use the optimizer object and for each epoch, shuffle the dataset using random package and create an Example object.
  • Update the nlp model using .update attribute and set the sgd arguments to use the optimizer.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

nlp = spacy.load("en_core_web_sm")
print("Before training: ", [(ent.text, ent.label_) for ent in nlp(test).ents])
other_pipes = [pipe for pipe in nlp.pipe_names if pipe != 'ner']
nlp.disable_pipes(*other_pipes)
optimizer = nlp.create_optimizer()

# Shuffle training data and the dataset using random package per epoch
for i in range(epochs):
  random.____(training_data)
  for text, ____ in training_data:
    doc = nlp.____(____)
    # Update nlp model after setting sgd argument to optimizer
    example = Example.____(____, ____)
    nlp.____([____], sgd = ____)
print("After training: ", [(ent.text, ent.label_) for ent in nlp(test).ents])
Edit and Run Code