Get startedGet started for free

Training with spaCy

1. Training with spaCy

Welcome! Let's learn how we train spaCy models for a NER task.

2. Training steps

We previously learned that a spaCy model may not work well on a given data. One solution is to train the model on our data. In this video, we will learn how we can train a model on our data after annotating and preparing training data, and by disabling all other pipeline components. For example for training an NER component, we need to disable all other pipeline components such as POS tagger and dependency parser. We then feed our examples to the training procedure and evaluate the new NER model performance. Let's learn more about each of these steps.

3. Disabling other pipeline components

It is necessary to disable other pipeline components of an nlp model in order to only train the intended component. For example, if we want to train an NER model we have to ensure any other components are disabled. For this purpose, we can use nlp-dot-disable_pipes() method given a list of other_pipes (all other pipeline components). Other_pipes is compiled by looping through each pipe name of nlp-dot-pipe_names and checking if the pipe name is not the same as ner.

4. Model training procedure

In the training procedure, we will go over the training data several times. An epoch is the number of times that the learning algorithm will work through the entire training dataset. In each epoch, the training code updates the weights of the model with a small number using an optimizer object on randomly shuffled training data. Optimizers are functions that update the model weights and aim to lower the risk of errors from these predictions, and improve the accuracy of the model. We can create an optimizer object, using create_optimizer() method. In each epoch, we first shuffle training_data, an Example object, by using random-dot-shuffle() method. Next, for each training data point, which is a tuple of a text and annotations, we extract the equivalent dictionary object from the Example object given the Doc container of a text and training data annotation using Example-dot-from_dict() method. The extracted Example dictionary will be used to update the nlp model weights by using the nlp-dot-update() method and passing the list of the example dictionary, the optimizer object and a losses dictionary to track model's loss during training. Loss is a number indicating how bad the model's prediction is on a single example. The procedure continues to process next training data points.

5. Save and load a trained model

After we have trained a model, the next step is to test the model. For this purpose, we need to save and later load it. We use the dot-get_pipe() method to get the trained pipeline component. In an example, we trained an NER model and hence we get the NER component and save to the disk using the ner-dot-to_disk() method, passing a model name. Later, we load a spaCy model and create a blank NER component by using the nlp-dot-create_pipe() method. Then, we load the trained NER model from the disk by using ner-dot-from_disk() method on the created NER component. Lastly, we add the loaded NER component to the pipeline by calling nlp-dot-add_pipe() method and passing a name for the NER model, such as "ner".

6. Model for inference

Once a trained model is saved, it can be loaded as nlp. Then, we can use the model to find entities of a given text. We can see an example of how to apply the NER model and store entities' texts and labels for a text.

7. Let's practice!

Let's practice our learnings.