Training multiple labels
Here's a small sample of a dataset created to train a new entity type WEBSITE
. The original dataset contains a few thousand sentences. In this exercise, you'll be doing the labeling by hand. In real life, you probably want to automate this and use an annotation tool – for example, Brat, a popular open-source solution, or Prodigy, our own annotation tool that integrates with spaCy.
After this exercise you will be nearly done with the course! If you enjoyed it, feel free to send Ines a thank you via Twitter - she'll appreciate it! Tweet to Ines
Diese Übung ist Teil des Kurses
Advanced NLP with spaCy
Interaktive Übung
Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.
TRAINING_DATA = [
("Reddit partners with Patreon to help creators build communities",
{'entities': [(____, ____, 'WEBSITE'), (____, ____, 'WEBSITE')]}),
("PewDiePie smashes YouTube record",
{'entities': [(____, ____, 'WEBSITE')]}),
("Reddit founder Alexis Ohanian gave away two Metallica tickets to fans",
{'entities': [(____, ___, 'WEBSITE')]}),
# And so on...
]