Aan de slagGa gratis aan de slag

Training multiple labels

Here's a small sample of a dataset created to train a new entity type WEBSITE. The original dataset contains a few thousand sentences. In this exercise, you'll be doing the labeling by hand. In real life, you probably want to automate this and use an annotation tool – for example, Brat, a popular open-source solution, or Prodigy, our own annotation tool that integrates with spaCy.

After this exercise you will be nearly done with the course! If you enjoyed it, feel free to send Ines a thank you via Twitter - she'll appreciate it! Tweet to Ines

Deze oefening maakt deel uit van de cursus

Advanced NLP with spaCy

Cursus bekijken

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

TRAINING_DATA = [
    ("Reddit partners with Patreon to help creators build communities", 
     {'entities': [(____, ____, 'WEBSITE'), (____, ____, 'WEBSITE')]}),
  
    ("PewDiePie smashes YouTube record", 
     {'entities': [(____, ____, 'WEBSITE')]}),
  
    ("Reddit founder Alexis Ohanian gave away two Metallica tickets to fans", 
     {'entities': [(____, ___, 'WEBSITE')]}),
    # And so on...
]
Code bewerken en uitvoeren