Session Ready
Exercise

NER with NLTK

You're now going to have some fun with named-entity recognition! A scraped news article has been pre-loaded into your workspace. Your task is to use nltk to find the named entities in this article.

What might the article be about, given the names you found?

Along with nltk, sent_tokenize and word_tokenize from nltk.tokenize have been pre-imported.

Instructions
100 XP
  • Tokenize article into sentences.
  • Tokenize each sentence in sentences into words using a list comprehension.
  • Inside a list comprehension, tag each tokenized sentence into parts of speech using nltk.pos_tag().
  • Chunk each tagged sentence into named-entity chunks using nltk.ne_chunk_sents(). Along with pos_sentences, specify the additional keyword argument binary=True.
  • Loop over each sentence and each chunk, and test whether it is a named-entity chunk by testing if it has the attribute label, and if the chunk.label() is equal to "NE". If so, print that chunk.