MulaiMulai sekarang secara gratis

NER with NLTK

You're now going to have some fun with named-entity recognition! A scraped news article has been pre-loaded into your workspace. Your task is to use nltk to find the named entities in this article.

What might the article be about, given the names you found?

Along with nltk, sent_tokenize and word_tokenize from nltk.tokenize have been pre-imported.

Latihan ini adalah bagian dari kursus

Introduction to Natural Language Processing in Python

Lihat Kursus

Petunjuk latihan

  • Tokenize article into sentences.
  • Tokenize each sentence in sentences into words using a list comprehension.
  • Inside a list comprehension, tag each tokenized sentence into parts of speech using nltk.pos_tag().
  • Chunk each tagged sentence into named-entity chunks using nltk.ne_chunk_sents(). Along with pos_sentences, specify the additional keyword argument binary=True.
  • Loop over each sentence and each chunk, and test whether it is a named-entity chunk by testing if it has the attribute label, and if the chunk.label() is equal to "NE". If so, print that chunk.

Latihan interaktif praktis

Cobalah latihan ini dengan menyelesaikan kode contoh berikut.

# Tokenize the article into sentences: sentences
sentences = ____

# Tokenize each sentence into words: token_sentences
token_sentences = [____ for sent in ____]

# Tag each tokenized sentence into parts of speech: pos_sentences
pos_sentences = [____ for sent in ____] 

# Create the named entity chunks: chunked_sentences
chunked_sentences = ____

# Test for stems of the tree with 'NE' tags
for sent in chunked_sentences:
    for chunk in sent:
        if hasattr(chunk, "label") and ____ == "____":
            print(chunk)
Edit dan Jalankan Kode