ComenzarEmpieza gratis

Tokenizing the Gettysburg Address

In this exercise, you will be tokenizing one of the most famous speeches of all time: the Gettysburg Address delivered by American President Abraham Lincoln during the American Civil War.

The entire speech is available as a string named gettysburg.

Este ejercicio forma parte del curso

Feature Engineering for NLP in Python

Ver curso

Instrucciones del ejercicio

  • Load the en_core_web_sm model using spacy.load().
  • Create a Doc object doc for the gettysburg string.
  • Using list comprehension, loop over doc to generate the token texts.

Ejercicio interactivo práctico

Prueba este ejercicio completando el código de muestra.

import spacy

# Load the en_core_web_sm model
nlp = ____.____(____)

# Create a Doc object
doc = ____(____)

# Generate the tokens
tokens = [token.____ for token in ____]
print(tokens)
Editar y ejecutar código