Summarizing long text
Summarization reduces large text into manageable content, helping readers quickly grasp key points from lengthy articles or documents.
There are two main types: extractive, which selects key sentences from the original text, and abstractive, which generates new sentences summarizing main ideas.
In this exercise, you’ll create an abstractive summarization pipeline using Hugging Face's pipeline() function and the cnicu/t5-small-booksum model. You’ll summarize text from a Wikipedia page on Greece, comparing the abstractive model's rephrased output to the original.
The pipeline function from the transformers library and the original_text have already been loaded for you.
Deze oefening maakt deel uit van de cursus
Working with Hugging Face
Oefeninstructies
- Create the summarization
pipelineusing the task "summarization" and save assummarizer. - Use the new pipeline to create a summary of the text and save as
summary_text. - Compare the length of the original and summary text.
Praktische interactieve oefening
Probeer deze oefening eens door deze voorbeeldcode in te vullen.
# Create the summarization pipeline
summarizer = ____(____="____", model="cnicu/t5-small-booksum")
# Summarize the text
summary_text = ____(original_text)
# Compare the length
print(f"Original text length: {len(original_text)}")
print(f"Summary length: {len(____[0]['____'])}")