Summarizing long text
Summarization reduces large text into manageable content, helping readers quickly grasp key points from lengthy articles or documents.
There are two main types: extractive, which selects key sentences from the original text, and abstractive, which generates new sentences summarizing main ideas.
In this exercise, you’ll create an abstractive summarization pipeline using Hugging Face's pipeline()
function and the cnicu/t5-small-booksum
model. You’ll summarize text from a Wikipedia page on Greece, comparing the abstractive model's rephrased output to the original.
The pipeline
function from the transformers
library and the original_text
have already been loaded for you.
This exercise is part of the course
Working with Hugging Face
Exercise instructions
- Create the summarization
pipeline
using the task "summarization" and save assummarizer
. - Use the new pipeline to create a summary of the text and save as
summary_text
. - Compare the length of the original and summary text.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create the summarization pipeline
summarizer = ____(____="____", model="cnicu/t5-small-booksum")
# Summarize the text
summary_text = ____(original_text)
# Compare the length
print(f"Original text length: {len(original_text)}")
print(f"Summary length: {len(____[0]['____'])}")