Sequence generation tasks

1. Sequence generation tasks

Now, we take a look at how pipelines simplify the otherwise complex task of sequence generation.

2. Sequence generation

Sequence generation is the process of producing new text based on a given input. It goes beyond labeling or extracting; it creates something new from the source. This includes tasks like: Text summarization, text translation, and language modeling. Let's go through each one.

3. Text summarization

Text summarization condenses a long document into a shorter, more digestible version, without losing the essential meaning. It helps highlight the most important points, removing redundancy while preserving the key message. This is especially useful when dealing with lengthy news articles, research papers, reports, emails, or any situation where you need to get the gist quickly without reading everything word for word.

4. Text summarization pipeline

To perform text summarization, we use the pipeline function with the task set to "summarization" and a suitable model. Suppose we have a long input text, such as a descriptive paragraph about the Amazon rainforest. We pass this text to the summarization pipeline. The output is a dictionary containing a "summary_text" field with the condensed version of the original passage. In this example, the model highlights the key points: the Amazon's biodiversity, its location, and its role in climate regulation, allowing readers to grasp the essence of the text more quickly.

5. Text translation

Next, let's look at text translation. As the name suggests, translation allows us to convert text from one language to another, which is crucial in multilingual applications like international websites or customer support tools.

6. Text translation pipeline

To create a translation pipeline, we set the task to "translation" and pick a model suited to the desired language pair. Here, we use a model to translate from English to French. For other languages, we need to pick other models. We define an English sentence and pass it to the pipeline. The output is a dictionary with the key "translation_text", containing the translated sentence in French.

7. Language modeling

Finally, let's explore language modeling. Here, the goal is to predict and generate the next words in a sequence based on a given prompt. This is the basis of many NLP applications like autocompletion, story generation, or even chatbot replies.

8. Language modeling pipeline

We use a text-generation pipeline with a suitable model to generate text continuations. We start by defining a prompt; in this case, "Once upon a time", as the opening of a story. We pass it to the pipeline, setting max_length=30 to limit the total number of tokens in each output and num_return_sequences=3 to generate three different completions. The result is a list of three dictionaries, each containing a unique continuation under the key "generated_text", showing how the model can generate multiple, creative variations from the same prompt.

9. Let's practice!

Time to practice!

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

Natural Language Processing (NLP) in Python

IntermediateSkill Level

4.8+

268 reviews