Zero-shot classification
Zero-shot classification is the ability for a transformer to predict a label from a new set of classes which it wasn't originally trained to identify. This is possible through its transfer learning capabilities. It can be an extremely valuable tool.
Hugging Face pipeline()
also has a zero-shot-classification
task. These pipelines require both an input text and candidate labels.
Build a zero-shot classifier to predict the label for the input text
, a news headline that has been loaded for you.
pipelines
from the transformers
library is already loaded for you.
Note that we are using our own version of the pipeline function to enable you to learn how to use these functions without having to download the model.
This is a part of the course
“Working with Hugging Face”
Exercise instructions
- Build the pipeline for a zero-shot-classification task and save as
classifier
. - Create a list of the labels - "politics", "science", "sports" - and save as
candidate_labels
. - Predict the label of
text
using the classifier and candidate labels.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Build the zero-shot classifier
____ = pipeline(____="zero-shot-classification", ____="facebook/bart-large-mnli")
# Create the list
candidate_labels = ["politics", "____", "____"]
# Predict the output
output = ____(____, ____)
print(f"Top Label: {output['labels'][0]} with score: {output['scores'][0]}")
This exercise is part of the course
Working with Hugging Face
Navigate and use the extensive repository of models and datasets available on the Hugging Face Hub.
It's time to dive into the Hugging Face ecosystem! You'll start by learning the basics of the pipeline module and Auto classes from the transformers library. Then, you'll learn at a high level what natural language processing and tokenization is. Finally, you'll start using the pipeline module for several text-based tasks, including text classification.
Exercise 1: Pipelines with Hugging FaceExercise 2: Getting started with pipelinesExercise 3: Using AutoClassesExercise 4: Comparing models with the pipelineExercise 5: NLP and tokenizationExercise 6: Normalizing textExercise 7: Comparing tokenizer outputExercise 8: Text classificationExercise 9: Grammatical correctnessExercise 10: Question Natural Language InferenceExercise 11: Zero-shot classificationExercise 12: SummarizationExercise 13: Summarizing long textExercise 14: Using min_length and max_lengthExercise 15: Summarizing several inputsWhat is DataCamp?
Learn the data skills you need online at your own pace—from non-coding essentials to data science and machine learning.