Inspecting datasets
Datasets on Hugging Face vary widely in size, structure, and features, making it important to inspect their metadata before loading them into your environment.
Let’s explore the "MMLU-Pro" dataset, a benchmark with 12K multi-choice questions and answers spanning STEM fields like Math and Computer Science, to understand its metadata, including size and features.
Note: this exercise may take a minute due to the dataset size.
This exercise is part of the course
Working with Hugging Face
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Import the function to load dataset metadata
from ____ import load_dataset_builder
# Initialize the dataset builder for the MMLU-Pro dataset
reviews_builder = ____("TIGER-Lab/MMLU-Pro")