Preprocess audio datasets
You're enhancing your precision agriculture application by enabling farmers to control their machinery with voice commands. The system should recognize keywords in commands like "Turn on the sprinkler irrigation system."
You'll leverage a keyword spotting dataset with audio clips of keywords like "on." Preprocess the audio files so they can be used with a pre-trained Transformer model!
Some data has been pre-loaded:
datasetcontains a sample training dataset of audio files. It already contains thetrainsplit, so you don't need to specifytrainwhen usingdataset.AutoFeatureExtractorhas been imported fromtransformers.modelis equal tofacebook/wav2vec2-base.max_durationis defined as 1 second.
Latihan ini adalah bagian dari kursus
Efficient AI Model Training with PyTorch
Petunjuk latihan
- Load a pre-trained
feature_extractorwith theAutoFeatureExtractorclass. - Set the
sampling_rateusing the rates from thefeature_extractor. - Set the
max_lengthof theaudio_arraysusingmax_duration.
Latihan interaktif praktis
Cobalah latihan ini dengan menyelesaikan kode contoh berikut.
# Load a pre-trained feature extractor
feature_extractor = ____.____(model)
def preprocess_function(examples):
audio_arrays = [x["array"] for x in examples["audio"]]
inputs = feature_extractor(
audio_arrays,
# Set the sampling rate
sampling_rate=____.____,
# Set the max length
max_length=int(feature_extractor.sampling_rate * max_duration),
truncation=True)
return inputs
encoded_dataset = dataset.map(preprocess_function, remove_columns=["audio", "file"], batched=True)