Splitting stereo audio to mono with PyDub
If you're trying to transcribe phone calls, there's a chance they've been recorded in stereo format, with one speaker on each channel.
As you've seen, it's hard to transcribe an audio file with more than one speaker. One solution is to split the audio file with multiple speakers into single files with individual speakers.
PyDub
's split_to_mono()
function can help with this. When called on an AudioSegment
recorded in stereo, it returns a list of two separate AudioSegment
's in mono format, one for each channel.
In this exercise, you'll practice this by splitting this stereo phone call (stereo_phone_call.wav
) recording into channel 1 and channel 2. This separates the two speakers, allowing for easier transcription.
This is a part of the course
“Spoken Language Processing in Python”
Exercise instructions
- Import
AudioSegment
frompydub
. - Create an
AudioSegment
instancestereo_phone_call
withstereo_phone_call.wav
. - Split
stereo_phone_call
intochannels
usingsplit_to_mono()
and check the channels of the resulting output. - Save each channel to new variables,
phone_call_channel_1
andphone_call_channel_2
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Import AudioSegment
from ____ import ____
# Import stereo audio file and check channels
stereo_phone_call = AudioSegment.from_file(____)
print(f"Stereo number channels: {stereo_phone_call.channels}")
# Split stereo phone call and check channels
channels = stereo_phone_call.____
print(f"Split number channels: {channels[0].____}, {channels[1].____}")
# Save new channels separately
phone_call_channel_1 = channels[0]
phone_call_channel_2 = ____
This exercise is part of the course
Spoken Language Processing in Python
Learn how to load, transform, and transcribe speech from raw audio files in Python.
Not all audio files come in the same shape, size or format. Luckily, the PyDub library by James Robert provides tools which you can use to programmatically alter and change different audio file attributes such as frame rate, number of channels, file format and more. In this chapter, you'll learn how to use this helpful library to ensure all of your audio files are in the right shape for transcription.
Exercise 1: Introduction to PyDubExercise 2: Import an audio file with PyDubExercise 3: Play an audio file with PyDubExercise 4: Audio parameters with PyDubExercise 5: Adjusting audio parametersExercise 6: Manipulating audio files with PyDubExercise 7: Turning it down... then upExercise 8: Normalizing an audio file with PyDubExercise 9: Chopping and changing audio filesExercise 10: Splitting stereo audio to mono with PyDubExercise 11: Converting and saving audio files with PyDubExercise 12: Exporting and reformatting audio filesExercise 13: Manipulating multiple audio files with PyDubExercise 14: An audio processing workflowWhat is DataCamp?
Learn the data skills you need online at your own pace—from non-coding essentials to data science and machine learning.