Splitting stereo audio to mono with PyDub
If you're trying to transcribe phone calls, there's a chance they've been recorded in stereo format, with one speaker on each channel.
As you've seen, it's hard to transcribe an audio file with more than one speaker. One solution is to split the audio file with multiple speakers into single files with individual speakers.
PyDub
's split_to_mono()
function can help with this. When called on an AudioSegment
recorded in stereo, it returns a list of two separate AudioSegment
's in mono format, one for each channel.
In this exercise, you'll practice this by splitting this stereo phone call (stereo_phone_call.wav
) recording into channel 1 and channel 2. This separates the two speakers, allowing for easier transcription.
This is a part of the course
“Spoken Language Processing in Python”
Exercise instructions
- Import
AudioSegment
frompydub
. - Create an
AudioSegment
instancestereo_phone_call
withstereo_phone_call.wav
. - Split
stereo_phone_call
intochannels
usingsplit_to_mono()
and check the channels of the resulting output. - Save each channel to new variables,
phone_call_channel_1
andphone_call_channel_2
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Import AudioSegment
from ____ import ____
# Import stereo audio file and check channels
stereo_phone_call = AudioSegment.from_file(____)
print(f"Stereo number channels: {stereo_phone_call.channels}")
# Split stereo phone call and check channels
channels = stereo_phone_call.____
print(f"Split number channels: {channels[0].____}, {channels[1].____}")
# Save new channels separately
phone_call_channel_1 = channels[0]
phone_call_channel_2 = ____