Get Started

Multiple Speakers 2

Deciphering between multiple speakers in one audio file is called speaker diarization. However, you've seen the free function we've been using, recognize_google() doesn't have the ability to transcribe different speakers.

One way around this, without using one of the paid speech to text services, is to ensure your audio files are single speaker.

This means if you were working with phone call data, you would make sure the caller and receiver are recorded separately. Then you could transcribe each file individually.

In this exercise, we'll transcribe each of the speakers in our multiple speakers audio file individually.

This is a part of the course

“Spoken Language Processing in Python”

View Course

Exercise instructions

  • Pass speakers to the enumerate() function to loop through the different speakers.
  • Call record() on recognizer to convert the AudioFiles into AudioData.
  • Use recognize_google() to transcribe each of the speaker_audio objects.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

recognizer = sr.Recognizer()

# Multiple speakers on different files
speakers = [sr.AudioFile("speaker_0.wav"), 
            sr.AudioFile("speaker_1.wav"), 
            sr.AudioFile("speaker_2.wav")]

# Transcribe each speaker individually
for i, speaker in enumerate(____):
    with speaker as source:
        speaker_audio = recognizer.____(source)
    print(f"Text from speaker {i}:")
    print(recognizer.____(____,
         				  language="en-US"))

This exercise is part of the course

Spoken Language Processing in Python

AdvancedSkill Level
4.7+
3 reviews

Learn how to load, transform, and transcribe speech from raw audio files in Python.

Speech recognition is still far from perfect. But the SpeechRecognition library provides an easy way to interact with many speech-to-text APIs. In this section, you'll learn how to use the SpeechRecognition library to easily start converting the spoken language in your audio files to text.

Exercise 1: SpeechRecognition Python libraryExercise 2: Pick the wrong speech_recognition APIExercise 3: Using the SpeechRecognition libraryExercise 4: Using the Recognizer classExercise 5: Reading audio files with SpeechRecognitionExercise 6: From AudioFile to AudioDataExercise 7: Recording the audio we needExercise 8: Dealing with different kinds of audioExercise 9: Different kinds of audioExercise 10: Multiple Speakers 1Exercise 11: Multiple Speakers 2
Exercise 12: Working with noisy audio

What is DataCamp?

Learn the data skills you need online at your own pace—from non-coding essentials to data science and machine learning.

Start Learning for Free