Creating customer call transcripts

1. Creating customer call transcripts

Welcome to the case study chapter! Let's now apply everything we're learned so far to a real-life application.

2. Case study introduction

Imagine you're an AI engineer at DataCamp. The support team wants to trial allowing users to submit support queries with voice messages. They would like you to build a chatbot that will interpret these voice messages, and provide a spoken response back to the user. Due to DataCamp's international learner base, this system must also support a wide array of languages. Let's break this down!

3. Case study introduction

The chatbot should take these recordings, transcribe

4. Case study introduction

them into text, detect

5. Case study introduction

the language, translate

6. Case study introduction

it into English, generate

7. Case study introduction

a response, and then reply

8. Case study introduction

with spoken audio in the customer's native language. It also needs a moderation

9. Case study introduction

system to filter out irrelevant messages and ensure polite responses before sending them back to the user. That sounds like a complex system, so let's break it down into steps.

10. Case study plan

In this video, we'll focus on getting an accurate English transcript. First, we'll transcribe the audio into text. Then, we'll detect the language used and translate it into English. Finally, we'll refine the translated text to correct any misinterpretations, especially with names and terminology. Let's get started.

11. Step 1: transcribe audio

We've been given an audio recording in mp3 format. To process it, we first open the file in read-binary mode, using the "rb" parameter. Next, we make a request to the OpenAI audio endpoint for a transcription, specifying the model and passing our audio file.

12. Step 1: transcribe audio

We extract the transcript using the text attribute and print it. Right away, we see that it's not in English.

13. Step 2: detect language

To determine the language, we send a chat completion request, prompting the model to identify the transcript's language and return only the country code. We provide a few examples of country codes, just so there's no misinterpretation in what we're requesting, and pass the transcript variable into the prompt using an f-string. The model return 'uk', which is the country code of Ukraine. Now that we know the language, we can move onto translation.

14. Step 3: translate to English

To translate the text into English, we send another chat completion request. This time, we ask the model to translate the transcript while specifying the detected language.

15. Step 3: translate to English

Reading the translation, we see that the customer is asking for learning recommendations. However, we also notice that some technical terms have been misinterpreted.

16. Step 3: translate to English

The model did not correctly recognize names like DataCamp, or technologies like LangChain and AWS, which could lead to confusion in the response. We need to fix this.

17. Step 4: refining the text

We can send another chat completion request; this time, asking the model to refine the transcript by fixing the terminology. We pass the translated text again and let the model adjust it.

18. Step 4: refining the text

After printing the corrected text, we see that the names have been properly recognized. Now, the customer's request is clear and ready for processing.

19. Recap

Let's recap what we've done. We transcribed the audio to extract the customer's message, detected the language and translated the text into English; finally, we refined the output to correct any misunderstandings related to names and terminology. In total, we made four requests to the OpenAI API to complete these steps.

20. Time for practice!

Now it's your turn!

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.