Get startedGet started for free

Explaining chat-based generative AI models

1. Explaining chat-based generative AI models

In previous videos, we've explored the importance of explainability and the techniques used to make predictions more transparent. Now, let's explore how we can understand and explain the predictions made by chat-based generative AI models, like ChatGPT.

2. Chat-based generative AI models

Unlike traditional models, chat-based generative models are used to generate text. When presented with a user's message or prompt, they construct responses word-by-word, predicting each subsequent word based on the preceding context and a vast knowledge base. But how can we explain the reasoning behind such responses?

3. Chain-of-thought prompt

One powerful approach is the Chain-of-Thought prompt, a special type of prompt that encourages the model to articulate its reasoning process step-by-step. To get this reasoning, we explicitly ask the model to explain its reasoning in the prompt. It then outlines the intermediate steps leading to its final outcome. This helps us understand how the model forms its answers.

4. Creating a chain-of-thought prompt

Suppose we want ChatGPT to determine the number of apples left in a shop after a sequence of transactions, such as selling or receiving. We include this scenario in the prompt and ask for a step-by-step explanation. The prompt is processed by a user-defined function called get_response, which handles interactions with ChatGPT. For this course, the specifics of this function aren't essential; we'll focus on using it to obtain the model's response, and it will be provided to you during the exercises. Printing the response, the model clarifies each step: starting with the initial count, subtracting sold apples, adding received apples, and concluding with the final number. This breakdown illustrates how the model accounts for each transaction before reaching its conclusion, enhancing our understanding of its logic.

5. Self-consistency

Another technique is self-consistency, which evaluates the model's confidence in its responses. This involves asking the model to generate multiple responses, which allows us to assess its consistency for specific outcomes.

6. Self-consistency in text classification

This is especially useful in tasks like text classification. For instance, if we prompt the model to classify a review as positive or negative and generate five responses instead of one, we can then evaluate the model's confidence based on the aggregation of answers by computing the proportion of responses that fall into each category.

7. Self-consistency in text classification

For instance, if four out of five responses are positive, the confidence in a positive classification is 0.8, while for negative, it is 0.2.

8. Creating self-consistency prompts

To implement this, we write a prompt asking the model to categorize the sentiment of a product review, replying only with the sentiment category, either 'positive' or 'negative,' to facilitate calculating proportions later on. We use a for loop to generate multiple responses, which we collect in a list, converting each response to lowercase before adding it. We then calculate the proportion of each category in the total responses to determine the confidence for each category.

9. Creating self-consistency prompts

Printing the confidence, we see the model is 60% sure the review is positive.

10. Let's practice!

Time to practice!