Get startedGet started for free

Unique data

1. Unique data

Next, we'll look at how to find unique values in our data.

2. Business questions

So far, we've answered “what” and “where” questions, like “What are our product names?”, and “Where are our customers from?”. We're now going to work towards questions like “How many products do we have?”, and “How many countries do we work with?”.

3. An impractical approach

Let's look at our products table. In previous queries, we saw a few products in the computing category, but we couldn't see what other categories exist. We could ask to see everything and scroll through the data, but scrolling through hundreds of records isn't practical.

4. Introducing DISTINCT: implied prompt

Instead, we might ask something like, “What categories are in the products table?”. Even though we didn't specify a summary, the prompt suggests we wanted an overview. The AI assistant may interpret this as a request for a summarized list, showing each category only once. This kind of summarization is useful when we want a clean list of values, and the AI assistant may use a keyword called DISTINCT to make that happen.

5. Introducing DISTINCT: direct prompt

To avoid any assumptions, we can be more specific with a prompt like, "What different product categories do we sell?". This kind of prompt tells the AI assistant we want a summary.

6. Unique pairs

Once we can see unique values, we might want to explore relationships between different fields. What if we want to understand how categories relate to specific products? We can ask something like, "What unique combinations of category and product name do we have?". The AI will generate a SQL query that applies DISTINCT once but it works on both fields together, showing us each unique pairing. Looking at the results set, at first glance it may appear that there are some duplicates, but each row in our result is a unique pair. "Kitchen Appliances" appears multiple times, but with a different product each time.

7. In industry

This same approach works across different industries and databases. An HR director asking "What job titles exist in our company?" sees each role listed once, not hundreds of duplicate entries. A retail manager asking "What suppliers do we work with in each region?" gets a clean list without repetition.

8. The DISTINCT tool

This DISTINCT tool answers some of the foundational questions every data project starts with. When working with AI, words like "unique," "different," or "distinct" can help signal that we want to see each value only once. But as we saw earlier, even prompts like “What categories are in the table?” can lead to the same outcome. Being intentional with how we phrase prompts gives us more control over the AI assistant's output.

9. Let's practice!

In our next video, we'll see how to count what we find. But for now, let's practice.

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.