Get startedGet started for free

Uniqueness and transformations

1. Uniqueness and transformations

Welcome back! As we look to wrap up our first workflow, we will examine three tools from three different toolsets.

2. Group Spotify playlist

Imagine you have a joint playlist on Spotify with your friends. You all add songs for each other to listen to. On some occasions, people add the same song multiple times! That can really disrupt your listening enjoyment, so you go through the playlist and remove duplicate songs.

3. Removing duplicates

In data preparation, this is the same as removing duplicates in your dataset. However, it is important to understand what duplicates are in the context of your dataset.

4. Removing duplicates

Let's take this sample data from an online grocery store. We can expect to have repeat values in some columns, as only a set number of possible values can apply.

5. Removing duplicates

For example, we could have multiple customers from the same country with the same name or age. Therefore, in these cases, we want to keep duplicate values.

6. Removing duplicates

The duplicates we want to remove revolve around unique identifier columns. For example, the customer ID should be unique to each customer. But we can see that 101 appears twice.

7. Removing duplicates

Looking at those two rows, they are the same, which means we would want to remove one of the two to ensure our raw data is free of duplicates.

8. Unique tool

In Alteryx Designer, duplicates are removed using the Unique tool, which is found in the preparation toolset.

9. Unique tool

By removing duplicates, we can enhance the data quality of our overall dataset and improve the accuracy of our analyses. Duplicates can skew or distort any results or insights we eventually deliver, so it is always best to remove them early.

10. Aggregating data

Data aggregation involves summarizing large volumes of data into a simplified, condensed format. This is done using aggregation operators such as counting, averaging, summing, or finding the minimum and maximum values to derive meaningful insights. The key purpose of data aggregation is to reduce complexity and make the data more accessible and understandable for analysis, reporting, or decision-making.

11. Aggregating data

One way to think about it is like gardening. You might prune plants to remove unnecessary parts and remove weeds to encourage healthy growth and blooming. Summarizing data is similar; it involves trimming down the dataset to its most essential elements, removing redundancies, and organizing it to best support insights and decisions.

12. Summarize tool

In Alteryx Designer, the Summarize tool will allow us to aggregate our data as we want to. We can choose which columns in our dataset to group the data by and which to aggregate based on the functions available. As well as the basic aggregate operators like sum and count, Alteryx Designer offers operators related to numeric, string, and spatial data. Even financial calculations can be carried out on the data.

13. DC High School bulletin board

The bulletin board in the main hall of DC High School is a great source of information for students and teachers. There are regularly updates about school events and clubs, information related to important safety procedures for all, and even important notices from the janitors.

14. Comment tool

Like the bulletin board, Alteryx Designer has a tool allowing you to document and communicate important information within your workflow. The Comment tool allows the user to add comments and descriptions within the workflow.

15. Comment tool

With this tool, you can document important information about steps included in the workflow. When sharing files and workflows, the comment tool is a great way for other users to understand what processes have been included and why.

16. Let's practice!

Time to get back to some exercises!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.