OLAP: CUBE operator
1. OLAP: CUBE operator
We will now zoom into business intelligence and introduce on-line analytical processing (or in short OLAP) as key techniques to gain insight on your data.2. Introduction to OLAP
The idea behind OLAP is to interactively analyze the data and to summarize and visualize it. Various OLAP operators can be used to interactively analyze the data and look for interesting patterns. Data is often aggregated to get a better overview. For example, we can count how many movies each customer rented, or we can look at the average rating of movies for each genre and each country. To facilitate the data aggregation, we will describe and illustrate the CUBE, ROLLUP and GROUPING SETS operators. In this chapter, we will discuss how to generate information presentable in a Pivot table with SQL. Pivot tables are a popular tool to represent different levels of aggregation for business reports. Let's first have a closer look at how they are constructed.3. Table rentings_extended
In a pivot table we show aggregated data. To keep the demonstration as simple as possible we will use this table to demonstrate the aggregation levels. We call it rentings_extended, because it has an additional column for the country of the customer. It only holds records of customers from Austria or Belgium for Drama or Comedy movies.4. Pivot table - number of movie rentals
We now count the records from the table rentings_extended for the categories country and genre. So we get this pivot table for the number of movie rentals. The numbers in the red part show the number of movie rentals for each genre and each country. In blue we have the total number of movie rentals for each country. In green we obtain the number for each genre by adding up horizontally. Finally in yellow we have the total number of movie rentals, which is 22.5. Pivot table and SQL output
The representation of this pivot table in SQL is shown here on the right. For each value in the pivot table, we have a row in the SQL output. First, we have 4 rows for all unique combinations of country and genre. The next two rows represent the counts for each country - the column genre is null since it is ignored for these aggregations. In green we have the same for Comedy and Drama, where country is ignored and is, therefore, null. Finally, for the total number of movie rentals both categories are null. Now, let's obtain these results from pivot tables by using an SQL query. With the CUBE operator we can obtain all these aggregation levels with a simple query.6. GROUP BY CUBE
To create a pivot table from the data in renting_extended we use the expression GROUP BY CUBE. The CUBE operator is always used in combination with the GROUP BY clause. We need to use parenthesis' around country and genre in the GROUP BY CUBE clause. In the select clause we use for the aggregation COUNT star. With this query we group the data by each combination of country and genre as well as by country and by genre alone. Finally, we count all movie rentals in the table with the total aggregation. As a result we obtain the table from the previous slide - here are its first rows.7. Number of ratings
Alternatively, we can count the number of ratings with COUNT of rating. Since the ratings of many records are null we get much lower numbers for each category.8. Now it's your turn to GROUP BY CUBE!
Now, you are ready to run your first OLAP queries using GROUP BY CUBE!Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.