Filtering and ordering

1. Filtering and ordering

A main part of querying data is to filter those records which are relevant for a certain research question. For future applications, it is also essential to bring tables in a particular order.

2. WHERE

When a WHERE clause is added to the SQL statement, certain rows of a table are selected where a selection condition holds. This example query selects all columns from the customers table only for customers from Italy by using `WHERE country = 'Italy'`. We can see the first few selected rows of the resulting table.

3. Operators in the WHERE clause

Several operators can be used in the WHERE clause, such as comparison operators, which compare the value of a column to a given value. These can check if the values are equal or not or whether one is greater or less than the other. Then there is the BETWEEN operator, the IN operator, and the NULL operator. Let's illustrate this with some examples.

4. Example comparison operators

Here are examples of comparison operators. In the first example, we select all columns from `movies` where the genre is not Drama. In the second example we select those movies with a renting price greater than or equal to 2.

5. Example: BETWEEN operator

Let's now look at the BETWEEN operator. We select all columns of `customers` where the date when the account was created is between the beginning of January and end of September 2018. The first of January and the end of September are included in the selection. Pay special attention to the way dates are used here. Dates can be compared if they are before or after a given date but it is necessary to use single quotation marks as for strings. Unless explicitly specified otherwise in the settings, only single quotes are accepted in most SQL flavors. Using double quotes will give an error in Postgres.

6. Example: IN operator

With the IN operator we can compare the values of a column to multiple values. Here we compare the nationality of actors to both USA and Australia. If the nationality is either USA or Australia then the actor is selected.

7. Example: NULL operator

NULL values are missing or unknown values in the data. We already know that not every customer gives a rating after watching a movie. Therefore, there are many NULL values in the column "rating". With the statement WHERE rating IS NULL we select all records where the customers didn't give a rating. With the statement WHERE rating IS NOT NULL we select all records where the customers gave a rating.

8. Boolean operators AND

With the statement AND you can connect two or more conditions in a WHERE expression. Here we ask for customers who are from Italy and have created their MovieNow account between the beginning of January and end of September 2018. It is also possible to use the boolean operator OR to connect two conditions. Then all rows are selected where one of the two conditions holds.

9. Boolean operators OR

An alternative to connecting conditions in the WHERE clause is to use the OR operator. Here we select all customers who are from Italy or who created their account between the given dates.

10. ORDER BY

By using the ORDER BY statement, SQL allows us to order the results of a query by the values of one or more columns. In this example we order by rating. By default the order is ascending.

11. ORDER BY ... DESC

To see the best ratings first, we can change to descending order by using DESC. This shows us that the first five rows all have the best rating: 10.

12. Let's practice!

Now it's time to put all this into practice.