1. Binning Data with Case
In this lesson, you will work with the CASE statement, which is commonly used to evaluate conditions in a query.
2. Changing column values with CASE
The CASE statement allows us to evaluate records like an if statement. We can use the CASE statement to check if a column contains a value and WHEN it does THEN we can replace the value with some other value of our choice, ELSE replace it with any other default value.
A CASE statement must have at least four keywords: CASE, WHEN, THEN, and END. The ELSE is optional, but it often makes sense to include it.
3. Changing column values with CASE in T-SQL
In this query, if the Continent is not Europe or Asia, the value in the newContinent field will be Other.
4. Changing column values with CASE in T-SQL
Take a look at how the case statement is evaluating the values in the Continent field to determine what value to put into it. In this query, we create a new column called NewContinent in which the value is Eurasia if the corresponding row in Continent is either Europe or Asia. If the Continent is not Europe or Asia, then the value in NewContinent does not change.
5. Using CASE statements to create value groups
A CASE statement is a good way to arrange data into smaller groups. This is a common technique for Data Science.
This example shows how to create groups, which are often called bins, so that we can evaluate larger groups of ages rather than discrete values.
The Case statement looks at each value of the age, and depending upon the value, substitutes the age in the lifeExpcolumn with one of 5 numbers representing groups of ages.
6. Let's practice!
Now it is time to practice with CASE!