Get startedGet started for free

Extracting and transforming date/ time data

1. Extracting and transforming date / time data

In a previous video, you learned about the AGE function and how to use it to calculate the difference between two timestamps. Next we'll look at some additional built-in functions that will help us transform timestamp and interval data types and create new fields that will help us prepare data for analysis.

2. Extracting and transforming date and time data

Let's start by taking a look at how we can use the EXTRACT, DATE_PART and DATE_TRUNC functions to manipulate timestamp data and create new columns by extracting sub-fields from existing date and time values. This type of data manipulation is useful when the precision of a timestamp is not useful for analysis and you want to use date parts like year or month in your queries but the underlying data only contains a standard timestamp value. You may also not care about certain precision like time of day in some analyses and truncating timestamps may be necessary.

3. Extracting and transforming date / time data

This is where the EXTRACT and DATE_PART functions come in very handy. To use these functions in your queries you will need to pass two parameters. The field identifier and the source. The field parameter is an identifier (or string if you are using DATE_PART) that indicates what sub-field that you want to extract from the source. The various field identifiers include year, month, quarter, day of week, etc. The source parameter needs to be a valid timestamp, time, or interval data type. Both EXTRACT and DATE_PART will produce identical results and can be used interchangeably with only slight variations in how you pass in the field and source parameters. Now, let's get into some examples and see this in action.

4. Extracting sub-fields from timestamp data

In our DVD Rentals database, every customer rental has a corresponding record in the payment table and each transaction is recorded with a timestamp in the payment_date column as the snippet below highlights. This level of detail is certainly necessary for an e-commerce application, but there will no doubt be times when you will want to be able to aggregate this data to use for training a model, reporting and/or trend analysis.

5. Extracting sub-fields from timestamp data

For example, you may want to identify the highest revenue by quarter. To do this we'll want to aggregate the amount column from the payment table and use the EXTRACT function to extract the quarter and year sub-fields from the payment_date column. Here you'll see that we also introduce a technique with the GROUP BY clause that allows us to specify the fields in the SELECT clause using a numeric reference which comes in handy when using functions to derive new columns. And you see the results of the query here which aggregates the amount column grouped by quarter and year.

6. Truncating timestamps using DATE_TRUNC()

The DATE_TRUNC() function will truncate timestamp or interval data types to return a timestamp or interval at a specified precision. The precision values are a subset of the field identifiers that can be used with the EXTRACT() and DATE_PART() functions. For example, to truncate a date by year we pass the year identifier as the first parameter of the DATE_TRUNC function, as you see here and get the following result. Or we can truncate the same timestamp using month as the parameter and get this result. Unlike these functions, DATE_TRUNC() will return an interval or timestamp rather than a numeric value.

7. Let's practice!

Let's put these new skills to use in the exercises!