Get Started

Transactions

1. Transactions

Let's take a quick look at transactions and their implications in Redshift.

2. Motivation for using transactions

Let's imagine we've got a data intake table that stores their data points' names and priorities, as shown here. In a case where we want to use a consistent timestamp across multiple queries, these two queries, even if they were fast, would have different SYSDATE timestamps with multiple microseconds in between. It could have been seconds of difference between them if they weren't nearly instantaneous, giving us inconsistent data results.

3. Statement grouping example

For this example, we could use the SQL statements presented here if we need to swap the Idaho samples and monitoring locations in priority; however, if we lose connection to the database server between our two statements executing. What would the result be?

4. Errored table results

As we can see here, both datasets became the number one priority, which might make our project manager super happy. However, it differs from what we intended and could result in an improper course of action. Errors like this and the prior data query timing difference are instances where transactions can help us.

5. Transaction advantages and considerations

Transactions wrap a series of SQL statements to ensure they all operate as one unit. Transactions provide several advantages, including ensuring consistent data outcomes, grouped queries succeed or fail together, and enabling concurrent operations. By default every SQL statement is a transaction of it's own. Transactions also affect some functions differently. For example, SYSDATE will set its timestamp value when the transaction starts by looking at a few date functions. The GETDATE function, on the other hand, sets its timestamp for each statement regardless of it being in a transaction or not.

6. Transactions Structure

The structure of a transaction typically involves opening with a BEGIN or START TRANSACTION statement, followed by one or more SQL statements ended with a semicolon, and closing with an END or COMMIT statement to commit or undo any changes. You can mix and match the keywords as you see fit. So if you wanna START TRANSACTION and END or BEGIN and COMMIT that's fine. For this course we'll consistently use BEGIN and END. Undoing changes is commonly called a rollback, but the scope of this is outside the bounds of this course. All these semicolons are important to ensure it works as expected.

7. Getting consistent query results

We can use transactions to ensure consistent query results. Here's an example of using transactions to execute multiple queries with the same timestamp, ensuring consistency in the data outcomes by taking advantage of SYSDATE, staying a consistent value throughout a transaction set at the beginning. So that all queries that access SYSDATE will get the same value.

8. Function behavior in transactions

Additionally, if we repeat the prior example, but this time with the GETDATE function, which gets a new timestamp for every statement, we can see that the transaction has a different effect. If we experience an unexpected behavior from a transaction, then we should check AWS documentation for that function to see how the function interoperates with transactions.

9. Let's practice!

Okay, it's time to practice before our product manager adds another number-one priority for us.