Get startedGet started for free

Time Travel - Part II

1. Time Travel - Part II

Welcome back! We just learned about retention time in Snowflake. We learned how to check the retention time with SHOW TABLES and to set a new retention time with ALTER TABLE. Undropping a table is only possible because of these retention policies. Now we’re going to move on from undropping tables to what I think is even cooler – We’re going to learn three ways of querying past data from Snowflake tables. The first way is querying a table as of a particular timestamp. The second way is querying a table as of a certain number of seconds in the past. And the third way is querying a table before a certain past query was run. Let’s make it happen. We’ll start by [messing](https://quickstarts.snowflake.com/guide/tasty_bytes_zero_to_snowflake_transformation/index.html?index=..%2F..index#0) up some data on purpose, so we can go back and recover the correct data. First, we’ll clone one of our tables, which effectively copies one of our tables so we don’t mess with the real thing – and don’t worry about this part, because we’ll discuss cloning in detail in a future video. CREATE OR REPLACE TABLE frostbyte_tasty_bytes.raw_pos.truck_dev CLONE frostbyte_tasty_bytes.raw_pos.truck; Let’s run a query on that cloned table just to make sure it worked: SELECT t.truck_id, t.year, t.make, t.model FROM frostbyte_tasty_bytes.raw_pos.truck_dev t ORDER BY t.truck_id; This shows all the food trucks in TastyBytes’ fleet – If we look at the query details, it looks like there are 450 rows. Let’s pretend for a second that we were asked to add a new column to this dataset that gives the age of each truck. The correct way to calculate the age would be something like this: SELECT t.truck_id, t.year, t.make, t.model, (YEAR(CURRENT_DATE()) - t.year) AS truck_age FROM frostbyte_tasty_bytes.raw_pos.truck_dev t; You can see that the truck_age results look plausible. We’ll do two things here to make our lives easier later – Because we’ll want to show how you can do time travel relative to a particular timestamp, as of the time a particular past query was run, let’s save the most recent query id and save the current timestamp. We can do this using Snowflake variables. These are really useful, and something I think should be covered earlier in people’s Snowflake journeys. You can save a variable with the SET command. Let’s first use SET to save the query id. We haven’t spoken about query ids, but in Snowflake, every query you run has a unique id, and with time travel that’s nice because we can return to the state of the data right before a specific query was run. So now let’s save a “good_data_query_id” variable that will record the last query we ran where we were happy with our data. SET good_data_query_id \= LAST_QUERY_ID(); LAST_QUERY_ID() does exactly what you’d expect, and pulls the id of the last query. You can use variables by putting a dollar sign in front of them – So if we run: SELECT $good_data_query_id; We see the query id we saved. In the case of saving the current timestamp, we can do the following: SET good_data_timestamp \= CURRENT_TIMESTAMP; CURRENT_TIMESTAMP does exactly what you think it should – it gives the current timestamp. Now we can display this variable using SELECT, and the dollar sign plus the variable name. SELECT $good_data_timestamp; We see the timestamp we saved. Awesome! If we want to check out all of our variables at once, we can run: SHOW VARIABLES; We see them all here. Okay, now it’s time to get back to our nefarious mission of corrupting data. To do this, we’ll calculate the truck age by dividing, not subtracting. Again, we’re doing this wrong on purpose: SELECT t.truck_id, t.year, t.make, t.model, (YEAR(CURRENT_DATE()) / t.year) AS truck_age FROM frostbyte_tasty_bytes.raw_pos.truck_dev t; And then let’s make ANOTHER error by accidentally overriding our YEAR column with our faulty truck age instead of adding a NEW column with the correct truck age. Wow, we’re so clumsy! Okay, here’s how we fill in our YEAR column with incorrect numbers: UPDATE frostbyte_tasty_bytes.raw_pos.truck_dev t SET t.year \= (YEAR(CURRENT_DATE()) / t.year); Awesome – It looks like that worked. To confirm, let’s query the data now: SELECT t.truck_id, t.year, t.make, t.model FROM frostbyte_tasty_bytes.raw_pos.truck_dev t; Great! We’ve messed up our data, just like we wanted. Every year says 1 because the numbers were all close to 1, and our YEAR column doesn’t allow for numbers to the right of the decimal place. Now comes the fun part. Now let’s try out the first time travel method for querying data from the past – Querying as of a particular timestamp. The way this works is we run our normal SELECT… FROM command, but then we add an “AT” at the end, and specify what “AT” we want. So in this case, we want the results we would have gotten had we queried this table back at a particular timestamp. If the time we’re aiming for is outside the retention period, this won’t work, but otherwise, the table should appear like it did at that moment. Let’s first do this by directly using the variable we saved earlier – the timestamp when our data was still uncorrupted. SELECT * FROM frostbyte_tasty_bytes.raw_pos.truck_dev AT(TIMESTAMP \=> $good_data_timestamp); Let’s take a look at the data… Great, the year column is back to normal! We’ve traveled back in time!!! To make it even clearer what this is doing, let’s show the timestamp string, copy-paste that in place of the variable, make sure it’s converted to a timestamp, and run this again: So we display our timestamp: SELECT $good_data_timestamp; And then we click on the timestamp, and go to the bottom right and copy that. Then we paste it into here: SELECT * FROM frostbyte_tasty_bytes.raw_pos.truck_dev AT(TIMESTAMP \=> '[insert timestamp]'::TIMESTAMP_LTZ); And we get: SELECT * FROM frostbyte_tasty_bytes.raw_pos.truck_dev AT(TIMESTAMP \=> '2024-03-07 16:03:18.353 -0800'::TIMESTAMP_LTZ); Snowflake doesn’t automatically know that '2024-03-07 16:03:18.353 -0800' is a timestamp – it just looks like a string – so we have to use the colon colon operator (‘::’) to convert this to the data type TIMESTAMP_LTZ (the “LTZ” stands for “local time zone”). Cool, so if we run this, we see that we get the same result as before – the YEAR column is back to normal. Look at us. We’re time travelers. Now let’s quickly try the second time travel method for querying data – Querying as of a certain number of seconds in the past. This is conceptually very similar to what we just did, but instead of specifying the timestamp outright, we specify an “offset” in seconds, and Snowflake automatically figures out relative to the current time the time we’re aiming for. Let’s go back a couple of minutes - you don’t have to worry about what I’m doing here, I’m just going to calculate how many seconds we need to go back to get to our table when it was in a good state. SELECT * FROM frostbyte_tasty_bytes.raw_pos.truck_dev AT(OFFSET \=> -60*5); Okay, cool! That worked too – our YEAR column is correct again. And finally, let’s try the third time travel method for querying data from the past – Querying a table before a certain past query was run, using the query id. Let’s take a look at our query id again: SELECT $good_data_query_id; Our command will be a lot like the ones we ran earlier, with the major difference being that instead of using “AT” we’re using “BEFORE.” This slight change in syntax makes it clear that when you do time travel based on query ids, you’re accessing the data as it was *before* that particular query you’ve specified was run. SELECT * FROM frostbyte_tasty_bytes.raw_pos.truck_dev BEFORE(STATEMENT \=> $good_data_query_id); Awesome – Once more, we have access to our correct data! In this video, we learned: One, how to check the retention time for a table Two, how to set a new retention time Three, how to set and use variables Four, how to save a query id Five, how to save a current timestamp, and Six, three ways to use time travel to query a table back in time: by us AT to query it as of a particular timestamp, by using AT to query it as of certain number of seconds in the past, and by using BEFORE to query it as it was before a specific query was run. That’s a lot of time travel knowledge! So the next time you accidentally drop a table, or you realize you messed something up big time, you can correct all of that by going back in time.

2. Let's practice!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.