Stages and Basic Ingestion - Part II
1. Stages and Basic Ingestion - Part II
Welcome to Part II of “Stages and Basic Ingestion.” It’s funny how different a video this would be if we were about to talk about “Stages and Basic Indigestion.” Anyway, let’s get back to it and learn about the different kinds of internal stages. There are three flavors of internal stage -- user, table, and named stages. Every user has a user stage that only that user can access, but from which you can copy data into multiple tables. You can't drop the user stage. Every table has a table stage that can only be used with that table, and again, you can't drop it. And named stages can be used by multiple users, and be associated with multiple tables. Having said all that, my guess is you'll mostly use named stages because they’re so flexible. We’re not going to create an internal stage here, but a key thing to know is that once you’ve created your internal stage, you’d need to take another step to actually move data from your local computer to that stage – by using a PUT command, for example. Okay, let’s turn our attention back to our external stage and list all of the stage’s files, by running “ls” – which is short for “list” – followed by “@” and the name of the stage – in this case, frostbyte_tasty_bytes.public.s3load. ls @frostbyte_tasty_bytes.public.s3load; Those are the files we want to ingest! They’re beautiful. You don’t have to memorize this, but when [referencing](https://docs.snowflake.com/en/user-guide/data-load-local-file-system-stage#staging-the-data-files) named stages in our code, we use the AT (@) character. When referencing table stages, we use AT followed by the percent character. And when referencing user stages, we use AT followed by a tilde. I mention this just because seeing these different combinations of AT confused me for a while. Okay, so let’s keep going. We’ve now made our external stage, but we haven’t yet used the stage as a bridge to help us copy data from our S3 bucket. To do that, we first need to make some empty tables to hold the raw data. There’s a country table, and a franchise table, and many others – location, menu, truck, orders, customers. Then we’ll create two views that join together a bunch of these tables and store the results in the harmonized schema. And don’t worry about what a view is – we’ll talk about them in a future video. And then we create a view for the highly-cleaned, highly reliable analytics schema. So let’s run all this code, and stop right before the COPY INTO commands. Awesome. Now it’s finally time to spin up our super powerful, 3XL laser cannon of a data warehouse – “demo_build_wh” – and actually copy the data from our external storage to our tables using our stage. Assuming we’ve set up your table with the right number of fields and the right data types, this is very easy. All we need to do is use the command “COPY INTO” followed by the name of the destination table, and then “FROM” followed by “@” and the stage name. We’ll run this whole section at once, including the command to drop the demo_build_wh once we’re done with it. If this hadn’t worked, we’d have seen some errors, but if we’re willing to wait a while (up to an hour or two) and we wanted to double check, we could use one of Snowflake’s built-in observability features. We just query the COPY_HISTORY view, within the ACCOUNT_USAGE schema, within the SNOWFLAKE database, like this: SELECT file_name, error_count, status, last_load_time FROM snowflake.account_usage.copy_history ORDER BY last_load_time desc LIMIT 10; We can pick a few fields we’re interested in – here I’ve picked file_name, error_count, status, and last_load_time – and check the ten files that were loaded most recently. In this case, all we’re seeing is the tasty_bytes_sample_data menu table that we loaded in a previous video, because there’s some latency, but if we came back later and ran this again, we’d see the order_detail files we just loaded. To confirm more quickly that our tables loaded properly, we can just go to the left hand database dropdown, and find a table – say, COUNTRY – and query it. Looks like at least that table loaded successfully! So there you have it! We covered Parts I and II of “Stages and Basic Ingestion,” and as far as I can tell, there’s no indigestion in sight. Here’s a quick recap of the seven things we covered: One, we learned about what a Snowflake stage object is Two, we learned how to create an external stage and upload files to it using the CREATE STAGE command Three, we learned how to explain the difference between an external stage and an internal stage (remember, if there’s a reference to a cloud storage bucket, it’s an external stage) Four, we learned about the three types of internal stages (user, table, and named) Five, we learned how to view staged files using the list command, Six, we learned you can load files from a stage to an existing table using the COPY INTO command And seven, we learned that you can query Snowflake data loading history with the COPY_HISTORY view in the ACCOUNT_USAGE schema You’re successfully conquered this stage. Soon we’ll learn about databases, schemas, tables, and views!2. Let's practice!
Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.