Get startedGet started for free

Loading Data from a cloud provider

1. Loading Data from a cloud provider

Let's discuss how to load data from a cloud provider into Snowflake with a feature called "Staging".

2. High Level Workflow

It is common to have Snowflake interact with another cloud provider in an analytics workflow. In this illustration data files are stored in folder directories within the Cloud Provider. An external stage is created to reference the folder directory. Think of a stage as a folder that Snowflake can read directly from. A table can be created by pointing to the stage allowing the data to be queried.

3. Setting up stage permissions

To create a Snowflake Stage, some work is needed. Snowflake needs permissions to access the file directory. Permission roles and policies must be created in the cloud provider. This may require working with a Cloud Administrator in your organization. Once your Snowflake account has access to read from the file directory, a "Storage Integration" needs to be created in Snowflake. This references the credentials that were generated from the cloud provider. A "Stage" is then created by referencing the "Storage Integration" name and the URL of the file directory. In this example the Stage points to an AWS file directory containing data on new customer signups. Remember: a table can be created on top of the Stage. Another way to visualize this process is to imagine a "Stage" as a kitchen and the data files as the ingredients. You ask your friend if it's okay to use their kitchen and the ingredients to cook a dish of your liking! The goal of this course is to focus on the Snowflake UI experience, so we won't go into details on setting up access. Instead, we'll focus on creating the stage in Snowsight.

4. Creating an external stage

Let's go over how to create an external stage to reference AWS. This sql script contains several queries. The first query creates the storage integration pointing to the AWS s3 directory and the configured AWS role. The second query creates the stage and points to the S3 file path for new customer signups and the storage integration that was just created. The third query is used to look up the stage properties. To ensure we can use the stage, we want to copy the AWS_EXTERNAL_ID value and paste it into the AWS policy.

5. Refresh and use external stage

We'll open the external stage in Snowsight and click on the refresh icon. We can now see all of the files that are linked to the AWS directory. Under Stage Details, we can view things like the AWS region, AWS S3 filepath, storage integration, and the owner of the Snowflake stage. Next we'll load data into a new table. First we'll choose the database and schema. Then we'll choose the external stage that was created. Let's name the new table DIM_CUSTOMER_SIGNUP and then hit next. We'll load the data and then run a query to see the output of the new table.

6. Let's practice!

Let's review what we just discussed.