Get startedGet started for free

What is Snowflake?

1. What is Snowflake?

Welcome to the Introduction to Snowflake course!

2. Your instructor

I'm Danny and I've been a data analytics engineer for over 10 years. I've had the privilege to work and learn from places like Google, Amazon, and Meta. I've used Snowflake to create data models and analysis from big data to provide insights on home title insurance risks. I'm excited to walk you through more about Snowflake!

3. What is Snowflake? (cont)

Snowflake is a leading AI Data Cloud & data warehousing platform used by thousands of companies. It's commonly used by Data Engineers and analysts. A data warehouse is a repository for storing large amounts of data. It is designed to support historical analysis and reporting using SQL. In an analytics workflow, source data is transformed with business-defined logic and loaded into tables in the data warehouse. The tables get queried for analysis, reporting, and dashboard insights.

4. Why is Snowflake popular?

Snowflake is popular for many reasons. It's a "Data Platform as a Self-Managed Service". Typically a database administrator has to provision hardware and software when setting up a data warehouse. But Snowflake manages this task. You sign up for a new account and can immediately begin using Snowflake. To better understand this, imagine buying a new car that is already built and ready to be driven instead of having to buy and assemble the car parts yourself.

5. A few Snowflake features to highlight

There are a few features worth highlighting. It is common for data engineers to store raw data in a cloud provider's file storage system or so called "Data Lake". For example, a video streaming app storing data on movies watched. Data pipelines are used to clean, organize, and apply business rules to data before storing it in a data warehouse. This process typically happens in batches at scheduled times. Snowflake supports this with connections to major cloud providers like AWS, GCP, and Azure allowing data from different sources to be unified together. Data privacy is also an important requirement and data engineers are responsible for enforcing "least-privilege access" meaning access should be limited to people who need the data to perform their job duties. An example can be limiting access on employee salary to the Human Resources department. Snowflake has data governance controls in place to manage sensitive data.

6. Snowsight UI

A key feature that we'll explore in this course is Snowsight. It is a UI tool used to interact with Snowflake and is commonly used for data analysis.

7. Data Marketplace in Snowsight

In Snowsight there is Data Marketplace, a catalog with free and paid curated datasets from Snowflake's data providers. A data analyst can use this to bring in external data to enrich their analysis.

8. Snowflake's data architecture

Snowflake's data architecture has three integrated layers. Cloud Services coordinate activities in Snowflake. This layer handles user login, access, tracks data usage, and optimizes SQL queries. In Query Processing, virtual warehouses use Massive Parallel Processing (MPP) architecture to process SQL queries. It distributes data and compute power across a cluster of nodes (or computers). Lastly, the Database Storage, compresses and stores data in a columnar format. It is optimized for analytical queries that involve aggregating or filtering on columns such as computing the average or searching for a specific value.

9. Let's practice!

Let's review what we just learned!