Snowpipe and Snowpipe Streaming

1. Snowpipe and Snowpipe Streaming

In the last video we covered COPY INTO - Snowflake's batch loading command. But Harbr's logistics operations don't stop between batch windows.

2. Snowpipe

Delivery events arrive continuously from logistics partners throughout the day. Snowpipe solves that problem by leading data from files as soon as they're available in a stage. Let's dive deeper.

3. The Problem with Batch Loading

Harbr's logistics partners drop delivery event files into S3 every few minutes. With COPY INTO scheduled at midnight, the ops team is always working with data that's up to 24 hours old. A delayed shipment flagged at 9am won't appear in Snowflake until the following day. That gap is the operational problem — and Snowpipe is Snowflake's answer to it.

4. What is Snowpipe?

A Snowpipe wraps a COPY INTO statement and triggers it automatically as new files arrive in a stage. The loading logic is identical — same syntax, same file format references, same error handling. What changes is the trigger: instead of running at midnight, Snowpipe fires each time a new file lands. Loads happen in micro-batches within minutes. And because it's serverless, there's no warehouse to provision or manage.

5. How Snowpipe Works

The most common pattern uses AUTO_INGEST - an event-driven trigger. When a file lands in Harbr's S3 bucket, S3 publishes a notification to an Amazon SQS queue. Snowpipe monitors that queue and triggers the COPY INTO load automatically. On Azure the equivalent is Event Grid; on GCP it's Pub/Sub. The pattern is the same across cloud providers. The alternative is the REST API trigger. Instead of relying on cloud notifications, your orchestration layer calls Snowpipe's insertFiles endpoint directly, passing a list of file paths to load. This is useful when you already have an orchestrator controlling file arrivals and want programmatic control over when Snowpipe fires. The insertReport endpoint lets you check the load status of files you've submitted.

6. Snowpipe Billing

Snowpipe billing is based on a fixed credit amount per gb consumed. This enables more predictable billing. For text files such as CSV, JSON, XML: you are charged based on their uncompressed size. For binary files such as Parquet, Avro, ORC: you are charged based on their observed size regardless of compression.

7. Snowpipe Streaming

Snowpipe Streaming removes the file boundary entirely. Instead of waiting for files to land in a stage, the application writes rows directly using the Streaming Ingest SDK. Harbr's delivery vehicles emit GPS coordinates every few seconds. With file-based methods that would mean minutes of latency. With Snowpipe Streaming, each coordinate lands in Snowflake within seconds.

8. Choosing the Right Ingestion Method

COPY INTO is the right tool for scheduled batch loads - nightly supplier files, weekly exports, anything where a few hours of latency is acceptable.Snowpipe is for when files arrive continuously and you need loads within minutes. Snowpipe Streaming is for data generated directly by an application - GPS tracking, IoT sensors, financial markets - where latency needs to be measured in seconds. One important differentiation is what is required to run each method: COPY INTO requires a Virtual Warehouse, whereas Snowpipe and Snowpipe Streaming are Serverless.

9. Let's practice!

You've covered how Snowpipe eliminates the batch loading gap through event-driven ingestion, how its serverless billing model works, and how Snowpipe Streaming removes the file boundary entirely. Time to put your knowledge to the test.

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.