What are data transformations?
1. What are data transformations?
It's likely that your raw data won't directly have the insights that you're looking for. Before being able to extract insights from your data, you will probably need to address things like missing values, incorrect formatting or column types, required calculations, aggregations, and much more. In short, there's a good chance that your raw data is not in the end state you need it to be in, and you'll need to perform some work against it to get it to the right end state. I like to think of that work as data transformations, or the set of changes or calculations that you'll perform against your raw data to get closer to your desired insights. So what exactly do transformations look like? What kinds of changes are typical against raw data? Well, every use case will vary simply because the nature of your raw data is likely also going to vary. But in general, it's common for transformations to address missing or incorrect data, fix formatting, perform aggregations, derive new columns from existing columns, derive new views or tables from raw data, and much more. And there are various ways of performing these data transformations in Snowflake. The two most common methods are to use SQL or Snowpark to write and perform the transformations. We'll cover both in this module. It's also common to write functions and logic that can be invoked to aid with transformations at scale, like user-defined functions and stored procedures, which we'll also cover in this module. And finally, we'll also cover how to use streams for efficient transformations, and how to use features like dynamic tables for automatic transformations. But beyond the individual transformation features, let's keep in mind what we're trying to accomplish. We're learning to build data pipelines that can take raw data and deliver an insight. To get to those insights, the raw data needs to be transformed. These transformations are crucial for delivering an insight and building a pipeline that can help meet a specific goal or objective. With that, let's start building a data pipeline.2. Let's practice!
Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.