Get startedGet started for free

ML - Snowflake ML Overview - Part I

1. ML - Snowflake ML Overview - Part I

I have an ambitious vision for this video. I want it to feel like flying on an airplane. You look out the window, and you see fields and forests. Maybe a mountain. You fly over a city and can see how all the parts fit together. That’s what I want this to feel like, but instead of flying over the countryside, we’re flying over the machine learning landscape at Snowflake. You gaze down and think: “Oh look, what a beautiful set of ML functions.” “Oh look, what a charming Model Registry.” And then as some combination of captain and flight attendant, I’ll say: “Snowflake’s platform provides a bunch of ways to do machine learning, all processed with CPUs or GPUs in Snowflake’s compute engine right next to your data. This is great because it means you can do the ML you want with the security and governance of Snowflake.” It’ll be great! Excellent view. Zero turbulence. So here’s how we’re going to go about this: The structure of this video will be pretty much identical to the structure we used when we learned about GenAI at Snowflake – we’ll talk about ML features that can be used in seconds, ML features that can be used in minutes, and more customized ML features that can be used in hours. There’s a lot of good stuff in here – This will be great. Okay, so here’s an overview of the ML landscape at Snowflake. We’re going to focus our attention on the “seconds” and “minutes” options here, because there’s not enough time in this course to do everything. We plan to cover bringing your own model into Snowflake, GPU-based training, and GPU-based inference in future coursework. As I did when we were doing the GenAI overview, I want to quickly call out that all of this is built on the foundation of Snowflake-governed data. It’s easy to brush past that, but that makes life easier in a lot of ways, because so many governance, security, performance, and ease-of-use benefits come from that. And we talked about virtual warehouses earlier in the course, but it’s worth repeating that there are Snowpark-optimized warehouses with lots of memory to help with ML tasks. I won’t say more about that here. Okay, so now let’s really get started by digging into Snowflake Cortex ML functions. So using Snowflake Cortex ML functions feels a lot like using the Snowflake Cortex LLM functions we’ve been talking about – translate, summarize, sentiment, extract, and of course complete, which you’re now pretty familiar with. And you can use them all in SQL. But there are a few differences. One is that under the hood, the models the ML functions rely on are machine learning models, not large language models. Another is that the LLMs underneath the LLM functions are all pre-trained, but for the ML functions, there’s typically a step where you submit training data to them. Like the LLM functions, the ML functions abstract away the complexity of the underlying models, and when you’re using them, you don’t have to worry about the underlying compute infrastructure because they leverage Snowflake’s multi-node elastic compute. Okay, so here are four ML functions: Forecast, anomaly_detection, top_insights (which you’ll see listed under “Contribution Explorer” in the Snowflake docs), and classification. I know that for at least forecast, anomaly_detection, and classification, the underlying model is a GBM – a gradient-boosted machine. This isn’t a course on ML, so we won’t talk about how that works, but it’s a very standard type of ML model. Forecast does what you’d guess – it makes time series forecasts. Anomaly detection identifies outliers. Top_insights identifies drivers of shifts in whatever variable you pick as the outcome variable of interest. I wouldn’t go so far as to give this a causal interpretation, but it can definitely get you started in a metrics investigation where leadership is like: “Why did metric X go down last month in country Y,” and you’re like: “Great question.” Classification sorts data into different groups – it can handle binary classification, or multi-class classification. Now I’ve provided some incomplete code snippets on the side, not because I think you’ll be able to look at this and know how to use any of these, but because I wanted to give you a sense as to how these work in practice. Please don’t be overwhelmed by the tiny font. You don’t need to absorb or memorize any of this. Think of it as taking a bath in a stream of code – just let the code wash over you. The key things I want to call out are that in two cases – with forecast and anomaly detection – you can see that you have to run a create command to create the model. You’ll see the function names, forecast and anomaly_detection, listed in each case. Then you actually use the models with a CALL command. Note that this is different from how we used the LLM functions! There we used them inside SELECT statements. The syntax for calling top_insights and classification is a bit more like the syntax for calling the LLM-functions – You can see that they’re inside the SELECT statements: SELECT SNOWFLAKE.ML.TOP_INSIGHTS, and SELECT MODEL_BINARY predict, where MODEL_BINARY is a SNOWFLAKE.ML.CLASSIFICATION model. So again, the goal here is to not bring you to the point where you can use these – you’ll need to take more Snowflake coursework or read the docs for that – but I wanted to give you a taste as to how using them might feel in practice. You know this already, but I’m a big fan of the Snowflake Cortex functions. That said, there’s a lot more to cover about the Snowflake ML landscape. Next up we’ll, learn about Snowpark ML Modeling, the Snowflake Feature Store, the Snowflake Model Registry, and more.

2. Let's practice!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.