Virtual Warehouses Scaling - Part II

1. Virtual Warehouses Scaling - Part II

We’re going to get right back into it by talking about a different form of scaling – scaling horizontally. So “[By default](https://docs.snowflake.com/en/user-guide/warehouses-multicluster), a virtual warehouse consists of a single cluster of compute resources… [and when you submit queries to the warehouse], the warehouse allocates resources to each query and begins executing the queries. If [there aren’t enough resources] to execute all the queries… Snowflake [starts queueing the queries]” and you just have to wait for all of them to get handled sequentially. Imagine there are three data engineers working with Tasty Bytes data at the same time. One submits a query that occupies the resources of the data warehouse, then the second uses that same warehouse, and has to wait for the first to finish. Then the third runs a query, and is behind both of the others in line! In a case like this where users are running a bunch of concurrent queries, instead of doing what we did earlier and scaling our cluster vertically by selecting a larger warehouse, we probably want to scale horizontally, so make use of more clusters. So let’s go ahead and set up a multi-cluster warehouse to see what that’s all about. Let’s use the UI to make a new warehouse that’s multi-cluster from the get-go. We’ll go to “Admin” and click on “Warehouses.” Then we click “+ Warehouse.” Let’s name our warehouse “warehouse_vino” after another one of my colleagues. But this time, let’s toggle open the “Advanced options” drop-down. You’ll see a few different options. We’ll talk about “auto resume” and “auto suspend” in a moment, so let’s skip those, and instead toggle on “Multi-cluster Warehouse.” Let’s set the minimum number of clusters as 1, and the max as 3. This means that we’ll use one cluster until queries start queueing up – until there’s a backlog – and then we’ll temporarily move to two clusters. And then if that proves insufficient, we’ll temporarily move to three clusters. Okay, so we make that warehouse, and look! Under the “clusters” column, you can see that there are three bars. Pretty neat. You won’t be surprised to learn that we can also make a multi-cluster warehouse through code. Let’s do that quickly by hopping over to “Projects,” “Worksheets,” and clicking on our “Virtual Warehouses Scaling” Worksheet. Let’s first drop our “warehouse_vino” by running the “DROP WAREHOUSE” command, followed by the name of the warehouse we want to drop. And then we’ll check that the warehouse is really gone using: SHOW WAREHOUSES; Yep, no more “warehouse_vino.” But then we type “CREATE WAREHOUSE” followed by the warehouse name, “warehouse_vino,” and then we add in the property “MAX_CLUSTER_COUNT \= 3”. Let’s run that, and then show our warehouses. And sure enough, we have a “warehouse_vino” multi-cluster warehouse once more! And one last word on multi-cluster warehouses – You can do the same things with a multi-cluster warehouse that you can with a single-cluster warehouse, like resizing the warehouse, etc. Okay, now let’s explore two options we skipped a moment ago – “auto resume,” and “auto suspend.” Let’s hop back to “Admin,” and take a look at the warehouses tab. Let’s edit the “warehouse_gilberto” warehouse by clicking on the three dots on the right, and clicking “edit.” You’ll see that “auto resume” and “auto suspend” are both toggled on by default. “Auto resume” means that the warehouse will automatically kick into action when someone asks it to do something, and “auto suspend” means that after a specified number of minutes of inactivity, the warehouse will turn off. My sense is most people keep “auto resume” on, but you could imagine turning that off for greater control over costs. And my sense is most people keep “auto suspend” on as a cost-saving measure, but they might adjust the number of minutes after which a warehouse shuts down. When I first learned about “auto suspend,” it seemed to me like you’d probably want to set that number really low to save on credits, but then I learned that when you keep your warehouse running, you keep data in cache, so there are queries you might re-execute but not have to really re-run because the results are still there. If you shut off your warehouse too early, you risk having to redo computations because you cleared your cache. So there’s a balance here, and what’s best for you will depend on your workload. Okay, so let’s hop over to our SQL worksheet and adjust the “warehouse_dash” warehouse so that it auto_suspends after 3 minutes, and does not autoresume. All we have to do is run an ALTER WAREHOUSE command, so we type “ALTER WAREHOUSE” and then write the name of the warehouse, “warehouse_dash,” and then “SET AUTO_SUSPEND \= 180” (you specify the number of seconds) followed by “AUTO_RESUME \= FALSE”; Then after we execute that, we can run “SHOW WAREHOUSE” and scroll over to confirm that “warehouse_dash” does indeed auto_suspend after 180 seconds and has auto_resume set to false. Okay, last thing! Sometimes you just want to suspend your warehouse to save credits. You can do this in the UI by clicking on the three dots and clicking “SUSPEND.” Or you can type “ALTER WAREHOUSE” followed by the name of the warehouse, and then “SUSPEND”. Run that, and your warehouse is now suspended! Great job – To recap what we’ve learned about scaling virtual warehouses: We learned what it means to scale a warehouse horizontally and vertically, how to do that, and how that translates into credit consumption. We also learned about the autosuspend and autoresume properties, and how to manually suspend a warehouse. After some practice, we can now put our warehouse skills to use to ingest some pretty awesome data.

2. Let's practice!

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.