Table showdown

1. Table showdown

Welcome back! In this video, we'll cover the two main types of tables: managed and unmanaged. We'll walk through creating, storing, and deleting each type in the Databricks UI to show their differences. To begin, we open a new Databricks notebook. In the first cell, let's type the following query. In this case, we make sure that the language of our notebook is SQL. Now this is the query that we can write, and once we have the query in the cell, we can click Run, which is on the left side of the cell, to execute. This creates a managed table named managed_table_example in Databricks. Managed tables are fully controlled by Databricks, meaning the data and storage location are automatically managed. You don't need to worry about specifying storage paths, as Databricks places the data in its default storage location. Let's see what happens when we delete this managed table in a new cell. Before deleting, if you want you can quickly run a SELECT * on the table and you can see the contents of the table as well. Now in order to delete the table, we can run this command DELETE TABLE and the table name. Run the command by clicking Run again. You'll notice a confirmation that the table has been dropped. Databricks removes the table and all underlying data associated with it. Managed tables are ideal for quick setups where you want Databricks to automatically handle data storage and cleanup, making data management simple and centralized. Now, let's create an unmanaged table to understand the differences. In Databricks, you can specify the data's storage location, allowing for more control over where the data is stored. In a new notebook cell, you can type the following. Now, make sure that we use the CREATE TABLE command as we used before, but in order to create an unmanaged table we always need to give a location. Once you have the command ready, click Run to execute. This command creates an unmanaged table called `unmanaged_table_example` and stores its data in the specified location. This flexibility is helpful if you need to control where data resides for compliance, cost management, or performance reasons. Let's see what happens when we delete this unmanaged table. In a new cell, we can type the same DROP TABLE command that we used with the table name, again. Run the command by clicking Run. In summary, Databricks managed tables automate data and storage cleanup upon deletion, which is ideal for centralized management. Unmanaged tables provide control over storage location and retain data files after schema removal, which is useful for independent data access.

2. Let's practice!

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.