Managing Data Catalogs

1. Managing Data Catalogs

In this scenario, I will be assuming the role of a data analyst at Amazon. I just received a new set of data from one of our other analyst teams, and they want to start querying their data from Databricks. Starting out on the Databricks UI, I will navigate to the Catalog Explorer to see what data is available to us. Looking through the catalogs, I can confirm that this dataset does not already exist in our Unity Catalog implementation. There are a number of ways that I can create tables based on my data, both through the UI and programmatically. In this case, I have already uploaded the data to the Databricks FileStore, or DBFS, so I will use code to create a couple tables. I can use this command to take a look at the folder where my files are hosted. I see that I have a folder of Parquet files that represent my data table. Reading this data into a DataFrame, I can see that this data relates to product reviews on the online platform. In the next notebook cell, I can create a tables in this catalog based on the Parquet files that I have. Databricks is then able to read those files, understand the data structure, and create new Delta tables in Unity Catalog. To verify that my tables now are in Databricks, I will try to query them using a simple SQL query. And just like that, I now have my data in Delta tables and am ready to use Databricks to fuel my analytics. Let's practice creating some tables in our following exercises.

2. Let's practice!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.

This exercise is part of the course

Introduction to Databricks

BeginnerSkill Level

4.8+

Start Course for Free

Discover how the Databricks Lakehouse platform modernizes data architecture using the new Lakehouse paradigm. Explore the foundational components of Databricks and their integration. Engage with the Databricks UI, platform architecture, and workspace administration. At the same time, practical exercises guide you through SQL queries, platform features, and external system connections, introducing you to the efficient data management and seamless integration Databricks can offer.

Exercise 1: The Databricks Data Intelligence Platform Exercise 2: Welcome to the Databricks UI Exercise 3: Overview of the Databricks UI Exercise 4: Why pick a Data Intelligence Platform Exercise 5: Databricks for all users Exercise 6: Databricks Architecture Exercise 7: Control Plane and Compute Plane relationship Exercise 8: Databricks and external systems Exercise 9: Administering a Databricks workspace Exercise 10: Different types of administrators Exercise 11: The marketplace Exercise 12: Connecting to partners

Learn the key components of the Databricks Data Intelligence Platform and how it can enhance your analytical processes. Explore Databricks fundamentals for data management and compute capabilities. You will also cover data catalog management and data ingestion. Get your hands dirty with interactive exercises that will guide you through integrating datasets, setting permissions, and configuring clusters.

Exercise 1: Data Intelligence Platform - Data Exercise 2: Exploring catalogs Exercise 3: Managing Data Catalogs

Current Exercise

Exercise 4: Adding your datasets Exercise 5: Setting Permissions Exercise 6: Data Intelligence Platform - Compute Exercise 7: Node capabilities: Single vs. Multi Exercise 8: Configuring clusters Exercise 9: Create your first cluster

Use the Databricks Data Intelligence Platform as your data warehousing solution for your Business Intelligence (BI) use cases. Use the built-in SQL-optimized capabilities within Databricks to create queries and dashboards on your data.

Exercise 1: Data Intelligence Platform - Analytics Exercise 2: Supported Tasks Exercise 3: Development on Databricks Exercise 4: Run your first notebook Exercise 5: SQL in the Data Intelligence Platform Exercise 6: The power of Databricks SQL Exercise 7: Write your first query Exercise 8: Iterate on your first query Exercise 9: Congratulations!