Get startedGet started for free

Databricks SQL key assets

1. Databricks SQL key assets

In this video, we will be going over some of the key concepts and assets of Databricks SQL.

2. Helpful analogy

Databricks SQL is a comprehensive data warehousing platform made up of several components. Think of it like a tree: while you can view the tree as a whole, it’s also made up of distinct parts such as roots, branches, and leaves. Similarly, Databricks SQL consists of various elements that together form the complete system.

3. Query

We’ll begin by discussing queries, the core of Databricks SQL. Queries provide the logic that drives a compute cluster and processes data. Databricks SQL uses the ANSI SQL standard, a widely adopted, open-source standard in the industry. With these queries, you can process data from various sources, including Unity Catalog tables, Delta tables, and even raw file formats.

4. SQL Warehouse

To run your queries, you’ll need a SQL Warehouse—compute clusters specifically designed for SQL processes. While they share components with other Databricks clusters, SQL Warehouses are optimized for SQL and BI workloads. They include features like Photon, an enhanced compute engine tailored for SQL tasks. These warehouses are easy to manage, offer advanced scaling capabilities, and integrate seamlessly with your existing BI tools.

5. Tables versus views

When working with data in Databricks SQL, there are two main categories of data storage options. The first, and most familiar, is working with data tables, which are physical representations of datasets, and are written in the Delta lake format. They are accessible both within the platform through Unity Catalog, as well as outside of the platform. Tables boast the advantage of providing optimization options such as partitioning and data sorting.

6. Tables versus views

Views are virtual tables representing query results and can be accessed through Unity Catalog. They simplify complex queries, like those with multiple joins or filters, and offer fast read performance for downstream users. Materialized views go a step further by storing query results, providing even faster performance. They also support incremental data processing, updating only with new or changed data, making them ideal for real-time analytics and improving query efficiency.

7. Visualizations and dashboards

On top of all the queries and datasets, Databricks SQL users can create visualizations and dashboards. Visualizations are graphical representations of your query results, and you can create them directly from any single query. Databricks offers familiar chart types like bar, line, and scatter plots. Dashboards, on the other hand, combine multiple visualizations, allowing you to display insights from various queries or datasets in one cohesive view.

8. Let's practice!

Now, let's go review and practice our understanding of Databricks SQL.