Overview of Databricks SQL
1. Overview of Databricks SQL
Hello! In this video, we will discuss Databricks SQL and how this platform feature fits into your data architecture.2. Lakehouse for all workloads
The lakehouse architecture is designed to satisfy the requirements for all data workloads. One of these types of workloads is data warehousing. Traditionally, these workloads are for SQL queries or Business Intelligence reports critical for business operations and have been supported by data warehouses.3. Databricks for SQL Users
For the Databricks Lakehouse platform, these use cases are satisfied by the Databricks SQL component. Databricks SQL brings the data warehouse into the lakehouse and provides a very familiar interface for SQL analysts using other data warehouses. To further enhance these warehousing capabilities, Databricks has optimized the compute performance for SQL workloads with an engine called Photon. Photon allows for the second latency that SQL users have come to expect, all while providing Databricks' cost performance at scale. Users can use Databricks SQL directly in the platform or connect it to their favorite Business Intelligence tool like Power BI or Tableau. The best part? This comes built into the platform directly! No additional configuration is needed.4. Databricks SQL vs. other databases
You may be asking yourself, what is so different about Databricks SQL? What about the other kinds of data warehouses out there? These are valid questions, so let's dive into them. Let us talk about Databricks SQL versus Postgres databases. Note that these differences can be applied to most other data warehouses in the market. Databricks SQL, being part of the lakehouse architecture, is built to operate on top of an open file format, which means that other data processes could access the same data as your BI reports. Most other data warehouses use a proprietary data format, which requires a data process to use that data warehouse's compute to access the data.5. Databricks SQL vs. other databases
Databricks SQL also can separate your compute power and storage, which provides flexibility from an architectural perspective. While some newer data warehouses can do this, most traditional data warehouses have compute and storage closely linked, meaning if you need more compute power, you need to pay for more storage, and vice versa.6. Databricks SQL vs. other databases
Databricks SQL is based on ANSI SQL, an open standard for the SQL language that works across many different systems. Many data warehouses use their own specific flavor of SQL, such as pgSQL for Postgres or T-SQL for SQL Server. These languages generally only work on their specific platform.7. Databricks SQL vs. other databases
Finally, Databricks SQL can leverage the work of other workloads in the Databricks platform, such as data engineering or data science, allowing SQL analysts to have access to more advanced analytics, which other data warehouses generally lack.8. SQL in the Lakehouse Architecture
To put this all in perspective, let's look at the lakehouse medallion architecture, which provides a high-level overview of how data goes from raw to an analytics-ready dataset. In the medallion architecture, data starts in a raw format in the Bronze tables, is cleaned and transformed into the Silver tables, and aggregated into the Gold tables. Typically, Databricks SQL will be used on the Gold level data, which is aggregated for specific business questions or problems. By incorporating data warehousing use cases with the rest of the Databricks platform, customers can have fresh, reliable data that will scale efficiently.9. Let's review!
With that, let us review some of the main concepts surrounding Databricks SQL.Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.