Get startedGet started for free

Platform Architecture

1. Platform architecture

Where does all of this actually run? Let's peek under the hood - what Databricks manages for you and what lives in your cloud account.

2. The hotel analogy

Think of it like staying at a hotel. The control plane is the front desk - managed by Databricks, handling bookings, coordination, and requests. The data plane is your hotel room - it's in your cloud account, where your stuff lives and your work actually happens. Databricks manages the front desk so you can focus on what's in your room. Let's break down what each one does.

3. The control plane

The control plane is the Databricks-managed backend. It handles the workspace interface you log into every day, the notebook environment where you write code, job scheduling that runs your pipelines on time, and cluster management that spins compute up and down. You interact with all of this through the Databricks UI or API, but the infrastructure behind it is managed entirely by Databricks.

4. The data plane

The data plane is where the real action happens, and it lives entirely in your cloud account - whether that's Azure, AWS, or GCP. Your compute clusters run here, processing data without it ever leaving your environment. Your data stays in your own cloud storage. And you control the networking and security settings. This separation is critical for compliance - your sensitive data never passes through Databricks' infrastructure.

5. What lives where?

Here's how it all comes together. When you run a query, the control plane tells a cluster what to do, but the cluster runs in your cloud and reads data from your storage. The results flow back through the control plane to your notebook. Your data never leaves your environment - the control plane only coordinates, it doesn't store or process your data.

6. Audit logs and system tables

Every action in Databricks is logged - who created a cluster, who ran a query, who accessed a table. These audit logs are stored as system tables in your own cloud account, not on Databricks' side. You can query them with standard SQL for compliance reporting, security investigations, or simply understanding how your workspace is being used. This is another benefit of the control plane and data plane separation - even your logs stay under your control.

7. Summary

Let's recap. The Databricks platform splits into two planes. The control plane is managed by Databricks and handles the workspace, scheduling, and coordination. The data plane lives in your cloud account and runs your clusters, stores your data, and keeps your audit logs. This separation means your data stays secure and under your control. Let's practice identifying what belongs where.

8. Let's practice!

Time to test your knowledge. In the next exercises, you'll classify platform components into the control plane and data plane.

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.