1. Administering a Databricks workspace
Hello again! In this video, we will cover the basics of administering a Databricks workspace in your cloud environment.
2. Account Admin
The top level of administration in Databricks is the Account Admin. This initially is the individual who sets up the Databricks Account, and has full control over the Databricks deployment.
Typically, the Account Admin is focused on making sure all the Databricks environments are setup the right way. Their key responsibilities include creating workspaces, adding users to the specific workspaces, governing access to those workspaces, and monitoring the account subscription for billing.
3. Account Console
If you are the Account Admin for your Databricks deployment, you will have access to the Account Console. This can be found at the URL on the screen and allows for various account-wide activities. This will be a key location during the initial setup, as we will discuss in later slides in this video.
In the account console, you can see things like the total list of workspaces in your account, where your datasets are hosted in the data lakehouse, and how much your account uses from a Databricks Unit, or DBU, perspective.
4. Account Console - Workspaces
The Workspaces section allows you to see information about all workspaces in your account, and even create new workspaces in any cloud as part of your account.
5. Account Console - Data
The Data section is a single location where you can view and manage your various data catalogs as part of your overall data governance strategy with Unity Catalog.
6. Account Console - Users & Groups
The Users and Groups section provides a single location to manage how you provide access to your organization once users have been added into your account.
7. Account Console - Settings
Finally, the Settings section allows you to create account-wide configurations, such as integrating your identity provider or enabling features for your workspaces.
8. Workspace Admin
Next, we have the Workspace Admin. This individual focuses on a specific set of workspaces, usually as part of their business unit within the organization. Their key responsibilities revolve around ensuring their teams have access to the workspace and the needed capabilities to deliver their workloads.
9. Data Plane
Let's assume you have just signed up for a Databricks account and want to start working with it. One of your first activities is creating your first Databricks workspace. A Databricks workspace consists of two distinct pieces. The first component is the "Data Plane", the workspace and resources directly deployed into the customer cloud environment. This is where all data, code, and compute resources reside, ensuring that Databricks conforms to your existing cloud security practices.
10. Control Plane
The second component is the "Control Plane", which resides in the Databricks cloud environment. This section will control back-end processes like security and version updates and gather basic metadata about what is happening in your deployment, as well as sending requests to the Data Plane in your environment to run jobs, create clusters, and any other activities needed. When users login to Databricks, they are logging into the Web Application that is hosted in the Databricks Control Plane.
11. Databricks Platform Architecture
Putting all the pieces together, here is a complete diagram of what the Databricks platform architecture looks like.
There are various approaches to creating a workspace, depending on how you want to go about it. There are UI based approaches by using the Cloud Service Provider marketplaces or the Databricks Account Console. In larger organizations, teams also can leverage more programmatic ways to create workspaces with the Databricks Accounts API or deployments such as Terraform. These range in levels of automation and programmatic capabilities, but each deployment method will work the same way in the end.
12. Let's review!
Let's review some of these key concepts to get our Databricks environment set up!