Get startedGet started for free

Hydrating the lakehouse

1. Hydrating the lakehouse

In this scenario, I’ll continue in my role as a data analyst for a large coffee retail company. I’ve been handed several data files to analyze and want to ingest them into Databricks to fully leverage the platform’s powerful tools. I start by heading to the Data Ingestion section of Databricks, where I can easily upload one of my CSV files to get a quick look at the data. Using the GUI, I create a table from the *domestic_consumption* file, which contains columns showing total coffee consumption by origin and year. This is a straightforward and user-friendly way to get started, and I can already see that this data will be useful for future analysis. To speed things up, I decide to take a more programmatic approach for the rest of the files. I switch to the SQL Editor pane and write a script using the `COPY INTO` command. This will let me efficiently create tables from the files I’ve already uploaded to a Databricks Volume. Finding the file paths is easy just open the Catalog Explorer and copy them directly from the catalog pane on the left-hand side. With this script, I can create a table for each file in just a few minutes. Once the script runs, each table is populated with data from its respective file. I jump back to the Catalog Explorer to check out my newly created tables. It’s great to see everything organized. I can view an overview of each table’s columns and even preview some sample data directly in the interface. Now that I’ve ingested enough data, I’m ready to start building out a more comprehensive data model and expanding on the initial analysis I’ve done. This is where things get exciting! In the upcoming exercises, you’ll get hands-on experience ingesting data using a variety of techniques, laying the foundation for your own data model and analyses. Let’s dive in and explore all the possibilities together!

2. Let's practice!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.