1. Extracts
Welcome back! We're going to change gears and discuss the various file types in Tableau, specifically extracts. This is important because there are many ways to save your work and data in Tableau. Knowing your options will enable you save and share your work quickly and effectively without losing any important components.
2. Tableau file types
There are several ways you can save your work on Tableau. Let's begin with packaged workbooks which has the extension .twbx. These are the files you've been opening throughout the exercises. It's a single file that contains your workbook along with any supporting files, including datasets and images. Basically the workbook and any local files it uses are compressed into a packaged workbook.
There's plain workbooks that only contains the workbook. Any supporting files are linked but not contained within the twb file. If you're sharing work with others who don't have access to the original data, then a packaged workbook is the way to go. Otherwise, the plain workbook is a good option to keep your file size small.
Then there are extracts that can be identified with either the .tde or .hyper file extension. .hyper is used in newer versions of tableau and in those versions, you'll find that .tde files will automatically upgrade to .hyper. Extracts are a local copy of parts or all of the data. This will be the focus of this video.
Finally, we have data source files which include information on how to connect to the data source and any modifications done to the data, like calculated fields and groups. We will cover this later in the course.
3. Extracts
So why would we want to use an extract, especially, if we can just keep everything in a packaged workbook? Well there are many benefits to extracts.
Extracts improves performance when saving, loading, and interacting with data. They support very large data sets. You can create extracts fast that contain billions of rows of data. You shouldn't try to save a packaged workbook with that much data - it would be slow, massive in size, just for a workbook! Additionally, Extracts leverages, Hyper, Tableau's database engine which generally operates faster than working with the original data.
Extracts also allow you to retain the data prep work you've done to enrich and extend the original dataset. That includes how you've combined multiple datasets with joins, unions, and relationships. As well as anytime you change a field from a measure to a dimension.
Finally, if you need to work offline which may affect your access to the original data files, you can use an extract to have a snapshot of your data readily available.
4. Live connections
When you are in the Data Source page in Tableau Desktop, you have the option to choose an Extract or a Live connection.
A live connection is when the data source has a direct connection to underlying data, whether it's directly pulling from the original Excel workbook or in many enterprise situations, pulling from the company's date warehouse, like a Redshift database.
5. Comparison
It's useful to know when to use which. Because live connections go straight to the source, they get updated in a real-time. If there's a new customer order, it will show up in Tableau. While, in Extracts, a refresh needs to be activated. So if real-time data is important, a live connection is a better choice.
Because a live connection requires querying an external database and data files, performance tends to be slower compared to extracts. Extracts, because it bundles up the prepped data in one place and uses Hyper, tends to be faster. This is especially true if your live connection depends on network. Again, this is why extracts can enable easier offline work.
6. Let's practice!
Ok, let's practice!