1. What is Fabric?
In this video, we’ll learn what Microsoft Fabric is used for and the personas at a company that typically use it.
2. A typical data workfow
Imagine a data workflow at a video streaming company keen on analyzing customer viewing habits to make informed decisions about future content.
3. A typical data workflow
A traditional workflow might look like this:
Raw data about viewer habits, such as views, likes, and watch duration, is collected and stored in an unstructured format.
4. A typical data workflow
Data Engineers then process this raw data through pipelines to store it in tables for efficient querying and analysis
5. A typical data workflow
Data Scientists use these tables to analyze and build machine-learning models using tools like TensorFlow or PySpark.
6. A typical data workflow
The insights are communicated via dashboards and reports using a tool like PowerBI.
7. Issues with the typical data workflow
Data flows through various platforms and tools at every stage of this process, which could result in many issues. For example, some team members might not have access to the tools they need. Or the tool they do have access to might not be compatible with their preferred programming language.
In short, this process is messy and complicated.
8. Fabric's solution
This is the problem that Fabric tries to solve. Fabric is an end-to-end data analytics platform that integrates all parts of a data workflow into a single unified experience. A data engineer might set up pipelines to move data into Fabric. Database Administrators can secure and govern this data regardless of which team is using it. Data Scientists can use SQL or Python inside Fabric to create models. BI Developers can use Power BI inside Fabric to create dashboards or reports. As seen in this image, the experience is unified across all of these functions.
9. Fabric's Workloads
Fabric can be used by various personas, and it is divided into several workload. Data engineers might work primarily in the data factory workload, while data scientists will find relevant tools in the data science workload.
10. Fabric's Workloads
We’ll explore these workloads in more detail in upcoming videos, but it’s essential to understand that each workload uses the same data source, OneLake.
Data stored in OneLake is in a format called Delta-Parquet.
11. All data stored in OneLake
All workloads use the same data source.
12. All data stored in OneLake
For example, a Data Engineer using Data Warehouse might write SQL to interact with a warehouse. Their SQL is passed through an engine named T-SQL, and their code is compatible with OneLake.
13. All data stored in OneLake
Meanwhile, a data scientist might write Python code using the data engineering workload. This Python code is passed through a different engine named Spark that allows Python to interact with the Delta-Parquet format as well.
14. All data stored in OneLake
The key takeaway is this: no matter what tool you use in Fabric, you’ll interact with the same data as everyone else. Different teams can choose the tool that best fits their use case or skill set. The data can be managed in a single unified experience.
This single unified experience is made possible through OneLake and the Delta-Parquet file format.
15. Let's practice!
But enough talk, let’s get some experience working with Fabric!