Merging and Joining Data
1. Merging and Joining Data
In this video, we’ll talk about mechanisms available for merging and joining data in Fabric.2. Tools for data merging
We'll look at how to perform merges using SQL scripts, Spark Notebooks and Dataflows.3. Basics of data merging
Joining tables is a key operation that allows you to combine rows from two or more tables based on a related column between them. It is a useful operation when building star schemas. For example, you would use a merge when converting a snowflake dimension to a star dimension. In relational databases this merging of tables is known as a JOIN operation between a left table and a right table.4. Basics of data merging
In the example, we use the common column "menu category ID" to join the tables "menu items" and "menu categories", creating a merged table.5. Types of Join
There are several types of join; we'll talk about the four most common types. The INNER JOIN is the simplest type of join. It returns records that have matching values in both tables. You use an inner join when you need only the matching records from both tables. If a record in one table doesn't have a match in the other table you don't want that record to show up in the result.6. Types of Join
The LEFT OUTER JOIN returns all records from the left table, and the matched records from the right table. You use this join when you want the result to be based on all records from the left table, complemented with matched records from the right table.7. Types of Join
The RIGHT OUTER JOIN returns all records from the right table, and the matched records from the left table.8. Types of Join
Finally, the FULL OUTER JOIN returns all records when there is a match in either left or right table. You use this join when you want the result to include all records from both tables, regardless of matches.9. Merging tables with SQL
To join tables using SQL, you use the JOIN statement. The basic syntax specifies the columns you want to be in the result, the type of join and the columns used for the comparison.10. Merging tables with PySpark
To join DataFrames using PySpark, you use the join function. You invoke the function on the left DataFrame, specifying the right Data Frame, the join condition and the type of join.11. Merging tables with Dataflows
In a Dataflow you can use the merge queries transformation to join two queries and produce a merged query. There are two transformations available. 'Merge queries' merges two queries and leaves the results in the first query. 'Merge queries as new merges two queries and leaves the results in a new query. This example shows the merge dialog where you configure the left and right tables, the join column and the type of join.12. Let's practice!
Now, let's do a couple of exercises to practice merging and joining data.Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.