Complex JSON Processing
1. Complex JSON processing
In this last video, we will explore some more complex JSON structures and how to handle them.2. Complex JSON data
Real-world JSON often contains nested objects, arrays, and mixed data types that challenge the simple parsing approaches that we saw in the last video. Now, we'll explore advanced JSON processing techniques using Tablesaw's JsonReader options and custom parsing strategies. These skills are essential for handling APIs, configuration files, and complex data sources.3. Nested JSON objects
Nested JSON contains objects within objects, creating hierarchical data structures. This concept is referred to as "nesting", and you may have heard of this when seeing "nested loops" in the past. Tablesaw can flatten simple nested structures automatically. Flattening means turning a column that contains lists or arrays in each row into multiple rows - one for each item - so the data becomes one continuous list instead of nested lists. This customer example shows address information nested within the main record. Understanding how to access nested properties is crucial for extracting meaningful data from real-world JSON sources.4. JSON flattening
And this is our flattened output! The flattening has converted the nested JSON into a table by creating separate columns for each item in the array. Nested objects, such as the customer address latitude and longitude, become dot-separated column names, showing how the lat and long belong to the parent address nesting.5. JsonReader configuration
JsonReadOptions, just like our CSVReadOptions from the previous chapter, gives us more control over how JSON data is converted into a table. This is especially helpful when working with more complex structures or when we need to fine-tune how specific columns are handled. We use the builder pattern to configure our options. First, we specify the source file. We can also set a custom table name and tell Tablesaw how to interpret missing values. Here, we use "N/A" as the missing value indicator, allowing the parser to recognize and handle gaps in the data appropriately. Finally, we call build() and pass the options to JsonReader's read method. Note that we also have methods that then relate to these options. For example, after we have specified "N/A" values as missing using the .missingValueIndicator() in JSONReadOptions, we can then use the isMissing() method to find the missing values in a column.6. Joining tables
One final key method to look at is the joinOn method, which allows us to join two tables together. Remember that once we read in our JSON file, it becomes a Table in Tablesaw, and so any operations are then performed on that table. Imagine two guest lists for a party: one list has names and phone numbers, the other has names and dietary preferences, and each list is a table in our code. An inner join, one of most common types of joins, is like creating a new list containing only the people who appear on both lists, combining their info side-by-side. We do this here by using the joinOn method, specifying the key in both tables, which in this case is the name column, and the inner method is used to perform an inner join between the phones and diets tables. This is just a brief look at joins; many other types exist, but an inner join is one of the most common in practice and is a very powerful tool.7. Let's practice!
It's time for your final practice.Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.