Understanding JSON data in Postgres

1. Understanding JSON data in Postgres

Welcome back! Now that you've mastered the basics of column-oriented NoSQL databases, we'll explore working with document data using Postgres JSON.

2. JSON and JSONB in Postgres

Document data is semi-structured data, typically rendered in JSON-like format. Unlike traditional relational databases, data in document format can handle evolving data with a flexible schema. Postgres offers two data types to work with document data; JSON and JSONB. As the name suggests, the JSON type allows for data to be stored in JSON format in the column of a Postgres table. A document in a column of type JSON can contain key-value pairs, arrays, and nested objects. JSONB is an extension of the JSON type and stores JSON data in binary, similar to MongoDB's BSON format. JSONB offers more efficient storage and retrieval of document data and yields itself to additional indexing. When creating a table, we'll use the JSON or JSONB keywords to declare a column with one of these types.

3. Why semi-structured data in Postgres?

Here, we get our first glimpse of semi-structured data in Postgres. While the school, age, and address columns are all of traditional data types, the parent-meta column takes type JSON. It contains key-value pairs, arrays, and nested objects. While there is a learning curve to working with document data, the advantages that it offers most certainly outweigh the initial investment. Storing data in a semi-structured format allows data to evolve without causing processes to break. Storing and querying this data is quite efficient and can reduce costs within an organization that houses a large amount of data. Data is often made available in JSON from a number of source systems, such as APIs and webhooks. Document databases make it easy to store this data in its raw form. With the functionality and tools that Postgres JSON provides, integrating document data into a data workflow is a breeze.

4. Querying JSON data with Postgres

When working with a table containing a column of type JSON or JSONB, the structure of a query is quite similar to working with a traditional Postgres table. Columns are selected from a table and can be filtered, grouped, or sorted. However, to work with the data within these columns, we'll need to use a few new tools. By the end of this chapter, we'll be familiar with inserting data in JSON format into a Postgres table. We'll take a look at the row-to-json and json-to-record functions to convert tabular data to JSON and vice-versa. We'll use a few special Postgres operators to extract individual records from JSON objects, and we'll wrap up by taking a peek at the json-extract-path and json-extract-path-text functions.

5. Executing queries with sqlalchemy and pandas

To build and execute queries against a Postgres database, we'll be using sqlalchemy and pandas. First, a connection object needs to be created. This can be done with sqlalchemy's create-engine function. This function takes a connection URI in the format: engine version, username, password AT host, port, database. After a connection object has been created, we'll need to write a query. This can be done using a single or multi-line string. The connection object and the query can then be passed to pandas' read-sql function, which returns a DataFrame. Take a moment and pause the video now to review each block of code. We'll use these commands as building blocks when writing and executing queries against document data.

6. Let's practice!

That was a lot! Let's review the basics of working with document data in Postgres with a few exercises.

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.