Row-oriented storage and partitions
1. Row-oriented storage and partitions
All data is ultimately stored somewhere. Let's explore that storage structure.2. Database storage types
Databases store data in two main formats, row-oriented storage, and column-oriented storage. Row-oriented storage retains the relation between columns. Column-oriented storage retains the relation between rows. Consider this example table of zoo animals.3. Row-oriented
In row-oriented storage, the relationship between columns remains. One record is all the columns for that one row. This one record about Bob, the panda is stored together.4. Column-oriented
In column oriented storage, the relationship between rows remains. One record is all the rows for one column. All the zoo animal names are stored together. All the animal species are stored together. However, the relationship between the name Bob and species Panda is lost in column storage. Postgres inherently uses row-oriented storage. We will focus on this storage type for the remainder of this lesson.5. Row-oriented storage
Row-oriented storage stores one row in the same location. Because all columns for one row are stored together, it is fast to append or delete whole rows. Additionally, a query to return one column is as fast as a query to return all columns. The number of rows returned impacts query speed, so returning all rows is slow.6. Reducing the rows
We have covered multiple ways to limit the number of records which would then speed your query. You can use a WHERE filter, an INNER JOIN, a DISTINCT clause, or a LIMIT clause.7. Row-oriented database methods
Row-oriented databases lend themselves to some database optimization methods, including partitions and indexes. Partitions are a method of splitting one parent table into many smaller, children tables. Indexes are a method of creating sorted column keys to improve search speeds. Both of these methods require set up and maintenance by a database administrator (DBA). Any column or combination of columns can be an index or partition, and neither are inherently obvious. The DBA should know or document these items in a database diagram or the metadata. We will focus first on partitions.8. Partition structure
Partitions exist on a parent table that is visible in the database front end. It is the table you reference in queries. Here, it is the table containing all records of the zoo animals. The zoo animals table is partitioned on animal habitat. Each continent is its own child table. These continent tables are not visible on the front end but are visible to queries.9. Partition structure
Partitions are referenced the same as any filter condition using the WHERE clause. This query references the visible parent table, zoo_animals, in the FROM clause. It uses the partitioned column, habitat, in the WHERE clause. The partition allows the query to search the two rows in the Africa child table to find Zebra as the resulting species. Without the partition, the query would have searched all four rows in the parent table.10. Partition overview
Partitions split one table into many smaller tables. They provide storage flexibility. Each child table can be stored on a different server, with seldom-used partitions on cheaper, slower storage options. Because the main table is functionally split, referencing a partition allows the query to search a subset of rows, speeding the search. Partitions are often on common filter conditions such as date or location columns.11. Partition query assessment
Recall the query planner that uses your order (or SQL) to plan your meal execution (or query). Adding EXPLAIN before a query shows this execution plan. The plan includes a cost estimate, rows, width information, and the filter. The cost estimate is the metric impacted by using a partition.12. Let's practice!
Now it's your turn to test your knowledge about row-oriented storage and partitions.Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.