PostgreSQL Tutorial: Table Partitioning

June 19, 2024

Summary: In this tutorial, you will learn how to use table partitioning in PostgreSQL.

Table of Contents

Why partition a table?

Table partitioning allows to split one large table into smaller ones bringing the following benefits:

  • Smaller tables are faster both for reading and writing.
  • You can very efficiently drop the whole partition instead of deleting data row by row.
  • Because PostgreSQL knows how to prune unused partitions, you can use partitions as a crude index. For example, by paritioning a table by date, you may not need an index on the date field any more and use a sequential scan instead.
  • Rarely used partitions can be moved to a cheaper storage.

Partitioning methods

Let’s suppose we have a table:

CREATE TABLE measurements (
  id int8 NOT NULL,
  value float8 NOT NULL,
  date timestamptz NOT NULL
);

You can partition that table by providing columns to use as the partition key:

CREATE TABLE measurements (
  id int8 NOT NULL,
  value float8 NOT NULL,
  date timestamptz NOT NULL
) PARTITION BY RANGE (date);

PostgreSQL supports several partitioning methods which only differ in the way they specify row values for the partition key.

Partition by range

Partitioning by range allows to specify a range of values for the partition, for example, we can store data for each month in a separate partition:

CREATE TABLE measurements_y2021m01 PARTITION OF measurements
FOR VALUES FROM ('2021-01-01') TO ('2021-02-01');

Partition by list

List partitioning allows to specify a list of values for the partition, for example, we can store small fraction of the frequently accessed data in the hot partition and move the rest to the cold partition:

CREATE TABLE measurements (
  id int8 PRIMARY KEY,
  value float8 NOT NULL,
  date timestamptz NOT NULL,
  hot boolean
) PARTITION BY LIST (hot);

CREATE TABLE measurements_hot PARTITION OF measurements
FOR VALUES IN (TRUE);

CREATE TABLE measurements_cold PARTITION OF measurements
FOR VALUES IN (NULL);

You can then move rows between partitions by updating the hot column:

-- Move rows to measurements_hot
UPDATE measurements SET hot = TRUE;

-- Move rows to measurements_cold
UPDATE measurements SET hot = NULL;

Partition by hash

Partitioning by hash allows to uniformly distribute rows into a set of tables, for example, we can create 3 partitions for our table and pick a partition for the row using a hash and a remainder of division:

CREATE TABLE measurements (
  id int8 PRIMARY KEY,
  value float8 NOT NULL,
  date timestamptz NOT NULL
) PARTITION BY HASH (id);

CREATE TABLE measurements_1 PARTITION OF measurements
FOR VALUES WITH (MODULUS 3, REMAINDER 0);

CREATE TABLE measurements_2 PARTITION OF measurements
FOR VALUES WITH (MODULUS 3, REMAINDER 1);

CREATE TABLE measurements_3 PARTITION OF measurements
FOR VALUES WITH (MODULUS 3, REMAINDER 2);

Thanks to using hashes, the partitions will receive approximately the same amount of rows.

Managing partitions

PostgreSQL allows to detach and attach partitions:

ALTER TABLE measurements DETACH PARTITION measurements_y2021m01;

ALTER TABLE measurements ATTACH PARTITION measurements_y2021m01
FOR VALUES FROM ('2021-01-01') TO ('2021-02-01');

You can use those commands to partition an existing table without moving any data:

-- Use the existing table as a partition for the existing data.
ALTER TABLE measurements RENAME TO measurements_y2021m01;

-- Create the partitioned table.
CREATE TABLE measurements (LIKE measurements_y2021m01 INCLUDING DEFAULTS INCLUDING CONSTRAINTS)
PARTITION BY RANGE (date);

-- Attach the existing partition with open left constraint.
ALTER TABLE measurements ATTACH PARTITION measurements_y2021m01
FOR VALUES FROM ('0001-01-01') TO ('2021-02-01');

-- Use proper constraints for new partitions.
CREATE TABLE measurements_y2021m02 PARTITION OF measurements
FOR VALUES FROM ('2021-02-01') TO ('2021-03-01');

Summary

This tutorial explained how to use PostgreSQL to split large tables of data into smaller pieces for table partitioning.