PostgreSQL Tutorial: Managing Partitioned Tables

June 23, 2024

Summary: In this tutorial, you will learn how to manage partitioned tables in PostgreSQL.

Table of Contents

PostgreSQL Table Partition Commands

Prior to Version 17, workflow options for partition management are limited to creating, attaching, and detaching partitions. Once we’ve designed our partition structure, we couldn’t redesign it in place. This applies to all partition types, whether we’re using RANGE, LIST, or HASH.

To combine multiple partitions into a single one, or to “subdivide” a single partition into multiples, we’d need to design a new structure then migrate all data rows to it. That’s a lot of steps!

From version 17, we have more options. We can now perform a SPLIT PARTITION operation on an existing singular partition, into two or more new ones.

If we wish to do the reverse, we’ve got that option as well. Starting from two or more partitions, we can perform a MERGE PARTITIONS operation to combine them into one.

Merging Partitions

The table below has an account_id and no real data columns, since we’re just looking to demo the partition management aspect.

The id uses a generated sequence value, which means each row will have a unique value across partitions.

CREATE TABLE t (
  id INT GENERATED ALWAYS AS IDENTITY,
  account_id INT NOT NULL
) PARTITION BY LIST (account_id);

Imagine we have the following two partitions for account_id 1 and account_id 2.

CREATE TABLE t_account_1 PARTITION OF t FOR VALUES IN (1);
CREATE TABLE t_account_2 PARTITION OF t FOR VALUES IN (2);

Let’s insert 10 records for account_id 1, and 100 records for account_id 2. We have 110 records total, but they’re split across two partitions. We want to merge these together.

INSERT INTO t (account_id) SELECT 1 FROM GENERATE_SERIES(1,10);
INSERT INTO t (account_id) SELECT 2 FROM GENERATE_SERIES(1,100);

Now we want to merge them together using MERGE PARTITIONS:

ALTER TABLE t
MERGE PARTITIONS (t_account_1, t_account_2)
INTO t_account_1_2;

That combined t_account_1 and t_account_1 into a single partition called t_account_1_2 with 110 records.

What about splitting partitions? How does that work?

Splitting Partitions

We’ve seen how to merge partitions. We can also split partitions using the SPLIT PARTITIONS command.

For this example let’s use the RANGE partitioning type.

Imagine that we had decided to create partitions for one week’s worth of data for an “events” style table that receives a lot of records. We’ll call the table t_events below.

We’ve decided with a one week boundary, the tables are large and unwieldy. We’d like to move to daily partitions so that the table for a day’s worth of data is smaller and more manageable.

Let’s look at the SQL commands for how we might achieve that.

Split Partitions Events Table

Create the t_events table using the RANGE partitioning type, initially with weekly partitions to demonstrate the current configuration.

CREATE TABLE t_events (
  id INT GENERATED ALWAYS AS IDENTITY,
  event_at TIMESTAMP WITHOUT TIME ZONE NOT NULL
) PARTITION BY RANGE (event_at);

Here are partitions for “last week”, “this week”, and “next week”.

CREATE TABLE t_events_last_week PARTITION OF t_events
FOR VALUES FROM ('2024-04-08 00:00:00') TO ('2024-04-15 00:00:00');

CREATE TABLE t_events_this_week PARTITION OF t_events
FOR VALUES FROM ('2024-04-15 00:00:00') TO ('2024-04-22 00:00:00');

CREATE TABLE t_events_next_week PARTITION OF t_events
FOR VALUES FROM ('2024-04-22 00:00:00') TO ('2024-04-29 00:00:00');

Now we’d like to take “next week’s partition” called t_events_next_week, and divide it into 7 daily partitions, one for each day.

Since it’s an upcoming week, we’ll assume it has no data in it, but is a pre-created partition.

When designing your own change like this, keep in mind the resulting boundaries you come up with must have equivalent start and end boundaries to the current configuration.

If the boundaries are off, you’ll get an error like this:

ERROR:  partition bound for relation "t_events_next_week" is null

Here’s the SPLIT PARTITION DDL command to split the single week partition, into 7 daily partitions:

ALTER TABLE t_events SPLIT PARTITION t_events_next_week INTO (
  PARTITION t_events_day_1 FOR VALUES FROM ('2024-04-22 00:00:00') TO ('2024-04-23 00:00:00'),
  PARTITION t_events_day_2 FOR VALUES FROM ('2024-04-23 00:00:00') TO ('2024-04-24 00:00:00'),
  PARTITION t_events_day_3 FOR VALUES FROM ('2024-04-24 00:00:00') TO ('2024-04-25 00:00:00'),
  PARTITION t_events_day_4 FOR VALUES FROM ('2024-04-25 00:00:00') TO ('2024-04-26 00:00:00'),
  PARTITION t_events_day_5 FOR VALUES FROM ('2024-04-26 00:00:00') TO ('2024-04-27 00:00:00'),
  PARTITION t_events_day_6 FOR VALUES FROM ('2024-04-27 00:00:00') TO ('2024-04-28 00:00:00'),
  PARTITION t_events_day_7 FOR VALUES FROM ('2024-04-28 00:00:00') TO ('2024-04-29 00:00:00')
);

Now, if we run \d+ t_events to describe t_events, we’ll see the two remaining weekly partitions, and the new 7 daily partitions.

There’s a catch. Performing this operation requires a lock on the parent table, which could be a long lock.

Is there a workaround?

Detach, Split, Reattach

As long as the structure of the table stays the same, partitions can be detached and reattached.

Those operations can both be performed in a non-blocking way by using CONCURRENTLY.

Unfortunately we can’t perform a SPLIT PARTITION CONCURRENTLY, which would make this even more convenient because we wouldn’t be worried about blocking writes while the exclusive lock was in effect.

Let’s consider a workaround. We know that we can detach partitions, split them while detached, then re-attach them. Would that work?

This is a lot of operations, and requires a “new fake parent” (my own name below) to work, so these steps should be considered more a proof of concept, not a recommendation. The goal is to avoid a potentially long lock blocking writes, by allowing the lock to occur on a detached table hierarchy. Essentially “offline.”

Trying to run SPLIT PARTITION on a detached partition with no parent doesn’t work. However, we can add a “new fake parent” table to stand-in temporarily.

Here’s the detach operation:

ALTER TABLE t_events
DETACH PARTITION t_events_next_week CONCURRENTLY;

Here’s the “fake” stand-in parent table definition. Once we’ve created this, we need to attach our detached partitions to it in order to perform the split.

We’ll only use the “fake parent” table for the split operation. When that’s done, we’ll detach the partitions again, and then re-attach them to the original parent CONCURRENTLY. At that point we can drop the “fake” parent.

CREATE TABLE t_events_fake_new (
  id INT GENERATED ALWAYS AS IDENTITY,
  event_at TIMESTAMP WITHOUT TIME ZONE NOT NULL
) PARTITION BY RANGE (event_at);

Running the SPLIT PARTITION on a separate parent avoids a long lock on the original parent, since it’s a completely separate table.

ALTER TABLE t_events_fake_new SPLIT PARTITION t_events_next_week INTO (
  PARTITION t_events_day_1 FOR VALUES FROM ('2024-04-22 00:00:00') TO ('2024-04-23 00:00:00'),
  PARTITION t_events_day_2 FOR VALUES FROM ('2024-04-23 00:00:00') TO ('2024-04-24 00:00:00'),
  PARTITION t_events_day_3 FOR VALUES FROM ('2024-04-24 00:00:00') TO ('2024-04-25 00:00:00'),
  PARTITION t_events_day_4 FOR VALUES FROM ('2024-04-25 00:00:00') TO ('2024-04-26 00:00:00'),
  PARTITION t_events_day_5 FOR VALUES FROM ('2024-04-26 00:00:00') TO ('2024-04-27 00:00:00'),
  PARTITION t_events_day_6 FOR VALUES FROM ('2024-04-27 00:00:00') TO ('2024-04-28 00:00:00'),
  PARTITION t_events_day_7 FOR VALUES FROM ('2024-04-28 00:00:00') TO ('2024-04-29 00:00:00')
);

Since the table structures have not changed, and since we’re not introducing any overlapping partition constraints, we can reattach to the original parent.

Alternatives

What about simply creating new partitions to move data rows into?

While it might be less work to create new partitions and move data rows, we couldn’t introduce new partitions that overlap with the boundaries/constraints of any existing one. PostgreSQL enforces this and would prevent the partition creation.

To avoid the overlap limitation, SPLIT PARTITION seems necessary when our goal is to modify a structure in-place like this.

However, in a similar way to the workaround above, we could follow the same tactic and detach the overlapping partition to work around the conflict.

With that approach, we might achieve the same end result and not need the SPLIT PARTITION command.

PostgreSQL Table Partition Commands

Merging Partitions

Splitting Partitions

Split Partitions Events Table

Detach, Split, Reattach

Alternatives

Search

Categories

Tags