PostgreSQL Tutorial: Tuning autovacuum for dead tuple cleanup

May 27, 2024

Summary: In this tutorial, you will learn how to tune autovacuum for dead tuple cleanup in PostgreSQL.

Table of Contents

The best-known autovacuum task is cleaning up of dead tuples from UPDATE or DELETE operations. If autovacuum cannot keep up with cleaning up dead tuples, you should follow these three tuning steps:

Make sure that nothing keeps autovacuum from reclaiming dead tuples

Sometimes, autovacuum won’t remove the dead tuples. Most often, the culprit are long running transactions. Unless you can remove these obstacles, tuning autovacuum will be useless.

If you cannot fight the problem at its root, you can use the configuration parameter idle_in_transaction_session_timeout to have PostgreSQL terminate sessions that stay “idle in transaction” for too long. That causes errors on the client side, but may be justified if you have no other way to keep your database operational. Similarly, to fight long-running queries, you can use statement_timeout.

Tuning autovacuum to run faster

If autovacuum cannot keep up with cleaning up dead tuples, the solution is to make it work faster. This may seem obvious, but many people fall into the trap of thinking that making autovacuum start earlier or run more often will solve the problem.

VACUUM is a resource-intensive operation, so autovacuum by default operates deliberately slowly. The goal is to have it work in the background without being in the way of normal database operation. But if your workload creates lots of dead tuples, you will have to make it more aggressive:

Setting autovacuum_vacuum_cost_delay to 0 will make autovacuum as fast as a manual VACUUM – that is, as fast as possible.

Since not all tables grow dead tuples at the same pace, it is usually best not to change the global setting in postgresql.conf, but to change the setting individually for busy tables:

ALTER TABLE busy_table SET (autovacuum_vacuum_cost_delay = 1);

Partitioning a table can also help with getting the job done faster. The advantage here is you can have several autovacuum workers working on several partitions in parallel, so that the partitioned table as a whole is done faster than a single autovacuum worker could.

Change the workload so that fewer dead tuples are generated

If nothing else works, you have to see that fewer dead tuples are generated. Perhaps several UPDATEs to a single row could be combined to a single UPDATE?

Often you can significantly reduce the number of dead tuples by using “HOT updates”:

  • set the fillfactor for the table to a value less than 100, so that INSERTs leave some free space in each block.
  • make sure that no column that you modify in the UPDATE is indexed.

Then any SELECT or DML statement can clean up dead tuples, and there is less need for VACUUM.

See more

PostgreSQL Optimization