Prepare for the PgBouncer and IPv4 deprecations on 26th January 2024

Database

Managing Indexes in PostgreSQL

An index makes your Postgres queries faster. The index is like a "table of contents" for your data - a reference list which allows queries to quickly locate a row in a given table without needing to scan the entire table (which in large tables can take a long time).

Indexes can be structured in a few different ways. The type of index chosen depends on the values you are indexing. By far the most common index type, and the default in Postgres, is the B-Tree. A B-Tree is the generalized form of a binary search tree, where nodes can have more than two children.

Even though indexes improve query performance, the Postgres query planner may not always make use of a given index when choosing which optimizations to make. Additionally indexes come with some overhead - additional writes and increased storage - so it's useful to understand how and when to use indexes, if at all.

Create an index

Let's take an example table:


_10
create table persons (
_10
id bigint generated by default as identity primary key,
_10
age int,
_10
height int,
_10
weight int,
_10
name text,
_10
deceased boolean
_10
);

We might want to frequently query users based on their age:


_10
select name from persons where age = 32;

Without an index, Postgres will scan every row in the table to find equality matches on age.

You can verify this by doing an explain on the query:


_10
explain select name from persons where age = 32;

Outputs:


_10
Seq Scan on persons (cost=0.00..22.75 rows=x width=y)
_10
Filter: (age = 32)

To add a simple B-Tree index you can run:


_10
create index idx_persons_age on persons (age);

Here is a simplified diagram of the index we just created (note that in practice, nodes actually have more than two children).

B-Tree index example in Postgres

You can see that in any large data set, traversing the index to locate a given value can be done in much less operations (O(log n)) than compared to scanning the table one value at a time from top to bottom (O(n)).

Partial indexes

If you are frequently querying a subset of rows then it may be more efficient to build a partial index. In our example, perhaps we only want to match on age where deceased is false. We could build a partial index:


_10
create index idx_living_persons_age on persons (age)
_10
where deceased is false;

Ordering indexes

By default B-Tree indexes are sorted in ascending order, but sometimes you may want to provide a different ordering. Perhaps our application has a page featuring the top 10 oldest people. Here we would want to sort in descending order, and include NULL values last. For this we can use:


_10
create index idx_persons_age_desc on persons (age desc nulls last);

Reindexing

After a while indexes can become stale and may need rebuilding. Postgres provides a reindex command for this, but due to Postgres locks being placed on the index during this process, you may want to make use of the concurrent keyword.


_10
reindex index concurrently idx_persons_age;

Alternatively you can reindex all indexes on a particular table:


_10
reindex table concurrently persons;

Take note that reindex can be used inside a transaction, but reindex [index/table] concurrently cannot.