How Postgres chooses which index to use

Last edited: 1/17/2025

For the curious: here is a list of all built-in indexes in Postgres

Postgres internals

How an index is chosen

PostgreSQL, internally, contains a few components that manage query execution:

ModuleDescription
ParserConverts SQL into an easily traversable query tree
Planner/OptimizerTakes the query tree and uses rules and database statistics to find the optimal strategy for getting the data
ExecutorExecutes the plan created by the planner

The planner will consider using an index when an indexed column is present in a filter statement, such as:

  • WHERE
  • LIKE
  • ILIKE
  • DISTINCT
  • SIMILAR TO
  • JOIN
  • ORDER BY

Otherwise, it will likely perform a full table scan (sequential scan).

In the majority of cases, the indexed column must not only be present but also must be filtered by a comparison operator (=, >, <>) that is compatible with the index.

As an example, one can create the following table:

Column NameData Type
idINT
dataJSONB

On the data column, a GIN index can be applied, which is excellent for filtering JSONB datatypes:


_10
CREATE INDEX some_arbitary_index_name ON some_table USING gin (data);

Here's a link to the list operators supported by the GIN index; notably, it does not support greater than >:


_10
-- GIN index will never be used
_10
select *
_10
from some_table
_10
where data -> val > 5;

GIN does support the @> operator:


_10
--GIN will be considered
_10
SELECT id FROM some_table
_10
WHERE data @> '[ { "itemId": "p11" } ]';

In most cases, developers work with the default BTREE index. It is the most practical and performant in the majority of cases and is compatible with the following filter operators:

Comparison Operator
<
<=
=
>=
>

An operator's functional equivalents, such as IN, BETWEEN, and ANY, are also valid.

However, just because the base requirements (relevant column, filter, and operators) are present, doesn't mean that an index will be used.

Indexes have a startup cost, so for small tables, Postgres might use a sequential scan if it believes that it will take less time. The database keeps statistics about each table that it uses to inform these choices.

In very rare cases, these statistics can become stale, and Postgres may opt to use a slower index or sequential scan when a better option is available.

You can see the query plan with the EXPLAIN keyword:


_10
EXPLAIN <your query>

To understand how to interpret its output, you can check out this explainer.

To reset statistics within the database, you can use the following query:


_10
-- use judiciously
_10
select pg_stat_reset();

Complex or composite indexes

For a more complete rundown, check the Postgres Official Docs

Multi-column indexes

If you make independent indexes on multiple columns, Postgres will likely use each of them independently to find the relevant rows and then combine the results together.

It is possible to make multi-column indexes. If you are regularly filtering against multiple columns, there can be performance benefits using them instead of several independent indexes.


_10
-- multi-column index
_10
create index test2_mm_idx on test2 (major, minor);
_10
_10
-- multi-column comparison:
_10
select name
_10
from test2
_10
where major = constant and minor = constant;

Ordered indexes

If you're using an ORDER BY clause, indexes can also be pre-sorted by DESC/ASC for better performance.


_10
-- organizes the index in a DESC order, places NULL values at the end
_10
CREATE INDEX test3_desc_index ON test3 (id DESC NULLS LAST);

Functional indexes

Although not as common, indexes can also be leveraged against modified values, such as when using a LOWER function:


_10
-- Index on modified column through function
_10
create index test1_lower_col1_idx on test1 (lower(col1));
_10
_10
-- Index will be considered for the following query:
_10
select * from test1 where lower(col1) = 'value';

Covering indexes

Indexes contain pointers to a specific row, but you could instruct an index to actually hold a copy of a column's value for even faster retrieval. These are known as covering indexes. Because maintaining a copy is storage intensive, you should avoid using it for values with large data footprints. FULL VIDEO ON TOPIC


_10
CREATE INDEX a_b_idx ON x (a,b) INCLUDE (c);

Indexes on JSONB

Although a GIN/GIST index can be used to index entire JSONB bodies, you can also target just specific Key-values with standard BTREE indexes:


_10
-- Example table
_10
create table person (
_10
id serial primary key,
_10
data jsonb
_10
);
_10
_10
create index index_name on person ((data ->> 'name'));