Error: index row size exceeds btree version 4 maximum for index
Last edited: 1/18/2025
Error
_10index row size exceeds btree version 4 maximum 2704 for index "idx_name"
Summary
PG has a limit on a BTree tuple(=row) size. It needs to fit at least 3 btree tuples on a 8Kb page. That could not be changed.
BTree row can be a single attribute or multiple attributes. These cases are better addressed separately.
BTree is built with multiple attributes
BTree with multiple attributes will perform better than several only in case the likely SELECT queries use several attributes that include the first attributes that are in the index. I.e. select by 1-st, 2-nd, 3-d but not by 2-nd, 3-d and 5-th index attributes.
The other case when multiple attributes BTree is good is when we have INSERT/UPDATE workload that is comparable to SELECT load (generally SELECTS a way more often). Then we can save speed-up updating a single index instead of several at INSERT/UPDATE at cost of SELECT performance decrease.
But most likely we have multiple attributes BTree index due to some automation tool, not by intention. Even without the mentioned error it's best to build separate single-attribute indexes for each attribute from it. Then drop multiple attributes BTree index. This is a must and the only solution when we have this error though.
BTree is built on a single attribute that is very long
This can be if the index is built on text, JSON column etc. It's not prohibited to build BTree on these datatypes, but it's also ineffective. Why?
One of the measures of index efficiency is the ratio of index entries to the width of all possible values space for this datatype. If we have say 100000 distinct values of int32 in the index then the ratio is 1/40000. If we have text with length of 2704 bytes (maximum for BTree index) we can hardly imagine the number of distinct values that gives us even a comparable ratio. That said indexing of that long values stores much redundancy in the index.
The solution is simple: use some king of hashing to transfer the values to much narrower space. I.e. md5. The solution is simple, you build a functional index (=index by expression):
_10CREATE INDEX ON table_name(MD5(column_name));
instead of:
_10CREATE INDEX ON table_name(column_name);
Then you must modify you SELECTs to be using the same function (otherwise the functional index will not be used in SELECT queries). I.e.
_10select * from table_name where MD5(column_name) = MD5('search_value');
instead of
_10select * from table_name where column_name = 'search_value';
More on building index by expression
For some datatypes other than text that allows queries by partial inclusion (i.e. that the pair key-value is includes in a JSON or for implementing tsvector phrase search) you'd just use GIST/GIN indexes that inherently have values space much narrower that the whole to be indexed.