pgvector 0.6.0 was released today, with a significant improvement: parallel builds for HNSW indexes. Building an HNSW index is now up to 30x faster for unlogged tables.
This release is a huge step forward for pgvector, making it easier to tune HNSW build parameters and increase search accuracy and performance.
HNSW indexes in pgvector
We explored how HNSW works in an earlier post, so as a quick recap: HNSW is an algorithm for approximate nearest neighbor search. It uses proximity graphs and consists of two parts: hierarchical and navigatable small world. It operates over multiple layers with different densities or distances between nodes, where layers represent different connection lengths between nodes. Thus allowing HNSW to search, insert, and delete in linearithmic time.
pgvector parallel index builds
Prior to 0.6.0, pgvector only supported building indexes using a single thread - a big bottleneck for large datasets. For example, building an index for 1 million vectors of 1536 dimensions would take around 1 hour and 27 minutes (with
With parallel index builds you can build an index for the same dataset in 9.5 minutes - 9 times faster:
Performance comparison: pgvector 0.5 vs 0.6
We tested index build time with the dbpedia-entities-openai-1M dataset (1 million vectors, 1536 dimensions) to compare the performance of parallel and single-threaded index HNSW builds. At the same time, we verified that the resulting indexes are the same in terms of accuracy and queries per second (QPS).
We ran benchmarks on various database sizes to see the impact of parallel builds:
- 4XL instance (16 cores 64GB RAM)
- 16XL instance (64 cores 256GB RAM)
4XL instance (16 cores 64GB RAM)
This benchmark used the following parameters:
max_parallel_maintenance_workers controls how many parallel threads are used to build an index. In further sections we will refer to the total number of workers, including the leader.
The index build time is 7-9 times faster for 0.6.0, while queries per second and accuracy stay the same for both versions:
v0.5.1: averaged 938 QPS and 0.963 accuracy across all benchmarks.
v0.6.0: averaged 950 QPS and 0.963 accuracy across all benchmarks.
16XL instance (64 cores 256GB RAM)
You can further improve index build performance using a more powerful instance (up to 13.5x for these parameters).
The index build time is not linearly proportional to the number of cores used. A sensible default for
CPU count / 2 , the default we set on the Supabase platform. Accuracy and QPS are not affected by
Pro tip: optimizing your bills
The trick is to use a large database while you build the index and then switch back to a cheaper instance after the index is built.
Embeddings with unlogged tables
Building time can be reduced even further using unlogged tables.
An unlogged table in Postgres is a table whose modifications are not recorded in the write-ahead log (trading performance for data reliability). Unlogged tables are a great option for embeddings because the raw data is often stored separately and the embeddings can be recreated from the source data at any time.
One of the steps of index creation is the final scan and WAL writing. This is generally short but not parallelizable. Using unlogged tables allows you to skip the WAL, with an impressive impact:
|Build time: v0.5.1
|Build time: v0.6.0 (unlogged)
|1h 06m 59s
|1h 27m 45s