Vector Buckets

Store, index, and query vector embeddings at scale with similarity search.

This feature is in alpha

Expect rapid changes, limited features, and possible breaking updates. Share feedback as we refine the experience and expand access.

Vector buckets enable efficient storage and similarity search of vector embeddings. Built on S3-compatible storage, they provide high-performance semantic search capabilities for AI and machine learning applications.

What are Vector buckets?#

Vector buckets are specialized storage containers optimized for vector data. Unlike traditional databases optimized for transactional queries, vector buckets use specialized indexing and distance metrics to perform fast similarity searches across millions of embeddings.

Each vector bucket contains:

Indexes - Organized collections of vectors with consistent dimensions and distance metrics
Vectors - Embeddings with associated metadata for filtering and enrichment
Metadata - Additional context about vectors (text, tags, IDs, etc.)

Key features#

Similarity Search - Find semantically similar vectors using cosine, euclidean, or L2 distance metrics
Metadata Filtering - Filter results by associated metadata before/after similarity search
Batch Operations - Insert, update, and query up to 500 vectors per request
Scalable Storage - Store millions of vectors in a single index
S3 Native - Built on proven S3 infrastructure for reliability and durability

Ideal use cases#

Vector buckets excel at:

Semantic Search - Find documents or images similar to a query
Recommendation Systems - Suggest products, content, or connections based on embeddings
Clustering & Anomaly Detection - Group similar items or identify outliers
Image Search - Retrieve visually similar images from large catalogs
RAG (Retrieval-Augmented Generation) - Find relevant context for LLM queries
Personalization - Recommend tailored content based on user embeddings

Comparison to pgvector#

Vector buckets share similarities to pgvector and matches the developer experience of using pgvector as much as possible, but Vector buckets and any Foreign Data Wrappers (FDW) they use only support one similarity search algorithm, the <===> distance operator.

This makes Vector buckets ideal for:

Large-scale data storage
Backend processing workflows
Applications where speed is less critical

And pgvector is ideal for:

Fast prototyping and small data volumes
Applications requiring quick response times
User-facing features closer to the front end

How Vector buckets work#

Create a bucket to organize your vector data
Create indexes within the bucket with specified dimensions and distance metrics
Store vectors with embeddings and optional metadata
Query vectors using similarity search to find nearest neighbors

The system automatically handles indexing and optimization, making searches fast and reliable even with millions of vectors.

Next steps#

Get started by learning how to create vector buckets or dive into storing vectors.