We're seeing an emerging trend for AI customers: Postgres and ClickHouse is becoming the “default data stack”.
This makes sense - AI companies typically generate a lot of logs and analytical data, which is better suited for an OLAP database like ClickHouse.
The combination of Supabase and ClickHouse together are perfect for Helicone.ai, providing the flexibility of Postgres with the analytical power of ClickHouse — an open-source stack we can trust and customize.
Justin Torre, CTO @ Helicone.ai
Supabase + ClickHouse Partnership
The partnership between Supabase and ClickHouse aims to create a seamless experience, building on the already solid Postgres + ClickHouse foundation. Today, we're releasing new features to enhance this integration.
ClickHouse is very excited to partner with Supabase to make it easy for customers to use both technologies together. Through this partnership, we aim to make it even simpler for Postgres developers to use ClickHouse in conjunction and build real-time, data-driven applications at scale.
Aaron Katz, CEO @ ClickHouse Inc.
Using Postgres and ClickHouse together
Before diving into those changes, some context on how most customers use Supabase and ClickHouse together. While both are databases, they serve different use-cases:
- Postgres: Ideal for storing and querying application data, powering critical transactional and web app use cases.
- ClickHouse: Optimized for analytics and reporting, supporting both customer-facing and internal applications
Postgres is a row-oriented database, ClickHouse is column-oriented. The ClickHouse team have a great write up about the difference between the two formats.
To provide an interface between these, Supabase customers generally use:
- clickhouse_fdw to query their ClickHouse data from their Postgres database.
- PeerDB to replicate their data from Postgres to ClickHouse.
Improving the ClickHouse & Supabase experience
We're making a number of changes to our platform based on the feedback we've had from customers.
Updated ClickHouse Foreign Data Wrapper
Using the ClickHouse FDW, you can directly query your ClickHouse database from Postgres:
_11-- Connect Postgres to your ClickHouse database:_11create foreign table user_analytics (_11 id bigint,_11 user_id bigint,_11 event text_11)_11server clickhouse_server_11options ( table 'UserAnalytics' );_11_11-- Query your ClickHouse instance from Postgres:_11select * from user_analytics where user_id = 1;
This means you can query your ClickHouse data using the Postgres tooling that you're familiar with.
The Wrapper now has support for ClickHouse Parameterized Views. With this update, you can pass query parameters directly to ClickHouse, taking full advantage of its analytical engine::
_10create foreign table user_analytics (_10 id bigint,_10 user_id bigint,_10 _event text,_10)_10server clickhouse_server_10options ( table '(select * from UserAnalytics(event=${_event}))' );_10_10select * from user_analytics where _event='button_click';
More granular replication control
Many of our customers use PeerDB to replicate data from Postgres to ClickHouse. This has occasionally presented challenges, particularly with Postgres's default 1GB WAL size, which, for large data volumes, can result in data loss if the WAL exceeds this size.
To resolve this, we've added 13 configurable Postgres parameters, enabling you to adjust replication settings through the CLI. For example, you can increase the default WAL size to 2GB:
_10supabase --experimental --project-ref xxxx-yyy \_10postgres-config update --config max_slot_wal_keep_size=2GB
The new CLI config includes the following Postgres parameters:
- logical_decoding_work_mem: Controls memory used during logical decoding.
- max_connections: Limits total connections to the Postgres server.
- max_locks_per_transaction: Sets the maximum locks allowed in a single transaction.
- max_replication_slots: Defines the number of replication slots for data streaming.
- max_slot_wal_keep_size: Limits disk space for WAL in replication slots.
- max_standby_archive_delay: Sets how long standby servers can wait for archive recovery.
- max_standby_streaming_delay: Controls delay on standby servers for streaming replication.
- max_wal_size: Specifies the maximum size of the Write Ahead Log.
- max_wal_senders: Sets the maximum number of processes sending WAL data.
- max_worker_processes: Defines the number of background worker processes.
- shared_buffers: Determines the amount of memory for shared buffers.
- wal_keep_size: Sets minimum WAL size to keep for standby servers.
- wal_sender_timeout: Specifies the timeout for inactive WAL sender processes.
Improved Disk Management
Supabase now provides granular control over disk usage for your Postgres database:
This is driven directly by customers using tools like PeerDB. With adjustable WAL configuration, it's important that developers can manage the disk as well. For example, on the Pro Plan's 8GB disk, you can configure your project with options like:
- Default: 7GB database space, 1GB Write Ahead Log
- Custom example: 6GB database space, 2 GB Write Ahead Log
Additionally, we're introducing High-performance Disks. We'll release more details about this later.
ClickHouse platform updates
The ClickHouse team have also been busy. They've released a number of updates to their platform, including:
- A native Supabase OAuth integration in PeerDB for Postgres CDC to ClickHouse.
- Support for IPV6 in PeerDB Cloud.
You can learn more about these features in the Supabase Partnership post they released today.
What's next?
Improving the experience between Postgres and ClickHouse is the first phase of this partnership. We're already working on native platform integrations. If you're using (or plan to use) Supabase and ClickHouse together please reach out - we'd love more design partners to help shape the future of this integration.
If you simply want to try out the tools and updates we've described above, you can get started with all of them, free of charge:
- Supabase: database.new
- ClickHouse: clickhouse.com
- PeerDB: peerdb.io