Encrypt columns using Transparent Column Encryption.
Supabase provides a secure method for encrypting columns using Vault, our Postgres secrets manager. Vault is Postgres extension with an integrated UI intended to act as a secure global secrets management for you project.
Vault enables an advanced feature called Transparent Column Encryption (TCE) which provides a safe way to encrypt your data so that it doesn't leak into logs and backups. It can also provide row-level authenticated encryption.
When creating a new column in the Dashboard, you can choose to encrypt a
bytea column. You will choose which key you would like to encrypt it with by selecting an existing key ID or creating a new one.
Once you've created an encrypted column, you can insert data into the table like you would any other table. For example if you were to insert an email address into an encrypted column, you will see that the address is transparently converted into an encrypted value on the new row.
Decrypted data is accessed using a special view that is automatically created after adding an encrypted column to a table. This view decrypts the data row-by-row as you access it. By default, this view is called
decrypted_<your-table-name>. In the example below, the decryption view for the
profiles table is called
decrypted_profiles. Notice there is a new column in the view called
decrypted_emails that contains the decrypted email value.
Using an Encrypted Table#
Now that you have TCE setup for a table, it's easy to use by simply inserting data into the table, and querying that data by looking at its generated view. The view is named
decrypted_<table_name> and by default is in the same schema as your table:
_10insert into secrets_10(secret, account_id)_10values_10('1234-5678-8765-4321', 123);
Now that you have inserted data, look at the table and notice how the secret is encrypted. This is the data that is stored on disk, the encrypted card number, the key id, and the account id, but the key itself is not stored. This means if someone gets a backup or dump of your database, they cannot decrypt the secret, they do not have the key, only the key ID:
_10> select * from secrets where account_id = 123;_10-[ RECORD 1 ]------+---------------------------------------------------------------------_10id | 1_10secret | jf8KfImkKTr+j4gzyDZQtLDEFL9eSlFuKjNlNEJvDg+OIKUr2wjF/8NnYcLisb5F9xiN_10account_id | 123_10key_id | 7f753c4f-8c68-457a-8801-1798b2e9f44d_10nonce | \x300a14aa721184ff7cf0f6bf088da267
For you, the developer, you need the unencrypted secret for you application. No problem, you can access that data using the dynamically generated decryption view
_10> select * from decrypted_secrets where account_id = 123;_10-[ RECORD 1 ]----------------+---------------------------------------------------------------------_10id | 1_10secret | jf8KfImkKTr+j4gzyDZQtLDEFL9eSlFuKjNlNEJvDg+OIKUr2wjF/8NnYcLisb5F9xiN_10decrypted_secret | 1234-5678-8765-4321_10account_id | 123_10key_id | 7f753c4f-8c68-457a-8801-1798b2e9f44d_10nonce | \x300a14aa721184ff7cf0f6bf088da267
Notice how there is a new column called
decrypted_secret. This column is not stored in database or on disk at all, it is generated “on-the-fly” as you select from the view. Database dumps do not contain this information, only the view itself, and most importantly, raw decryption keys are never stored.
How Key Derivation Works#
The current state-of-the-art in encryption libraries is libsodium.
libsodium offers a range of APIs for authenticated secret and public key encryption, key derivation, encrypted streaming, AEAD, various forms of hashing, and much more.
This powerful API is available to PostgreSQL using the pgsodium extension. pgsodium provides all the functionality of the full libsodium API, but previously it required developers to set up database encryption themselves, which remained a challenge even for those familiar with database administration.
To solve this problem, pgsodium now has a full key management API, primarily via the table
pgsodium.key and the
pgsodium.create_key() function. This key table contains no raw keys, but instead uses libsodium Key IDs to derive keys that are used internally for encryption. A key derivation function is used with an internal root key that is unavailable to SQL and not stored in the database, but rather managed by you externally using flexible scripts, or by Supabase automatically as part of our service offering.
The simplest way to use pgsodium to encrypt and decrypt data is to first create a Key ID. Valid Key IDs are stored in pgsodium in a special extension table, and they can be created using the
pgsodium.create_key() function. This function takes a number of arguments depending on how it's used, but the simplest case is to create a new key with no arguments:
_11select * from pgsodium.create_key();_11-[ RECORD 1 ]---+-------------------------------------_11id | eaa20d8c-c77c-4985-9f73-2a5f5d1f1e6d_11name |_11status | valid_11key_type | aead-det_11key_id | 2_11key_context | \x7067736f6469756d_11created | 2022-11-13 21:19:35.765823+00_11expires |_11associated_data |
This key can now be used with pgsodium encryption functions by its UUID (
eaa20d8c-c77c-4985-9f73-2a5f5d1f1e6d). For example:
_10select * from pgsodium.crypto_aead_det_encrypt (_10'this is the message', -- a message to encrypt_10'this is associated data', -- some authenticated associated data_10'eaa20d8c-c77c-4985-9f73-2a5f5d1f1e6d'::uuid -- key ID_10);
This produces the following encrypted “ciphertext” using the
aead-det algorithm from the libsodium XChaCha20-SIV encryption function.
_10-[ RECORD 1 ]-----------+---------------------------------------------------------------------------------------------------------_10crypto_aead_det_encrypt | \\x099baa820250d7375ed141f8f1936af384bc229f3de1010a6eff6ffdaf3998baffbae75b5cd83d1c469407ff2d3764a428b742
Now to decrypt the ciphertext, pass it to the decryption function with the same Key ID:
_10select *_10from convert_from(pgsodium.crypto_aead_det_decrypt (_10'\\x099baa820250d7375ed141f8f1936af384bc229f3de1010a6eff6ffdaf3998baffbae75b5cd83d1c469407ff2d3764a428b742',_10'this is associated data',_10'eaa20d8c-c77c-4985-9f73-2a5f5d1f1e6d'::uuid_10), 'utf8');
Which recovers the original “plaintext” message:
_10-[ RECORD 1 ]+--------------------_10convert_from | this is the message
In the above example, there is no raw key, only a Key ID which is used to derive the key used to encrypt the message and authenticate it with the associated data. In fact, it is impossible for a SQL user to derive the key used above, and if the Key ID is stored, then no encryption keys or decrypted information will leak into backups, disk storage, or the database WAL stream.
Transparent Column Encryption#
As of pgsodium 3.0.0 and up, the extension offers a simple and declarative Transparent Column Encryption feature (TCE). This feature is now shipped with all Supabase projects. TCE allows you to specify encrypted columns within a table and generates a new view that “wraps” that table to decrypt the contents.
TCE works using two dynamically generated objects for tables that contain encrypted columns:
INSERT UPDATEtrigger that encrypts data when it is inserted or modified
- a view that is created to wrap the table to decrypt the data when it is accessed
To “transparently” decrypt the table, access the dynamically generated view instead of the table. For every encrypted column in the table, the view will have an additional decrypted column that shows the decrypted result.
It's worth noting at this point that sometimes there is some confusion about handling encrypted data with TCE. The
T stands for Transparent which means, you can always see decrypted data through the view, where the decrypted data can't be see is when stored on disk, or in pg_dumps, backups, WAL streams, etc. This is often called Encryption At Rest and is one layer in many that may be used to encrypt and protect your data.
Often Transparent encryption is understood to be “Transparent Disk Encryption” or “Full Disk Encryption”, this is where a drive is encrypted but reading and writing that drive is decrypted. TCE is similar to this, where data on disk is encrypted, but it is more fine grained, only particular columns are encrypted. The data is also encrypted in the sense that the table stored on disk contains encrypted data, without the view or the key, pg_dumps and backups still contain encrypted data, this is not possible with disk-only encryption.
For the moment TCE only works for columns of type
text (or types castable to
json). Soon we will also support
bytea and possibly more as use cases and tests get better.
TCE uses one of PostgreSQL's lesser-known features:
SECURITY LABEL. A security label can be thought of as a simple label which is attached to an object (a table, column, etc). Each label is scoped to an extension and that extension can provide security features depending on the label.
Let's see a simple example of using
SECURITY LABEL to encrypt a column using pgsodium.
One Key ID for the Entire Column#
For the simplest case, a column can be encrypted with one Key ID which must be of the type
aead-det (as created above):
_10create table secrets (_10id bigserial primary key,_10secret_number text_10);_10_10security label for pgsodium_10on column secrets.secret_number_10is 'ENCRYPT WITH KEY ID e348034b-3f07-4878-aad6-000511d12826';
The advantage of this approach is simplicity - the user creates one key and labels a column with it. The cryptographic algorithm for this approach uses a nonceless encryption algorithm called
crypto_aead_det_xchacha20(). This algorithm is written by the author of libsodium and can be found here.
Using one key for an entire column means that whoever can decrypt one row can decrypt them all from a database dump. Also changing (rotating) the key means rewriting the whole table.
One Key ID per Row#
A more fine grained approach would be storing one Key ID per row:
_10create table secrets (_10id bigserial primary key,_10secret text,_10key_id uuid not null default 'e348034b-3f07-4878-aad6-000511d12826'::uuid_10);_10_10security label for pgsodium_10on column secrets.secret_10is 'ENCRYPT WITH KEY COLUMN key_id';
This approach ensures that a key for one user doesn't necessarily decrypt any others. While rows can share key IDs, they don't necessarily have to (unlike the first example above where one key id was used for the entire column). It also acts as a natural partition that can work in conjunction with Row Level Security to share distinct keys between owners.
Notice also how there is a
DEFAULT value for the
key_id. In a way, this gives you the best of both approaches - encrypting the column when a per-row key ID is not provided. The downside to this approach is that you need to store one key ID per row, which takes up more disk space (but that's cheap!).
One Key ID per Row with Nonce Support#
The default cryptographic algorithm for the above approach uses a nonceless encryption algorithm called
crypto_aead_det_xchacha20(). This algorithm has the advantage that it does not require nonce values, the disadvantage is that duplicate plaintexts will produce duplicate ciphertexts.
Nonces are some extra cryptographic context that is used in many cryptographic algorithms to produce different ciphertexts, even if the plaintexts are the same. The nonce does not have to be secret, but it does have to be unique. pgsodium comes with a useful function
pgsodium.crypto_aead_det_noncegen() that will generate a cryptographically secure nonce for you, and in almost all cases it's best to use that function unless you know specifically what you are doing. In password hashing approaches, this is often similar to how a “salt” value is used to deduplicate password hashes.
Duplicate ciphertexts cannot be used to “attack the key”, it can only reveal the duplication. However, duplication is still information. In our examples so far, an attacker might be able to use this information to determine that two accounts share the same secret. While not technically breaking the encryption, this still leaks information to an attacker.
_10create table secrets (_10id bigserial primary key,_10secret text,_10key_id uuid not null default 'e348034b-3f07-4878-aad6-000511d12826'::uuid,_10nonce bytea default pgsodium.crypto_aead_det_noncegen()_10);_10_10security label for pgsodium_10on column secrets.secret_10is 'ENCRYPT WITH KEY COLUMN key_id NONCE nonce';
This is the most secure form of TCE - there is a unique key ID and a unique nonce per row.
One Key ID per Row with Associated Data#
The encryption that is used for TCE is one of a family of functions provided by libsodium to do Authenticated Encryption with Associated Data or AEAD Encryption. The “associated” data is plaintext (unencrypted) information that is mixed into the authentication signature of the encrypted data, such that when you authenticate the data, you also know that the associated data is authentic.
AEAD is helpful because often you have metadata associated with a secret, which isn't confidential but must not be forged.
In our secret example, we might associate a "
secret" with an "
account_id". But what if a malicious actor wanted to use someone else's secret on their own account? If someone could forge the
account_id data column, swapping an
account_id with their own
account_id, then you could be tricked into using the wrong secret. By “associating” the
account_id with the
secret, it cannot be forged without throwing an error.
Like above, this is done simply by extending the security label with the associated data column:
_11create table secrets (_11id bigserial primary key,_11secret text,_11account_id integer,_11key_id uuid not null default 'e348034b-3f07-4878-aad6-000511d12826'::uuid,_11nonce bytea default pgsodium.crypto_aead_det_noncegen()_11);_11_11security label for pgsodium_11on column secrets.secret_11is 'ENCRYPT WITH KEY COLUMN key_id ASSOCIATED (account_id) NONCE nonce';
The new label indicates which column is to be associated with the secret, and that's it! Your
account_id and secret are now protected under the same authentication signature as the secret itself.
- Supabase Vault
- Read more about Supabase Vault in the blog post
- Supabase Vault on GitHub