All assets uploaded to Supabase Storage are cached on a Content Delivery Network (CDN) to improve the latency for users all around the world. CDNs are a geographically distributed set of servers or nodes which caches content from an origin server. For Supabase Storage, the origin is the storage server running in the same region as your project. Aside from performance, CDNs also help with security and availability by mitigating Distributed Denial of Service and other application attacks.
Our basic CDN caches objects based on the cache time set when uploading objects.
Let's walk through an example of how a CDN helps with performance.
A new bucket is created for a Supabase project launched in Singapore. All requests to the Supabase Storage API are routed to the CDN first.
A user from the United States requests an object and is routed to the U.S. CDN. At this point, that CDN node does not have the object in its cache and pings the origin server in Singapore.
Another user, also in the United States, requests the same object and is served directly from the CDN cache in the United States instead of routing the request back to Singapore.
By default, assets are cached both in the CDN and in the user's browser for 1 hour. After this, the CDN nodes ping the storage server to see if an object has been updated. You can modify this cache time when you are uploading or updating an object by modifying the
If you expect the object to not change at a given URL, setting a longer cache duration is preferable.
If you need to update the version of the object stored in the CDN, there are various cache-busting techniques you can use. The most common way to do this is to add a version query parameter in the URL.
For example, you can use a URL like
/storage/v1/object/sign/profile-pictures/cat.jpg?token=eyJh...&version=1 in your applications and set a long cache time of 1 year.
When you want to update the cat picture, you can increment the version query parameter in the URL. The CDN treats
/storage/v1/object/sign/profile-pictures/cat.jpg?token=eyJh...&version=2 as a new object and pings the origin for the updated version.
If your asset is updated frequently, we recommend that you re-upload the new asset in a different key, this way you'll always have the latest changes available immediately.
Note that CDNs might still evict your object from their cache if it has not been requested for a while from a specific region. For example, if no user from United States requests your object, it will be removed from the CDN cache even if you set a very long cache control duration.
The cache status of a particular request is sent in the
cf-cache-status header. A cache status of
MISS indicates that the CDN node did not have the object in its cache and had to ping the origin to get it. A cache status of
HIT indicates that the object was sent directly from the CDN.
Smart CDN Caching#
Smart CDN caching is automatically enabled for Pro plan and above.
With Smart CDN caching enabled, the asset metadata in your database is synchronized to the edge. This automatically revalidates the cache when the asset is changed or deleted.
Additionally, the smart CDN has a higher cache hit ratio as the origin server is shielded from asset requests that haven't changed when using different query strings in the URL.
When Smart CDN is enabled, the asset is cached on the CDN for as long as possible. You can still control how long assets are stored in the browser using the cacheControl option when uploading a file. Smart CDN caching works with all types of storage operations including signed URLs.
When a file is updated or deleted, the CDN cache is automatically invalidated to reflect the change (including transformed images). It can take up to 60 seconds for the CDN cache to be invalidated as the asset metadata has to propagate across all the data-centers around the globe.
When an asset is invalidated at the CDN level, browsers may not update its cache.
If your asset is updated frequently, we recommend that you re-upload the new asset in a different key, this way you'll always have the latest changes available without waiting the propagation delay. If you expect your asset to be deleted, we recommend setting a low browser TTL value using the
cacheControl option when using smart CDN caching, the default is 1 hour which generally is a good default.
Public vs Private Buckets#
Objects in public buckets do not require any Authorization to access objects. This leads to a better cache hit rate compared to private buckets. For private buckets, permissions for accessing each object is checked on a per user level. For example, if two different users access the same object in a private bucket from the same region, it results in a cache miss for both the users since they might have different security policies attached to them. On the other hand, if two different users access the same object in a public bucket from the same region, it results in a cache hit for the second user.
Debugging Cache Hits#
The storage report gives a breakdown of your project's cache hit rate.
The Logs Explorer can also be used to debug cache misses.
Cache hits can be determined via the
metadata.response.headers.cf_cache_status key. Any value corresponding to either
UPDATING is considered a cache hit.
The following example query will show the top cache misses from the
_17select_17r.path as path,_17r.search as search,_17count(id) as count_17from_17edge_logs as f_17cross join unnest(f.metadata) as m_17cross join unnest(m.request) as r_17cross join unnest(m.response) as res_17cross join unnest(res.headers) as h_17where_17starts_with(r.path, '/storage/v1/object')_17and r.method = 'GET'_17and h.cf_cache_status in ('MISS', 'NONE/UNKNOWN', 'EXPIRED', 'BYPASS', 'DYNAMIC')_17group by path, search_17order by count desc_17limit 50;
Try out this query in the Logs Explorer here.
Your cache hit ratio over time can then be determined using the following query:
_12select_12timestamp_trunc(timestamp, hour) as timestamp,_12countif(h.cf_cache_status in ('HIT', 'STALE', 'REVALIDATED', 'UPDATING')) / count(f.id) as ratio_12from_12edge_logs as f_12cross join unnest(f.metadata) as m_12cross join unnest(m.request) as r_12cross join unnest(m.response) as res_12cross join unnest(res.headers) as h_12where starts_with(r.path, '/storage/v1/object') and r.method = 'GET'_12group by timestamp_12order by timestamp desc;
Try out this query in the Logs Explorer here.