The worst bugs in multi-tenant SaaS apps look like normal code. No crash, no error, no stack trace. Tenant A's customer just sees Tenant B's data, and you find out about it from a screenshot in a support ticket.
If you're on Supabase or any Postgres-with-row-level-security stack, the surface area is enormous and your AI coding agent doesn't know about most of it. Claude Code, Cursor, and Codex can read your TypeScript all day; they have no idea whether the table you just queried actually enforces a tenant scope. They'll happily ship the bug for you.
This post is about what a tenant-leak audit actually checks, why agents need explicit help to do it, and a concrete workflow for running one — including the bugs it finds that no test suite will catch.
The most common shape:
// app/api/projects/route.ts
export async function GET(req: Request) {
const user = await verifySession(req);
const { data } = await supabase
.from("projects")
.select("*");
return Response.json(data);
}
Looks fine. Reads a table, returns the rows. Test passes — the user has projects, they get projects back.
What's missing: the query has no .eq("org_id", user.orgId) filter. Whether that's a bug depends entirely on the RLS policy on projects. If RLS is enforced and the policy correctly checks auth.uid() against an org_membership table, the database silently filters and the code is correct. If RLS is disabled, or the policy is , or the policy compares against the wrong column — every user gets every project.
The thread discusses the challenges of tenant data leaks in multi-tenant SaaS applications using Supabase with row-level security (RLS). It highlights common pitfalls such as incorrect RLS configurations and the limitations of AI coding agents in detecting these issues. The post introduces a tenant-leak audit tool that identifies potential leaks by analyzing database policies and application code.
The tool looks genuinely useful for teams that are committed to the shared-DB + RLS path. the failure modes you list are real, and "the same code is either correct or catastrophically wrong, depending entirely on database state your AI agent never sees" is the cleanest one-line indictment of the model i've read.
The thing that pushed us off RLS entirely (we hit all five of these in production at different points) was realizing the bug class doesn't exist if there are no other tenants in the database to leak:
- no shared rows to forget a filter on
- no policies to misconfigure or set to `using (true)`
- no SECURITY DEFINER bypass surface
- service-role key only sees one tenant's DB
- transaction-mode pgbouncer + missed `SET LOCAL` becomes a non-issue
We ended up building our own per-tenant database orchestration layer (TenantsDB) for exactly this reason. Each tenant gets a real isolated database, schema versioning is coordinated across all of them, and the audit question becomes trivial because there's nothing to audit.
not saying RLS auditing is wrong. For teams that can't move off shared infra, your tool is the right..
using (true)This is the trap: the same code is either correct or catastrophically wrong, depending entirely on database state your AI agent never sees.
Real cross-tenant leaks I've seen (or shipped, then fixed):
Someone created the table for a quick prototype, forgot to ALTER TABLE ... ENABLE ROW LEVEL SECURITY, and shipped. The migration history shows the table; it does not show RLS being enabled. By default in Supabase, new tables have RLS off until you turn it on.
Common during early development:
create policy "allow all" on projects
for select using (true);
Effectively the same as RLS off, but harder to spot — the table shows as "RLS enabled" in the Supabase dashboard. Devs see the green shield and assume they're protected.
create policy "tenant select" on projects
for select using (auth.uid() = user_id);
This filters by user, not org. If a user belongs to multiple orgs and the projects belong to orgs (not users), the comparison is nonsense — sometimes it permits, sometimes it doesn't, in ways that depend on row ordering and joins.
A Postgres function declared SECURITY DEFINER runs with the permissions of its owner, not the caller. RLS still applies — but to the owner's session, not the user's. If the function doesn't manually filter, it returns everything.
create function get_dashboard_stats()
returns table (...) language sql
security definer
as $$
-- no tenant filter — returns stats for ALL orgs
select count(*) from projects;
$$;
App code calls supabase.rpc("get_dashboard_stats") and gets numbers from every tenant in the system, mixed together.
Supabase ships two keys per project: anon (subject to RLS) and service_role (bypasses RLS entirely). The service-role key shipping in the browser bundle defeats the whole security model. Every query runs as god mode.
This used to be a common bug because the env var names look similar (SUPABASE_ANON_KEY vs SUPABASE_SERVICE_ROLE_KEY), and a single typo in .env moves you from "secure" to "every user is admin."
Three reasons, in increasing severity:
Out of the box, Claude Code or Cursor knows your TypeScript. It does not know your schema. It cannot query pg_policies. It doesn't know which of your 47 tables have RLS enabled, which policies reference which columns, or which functions are SECURITY DEFINER.
You can paste your schema into CLAUDE.md, but that's a snapshot. It goes stale the next migration. The agent will edit code that contradicts policies it doesn't know exist.
Even if the agent could see the schema, the leaks don't look wrong in the code. A query without a tenant filter is correct when RLS is enforced. A query with a tenant filter is correct when RLS is off. The "right answer" depends on database state, not code shape. Static analysis on the TypeScript alone can't decide.
Most test suites use one tenant. They check that the data they inserted comes back. They don't check that another tenant's data doesn't come back. You'd need a multi-tenant integration test explicitly designed to catch leaks — and almost no one writes those until after their first incident.
So: a tool the agent can't see, a bug pattern that looks like working code, and a test suite that doesn't probe the right axis. The leak ships.
A working tenant-leak audit asks four questions, each with a specific lookup:
org_id, tenant_id, workspace_id, or similar column. Discoverable from the schema.pg_class and pg_policies.SECURITY DEFINER functions touching these tables without filtering? Read function bodies, look for the tenant column..from(...) call back to its route handler.None of these questions are answerable from your TypeScript alone. Each is answerable from a typed query against a live database introspection. That's the gap an MCP server with database awareness fills.
agentmako exposes a tenant_leak_audit MCP tool that runs the above against your live Postgres. Sample output:
tenant_leak_audit({})
→ {
flagged: [
{
table: "projects",
issue: "rls_disabled",
severity: "high",
detail: "Table has org_id column but RLS is disabled."
},
{
table: "manager_district",
issue: "policy_wrong_column",
severity: "high",
detail: "RLS policy references auth.uid() = user_id but table is keyed by district_id."
},
{
function: "get_dashboard_stats",
issue: "security_definer_no_filter",
severity: "medium",
detail: "SECURITY DEFINER function reads tenant-keyed tables without filter."
},
{
route: "/api/projects",
issue: "from_call_outside_session",
severity: "low",
detail: "Direct .from('projects') call without verifySession() in scope."
}
],
passed: 23,
total_tenant_tables: 27
}
The audit isn't proving anything. It's surfacing the things a human should look at. Each finding maps to a specific table, function, or route, with the exact reason it was flagged.
Take the first flag — projects has an org_id column but RLS is disabled. The agent's job: read the policy, write the migration, verify.
db_table_schema({ schema: "public", table: "projects" })
→ columns: [id uuid, org_id uuid, name text, created_at timestamptz],
rls: { enabled: false, policies: [] },
indexes: [...]
Confirms the audit. Now the agent knows the column shape and can write the migration without guessing.
schema_usage({ schema: "public", object: "projects" })
→ reads: ["app/api/projects/route.ts:8 — .from('projects').select('*')",
"app/dashboard/page.tsx:42 — .from('projects').select('id, name')"],
writes: ["app/api/projects/route.ts:31 — .from('projects').insert(...)"],
rpc_refs: []
Two reads, one write. The agent can now reason about which need a manual filter and which will be covered by RLS.
alter table public.projects enable row level security;
create policy "projects: org members can read"
on public.projects
for select
using (
exists (
select 1 from public.org_membership m
where m.user_id = auth.uid()
and m.org_id = projects.org_id
)
);
create policy "projects: org admins can write"
on public.projects
for all
using (
exists (
select 1 from public.org_membership m
where m.user_id = auth.uid()
and m.org_id = projects.org_id
and m.role = 'admin'
)
);
The agent writes this migration with full knowledge of the schema shape (the org_membership join), not from a guess. The policies map cleanly to existing app-code expectations.
db_reef_refresh()
tenant_leak_audit({})
→ { flagged: [...3 remaining...], passed: 24, total_tenant_tables: 27 }
The projects table is gone from the flagged list. The remaining three are still there for the next round.
Important detail: the audit results live in Reef — agentmako's durable findings store. That means:
manager_district — intentional, this is a global reference table not tenant-keyed"), the ack is bound to the finding's identity, not its line number. Future re-runs respect it.This matters because tenant-leak audits aren't one-shot. Schemas change. Migrations land. Someone disables RLS for a hotfix and forgets to turn it back on. Persistent findings make this catch the regression on the next run, not three months later.
The same audit shape works for any Postgres app — Drizzle, Prisma, Kysely, raw pg. The tools that change:
auth.uid() is the session anchor; policies usually reference an org_membership or workspace_membership join.SET app.user_id = '...'; policies reference that..from() style call detection becomes ORM-method detection. agentmako's schema_usage covers all of these.The pattern doesn't depend on Supabase specifically. It depends on: (a) RLS as the enforcement layer, (b) a structured way to query schema + policies, (c) a structured way to find app-code call sites. Any stack with those three has the same audit available.
Add the audit to your pre-commit hook, or run it weekly, or both. Either way:
None of this requires you to write a custom static analyzer. The audit is a single MCP tool call. The trick is having a context engine that can see the database and the code and the import graph in one place — which is exactly what agentmako ships.
If you're on Supabase or Postgres and have RLS-protected tables:
npm install -g agentmakoagentmako connect . — interactive, prompts for your DB URL (stored in your OS keychain).tenant_leak_audit with no args.It takes about 30 seconds. The first time you run it, you'll probably find at least one issue you didn't know was there. That's the point — these bugs are invisible until something explicitly looks for them.
Better to find your tenant leak from a tool than from a screenshot in a support ticket.
agentmako is local-first, Apache-2.0, and works with every MCP-compatible coding agent.
npm install -g agentmakocopy