Running AI Models

How to run AI models in Edge Functions.

Supabase Edge Runtime has a built-in API for running AI models. You can use this API to generate embeddings, build conversational workflows, and do other AI related tasks in your Edge Functions.

Setup

There are no external dependencies or packages to install to enable the API.

You can create a new inference session by doing:


_10const model = new Supabase.ai.Session('model-name')

To get type hints and checks for the API you can import types from functions-js at the top of your file:


_10import 'jsr:@supabase/functions-js/edge-runtime.d.ts'

Running a model inference

Once the session is instantiated, you can call it with inputs to perform inferences. Depending on the model you run, you may need to provide different options (discussed below).


_10const output = await model.run(input, options)

How to generate text embeddings

Now let's see how to write an Edge Function using the Supabase.ai API to generate text embeddings. Currently, Supabase.ai API only supports the gte-small model.

gte-small model exclusively caters to English texts, and any lengthy texts will be truncated to a maximum of 512 tokens. While you can provide inputs longer than 512 tokens, truncation may affect the accuracy.


_13const model = new Supabase.ai.Session('gte-small')
_13
_13Deno.serve(async (req: Request) => {
_13  const params = new URL(req.url).searchParams
_13  const input = params.get('input')
_13  const output = await model.run(input, { mean_pool: true, normalize: true })
_13  return new Response(JSON.stringify(output), {
_13    headers: {
_13      'Content-Type': 'application/json',
_13      Connection: 'keep-alive',
_13    },
_13  })
_13})

Using Large Language Models

Inference via larger models is supported via Ollama. In the first iteration, you can use it with a self-managed Ollama server. We are progressively rolling out support for the hosted solution. To sign up for early access, fill up this form.

Running locally

Install Ollama and pull the Mistral model

_10ollama pull mistral
Run the Ollama server locally

_10ollama serve
Set a function secret called AI_INFERENCE_API_HOST to point to the Ollama server

_10echo "AI_INFERENCE_API_HOST=http://host.docker.internal:11434" >> supabase/functions/.env
Create a new function with the following code

_10supabase functions new ollama-test

_37import 'jsr:@supabase/functions-js/edge-runtime.d.ts' _37const session = new Supabase.ai.Session('mistral') _37 _37Deno.serve(async (req: Request) => { _37 const params = new URL(req.url).searchParams _37 const prompt = params.get('prompt') ?? '' _37 _37 // Get the output as a stream _37 const output = await session.run(prompt, { stream: true }) _37 _37 const headers = new Headers({ _37 'Content-Type': 'text/event-stream', _37 Connection: 'keep-alive', _37 }) _37 _37 // Create a stream _37 const stream = new ReadableStream({ _37 async start(controller) { _37 const encoder = new TextEncoder() _37 _37 try { _37 for await (const chunk of output) { _37 controller.enqueue(encoder.encode(chunk.response ?? '')) _37 } _37 } catch (err) { _37 console.error('Stream error:', err) _37 } finally { _37 controller.close() _37 } _37 }, _37 }) _37 _37 // Return the stream to the user _37 return new Response(stream, { _37 headers, _37 }) _37})
Serve the function


_10supabase functions serve --env-file supabase/functions/.env

Execute the function

_10curl --get "http://localhost:54321/functions/v1/ollama-test" \ _10--data-urlencode "prompt=write a short rap song about Supabase, the Postgres Developer platform, as sung by Nicki Minaj" \ _10-H "Authorization: $ANON_KEY"

Deploying to production

Once the function is working locally, it's time to deploy to production.

Deploy a Ollama server and set a function secret called AI_INFERENCE_API_HOST to point to the deployed Ollama server

_10supabase secrets set AI_INFERENCE_API_HOST=https://path-to-your-ollama-server/
Deploy the Supabase function

_10supabase functions deploy ollama-test
Execute the function

_10curl --get "https://project-ref.supabase.co/functions/v1/ollama-test" \ _10--data-urlencode "prompt=write a short rap song about Supabase, the Postgres Developer platform, as sung by Nicki Minaj" \ _10-H "Authorization: $ANON_KEY"

As demonstrated in the video above, running Ollama locally is typically slower than running it in on a server with dedicated GPUs. We are collaborating with the Ollama team to improve local performance.

In the future, a hosted Ollama API, will be provided as part of the Supabase platform. Supabase will scale and manage the API and GPUs for you. To sign up for early access, fill up this form.

Running AI Models

How to run AI models in Edge Functions.

Setup#

Running a model inference#

How to generate text embeddings#

Using Large Language Models#

Running locally#

Deploying to production#