Edge Functions

Running AI Models

How to run AI models in Edge Functions.

Supabase Edge Runtime has a built-in API for running AI models. You can use this API to generate embeddings, build conversational workflows, and do other AI related tasks in your Edge Functions.

Setup

There are no external dependencies or packages to install to enable the API.

You can create a new inference session by doing:


_10
const model = new Supabase.ai.Session('model-name')

Running a model inference

Once the session is instantiated, you can call it with inputs to perform inferences. Depending on the model you run, you may need to provide different options (discussed below).


_10
const output = await model.run(input, options)

How to generate text embeddings

Now let's see how to write an Edge Function using the Supabase.ai API to generate text embeddings. Currently, Supabase.ai API only supports the gte-small model.


_13
const model = new Supabase.ai.Session('gte-small')
_13
_13
Deno.serve(async (req: Request) => {
_13
const params = new URL(req.url).searchParams
_13
const input = params.get('input')
_13
const output = await model.run(input, { mean_pool: true, normalize: true })
_13
return new Response(JSON.stringify(output), {
_13
headers: {
_13
'Content-Type': 'application/json',
_13
Connection: 'keep-alive',
_13
},
_13
})
_13
})

Using Large Language Models

Inference via larger models is supported via Ollama. In the first iteration, you can use it with a self-managed Ollama server. We are progressively rolling out support for the hosted solution. To sign up for early access, fill up this form.

Running locally

  1. Install Ollama and pull the Mistral model


    _10
    ollama pull mistral

  2. Run the Ollama server locally


    _10
    ollama serve

  3. Set a function secret called AI_INFERENCE_API_HOST to point to the Ollama server


    _10
    echo "AI_INFERENCE_API_HOST=http://host.docker.internal:11434" >> supabase/functions/.env

  4. Create a new function with the following code


    _10
    supabase functions new ollama-test


    _37
    /// <reference types="https://esm.sh/@supabase/functions-js/src/edge-runtime.d.ts" />
    _37
    const session = new Supabase.ai.Session('mistral')
    _37
    _37
    Deno.serve(async (req: Request) => {
    _37
    const params = new URL(req.url).searchParams
    _37
    const prompt = params.get('prompt') ?? ''
    _37
    _37
    // Get the output as a stream
    _37
    const output = await session.run(prompt, { stream: true })
    _37
    _37
    const headers = new Headers({
    _37
    'Content-Type': 'text/event-stream',
    _37
    Connection: 'keep-alive',
    _37
    })
    _37
    _37
    // Create a stream
    _37
    const stream = new ReadableStream({
    _37
    async start(controller) {
    _37
    const encoder = new TextEncoder()
    _37
    _37
    try {
    _37
    for await (const chunk of output) {
    _37
    controller.enqueue(encoder.encode(chunk.response ?? ''))
    _37
    }
    _37
    } catch (err) {
    _37
    console.error('Stream error:', err)
    _37
    } finally {
    _37
    controller.close()
    _37
    }
    _37
    },
    _37
    })
    _37
    _37
    // Return the stream to the user
    _37
    return new Response(stream, {
    _37
    headers,
    _37
    })
    _37
    })

  5. Serve the function


_10
supabase functions serve --env-file supabase/functions/.env

  1. Execute the function


    _10
    curl --get "http://localhost:54321/functions/v1/ollama-test" \
    _10
    --data-urlencode "prompt=write a short rap song about Supabase, the Postgres Developer platform, as sung by Nicki Minaj" \
    _10
    -H "Authorization: $ANON_KEY"

Deploying to production

Once the function is working locally, it's time to deploy to production.

  1. Deploy a Ollama server and set a function secret called AI_INFERENCE_API_HOST to point to the deployed Ollama server


    _10
    supabase secrets set AI_INFERENCE_API_HOST=https://path-to-your-ollama-server/

  2. Deploy the Supabase function


    _10
    supabase functions deploy ollama-test

  3. Execute the function


    _10
    curl --get "https://project-ref.supabase.co/functions/v1/ollama-test" \
    _10
    --data-urlencode "prompt=write a short rap song about Supabase, the Postgres Developer platform, as sung by Nicki Minaj" \
    _10
    -H "Authorization: $ANON_KEY"

As demonstrated in the video above, running Ollama locally is typically slower than running it in on a server with dedicated GPUs. We are collaborating with the Ollama team to improve local performance.

In the future, a hosted Ollama API, will be provided as part of the Supabase platform. Supabase will scale and manage the API and GPUs for you. To sign up for early access, fill up this form.