Supabase | The Open Source Firebase Alternative

Building small, specialized coding LLMs instead of one big model — need feedback

JavaScript

TypeScript

Django

React

I’m experimenting with a different approach to local coding assistants and wanted to get feedback from people who’ve tried similar setups.

Instead of relying on one general-purpose model, I’m thinking of building multiple small, specialized models, each focused on a specific domain:

Frontend (React, Tailwind, UI patterns)
Backend (Django, APIs, auth flows)
Database (Postgres, Supabase)
DevOps (Docker, CI/CD)

The idea is:

Use something like Ollama to run models locally
Fine-tune (LoRA) or use RAG to specialize each model
Route tasks to the correct model instead of forcing one model to do everything

Why I’m considering this

Smaller models = faster + cheaper
Better domain accuracy if trained properly
More control over behavior (especially for coding style)

Where I need help / opinions

Has anyone here actually tried multi-model routing systems for coding tasks?
Is fine-tuning worth it here, or is RAG enough for most cases?
How do you handle dataset quality for specialization (especially frontend vs backend)?
Would this realistically outperform just using a strong single model?
Any tools/workflows you’d recommend for managing multiple models?

My current constraints

12-core CPU, 16GB RAM (no high-end GPU)
Mostly working with JavaScript/TypeScript + Django
Goal is a practical dev assistant, not research

I’m also considering sharing the results publicly (maybe on **Hugging Face / Transformers) if this approach works.

Would really appreciate any insights, warnings, or even “this is a bad idea” takes 🙏

Thanks!

How to help

The user is exploring the development of multiple small, specialized coding LLMs instead of a single large model. They seek feedback on the feasibility and effectiveness of this approach, particularly regarding multi-model routing systems, fine-tuning versus RAG, and dataset quality. They are working with limited hardware resources and aim to create a practical development assistant.

Help on Reddit

Replies (8)

i've been down this road and honestly the routing layer is the hardest part.

ran something similar with a few 7b models for different tasks - the domain specialization did help, but the overhead of managing multiple models and figuring out which one to call for what basically erased most of the gains.

for your constraints (16gb ram, no gpu), i'd honestly suggest starting with just RAG on a solid base model before investing in fine-tuning. you can get 80% of the benefit with way less setup pain.

if you do go multi-model, i'd recommend routing at the task level (file type + complexity) rather than trying to do it dynamically. much easier to debug when something breaks.

PsychologicalRope850·3/23/2026, 10:11:30 AM

on that hardware it makes more sense to use one decent model plus good rag than build a zoo of models and routing that eats all your gains

szansky·3/23/2026, 10:17:52 AM

With 16GB RAM and no GPU, it’s more or less a lost cause. 16GB is very little, let alone if it’s not even VRAM.

Azoraqua_·3/23/2026, 11:32:05 AM