pretraining Training our own models
We train compact language models from the ground up, sized for the jobs Turborg actually does. Smaller, task-focused models beat a giant general model on the work that matters here: reading IRC, summarizing channels, and acting on commands.
open-weight Built on open weights
We are not precious about training everything ourselves. We are big fans of open source, so we build on the best open-weight models out there, fine-tune them for our tasks, and contribute fixes and findings back to the projects we lean on. The right tool for the job, whether we trained it or the community did.
fine-tuning Fine-tuning for IRC tasks
We take strong base models, ours or open weights, and fine-tune them on the exact tasks our users run. Supervised fine-tuning plus preference tuning teaches a model to write a clean summary or a polished message the way our users expect, not the way a generic chatbot would.
dataset Our own training data
Fifteen years of running IRC infrastructure means we understand chat at scale. We build our own curated datasets for the tasks we care about, labelled and filtered in-house, so our models learn from data that reflects real conversations instead of scraped noise.
quantize Quantization and distillation
A model is only useful if it runs cheap. We quantize to 8-bit and 4-bit and distill large models into small ones, cutting memory and latency while keeping the quality our users feel. That is how an AI summary comes back in a second, not a minute.
harness AI agent harnesses
Models are one piece. The harness around them is what turns a model into a working agent: tool calling, memory, retrieval over your backlog, and guardrails that keep it on task. We build the scaffolding that lets a Turborg agent read a channel, run a skill, and answer on command.
inference Fast, cheap inference
We optimize the full inference path: batching, caching, speculative decoding, and serving our quantized models on hardware we already operate. The result is AI that stays responsive under load and stays affordable enough to include in every plan.
evals Evals we trust
We do not ship on vibes. Every model and prompt change runs against task-specific evals: does the summary capture the thread, does Polish keep your meaning, does the agent pick the right tool. We measure quality, latency, and cost before anything reaches a user.
rag Retrieval and memory
Long channels and weeks of backlog do not fit in a prompt. We use retrieval to pull the right context and give agents persistent memory across sessions, so an agent can reason over history it was never shown all at once.