Custom providers let you connect your own model endpoints to Cencori. Self-hosted LLaMA, a fine-tuned model on your GPU server, a private vLLM instance — anything that speaks the OpenAI or Anthropic API format.

Once connected, custom providers go through the same gateway pipeline as OpenAI, Anthropic, and every other built-in provider. Security filtering, rate limiting, end-user billing, observability, caching — all of it works automatically.

How It Works

You register a provider with a name, base URL, and optional API key.
You add one or more model names to the provider.
When a request comes through the gateway with a matching model name, Cencori resolves it to your provider and forwards the request.
The response flows back through the same pipeline — logged, metered, billed.

Supported Formats

Format	Protocol	Use when
OpenAI Compatible	`POST /chat/completions`	Ollama, vLLM, LM Studio, Together, any OpenAI-compatible server
Anthropic Compatible	`POST /messages`	Self-hosted Anthropic-format endpoints

Most self-hosted model servers default to the OpenAI format. Use Anthropic format only if your server specifically implements the Anthropic Messages API.

Model Routing

When the gateway receives a request, it resolves the model field using a three-tier matching system. Custom providers are checked after built-in providers, so you can't shadow gpt-4o or claude-sonnet-4-20250514 with a custom provider.

Tier 1: Exact Model Name

If the requested model matches a model name registered to your custom provider, it routes there.

Codetext

// You registered model "legal-llama-3.1" on provider "Marlo LLaMA"
await client.chat.completions.create({
  model: "legal-llama-3.1",  // exact match → routes to your provider
  messages: [{ role: "user", content: "..." }],
});

Tier 2: Provider Name

If the requested model matches the provider name itself, the gateway routes to that provider and uses the first registered model (or passes the provider name as the upstream model).

Codetext

// Provider name is "Marlo LLaMA"
await client.chat.completions.create({
  model: "Marlo LLaMA",  // matches provider name → routes to your provider
  messages: [{ role: "user", content: "..." }],
});

Tier 3: Provider Prefix

If the requested model starts with providerName/, the gateway routes to that provider and passes the suffix as the upstream model name.

Codetext

await client.chat.completions.create({
  model: "Marlo LLaMA/llama3.1-70b",  // prefix match → routes with "llama3.1-70b" as the model
  messages: [{ role: "user", content: "..." }],
});

This is useful when your server hosts multiple models and you want to select one dynamically without registering each one.

Dashboard Setup

Open your project in the Cencori dashboard.
Go to Custom Providers in the sidebar.
Click Add Provider.
Fill in:
- Name — a label for this provider (also used for Tier 2/3 routing)
- Base URL — your model server's endpoint, up to /v1 (not including /chat/completions)
- API Key — optional. Leave empty for local models without auth. Encrypted at rest with AES-256-GCM if provided.
- Format — OpenAI Compatible or Anthropic Compatible
Click Create.
Click Test on the provider card to verify the connection. Cencori sends a minimal request and reports success/failure with latency.

API Reference

For the full API reference — create, list, update, delete, test connection, and the models endpoint — see the Custom Providers API docs.

Using Custom Providers

Once a provider is created and active, use the gateway exactly like you would with any built-in provider:

Codetext

import OpenAI from "openai";
 
const client = new OpenAI({
  apiKey: "csk_your_cencori_key",
  baseURL: "https://api.cencori.com/v1",
});
 
// Route to your self-hosted model
const response = await client.chat.completions.create({
  model: "llama3.1",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "Hello!" },
  ],
  user: "user_123",  // optional: enables per-user billing
});

Streaming works the same way:

Codetext

const stream = await client.chat.completions.create({
  model: "llama3.1",
  messages: [{ role: "user", content: "Hello!" }],
  stream: true,
});
 
for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || "");
}

What Custom Providers Get

Everything that built-in providers get:

Feature	How it works
Security filtering	PII detection and prompt injection blocking run on every request before it reaches your model
Rate limiting	Project-level and per-key limits enforced at the gateway
End-user billing	Pass `user: "..."` to meter, limit, and charge per user
Observability	Every request logged with model, latency, tokens, cost, status, user
Semantic caching	Repeated queries served from cache if enabled
Environment scoping	Test-key and production-key traffic tracked separately
Audit logging	Provider create/update/delete actions logged

API Key Encryption

API keys for custom providers are encrypted at rest using AES-256-GCM, scoped to the organization ID. The key is decrypted only at request time when the gateway forwards traffic to your provider. You cannot retrieve the original key from the dashboard — only replace it.

If your model server doesn't require authentication (e.g., Ollama running locally), leave the API key field empty.

Common Server Setup

Server	Install	Default endpoint
Ollama	`ollama serve && ollama pull llama3.1`	`http://localhost:11434/v1`
vLLM	`vllm serve meta-llama/Llama-3.1-8B-Instruct`	`http://localhost:8000/v1`
LM Studio	Download from lmstudio.ai	`http://localhost:1234/v1`

For production, deploy on a cloud VM with a public IP or domain. For testing, use a tunnel like ngrok to expose your local server.

Troubleshooting

Test connection fails with timeout

Is the server running and reachable from the internet?
Is the base URL correct? It should end at /v1, not include /chat/completions.
If using ngrok, is the tunnel still active?

Requests return "model not found"

Is the provider active? Check the status badge in the dashboard.
Does the model name in your request match a registered model name exactly (case-insensitive)?
If using Tier 3 routing (provider/model), does the provider name match?

Requests fail with decryption error

This happens if the organization ID changed or the encryption key rotated. Delete the provider and re-create it with the API key.

Model works directly but fails through Cencori

Check that the base URL doesn't include a trailing slash.
Check that the API format matches what your server implements (OpenAI vs Anthropic).

Custom Providers API — full API reference
Blog: Connect a Local LLM to Cencori — step-by-step walkthrough
BYOK — use your own API keys with built-in providers
End-User Billing — meter and charge your users
Core Architecture — how the gateway pipeline works

Custom Providers

How It Works

Supported Formats

Model Routing

Tier 1: Exact Model Name

Tier 2: Provider Name

Tier 3: Provider Prefix

Dashboard Setup

API Reference

Using Custom Providers

What Custom Providers Get

API Key Encryption

Common Server Setup

Troubleshooting

Related