End-user billing lets you charge your customers for AI usage. Set per-user rate plans with token/request limits, apply markup, and collect payments through Stripe Connect — all from the dashboard.

[!NOTE] New to end-user billing? Start with the Setup Guide — a step-by-step walkthrough for non-technical users. This page is the API reference for your engineering team.

How It Works

Your end-user makes a request to your app.
Your backend checks their quota with Cencori.
If allowed, process the request.
Report the usage back to Cencori.
Cencori tracks everything — limits, costs, revenue.
You view it all in the End-User Billing dashboard.

There are two ways to connect:

Method	Best for	Integration effort
AI Gateway	Teams routing AI traffic through Cencori	1 line (change base URL)
Usage Events API	Teams with their own AI infrastructure	~10 lines per endpoint

Option 1: AI Gateway (Recommended)

If you route AI requests through the Cencori gateway, end-user billing works automatically. Just pass the user ID in your request:

Codetext

const response = await fetch("https://api.cencori.com/v1/chat/completions", {
  method: "POST",
  headers: {
    "Authorization": "Bearer csk_your_secret_key",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "gpt-4o",
    messages: [{ role: "user", content: "Hello!" }],
    user: "user_123",  // your end-user's ID
  }),
});

The gateway handles quota checks, usage recording, markup calculation, and limit enforcement. No additional API calls needed.

Option 2: Usage Events API

If you call AI providers directly and don't use the Cencori gateway, use the Usage Events API to report usage and check quotas.

Check Quota (Before Processing)

Call this before serving an AI request to verify the user is within their limits:

Codetext

curl https://api.cencori.com/v1/billing/check-quota?end_user_id=user_123 \
  -H "Authorization: Bearer csk_your_secret_key"

Response:

Codetext

{
  "allowed": true,
  "reason": "within_limits",
  "end_user_id": "user_123",
  "is_new_user": false,
  "rate_plan": "pro",
  "overage_action": "block",
  "retry_after_seconds": null,
  "usage": {
    "daily_tokens": { "used": 45200, "limit": 1000000 },
    "monthly_tokens": { "used": 1250000, "limit": 30000000 },
    "daily_requests": { "used": 23, "limit": 500 },
    "monthly_requests": { "used": 412, "limit": 10000 },
    "requests_per_minute": { "used": 12, "limit": 60 }
  },
  "billing": {
    "markup_percentage": 20,
    "flat_rate_per_request": null,
    "allowed_models": ["gpt-4o", "claude-sonnet-4.5"]
  }
}

When allowed is false, the reason field tells you which limit was hit (e.g. daily_token_limit_exceeded, monthly_request_limit_exceeded, requests_per_minute_exceeded, user_blocked). If the limit is time-based, retry_after_seconds tells your app how long to wait before retrying.

check-quota always returns 200 OK — inspect the allowed field to decide what to do. If you route traffic through the Cencori gateway instead, blocked requests return 429 Too Many Requests (see Gateway 429 Responses below).

You can also POST the same request with a JSON body:

Codetext

curl -X POST https://api.cencori.com/v1/billing/check-quota \
  -H "Authorization: Bearer csk_your_secret_key" \
  -H "Content-Type: application/json" \
  -d '{ "end_user_id": "user_123", "model": "gpt-4o" }'

Report Usage (After Processing)

After your AI request completes, report the usage:

Codetext

curl -X POST https://api.cencori.com/v1/billing/usage-events \
  -H "Authorization: Bearer csk_your_secret_key" \
  -H "Content-Type: application/json" \
  -d '{
    "end_user_id": "user_123",
    "model": "gpt-4o",
    "provider": "openai",
    "prompt_tokens": 500,
    "completion_tokens": 1200,
    "cost_usd": 0.0085,
    "latency_ms": 1340,
    "metadata": {
      "conversation_id": "conv_abc123"
    }
  }'

Response:

Codetext

{
  "recorded": 1,
  "failed": 0,
  "total": 1
}

Batch Reporting

Send up to 1,000 events in a single request:

Codetext

curl -X POST https://api.cencori.com/v1/billing/usage-events \
  -H "Authorization: Bearer csk_your_secret_key" \
  -H "Content-Type: application/json" \
  -d '{
    "events": [
      {
        "end_user_id": "user_123",
        "model": "gpt-4o",
        "prompt_tokens": 500,
        "completion_tokens": 1200,
        "cost_usd": 0.0085
      },
      {
        "end_user_id": "user_456",
        "model": "claude-sonnet-4.5",
        "prompt_tokens": 800,
        "completion_tokens": 2000,
        "cost_usd": 0.012
      }
    ]
  }'

Full Integration Example

Here's how a typical backend integration looks:

Codetext

import Anthropic from "@anthropic-ai/sdk";
 
const anthropic = new Anthropic();
const CENCORI_API = "https://api.cencori.com/v1/billing";
const CENCORI_KEY = process.env.CENCORI_API_KEY;
 
async function handleUserMessage(userId: string, message: string) {
  // 1. Check quota
  const quotaRes = await fetch(
    `${CENCORI_API}/check-quota?end_user_id=${userId}`,
    { headers: { Authorization: `Bearer ${CENCORI_KEY}` } }
  );
  const quota = await quotaRes.json();
 
  if (!quota.allowed) {
    throw new Error(`Rate limit: ${quota.reason}`);
  }
 
  // 2. Call your AI provider directly
  const response = await anthropic.messages.create({
    model: "claude-sonnet-4.5",
    max_tokens: 1024,
    messages: [{ role: "user", content: message }],
  });
 
  // 3. Report usage to Cencori
  await fetch(`${CENCORI_API}/usage-events`, {
    method: "POST",
    headers: {
      Authorization: `Bearer ${CENCORI_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      end_user_id: userId,
      model: "claude-sonnet-4.5",
      provider: "anthropic",
      prompt_tokens: response.usage.input_tokens,
      completion_tokens: response.usage.output_tokens,
      cost_usd: calculateCost(response.usage), // your cost calculation
    }),
  });
 
  return response.content[0].text;
}

Usage Event Fields

Field	Type	Required	Description
`end_user_id`	string	Yes	Your user's ID (any string, consistent per user)
`model`	string	No	Model name (e.g. `gpt-4o`). Defaults to `external`
`provider`	string	No	Provider name (e.g. `openai`). Defaults to `external`
`prompt_tokens`	integer	No	Input tokens
`completion_tokens`	integer	No	Output tokens
`total_tokens`	integer	No	Total tokens (auto-calculated if omitted)
`cost_usd`	number	No	Provider cost for this request
`latency_ms`	integer	No	Request duration in milliseconds
`status`	string	No	`success` or `error`. Defaults to `success`
`metadata`	object	No	Arbitrary JSON (conversation ID, session, etc.)

Rate Plans

Rate plans define per-user limits and pricing. Create them in the dashboard under End-User Billing > Rate Plans.

Each plan can set:

Setting	Description
Daily/monthly token limits	Max tokens a user can consume per day/month
Daily/monthly request limits	Max requests per day/month
Requests per minute	Burst rate limit
Cost limits (daily/monthly)	Max spend in USD
Markup percentage	Your margin on top of provider cost
Flat rate per request	Fixed fee added to each request
Allowed models	Restrict which models this plan can access
Overage action	`block` (hard stop — returns 429 via gateway, `allowed: false` via check-quota) or `alert_only` (lets requests through, flags the overage in your dashboard)

[!NOTE] If a user has no assigned plan, they inherit the project's default plan. If no default plan exists, they have no limits.

Rate Plan API

Rate plans are managed via the dashboard or the management API (session-cookie auth).

Create a plan:

Codetext

POST /api/projects/{projectId}/rate-plans

Codetext

{
  "name": "Pro",
  "slug": "pro",
  "is_default": false,
  "daily_token_limit": 2000000,
  "monthly_token_limit": 50000000,
  "daily_request_limit": 500,
  "monthly_request_limit": 10000,
  "requests_per_minute": 60,
  "daily_cost_limit_usd": 5.00,
  "monthly_cost_limit_usd": 50.00,
  "markup_percentage": 20,
  "flat_rate_per_request": null,
  "allowed_models": ["gpt-4o", "claude-sonnet-4.5"],
  "overage_action": "block",
  "priority": 0
}

Required fields: name, slug. All other fields are optional.

Update a plan:

Codetext

PATCH /api/projects/{projectId}/rate-plans/{planId}

Send only the fields you want to change. Same field names as create.

Delete a plan:

Codetext

DELETE /api/projects/{projectId}/rate-plans/{planId}

If end-users are assigned to the deleted plan, they are automatically reassigned to the project's default plan. If no default plan exists and users are still assigned, the delete returns 409 Conflict.

List all plans:

Codetext

GET /api/projects/{projectId}/rate-plans

Returns all plans for the project, sorted by priority ascending.

Priority

When multiple plans could apply to a user, the plan explicitly assigned to the user takes precedence. If no plan is assigned, the project's default plan (is_default: true) is used. The priority field is for your own ordering in the dashboard — it does not affect which plan is selected.

End-User Management

Auto-Creation

Users are created automatically when usage is recorded for the first time, either by the AI Gateway after a successful request or by the Usage Events API. A quota check by itself does not create an end_users row.

You can also manage users explicitly through the dashboard.

[!NOTE] Project management routes such as /api/projects/{projectId}/end-users are currently dashboard management endpoints backed by the authenticated Supabase session cookie. They do not yet support bearer-token auth for external automation.

End-User API

Create or update a user (upsert by external_id):

Codetext

POST /api/projects/{projectId}/end-users

Codetext

{
  "external_id": "user_123",
  "display_name": "Jane Doe",
  "email": "jane@example.com",
  "rate_plan_id": "uuid-of-rate-plan",
  "is_blocked": false,
  "metadata": { "plan_tier": "enterprise", "signup_source": "api" }
}

Required: external_id. All other fields are optional. If a user with that external_id already exists in the project, it updates instead of creating.

The metadata field accepts arbitrary JSON. Use it to store customer-specific tags (plan tier, signup source, internal notes, etc.) that your team can reference in the dashboard.

Update a user:

Codetext

PATCH /api/projects/{projectId}/end-users/{endUserId}

Accepts: display_name, email, rate_plan_id, is_blocked, metadata, status (alias — "blocked" sets is_blocked: true, "active" sets is_blocked: false).

Delete a user:

Codetext

DELETE /api/projects/{projectId}/end-users/{endUserId}

Permanently removes the user and their usage history. This cannot be undone.

Get user details:

Codetext

GET /api/projects/{projectId}/end-users/{endUserId}

Returns the user with their rate plan details and usage history (last 30 daily/monthly rows).

List users (paginated, filterable):

Codetext

GET /api/projects/{projectId}/end-users?page=1&per_page=25&search=jane&plan_id=uuid&status=active

Invoices

Generate invoices for your end-users based on their usage over a billing period. Invoices can optionally be sent through Stripe Connect.

List Invoices

Codetext

GET /api/projects/{projectId}/end-user-billing/invoices?page=1&per_page=25&status=draft&end_user_id=uuid

All query parameters are optional. Filter by status (draft, sent, paid, void, overdue) or end_user_id.

Generate Invoices

Codetext

POST /api/projects/{projectId}/end-user-billing/invoices

Codetext

{
  "period_start": "2026-03-01",
  "period_end": "2026-04-01",
  "end_user_ids": ["uuid-1", "uuid-2"],
  "send_via_stripe": true
}

Field	Type	Required	Description
`period_start`	string	Yes	Start date (YYYY-MM-DD)
`period_end`	string	Yes	End date (YYYY-MM-DD)
`end_user_ids`	string[]	No	Generate only for these users. Omit to generate for all users with usage in the period.
`send_via_stripe`	boolean	No	If `true` and Stripe Connect is active, creates and sends a Stripe invoice. Defaults to `false`.

Response:

Codetext

{
  "generated": 12,
  "skipped": 3,
  "errors": 0,
  "invoices": [...],
  "skipped_details": [...]
}

Invoice Lifecycle

Status	Meaning
`draft`	Generated but not sent
`sent`	Sent to the end-user (via Stripe or manually)
`paid`	Payment received (updated automatically by the Stripe Connect webhook)
`overdue`	Payment failed (updated automatically by the Stripe Connect webhook)
`void`	Cancelled

Stripe Connect

Collect payments from your end-users using Stripe Connect. Cencori acts as the platform — your users pay you directly.

Setup

Stripe Connect is organization-scoped, not project-scoped. You connect one Stripe account per organization, and every project in that organization reuses it.

Open any project's End-User Billing > Configuration page.
Click Connect Stripe Account.
Complete the Stripe onboarding flow.
Once connected, every project in that organization can generate invoices and collect payments.

How Charges Work

Codetext

Provider cost (e.g. OpenAI)     $0.01
+ Your markup (e.g. 20%)        $0.002
+ Flat fee (if set)              $0.00
= Customer charge                $0.012

The markup and flat fee come from the user's rate plan (or the project default).

Stripe Connect Webhook

The webhook endpoint at /api/billing/stripe-connect-webhook processes the following Stripe events:

Event	Action
`account.updated`	Syncs the Connect account status (charges_enabled, payouts_enabled)
`invoice.paid`	Marks the corresponding `end_user_invoice` as `paid`
`invoice.payment_failed`	Marks the corresponding `end_user_invoice` as `overdue`
`account.application.deauthorized`	Marks the Connect account as disconnected

Set the webhook URL in your Stripe dashboard to https://your-domain.com/api/billing/stripe-connect-webhook and configure the STRIPE_CONNECT_WEBHOOK_SECRET environment variable with the signing secret.

Dashboard

The End-User Billing page has four tabs:

Configuration: Enable/disable billing, set default markup, connect Stripe.
End Users: View all users, their plans, usage, and status. Block/unblock users.
Rate Plans: Create and manage rate plans with limits and pricing.
Revenue: Usage stats, cost breakdown, daily trends, and top users.

Authentication

Both billing endpoints (https://api.cencori.com/v1/billing/*) use the same API key as the gateway:

Header: Authorization: Bearer csk_your_secret_key
Or: CENCORI_API_KEY: csk_your_secret_key

[!CAUTION] Only secret keys (csk_*) can report usage and check quotas. Publishable keys (cpk_*) are rejected.

Dashboard management endpoints (/api/projects/*) use the authenticated Supabase session cookie and are not accessible via API key.

Environment Scoping

Quota checks and usage aggregation are scoped by API key environment. Test-key traffic (environment = test) does not consume production quota, and production-key traffic does not consume test quota.

Gateway 429 Responses

When a gateway request is blocked because the end-user exceeded a rate plan limit, the response looks like:

Codetext

HTTP/1.1 429 Too Many Requests
Content-Type: application/json
Retry-After: 3600
 
{
  "error": "Rate limit exceeded",
  "reason": "daily_token_limit_exceeded",
  "retry_after_seconds": 3600
}

The Retry-After header is included when the limit is time-based (daily/monthly resets, or requests-per-minute cooldown). Your client should respect this header for automatic retry logic.

Possible reason values:

Reason	Description
`user_blocked`	User is manually blocked
`daily_token_limit_exceeded`	Daily token cap reached
`monthly_token_limit_exceeded`	Monthly token cap reached
`daily_request_limit_exceeded`	Daily request cap reached
`monthly_request_limit_exceeded`	Monthly request cap reached
`requests_per_minute_exceeded`	Per-minute burst limit hit
`daily_cost_limit_exceeded`	Daily cost cap reached
`monthly_cost_limit_exceeded`	Monthly cost cap reached
`model_not_allowed`	Model not in the plan's allowed list

Error Responses

Status	Meaning
400	Validation error (missing fields, billing not enabled)
401	Invalid or missing API key
403	Publishable key used (secret key required)
429	Gateway request blocked for the end-user (see above)
500	Server error

SDK Support

The TypeScript SDK (cencori on npm) currently covers the AI Gateway, which handles end-user billing automatically when you pass user: "user_123" in your requests. Dedicated cencori.billing.checkQuota() and cencori.billing.recordUsage() methods for the Usage Events API path are not yet available — use the REST endpoints directly for now.

Limits

Batch size: max 1,000 events per request.
CSV export: max 50,000 rows per export.
Rate plan limits are enforced at the daily and monthly level.

Billing & Usage — plan tiers and credits
Rate Limiting — project-level rate limits
API Keys — key types and management
Authentication — how to authenticate requests

End-User Billing

How It Works

Option 1: AI Gateway (Recommended)

Option 2: Usage Events API

Check Quota (Before Processing)

Report Usage (After Processing)

Batch Reporting

Full Integration Example

Usage Event Fields

Rate Plans

Rate Plan API

Priority

End-User Management

Auto-Creation

End-User API

Invoices

List Invoices

Generate Invoices

Invoice Lifecycle

Stripe Connect

Setup

How Charges Work

Stripe Connect Webhook

Dashboard

Authentication

Environment Scoping

Gateway 429 Responses

Error Responses

SDK Support

Limits

Related