Docs/AI SDK

AI

Python SDK

Last updated March 3, 2026

Official Python client for Cencori. Supports sync/async/await, strict typing with Pydantic, and streaming.

Installation

Codetext
pip install cencori

Basic Usage (Synchronous)

The standard client is blocking and best for scripts or simple applications.

Codetext
from cencori import Cencori
 
client = Cencori(
    api_key="csk_...", # Defaults to os.environ.get("CENCORI_API_KEY")
)
 
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello world"}]
)
 
print(response.choices[0].message.content)

Async Usage (Async/Await)

For high-performance web applications (FastAPI, Django, etc.), use AsyncCencori.

Codetext
import asyncio
from cencori import AsyncCencori
 
async def main():
    client = AsyncCencori()
    
    response = await client.chat.completions.create(
        model="claude-3-5-sonnet",
        messages=[{"role": "user", "content": "Tell me a joke"}]
    )
    
    print(response.choices[0].message.content)
 
if __name__ == "__main__":
    asyncio.run(main())

Streaming Responses

Streaming allows you to process the response chunk-by-chunk, which is critical for real-time AI experiences.

Codetext
stream = client.chat.completions.create(
    model="gemini-2.5-flash",
    messages=[{"role": "user", "content": "Write a long story..."}],
    stream=True,
)
 
for chunk in stream:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end="")

Async Streaming

Codetext
stream = await client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "..."}],
    stream=True,
)
 
async for chunk in stream:
    content = chunk.choices[0].delta.content
    if content:
        print(content, end="")

Typing & Pydantic Support

The SDK utilizes Pydantic V2 for all data models, ensuring runtime type safety.

Codetext
from cencori.types import ChatCompletion
 
# Functions can explicitly accept Cencori types
def process_response(response: ChatCompletion):
    print(f"Usage: {response.usage.total_tokens}")
 
# Exceptions also use standard types
from cencori import APIError, RateLimitError
 
try:
    client.chat.completions.create(...)
except RateLimitError:
    print("Too many requests!")
except APIError as e:
    print(f"Cencori error: {e.message}")

Advanced Configuration

You can override the http client or set global timeouts.

Codetext
import httpx
 
client = Cencori(
    timeout=20.0, # 20 seconds default timeout
    http_client=httpx.Client(proxies="http://my-proxy"),
    max_retries=2,
)