Building Agents with the Responses API
Build production AI agents that search the web, query your knowledge base, run code, and call APIs — all through a single endpoint.
You have an AI model. You want it to do more than chat — you want it to search the web for real-time information, look up internal knowledge, run calculations, and call your APIs. You want an agent, not just a chatbot.
The Responses API gives you that in one endpoint. No wiring up separate search APIs, vector databases, or code sandboxes. Just tell the model what tools it has, and it uses them.
What Makes It an Agent
An AI stops being a chatbot and starts being an agent when it can act on the world — gather information, execute code, and call services — not just generate text.
The Responses API gives the model four built-in capabilities:
Web Search — The agent searches the web for real-time information, reads results, and synthesizes answers. No separate search API integration needed.
Knowledge Retrieval — The agent queries your stored documents and memory. Connect your knowledge base once, and every agent you build can reference it.
Code Execution — The agent writes and runs Python or JavaScript to solve problems numerically, process data, or generate charts. Results flow back into the conversation naturally.
Function Calling — The agent calls your existing APIs and services. Works exactly like standard tool calling — same format, same flow.
Each tool is self-contained. The agent decides when to use them based on the task.
How It Works
A single request, one response:
You: "Analyze Q3 revenue trends and search for industry benchmarks."
Agent: 1. Searches the web for current benchmarks
2. Queries internal knowledge base for Q3 data
3. Writes and runs Python to compare the numbers
4. Returns a synthesized analysis
All four tool calls happen in one request-response cycle. No building a multi-step pipeline, no managing intermediate state, no stitching together outputs from different services.
const response = await cencori.ai.responses({
model: 'gpt-4o',
input: 'Pull our Q3 revenue from the knowledge base, search for competitor benchmarks, and compare them.',
tools: [
{ type: 'file_search', max_num_results: 10 },
{ type: 'web_search_preview', search_context_size: 'high' },
{ type: 'code_interpreter' },
],
});The response contains everything — the text answer, search results, retrieved documents, and code output — in a single structured response.
Not Just a Wrapper
All four tools run on Cencori's infrastructure:
- Web search hits a search provider, fetches and ranks results, injects them as context
- File search queries your Cencori project's vector store
- Code interpreter executes in a sandboxed environment with timeouts and resource limits
- Function calls route through your existing provider chain with the same security, failover, and observability as any other gateway request
Every request passes through the same security pipeline — jailbreak detection, PII masking, audit logging, rate limiting. The same end-user billing applies. You get agentic capabilities without bypassing your existing governance.
Where to Start
The Responses API is available now at https://cencori.com/api/v1/responses or through the Cencori SDK starting with cencori@1.2.0.
Beyond the First Request
Structured Output
Sometimes you need the agent to return structured data rather than free text. The response_format parameter lets you enforce JSON output:
const response = await cencori.ai.responses({
model: 'gpt-4o',
input: 'Extract the key dates from this document.',
response_format: {
type: 'json_schema',
json_schema: {
name: 'dates',
schema: {
type: 'object',
properties: {
events: {
type: 'array',
items: {
type: 'object',
properties: {
title: { type: 'string' },
date: { type: 'string' },
},
},
},
},
},
},
},
});This is useful when piping agent output into another system — the schema acts as a contract between the agent and downstream consumers.
Source Annotations
When the agent uses web search, the response includes url_citation annotations on the output text. Each [1] citation in the text maps to a source URL:
const msg = response.output.find(i => i.type === 'message');
const annotation = msg?.content?.[0]?.annotations?.[0];
// { type: 'url_citation', start_index: 0, end_index: 3, url: '...', title: '...' }Use these annotations to render clickable citations in your UI.
Multi-turn Conversations
Chain agent interactions by passing the previous response ID:
const step1 = await cencori.ai.responses({
model: 'gpt-4o',
input: 'Search for Q3 AI funding rounds.',
tools: [{ type: 'web_search_preview', search_context_size: 'high' }],
});
const step2 = await cencori.ai.responses({
model: 'gpt-4o',
input: 'Summarize those rounds in a table.',
previous_response_id: step1.id,
});The agent sees the previous response's output as conversation history, giving it context for follow-up questions.
What's Next
The Responses API handles single-turn agentic requests. For multi-step agents that need persistent memory, scheduled tasks, or human-in-the-loop approval, Cencori's Workflow and Orchestration layer is on the way — join the waitlist.

