AI Models
Browse 105+ models from 14 providers. All accessible through a single API with built-in security, observability, and failover.
Claude 3.5 Haiku
claude-3-5-haiku-20241022
Fast and efficient
Claude 3.5 Sonnet
claude-3-5-sonnet-20241022
Balance of speed and capability
Claude 3.7 Sonnet
claude-3-7-sonnet
Hybrid reasoning model
Claude Haiku 4.5
claude-haiku-4.5
Fastest Claude model
Claude Opus 4
claude-opus-4
Most capable Claude model
Claude Opus 4.5
claude-opus-4.5
Latest, most intelligent
Claude Opus 4.6
claude-opus-4.6
Latest flagship, agentic coding record-breaker
Claude Opus 4.7
claude-opus-4.7
Latest flagship, improved reasoning & agentic coding
Claude Sonnet 4
claude-sonnet-4
Balanced speed & capability
Claude Sonnet 4.5
claude-sonnet-4.5
Enhanced coding & agents
Claude Sonnet 4.6
claude-sonnet-4.6
Latest flagship, enhanced reasoning & coding
Command A
command-a-03-2025
Most performant, agentic tasks
Command Light
command-light
Fast and efficient
Command R
command-r
Balanced performance
Command R+
command-r-plus-08-2024
Complex RAG and multi-step
DeepSeek Coder V2
deepseek-coder-v2
338 languages, GPT-4 level
DeepSeek R1
deepseek-reasoner
Reasoning model
DeepSeek V3
deepseek-chat
128K context, MIT license
DeepSeek V3.1
deepseek-v3.1
Hybrid thinking modes
DeepSeek V3.2
deepseek-v3.2
GPT-5 level, daily driver
DeepSeek V3.2 Speciale
deepseek-v3.2-speciale
Maxed reasoning, competition gold
Gemini 2.0 Flash
gemini-2.0-flash
Fast model
Gemini 2.0 Flash Thinking
gemini-2.0-flash-thinking
Reasoning variant
Gemini 2.5 Flash
gemini-2.5-flash
Thinking capabilities
Gemini 2.5 Flash Lite
gemini-2.5-flash-lite
Speed optimized
Gemini 2.5 Pro
gemini-2.5-pro
Enhanced reasoning & coding
Gemini 3 Deep Think
gemini-3-deep-think
Deep iterative reasoning
Gemini 3 Flash
gemini-3-flash
Frontier speed & intelligence
Gemini 3 Pro
gemini-3-pro
Powerful Gemini model
Gemini 3 Pro Image
gemini-3-pro-image
Fast photorealism
Gemini 3.1 Flash Image (Nano Banana 2)
gemini-3.1-flash-image
Reasoning-guided image synthesis, up to 4K
Gemini 3.1 Pro (Custom Tools)
gemini-3.1-pro-preview-customtools
Optimized for custom tools and bash
Gemini 3.1 Pro Preview
gemini-3.1-pro-preview
Latest flagship preview, 1M context, enhanced reasoning
Imagen 3
imagen-3
High quality images
Llama 3.1 8B Instant
llama-3.1-8b-instant
Ultra-fast inference
Llama 3.3 70B Versatile
llama-3.3-70b-versatile
Groq-hosted versatile Llama 3.3 model
Llama 4 Maverick
llama-4-maverick
Latest multimodal Llama
Llama 4 Scout
llama-4-scout
Advanced Llama 4 model
Mixtral 8x7B
mixtral-8x7b-32768
MoE architecture
Llama 3.3 70B
meta-llama/Llama-3.3-70B-Instruct
Via HF Inference
Llama 4 Maverick
meta-llama/Llama-4-Maverick
Via HF Inference
Mistral Large 3
mistralai/Mistral-Large-3
Via HF Inference
Qwen 2.5 72B
Qwen/Qwen2.5-72B-Instruct
Via HF Inference
Llama 3.1 405B
llama-3.1-405b
Largest open model
Llama 3.1 70B
llama-3.1-70b
Balanced performance
Llama 3.2 90B Vision
llama-3.2-90b-vision
Multimodal understanding
Llama 3.3 70B
llama-3.3-70b
Latest Llama 3 model
Llama 4 Maverick
llama-4-maverick
Latest multimodal flagship
Llama 4 Scout
llama-4-scout
Advanced reasoning
Codestral 25.01
codestral-latest
2.5x faster code generation
Devstral 2
devstral-latest
Frontier code agents
Magistral Medium
magistral-medium
Multimodal reasoning
Ministral 3B
ministral-3b
Compact edge model
Ministral 8B
ministral-8b
Small efficient model
Mistral Large 3
mistral-large-latest
675B params, best open-weight multimodal
Mistral Medium 3.1
mistral-medium-latest
Frontier-class multimodal
Mistral Small 3
mistral-small-latest
24B params, fast
DALL-E 2
dall-e-2
Fast image generation
DALL-E 3
dall-e-3
High quality images
GPT Image 1
gpt-image-1
ChatGPT image generation model
GPT Image 1.5
gpt-image-1.5
Best text rendering
GPT-4 Turbo
gpt-4-turbo
Legacy GPT-4 model
GPT-4.1
gpt-4.1
Long-context GPT-4.1
GPT-4.1 Mini
gpt-4.1-mini
Balanced GPT-4.1 model
GPT-4.1 Nano
gpt-4.1-nano
Fast GPT-4.1 nano model
GPT-4o
gpt-4o
Omni-modal model
GPT-4o Mini
gpt-4o-mini
Fast and cost-effective
GPT-5
gpt-5
Flagship model
GPT-5 Mini
gpt-5-mini
Fast and efficient
GPT-5 Nano
gpt-5-nano
Lowest-latency GPT-5 model
GPT-5 Pro
gpt-5-pro
High-quality GPT-5 variant
GPT-5.1
gpt-5.1
Improved GPT-5 generation
GPT-5.2
gpt-5.2
Latest GPT-5.2 flagship
GPT-5.2 Pro
gpt-5.2-pro
Most capable GPT-5.2 variant
GPT-5.3 Instant
gpt-5.3-chat-latest
Latest GPT-5.3 instant release
GPT-5.4 Pro
gpt-5.4-pro
Most capable GPT-5.4 variant
GPT-5.4 Thinking
gpt-5.4
Latest GPT-5.4 reasoning model
o1
o1
Legacy reasoning model
o3
o3
Advanced reasoning model
o3 Mini
o3-mini
Fast reasoning model
o3 Pro
o3-pro
Most advanced reasoning model
o4 Mini
o4-mini
Successor to o1-mini
Claude Opus 4.5 (via OpenRouter)
anthropic/claude-opus-4.5
Unified billing
Gemini 3 Pro (via OpenRouter)
google/gemini-3-pro
Meta-provider
GPT-5 (via OpenRouter)
openai/gpt-5
Access any model
Grok 4 (via OpenRouter)
x-ai/grok-4
Access xAI models
Sonar
sonar
Default web-connected
Sonar Large Online
llama-3.1-sonar-large-128k-online
Web-connected search
Sonar Pro
sonar-pro
Enhanced search, richer context
Sonar Reasoning Pro
sonar-reasoning-pro
Deep inference & research
Qwen 2.5 32B
qwen2.5-32b-instruct
Balanced performance
Qwen 2.5 72B
qwen2.5-72b-instruct
Flagship model
Qwen 2.5 Coder 32B
qwen2.5-coder-32b
Code specialized
QwQ 32B
qwq-32b-preview
Reasoning model
DeepSeek V3.1
deepseek-ai/DeepSeek-V3.1
Hybrid reasoning
Llama 3.3 70B Turbo
meta-llama/Llama-3.3-70B-Instruct-Turbo
Fast Llama inference
Llama 4 Maverick
meta-llama/Llama-4-Maverick
Latest Llama
Qwen 2.5 72B
Qwen/Qwen2.5-72B-Instruct-Turbo
Alibaba flagship
Grok 3
grok-3
DeepSearch, Big Brain Mode
Grok 3 Mini
grok-3-mini
Fast responses
Grok 4
grok-4
Enhanced reasoning, real-time search
Grok 4 Heavy
grok-4-heavy
Maximum capability
Grok 4.1
grok-4.1
Improved multimodal & reasoning
Grok 4.1 Fast
grok-4.1-fast
Best agentic tool calling
Grok Code Fast
grok-code-fast-1
Fast agentic coding
Model | Provider | Type | Context | Description | Status |
|---|---|---|---|---|---|
Claude 3.5 Haiku claude-3-5-haiku-20241022 | Anthropic | chat | 200K | Fast and efficient | Live |
Claude 3.5 Sonnet claude-3-5-sonnet-20241022 | Anthropic | chat | 200K | Balance of speed and capability | Live |
Claude 3.7 Sonnet claude-3-7-sonnet | Anthropic | reasoning | 200K | Hybrid reasoning model | Live |
Claude Haiku 4.5 claude-haiku-4.5 | Anthropic | chat | 200K | Fastest Claude model | Live |
Claude Opus 4 claude-opus-4 | Anthropic | chat | 200K | Most capable Claude model | Live |
Claude Opus 4.5 claude-opus-4.5 | Anthropic | chat | 200K | Latest, most intelligent | Live |
Claude Opus 4.6 claude-opus-4.6 | Anthropic | chat | 1M | Latest flagship, agentic coding record-breaker | Live |
Claude Opus 4.7 claude-opus-4.7 | Anthropic | chat | 1M | Latest flagship, improved reasoning & agentic coding | Live |
Claude Sonnet 4 claude-sonnet-4 | Anthropic | chat | 200K | Balanced speed & capability | Live |
Claude Sonnet 4.5 claude-sonnet-4.5 | Anthropic | chat | 200K | Enhanced coding & agents | Live |
Claude Sonnet 4.6 claude-sonnet-4.6 | Anthropic | chat | 200K | Latest flagship, enhanced reasoning & coding | Live |
Command A command-a-03-2025 | Cohere | chat | 256K | Most performant, agentic tasks | Live |
Command Light command-light | Cohere | chat | 4K | Fast and efficient | Live |
Command R command-r | Cohere | chat | 128K | Balanced performance | Live |
Command R+ command-r-plus-08-2024 | Cohere | chat | 128K | Complex RAG and multi-step | Live |
DeepSeek Coder V2 deepseek-coder-v2 | DeepSeek | code | 128K | 338 languages, GPT-4 level | Live |
DeepSeek R1 deepseek-reasoner | DeepSeek | reasoning | 64K | Reasoning model | Live |
DeepSeek V3 deepseek-chat | DeepSeek | chat | 128K | 128K context, MIT license | Live |
DeepSeek V3.1 deepseek-v3.1 | DeepSeek | chat | 128K | Hybrid thinking modes | Live |
DeepSeek V3.2 deepseek-v3.2 | DeepSeek | chat | 128K | GPT-5 level, daily driver | Live |
DeepSeek V3.2 Speciale deepseek-v3.2-speciale | DeepSeek | reasoning | 128K | Maxed reasoning, competition gold | Live |
Gemini 2.0 Flash gemini-2.0-flash | Google AI | chat | 1M | Fast model | Live |
Gemini 2.0 Flash Thinking gemini-2.0-flash-thinking | Google AI | reasoning | 1M | Reasoning variant | Live |
Gemini 2.5 Flash gemini-2.5-flash | Google AI | chat | 1M | Thinking capabilities | Live |
Gemini 2.5 Flash Lite gemini-2.5-flash-lite | Google AI | chat | 1M | Speed optimized | Live |
Gemini 2.5 Pro gemini-2.5-pro | Google AI | chat | 1M | Enhanced reasoning & coding | Live |
Gemini 3 Deep Think gemini-3-deep-think | Google AI | reasoning | 1M | Deep iterative reasoning | Live |
Gemini 3 Flash gemini-3-flash | Google AI | chat | 1M | Frontier speed & intelligence | Live |
Gemini 3 Pro gemini-3-pro | Google AI | chat | 2M | Powerful Gemini model | Live |
Gemini 3 Pro Image gemini-3-pro-image | Google AI | image | — | Fast photorealism | Live |
Gemini 3.1 Flash Image (Nano Banana 2) gemini-3.1-flash-image | Google AI | image | — | Reasoning-guided image synthesis, up to 4K | Live |
Gemini 3.1 Pro (Custom Tools) gemini-3.1-pro-preview-customtools | Google AI | chat | 1M | Optimized for custom tools and bash | Live |
Gemini 3.1 Pro Preview gemini-3.1-pro-preview | Google AI | chat | 1M | Latest flagship preview, 1M context, enhanced reasoning | Live |
Imagen 3 imagen-3 | Google AI | image | — | High quality images | Live |
Llama 3.1 8B Instant llama-3.1-8b-instant | Groq | chat | 128K | Ultra-fast inference | Live |
Llama 3.3 70B Versatile llama-3.3-70b-versatile | Groq | chat | 128K | Groq-hosted versatile Llama 3.3 model | Live |
Llama 4 Maverick llama-4-maverick | Groq | chat | 256K | Latest multimodal Llama | Live |
Llama 4 Scout llama-4-scout | Groq | chat | 256K | Advanced Llama 4 model | Live |
Mixtral 8x7B mixtral-8x7b-32768 | Groq | chat | 33K | MoE architecture | Live |
Llama 3.3 70B meta-llama/Llama-3.3-70B-Instruct | Hugging Face | chat | 128K | Via HF Inference | Live |
Llama 4 Maverick meta-llama/Llama-4-Maverick | Hugging Face | chat | 256K | Via HF Inference | Live |
Mistral Large 3 mistralai/Mistral-Large-3 | Hugging Face | chat | 128K | Via HF Inference | Live |
Qwen 2.5 72B Qwen/Qwen2.5-72B-Instruct | Hugging Face | chat | 32K | Via HF Inference | Live |
Llama 3.1 405B llama-3.1-405b | Meta AI | chat | 128K | Largest open model | Live |
Llama 3.1 70B llama-3.1-70b | Meta AI | chat | 128K | Balanced performance | Live |
Llama 3.2 90B Vision llama-3.2-90b-vision | Meta AI | chat | 128K | Multimodal understanding | Live |
Llama 3.3 70B llama-3.3-70b | Meta AI | chat | 128K | Latest Llama 3 model | Live |
Llama 4 Maverick llama-4-maverick | Meta AI | chat | 256K | Latest multimodal flagship | Live |
Llama 4 Scout llama-4-scout | Meta AI | chat | 256K | Advanced reasoning | Live |
Codestral 25.01 codestral-latest | Mistral AI | code | 256K | 2.5x faster code generation | Live |
Devstral 2 devstral-latest | Mistral AI | code | 256K | Frontier code agents | Live |
Magistral Medium magistral-medium | Mistral AI | reasoning | 128K | Multimodal reasoning | Live |
Ministral 3B ministral-3b | Mistral AI | chat | 128K | Compact edge model | Live |
Ministral 8B ministral-8b | Mistral AI | chat | 128K | Small efficient model | Live |
Mistral Large 3 mistral-large-latest | Mistral AI | chat | 128K | 675B params, best open-weight multimodal | Live |
Mistral Medium 3.1 mistral-medium-latest | Mistral AI | chat | 128K | Frontier-class multimodal | Live |
Mistral Small 3 mistral-small-latest | Mistral AI | chat | 32K | 24B params, fast | Live |
DALL-E 2 dall-e-2 | OpenAI | image | — | Fast image generation | Live |
DALL-E 3 dall-e-3 | OpenAI | image | — | High quality images | Live |
GPT Image 1 gpt-image-1 | OpenAI | image | — | ChatGPT image generation model | Live |
GPT Image 1.5 gpt-image-1.5 | OpenAI | image | — | Best text rendering | Live |
GPT-4 Turbo gpt-4-turbo | OpenAI | chat | 128K | Legacy GPT-4 model | Live |
GPT-4.1 gpt-4.1 | OpenAI | code | 1.0M | Long-context GPT-4.1 | Live |
GPT-4.1 Mini gpt-4.1-mini | OpenAI | chat | 1.0M | Balanced GPT-4.1 model | Live |
GPT-4.1 Nano gpt-4.1-nano | OpenAI | chat | 1.0M | Fast GPT-4.1 nano model | Live |
GPT-4o gpt-4o | OpenAI | chat | 128K | Omni-modal model | Live |
GPT-4o Mini gpt-4o-mini | OpenAI | chat | 128K | Fast and cost-effective | Live |
GPT-5 gpt-5 | OpenAI | chat | 400K | Flagship model | Live |
GPT-5 Mini gpt-5-mini | OpenAI | chat | 400K | Fast and efficient | Live |
GPT-5 Nano gpt-5-nano | OpenAI | chat | 400K | Lowest-latency GPT-5 model | Live |
GPT-5 Pro gpt-5-pro | OpenAI | chat | 400K | High-quality GPT-5 variant | Live |
GPT-5.1 gpt-5.1 | OpenAI | chat | 400K | Improved GPT-5 generation | Live |
GPT-5.2 gpt-5.2 | OpenAI | chat | 400K | Latest GPT-5.2 flagship | Live |
GPT-5.2 Pro gpt-5.2-pro | OpenAI | chat | 400K | Most capable GPT-5.2 variant | Live |
GPT-5.3 Instant gpt-5.3-chat-latest | OpenAI | chat | 400K | Latest GPT-5.3 instant release | Live |
GPT-5.4 Pro gpt-5.4-pro | OpenAI | chat | 400K | Most capable GPT-5.4 variant | Live |
GPT-5.4 Thinking gpt-5.4 | OpenAI | chat | 400K | Latest GPT-5.4 reasoning model | Live |
o1 o1 | OpenAI | reasoning | 200K | Legacy reasoning model | Live |
o3 o3 | OpenAI | reasoning | 200K | Advanced reasoning model | Live |
o3 Mini o3-mini | OpenAI | reasoning | 200K | Fast reasoning model | Live |
o3 Pro o3-pro | OpenAI | reasoning | 200K | Most advanced reasoning model | Live |
o4 Mini o4-mini | OpenAI | reasoning | 200K | Successor to o1-mini | Live |
Claude Opus 4.5 (via OpenRouter) anthropic/claude-opus-4.5 | OpenRouter | chat | 200K | Unified billing | Live |
Gemini 3 Pro (via OpenRouter) google/gemini-3-pro | OpenRouter | chat | 2M | Meta-provider | Live |
GPT-5 (via OpenRouter) openai/gpt-5 | OpenRouter | chat | 256K | Access any model | Live |
Grok 4 (via OpenRouter) x-ai/grok-4 | OpenRouter | chat | 256K | Access xAI models | Live |
Sonar sonar | Perplexity | search | 128K | Default web-connected | Live |
Sonar Large Online llama-3.1-sonar-large-128k-online | Perplexity | search | 128K | Web-connected search | Live |
Sonar Pro sonar-pro | Perplexity | search | 128K | Enhanced search, richer context | Live |
Sonar Reasoning Pro sonar-reasoning-pro | Perplexity | reasoning | 128K | Deep inference & research | Live |
Qwen 2.5 32B qwen2.5-32b-instruct | Qwen | chat | 128K | Balanced performance | Live |
Qwen 2.5 72B qwen2.5-72b-instruct | Qwen | chat | 128K | Flagship model | Live |
Qwen 2.5 Coder 32B qwen2.5-coder-32b | Qwen | code | 128K | Code specialized | Live |
QwQ 32B qwq-32b-preview | Qwen | reasoning | 32K | Reasoning model | Live |
DeepSeek V3.1 deepseek-ai/DeepSeek-V3.1 | Together AI | chat | 128K | Hybrid reasoning | Live |
Llama 3.3 70B Turbo meta-llama/Llama-3.3-70B-Instruct-Turbo | Together AI | chat | 128K | Fast Llama inference | Live |
Llama 4 Maverick meta-llama/Llama-4-Maverick | Together AI | chat | 256K | Latest Llama | Live |
Qwen 2.5 72B Qwen/Qwen2.5-72B-Instruct-Turbo | Together AI | chat | 32K | Alibaba flagship | Live |
Grok 3 grok-3 | xAI | chat | 128K | DeepSearch, Big Brain Mode | Live |
Grok 3 Mini grok-3-mini | xAI | chat | 128K | Fast responses | Live |
Grok 4 grok-4 | xAI | chat | 256K | Enhanced reasoning, real-time search | Live |
Grok 4 Heavy grok-4-heavy | xAI | chat | 256K | Maximum capability | Live |
Grok 4.1 grok-4.1 | xAI | chat | 256K | Improved multimodal & reasoning | Live |
Grok 4.1 Fast grok-4.1-fast | xAI | chat | 2M | Best agentic tool calling | Live |
Grok Code Fast grok-code-fast-1 | xAI | code | 128K | Fast agentic coding | Live |