Add cost tracking, caching, failover, and 76 middleware modules to your Cohere requests. One URL change, no SDK swap.
Cohere offers strong models for enterprise search, RAG, and classification. Their Command models are competitive on reasoning tasks and their Embed models are among the best for retrieval.
Proxying through Stockyard normalizes Cohere into the same OpenAI-compatible endpoint as your other providers. Track costs, cache responses, and fail over to other providers without changing your application code.
# Install Stockyard curl -fsSL stockyard.dev/install.sh | sh # Set your Cohere API key export COHERE_API_KEY=your-key-here # Start the proxy stockyard # Provider: cohere (from COHERE_API_KEY) # Proxy listening on :4200 # Send a request through the proxy curl http://localhost:4200/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{"model":"command-r-plus","messages":[{"role":"user","content":"hello"}]}'
Cohere uses their compatibility endpoint (/compatibility/v1) which Stockyard connects to automatically. Embedding requests route to Cohere when using embed model names.
Cohere's Embed models are among the best for RAG and semantic search. With Stockyard, you can route embedding requests to Cohere and chat requests to OpenAI or Anthropic through the same endpoint:
# Embeddings → Cohere curl http://localhost:4200/v1/embeddings \ -d '{"model":"embed-english-v3.0","input":"search query"}' # Chat → OpenAI (same endpoint) curl http://localhost:4200/v1/chat/completions \ -d '{"model":"gpt-4o","messages":[...]}'
Both requests are traced, cost-tracked, and cached through the same proxy. One dashboard for everything.
Route Cohere through Stockyard in under 60 seconds.
Install GuideAll 16 providers · Proxy-only mode · What is an LLM proxy? · vs LiteLLM · vs Langfuse