Add cost tracking, caching, failover, and 76 middleware modules to your Google Gemini requests. One URL change, no SDK swap.
Google Gemini uses its own API format. Stockyard translates Gemini requests to and from the OpenAI-compatible format, so you can use Gemini through the same endpoint and SDK as OpenAI and Anthropic.
Gemini models are competitively priced, especially Flash variants. Proxying through Stockyard lets you route cost-sensitive requests to Gemini while keeping quality-critical requests on Claude or GPT-4o, all through one endpoint.
# Install Stockyard curl -fsSL stockyard.dev/install.sh | sh # Set your Google Gemini API key export GEMINI_API_KEY=your-key-here # Start the proxy stockyard # Provider: google-gemini (from GEMINI_API_KEY) # Proxy listening on :4200 # Send a request through the proxy curl http://localhost:4200/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{"model":"gemini-2.5-pro","messages":[{"role":"user","content":"hello"}]}'
Gemini model names are auto-detected. Use the full model name (e.g., gemini-2.5-flash) in your request body.
Gemini 2.5 Flash is one of the cheapest capable models available. With model aliasing, you can route cost-sensitive tasks to Flash while keeping quality-critical requests on GPT-4o or Claude:
# Route cheap tasks to Gemini, expensive tasks to GPT-4o curl -X PUT http://localhost:4200/api/proxy/aliases \ -d '{"alias":"cheap","model":"gemini-2.5-flash"}' curl -X PUT http://localhost:4200/api/proxy/aliases \ -d '{"alias":"quality","model":"gpt-4o"}'
Your app sends to cheap or quality. Change the backing model anytime without redeploying.
Route Google Gemini through Stockyard in under 60 seconds.
Install GuideAll 16 providers · Proxy-only mode · What is an LLM proxy? · vs AWS Bedrock · vs LiteLLM