Provider

Route Groq through Stockyard

Add cost tracking, caching, failover, and 76 middleware modules to your Groq requests. One URL change, no SDK swap.

Environment variable
GROQ_API_KEY
Models
llama-3.3-70b-versatile, llama-3.1-8b-instant, mixtral-8x7b
Failover to
OpenAI GPT-4o-mini, DeepSeek, or Anthropic Haiku
API format
OpenAI-compatible

Why proxy Groq?

Groq runs open-source models on custom LPU hardware with extremely low latency. Proxying through Stockyard adds cost tracking (Groq is cheap but not free), response caching (save even more), and failover to other providers when Groq hits rate limits.

Groq is already OpenAI-compatible, so the translation overhead is minimal. Stockyard adds the operational layer that Groq does not provide: per-request logging, audit trails, and middleware modules.

Quick start

# Install Stockyard
curl -fsSL stockyard.dev/install.sh | sh

# Set your Groq API key
export GROQ_API_KEY=your-key-here

# Start the proxy
stockyard
# Provider: groq (from GROQ_API_KEY)
# Proxy listening on :4200

# Send a request through the proxy
curl http://localhost:4200/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"llama-3.3-70b-versatile","messages":[{"role":"user","content":"hello"}]}'

Good to know

Groq has aggressive rate limits on free tiers. Stockyard's rate limiting module can smooth out request bursts and the cache reduces redundant API calls.

Handle Groq rate limits gracefully

Groq's LPU hardware delivers sub-second responses, but free-tier rate limits can throttle you at 30 requests per minute. Stockyard helps in two ways:

CACHING

Identical prompts return cached responses instantly. For iterative development, this can cut your effective request count by 50-80%.

FAILOVER

When Groq returns a 429 rate limit error, Stockyard automatically retries on your fallback provider (OpenAI, DeepSeek, etc.) so your app never sees the error.

Route Groq through Stockyard in under 60 seconds.

Install Guide

All 16 providers · Proxy-only mode · What is an LLM proxy? · vs LiteLLM · vs Helicone

Explore: OpenAI · Anthropic · DeepSeek · Ollama