Getting Started

Your first 5 minutes

Install the binary, send a request, and watch 76 middleware modules light up. Five minutes to full observability.

Install → start → first trace with automatic cost routing. That's it.

Install

One command. No runtime, no Docker, no dependencies.

# Downloads a single static binary to /usr/local/bin
curl -fsSL stockyard.dev/install.sh | sh
      

What just happened: A ~25MB Go binary is now on your PATH. It contains 16 core apps, 76 middleware modules, an embedded SQLite database, and a full web console. Nothing else was installed.

Tip: Run stockyard doctor to verify your environment before starting. It checks your config, port availability, provider API keys, and disk access.

Set a provider key

Stockyard needs at least one LLM provider key to proxy requests. Set it as an environment variable — Stockyard auto-detects the provider from the key prefix.

# Set at least one (Stockyard auto-detects the provider)
export OPENAI_API_KEY=sk-...
# Or: export ANTHROPIC_API_KEY=sk-ant-...
# Or: export GROQ_API_KEY=gsk_...
# Or: export GEMINI_API_KEY=...
      

No key yet? Use the Playground to try Stockyard with your key in the browser — nothing gets stored.

Start the platform

One command. Everything boots on a single port.

stockyard

# Output:
# Stockyard v1.0 — Wrangle your Stack.
# ┌─────────────────────────────────────────────┐
# │  Proxy      :4200/v1    ✓ 76 modules active │
# │  Console    :4200/ui    ✓ 150 tools        │
# │  API        :4200/api   ✓ 350+ endpoints     │
# │  Playground :4200/play  ✓ ready              │
# └─────────────────────────────────────────────┘
      

What just happened: The proxy is listening. The console is live. The SQLite database was created at ~/.stockyard/stockyard.db. All 16 providers are auto-configured. Seventy-six modules are running in the middleware chain. Nothing to configure yet — sensible defaults are already active.

Send your first request

Point any OpenAI-compatible client at localhost:4200.

# Just change the base URL — your API key stays yours
curl http://localhost:4200/v1/chat/completions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"Hello"}]}'
      

Or in your app code — one line changes:

# Before — direct to OpenAI
client = OpenAI()

# After — through Stockyard
client = OpenAI(base_url="http://localhost:4200/v1")
      

What just happened: Your request hit the proxy, flowed through 76 middleware modules (rate limiter, cost tracker, safety filter, semantic cache, etc.), reached OpenAI, and the response came back — with a trace logged in Lookout and an audit event written to Brand. You didn't configure any of that.

Open the console

See what just happened — traces, costs, live events.

Open http://localhost:4200/ui in your browser.

Lookout dashboard showing request traces with model, provider, latency, tokens, cost, and time columns

Lookout → Traces: every request logged with model, latency, tokens, and cost

What you're seeing: The Lookout dashboard shows every request that's flowed through the proxy. Each row is a trace — which model was called, through which provider, how long it took, how many tokens were used, and what it cost. The cost rollup at the top gives you today's spend at a glance.

Toggle a module

Turn features on and off at runtime — no restart, no redeploy.

Navigate to Chute in the sidebar. You'll see all 76 modules with toggle switches.

# Or via the API:
curl -X PUT http://localhost:4200/api/proxy/modules/cost-cap \
  -d '{"enabled": true}'
      

What just happened: The module was toggled in the live middleware chain — the next request will flow through the updated pipeline. No binary restart, no config file change, no deploy. The toggle is persisted to SQLite and survives restarts.

One-line change

Works with your stack

Point your existing code at http://localhost:4200/v1 instead of the provider. That's the whole integration.

OpenAI Python SDK

# Before
client = OpenAI()

# After
client = OpenAI(base_url="http://localhost:4200/v1")
      

OpenAI Node SDK

// Before
const openai = new OpenAI();

// After
const openai = new OpenAI({
  baseURL: "http://localhost:4200/v1"
});
      

LangChain

llm = ChatOpenAI(
  base_url="http://localhost:4200/v1",
  model="gpt-4o"
)
      

Vercel AI SDK

const openai = createOpenAI({
  baseURL: "http://localhost:4200/v1"
});
      

Cursor / Windsurf / Copilot

# Set in editor settings:
API Base URL: http://localhost:4200/v1
      

curl / any HTTP client

curl http://localhost:4200/v1/chat/completions \
  -d '{"model":"gpt-4o","messages":[...]}'
      

Full integration docs · Editor setup guides

Who it's for

Three developers, three problems

The same platform, configured differently.

Solo developer

"I don't want a surprise $400 bill."

You're building a side project with GPT-4o. You want a hard cost cap at $20/month, automatic fallback to a cheaper model when you're close, and a daily email if anything looks off.

Enable three modules and you're done.

cost-cap · model-fallback · spend-alerts

Startup going to prod

"Our safety review is next week."

You need PII filtering before responses hit users, injection attempt blocking, and an audit trail that proves you're doing it. Install the safety pack — five modules, pre-configured.

block-pii-output · block-injection · content-filter · audit-log · safety-score

Enterprise team

"Compliance wants a tamper-proof ledger."

Every LLM call needs to be hash-chained in an append-only audit log. Brand does this automatically — every request through the proxy is a signed ledger entry with chain verification.

trust-ledger · evidence-export · chain-verify · policy-engine

The Console

What you're actually looking at

Every app gets its own tab. One dashboard for the full platform.

Every request traced. Cost attribution by model and provider. Latency sparklines. Anomaly detection. Alerts when something looks wrong.

Lookout dashboard with traces, cost, latency, and error rate stats

Traces & Cost Dashboard — real-time request monitoring

Integration

One line in your code

If your code talks to the OpenAI API, it already works with Stockyard.

from openai import OpenAI

# Your API key goes straight to the provider — Stockyard never stores it
client = OpenAI(
    base_url="http://localhost:4200/v1",  # ← the only change
    api_key="sk-your-key",
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Explain LLM proxies in one sentence."}],
)
print(response.choices[0].message.content)
      

Works with any OpenAI-compatible client. The Anthropic, Google, and Groq SDKs all support custom base URLs.

Ready to try it?

Three commands. Thirty seconds. Everything between your app and the model.

Read the docs Self-host the binary

Not ready to install yet?

Get notified when we ship new features. No spam, unsubscribe anytime.

Ready for more?

Everything you just saw is free — including 100 cost-routing decisions per day and a 5-probe security quickscan. See your real savings and security grade before you pay anything. When you're ready for unlimited routing, the full red-team suite, prompt evolution, or any of the 150 tools — upgrade in one click.

See plans → Explore all 150 tools