LLM Observability

Know what every request actually costs.

Stockyard traces every LLM request with cost, latency, token count, model, and provider. Automatic, self-hosted, no third-party data sharing. Query through the API or browse in the dashboard.

One request, everything recorded

# Send a normal request through the proxy
$ curl localhost:4200/v1/chat/completions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{"model":"gpt-4o","messages":[{"role":"user","content":"Summarize this doc"}]}'

# Query what just happened
$ curl localhost:4200/api/observe/traces?limit=1
{
  "model": "gpt-4o",
  "provider": "openai",
  "input_tokens": 847,
  "output_tokens": 234,
  "cost_cents": 1.42,
  "latency_ms": 1847,
  "cached": false,
  "status": 200
}
  

No SDK changes. No extra library. Route traffic through Stockyard and tracing is automatic.

What you can answer immediately

How much did we spend today? Aggregate cost by hour, day, or week. Break it down by model, provider, or API key.

Which model is cheapest for this task? Compare cost-per-request across models doing the same work. Lasso lets you replay requests against cheaper models side by side.

Are we overspending on a specific endpoint? Filter traces by model, status code, latency range, or cost threshold. Find the expensive outliers.

Is caching working? See cache hit rate, cost saved from cached responses, and which requests are being served from cache.

Cost calculation is automatic

Stockyard knows the pricing for every model across all 40 supported providers. When a request completes, cost is calculated from input tokens, output tokens, and the model's per-token rate. You don't configure pricing tables or update them manually.

Costs are stored in integer cents in SQLite. No floating-point drift, no rounding surprises.

Self-hosted means your data stays yours

SaaS observability tools require you to send every prompt and completion to a third party. Stockyard runs on your infrastructure. Traces live in a local SQLite file. Nothing leaves your network unless you send it somewhere.

Compliance teams care about this. So do customers.

Query API

GET /api/observe/traces          # list traces with filters
GET /api/observe/traces/{id}      # single trace detail
GET /api/observe/stats           # aggregate cost, tokens, latency
GET /api/observe/stats/models    # breakdown by model
GET /api/observe/stats/providers # breakdown by provider
  

Filter by date range, model, provider, cost threshold, status code, latency range.

See your LLM costs from request one.

Tracing is free on every tier. Install Stockyard, send traffic, query your costs. No credit card, no third-party data sharing.

Install Stockyard

Lookout Docs → Cut LLM Costs → Proxy-Only Setup →