How does Stockyard calculate LLM costs?

Stockyard counts input and output tokens per request and multiplies by the provider pricing for that model. Costs are calculated in real time and stored in the local SQLite database.

Can I set spend limits per user or team?

Yes. Stockyard supports daily and monthly spend caps per API key, per user, and per team. When a cap is reached, requests are blocked or routed to a cheaper model depending on configuration.

Does cost tracking require sending data to a third party?

No. All cost data is stored in embedded SQLite on your own infrastructure. No prompts, completions, or cost data leaves your network.

LLM Cost Tracking — Per-Request, Self-Hosted

The problem with LLM costs

LLM API bills are opaque. Provider dashboards show aggregate spend but not per-request breakdowns. You cannot tell which feature, which user, or which prompt pattern is driving costs. By the time you notice the bill, the damage is done.

Most cost tracking tools are SaaS products that require sending your prompts and completions to a third party. That creates a new data residency problem while solving the billing one.

How Stockyard tracks costs

Every request through the proxy is logged with input tokens, output tokens, model, provider, latency, and calculated cost in USD. The data stays in embedded SQLite on your server.

The Lookout dashboard shows cost breakdowns by model, provider, day, and user. You can set spend caps per team or per API key. The proxy tracks costs even in proxy-only mode with zero configuration.

# Check spend via API
curl http://localhost:4200/api/spend
{"projects":{"default":{"today":0.42,"month":12.87}}}

# Per-user spend caps
curl -X PUT http://localhost:4200/api/users/u123/spend/cap \
  -d '{"daily_cap": 5.00, "monthly_cap": 100.00}'
  

What you can track

Per-request cost with token-level granularity. Daily and monthly spend rollups per project. Cost breakdowns by model, provider, and user. Cache hit rates and cost savings from semantic caching. Historical cost trends with 7/30/90-day views.

All of this runs locally. No data leaves your network. No SaaS dependency. No per-seat pricing for your observability tool.

Cost optimization built in

Stockyard does not just track costs. The Drover autopilot can automatically route requests to cheaper models when quality thresholds are met. Semantic caching deduplicates identical prompts. Spend caps prevent runaway costs before they hit your provider bill.

Stop guessing what your LLM API costs. Start tracking per-request.

Install Stockyard See pricing

Per-request visibility changes everything

Aggregate cost dashboards tell you how much you spent last month. Per-request cost tracking tells you why. When you can see that a single customer's workflow generated $47 in API costs because of a retry loop, or that your embeddings pipeline is re-embedding documents it already processed, you can fix the problem instead of just observing the bill. Stockyard logs every request with its token count, model, provider, latency, and calculated cost in USD. The data lives in SQLite on your server, queryable through the API and visualized in the Lookout dashboard.

Cost attribution by API key lets you allocate spend to teams, projects, or customers. Set up separate keys for your production app, your staging environment, and your internal tools. The spend endpoint breaks down costs by key, by model, and by day. Export to CSV for finance reporting. Set spend caps that return a structured error when a key exceeds its budget — this prevents the runaway loop scenario where a single bug generates a four-figure bill overnight.

LLM Cost Tracking Without Vendor Lock-In

The problem with LLM costs

How Stockyard tracks costs

What you can track

Cost optimization built in