How a request flows through Stockyard

Click any module to learn what it does, or hit Play to watch a request animate through the full chain.

Entry point
Middleware module
App (auto-records)
Provider
Active (animation)
Your app
POST /v1/chat/completions
Routing
fallbackrouter
modelswitch
regionroute
abrouter
Safety
promptguard
toxicfilter
guardrail
secretscan
agentguard
Cost
costcap
rateshield
usagepulse
tierdrop
Cache
cachelayer
semanticcache
Transform
promptslim
tokentrim
contextpack
langbridge
Validate
structuredshield
evalgate
Lookout
llmtap
tracelink
alertpulse
driftwatch
Shims
anthrofit
geminishim
Apps
Lookout
+
Brand
(auto-record)
Provider
OpenAI
Anthropic
Gemini
Groq
Ollama
+35 more

Module

Click any module to see what it does.

How the Proxy Works

Every request enters through the standard OpenAI-compatible /v1/chat/completions endpoint. It flows through the middleware chain in order: routing decides which provider handles it, safety modules check for harmful content, cost modules enforce spending limits, the cache checks for a hit, transforms optimize the prompt, validators ensure output quality, and observe modules record everything. After the response, hooks write traces to Lookout and audit entries to Brand automatically.

Every module is wrapped with toggle.Wrap. Disabled modules are bypassed with zero overhead. Toggle any module on or off at runtime via the API or the console — no restart required.

The Full Platform

The proxy is the foundation. On top of it, Stockyard runs integrated products:

Core Infrastructure

Proxy (76 modules) → Lookout (traces) → Brand (audit)

Creation Tools

Tack Room (prompt templates) → Forge (DAG workflows)

Marketplace

Trading Post (config packs, one-click install)

Operations

BillingTeamMemory

The Stack

Language
Go 1.22
Database
SQLite (WAL, embedded)
Binary
~25MB, self-hosted
Dependencies
Zero
Tables
106+
API endpoints
360+
Proxy overhead
~200ms (full chain)
Encryption
AES-256-GCM at rest

The ~200ms proxy overhead is the full middleware chain (rate limiting, cost tracking, logging, failover, filtering). A typical LLM provider response takes 1-30 seconds, so proxy overhead is under 2% of total request time.

Explore: Why SQLite · vs LiteLLM · Proxy-only mode