← Refinery/Stratum S6Weeks 8–12 · production cut-over by day 84Pressure 97

Serve — Decision APIs · Online features · Cache

Serve is the high-pressure pump. Sub-second decisions, governed by policy, cached at the edge, traced end-to-end. Your agents stop guessing and start acting.

p99 latency

39ms

Availability

99.95%

Cost / 1M calls

− 41%

Commission this stratum →See deliverables

Deliverables

Everything that ships

Decision API gateway
Typed endpoints, JWT + policy, rate-limit + circuit breaker.
Online feature serving
Sub-10ms reads from Redis / DynamoDB, point-in-time correct.
Model serving lane
Triton / vLLM / KServe with shadow + canary deploys.
Edge cache
Cloudflare / CloudFront tiers for high-fan-out reads.
Observability
OpenTelemetry traces from agent → API → feature → model.

Pod composition

Platform Engineer
ML Engineer
Reliability Engineer

Example output · Decision · /v1/next-best-actionjson

POST /v1/next-best-action
{
  "customer_id": "c_8821",
  "context": { "channel": "app", "intent": "renew" }
}
→ 200 { "action": "offer_uplift_12mo", "score": 0.81, "lat_ms": 37 }

Timeline

Weeks 8–12 · production cut-over by day 84

1
Weeks 8–9
Gateway + features
Typed APIs, JWT + policy, online features under 10ms.
2
Weeks 9–11
Model lane
Triton/vLLM/KServe with shadow + canary; OTel traces end-to-end.
3
Weeks 11–12
Edge + cut-over
Cloudflare cache tier; production cut-over with rollback.

FAQs

Things prospects ask

Can we bring our own models?

Yes — anything that speaks gRPC/HTTP. We wrap with policy, observability and shadow deploys.

How do you handle bursty traffic?

Per-tenant rate limits, circuit breakers, edge cache for high-fan-out reads, autoscaling on request budget.

Commission · S6 Serve

Stand up Serve in Weeks 8–12.

We'll respond within one business day with a scoping note, a fixed-price outcome contract, and a named principal. Your details sync straight into our concierge queue.

• Outcome-priced — no T&M.
• Sovereign by default — your data, your region, your keys.
• Wired into the Fuel Pressure gauge from day one.