How we keep LLM costs predictable without a database

June 8, 2026 · Quravin

The model is the bill. Everything else — Lambda, S3, the CDN — rounds to zero at our scale. So cost control is really spend control, and it has to be exact.

A hard counter on S3

We didn’t want to run a database just to count requests. Instead we use S3 conditional writes (If-Match on the ETag) as a compare-and-swap: read the counter, increment, write only if the ETag still matches, retry on conflict. That gives an atomic monthly request counter and a daily cost counter per organization — durable, serverless, and correct under concurrency.

Three layers of protection

Monthly request quota per plan (with per-app overrides).
Daily USD cost cap — a hard ceiling that stops runs before they spend.
Anonymous limits — per-IP and a global daily ceiling on the public try-it endpoint, plus a Turnstile bot check, so the free tools can’t be turned into someone else’s free GPU.

The payoff

You get a bill you can predict and a kill-switch you control, without operating any stateful infrastructure. Read more in the FAQ or start building.