Three layers — allocation, enforcement, reporting — that fit over the tools you already use. No rip-and-replace.
01 — Allocation Layer
One policy. Every team's compute.
Define compute budgets by job level, team, or individual — with planned HRIS sync starting with Rippling, then BambooHR, then enterprise HRIS in Q3 2026. Joiners, movers, and leavers update policy automatically once that sync ships.
Stipend — Allocation Policy · Acme Corp
Budget Policy — 6 role tiers configured
Rippling first
Role TierMonthly BudgetProvider Access
Engineering · Principal
Level 6+
$3,000 / mo
OpenAIAnthropicGoogleAll models
Engineering · Senior
Level 4–5
$2,000 / mo
OpenAIAnthropic
Engineering · Mid
Level 2–3
$1,200 / mo
OpenAIAnthropic
Product
Any level
$1,500 / mo
OpenAIAnthropic
Design
Any level
$800 / mo
OpenAIgpt-4o, gpt-4o-mini
Support
Any level
$200 / mo
gpt-4o-mini only
Budget by role tier
Allocations tied to job levels in your HRIS. A new hire gets the right budget from day one — no admin ticket, no manual step.
Provider model allowlists
Define which providers and models each group can call. Glob patterns like claude-sonnet-* are supported. Blocked at the gateway before the request leaves.
Passthrough or resale
Bring your own API contracts or route through Stipend's resale layer. Either way: one credential per employee, all providers.
02 — Real-Time Enforcement
Hard limits before the request lands.
The gateway isn't a monitor — it's a gatekeeper. Every AI call is checked against the employee's remaining balance synchronously, before it reaches the provider. Over budget means a clean rejection. No overages, ever.
Stipend Gateway — Request Lifecycle
Employee tool
Cursor, Claude, VS Code, API
HTTPS
Stipend Gateway
Auth · Allowlist Budget reserve
✓ Approved
OpenAI
api.openai.com
Response
Reconcile
Actual cost returned to wallet
01
Key auth
Employee's Stipend key resolved to user, account, and wallet. Revoked keys rejected instantly — no round-trip.
< 1ms
02
Model allowlist
Requested model checked against account's provider policy. Unapproved models blocked before a reservation is even attempted.
03
Atomic reservation
Worst-case cost estimated and atomically decremented from wallet in one SQL transaction. Concurrent callers cannot race past the limit.
~3ms avg
04
Forward to provider
Request proxied with the resolved API key. Works for both standard and streaming responses — transparent to the calling tool.
05
Reconcile cost
Actual tokens parsed from the provider response. Over-estimated reserve returned to wallet. Usage event written — immutable, one per request.
When budget is exhausted
POST /v1/chat/completions HTTP 402 Payment Required
The atomic SQL reserve means concurrent requests cannot exceed the wallet balance. The database is the source of truth — not a cache or a flag.
Stream-aware
Works with standard and streaming requests. Token counts parsed from SSE chunks as they arrive. Wallet reconciled on stream close — no polling needed.
Drop-in compatible
The gateway speaks the same request/response shape as OpenAI and Anthropic. Employees swap one base URL. No code changes required in their tools.
03 — Finance-Ready Reporting
Reports your CFO can actually use.
Every request carries full attribution — employee, team, cost center, provider, model. Finance gets a clean breakdown they can export directly to their ERP. No manual reconciliation. No spreadsheets.
Stipend — AI Spend Report · March 2026
March 2026 · AI Spend Report
Total Spent
$11,760
↑ 14% vs February
Remaining Budget
$1,940
of $13,700 total
Overages
$0
100% enforced
Active Employees
28
across 4 teams
TeamSpentBudget usageStatus
Engineering
CC-ENG-001
$6,840
85%
On track
Product
CC-PROD-001
$3,200
91%
Near limit
Design
CC-DES-001
$1,240
77%
On track
Support
CC-SUP-001
$480
80%
On track
OpenAI$6,703
Anthropic$4,117
Google$940
Exports to
NetSuiteQuickBooksWorkday
Cost center attribution
Every token billed to the right team and role automatically. Allocations match your existing org chart, not a custom taxonomy you have to maintain.
Immutable audit trail
Every request logged once and never updated: who, what model, how many tokens, what cost, at what time. Write-once by design — built for compliance reviews.
Monthly reports, automatic
Finance receives a structured summary on the first of each month. One-click export for AP. No dashboard to check, no manual pull required.
<1 day
From signup to first enforced request
100%
Of requests budget-checked before reaching the provider
Rippling
BambooHR next. Enterprise HRIS in Q3 2026.
1 click
Admin action to revoke access today
Full Capabilities
Everything you need. Nothing you don't.
Budget by Role
Starting with Rippling, then BambooHR, planned HRIS sync will provision new hires at the right tier, update allocations on role change, and expand to enterprise HRIS in Q3 2026.
Provider Policy
Define which providers and models each team can access. Glob patterns supported. Requests to unapproved endpoints blocked before they leave the gateway.
Real-Time Enforcement
Every request checked against remaining balance synchronously. Hard limits, not soft alerts. No overages, no bill surprises at month end.
Audit Trail
Complete, immutable logs of every request: who, which model, how many tokens, what cost, at what time. Write-once by design — built for compliance reviews.
HRIS Sync
Rippling first. BambooHR next. Enterprise HRIS in Q3 2026. Native sync will automate joiners, movers, and leavers as Stipend moves further upmarket.
Cost Center Reporting
Every dollar attributed to a team, role, or cost center. Export to NetSuite, QuickBooks, or your ERP. Finance-ready without manual reconciliation.
No Cost for Qualified Teams
Every employee will have a token budget. Are you ready?
Budgets, policy controls, and real-time enforcement — running in a single afternoon. No custom engineering. No long-term contract.