Usage & Billing
Usage & Billing
JoyToken billing is built around tier wallets, router model rates, billing-service calculation, usage records, and analytics aggregation. Successful model responses keep the OpenAI-compatible structure and add JoyToken cost fields in metadata.billing.
Request Lifecycle
Credit Tiers
The current system manages wallets and routing across three tiers:
Do not hand-calculate credits. The gateway asks router-service for rates and billing-service calculates cost from tokens and rates.
Freeze
When wallet quota is enabled, the gateway freezes credits before invoking the provider.
Estimation inputs:
Provider failure releases the freeze. If billing calculation or usage recording fails after provider success, the gateway also releases instead of charging incorrectly.
Cost Calculation
The gateway calls:
- router-service
GetBillingRates: fetch model rates. - billing-service
Calculate: computecredits_used,usd_cost, provider cost, and margin. - billing-service
RecordUsage: idempotently record usage byrequest_id. - wallet-service
Settle: settle the freeze with actual credits.
Common metadata.billing fields:
Usage Record Fields
The gateway sends these fields when recording usage:
Streaming Billing
Streaming responses forward provider chunks and parse usage as the stream progresses. Before the stream ends, the gateway appends a metadata event:
If the stream does not produce usable usage, the gateway releases the freeze and records the stream as not settleable from usage.
Analytics
Billing and analytics services provide these aggregate views:
Cost Control
- Use separate API keys per environment and workflow.
- Give IDE, agent, and RAG indexing workloads separate budgets.
- Prefer
model: "auto"plus tier policy for cost governance. - Pin model or tier for critical production paths to avoid cost drift.
- Use
X-Request-IDto connect application logs with JoyToken Usage / Billing.