Routing

JoyToken routing happens inside api-gateway. The gateway validates the API key and policy, converts the OpenAI-compatible request into router-service Route input, then uses the returned selected_model for wallet freeze, provider invocation, usage calculation, and billing records.

POST /openai/v1/chat/completions
-> ValidateApiKey
-> build policy
-> Route
-> wallet freeze
-> provider invoke
-> usage / billing

Routing Input

Chat requests are converted into router input:

InputSource
request_idThe current implementation uses an internal req-<timestamp> for router input; external X-Request-ID is mainly for gateway logs, billing, and provider correlation
session_idRequest user; otherwise a hash-like value from the latest user message
user_idAPI key owner ID
system_promptThe first system message
messagesOpenAI messages converted to role/content pairs
latest_promptLatest user message, or the last message when no user message exists
client_ipX-Forwarded-For, X-Real-IP, or remote address
toolsRequest tools, summarized for model selection
policyGateway-built policy constraint
options.executeCurrently false; router decides, gateway invokes provider

Model Selection

Request modelBehavior
omittedUses gateway default model; without a default it behaves like auto
autorouter-service chooses a model within policy
concrete model ID / keyGateway sets fixed_model in route policy and still asks router to resolve and validate
API key fixed modelConcrete requests for another model are rejected before routing
Auto routing request
1{
2 "model": "auto",
3 "tier": "standard",
4 "messages": [
5 { "role": "user", "content": "Summarize this support ticket." }
6 ]
7}

Tier Resolution

Final allowed tiers come from:

  1. API key policy snapshot.
  2. API key tier.
  3. Request body tier.
  4. Wallet balance filtering when wallet quota is enabled.

If the request body contains an invalid tier, the gateway returns 403 policy_rejected. If the requested tier is outside the policy, the request is also rejected.

Policy Sent to Router

The gateway assembles policy from policy_snapshot_json and API key fields:

ConstraintSource
routing_strategyPolicy snapshot; supports BALANCE, COST_FIRST, QUALITY_FIRST, SPEED_FIRST
allowed_tiersIntersection of policy tiers, API key tier, and request tier
model_blacklistPolicy snapshot
fixed_modelAPI key fixed_model or concrete request model
tagPolicy scenario tag, lowercased
industry_packsPolicy industry scenario packs
required_feature_tagsDerived from request content; image input adds vision
quota_remainingAPI key limit_daily, used as routing input

Wallet-Aware Fallback

If the request uses model: "auto" and the selected tier fails wallet freeze due to insufficient balance, the gateway can retry routing to another allowed tier.

Current tierFallback order
premiumstandard -> economy
standardpremium -> economy
economystandard -> premium

This fallback only happens for model: "auto". Fixed model requests fail with 402 insufficient_quota when freeze fails.

Provider Request Body

Before calling provider-adapter, the gateway normalizes the provider body:

ChangeReason
Removes top-level tierUpstream providers do not understand JoyToken tier
Sets selected modelProvider receives the selected model
Merges metadataAdds routing metadata for observability
For streaming, sets stream_options.include_usage = trueAllows usage extraction from SSE when the provider supports it

Response Metadata

Non-streaming responses merge routing metadata into the JSON body. Streaming responses append a metadata event before [DONE].

Common fields:

FieldMeaning
modelSelected model
tierSelected billing/routing tier
scoreRouter score
task_scoreRouter task scoring details when available
model_recommendationCandidate models when router returns them
latency.routing_msRouter latency
latency.first_token_msFirst-token latency for streaming
latency.stream_msTotal stream transfer time
billingCredits and token fields when calculated
Stream metadata event
1{
2 "metadata": {
3 "model": "GLM-5",
4 "tier": "standard",
5 "score": 7.57,
6 "latency": {
7 "routing_ms": 6,
8 "first_token_ms": 875,
9 "stream_ms": 9878
10 },
11 "billing": {
12 "credits_used": "0.2288",
13 "input_tokens": 54,
14 "output_tokens": 545
15 }
16 }
17}

Response Headers

HeaderMeaning
X-DAOE-Used-ModelModel used by provider
X-DAOE-Used-ProviderProvider returned by provider-adapter
X-DAOE-Failover1 when streaming provider failover happened

Troubleshooting

SymptomCheck
routing_errorRouter returned no selected_model or rejected candidates
requested tier is not allowedAPI key policy and request tier
requested model is not allowedAPI key fixed model
client ip is not allowedPolicy snapshot IP allowlist
wallet balance is insufficientWallet balance, budget, and freeze amount for the selected tier