LLM Mart

Docs

OpenAI-compatible inference, paid in USDC on Solana. One signature, then it's just API calls.

1

Endpoint

POST /api/inference/v1/chat/completions

Drop-in replacement for the OpenAI Chat Completions endpoint. Send your inf_ key as a bearer token. Streaming and non-streaming both supported.

2

Routing

For each request the router tries, in order:

  1. 1. Your priority key (if set in Router Settings).
  2. 2. The cheapest healthy marketplace offer where you have credits with the seller.
  3. 3. Your fallback key (if set).
3

Billing

Buyer is charged the seller's quoted price per million tokens. Credits are pre-purchased from /markets in USDC. Platform takes a 1% fee at top-up time; the rest goes straight to the seller's wallet.

When credits with a seller run out, the API returns 402 insufficient_credits.

4

Settlement

Top-ups are a single Solana transaction with two SPL transfers — seller wallet and treasury — in one buyer-signed tx. USDC never sits in the platform's custody for the seller portion.

5

Compatibility

Works out of the box with:

  • curl / fetch
  • OpenAI Python / TS SDK (base_url)
  • Anthropic SDK (ANTHROPIC_BASE_URL)
  • Claude Code (same env override)