Create chat completion
OpenAI-compatible chat completions endpoint, multiplexed across every model Gumloop supports
(Anthropic, OpenAI, Google Gemini, OpenRouter routes). Set stream: true for Server-Sent Events,
or omit it for a unary JSON response. Image-generation models (gpt-image-*, gemini-*-image-preview,
dall-e-*) are dispatched automatically when modalities includes "image" and yield image
attachments on choices[0].message.images.
Streaming host
Chat completions live on the streaming host. Send all requests — unary or streaming — to:
POST https://ws.gumloop.com/api/v1/chat/completions
api.gumloop.com does not serve this endpoint; the Python SDK routes there automatically.
Billing
Each completion charges the caller’s credit balance based on token usage (with cache-token semantics per provider) plus a flat 30-credit fee for image-gen calls. Users who configure their own provider API key get a 50% discount.
Authorizations
Body
Model slug. Use the id from GET /models or one of Gumloop's preset routes.
"claude-sonnet-4-5"
Conversation history. Roles system, user, assistant. Multipart content (text + image) is supported on the user role.
[
{
"role": "user",
"content": "Capital of Canada?"
}
]When true, the response is text/event-stream carrying one chat.completion.chunk per delta and terminating with data: [DONE]. Otherwise a unary JSON chat.completion is returned.
Sampling temperature.
Cap on completion tokens. Replaces the deprecated max_tokens field.
Output modalities. Include "image" to route to an image-generation model.
text, image Image-generation parameters (size, quality, aspect_ratio, background, output_format, partial_images). Optional. Image-generation models accept either modalities: ["image"] or image_config (or both); chat models ignore this field.
Constrain the response. {type: "json_object"} returns a JSON object; {type: "json_schema", json_schema: {name, strict, schema}} returns JSON matching the supplied schema.
OpenAI-shape tool definitions ({type: "function", function: {name, description, parameters}}). Pass tool_choice to constrain selection.
Force a specific tool, leave unset for the model to choose, or pass "none" to disable tool calls.
OpenRouter provider routing config. Caller fields like sort and order are honored; ZDR/data_collection policy is server-enforced.
Response
Chat completion. When stream is false (or omitted), the response is the chat.completion envelope below. When stream: true, the response is text/event-stream; each event carries one chat.completion.chunk and the terminator is data: [DONE].
