Skip to content

Chat Completions

This section lists the implemented endpoints under this capability.

Create chat completion

POST /v1/chat/completions

Create a model response from the conversation history. Supports both streaming and non-streaming responses.

Compatible with the OpenAI Chat Completions API.

Authentication

  • Bearer Token (Authorization: Bearer <token>)

Request Body

  • Content-Type: application/json
  • Schema: ChatCompletionRequest
Field Type Required Description
model string yes Model ID
messages array[Message] yes Conversation message list
temperature number no Sampling temperature
top_p number no Nucleus sampling parameter
n integer no Number of generations
stream boolean no Whether to stream the response
stream_options object no -
stop - no Stop sequences
max_tokens integer no Maximum generated token count
max_completion_tokens integer no Maximum completion token count
presence_penalty number no -
frequency_penalty number no -
logit_bias object no -
user string no -
tools array[Tool] no -
tool_choice - no -
response_format object (ResponseFormat) no -
seed integer no -
reasoning_effort string no Reasoning effort (for models that support reasoning)
modalities array[string] no -
audio object no -

Responses

Status Description Schema
200 Response created successfully ChatCompletionResponse
400 Invalid request parameters ErrorResponse
429 Rate limit exceeded ErrorResponse