Chat Completions¶

This section lists the implemented endpoints under this capability.

Create chat completion¶

POST /v1/chat/completions

Create a model response from the conversation history. Supports both streaming and non-streaming responses.

Compatible with the OpenAI Chat Completions API.

Field	Type	Required	Description
model	string	yes	Model ID
messages	array[Message]	yes	Conversation message list
temperature	number	no	Sampling temperature
top_p	number	no	Nucleus sampling parameter
n	integer	no	Number of generations
stream	boolean	no	Whether to stream the response
stream_options	object	no	-
stop	-	no	Stop sequences
max_tokens	integer	no	Maximum generated token count
max_completion_tokens	integer	no	Maximum completion token count
presence_penalty	number	no	-
frequency_penalty	number	no	-
logit_bias	object	no	-
user	string	no	-
tools	array[Tool]	no	-
tool_choice	-	no	-
response_format	object (ResponseFormat)	no	-
seed	integer	no	-
reasoning_effort	string	no	Reasoning effort (for models that support reasoning)
modalities	array[string]	no	-
audio	object	no	-

Status	Description	Schema
200	Response created successfully	ChatCompletionResponse
400	Invalid request parameters	ErrorResponse
429	Rate limit exceeded	ErrorResponse