POST
/
v1
/
chat
/
completions

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
messages
object[]
required

A list of messages comprising the conversation so far

model
enum<string>
required

ID of the model to use

Available options:
hermes-3-llama3.1-8b,
meta-llama/llama-3.3-70b-instruct,
mistralai/mistral-7b-instruct,
mistralai/mixtral-8x22b-instruct,
mistralai/mixtral-8x7b-instruct,
nvidia/llama-3.1-nemotron-70b-instruct,
qwen/qwen-2.5-coder-32b-instruct
max_tokens
integer
default:
1024

The maximum number of tokens to generate

stream
boolean
default:
false

Whether to stream back partial progress

temperature
number

Sampling temperature between 0 and 1

Required range: 0 < x < 1

Response

200 - application/json
choices
object[]
created
integer

Unix timestamp of when the completion was created

id
string

Unique identifier for the completion

model
string

Model ID used

object
enum<string>
Available options:
chat.completion