POST
/
v1
/
chat
/
completions

Authorizations

Authorization
string
headerrequired

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
model
enum<string>
required

ID of the model to use

Available options:
dolphin-2.9-llama3-8b,
hermes-3-llama3.1-8b,
meta-llama/llama-3-70b-instruct,
meta-llama/llama-3.1-405b-instruct,
mistralai/mistral-7b-instruct,
mistralai/mixtral-8x22b-instruct,
mistralai/mixtral-8x7b-instruct,
nvidia/llama-3.1-nemotron-70b-instruct,
openhermes-mixtral-8x7b-gptq,
qwen/qwen-2.5-coder-32b-instruct,
theia-llama-3.1-8b
messages
object[]
required

A list of messages comprising the conversation so far

max_tokens
integer
default: 1024

The maximum number of tokens to generate

temperature
number

Sampling temperature between 0 and 1

Required range: 0 < x < 1
stream
boolean
default: false

Whether to stream back partial progress

Response

200 - application/json
id
string

Unique identifier for the completion

object
enum<string>
Available options:
chat.completion
created
integer

Unix timestamp of when the completion was created

model
string

Model ID used

choices
object[]