LLM Gateway
Chat Completion Endpoint
OpenAI-compatible chat completion endpoint for LLM interactions.
POST
/
v1
/
chat
/
completions
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Body
application/json
ID of the model to use
Available options:
dolphin-2.9-llama3-8b
, hermes-3-llama3.1-8b
, meta-llama/llama-3-70b-instruct
, meta-llama/llama-3.1-405b-instruct
, mistralai/mistral-7b-instruct
, mistralai/mixtral-8x22b-instruct
, mistralai/mixtral-8x7b-instruct
, nvidia/llama-3.1-nemotron-70b-instruct
, openhermes-mixtral-8x7b-gptq
, qwen/qwen-2.5-coder-32b-instruct
, theia-llama-3.1-8b
A list of messages comprising the conversation so far
The maximum number of tokens to generate
Sampling temperature between 0 and 1
Required range:
0 < x < 1
Whether to stream back partial progress