POST
/
v4
/
beta
/
chat
/
completions

Authorizations

x-api-key
string
header
required

Body

application/json
messages
object[]
required

openai standard message format

model
string
required

model specified as model_vendor/model, for example openai/gpt-4o

audio
object

Parameters for audio output. Required when audio output is requested with modalities: ['audio'].

frequency_penalty
number

Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far.

Required range: -2 < x < 2
function_call
object

Deprecated in favor of tool_choice. Controls which function is called by the model.

functions
object[]

Deprecated in favor of tools. A list of functions the model may generate JSON inputs for.

logit_bias
object

Modify the likelihood of specified tokens appearing in the completion. Maps tokens to bias values from -100 to 100.

logprobs
boolean

Whether to return log probabilities of the output tokens or not.

max_completion_tokens
integer

An upper bound for the number of tokens that can be generated, including visible output tokens and reasoning tokens.

max_tokens
integer

Deprecated in favor of max_completion_tokens. The maximum number of tokens to generate.

metadata
object

Developer-defined tags and values used for filtering completions in the dashboard.

modalities
string[]

Output types that you would like the model to generate for this request.

n
integer

How many chat completion choices to generate for each input message.

parallel_tool_calls
boolean

Whether to enable parallel function calling during tool use.

prediction
object

Static predicted output content, such as the content of a text file being regenerated.

presence_penalty
number

Number between -2.0 and 2.0. Positive values penalize tokens based on whether they appear in the text so far.

Required range: -2 < x < 2
reasoning_effort
string

For o1 models only. Constrains effort on reasoning. Values: low, medium, high.

response_format
object

An object specifying the format that the model must output.

seed
integer

If specified, system will attempt to sample deterministically for repeated requests with same seed.

stop

Up to 4 sequences where the API will stop generating further tokens.

store
boolean

Whether to store the output for use in model distillation or evals products.

stream
boolean

If true, partial message deltas will be sent as server-sent events.

stream_options
object

Options for streaming response. Only set this when stream is true.

temperature
number

What sampling temperature to use. Higher values make output more random, lower more focused.

Required range: 0 < x < 2
tool_choice

Controls which tool is called by the model. Values: none, auto, required, or specific tool.

tools
object[]

A list of tools the model may call. Currently, only functions are supported. Max 128 functions.

top_k
integer

Only sample from the top K options for each subsequent token

top_logprobs
integer

Number of most likely tokens to return at each position, with associated log probability.

Required range: 0 < x < 20
top_p
number

Alternative to temperature. Only tokens comprising top_p probability mass are considered.

Required range: 0 < x < 1

Response

200 - application/json
choices
object[]
required
created
integer
required
id
string
required
model
string
required
object
enum<string>
default:
chat.completion
Available options:
chat.completion
service_tier
enum<string>
Available options:
scale,
default
system_fingerprint
string
usage
object