POST
/
v4
/
completions

Authorizations

x-api-key
string
headerrequired

Body

application/json
account_id
string

The account ID to use for usage tracking. This will be gradually enforced.

model
required

The ID of the model to use for completions.

Users have two options:

  • Option 1: Use one of the supported models from the dropdown.
  • Option 2: Enter the ID of a custom model.

Note: For custom models we currently only support models finetuned using using the Scale-hosted LLM-Engine API.

Available options:
gpt-4,
gpt-4-0613,
gpt-4-32k,
gpt-4-32k-0613,
gpt-4-vision-preview,
gpt-4o,
gpt-4o-mini,
gpt-4o-2024-08-06,
gpt-3.5-turbo,
gpt-3.5-turbo-0613,
gpt-3.5-turbo-16k,
gpt-3.5-turbo-16k-0613,
gemini-pro,
gemini-1.5-pro-001,
gemini-1.5-pro-preview-0409,
gemini-1.5-pro-preview-0514,
text-davinci-003,
text-davinci-002,
text-curie-001,
text-babbage-001,
text-ada-001,
claude-instant-1,
claude-instant-1.1,
claude-2,
claude-2.0,
llama-7b,
llama-2-7b,
llama-2-7b-chat,
llama-2-13b,
llama-2-13b-chat,
llama-2-70b,
llama-2-70b-chat,
llama-3-8b,
llama-3-8b-instruct,
llama-3-1-8b-instruct,
llama-3-1-70b-instruct,
llama-3-70b-instruct,
Meta-Llama-3-8B-Instruct-RMU,
Meta-Llama-3-8B-Instruct-RR,
Meta-Llama-3-8B-Instruct-DERTA,
Meta-Llama-3-8B-Instruct-LAT,
falcon-7b,
falcon-7b-instruct,
falcon-40b,
falcon-40b-instruct,
mpt-7b,
mpt-7b-instruct,
flan-t5-xxl,
mistral-7b,
mistral-7b-instruct,
mixtral-8x7b,
mixtral-8x7b-instruct,
mixtral-8x22b-instruct,
llm-jp-13b-instruct-full,
llm-jp-13b-instruct-full-dolly,
zephyr-7b-alpha,
zephyr-7b-beta,
zephyr-cat-merged,
codellama-7b,
codellama-7b-instruct,
codellama-13b,
codellama-13b-instruct,
codellama-34b,
codellama-34b-instruct,
phi-3-mini-4k-instruct,
phi-3-cat-merged,
dolphin-2.9-llama3-8b,
dolphin-2.9-llama3-70b
prompt
string
required

Prompt for which to generate the completion.

Good prompt engineering is crucial to getting performant results from the model. If you are having trouble getting the model to perform well, try writing a more specific prompt here before trying more expensive techniques such as swapping in other models or finetuning the underlying LLM.

images
object[]

List of image urls to be used for image based completions. Leave empty for text based completions.

model_parameters
object

Configuration parameters for the completion model, such as temperature, max_tokens, and stop_sequences.

If not specified, the default value are:

  • temperature: 0.2
  • max_tokens: None (limited by the model's max tokens)
  • stop_sequences: None
stream
boolean
default: false

Whether or not to stream the response.

Setting this to True will stream the completion in real-time.

Response

200 - application/json
completion
object
required

The actual completion text and the finish reason.

token_usage
object

Token usage numbers. If streaming, this field is null until the stream completes, at which point it will be populated (if supported).