Documentation Index
Fetch the complete documentation index at: https://docs.gp.scale.com/llms.txt
Use this file to discover all available pages before exploring further.
model specified as model_vendor/model, for example openai/gpt-4o
Which fields to include in the response
Instructions for the response generation
Maximum number of output tokens
Metadata for the response
Whether to enable parallel tool calls
ID of the previous response for chaining
Reasoning configuration for the response
Whether to store the response
Whether to stream the response
Sampling temperature for randomness control
0 <= x <= 2Text configuration parameters
Tool choice configuration
Tools available for the response
Top-p sampling parameter
0 <= x <= 1Truncation configuration
auto, disabled Successful Response
An output message from the model.
none, auto, required Defines a function in your own code the model can choose to call.
Learn more about function calling.
"response"An error object returned when the model fails to generate a Response.
Details about why the response is incomplete.
The conversation that this response belonged to.
Input items and output items from this response were automatically added to this conversation.
Reference to a prompt template and its variables. Learn more.
in-memory, 24h gpt-5 and o-series models only
Configuration options for reasoning models.
auto, default, flex, scale, priority completed, failed, in_progress, cancelled, queued, incomplete Configuration options for a text response from the model.
Can be plain text or structured JSON data. Learn more:
auto, disabled Represents token usage details including input tokens, output tokens, a breakdown of output tokens, and the total tokens used.