Skip to main content
POST
/
v5
/
responses
Responses
curl --request POST \
  --url https://api.egp.scale.com/v5/responses \
  --header 'Content-Type: application/json' \
  --header 'x-api-key: <api-key>' \
  --data '
{
  "model": "<string>",
  "input": "<string>",
  "include": [
    "<string>"
  ],
  "instructions": "<string>",
  "max_output_tokens": 123,
  "metadata": {},
  "parallel_tool_calls": true,
  "previous_response_id": "<string>",
  "reasoning": {},
  "store": true,
  "stream": true,
  "temperature": 1,
  "text": {},
  "tool_choice": "<string>",
  "tools": [
    {}
  ],
  "top_p": 0.5,
  "truncation": "auto"
}
'
{
  "id": "<string>",
  "created_at": 123,
  "model": "<string>",
  "output": [
    {
      "id": "<string>",
      "content": [
        "<unknown>"
      ],
      "role": "<string>",
      "status": "in_progress",
      "type": "<string>",
      "phase": "commentary"
    }
  ],
  "parallel_tool_calls": true,
  "tool_choice": "none",
  "tools": [
    {
      "name": "<string>",
      "type": "<string>",
      "parameters": {},
      "strict": true,
      "defer_loading": true,
      "description": "<string>"
    }
  ],
  "object": "response",
  "error": {
    "code": "server_error",
    "message": "<string>"
  },
  "incomplete_details": {
    "reason": "max_output_tokens"
  },
  "instructions": "<string>",
  "metadata": {},
  "temperature": 123,
  "top_p": 123,
  "background": true,
  "completed_at": 123,
  "conversation": {
    "id": "<string>"
  },
  "max_output_tokens": 123,
  "max_tool_calls": 123,
  "previous_response_id": "<string>",
  "prompt": {
    "id": "<string>",
    "variables": {},
    "version": "<string>"
  },
  "prompt_cache_key": "<string>",
  "prompt_cache_retention": "in-memory",
  "reasoning": {
    "effort": "none",
    "generate_summary": "auto",
    "summary": "auto"
  },
  "safety_identifier": "<string>",
  "service_tier": "auto",
  "status": "completed",
  "text": {
    "format": {
      "type": "<string>"
    },
    "verbosity": "low"
  },
  "top_logprobs": 123,
  "truncation": "auto",
  "usage": {
    "input_tokens": 123,
    "input_tokens_details": {
      "cached_tokens": 123
    },
    "output_tokens": 123,
    "output_tokens_details": {
      "reasoning_tokens": 123
    },
    "total_tokens": 123
  },
  "user": "<string>"
}

Documentation Index

Fetch the complete documentation index at: https://docs.gp.scale.com/llms.txt

Use this file to discover all available pages before exploring further.

Authorizations

x-api-key
string
header
required

Headers

x-selected-account-id
string | null

Body

application/json
model
string
required

model specified as model_vendor/model, for example openai/gpt-4o

input
required
include
string[]

Which fields to include in the response

instructions
string

Instructions for the response generation

max_output_tokens
integer

Maximum number of output tokens

metadata
Metadata · object

Metadata for the response

parallel_tool_calls
boolean

Whether to enable parallel tool calls

previous_response_id
string

ID of the previous response for chaining

reasoning
Reasoning · object

Reasoning configuration for the response

store
boolean

Whether to store the response

stream
boolean

Whether to stream the response

temperature
number

Sampling temperature for randomness control

Required range: 0 <= x <= 2
text
Text · object

Text configuration parameters

tool_choice

Tool choice configuration

tools
Tools · object[]

Tools available for the response

top_p
number

Top-p sampling parameter

Required range: 0 <= x <= 1
truncation
enum<string>

Truncation configuration

Available options:
auto,
disabled

Response

Successful Response

id
string
required
created_at
number
required
model
required
output
(ResponseOutputMessage · object | ResponseFileSearchToolCall · object | ResponseFunctionToolCall · object | ResponseFunctionToolCallOutputItem · object | ResponseFunctionWebSearch · object | ResponseComputerToolCall · object | ResponseComputerToolCallOutputItem · object | ResponseReasoningItem · object | ResponseToolSearchCall · object | ResponseToolSearchOutputItem · object | ResponseCompactionItem · object | ImageGenerationCall · object | ResponseCodeInterpreterToolCall · object | LocalShellCall · object | LocalShellCallOutput · object | ResponseFunctionShellToolCall · object | ResponseFunctionShellToolCallOutput · object | ResponseApplyPatchToolCall · object | ResponseApplyPatchToolCallOutput · object | McpCall · object | McpListTools · object | McpApprovalRequest · object | McpApprovalResponse · object | ResponseCustomToolCall · object | ResponseCustomToolCallOutputItem · object)[]
required

An output message from the model.

parallel_tool_calls
boolean
required
tool_choice
required
Available options:
none,
auto,
required
tools
(FunctionTool · object | FileSearchTool · object | ComputerTool · object | ComputerUsePreviewTool · object | WebSearchTool · object | Mcp · object | CodeInterpreter · object | ImageGeneration · object | LocalShell · object | FunctionShellTool · object | CustomTool · object | NamespaceTool · object | ToolSearchTool · object | WebSearchPreviewTool · object | ApplyPatchTool · object)[]
required

Defines a function in your own code the model can choose to call.

Learn more about function calling.

object
string
default:response
Allowed value: "response"
error
ResponseError · object

An error object returned when the model fails to generate a Response.

incomplete_details
IncompleteDetails · object

Details about why the response is incomplete.

instructions
metadata
Metadata · object
temperature
number
top_p
number
background
boolean
completed_at
number
conversation
Conversation · object

The conversation that this response belonged to.

Input items and output items from this response were automatically added to this conversation.

max_output_tokens
integer
max_tool_calls
integer
previous_response_id
string
prompt
ResponsePrompt · object

Reference to a prompt template and its variables. Learn more.

prompt_cache_key
string
prompt_cache_retention
enum<string>
Available options:
in-memory,
24h
reasoning
Reasoning · object

gpt-5 and o-series models only

Configuration options for reasoning models.

safety_identifier
string
service_tier
enum<string>
Available options:
auto,
default,
flex,
scale,
priority
status
enum<string>
Available options:
completed,
failed,
in_progress,
cancelled,
queued,
incomplete
text
ResponseTextConfig · object

Configuration options for a text response from the model.

Can be plain text or structured JSON data. Learn more:

top_logprobs
integer
truncation
enum<string>
Available options:
auto,
disabled
usage
ResponseUsage · object

Represents token usage details including input tokens, output tokens, a breakdown of output tokens, and the total tokens used.

user
string