Question Sets
Dataset Items
Evaluations
Evaluation Items
Create Evaluation
curl --request POST \
--url https://api.egp.scale.com/v5/evaluations \
--header 'Content-Type: application/json' \
--header 'x-api-key: <api-key>' \
--data '{
"name": "<string>",
"description": "<string>",
"tags": [
"<string>"
],
"tasks": [
{
"task_type": "chat_completion",
"alias": "<string>",
"configuration": {
"top_k": 123,
"frequency_penalty": 123,
"function_call": {},
"functions": [
{}
],
"logit_bias": {},
"logprobs": true,
"max_completion_tokens": 123,
"max_tokens": 123,
"metadata": {},
"modalities": [
"<string>"
],
"n": 123,
"parallel_tool_calls": true,
"prediction": {},
"presence_penalty": 123,
"reasoning_effort": "<string>",
"response_format": {},
"seed": 123,
"stop": "<string>",
"store": true,
"temperature": 123,
"tool_choice": "<string>",
"tools": [
{}
],
"top_logprobs": 123,
"top_p": 123,
"audio": {},
"model": "<string>",
"messages": [
{}
]
}
}
],
"data": [
{}
]
}'
{
"id": "<string>",
"object": "evaluation",
"name": "<string>",
"description": "<string>",
"created_at": "2023-11-07T05:31:56Z",
"created_by_user_id": "<string>",
"archived_at": "2023-11-07T05:31:56Z",
"tags": [
"<string>"
],
"datasets": [
{
"id": "<string>",
"object": "dataset",
"name": "<string>",
"description": "<string>",
"created_at": "2023-11-07T05:31:56Z",
"created_by_user_id": "<string>",
"archived_at": "2023-11-07T05:31:56Z",
"tags": [
"<string>"
],
"current_version_num": 123
}
],
"status": "failed",
"tasks": [
{
"task_type": "chat_completion",
"alias": "<string>",
"configuration": {
"top_k": 123,
"frequency_penalty": 123,
"function_call": {},
"functions": [
{}
],
"logit_bias": {},
"logprobs": true,
"max_completion_tokens": 123,
"max_tokens": 123,
"metadata": {},
"modalities": [
"<string>"
],
"n": 123,
"parallel_tool_calls": true,
"prediction": {},
"presence_penalty": 123,
"reasoning_effort": "<string>",
"response_format": {},
"seed": 123,
"stop": "<string>",
"store": true,
"temperature": 123,
"tool_choice": "<string>",
"tools": [
{}
],
"top_logprobs": 123,
"top_p": 123,
"audio": {},
"model": "<string>",
"messages": [
{}
]
}
}
]
}
Authorizations
Body
Items to be evaluated
The tags associated with the entity
Tasks allow you to augment and evaluate your data
chat_completion
Alias to title the results column. Defaults to the task_type
Items to be evaluated
The tags associated with the entity
Tasks allow you to augment and evaluate your data
chat_completion
Alias to title the results column. Defaults to the task_type
The ID of the dataset containing the items referenced by the data
field
The tags associated with the entity
Tasks allow you to augment and evaluate your data
chat_completion
Alias to title the results column. Defaults to the task_type
Items to be evaluated
Create a reusable dataset from items in the data
field
The tags associated with the entity
Tasks allow you to augment and evaluate your data
chat_completion
Alias to title the results column. Defaults to the task_type
Response
The tags associated with the entity
The tags associated with the entity
dataset
failed
, completed
, running
evaluation
Tasks executed during evaluation. Populated with optional task
view.
chat_completion
Alias to title the results column. Defaults to the task_type
curl --request POST \
--url https://api.egp.scale.com/v5/evaluations \
--header 'Content-Type: application/json' \
--header 'x-api-key: <api-key>' \
--data '{
"name": "<string>",
"description": "<string>",
"tags": [
"<string>"
],
"tasks": [
{
"task_type": "chat_completion",
"alias": "<string>",
"configuration": {
"top_k": 123,
"frequency_penalty": 123,
"function_call": {},
"functions": [
{}
],
"logit_bias": {},
"logprobs": true,
"max_completion_tokens": 123,
"max_tokens": 123,
"metadata": {},
"modalities": [
"<string>"
],
"n": 123,
"parallel_tool_calls": true,
"prediction": {},
"presence_penalty": 123,
"reasoning_effort": "<string>",
"response_format": {},
"seed": 123,
"stop": "<string>",
"store": true,
"temperature": 123,
"tool_choice": "<string>",
"tools": [
{}
],
"top_logprobs": 123,
"top_p": 123,
"audio": {},
"model": "<string>",
"messages": [
{}
]
}
}
],
"data": [
{}
]
}'
{
"id": "<string>",
"object": "evaluation",
"name": "<string>",
"description": "<string>",
"created_at": "2023-11-07T05:31:56Z",
"created_by_user_id": "<string>",
"archived_at": "2023-11-07T05:31:56Z",
"tags": [
"<string>"
],
"datasets": [
{
"id": "<string>",
"object": "dataset",
"name": "<string>",
"description": "<string>",
"created_at": "2023-11-07T05:31:56Z",
"created_by_user_id": "<string>",
"archived_at": "2023-11-07T05:31:56Z",
"tags": [
"<string>"
],
"current_version_num": 123
}
],
"status": "failed",
"tasks": [
{
"task_type": "chat_completion",
"alias": "<string>",
"configuration": {
"top_k": 123,
"frequency_penalty": 123,
"function_call": {},
"functions": [
{}
],
"logit_bias": {},
"logprobs": true,
"max_completion_tokens": 123,
"max_tokens": 123,
"metadata": {},
"modalities": [
"<string>"
],
"n": 123,
"parallel_tool_calls": true,
"prediction": {},
"presence_penalty": 123,
"reasoning_effort": "<string>",
"response_format": {},
"seed": 123,
"stop": "<string>",
"store": true,
"temperature": 123,
"tool_choice": "<string>",
"tools": [
{}
],
"top_logprobs": 123,
"top_p": 123,
"audio": {},
"model": "<string>",
"messages": [
{}
]
}
}
]
}