Knowledge Bases
- GETList Knowledge Bases
- POSTCreate Knowledge Base
- GETGet Knowledge Base
- DELDelete Knowledge Base
- PATCHPatch Knowledge Base
- GETList Upload Jobs
- GETGet Chunks
- POSTQuery Relevant Chunks
- POSTDelete Knowledge Base Data Source Connection
- POSTSubmit Upload Job with local files
- GETList Upload Jobs
- POSTSubmit Upload Job
- GETGet Upload Job
- POSTCancel Upload Job
- GETList Tracked Artifacts
- GETGet Tracked Artifact Details
- DELDelete Locally Stored Artifact
- PATCHPatch Artifact Information
- POSTTest Knowledge Base Data Source credentials
- POSTBatch Delete Locally Stored Artifacts
- GETList Upload Schedules
- POSTCreate Upload Schedule
- GETGet Upload Schedule
- DELDelete Upload Schedule
- PATCHUpdate Upload Schedule
Agents
Completions
Chat Completions
Models
- POSTExecute Model Deployment
- POSTGenerate text embedding
- POSTGenerate reranking
- POSTGenerate completion
- POSTGenerate chat completion
- GETList Model Deployments of a Model
- POSTDeploy Model
- GETGet Model usage by model name
- GETList All Model Deployments
- GETGet Model usage for one deployment
- POSTAdd model to group
- GETGet Model usage for a group
- GETList Models
- POSTCreate Model
- GETGet Model
- DELDelete Model
- PATCHUpdate Model
- GETList Model Groups
- POSTCreate Model Group
- GETGet Model Group
- DELDelete Model Group
- PATCHUpdate Model Group
- GETGet Deployment
- DELDelete Deployment
- PATCHUpdate Deployment
Accounts
Organizations
Question Sets
Evaluations
- POSTClaim Evaluation Task
- PATCHUpdate Evaluation Task
- GETGet Contributor Metrics
- GETList Contributor Metrics
- GETGet Evaluation Metrics
- GETGet Hybrid Evaluation Metrics
- GETList Evaluations
- POSTCreate Evaluation
- GETGet Evaluation
- DELDelete Evaluation
- PATCHUpdate Evaluation
- GETList Test Case Results
- POSTCreate Test Case Result
- POSTBatch Create Test Case Results
- GETGet Test Case Result
- PATCHUpdate Test Case Result
- GETGet Historical Test Case Result
- GETList Historical Test Case Results
Evaluation Configs
Evaluation Datasets
- GETList Evaluation Datasets
- POSTCreate Evaluation Dataset
- GETGet Evaluation Dataset
- DELDelete Evaluation Dataset
- PATCHUpdate Evaluation Dataset
- GETList Evaluation Dataset Versions
- POSTCreate Evaluation Dataset Version
- GETGet Evaluation Dataset Version
- GETList Test Cases
- POSTCreate Test Case
- POSTBatch Create Test Cases
- GETGet Test Case
- DELDelete Test Case
- PATCHUpdate Test Case
- GETGet Historical Test Case
- GETList Historical Test Cases
- DELWipe Full History of Test Cases
- GETList Autogenerated Draft Test Cases
- POSTCreate Autogenerated Draft Test Case
- GETGet Autogenerated Draft Test Case
- DELDelete Autogenerated Draft Test Case
- PATCHUpdate Autogenerated Draft Test Case
- POSTApprove Autogenerated Draft Test Case
- POSTPublish Latest Evaluation Dataset Version
- POSTApprove Auto Generated Test Cases Batch
- GETGet Evaluation Dataset Generation Jobs
- POSTCreate Evaluation Dataset Generation Job
- GETGet Evaluation Dataset Generation Job
- POSTCancel Evaluation Dataset Generation Job
- POSTPublish Evaluation Dataset Version
Studio Projects
Application Specs
Knowledge Base Data Sources
Model Templates V3 (Beta)
Model server
API Reference
Fine Tuning Jobs V3 (Beta)
Training Datasets V3 (Beta)
package deployments
Beta
Applications
- GETList Application Variants
- POSTCreate Application Variant
- GETGet Application Variant
- DELDelete Application Variant
- PATCHUpdate Application Variant
- GETList Application Deployments
- POSTCreate Application Deployment
- GETGet Application Deployment
- PATCHUpdate Application Deployment
- GETList Application Variant Reports
- POSTCreate Application Variant Report
- GETGet Application Variant Report
- GETList Application Test Case Outputs
- POSTBatch Create Application Test Case Outputs
- GETGet Application Test Case Output
- POSTCreate Application With Variant
- GETGet Application Schemas
- POSTProcess Application
- POSTProcess Application By Id
- POSTProcess Deployed Application For Thread
- POSTValidate Application
- GETList Application Threads
- POSTCreate New Application Thread
- POSTUpload Application Files
- PATCHPatch Application Variant Guardrails
ChatThreads
Interactions
MonitoringDashboard
Chat Themes
account groups
List Test Case Results
Description
Lists all test case results accessible to the user.
Details
This API can be used to list test case results. If a user has access to multiple accounts, all test case results from all accounts the user is associated with will be returned.
Page number for pagination to be returned by the given endpoint. Starts at page 1
Maximum number of artifacts to be returned by the given endpoint. Defaults to 100 and cannot be greater than 10k.
Authorizations
Headers
Path Parameters
Query Parameters
AnnotationResults
, Metrics
, Task
, TestCaseVersion
, Trace
Page number for pagination to be returned by the given endpoint. Starts at page 1
x > 1
Maximum number of artifacts to be returned by the given endpoint. Defaults to 100 and cannot be greater than 10k.
1 < x < 10000
Response
The data returned for the current page.
GENERATION
, FLEXIBLE
The unique identifier of the entity.
The date and time when the entity was created in ISO format.
The ID of the account that owns the given entity.
Boolean of whether this interaction has been uploaded to s3 bucket yet, default is false
How much time the step took in milliseconds(ms)
The outcome of the operation
SUCCESS
, ERROR
The date and time when the entity was created in ISO format.
The JSON representation of the metadata insights emitted through the execution. This can differ based on different types of operations
EXTERNAL_AI
, EVALUATION
, SGP_CHAT
, AGENTS_SERVICE
Identifies the application step
The id of the application insight this step belongs to
The id of the node in the application_variant config that emitted this insight
Type of the operation, e.g. RERANKING
TEXT_INPUT
, TEXT_OUTPUT
, COMPLETION_INPUT
, COMPLETION
, KB_RETRIEVAL
, KB_INPUT
, RERANKING
, EXTERNAL_ENDPOINT
, PROMPT_ENGINEERING
, DOCUMENT_INPUT
, MAP_REDUCE
, DOCUMENT_SEARCH
, DOCUMENT_PROMPT
, CUSTOM
, INPUT_GUARDRAIL
, OUTPUT_GUARDRAIL
, CODE_EXECUTION
, DATA_MANIPULATION
, EVALUATION
The outcome of the operation
SUCCESS
, ERROR
The JSON representation of the input that this step received.
The JSON representation of the output that this step emitted.
The JSON representation of the expected output for this step
The JSON representation of the metadata insights emitted through the execution. This can differ based on different types of operations
The start time of the step
How much time the step took in milliseconds(ms)
The ID of the associated evaluation dataset.
Boolean to track whether or not the test case is autogenerated
The unique identifier of the entity.
The date and time when the entity was created in ISO format.
The ID of the account that owns the given entity.
The user who originally created the entity.
GENERATION
Used for tracking previous chat interactions for multi-chat test cases
Metadata for the test case
The date and time when the entity was archived in ISO format.
EVALUATION_ANNOTATION
The ID of the account that owns the given entity.
PENDING
, COMPLETED
The date and time when the task assignment expires in ISO format.
The entity that the task is associated with.
The status of the test case result. This should not be explictly set when creating a test case result. When patching a test case result, this field can be set to 'PENDING' to prevent the test case result from being marked 'COMPLETED'.
PENDING
, COMPLETED
, FAILED
The unique identifier of the entity.
The date and time when the entity was created in ISO format.
The ID of the account that owns the given entity.
The user who originally created the entity.
UNAUDITED
, FIXED
, APPROVED
GENERATION
The result of the test case evaluation, in JSON form where the key is the question ID and the value is the result.
The time spent labeling in seconds.
The user who annotated the task.
The date and time when the entity was archived in ISO format.
The type of annotation result.
llm_auto
, human
The selected choices(s) for the annotation result, in JSON form. For categorical questions, this is an object or list of objects (depending on if multiple selections are allowed). For free text questions, this is a string. For numeric or rating questions, this is a number.
The unique identifier of the entity.
The date and time when the entity was created in ISO format.
The time elapsed to generate this annotation in seconds.
The reasoning the LLM gave for the annotation it provided.
The ID of the associated annotation result.
The cost of the annotation in cents.
The status of the test case result. This should not be explictly set when creating a test case result. When patching a test case result, this field can be set to 'PENDING' to prevent the test case result from being marked 'COMPLETED'.
PENDING
, COMPLETED
, FAILED
UNAUDITED
, FIXED
, APPROVED
CHUNKS
GENERATION
The result of the test case evaluation, in JSON form where the key is the question ID and the value is the result.
The time spent labeling in seconds.
The unique identifier of the entity.
The date and time when the entity was created in ISO format.
The ID of the account that owns the given entity.
The user who originally created the entity.
The user who annotated the task.
The date and time when the entity was archived in ISO format.
The total number of items of the query
The current page number.
The number of items per page.