Knowledge Bases
- GETList Knowledge Bases
- POSTCreate Knowledge Base
- GETGet Knowledge Base
- DELDelete Knowledge Base
- PATCHPatch Knowledge Base
- GETList Upload Jobs
- GETGet Chunks
- POSTQuery Relevant Chunks
- POSTDelete Knowledge Base Data Source Connection
- POSTSubmit Upload Job with local files
- GETList Upload Jobs
- POSTSubmit Upload Job
- GETGet Upload Job
- POSTCancel Upload Job
- GETList Tracked Artifacts
- GETGet Tracked Artifact Details
- DELDelete Locally Stored Artifact
- PATCHPatch Artifact Information
- POSTTest Knowledge Base Data Source credentials
- POSTBatch Delete Locally Stored Artifacts
- GETList Upload Schedules
- POSTCreate Upload Schedule
- GETGet Upload Schedule
- DELDelete Upload Schedule
- PATCHUpdate Upload Schedule
Agents
Completions
Chat Completions
Models
- POSTExecute Model Deployment
- POSTGenerate text embedding
- POSTGenerate reranking
- POSTGenerate completion
- POSTGenerate chat completion
- GETList Model Deployments of a Model
- POSTDeploy Model
- GETGet Model usage by model name
- GETList All Model Deployments
- GETGet Model usage for one deployment
- POSTAdd model to group
- GETGet Model usage for a group
- GETList Models
- POSTCreate Model
- GETGet Model
- DELDelete Model
- PATCHUpdate Model
- GETList Model Groups
- POSTCreate Model Group
- GETGet Model Group
- DELDelete Model Group
- PATCHUpdate Model Group
- GETGet Deployment
- DELDelete Deployment
- PATCHUpdate Deployment
Accounts
Organizations
Question Sets
Evaluations
- POSTClaim Evaluation Task
- PATCHUpdate Evaluation Task
- GETGet Contributor Metrics
- GETList Contributor Metrics
- GETGet Evaluation Metrics
- GETGet Hybrid Evaluation Metrics
- GETList Evaluations
- POSTCreate Evaluation
- GETGet Evaluation
- DELDelete Evaluation
- PATCHUpdate Evaluation
- GETList Test Case Results
- POSTCreate Test Case Result
- POSTBatch Create Test Case Results
- GETGet Test Case Result
- PATCHUpdate Test Case Result
- GETGet Historical Test Case Result
- GETList Historical Test Case Results
Evaluation Configs
Evaluation Datasets
- GETList Evaluation Datasets
- POSTCreate Evaluation Dataset
- GETGet Evaluation Dataset
- DELDelete Evaluation Dataset
- PATCHUpdate Evaluation Dataset
- GETList Evaluation Dataset Versions
- POSTCreate Evaluation Dataset Version
- GETGet Evaluation Dataset Version
- GETList Test Cases
- POSTCreate Test Case
- POSTBatch Create Test Cases
- GETGet Test Case
- DELDelete Test Case
- PATCHUpdate Test Case
- GETGet Historical Test Case
- GETList Historical Test Cases
- DELWipe Full History of Test Cases
- GETList Autogenerated Draft Test Cases
- POSTCreate Autogenerated Draft Test Case
- GETGet Autogenerated Draft Test Case
- DELDelete Autogenerated Draft Test Case
- PATCHUpdate Autogenerated Draft Test Case
- POSTApprove Autogenerated Draft Test Case
- POSTPublish Latest Evaluation Dataset Version
- POSTApprove Auto Generated Test Cases Batch
- GETGet Evaluation Dataset Generation Jobs
- POSTCreate Evaluation Dataset Generation Job
- GETGet Evaluation Dataset Generation Job
- POSTCancel Evaluation Dataset Generation Job
- POSTPublish Evaluation Dataset Version
Studio Projects
Application Specs
Knowledge Base Data Sources
Model Templates V3 (Beta)
Model server
API Reference
Fine Tuning Jobs V3 (Beta)
Training Datasets V3 (Beta)
package deployments
Beta
Applications
- GETList Application Variants
- POSTCreate Application Variant
- GETGet Application Variant
- DELDelete Application Variant
- PATCHUpdate Application Variant
- GETList Application Deployments
- POSTCreate Application Deployment
- GETGet Application Deployment
- PATCHUpdate Application Deployment
- GETList Application Variant Reports
- POSTCreate Application Variant Report
- GETGet Application Variant Report
- GETList Application Test Case Outputs
- POSTBatch Create Application Test Case Outputs
- GETGet Application Test Case Output
- POSTCreate Application With Variant
- GETGet Application Schemas
- POSTProcess Application
- POSTProcess Application By Id
- POSTProcess Deployed Application For Thread
- POSTValidate Application
- GETList Application Threads
- POSTCreate New Application Thread
- POSTUpload Application Files
- PATCHPatch Application Variant Guardrails
ChatThreads
Interactions
MonitoringDashboard
Chat Themes
account groups
List All Model Deployments
TODO: Document
Optional filter by account id
Page number for pagination to be returned by the given endpoint. Starts at page 1
Maximum number of artifacts to be returned by the given endpoint. Defaults to 100 and cannot be greater than 10k.
Authorizations
Query Parameters
model_creation_parameters:asc
, model_creation_parameters:desc
, model_endpoint_id:asc
, model_endpoint_id:desc
, model_instance_id:asc
, model_instance_id:desc
, vendor_configuration:asc
, vendor_configuration:desc
, deployment_metadata:asc
, deployment_metadata:desc
, status:asc
, status:desc
, id:asc
, id:desc
, created_at:asc
, created_at:desc
, account_id:asc
, account_id:desc
, created_by_user_id:asc
, created_by_user_id:desc
, created_by_user:asc
, created_by_user:desc
, name:asc
, name:desc
Optional filter by account id
Page number for pagination to be returned by the given endpoint. Starts at page 1
x > 1
Maximum number of artifacts to be returned by the given endpoint. Defaults to 100 and cannot be greater than 10k.
1 < x < 10000
Response
The data returned for the current page.
The maximum number of concurrent requests that an individual worker can
service. Launch automatically scales the number of workers for the endpoint so that
each worker is processing per_worker
requests, subject to the limits defined by
min_workers
and max_workers
.
- If the average number of concurrent requests per worker is lower than
per_worker
, then the number of workers will be reduced. - Otherwise, if the average number of concurrent requests per worker is higher thanper_worker
, then the number of workers will be increased to meet the elevated traffic.
Here is our recommendation for computing per_worker
:
- Compute
min_workers
andmax_workers
per your minimum and maximum throughput requirements. 2. Determine a value for the maximum number of concurrent requests in the workload. Divide this number bymax_workers
. Doing this ensures that the number of workers will "climb" tomax_workers
.
LAUNCH
The unique identifier of the entity.
The date and time when the entity was created in ISO format.
The ID of the account that owns the given entity.
The user who originally created the entity.
Status of the model's deployment.
The total number of items of the query
The current page number.
The number of items per page.