JavaScript

import SGPClient from 'sgp';

const client = new SGPClient({
  apiKey: 'My API Key',
});

const modelDeployment = await client.models.deployments.create('model_instance_id', { name: 'name' });

console.log(modelDeployment.id);

{
  "name": "<string>",
  "model_creation_parameters": {},
  "model_endpoint_id": "<string>",
  "model_instance_id": "<string>",
  "vendor_configuration": {
    "min_workers": 0,
    "max_workers": 1,
    "per_worker": 10,
    "vendor": "LAUNCH"
  },
  "deployment_metadata": {},
  "id": "<string>",
  "created_at": "2023-11-07T05:31:56Z",
  "account_id": "<string>",
  "created_by_user_id": "<string>",
  "created_by_identity_type": "user",
  "status": "<string>"
}

Models

Deploy Model

Description

Model Deployments are unique endpoints created for custom models in the Scale GenAI Platform. They enable users to interact with and utilize specific instances of models through the API/SDK. Each deployment is associated with a model instance, containing the necessary model template and model-metadata. Model templates describe the creation parameters that are configured on the deployment. The model deployments provide a means to call upon models for inference, logging calls, and monitoring usage.

Built-in models also have deployments for creating a consistent interface for all models. But they don’t represent a real deployment, they are just a way to interact with the built-in models. These deployments are created automatically when the model is created and they are immutable.

Endpoint details

This endpoint is used to deploy a model instance. The request payload schema depends on the model_request_parameters_schema of the Model Template that the created model was created from.

POST

models

{model_instance_id}

deployments

JavaScript

import SGPClient from 'sgp';

const client = new SGPClient({
  apiKey: 'My API Key',
});

const modelDeployment = await client.models.deployments.create('model_instance_id', { name: 'name' });

console.log(modelDeployment.id);

{
  "name": "<string>",
  "model_creation_parameters": {},
  "model_endpoint_id": "<string>",
  "model_instance_id": "<string>",
  "vendor_configuration": {
    "min_workers": 0,
    "max_workers": 1,
    "per_worker": 10,
    "vendor": "LAUNCH"
  },
  "deployment_metadata": {},
  "id": "<string>",
  "created_at": "2023-11-07T05:31:56Z",
  "account_id": "<string>",
  "created_by_user_id": "<string>",
  "created_by_identity_type": "user",
  "status": "<string>"
}

Authorizations

x-api-key

string

header

required

Path Parameters

model_instance_id

string

required

Body

application/json

Response

200

application/json

Successful Response

The response is of type object.

List Model Deployments of a Model Get Model usage by model name

Knowledge Bases

Chunks

Agents

Completions

Chat Completions

Models

Users

Accounts

Organizations

Question Sets

Evaluations

Evaluation Configs

Evaluation Datasets

Studio Projects

Application Specs

Questions

Knowledge Base Data Sources

Model Templates V3 (Beta)

Model server

API Reference

Fine Tuning Jobs V3 (Beta)

Training Datasets V3 (Beta)

package deployments

Beta

Applications

ChatThreads

Interactions

MonitoringDashboard

Chat Themes

account groups

account