POST
/
v4
/
model-templates
import SGPClient from 'sgp';

const client = new SGPClient({
  apiKey: process.env['SGP_API_KEY'], // This is the default and can be omitted
});

async function main() {
  const modelTemplate = await client.modelTemplates.create({
    account_id: 'account_id',
    endpoint_type: 'SYNC',
    model_type: 'COMPLETION',
    name: 'name',
    vendor_configuration: { bundle_config: { image: 'image', registry: 'registry', tag: 'tag' } },
  });

  console.log(modelTemplate.id);
}

main();
{
  "name": "<string>",
  "endpoint_type": "SYNC",
  "model_type": "COMPLETION",
  "vendor_configuration": {
    "vendor": "LAUNCH",
    "bundle_config": {
      "registry": "<string>",
      "image": "<string>",
      "tag": "<string>",
      "command": [
        "<string>"
      ],
      "env": {},
      "streaming_command": [
        "<string>"
      ],
      "readiness_initial_delay_seconds": 120,
      "healthcheck_route": "/readyz",
      "predict_route": "/predict",
      "streaming_predict_route": "/generate_streaming"
    },
    "endpoint_config": {
      "cpus": 3,
      "memory": "8Gi",
      "storage": "16Gi",
      "gpus": 0,
      "gpu_type": "nvidia-tesla-t4",
      "min_workers": 0,
      "max_workers": 1,
      "per_worker": 10,
      "endpoint_type": "ASYNC",
      "high_priority": false
    },
    "fine_tuning_job_bundle_config": {
      "registry": "<string>",
      "image": "<string>",
      "tag": "<string>",
      "command": [
        "<string>"
      ],
      "env": {},
      "mount_location": "/workspace/launch_specific/config.json",
      "training_dataset_schema_type": "GENERATION",
      "resources": {
        "cpus": 3,
        "memory": "8Gi",
        "storage": "16Gi",
        "gpus": 0,
        "gpu_type": "nvidia-tesla-t4"
      }
    }
  },
  "model_creation_parameters_schema": {
    "parameters": [
      {
        "name": "<string>",
        "type": "str",
        "description": "<string>",
        "required": true
      }
    ]
  },
  "model_request_parameters_schema": {
    "parameters": [
      {
        "name": "<string>",
        "type": "str",
        "description": "<string>",
        "required": true
      }
    ]
  },
  "endpoint_protocol": "SGP",
  "id": "<string>",
  "created_at": "2023-11-07T05:31:56Z",
  "account_id": "<string>",
  "created_by_user_id": "<string>"
}

Authorizations

x-api-key
string
header
required

Headers

x-selected-account-id
string | null

Body

application/json
name
string
required
endpoint_type
enum<string>
required

An enum representing the different types of model endpoint types supported.

Attributes: SYNC: Denotes that the model endpoint type is sync. ASYNC: Denotes that the model endpoint type is async. STREAMING: Denotes that the model endpoint type is streaming. BATCH: Denotes that the model endpoint type is batch.

Available options:
SYNC,
ASYNC,
STREAMING,
BATCH
model_type
enum<string>
required

An enum representing the different types of models supported.

Attributes: COMPLETION: Denotes that the model type is completion. CHAT_COMPLETION: Denotes that the model type is chat completion. AGENT: Denotes that the model type is agent. EMBEDDING: Denotes that the model type is embedding. RERANKING: Denotes that the model type is reranking. GENERIC: Denotes that the model type is generic.

Available options:
COMPLETION,
CHAT_COMPLETION,
AGENT,
EMBEDDING,
RERANKING,
GENERIC
vendor_configuration
object
required

Configuration for launching a model using the Launch service which is an internal and self-hosted service developed by Scale that deploys models on Kubernetes.

Attributes: vendor: The vendor of the model template bundle_config: The bundle configuration of the model template endpoint_config: The endpoint configuration of the model template

account_id
string
required

The ID of the account that owns the given entity.

model_creation_parameters_schema
object

The field names and types of available parameter fields which may be specified during model creation

model_request_parameters_schema
object

The field names and types of available parameter fields which may be specified in a model execution API's model_request_parameters field.

endpoint_protocol
enum<string>

The name of the calling convention expected by the Launch model endpoint

Available options:
SGP,
COHERE,
VLLM

Response

200
application/json
Successful Response
name
string
required
endpoint_type
enum<string>
required

An enum representing the different types of model endpoint types supported.

Attributes: SYNC: Denotes that the model endpoint type is sync. ASYNC: Denotes that the model endpoint type is async. STREAMING: Denotes that the model endpoint type is streaming. BATCH: Denotes that the model endpoint type is batch.

Available options:
SYNC,
ASYNC,
STREAMING,
BATCH
model_type
enum<string>
required

An enum representing the different types of models supported.

Attributes: COMPLETION: Denotes that the model type is completion. CHAT_COMPLETION: Denotes that the model type is chat completion. AGENT: Denotes that the model type is agent. EMBEDDING: Denotes that the model type is embedding. RERANKING: Denotes that the model type is reranking. GENERIC: Denotes that the model type is generic.

Available options:
COMPLETION,
CHAT_COMPLETION,
AGENT,
EMBEDDING,
RERANKING,
GENERIC
vendor_configuration
object
required

Configuration for launching a model using the Launch service which is an internal and self-hosted service developed by Scale that deploys models on Kubernetes.

Attributes: vendor: The vendor of the model template bundle_config: The bundle configuration of the model template endpoint_config: The endpoint configuration of the model template

id
string
required

The unique identifier of the entity.

created_at
string
required

The date and time when the entity was created in ISO format.

account_id
string
required

The ID of the account that owns the given entity.

created_by_user_id
string
required

The user who originally created the entity.

model_creation_parameters_schema
object

The field names and types of available parameter fields which may be specified during model creation

model_request_parameters_schema
object

The field names and types of available parameter fields which may be specified in a model execution API's model_request_parameters field.

endpoint_protocol
enum<string>

The name of the calling convention expected by the Launch model endpoint

Available options:
SGP,
COHERE,
VLLM