JavaScript

import SGPClient from 'sgp';

const client = new SGPClient({
  apiKey: process.env['SGP_API_KEY'], // This is the default and can be omitted
});

const modelTemplate = await client.modelTemplates.create({
  account_id: 'account_id',
  endpoint_type: 'SYNC',
  model_type: 'COMPLETION',
  name: 'name',
  vendor_configuration: { bundle_config: { image: 'image', registry: 'registry', tag: 'tag' } },
});

console.log(modelTemplate.id);

{
  "name": "<string>",
  "endpoint_type": "SYNC",
  "model_type": "COMPLETION",
  "vendor_configuration": {
    "bundle_config": {
      "registry": "<string>",
      "image": "<string>",
      "tag": "<string>",
      "command": [
        "<string>"
      ],
      "env": {},
      "streaming_command": [
        "<string>"
      ],
      "readiness_initial_delay_seconds": 120,
      "healthcheck_route": "/readyz",
      "predict_route": "/predict",
      "streaming_predict_route": "/generate_streaming"
    },
    "vendor": "LAUNCH",
    "endpoint_config": {
      "cpus": 3,
      "memory": "8Gi",
      "storage": "16Gi",
      "gpus": 0,
      "gpu_type": "nvidia-tesla-t4",
      "min_workers": 0,
      "max_workers": 1,
      "per_worker": 10,
      "endpoint_type": "ASYNC",
      "high_priority": false
    },
    "fine_tuning_job_bundle_config": {
      "registry": "<string>",
      "image": "<string>",
      "tag": "<string>",
      "command": [
        "<string>"
      ],
      "env": {},
      "mount_location": "/workspace/launch_specific/config.json",
      "training_dataset_schema_type": "GENERATION",
      "resources": {
        "cpus": 3,
        "memory": "8Gi",
        "storage": "16Gi",
        "gpus": 0,
        "gpu_type": "nvidia-tesla-t4"
      }
    }
  },
  "id": "<string>",
  "created_at": "2023-11-07T05:31:56Z",
  "account_id": "<string>",
  "created_by_user_id": "<string>",
  "created_by_identity_type": "user",
  "model_creation_parameters_schema": {
    "parameters": [
      {
        "name": "<string>",
        "type": "<string>",
        "description": "<string>",
        "required": true
      }
    ]
  },
  "model_request_parameters_schema": {
    "parameters": [
      {
        "name": "<string>",
        "type": "<string>",
        "description": "<string>",
        "required": true
      }
    ]
  },
  "endpoint_protocol": "SGP"
}

Model Templates V3 (Beta)

Create Model Template

Description

Creates a model template.

Details

Model templates serve 2 purposes. First, they provide common scaffolding that is static across multiple models. Second, they expose several variables that can be injected at model creation time to customize the model.

For example, a model template can define a docker image that contains code to run a HuggingFace or SentenceTransformers model. This docker image code also accepts environment variables that can be set to swap out the model weights or model name.

Two of the most important fields required to create a model template are the model_creation_parameters_schema and model_request_parameters_schema fields.

The model_creation_parameters_schema field defines the schema for parameters that can be injected at model creation time. For example, if the schema contains a model_weights_uri field, which expects a string, when a model is created from this template, the user can provide a URI to a model weights file that can be used to swap out the model weights used by the model.

The model_request_parameters_schema field defines the schema for parameters that can be injected by an end user at model execution time. For example, if the schema contains a model_request_parameters field, which expects a string, when a model is executed, the user can provide a string that will be passed to the model for inference.

POST

model-templates

JavaScript

import SGPClient from 'sgp';

const client = new SGPClient({
  apiKey: process.env['SGP_API_KEY'], // This is the default and can be omitted
});

const modelTemplate = await client.modelTemplates.create({
  account_id: 'account_id',
  endpoint_type: 'SYNC',
  model_type: 'COMPLETION',
  name: 'name',
  vendor_configuration: { bundle_config: { image: 'image', registry: 'registry', tag: 'tag' } },
});

console.log(modelTemplate.id);

{
  "name": "<string>",
  "endpoint_type": "SYNC",
  "model_type": "COMPLETION",
  "vendor_configuration": {
    "bundle_config": {
      "registry": "<string>",
      "image": "<string>",
      "tag": "<string>",
      "command": [
        "<string>"
      ],
      "env": {},
      "streaming_command": [
        "<string>"
      ],
      "readiness_initial_delay_seconds": 120,
      "healthcheck_route": "/readyz",
      "predict_route": "/predict",
      "streaming_predict_route": "/generate_streaming"
    },
    "vendor": "LAUNCH",
    "endpoint_config": {
      "cpus": 3,
      "memory": "8Gi",
      "storage": "16Gi",
      "gpus": 0,
      "gpu_type": "nvidia-tesla-t4",
      "min_workers": 0,
      "max_workers": 1,
      "per_worker": 10,
      "endpoint_type": "ASYNC",
      "high_priority": false
    },
    "fine_tuning_job_bundle_config": {
      "registry": "<string>",
      "image": "<string>",
      "tag": "<string>",
      "command": [
        "<string>"
      ],
      "env": {},
      "mount_location": "/workspace/launch_specific/config.json",
      "training_dataset_schema_type": "GENERATION",
      "resources": {
        "cpus": 3,
        "memory": "8Gi",
        "storage": "16Gi",
        "gpus": 0,
        "gpu_type": "nvidia-tesla-t4"
      }
    }
  },
  "id": "<string>",
  "created_at": "2023-11-07T05:31:56Z",
  "account_id": "<string>",
  "created_by_user_id": "<string>",
  "created_by_identity_type": "user",
  "model_creation_parameters_schema": {
    "parameters": [
      {
        "name": "<string>",
        "type": "<string>",
        "description": "<string>",
        "required": true
      }
    ]
  },
  "model_request_parameters_schema": {
    "parameters": [
      {
        "name": "<string>",
        "type": "<string>",
        "description": "<string>",
        "required": true
      }
    ]
  },
  "endpoint_protocol": "SGP"
}

Authorizations

x-api-key

string

header

required

Headers

x-selected-account-id

string

Body

application/json

name

string

required

endpoint_type

enum<string>

required

An enum representing the different types of model endpoint types supported.

Attributes: SYNC: Denotes that the model endpoint type is sync. ASYNC: Denotes that the model endpoint type is async. STREAMING: Denotes that the model endpoint type is streaming. BATCH: Denotes that the model endpoint type is batch.

Available options:

SYNC,

ASYNC,

STREAMING,

BATCH

model_type

enum<string>

required

An enum representing the different types of models supported.

Attributes: COMPLETION: Denotes that the model type is completion. CHAT_COMPLETION: Denotes that the model type is chat completion. AGENT: Denotes that the model type is agent. EMBEDDING: Denotes that the model type is embedding. RERANKING: Denotes that the model type is reranking. GENERIC: Denotes that the model type is generic. BUNDLE: "Not to be used directly - type to surface a model bundle in the UI - TODO: Explicitly type a bundle

Available options:

COMPLETION,

CHAT_COMPLETION,

AGENT,

EMBEDDING,

RERANKING,

GENERIC,

BUNDLE

vendor_configuration

LaunchVendorConfiguration · object

required

Configuration for launching a model using the Launch service which is an internal and self-hosted service developed by Scale that deploys models on Kubernetes.

Attributes: vendor: The vendor of the model template bundle_config: The bundle configuration of the model template endpoint_config: The endpoint configuration of the model template

Show child attributes

account_id

string

required

The ID of the account that owns the given entity.

model_creation_parameters_schema

ParameterSchema · object

The field names and types of available parameter fields which may be specified during model creation

Show child attributes

model_request_parameters_schema

ParameterSchema · object

The field names and types of available parameter fields which may be specified in a model execution API's model_request_parameters field.

Show child attributes

endpoint_protocol

enum<string>

The name of the calling convention expected by the Launch model endpoint

Available options:

SGP,

COHERE,

VLLM

Response

Successful Response

name

string

required

endpoint_type

enum<string>

required

An enum representing the different types of model endpoint types supported.

Available options:

SYNC,

ASYNC,

STREAMING,

BATCH

model_type

enum<string>

required

An enum representing the different types of models supported.

Available options:

COMPLETION,

CHAT_COMPLETION,

AGENT,

EMBEDDING,

RERANKING,

GENERIC,

BUNDLE

vendor_configuration

LaunchVendorConfiguration · object

required

Configuration for launching a model using the Launch service which is an internal and self-hosted service developed by Scale that deploys models on Kubernetes.

Attributes: vendor: The vendor of the model template bundle_config: The bundle configuration of the model template endpoint_config: The endpoint configuration of the model template

Show child attributes

string

required

The unique identifier of the entity.

created_at

string<date-time>

required

The date and time when the entity was created in ISO format.

account_id

string

required

The ID of the account that owns the given entity.

created_by_user_id

string

required

The user who originally created the entity.

created_by_identity_type

enum<string>

required

The type of identity that created the entity.

Available options:

user,

service_account

model_creation_parameters_schema

ParameterSchema · object

The field names and types of available parameter fields which may be specified during model creation

Show child attributes

model_request_parameters_schema

ParameterSchema · object

The field names and types of available parameter fields which may be specified in a model execution API's model_request_parameters field.

Show child attributes

endpoint_protocol

enum<string>

The name of the calling convention expected by the Launch model endpoint

Available options:

SGP,

COHERE,

VLLM

List Model Templates Get Model Template

⌘I

Knowledge Bases

Chunks

Agents

Completions

Chat Completions

Models

Users

Accounts

Organizations

Question Sets

Evaluations

Evaluation Configs

Evaluation Datasets

Studio Projects

Application Specs

Questions

Knowledge Base Data Sources

Model Templates V3 (Beta)

Model server

API Reference

Fine Tuning Jobs V3 (Beta)

Training Datasets V3 (Beta)

package deployments

Beta

Applications

ChatThreads

Interactions

MonitoringDashboard

Chat Themes

account groups

Create Model Template

Description

Details

Authorizations

Headers

Body

Response