Create Model Template
Description
Creates a model template.
Details
Model templates serve 2 purposes. First, they provide common scaffolding that is static across multiple models. Second, they expose several variables that can be injected at model creation time to customize the model.
For example, a model template can define a docker image that contains code to run a HuggingFace or SentenceTransformers model. This docker image code also accepts environment variables that can be set to swap out the model weights or model name.
Two of the most important fields required to create a model template are the model_creation_parameters_schema
and model_request_parameters_schema
fields.
The model_creation_parameters_schema
field defines the schema for parameters that can be injected at model creation time. For example, if the schema contains a model_weights_uri
field, which expects a string, when a model is created from this template, the user can provide a URI to a model weights file that can be used to swap out the model weights used by the model.
The model_request_parameters_schema
field defines the schema for parameters that can be injected by an end user at model execution time. For example, if the schema contains a model_request_parameters
field, which expects a string, when a model is executed, the user can provide a string that will be passed to the model for inference.
Authorizations
Headers
Body
The ID of the account that owns the given entity.
An enum representing the different types of model endpoint types supported.
Attributes: SYNC: Denotes that the model endpoint type is sync. ASYNC: Denotes that the model endpoint type is async. STREAMING: Denotes that the model endpoint type is streaming. BATCH: Denotes that the model endpoint type is batch.
SYNC
, ASYNC
, STREAMING
, BATCH
An enum representing the different types of models supported.
Attributes: COMPLETION: Denotes that the model type is completion. CHAT_COMPLETION: Denotes that the model type is chat completion. AGENT: Denotes that the model type is agent. EMBEDDING: Denotes that the model type is embedding. RERANKING: Denotes that the model type is reranking. GENERIC: Denotes that the model type is generic.
COMPLETION
, CHAT_COMPLETION
, AGENT
, EMBEDDING
, RERANKING
, GENERIC
Configuration for launching a model using the Launch service which is an internal and self-hosted service developed by Scale that deploys models on Kubernetes.
Attributes: vendor: The vendor of the model template bundle_config: The bundle configuration of the model template endpoint_config: The endpoint configuration of the model template
The name of the calling convention expected by the Launch model endpoint
SGP
, COHERE
, VLLM
The field names and types of available parameter fields which may be specified during model creation
The field names and types of available parameter fields which may be specified in a model execution API's model_request_parameters
field.
Response
The ID of the account that owns the given entity.
The date and time when the entity was created in ISO format.
The user who originally created the entity.
An enum representing the different types of model endpoint types supported.
Attributes: SYNC: Denotes that the model endpoint type is sync. ASYNC: Denotes that the model endpoint type is async. STREAMING: Denotes that the model endpoint type is streaming. BATCH: Denotes that the model endpoint type is batch.
SYNC
, ASYNC
, STREAMING
, BATCH
The unique identifier of the entity.
An enum representing the different types of models supported.
Attributes: COMPLETION: Denotes that the model type is completion. CHAT_COMPLETION: Denotes that the model type is chat completion. AGENT: Denotes that the model type is agent. EMBEDDING: Denotes that the model type is embedding. RERANKING: Denotes that the model type is reranking. GENERIC: Denotes that the model type is generic.
COMPLETION
, CHAT_COMPLETION
, AGENT
, EMBEDDING
, RERANKING
, GENERIC
Configuration for launching a model using the Launch service which is an internal and self-hosted service developed by Scale that deploys models on Kubernetes.
Attributes: vendor: The vendor of the model template bundle_config: The bundle configuration of the model template endpoint_config: The endpoint configuration of the model template
The name of the calling convention expected by the Launch model endpoint
SGP
, COHERE
, VLLM
The field names and types of available parameter fields which may be specified during model creation
The field names and types of available parameter fields which may be specified in a model execution API's model_request_parameters
field.