> ## Documentation Index
> Fetch the complete documentation index at: https://docs.gp.scale.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Create Model Template

> ### Description
Creates a model template.

### Details

Model templates serve 2 purposes. First, they provide common scaffolding that is static across multiple models. Second, they expose several variables that can be injected at model creation time to customize the model.

For example, a model template can define a docker image that contains code to run a HuggingFace or SentenceTransformers model. This docker image code also accepts environment variables that can be set to swap out the model weights or model name.

Two of the most important fields required to create a model template are the `model_creation_parameters_schema` and `model_request_parameters_schema` fields.

The `model_creation_parameters_schema` field defines the schema for parameters that can be injected at model creation time. For example, if the schema contains a `model_weights_uri` field, which expects a string, when a model is created from this template, the user can provide a URI to a model weights file that can be used to swap out the model weights used by the model.

The `model_request_parameters_schema` field defines the schema for parameters that can be injected by an end user at model execution time. For example, if the schema contains a `model_request_parameters` field, which expects a string, when a model is executed, the user can provide a string that will be passed to the model for inference.



## OpenAPI

````yaml https://app.stainlessapi.com/api/spec/documented/sgp/openapi.yml post /v4/model-templates
openapi: 3.1.0
info:
  title: EGP API V4
  description: >-
    This is the parent API for all EGP APIs. If you are looking for the EGP API,
    please go to https://api.egp.scale.com/docs.
  contact:
    name: Scale Generative AI Platform
    url: https://scale.com/genai-platform
  version: 0.1.0
servers:
  - url: https://api.egp.scale.com
security: []
tags:
  - name: Models
    description: Model API.
paths:
  /v4/model-templates:
    post:
      tags:
        - Model Templates V3 (Beta)
      summary: Create Model Template
      description: >-
        ### Description

        Creates a model template.


        ### Details


        Model templates serve 2 purposes. First, they provide common scaffolding
        that is static across multiple models. Second, they expose several
        variables that can be injected at model creation time to customize the
        model.


        For example, a model template can define a docker image that contains
        code to run a HuggingFace or SentenceTransformers model. This docker
        image code also accepts environment variables that can be set to swap
        out the model weights or model name.


        Two of the most important fields required to create a model template are
        the `model_creation_parameters_schema` and
        `model_request_parameters_schema` fields.


        The `model_creation_parameters_schema` field defines the schema for
        parameters that can be injected at model creation time. For example, if
        the schema contains a `model_weights_uri` field, which expects a string,
        when a model is created from this template, the user can provide a URI
        to a model weights file that can be used to swap out the model weights
        used by the model.


        The `model_request_parameters_schema` field defines the schema for
        parameters that can be injected by an end user at model execution time.
        For example, if the schema contains a `model_request_parameters` field,
        which expects a string, when a model is executed, the user can provide a
        string that will be passed to the model for inference.
      operationId: POST-V4-/model-templates
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/ModelTemplateRequest'
      responses:
        '200':
          description: Successful Response
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ModelTemplateResponse'
        '422':
          description: Validation Error
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/HTTPValidationError'
      x-codeSamples:
        - lang: Python
          source: |-
            import os
            from scale_gp import SGPClient

            client = SGPClient(
                api_key=os.environ.get("SGP_API_KEY"),  # This is the default and can be omitted
            )
            model_template = client.model_templates.create(
                account_id="account_id",
                endpoint_type="SYNC",
                model_type="COMPLETION",
                name="name",
                vendor_configuration={
                    "bundle_config": {
                        "image": "image",
                        "registry": "registry",
                        "tag": "tag",
                    }
                },
            )
            print(model_template.id)
        - lang: Go
          source: "package main\n\nimport (\n\t\"context\"\n\t\"fmt\"\n\n\t\"github.com/stainless-sdks/sgp-go\"\n\t\"github.com/stainless-sdks/sgp-go/option\"\n)\n\nfunc main() {\n\tclient := sgp.NewClient(\n\t\toption.WithAPIKey(\"My API Key\"),\n\t)\n\tmodelTemplate, err := client.ModelTemplates.New(context.TODO(), sgp.ModelTemplateNewParams{\n\t\tAccountID:    sgp.F(\"account_id\"),\n\t\tEndpointType: sgp.F(sgp.ModelTemplateNewParamsEndpointTypeSync),\n\t\tModelType:    sgp.F(sgp.ModelTemplateNewParamsModelTypeCompletion),\n\t\tName:         sgp.F(\"name\"),\n\t\tVendorConfiguration: sgp.F(sgp.ModelTemplateNewParamsVendorConfiguration{\n\t\t\tBundleConfig: sgp.F(sgp.ModelTemplateNewParamsVendorConfigurationBundleConfig{\n\t\t\t\tImage:    sgp.F(\"image\"),\n\t\t\t\tRegistry: sgp.F(\"registry\"),\n\t\t\t\tTag:      sgp.F(\"tag\"),\n\t\t\t}),\n\t\t}),\n\t})\n\tif err != nil {\n\t\tpanic(err.Error())\n\t}\n\tfmt.Printf(\"%+v\\n\", modelTemplate.ID)\n}\n"
components:
  schemas:
    ModelTemplateRequest:
      properties:
        name:
          type: string
          title: Name
        endpoint_type:
          $ref: '#/components/schemas/ModelEndpointType'
        model_type:
          $ref: '#/components/schemas/ModelType'
        vendor_configuration:
          $ref: '#/components/schemas/LaunchVendorConfiguration'
        model_creation_parameters_schema:
          $ref: '#/components/schemas/ParameterSchema'
          description: >-
            The field names and types of available parameter fields which may be
            specified during model creation
        model_request_parameters_schema:
          $ref: '#/components/schemas/ParameterSchema'
          description: >-
            The field names and types of available parameter fields which may be
            specified in a model execution API's `model_request_parameters`
            field.
        endpoint_protocol:
          $ref: '#/components/schemas/LaunchEndpointProtocol'
          description: >-
            The name of the calling convention expected by the Launch model
            endpoint
        account_id:
          type: string
          title: Account Id
          description: The ID of the account that owns the given entity.
          can_patch: false
      type: object
      required:
        - name
        - endpoint_type
        - model_type
        - vendor_configuration
        - account_id
      title: ModelTemplateRequest
    ModelTemplateResponse:
      properties:
        name:
          type: string
          title: Name
        endpoint_type:
          $ref: '#/components/schemas/ModelEndpointType'
        model_type:
          $ref: '#/components/schemas/ModelType'
        vendor_configuration:
          $ref: '#/components/schemas/LaunchVendorConfiguration'
        model_creation_parameters_schema:
          $ref: '#/components/schemas/ParameterSchema'
          description: >-
            The field names and types of available parameter fields which may be
            specified during model creation
        model_request_parameters_schema:
          $ref: '#/components/schemas/ParameterSchema'
          description: >-
            The field names and types of available parameter fields which may be
            specified in a model execution API's `model_request_parameters`
            field.
        endpoint_protocol:
          $ref: '#/components/schemas/LaunchEndpointProtocol'
          description: >-
            The name of the calling convention expected by the Launch model
            endpoint
        id:
          type: string
          title: Id
          description: The unique identifier of the entity.
        created_at:
          type: string
          format: date-time
          title: Created At
          description: The date and time when the entity was created in ISO format.
        account_id:
          type: string
          title: Account Id
          description: The ID of the account that owns the given entity.
          can_patch: false
        created_by_user_id:
          type: string
          title: Created By User Id
          description: The user who originally created the entity.
        created_by_identity_type:
          $ref: '#/components/schemas/IdentifierTypeEnum'
          description: The type of identity that created the entity.
      type: object
      required:
        - name
        - endpoint_type
        - model_type
        - vendor_configuration
        - id
        - created_at
        - account_id
        - created_by_user_id
        - created_by_identity_type
      title: ModelTemplateResponse
    HTTPValidationError:
      properties:
        detail:
          items:
            $ref: '#/components/schemas/ValidationError'
          type: array
          title: Detail
      type: object
      title: HTTPValidationError
    ModelEndpointType:
      type: string
      enum:
        - SYNC
        - ASYNC
        - STREAMING
        - BATCH
      title: ModelEndpointType
      description: >-
        An enum representing the different types of model endpoint types
        supported.


        Attributes:
            SYNC: Denotes that the model endpoint type is sync.
            ASYNC: Denotes that the model endpoint type is async.
            STREAMING: Denotes that the model endpoint type is streaming.
            BATCH: Denotes that the model endpoint type is batch.
    ModelType:
      type: string
      enum:
        - COMPLETION
        - CHAT_COMPLETION
        - AGENT
        - EMBEDDING
        - RERANKING
        - GENERIC
        - BUNDLE
      title: ModelType
      description: |-
        An enum representing the different types of models supported.

        Attributes:
            COMPLETION: Denotes that the model type is completion.
            CHAT_COMPLETION: Denotes that the model type is chat completion.
            AGENT: Denotes that the model type is agent.
            EMBEDDING: Denotes that the model type is embedding.
            RERANKING: Denotes that the model type is reranking.
            GENERIC: Denotes that the model type is generic.
            BUNDLE: "Not to be used directly - type to surface a model bundle in the UI - TODO: Explicitly type a bundle
    LaunchVendorConfiguration:
      properties:
        vendor:
          type: string
          const: LAUNCH
          title: Vendor
          default: LAUNCH
        bundle_config:
          $ref: '#/components/schemas/ModelBundleConfiguration'
        endpoint_config:
          $ref: '#/components/schemas/CreateModelEndpointConfig'
        fine_tuning_job_bundle_config:
          $ref: '#/components/schemas/FineTuningBundleConfiguration'
      type: object
      required:
        - bundle_config
      title: LaunchVendorConfiguration
      description: >-
        Configuration for launching a model using the Launch service which is an
        internal and

        self-hosted service developed by Scale that deploys models on
        Kubernetes.


        Attributes:
            vendor: The vendor of the model template
            bundle_config: The bundle configuration of the model template
            endpoint_config: The endpoint configuration of the model template
    ParameterSchema:
      properties:
        parameters:
          items:
            $ref: '#/components/schemas/ParameterSchemaField'
          type: array
          title: Parameters
      type: object
      required:
        - parameters
      title: ParameterSchema
    LaunchEndpointProtocol:
      type: string
      enum:
        - SGP
        - COHERE
        - VLLM
      title: LaunchEndpointProtocol
    IdentifierTypeEnum:
      type: string
      enum:
        - user
        - service_account
      title: IdentifierTypeEnum
    ValidationError:
      properties:
        loc:
          items:
            anyOf:
              - type: string
              - type: integer
          type: array
          title: Location
        msg:
          type: string
          title: Message
        type:
          type: string
          title: Error Type
        input:
          title: Input
        ctx:
          type: object
          title: Context
          additionalProperties: true
      type: object
      required:
        - loc
        - msg
        - type
      title: ValidationError
    ModelBundleConfiguration:
      properties:
        registry:
          type: string
          title: Registry
        image:
          type: string
          title: Image
        tag:
          type: string
          title: Tag
        command:
          items:
            type: string
          type: array
          title: Command
        env:
          additionalProperties:
            type: string
          type: object
          title: Env
        streaming_command:
          title: Streaming Command
          items:
            type: string
          type: array
        readiness_initial_delay_seconds:
          type: integer
          title: Readiness Initial Delay Seconds
          default: 120
        healthcheck_route:
          type: string
          title: Healthcheck Route
          default: /readyz
        predict_route:
          type: string
          title: Predict Route
          default: /predict
        streaming_predict_route:
          title: Streaming Predict Route
          default: /generate_streaming
          type: string
      type: object
      required:
        - registry
        - image
        - tag
      title: ModelBundleConfiguration
    CreateModelEndpointConfig:
      properties:
        cpus:
          type: integer
          title: Cpus
          default: 3
        memory:
          type: string
          title: Memory
          default: 8Gi
        storage:
          type: string
          title: Storage
          default: 16Gi
        gpus:
          type: integer
          title: Gpus
          default: 0
        gpu_type:
          $ref: '#/components/schemas/GPUType'
        min_workers:
          type: integer
          title: Min Workers
          default: 0
        max_workers:
          type: integer
          title: Max Workers
          default: 1
        per_worker:
          type: integer
          title: Per Worker
          description: >-
            The maximum number of concurrent requests that an individual worker
            can

            service. Launch automatically scales the number of workers for the
            endpoint so that

            each worker is processing ``per_worker`` requests, subject to the
            limits defined by

            ``min_workers`` and ``max_workers``.


            - If the average number of concurrent requests per worker is lower
            than

            ``per_worker``, then the number of workers will be reduced. -
            Otherwise,

            if the average number of concurrent requests per worker is higher
            than

            ``per_worker``, then the number of workers will be increased to meet
            the elevated

            traffic.


            Here is our recommendation for computing ``per_worker``:


            1. Compute ``min_workers`` and ``max_workers`` per your minimum and
            maximum

            throughput requirements. 2. Determine a value for the maximum number
            of

            concurrent requests in the workload. Divide this number by
            ``max_workers``. Doing

            this ensures that the number of workers will "climb" to
            ``max_workers``.
          default: 10
        endpoint_type:
          $ref: '#/components/schemas/ModelEndpointType'
          default: ASYNC
        high_priority:
          title: High Priority
          default: false
          type: boolean
      type: object
      title: CreateModelEndpointConfig
    FineTuningBundleConfiguration:
      properties:
        registry:
          type: string
          title: Registry
        image:
          type: string
          title: Image
        tag:
          type: string
          title: Tag
        command:
          items:
            type: string
          type: array
          title: Command
        env:
          additionalProperties:
            type: string
          type: object
          title: Env
        mount_location:
          title: Mount Location
          description: >-
            The filesystem location where the fine tuning job's configuration
            will be available when it is started.
          default: /workspace/launch_specific/config.json
          type: string
        training_dataset_schema_type:
          $ref: '#/components/schemas/TrainingDatasetORMSchemaTypeEnum'
          description: Optionally set required training and validation dataset schema
        resources:
          $ref: '#/components/schemas/RequiredResources'
      type: object
      required:
        - registry
        - image
        - tag
      title: FineTuningBundleConfiguration
    ParameterSchemaField:
      properties:
        name:
          type: string
          title: Name
        type:
          anyOf:
            - type: string
              const: str
            - type: string
              const: int
            - type: string
              const: float
            - type: string
              const: bool
          title: Type
        description:
          title: Description
          type: string
        required:
          title: Required
          type: boolean
      type: object
      required:
        - name
        - type
        - description
        - required
      title: ParameterSchemaField
    GPUType:
      type: string
      enum:
        - nvidia-tesla-t4
        - nvidia-ampere-a10
        - nvidia-ampere-a100
        - nvidia-ampere-a100e
        - nvidia-hopper-h100
        - nvidia-hopper-h100-1g20gb
        - nvidia-hopper-h100-3g40gb
      title: GPUType
    TrainingDatasetORMSchemaTypeEnum:
      type: string
      enum:
        - GENERATION
        - RERANKING_QUESTIONS
      title: TrainingDatasetORMSchemaTypeEnum
    RequiredResources:
      properties:
        cpus:
          type: integer
          title: Cpus
          default: 3
        memory:
          type: string
          title: Memory
          default: 8Gi
        storage:
          type: string
          title: Storage
          default: 16Gi
        gpus:
          type: integer
          title: Gpus
          default: 0
        gpu_type:
          $ref: '#/components/schemas/GPUType'
      type: object
      title: RequiredResources

````