import SGPClient from 'sgp';
const client = new SGPClient({
apiKey: process.env['SGP_API_KEY'], // This is the default and can be omitted
});
const modelDeployment = await client.models.deployments.create('model_instance_id', { name: 'name' });
console.log(modelDeployment.id);
{
"name": "<string>",
"model_creation_parameters": {},
"model_endpoint_id": "<string>",
"model_instance_id": "<string>",
"vendor_configuration": {
"min_workers": 0,
"max_workers": 1,
"per_worker": 10,
"vendor": "LAUNCH"
},
"deployment_metadata": {},
"id": "<string>",
"created_at": "2023-11-07T05:31:56Z",
"account_id": "<string>",
"created_by_user_id": "<string>",
"status": "<string>"
}
Model Deployments are unique endpoints created for custom models in the Scale GenAI Platform. They enable users to interact with and utilize specific instances of models through the API/SDK. Each deployment is associated with a model instance, containing the necessary model template and model-metadata. Model templates describe the creation parameters that are configured on the deployment. The model deployments provide a means to call upon models for inference, logging calls, and monitoring usage.
Built-in models also have deployments for creating a consistent interface for all models. But they don’t represent a real deployment, they are just a way to interact with the built-in models. These deployments are created automatically when the model is created and they are immutable.
This endpoint is used to deploy a model instance. The request payload schema depends on the model_request_parameters_schema
of the Model Template that the created model was created from.
import SGPClient from 'sgp';
const client = new SGPClient({
apiKey: process.env['SGP_API_KEY'], // This is the default and can be omitted
});
const modelDeployment = await client.models.deployments.create('model_instance_id', { name: 'name' });
console.log(modelDeployment.id);
{
"name": "<string>",
"model_creation_parameters": {},
"model_endpoint_id": "<string>",
"model_instance_id": "<string>",
"vendor_configuration": {
"min_workers": 0,
"max_workers": 1,
"per_worker": 10,
"vendor": "LAUNCH"
},
"deployment_metadata": {},
"id": "<string>",
"created_at": "2023-11-07T05:31:56Z",
"account_id": "<string>",
"created_by_user_id": "<string>",
"status": "<string>"
}
Successful Response
The response is of type object
.