import SGPClient from 'sgp';
const client = new SGPClient({
apiKey: 'My API Key',
});
const modelDeployment = await client.models.deployments.retrieve('model_instance_id', 'deployment_id');
console.log(modelDeployment.id);{
"name": "<string>",
"id": "<string>",
"created_at": "2023-11-07T05:31:56Z",
"account_id": "<string>",
"created_by_user_id": "<string>",
"created_by_identity_type": "user",
"status": "<string>",
"model_creation_parameters": {},
"model_endpoint_id": "<string>",
"model_instance_id": "<string>",
"vendor_configuration": {
"min_workers": 0,
"max_workers": 1,
"per_worker": 10,
"vendor": "LAUNCH"
},
"deployment_metadata": {}
}Gets the details of a deployment
This API can be used to get information about a single deployment by ID. To use this API, pass in the id that was returned from your Create Deployment API call as a path parameter.
Review the response schema to see the fields that will be returned.
import SGPClient from 'sgp';
const client = new SGPClient({
apiKey: 'My API Key',
});
const modelDeployment = await client.models.deployments.retrieve('model_instance_id', 'deployment_id');
console.log(modelDeployment.id);{
"name": "<string>",
"id": "<string>",
"created_at": "2023-11-07T05:31:56Z",
"account_id": "<string>",
"created_by_user_id": "<string>",
"created_by_identity_type": "user",
"status": "<string>",
"model_creation_parameters": {},
"model_endpoint_id": "<string>",
"model_instance_id": "<string>",
"vendor_configuration": {
"min_workers": 0,
"max_workers": 1,
"per_worker": 10,
"vendor": "LAUNCH"
},
"deployment_metadata": {}
}Successful Response
The unique identifier of the entity.
The date and time when the entity was created in ISO format.
The ID of the account that owns the given entity.
The user who originally created the entity.
The type of identity that created the entity.
user, service_account Status of the model's deployment.
Show child attributes
The maximum number of concurrent requests that an individual worker can
service. Launch automatically scales the number of workers for the endpoint so that
each worker is processing per_worker requests, subject to the limits defined by
min_workers and max_workers.
per_worker, then the number of workers will be reduced. - Otherwise,
if the average number of concurrent requests per worker is higher than
per_worker, then the number of workers will be increased to meet the elevated
traffic.Here is our recommendation for computing per_worker:
min_workers and max_workers per your minimum and maximum
throughput requirements. 2. Determine a value for the maximum number of
concurrent requests in the workload. Divide this number by max_workers. Doing
this ensures that the number of workers will "climb" to max_workers."LAUNCH"