Inference
Deploy and Execute a Model
Recipes
- Evaluations
- Applications
- Datasets
- Inference
Inference
Deploy and Execute a Model
Use this recipe to deploy and execute a gemini-pro completion model
Follow the instructions in the Quickstart Guide to setup the SGP Client
from scale_gp import SGPClient
client = SGPClient(api_key=api_key)
Before creating a model, you must first create a model template. A model template serves 2 purposes. First, it provides common scaffolding that is static across multiple models. Second, it exposes several variables that can be injected at model creation time to customize the model.
bundle_config = VendorConfigurationBundleConfig(image="gemini-pro", registry="aws-registry", tag="latest")
endpoint_config = VendorConfigurationEndpointConfig(
max_workers=3,
)
vendor_configuration = VendorConfiguration(
bundle_config=bundle_config,
endpoint_config=endpoint_config,
)
model_template = client.model_templates.create(
account_id=account_id,
endpoint_type="SYNC",
model_type="COMPLETION",
name="Gemini-Pro Template",
vendor_configuration=vendor_configuration,
)
Use the created model template to create a model instance
model_instance = client.models.create(
account_id=account_id,
model_type="COMPLETION",
name="gemini-pro",
model_vendor="GOOGLE",
model_template_id=model_template.id,
)
Deploy the created instance
model_deployment = client.models.deployments.create(
model_instance_id=model_instance.id, name="Gemini-Pro Deployment", account_id=account_id
)
Execute the model deployment. In this case, we are executing a completion model with a list of prompts.
execute_result = client.models.deployments.execute(
model_deployment_id=model_deployment.id,
model_instance_id=model_instance.id,
extra_body={"prompts": ["What is the capital of Canada?"]},
)