Inference Overview
An overview of Scale Generative Platform’s Inference Capabilities
This uses SGP’s V5 apis which are still in development
SGP supports an OpenAI-compatible interface that allows you to use the same API across model providers and self-hosted models.
For instance, we can use OpenAI’s client to do inference with Anthropic’s claude-3.5-sonnet
You can also use our SDK, scale-gp
Specifying which Model to do Inference On
As you might have noticed in the example, you need to specify the model names in a special way. Instead of just claude-3.5-sonnet
it’s anthropic/claude-3-5-sonnet-20240620
We need to do this because we have many providers, some of which provide the same model.
-
To distinguish them, the first part is the provider:
anthropic
,openai
,llmengine
etc. -
The second part is the model name. You’ll sometimes see a different model name than you would if you used the provider directly. This is because the versions we use are pinned to a specific point in time (meaning the model stays the same over time).
The two parts are joined together by a slash to get the overall name openai/gpt-4o
or anthropic/claude-3-5-sonnet-20240620
For a complete list of model names for models not hosted on SGP (e.g., OpenAI and Anthropic models), you can view the documentation for the open source project we use to route to the correct provider here https://docs.litellm.ai/docs/providers
For models hosted on SGP, you can visit https://llm-engine.scale.com/model_zoo/ to get the model names. Note that for these models, the provider is llmengine
.