When you create an application variant, you can select form one of 4 templates.

  • Empty Canvas
  • Simple Completion
  • Simple RAG
  • Public Endpoint

Empty Canvas

This is a blank canvas, or the equivalent of creating an application from scratch. If you already know what your specific application needs are and have experience with creating LLM applications, this might be the best place to start.

Simple Completion

This is a simple single term completion model. It consists of a text input which will serve as your user prompt. The input is connected to a completion model node and the output of the model will be returned as an output node.

Without any adjustments, this application will be a single turn evaluation where the user input is the prompt and fed to the selected model. The response of the model in return will be passed to the user through the output. If you want to adjust the tone and model behavior, you can utilize this template and adjust the system prompt option under the advanced configuration setting of the completion model node.

Simple RAG

Retrieval augmented generation (RAG) is an architecture for LLM applications that provides the most relevant and contextually important data to a Generative AI Applications’s LLM. In this technique, there is an external database where the model fetches the most contextually relevant results that match the user query during generation time. RAG applications will provide the most up-to-date information about the world and domain specific data to Generative AI applications.

The RAG template inside Scale Generative AI Platform provides the recommended standard application setup to create a RAG application. It includes a knowledge base, a reranker (this step is optional), completion model, and system prompt.

Public Endpoint

If you already have an existing application or model that has an endpoint, you can utilize this template to run the application. This template features an input that will serve as the prompt sent to call the endpoint, a node for the endpoint itself, and a text output. If the endpoint requires authorization, you will have to provide an API key. With this template, you can add functionality to the app that adjusts the tone and model behavior.