New nodes are constantly being created. This document was last updated 5/1/2024

BatchedNode

Type: batched
Operation Type: CUSTOM

Description

Group inputs into batches. This node uses more_itertools.chunked or chunked_even to split an iterable into batches. Here is an example config for a batched node:

- name: batched
  type: batched
  config:
    batch_size: 4
    split_approx_evenly: false
    inputs:
      iterable: input

In this example, when the input is “0123456789”, the output will be: [[0, 1, 2, 3], [4, 5, 6], [7, 8, 9]] This allows you to easily batch up the input for processing in parallel.

Configuration

The configuration parameters for this node are as follows:

type (Literal[‘batched’]):
- Default: batched
- Required: No
operation_type (Literal[NodeOperationType.CUSTOM: ‘CUSTOM’]):
- Default: NodeOperationType.CUSTOM
- Required: No
name (str):
- Description: The name of the node.
- Required: Yes
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
batch_size (int):
- Description: The size of the batches to split the input into.
- Required: Yes
split_approx_evenly (bool):
- Description: Whether to split the input into batches of approximately equal size.
- Default: True
- Required: No

Method

The node provides the following method:

Inputs

iterable (Iterable[~T]):
- Required: Yes

Output

output (list):
- Required: Yes

====================================================================================================

BranchDecisionNode

Type: branch_decision_node
Operation Type: CUSTOM

Description

Decides which branch to execute based on conditions provided by the user in a plan config. For example, a plan config might contain a branch config with 3 mutually exclusive branches like this:

- branch:  "add_animal_to_message_by_color"
  conditional_workflows:
  - condition: if
      condition_input_var: "message"
      operator: "contains"
      reference_var: "green"
      workflow_name: "add_frog_to_message"

  - condition: elif
      condition_input_var: "message"
      operator: "contains"
      reference_var: "yellow"
      workflow_name: "add_giraffe_to_message"

  - condition: else
      workflow_name: "add_cow_to_message"

and this node will run with input_kwargs = {"add_frog_to_message": MESSAGE, "add_giraffe_to_message": MESSAGE}, in order to decide which workflow to execute next.

Configuration

The configuration parameters for this node are as follows:

type (Literal[‘branch_decision_node’]):
- Default: branch_decision_node
- Required: No
operation_type (Literal[NodeOperationType.CUSTOM: ‘CUSTOM’]):
- Default: NodeOperationType.CUSTOM
- Required: No
name (str):
- Description: The name of the node.
- Required: Yes
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
ordered_conditional_workflow_cfgs (List[egp_services.plan.workflow_items.ConditionalWorkflowItem]):
- Description: The ordered list of conditional workflows to evaluate.
- Required: Yes
dependent_nodes (List[str]):
- Description: The names of the nodes that this node depends on.
- Required: Yes

References in Configuration

The configuration includes references to the following models:

ConditionalWorkflowItem

Description: Representation of a workflow that is part of a branch complex Attributes: condition: Either if, elif, or else condition_input_var: the variable that is inputted into the unary condition operator: the unary condition operator reference_var: the unary condition reference variable that is used to build a predicate function with the operator condition_tree: a compound logical expression expressed as a tree (see CompoundCondition), optional if the above condition vars are set workflow_nodes: to be set to a list of all nodes in the workflow at runtime, so they can be easily ignored if this branch is not selected to run
Fields:
- workflow_name (str):
  - Description: The name of the abstract workflow
  - Required: Yes
- workflow_alias (Optional[str]):
  - Description: The alias of the abstract workflow in the graph
  - Required: No
- workflow_inputs (Optional[Dict[str, Union[str, Dict[str, Union[str, Dict]]]]]):
  - Description: The inputs to the workflow
  - Required: No
- condition (Literal[‘if’, ‘elif’, ‘else’]):
  - Description: The condition type
  - Required: Yes
- condition_input_var (Optional[str]):
  - Description: The input variable to the unary condition
  - Required: No
- operator (Optional[str]):
  - Description: The operator to use in the unary condition
  - Required: No
- reference_var (Optional[str]):
  - Description: The reference variable to use in the unary condition
  - Required: No
- condition_tree (Optional[egp_services.helpers.CompoundCondition]):
  - Description: The compound condition tree
  - Required: No
- workflow_nodes (List[str]):
  - Description: The nodes that make up the workflow
  - Required: No

Method

The node provides the following method:

Inputs

(No explicit inputs listed for the primary method)

Output

output (BranchDecisionNodeOutput):
- Required: Yes

References in Output

The output includes references to the following models:

BranchDecisionNodeOutput

Description: No description available
Fields:
- selected_workflow_name (Optional[str]):
  - Required: Yes
- new_nodes_to_ignore (Set[str]):
  - Required: Yes

====================================================================================================

ChatGenerationNode

Type: chat_generation
Operation Type: COMPLETION

Description

A node that generates chat completions from an LLM.

Configuration

The configuration parameters for this node are as follows:

type (Literal[‘chat_generation’]):
- Default: chat_generation
- Required: No
operation_type (Literal[NodeOperationType.COMPLETION: ‘COMPLETION’]):
- Default: NodeOperationType.COMPLETION
- Required: No
name (str):
- Description: The name of the node.
- Required: Yes
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
llm_model (Optional[str]):
- Description: The name of the LLM model to use for generation. If not provided, llm_model_instance and llm_model_deployment must be set.
- Required: No
llm_model_instance (Optional[str]):
- Description: The name of the custom LLM model instance to use for generation. If not provided, llm_model must be set.
- Required: No
llm_model_deployment (Optional[str]):
- Description: The name of the custom LLM model deployment to use for generation. If not provided, llm_model must be set.
- Required: No
max_tokens (int):
- Description: The maximum number of tokens to generate in the completion. The maximum value is 2048.
- Default: 512
- Required: No
temperature (float):
- Description: The sampling temperature. Lower values will result in more deterministic completions.
- Default: 0.2
- Required: No
stop_sequences (Optional[List[str]]):
- Description: A list of strings that will stop the completion when encountered.
- Required: No
strip_whitespace (bool):
- Description: Whether to strip leading and trailing whitespace from the response.
- Required: No
retry_config (RetryConfig):
- Description: The retry configuration for the node. Can be used to specify the number of retries, delay, backoff, and exceptions to retry on.
- Default: {'tries': 5, 'delay': 1, 'backoff': 2, 'exceptions': ['SGPClientError']}
- Required: No
memory_strategy (Optional[Dict[str, Any]]):
- Description: The memory strategy to prevent exceeding the LLM’s context limit from being exceeded. Currently SGP only supports the Last K memory strategy (keep the last k messages), but will be adding new strategies soon. (e.g. {'type': 'last_k', 'params: {'k': 1000} })
- Required: No
instructions (Optional[str]):
- Description: Instructions for the system prompt. If not provided, there will be no system message. It can also be provided at runtime.
- Required: No

References in Configuration

The configuration includes references to the following models:

RetryConfig

Description: No description available
Fields:
- tries (int):
  - Description: The number of times to retry the operation before giving up.
  - Default: 5
  - Required: No
- delay (int):
  - Description: The number of seconds to wait before retrying the operation.
  - Default: 1
  - Required: No
- backoff (int):
  - Description: The factor by which to increase the delay between retries.
  - Default: 2
  - Required: No
- exceptions (List[Literal[‘SGPClientError’, ‘APITimeoutError’, ‘InternalServerError’, ‘RateLimitError’, ‘Exception’]]):
  - Description: The exceptions to retry on.
  - Default: ['SGPClientError']
  - Required: No

Method

The node provides the following method:

Inputs

messages (List[egp_services.types.core.chat.Message]):
- Required: Yes
dynamic_kwargs (typing.Any):
- Required: Yes

References in Inputs

The inputs include references to the following models:

Message

Description: Represents a single message with a role
Fields:
- role (Literal[‘system’, ‘user’, ‘assistant’, ‘tool’, ‘ipython’]):
  - Description: The role of the message
  - Required: Yes
- content (str):
  - Description: The content of the message
  - Required: Yes

Output

output (str):
- Required: Yes

====================================================================================================

ChatGenerationWithCitationsNode

Type: chat_generation_with_citations
Operation Type: COMPLETION

Description

Extends ChatGenerationNode to enrich inline citations by modifying the streamed text to and wrapping the result in a CitationResponse object, including any citations referenced in that chunk. The streaming citation enrichment uses a regex pattern to match citations and replaces them with a format string that includes the citation index and the fragment UUID. For example, to replace “[1]” with “[[1]](#chunk_id=abcd-1234)”, the regex pattern would be r"\[(\d+)\]" and the replace string would be r"[[{}]](#chunk_id={})". This formulation means it is easy to handle citation formats where the citation itself is replaced by the enriched text, but not easy to deal with longer-range edits such as the {'<'}span{'>'} tag format of Donovan v1. It also does not allow us to handle grouped citations (e.g., [1, 2]) or range citations (e.g., [1-3]). Instead, we have to instruct the model to insert those separately (e.g., [1][2][3]).

Configuration

The configuration parameters for this node are as follows:

regex_pattern (str):
- Description: The regex pattern to match citations in the text. The first group must be the citation index!
- Required: Yes
regex_replace (str):
- Description: The format string to replace the citation with. The first argument is the citation index and the second is the fragment UUID.
- Required: Yes
use_original_indexing (bool):
- Description: Citations are numbered in the order they appear in the output. To use the original indexing, set this to True.
- Required: No
type (Literal[‘chat_generation_with_citations’]):
- Default: chat_generation_with_citations
- Required: No
operation_type (Literal[NodeOperationType.COMPLETION: ‘COMPLETION’]):
- Default: NodeOperationType.COMPLETION
- Required: No
name (str):
- Description: The name of the node.
- Required: Yes
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
llm_model (Optional[str]):
- Description: The name of the LLM model to use for generation. If not provided, llm_model_instance and llm_model_deployment must be set.
- Required: No
llm_model_instance (Optional[str]):
- Description: The name of the custom LLM model instance to use for generation. If not provided, llm_model must be set.
- Required: No
llm_model_deployment (Optional[str]):
- Description: The name of the custom LLM model deployment to use for generation. If not provided, llm_model must be set.
- Required: No
max_tokens (int):
- Description: The maximum number of tokens to generate in the completion. The maximum value is 2048.
- Default: 512
- Required: No
temperature (float):
- Description: The sampling temperature. Lower values will result in more deterministic completions.
- Default: 0.2
- Required: No
stop_sequences (Optional[List[str]]):
- Description: A list of strings that will stop the completion when encountered.
- Required: No
strip_whitespace (bool):
- Description: Whether to strip leading and trailing whitespace from the response.
- Required: No
retry_config (RetryConfig):
- Description: The retry configuration for the node. Can be used to specify the number of retries, delay, backoff, and exceptions to retry on.
- Default: {'tries': 5, 'delay': 1, 'backoff': 2, 'exceptions': ['SGPClientError']}
- Required: No
memory_strategy (Optional[Dict[str, Any]]):
- Description: The memory strategy to prevent exceeding the LLM’s context limit from being exceeded. Currently SGP only supports the Last K memory strategy (keep the last k messages), but will be adding new strategies soon. (e.g. {'type': 'last_k', 'params: {'k': 1000} })
- Required: No
instructions (Optional[str]):
- Description: Instructions for the system prompt. If not provided, there will be no system message. It can also be provided at runtime.
- Required: No

References in Configuration

The configuration includes references to the following models:

RetryConfig

Description: No description available
Fields:
- tries (int):
  - Description: The number of times to retry the operation before giving up.
  - Default: 5
  - Required: No
- delay (int):
  - Description: The number of seconds to wait before retrying the operation.
  - Default: 1
  - Required: No
- backoff (int):
  - Description: The factor by which to increase the delay between retries.
  - Default: 2
  - Required: No
- exceptions (List[Literal[‘SGPClientError’, ‘APITimeoutError’, ‘InternalServerError’, ‘RateLimitError’, ‘Exception’]]):
  - Description: The exceptions to retry on.
  - Default: ['SGPClientError']
  - Required: No

Method

The node provides the following method:

Inputs

messages (List[egp_services.types.core.chat.Message]):
- Required: Yes
references (Optional[List[scale_gp.types.shared.chunk.Chunk]]):
- Required: No

References in Inputs

The inputs include references to the following models:

Message

Description: Represents a single message with a role
Fields:
- role (Literal[‘system’, ‘user’, ‘assistant’, ‘tool’, ‘ipython’]):
  - Description: The role of the message
  - Required: Yes
- content (str):
  - Description: The content of the message
  - Required: Yes

Chunk

Description: No description available
Fields:
- chunk_id (str):
  - Required: Yes
- score (float):
  - Required: Yes
- text (str):
  - Required: Yes
- attachment_url (Optional[str]):
  - Required: No
- embedding (Optional[List[float]]):
  - Required: No
- metadata (Optional[object]):
  - Required: No
- title (Optional[str]):
  - Required: No
- user_supplied_metadata (Optional[object]):
  - Required: No

Output

output (ModifiedCitationResponse):
- Required: Yes

References in Output

The output includes references to the following models:

ModifiedCitationResponse

Description: Basic class to represent the output of a citation node.
Fields:
- output (str):
  - Description: The output of the citation node
  - Required: Yes
- citations (List[egp_services.types.citation.Citation]):
  - Description: The citations created by the citation node
  - Required: Yes
- original (str):
  - Description: The original text output
  - Required: Yes

====================================================================================================

ChunkEvaluationNode

Type: chunk_eval
Operation Type: EVALUATION

Description

This is a node to evaluate how the retrieval of your chunks has gone. It uses fuzzy matching of chunks against source chunks.

Configuration

The configuration parameters for this node are as follows:

type (Literal[‘chunk_eval’]):
- Default: chunk_eval
- Required: No
operation_type (Literal[NodeOperationType.EVALUATION: ‘EVALUATION’]):
- Default: NodeOperationType.EVALUATION
- Required: No
name (str):
- Description: The name of the node.
- Required: Yes
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
top_k_thresholds (List[int]):
- Description: The top-k values you’d like to evaluate retrieval against.
- Required: Yes
fuzzy_match_threshold (float):
- Description: The threshold for fuzzy string matching.
- Default: 0.5
- Required: No
require_all (bool):
- Description: Whether all sources must be present in the top k chunks.
- Required: No

Method

The node provides the following method:

Inputs

chunks (List[scale_gp.types.shared.chunk.Chunk]):
- Required: Yes
sources (List[egp_services.types.dataset.RetrievalSource]):
- Required: Yes

References in Inputs

The inputs include references to the following models:

Chunk

Description: No description available
Fields:
- chunk_id (str):
  - Required: Yes
- score (float):
  - Required: Yes
- text (str):
  - Required: Yes
- attachment_url (Optional[str]):
  - Required: No
- embedding (Optional[List[float]]):
  - Required: No
- metadata (Optional[object]):
  - Required: No
- title (Optional[str]):
  - Required: No
- user_supplied_metadata (Optional[object]):
  - Required: No

RetrievalSource

Description: Represents a single source from the corpus of documents that we expect should be retrieved when a particular question is asked.
Fields:
- source_type (RetrievalSourceType):
  - Description: The type of the source
  - Required: Yes
- document (str):
  - Description: The name of the document
  - Required: Yes
- page (Optional[int]):
  - Description: The page number of the document
  - Required: No
- content (Optional[str]):
  - Description: The text content of the chunk
  - Required: No

Output

output (Dict[str, bool]):
- Required: Yes

====================================================================================================

CitationNode

Type: citation
Operation Type: CITATION

Description

A node that generates citations for a given text using a citation generator using ROUGE similarity.

Configuration

The configuration parameters for this node are as follows:

type (Literal[‘citation’]):
- Default: citation
- Required: No
operation_type (Literal[NodeOperationType.CITATION: ‘CITATION’]):
- Default: NodeOperationType.CITATION
- Required: No
name (str):
- Description: The name of the node.
- Required: Yes
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
citation_type (Literal[‘rouge’, ‘model_defined’]):
- Description: The type of citation to use.
- Required: Yes
citation_context (CitationContext):
- Description: The citation context (e.g. configuration of how make the citations) to use.
- Default: {'min_similarity': 0.5, 'metric': 'rouge2', 'score': 'recall', 'generate_with_llm': False}
- Required: No
s3_path_override (Optional[str]):
- Description: If provided, the S3 path to use for the attachment URL in the citation to deal with auth challenges.
- Required: No

References in Configuration

The configuration includes references to the following models:

CitationContext

Description: No description available
Fields:
- min_similarity (Optional[Annotated[float, None, Interval(gt=None, ge=0, lt=None, le=1), None, None]]):
  - Description: The similarity threshold at which a candidate match is returned as a citation
  - Required: No
- metric (Optional[Annotated[str, StringConstraints(strip_whitespace=None, to_upper=None, to_lower=None, strict=None, min_length=None, max_length=None, pattern=‘^rouge(L|Lsum|\d+)$’)]]):
  - Description: The ROUGE metric to use (rouge2 by default)
  - Required: No
- score (Optional[Literal[‘precision’, ‘recall’, ‘fmeasure’]]):
  - Description: The value to use with ROUGE citation matching (recall is the default)
  - Required: No
- generate_with_llm (bool):
  - Description: When True, we prompt the LLM to include citations in it’s response and parse them during post-processing rather than identifying cited chunks entirely in post-processing like we do for other methods.
  - Required: No

Method

The node provides the following method:

Inputs

generated_text (str):
- Required: Yes
chunks (List[scale_gp.types.shared.chunk.Chunk]):
- Required: Yes
model_citations (Optional[List[egp_services.types.citation.ModelCitation]]):
- Required: No

References in Inputs

The inputs include references to the following models:

Chunk

Description: No description available
Fields:
- chunk_id (str):
  - Required: Yes
- score (float):
  - Required: Yes
- text (str):
  - Required: Yes
- attachment_url (Optional[str]):
  - Required: No
- embedding (Optional[List[float]]):
  - Required: No
- metadata (Optional[object]):
  - Required: No
- title (Optional[str]):
  - Required: No
- user_supplied_metadata (Optional[object]):
  - Required: No

ModelCitation

Description: No description available
Fields:
- start_index (int):
  - Description: The start index of the citation
  - Required: Yes
- end_index (int):
  - Description: The end index of the citation
  - Required: Yes
- cited_text (str):
  - Description: The text of the citation
  - Required: Yes
- chunks (List[scale_gp.types.shared.chunk.Chunk]):
  - Description: The chunks of the citation
  - Required: Yes
- chunk_indices (List[int]):
  - Description: The indices of the chunks
  - Required: Yes

Output

output (CitationResponse):
- Required: Yes

References in Output

The output includes references to the following models:

CitationResponse

Description: Basic class to represent the output of a citation node.
Fields:
- output (str):
  - Description: The output of the citation node
  - Required: Yes
- citations (List[egp_services.types.citation.Citation]):
  - Description: The citations created by the citation node
  - Required: Yes

====================================================================================================

CodeExecutionNode

Type: code_execution
Operation Type: CODE_EXECUTION

Description

A node that executes Python code in an AWS Lambda sandbox.

Configuration

The configuration parameters for this node are as follows:

type (Literal[‘code_execution’]):
- Default: code_execution
- Required: No
operation_type (Literal[NodeOperationType.CODE_EXECUTION: ‘CODE_EXECUTION’]):
- Default: NodeOperationType.CODE_EXECUTION
- Required: No
name (str):
- Description: The name of the node.
- Required: Yes
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
files (Dict[str, str]):
- Description: A mapping of sandboxed file paths to their source paths. Keys should be relative paths at which the files will be saved within the sandbox, and values should be source file identifiers compatible with smart_open.
- Required: Yes
output_artifacts_dir (Optional[str]):
- Description: If set, output files will be saved to this directory
- Required: No

Method

The node provides the following method:

Inputs

code (str):
- Required: Yes
files (Optional[Dict[str, str]]):
- Required: No

Output

output (Dict[str, str]):
- Required: Yes

====================================================================================================

ConditionNode

Type: condition
Operation Type: DATA_MANIPULATION

Description

A node that checks certain conditions and outputs the result.

Configuration

The configuration parameters for this node are as follows:

type (Literal[‘condition’]):
- Default: condition
- Required: No
operation_type (Literal[NodeOperationType.DATA_MANIPULATION: ‘DATA_MANIPULATION’]):
- Default: NodeOperationType.DATA_MANIPULATION
- Required: No
name (str):
- Description: The name of the node.
- Required: Yes
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
condition (CompoundCondition):
- Description: A compound condition that will be evaluated to determine whether the node should run. The condition has the following structure: logical_operator: Literal[“ALL”, “ANY”, “NOT”] = “ALL” conditions: Optional[List[Union[UnaryCondition, “CompoundCondition”]]] = None input_names: List[str] = None # populated by model validator
Each condition in the conditions list can be either a UnaryCondition or a CompoundCondition. A Unary Condition looks like; Representation of a boolean function with a single input e.g. the condition specified by input_name: x operator: ‘contains’ ref_value: ‘c’ would evaluate to True if x == ‘cat’ Operators are defined in the constant function store dict_keys([‘greater_than’, ‘equals’, ‘less_than’, ‘is_in’, ‘contains’, ‘has_duplicates’, ‘is_null’, ‘bool’])
- Required: Yes

References in Configuration

The configuration includes references to the following models:

CompoundCondition

Description: Representation of a compound boolean statement, i.e. a negation, conjunction, or disjunction of UnaryConditions
Fields:
- logical_operator (Literal[‘ALL’, ‘ANY’, ‘NOT’]):
  - Description: The logical operator to apply to the conditions
  - Default: ALL
  - Required: No
- conditions (Optional[List[Union[egp_services.helpers.UnaryCondition, egp_services.helpers.CompoundCondition]]]):
  - Description: The list of conditions to apply the logical operator to
  - Required: No
- input_names (Optional[List[str]]):
  - Description: The list of input names used in the conditions, populated by model validator
  - Required: No

Method

The node provides the following method:

Inputs

(No explicit inputs listed for the primary method)

Output

output (bool):
- Required: Yes

====================================================================================================

CreateMessagesNode

Type: create_messages
Operation Type: DATA_MANIPULATION

Description

This node will construct a list of valid messages

Configuration

The configuration parameters for this node are as follows:

type (Literal[‘create_messages’]):
- Default: create_messages
- Required: No
operation_type (Literal[NodeOperationType.DATA_MANIPULATION: ‘DATA_MANIPULATION’]):
- Default: NodeOperationType.DATA_MANIPULATION
- Required: No
name (str):
- Description: The name of the node.
- Required: Yes
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
message_configs (List[Union[egp_services.nodes.formatting.messages.AlternatingRoleMessages, egp_services.nodes.formatting.messages.SingleRoleMessages]]):
- Description: We take as input a list of configs that represent either a single user or two users we want to alternate between
- Required: Yes

References in Configuration

The configuration includes references to the following models:

AlternatingRoleMessages

Description: A class representing multiple roles that alternate between each other
Fields:
- role_value_pairs (List[Dict[str, str]]):
  - Description: A list of dictionaries with a role and a value (where value is a string or a string representing an input)
  - Required: Yes

SingleRoleMessages

Description: A class represent a single role. It is a dictionary with a role
Fields:
- role (str):
  - Description: The role of the message (e.g. user, assistant, system)
  - Required: Yes
- content (str):
  - Description: This can either be a string representing the content (e.g. system message) or a string representing an input (either a str or a list of strings)
  - Required: Yes

Method

The node provides the following method:

Inputs

(No explicit inputs listed for the primary method)

Output

output (List[Dict[str, str]]):
- Required: Yes

====================================================================================================

DataTransformNode

Type: data_transform
Operation Type: DATA_MANIPULATION

Description

A node that performs data transformations on the input data.

Configuration

The configuration parameters for this node are as follows:

type (Literal[‘data_transform’]):
- Default: data_transform
- Required: No
operation_type (Literal[NodeOperationType.DATA_MANIPULATION: ‘DATA_MANIPULATION’]):
- Default: NodeOperationType.DATA_MANIPULATION
- Required: No
name (str):
- Description: The name of the node.
- Required: Yes
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
action (Literal[‘load_json_chunk_vals’, ‘to_key_value_dict’, ‘get_json_val’, ‘to_list’, ‘strip’, ‘df_sample_str’, ‘till_stop_token’, ‘identity’, ‘select_keys_from_dict’, ‘int_adder’, ‘json_loads’, ‘clean_sql’, ‘len’, ‘index’, ‘slice’, ‘get_pydantic_attribute’, ‘get’, ‘read_text_file’, ‘split_to_chunks_nltk’]):
- Description: The action to perform on the input data. Must be one of the keys in dict_keys([‘load_json_chunk_vals’, ‘to_key_value_dict’, ‘get_json_val’, ‘to_list’, ‘strip’, ‘df_sample_str’, ‘till_stop_token’, ‘identity’, ‘select_keys_from_dict’, ‘int_adder’, ‘json_loads’, ‘clean_sql’, ‘len’, ‘index’, ‘slice’, ‘get_pydantic_attribute’, ‘get’, ‘read_text_file’, ‘split_to_chunks_nltk’]).
- Required: Yes
additional_inputs (Dict[str, Any]):
- Description: Additional inputs for the action.
- Required: No
apply_to_dictlist_leaves (bool):
- Description: Whether the action should be applied to the leaves of a nested dict/list.
- Required: No

Method

The node provides the following method:

Inputs

response (typing.Any):
- Required: Yes

Output

output (Dict[str, Any]):
- Required: Yes

====================================================================================================

FileReaderNode

Type: file_reader
Operation Type: FILE_RETRIEVAL

Description

A node that loads and reads a file to a string. Supports SGP file ids.

Configuration

The configuration parameters for this node are as follows:

type (Literal[‘file_reader’]):
- Default: file_reader
- Required: No
operation_type (Literal[NodeOperationType.FILE_RETRIEVAL: ‘FILE_RETRIEVAL’]):
- Default: NodeOperationType.FILE_RETRIEVAL
- Required: No
name (str):
- Description: The name of the node.
- Required: Yes
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
file (Union[str, List[str]]):
- Description: Unique identifier for the file held by SGP to be read. May also be a list of file ids to be read. If a list, the node will return a dictionary of file id to file contents.
- Required: No
decode_as (Optional[str]):
- Description: If provided, will decode the file contents as this encoding (i.e. ‘utf-8’)
- Required: No

Method

The node provides the following method:

Inputs

(No explicit inputs listed for the primary method)

Output

output (Union[str, bytes]):
- Required: Yes

====================================================================================================

GenerationNode

Type: generation
Operation Type: COMPLETION

Description

A string in, string out generation node that uses an LLM model to generate completions. It will pull the model from the SGP API and use it to generate completions.

Configuration

The configuration parameters for this node are as follows:

type (Literal[‘generation’]):
- Default: generation
- Required: No
operation_type (Literal[NodeOperationType.COMPLETION: ‘COMPLETION’]):
- Default: NodeOperationType.COMPLETION
- Required: No
name (str):
- Description: The name of the node.
- Required: Yes
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
llm_model (Optional[str]):
- Description: The name of the LLM model to use for generation. If not provided, llm_model_instance and llm_model_deployment must be set.
- Required: No
llm_model_instance (Optional[str]):
- Description: The name of the custom LLM model instance to use for generation. If not provided, llm_model must be set.
- Required: No
llm_model_deployment (Optional[str]):
- Description: The name of the custom LLM model deployment to use for generation. If not provided, llm_model must be set.
- Required: No
max_tokens (int):
- Description: The maximum number of tokens to generate in the completion. The maximum value is 2048.
- Default: 512
- Required: No
temperature (float):
- Description: The sampling temperature. Lower values will result in more deterministic completions.
- Default: 0.2
- Required: No
stop_sequences (Optional[List[str]]):
- Description: A list of strings that will stop the completion when encountered.
- Required: No
strip_whitespace (bool):
- Description: Whether to strip leading and trailing whitespace from the response.
- Required: No
retry_config (RetryConfig):
- Description: The retry configuration for the node. Can be used to specify the number of retries, delay, backoff, and exceptions to retry on.
- Default: {'tries': 5, 'delay': 1, 'backoff': 2, 'exceptions': ['SGPClientError']}
- Required: No

References in Configuration

The configuration includes references to the following models:

RetryConfig

Description: No description available
Fields:
- tries (int):
  - Description: The number of times to retry the operation before giving up.
  - Default: 5
  - Required: No
- delay (int):
  - Description: The number of seconds to wait before retrying the operation.
  - Default: 1
  - Required: No
- backoff (int):
  - Description: The factor by which to increase the delay between retries.
  - Default: 2
  - Required: No
- exceptions (List[Literal[‘SGPClientError’, ‘APITimeoutError’, ‘InternalServerError’, ‘RateLimitError’, ‘Exception’]]):
  - Description: The exceptions to retry on.
  - Default: ['SGPClientError']
  - Required: No

Method

The node provides the following method:

Inputs

input_prompt (str):
- Required: Yes
input_image_dir (Optional[egp_services.types.core.root.InputImageDir]):
- Required: No
input_image_paths (Optional[egp_services.types.core.root.InputImagePaths]):
- Required: No
dynamic_kwargs (typing.Any):
- Required: Yes

References in Inputs

The inputs include references to the following models:

InputImageDir

Description: Represents a directory containing input images. The root attribute stores the directory path as a string.
Fields:
- root (str):
  - Description: The directory containing the input images
  - Required: Yes

InputImagePaths

Description: A list of paths to input images. The root attribute stores the directory path as a List of strings.
Fields:
- root (List[str]):
  - Description: A list of paths to input images
  - Required: Yes

Output

output (Union[str, egp_services.tracing.NodeOutput]):
- Required: Yes

References in Output

The output includes references to the following models:

NodeOutput

Description: Container for the output of a node run and any metadata associated with it.
Fields:

====================================================================================================

GenerationWithCitationsNode

Type: generation_with_citations
Operation Type: COMPLETION

Description

Extends GenerationNode to enrich inline citations by modifying the streamed text to and wrapping the result in a CitationResponse object, including any citations referenced in that chunk. The streaming citation enrichment uses a regex pattern to match citations and replaces them with a format string that includes the citation index and the fragment UUID. For example, to replace “[1]” with “[[1]](#chunk_id=abcd-1234)”, the regex pattern would be r"\[(\d+)\]" and the replace string would be r"[[{}]](#chunk_id={})". This formulation means it is easy to handle citation formats where the citation itself is replaced by the enriched text, but not easy to deal with longer-range edits such as the {'<'}span{'>'} tag format of Donovan v1. It also does not allow us to handle grouped citations (e.g., [1, 2]) or range citations (e.g., [1-3]). Instead, we have to instruct the model to insert those separately (e.g., [1][2][3]).

Configuration

The configuration parameters for this node are as follows:

regex_pattern (str):
- Description: The regex pattern to match citations in the text. The first group must be the citation index!
- Required: Yes
regex_replace (str):
- Description: The format string to replace the citation with. The first argument is the citation index and the second is the fragment UUID.
- Required: Yes
use_original_indexing (bool):
- Description: Citations are numbered in the order they appear in the output. To use the original indexing, set this to True.
- Required: No
type (Literal[‘generation_with_citations’]):
- Default: generation_with_citations
- Required: No
operation_type (Literal[NodeOperationType.COMPLETION: ‘COMPLETION’]):
- Default: NodeOperationType.COMPLETION
- Required: No
name (str):
- Description: The name of the node.
- Required: Yes
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
llm_model (Optional[str]):
- Description: The name of the LLM model to use for generation. If not provided, llm_model_instance and llm_model_deployment must be set.
- Required: No
llm_model_instance (Optional[str]):
- Description: The name of the custom LLM model instance to use for generation. If not provided, llm_model must be set.
- Required: No
llm_model_deployment (Optional[str]):
- Description: The name of the custom LLM model deployment to use for generation. If not provided, llm_model must be set.
- Required: No
max_tokens (int):
- Description: The maximum number of tokens to generate in the completion. The maximum value is 2048.
- Default: 512
- Required: No
temperature (float):
- Description: The sampling temperature. Lower values will result in more deterministic completions.
- Default: 0.2
- Required: No
stop_sequences (Optional[List[str]]):
- Description: A list of strings that will stop the completion when encountered.
- Required: No
strip_whitespace (bool):
- Description: Whether to strip leading and trailing whitespace from the response.
- Required: No
retry_config (RetryConfig):
- Description: The retry configuration for the node. Can be used to specify the number of retries, delay, backoff, and exceptions to retry on.
- Default: {'tries': 5, 'delay': 1, 'backoff': 2, 'exceptions': ['SGPClientError']}
- Required: No

References in Configuration

The configuration includes references to the following models:

RetryConfig

Description: No description available
Fields:
- tries (int):
  - Description: The number of times to retry the operation before giving up.
  - Default: 5
  - Required: No
- delay (int):
  - Description: The number of seconds to wait before retrying the operation.
  - Default: 1
  - Required: No
- backoff (int):
  - Description: The factor by which to increase the delay between retries.
  - Default: 2
  - Required: No
- exceptions (List[Literal[‘SGPClientError’, ‘APITimeoutError’, ‘InternalServerError’, ‘RateLimitError’, ‘Exception’]]):
  - Description: The exceptions to retry on.
  - Default: ['SGPClientError']
  - Required: No

Method

The node provides the following method:

Inputs

input_prompt (str):
- Required: Yes
references (Optional[List[egp_services.types.citation.Citation]]):
- Required: No

References in Inputs

The inputs include references to the following models:

Citation

Description: No description available
Fields:
- body (Optional[str]):
  - Description: The text of the citation
  - Required: No
- cited_text (Optional[str]):
  - Description: The chunk of cited text, this is essentially a duplicate field to body. Keeping this here to maintain backwards compatibility with plugins.
  - Required: No
- s3_attachment_url (str):
  - Description: The URL of the attachment
  - Required: Yes
- page_number (str):
  - Description: The page number of the citation
  - Required: Yes
- fragment_uuid (str):
  - Description: The UUID of the fragment
  - Required: Yes
- document_uuid (str):
  - Description: The UUID of the document
  - Required: Yes
- source_type (Optional[Literal[‘document’, ‘sql’]]):
  - Description: The type of the source
  - Required: No
- document_classification (Optional[str]):
  - Description: The classification of the document
  - Required: No
- metadata (Optional[Dict]):
  - Description: The metadata of the citation
  - Required: No

Output

output (ModifiedCitationResponse):
- Required: Yes

References in Output

The output includes references to the following models:

ModifiedCitationResponse

Description: Basic class to represent the output of a citation node.
Fields:
- output (str):
  - Description: The output of the citation node
  - Required: Yes
- citations (List[egp_services.types.citation.Citation]):
  - Description: The citations created by the citation node
  - Required: Yes
- original (str):
  - Description: The original text output
  - Required: Yes

====================================================================================================

GenerationWithDocumentModelNode

Type: generation_with_document_model
Operation Type: COMPLETION

Description

This is a node that generates completions using a document model (e.g. Cohere’s document model). This is different than a normal generation model because you can pass the “documents” (or chunks) directly to the LLM provider and they’ll automatically deal with the retrieval and citation for you.

Configuration

The configuration parameters for this node are as follows:

type (Literal[‘generation_with_document_model’]):
- Default: generation_with_document_model
- Required: No
operation_type (Literal[NodeOperationType.COMPLETION: ‘COMPLETION’]):
- Default: NodeOperationType.COMPLETION
- Required: No
name (str):
- Description: The name of the node.
- Required: Yes
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
llm_model (str):
- Description: The document model to use.
- Required: Yes
max_tokens (int):
- Description: The maximum number of tokens to generate in the completion. The maximum value is 2048.
- Default: 512
- Required: No
temperature (float):
- Description: The sampling temperature. Lower values will result in more deterministic completions.
- Default: 0.2
- Required: No
stop_sequences (Optional[List[str]]):
- Description: A list of strings that will stop the completion when encountered.
- Required: No

Method

The node provides the following method:

Inputs

user_prompt (str):
- Required: Yes
chunks (List[scale_gp.types.shared.chunk.Chunk]):
- Required: Yes

References in Inputs

The inputs include references to the following models:

Chunk

Description: No description available
Fields:
- chunk_id (str):
  - Required: Yes
- score (float):
  - Required: Yes
- text (str):
  - Required: Yes
- attachment_url (Optional[str]):
  - Required: No
- embedding (Optional[List[float]]):
  - Required: No
- metadata (Optional[object]):
  - Required: No
- title (Optional[str]):
  - Required: No
- user_supplied_metadata (Optional[object]):
  - Required: No

Output

output (Tuple[str, List[egp_services.types.citation.ModelCitation]]):
- Required: Yes

References in Output

The output includes references to the following models:

ModelCitation

Description: No description available
Fields:
- start_index (int):
  - Description: The start index of the citation
  - Required: Yes
- end_index (int):
  - Description: The end index of the citation
  - Required: Yes
- cited_text (str):
  - Description: The text of the citation
  - Required: Yes
- chunks (List[scale_gp.types.shared.chunk.Chunk]):
  - Description: The chunks of the citation
  - Required: Yes
- chunk_indices (List[int]):
  - Description: The indices of the chunks
  - Required: Yes

====================================================================================================

GetMessageNode

Type: get_message
Operation Type: DATA_MANIPULATION

Description

Node to get a message from a list of messages

Configuration

The configuration parameters for this node are as follows:

type (Literal[‘get_message’]):
- Default: get_message
- Required: No
operation_type (Literal[NodeOperationType.DATA_MANIPULATION: ‘DATA_MANIPULATION’]):
- Default: NodeOperationType.DATA_MANIPULATION
- Required: No
name (str):
- Description: The name of the node.
- Required: Yes
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
index (int):
- Description: The index of the message to get
- Required: Yes

Method

The node provides the following method:

Inputs

messages (List[egp_services.types.core.chat.Message]):
- Required: Yes

References in Inputs

The inputs include references to the following models:

Message

Description: Represents a single message with a role
Fields:
- role (Literal[‘system’, ‘user’, ‘assistant’, ‘tool’, ‘ipython’]):
  - Description: The role of the message
  - Required: Yes
- content (str):
  - Description: The content of the message
  - Required: Yes

Output

output (str):
- Required: Yes

====================================================================================================

InsertMessagesNode

Type: insert_messages
Operation Type: DATA_MANIPULATION

Description

Node to insert a list of messages into another list of messages at a certain index

Configuration

The configuration parameters for this node are as follows:

type (Literal[‘insert_messages’]):
- Default: insert_messages
- Required: No
operation_type (Literal[NodeOperationType.DATA_MANIPULATION: ‘DATA_MANIPULATION’]):
- Default: NodeOperationType.DATA_MANIPULATION
- Required: No
name (str):
- Description: The name of the node.
- Required: Yes
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
index (int):
- Description: The index to insert the new messages at. If index = -1, then the messages will be appended to the end of the list
- Required: Yes

Method

The node provides the following method:

Inputs

messages (List[egp_services.types.core.chat.Message]):
- Required: Yes
new_messages (List[egp_services.types.core.chat.Message]):
- Required: Yes

References in Inputs

The inputs include references to the following models:

Message

Description: Represents a single message with a role
Fields:
- role (Literal[‘system’, ‘user’, ‘assistant’, ‘tool’, ‘ipython’]):
  - Description: The role of the message
  - Required: Yes
- content (str):
  - Description: The content of the message
  - Required: Yes

Output

output (List[Dict[str, str]]):
- Required: Yes

====================================================================================================

JinjaNode

Type: jinja
Operation Type: PROMPT_ENGINEERING

Description

An extremely flexible node that takes as input a dictionary of data, where each key is a tag corresponding to the data transformation Jinja template tag (optional), and each value is the data to be templated. It will always take as input something extremely flexible and return a string. Here are some example configs:

1) name: database_content_jinja
    type: jinja
    config:
        verbose: false
        data_transformations:
          chunks:
            jinja_helper_functions:
            - CHUNK_TO_JSON
            - JSON_LOADS_CHUNK_CONTENT
            - FORMAT_SQL_SCHEMA
            jinja_template_str: '{% for chunk in value %}{{chunk["table"]}}\n\n{% endfor %}'
        output_template:
          jinja_template_str: '{{ chunks }}'

2) name: get_unmodified_generation
    type: jinja
    config:
        # The Jinja node refuses to include unknown types in the output template, but you can
        # process them with a data transformation first to get around this.
        data_transformations:
          response:
            jinja_template_str: "{{ value.original }}"
        output_template:
          jinja_template_str: "{{ response }}"

3) name: PROMPT_ENGINEERING_NODE-1711657246958
    type: jinja
    config:
      data_transformations:
        chunks:
          jinja_template_str: "{% for chunks_var in value %}\n\n{{ chunks_var.text }}\n\n{% endfor %}"
      output_template:
        jinja_template_str: "\n\nYou are a copilot that is meant to answer the below user query to the best of\nyour ability. Please only draw from the context below in your responses, not\nfrom your own internal knowledge if at all possible. If you cannot answer please\ndo not do so. Do not hallucinate.\n\n\nUser query:\n\n{{ query }}\n\n\nContext:\n\n{{ chunks }}\n\n"

Configuration

The configuration parameters for this node are as follows:

type (Literal[‘jinja’]):
- Default: jinja
- Required: No
operation_type (Literal[NodeOperationType.PROMPT_ENGINEERING: ‘PROMPT_ENGINEERING’]):
- Default: NodeOperationType.PROMPT_ENGINEERING
- Required: No
name (str):
- Description: The name of the node.
- Required: Yes
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
data_transformations (Dict[str, egp_services.nodes.jinja.JinjaNodeTemplate]):
- Description: A dictionary of data transformations to apply to the input data.
- Required: No
output_template (JinjaNodeTemplate):
- Description: The output template to use.
- Default: {'jinja_template_str_unloaded': None, 'jinja_template_path': None, 'jinja_helper_functions': [], 'jinja_template_str_loaded': None}
- Required: No
log_output (bool):
- Description: Whether to log the output of the node to the logger.
- Required: No
log_prefix (str):
- Description: A prefix to add to the log output. Useful for distinguishing between multiple nodes.
- Required: No
verbose (bool):
- Description: Whether to log additional information about the node.
- Default: True
- Required: No
llm_model (Optional[Literal[‘llama-2-7b-chat’, ‘llama-2-13b-chat’, ‘llama-2-70b-chat’, ‘falcon-7b-insruct’, ‘falcon-40b-instruct’, ‘mistral-7b-instruct’, ‘mixtral-8x7b-instruct’, ‘mixtral-8x22b-instruct’, ‘codellama-7b-instruct’, ‘codellama-13b-instruct’, ‘codellama-34b-instruct’, ‘codellama-70b-instruct’, ‘zephyr-7b-alpha’, ‘zephyr-7b-beta’, ‘gemma-2b-instruct’, ‘gemma-7b-instruct’, ‘llama-3-8b-instruct’, ‘llama-3-70b-instruct’, ‘llama-3-1-8b-instruct’, ‘llama-3-1-70b-instruct’, ‘llama-3-1-405b-instruct’, ‘dolphin-2.2-70b’, ‘dolphin-2.9-llama3-70b’, ‘dolphin-2.9-llama3-8b’]]):
- Description: The name of the model to use for tokenization.
- Required: No
max_tokens (Optional[int]):
- Description: The maximum number of tokens to use for the output template. If set, the output template will be token-limited.
- Required: No
context_chunks_key (Optional[str]):
- Description: The key in the templated data that contains the context chunks to be token-limited.
- Default: context_chunks
- Required: No

References in Configuration

The configuration includes references to the following models:

JinjaNodeTemplate

Description: Base model for a Jinja template. Guaranteed to store a string that can be read in to Template().
Fields:
- jinja_template_str_unloaded (Optional[str]):
  - Description: Raw template to apply to the data. This should be a Jinja2 template string. Please note, the data will be mapped as ‘value’ in the template. Default None corresponds to {{value}}. Should access property jinja_template_str or field jinja_template_str_loaded for the loaded template data
  - Required: No
- jinja_template_path (Optional[str]):
  - Description: Path to a Jinja2 template file. Default None.
  - Required: No
- jinja_helper_functions (Optional[List[Union[str, Dict[str, Any]]]]):
  - Description: Potential helper functions to use in jinja, please refer to egp_services.enums
  - Required: No
- jinja_template_str_loaded (Optional[str]):
  - Description: The original jinja_template_str field from the config might not contain the needed template, and we may need to load S3 data specified with jinja_template_path. This field caches the loaded template content, it is also accessed through property jinja_template_str.
  - Required: No

Method

The node provides the following method:

Inputs

(No explicit inputs listed for the primary method)

Output

output (str):
- Required: Yes

====================================================================================================

LLMEngineNode

Type: llm_engine
Operation Type: COMPLETION

Description

A node that generates completions from an LLM leveraging LLMEngine calls instead of SGP. Has increased functionality such as batch completion.

Configuration

The configuration parameters for this node are as follows:

type (Literal[‘llm_engine’]):
- Default: llm_engine
- Required: No
operation_type (Literal[NodeOperationType.COMPLETION: ‘COMPLETION’]):
- Default: NodeOperationType.COMPLETION
- Required: No
name (str):
- Description: The name of the node.
- Required: Yes
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
llm_model (str):
- Description: The LLM model to use for generation. This should be a Scale-hosted LLM model.
- Required: Yes
max_tokens (Optional[int]):
- Description: The maximum number of tokens to generate in the completion.
- Default: 512
- Required: No
temperature (Optional[float]):
- Description: The temperature to use for generation. Higher values lead to more randomness.
- Default: 0.2
- Required: No
stop_sequences (Optional[List[str]]):
- Description: A list of strings that, if generated, will stop the completion.
- Required: No
presence_penalty (Optional[float]):
- Description: The presence penalty to use for generation.
- Required: Yes
frequency_penalty (Optional[float]):
- Description: The frequency penalty to use for generation.
- Required: Yes
top_k (Optional[int]):
- Description: The top-k value to use for generation.
- Required: Yes
top_p (Optional[float]):
- Description: The top-p value to use for generation.
- Required: Yes
timeout (int):
- Description: The maximum time in seconds to wait for the completion. Used for max_runtime_sec in batch completions call.
- Default: 900
- Required: No
include_stop_str_in_output (Optional[bool]):
- Description: Whether to include the stop string in the output.
- Required: No
guided_json (Optional[Dict[str, Any]]):
- Description: A JSON object to guide the completion.
- Required: No
guided_regex (Optional[str]):
- Description: A regex string to guide the completion.
- Required: No
guided_choice (Optional[List[str]]):
- Description: A list of strings to guide the completion.
- Required: No
batch_run_mode (Literal[‘sync’, ‘async’]):
- Description: Whether to run the batch completions synchronously or asynchronously.
- Default: async
- Required: No
batch_sys_kwargs (LLMEngineBatchSysKwargs):
- Description: System-level kwargs for batch completions.
- Required: No

References in Configuration

The configuration includes references to the following models:

LLMEngineBatchSysKwargs

Description: No description available
Fields:
- checkpoint_path (Optional[str]):
  - Description: The checkpoint path to use for the model.
  - Required: No
- seed (Optional[int]):
  - Description: The seed to use for the model’s generation.
  - Required: No
- num_shards (Optional[int]):
  - Description: The number of shards to split the model into
  - Default: 1
  - Required: No
- labels (Optional[Dict[str, str]]):
  - Description: The labels to use for the model (e.g. team/product)
  - Required: No

Method

The node provides the following method:

Inputs

input_prompt (str):
- Required: Yes

Output

output (str):
- Required: Yes

====================================================================================================

LLMGeneratedCitationNode

Type: llm_generated_citation
Operation Type: CITATION

Description

Converts LLM-generated citations to a format that can be used to cite chunks.

Configuration

The configuration parameters for this node are as follows:

type (Literal[‘llm_generated_citation’]):
- Default: llm_generated_citation
- Required: No
operation_type (Literal[NodeOperationType.CITATION: ‘CITATION’]):
- Default: NodeOperationType.CITATION
- Required: No
name (str):
- Description: The name of the node.
- Required: Yes
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No

Method

The node provides the following method:

Inputs

generated_text (str):
- Required: Yes
chunks (List[scale_gp.types.shared.chunk.Chunk]):
- Required: Yes

References in Inputs

The inputs include references to the following models:

Chunk

Description: No description available
Fields:
- chunk_id (str):
  - Required: Yes
- score (float):
  - Required: Yes
- text (str):
  - Required: Yes
- attachment_url (Optional[str]):
  - Required: No
- embedding (Optional[List[float]]):
  - Required: No
- metadata (Optional[object]):
  - Required: No
- title (Optional[str]):
  - Required: No
- user_supplied_metadata (Optional[object]):
  - Required: No

Output

output (CitationAndReferencesResponse):
- Required: Yes

References in Output

The output includes references to the following models:

CitationAndReferencesResponse

Description: Class to represent the output of a citation node. Can include a list of uncited references that were seen by the LLM but not directly cited.
Fields:
- output (str):
  - Description: The output of the citation node
  - Required: Yes
- citations (List[egp_services.types.citation.Citation]):
  - Description: The citations created by the citation node
  - Required: Yes
- uncited_references (Optional[List[egp_services.types.citation.Citation]]):
  - Description: The references that were not cited
  - Required: No

====================================================================================================

MergeNode

Type: merge_node
Operation Type: CUSTOM

Description

Either used to rename workflow outputs OR used after a branch in a plan, to allow users to reference the results from any branch under the same name. For example, extending the example branch config above with the following merge config:

  - branch:  "add_animal_to_message_by_color"
    ...
    merge_outputs:
        # define aliases for output nodes that can be referenced by a step that consumes the output of *either* branch
        # will create a special "merge" node under the hood for each alias here
        message_with_animal:
        - "add_frog_to_message.add_frog_to_message_jinja_node.output"
        - "add_giraffe_to_message.add_giraffe_to_message_jinja_node.output"
        - "add_cow_to_message.add_cow_to_message_jinja_node.output"

allows the user to reference the outputs of any of these 3 branches via the alias “add_animal_to_message_by_color.message_with_animal.output” in downstream workflows.

Configuration

The configuration parameters for this node are as follows:

type (Literal[‘merge_node’]):
- Default: merge_node
- Required: No
operation_type (Literal[NodeOperationType.CUSTOM: ‘CUSTOM’]):
- Default: NodeOperationType.CUSTOM
- Required: No
name (str):
- Description: The name of the node.
- Required: Yes
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
input_hierarchy (List[str]):
- Description: A list of node names defining the order in which to look for non-null inputs.
- Required: Yes

Method

The node provides the following method:

Inputs

(No explicit inputs listed for the primary method)

Output

output (typing.Any):
- Required: Yes

====================================================================================================

ProcessorNode

Type: processor
Operation Type: CUSTOM

Description

A node that processes data by executing a series of functions. This allows for very custom logic that is specific to a certain project/customer. Here are some example configs:

1) name: update_responses
    type: processor
    inputs:
        list1: responses
        list2: executed_response_to_str.output
    config:
        return_key: concatenated
        function_specs:
          concatenated:
            path: concatenate
            kwargs:
              list1: list1
              list2: list2

2) name: "iterate_pages"
    type: "processor"
    inputs:
        x: x
    config:
        return_key: decrement_int
        function_specs:
          decrement_int:
            path: hammerhead.processing.functions.decrement_int
            kwargs:
              x: x

3) name: process_chunks
    type: processor
    inputs:
      sequence: sequence
    config:
      return_key: sequence_dropout
      function_specs:
        # Preloaded Functions
        load_chunks_to_dict:  # Giving this a name different than the function_path
          path: sequence_pydantic_to_dict  # Need to ensure we pass function_path
          kwargs:
            sequence: sequence
        extract_nested_json:  # Preloaded function -{'>'} Load json string from dict
          path: extract_nested_json
          kwargs:
            key: text.numbers  # [{'text': '{"numbers": '[1, 2, 3]}'}] -{'>'} [{'numbers': [1, 2, 3]}]
            jsons: load_chunks_to_dict  # Reference -{'>'} load_chunks_to_dict
        sequence_shuffle:
          path: sequence_shuffle
          kwargs:
            sequence: extract_nested_json  # Reference -{'>'} extract_nested_json
        sequence_dropout:
          path: sequence_dropout
          kwargs:
            sequence: sequence_shuffle  # Reference -{'>'}  sequence_shuffle
            dropout_rate: 0.5  # kwarg

Configuration

The configuration parameters for this node are as follows:

type (Literal[‘processor’]):
- Default: processor
- Required: No
operation_type (Literal[NodeOperationType.CUSTOM: ‘CUSTOM’]):
- Default: NodeOperationType.CUSTOM
- Required: No
name (str):
- Description: The name of the node.
- Required: Yes
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
return_key (str):
- Description: The function to return results from.
- Required: Yes
function_specs (Dict[str, egp_services.nodes.processing.node.FunctionSpec]):
- Description: Specifications for executing a function.
- Required: Yes

References in Configuration

The configuration includes references to the following models:

FunctionSpec

Description: Define the specification of function execution Key: A a unique name for the execution of a function and the kwargs. Values: A dictionary mapping:
- path (str): path to the function
- kwargs (Dict[str, Any]): Keyword arguments for this execution of the function.

Function Key & Path Representation:

1. Basic Usage
    {
        "first_usage": {
            "path": "package.module.function",
            ... # your kwargs
        },
        "second_usage": {
            "path": "package.module.function",
            ... # your kwargs
        }
    }

2. Passing results between functions
    {
        "first_function": {
            ... # your kwargs
        },
        "second_function" " {
            input_from_first_function: "first_function"
        }
    }

Fields:
- path (str):
  - Description: Path to the function.
  - Required: Yes
- kwargs (Dict[str, Any]):
  - Description: Keyword arguments for the function.
  - Required: Yes

Method

The node provides the following method:

Inputs

(No explicit inputs listed for the primary method)

Output

output (typing.Any):
- Required: Yes

====================================================================================================

RegexMatchNode

Type: regex_match
Operation Type: DATA_MANIPULATION

Description

A node that matches a string against a regex pattern and returns the match groups.

Configuration

The configuration parameters for this node are as follows:

type (Literal[‘regex_match’]):
- Default: regex_match
- Required: No
operation_type (Literal[NodeOperationType.DATA_MANIPULATION: ‘DATA_MANIPULATION’]):
- Default: NodeOperationType.DATA_MANIPULATION
- Required: No
name (str):
- Description: The name of the node.
- Required: Yes
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
pattern (str):
- Description: The regex pattern to match against.
- Required: Yes

Method

The node provides the following method:

Inputs

string (str):
- Required: Yes

Output

output (Optional[Dict[str, str]]):
- Required: Yes

====================================================================================================

RemoveMessageNode

Type: remove_message
Operation Type: DATA_MANIPULATION

Description

Node to remove a message from a list of messages

Configuration

The configuration parameters for this node are as follows:

type (Literal[‘remove_message’]):
- Default: remove_message
- Required: No
operation_type (Literal[NodeOperationType.DATA_MANIPULATION: ‘DATA_MANIPULATION’]):
- Default: NodeOperationType.DATA_MANIPULATION
- Required: No
name (str):
- Description: The name of the node.
- Required: Yes
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
index (int):
- Description: The index of the message to remove. If index = -1, then the last message will be removed
- Required: Yes

Method

The node provides the following method:

Inputs

messages (List[egp_services.types.core.chat.Message]):
- Required: Yes

References in Inputs

The inputs include references to the following models:

Message

Description: Represents a single message with a role
Fields:
- role (Literal[‘system’, ‘user’, ‘assistant’, ‘tool’, ‘ipython’]):
  - Description: The role of the message
  - Required: Yes
- content (str):
  - Description: The content of the message
  - Required: Yes

Output

output (List[Dict[str, str]]):
- Required: Yes

====================================================================================================

RerankerNode

Type: reranker
Operation Type: RERANKING

Description

A node that reranks a list of chunks based on a set of scorers. An example configs:

    1) name: "reranker"
        type: "reranker"
        config:
          num_to_return: 5
          scorers:
            - name: "cross-encoder"
              model: "cross-encoder/ms-marco-MiniLM-L-12-v2"

Configuration

The configuration parameters for this node are as follows:

type (Literal[‘reranker’]):
- Default: reranker
- Required: No
operation_type (Literal[NodeOperationType.RERANKING: ‘RERANKING’]):
- Default: NodeOperationType.RERANKING
- Required: No
name (str):
- Description: The name of the node.
- Required: Yes
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
scorers (List[egp_services.nodes.chunks.scorers.BM25Scorer | egp_services.nodes.chunks.scorers.CrossEncoderScorer | egp_services.nodes.chunks.scorers.RecencyScorer | egp_services.nodes.chunks.scorers.CustomModelScorer | egp_services.nodes.chunks.scorers.LLMRerankerScorer]):
- Description: The list of scorers to use for reranking.
- Required: Yes
num_to_return (int):
- Description: The number of chunks to return after reranking.
- Required: Yes
score_threshold (Optional[float]):
- Description: The threshold of the score to filter the chunks.
- Required: No

References in Configuration

The configuration includes references to the following models:

LLMRerankerScorer

Description: LLM reranker using a Sliding window based approach.
Fields:
- name (Literal[‘llm-reranker’]):
  - Default: llm-reranker
  - Required: No
- weight (float):
  - Default: 1.0
  - Required: No
- client_kwargs (Dict[str, str]):
  - Required: No
- model (str):
  - Required: Yes
- max_tokens (int):
  - Default: 512
  - Required: No
- temperature (float):
  - Default: 0.2
  - Required: No
- stop_sequences (Optional[List[str]]):
  - Required: No
- jinja_template_path (str):
  - Required: Yes
- jinja_template_str (str):
  - Required: Yes

CustomModelScorer

Description: Finetuned reranker model uploaded to the Models API.
Fields:
- name (Literal[‘sgp-custom-model’]):
  - Default: sgp-custom-model
  - Required: No
- weight (float):
  - Default: 1.0
  - Required: No
- client_kwargs (Dict[str, str]):
  - Required: No
- model (str):
  - Required: Yes
- instance_name (Optional[str]):
  - Required: No
- deployment_name (Optional[str]):
  - Required: No

CrossEncoderScorer

Description: Cross-encoder reranker. Can be loaded from local path or from EGP/Cohere. User must provide the name of the cross-encoder model.
Fields:
- name (Literal[‘cross-encoder’]):
  - Default: cross-encoder
  - Required: No
- weight (float):
  - Default: 1.0
  - Required: No
- client_kwargs (Dict[str, str]):
  - Required: No
- model (str):
  - Required: Yes
- reranking_strategy (Union[scale_gp.types.chunk_rank_params.RankStrategyCrossEncoderRankStrategy, scale_gp.types.chunk_rank_params.RankStrategyModelRankStrategy, egp_services.nodes.chunks.scorers.LocalRerankStrategy, NoneType]):
  - Required: No

RecencyScorer

Description: Recency scorer for chunks. Score is computed based on delta days or seconds using date metadata and the provided query date.
Fields:
- name (Literal[‘recency’]):
  - Default: recency
  - Required: No
- weight (float):
  - Default: 1.0
  - Required: No
- client_kwargs (Dict[str, str]):
  - Required: No
- field (Optional[str]):
  - Default: published_date
  - Required: No
- units (Optional[Literal[‘days’, ‘seconds’]]):
  - Default: days
  - Required: No

BM25Scorer

Description: BM25 scorer for chunks. Users can specify which field to use for computing BM25 score with query.
Fields:
- name (Literal[‘bm25’]):
  - Default: bm25
  - Required: No
- weight (float):
  - Default: 1.0
  - Required: No
- client_kwargs (Dict[str, str]):
  - Required: No
- field (Optional[str]):
  - Default: text
  - Required: No

Method

The node provides the following method:

Inputs

query (str):
- Required: Yes
chunks (List[scale_gp.types.shared.chunk.Chunk]):
- Required: Yes

References in Inputs

The inputs include references to the following models:

Chunk

Description: No description available
Fields:
- chunk_id (str):
  - Required: Yes
- score (float):
  - Required: Yes
- text (str):
  - Required: Yes
- attachment_url (Optional[str]):
  - Required: No
- embedding (Optional[List[float]]):
  - Required: No
- metadata (Optional[object]):
  - Required: No
- title (Optional[str]):
  - Required: No
- user_supplied_metadata (Optional[object]):
  - Required: No

Output

output (List[scale_gp.types.shared.chunk.Chunk]):
- Required: Yes

References in Output

The output includes references to the following models:

Chunk

Description: No description available
Fields:
- chunk_id (str):
  - Required: Yes
- score (float):
  - Required: Yes
- text (str):
  - Required: Yes
- attachment_url (Optional[str]):
  - Required: No
- embedding (Optional[List[float]]):
  - Required: No
- metadata (Optional[object]):
  - Required: No
- title (Optional[str]):
  - Required: No
- user_supplied_metadata (Optional[object]):
  - Required: No

====================================================================================================

ResponseParserNode

Type: response_parser
Operation Type: DATA_MANIPULATION

Description

A node that parses a response based on a given action.

Configuration

The configuration parameters for this node are as follows:

type (Literal[‘response_parser’]):
- Default: response_parser
- Required: No
operation_type (Literal[NodeOperationType.DATA_MANIPULATION: ‘DATA_MANIPULATION’]):
- Default: NodeOperationType.DATA_MANIPULATION
- Required: No
name (str):
- Description: The name of the node.
- Required: Yes
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
action (Literal[‘greater_than’, ‘equals’, ‘less_than’, ‘is_in’, ‘contains’, ‘has_duplicates’, ‘is_null’, ‘bool’, ‘to_integer’, ‘to_key_value_dict’]):
- Description: The action to perform on the response. Must be one of the keys: dict_keys([‘greater_than’, ‘equals’, ‘less_than’, ‘is_in’, ‘contains’, ‘has_duplicates’, ‘is_null’, ‘bool’, ‘to_integer’, ‘to_key_value_dict’])
- Required: Yes
reference_value (Optional[Any]):
- Description: A reference value to use in the action. The meaning of this value depends on the action.
- Required: No

Method

The node provides the following method:

Inputs

response (str):
- Required: Yes

Output

output (Dict[str, Any]):
- Required: Yes

====================================================================================================

RetrieverNode

Type: retriever
Operation Type: KB_RETRIEVAL

Description

Retrieves chunks from a knowledge base based on a query. The knowledge base can be specified by ID or name. We also have the option to filter the retrieved chunks based on metadata fields. This is useful when we want to retrieve chunks that have metadata fields that match the values of certain columns in the input data. Metadata: inputs format: Dictkey: str, val: str key: field that exists in metadata of the chunks from the knowledgebase to be queried val: field / column name that exists in input data kwargs / eval dataset csv file Example: The following code requires chunks to have metadata fields “chapter” and “section” and requires the data to have fields / columns called “chapter_column” and “section_column” It will retrieve chunks for a question such that only those chunks are retrieved whose metadata in the two fields is identical to the values provided for them in the question’s row.

workflow:
...
- name: "retrieve"
  type: "retriever"
  config:
    ...
    metadata:
        field_concatenation_mode = "and",
        map_nan_input_to_filter_value = "NA"
  inputs:
    query: "question"
    metadata_filters:
        chapter: "meta_columns:chapter_column"
        section: "meta_columns:section_column"
...

Configuration

The configuration parameters for this node are as follows:

type (Literal[‘retriever’]):
- Default: retriever
- Required: No
operation_type (Literal[NodeOperationType.KB_RETRIEVAL: ‘KB_RETRIEVAL’]):
- Default: NodeOperationType.KB_RETRIEVAL
- Required: No
name (str):
- Description: The name of the node.
- Required: Yes
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
filter_fields (Optional[List[str]]):
- Description: The metadata fields to filter on.
- Required: No
knowledge_base_id (Optional[str]):
- Description: The ID of the knowledge base to query.
- Required: No
knowledge_base_name (Optional[str]):
- Description: The name of the knowledge base to query.
- Required: No
num_to_return (int):
- Description: The number of chunks to return after retrieval.
- Required: Yes
exact_knn_search (Optional[bool]):
- Description: Whether to use exact KNN search. If not provided, will be set to True if num_to_return 1000.
- Required: No
metadata (Optional[Dict[str, Optional[str]]]):
- Description: Metadata filters to apply to the chunks retrieved.
- Default: {'field_concatenation_mode': 'and'}
- Required: No
min_results_per_knowledge_base (int):
- Description: The minimum number of results to return per knowledge base.
- Required: No
num_retriever_workers (int):
- Description: The number of concurrent retriever workers to use.
- Default: 1
- Required: No

Method

The node provides the following method:

Inputs

query (str):
- Required: Yes
metadata_filters (Optional[Dict[str, str]]):
- Required: No
include_embeddings (bool):
- Required: No
knowledge_base_ids (Union[egp_services.types.core.root.KnowledgeBaseId, List[egp_services.types.core.root.KnowledgeBaseId], NoneType]):
- Required: No

References in Inputs

The inputs include references to the following models:

KnowledgeBaseId

Description: Represents a unique identifier for a knowledge base. The root attribute holds the ID as a string.
Fields:
- root (str):
  - Description: The id of the knowledge base
  - Required: Yes

Output

output (List[scale_gp.types.shared.chunk.Chunk]):
- Required: Yes

References in Output

The output includes references to the following models:

Chunk

Description: No description available
Fields:
- chunk_id (str):
  - Required: Yes
- score (float):
  - Required: Yes
- text (str):
  - Required: Yes
- attachment_url (Optional[str]):
  - Required: No
- embedding (Optional[List[float]]):
  - Required: No
- metadata (Optional[object]):
  - Required: No
- title (Optional[str]):
  - Required: No
- user_supplied_metadata (Optional[object]):
  - Required: No

====================================================================================================

SearchCitationNode

Type: search_citation
Operation Type: CITATION

Description

This node is used to create citations objects found via regex search.

Configuration

The configuration parameters for this node are as follows:

type (Literal[‘search_citation’]):
- Default: search_citation
- Required: No
operation_type (Literal[NodeOperationType.CITATION: ‘CITATION’]):
- Default: NodeOperationType.CITATION
- Required: No
name (str):
- Description: The name of the node.
- Required: Yes
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
search_regex (str):
- Description: The regex to search for the search result in the response.
- Required: Yes
end_search_regex (str):
- Description: The regex to search for the end of the search result in the response.
- Required: Yes

Method

The node provides the following method:

Inputs

response (str):
- Required: Yes

Output

output (CitationResponse):
- Required: Yes

References in Output

The output includes references to the following models:

CitationResponse

Description: Basic class to represent the output of a citation node.
Fields:
- output (str):
  - Description: The output of the citation node
  - Required: Yes
- citations (List[egp_services.types.citation.Citation]):
  - Description: The citations created by the citation node
  - Required: Yes

====================================================================================================

SQLExecutorNode

Type: sql_executor
Operation Type: CODE_EXECUTION

Description

A node that executes a SQL query on a DB. Right now, only Snowflake is supported.

Configuration

The configuration parameters for this node are as follows:

type (Literal[‘sql_executor’]):
- Default: sql_executor
- Required: No
operation_type (Literal[NodeOperationType.CODE_EXECUTION: ‘CODE_EXECUTION’]):
- Default: NodeOperationType.CODE_EXECUTION
- Required: No
name (str):
- Description: The name of the node.
- Required: Yes
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
connector_kwargs (Dict[str, str]):
- Description: A dictionary of keyword arguments to pass to the DB connector
- Required: Yes
connector_type (Literal[‘snowflake’]):
- Description: The type of connector to use
- Default: snowflake
- Required: No
secrets (List[str]):
- Description: A list of keys in connector_kwargs that should be fetched from secrets
- Required: No
return_type (Literal[‘df’, ‘dicts’, ‘markdown’, ‘json’, ‘str’]):
- Description: The type of return value to return
- Default: markdown
- Required: No
schema_remapping_file (Optional[str]):
- Description: A file containing a schema remapping for tables
- Required: No
log_queries (bool):
- Description: Whether to log the queries executed by the node
- Default: True
- Required: No

Method

The node provides the following method:

Inputs

query (str):
- Required: Yes

Output

output (SQLResponse):
- Required: Yes

References in Output

The output includes references to the following models:

SQLResponse

Description: No description available
Fields:
- response (Union[str, List[Dict], NoneType]):
  - Required: Yes
- error (Optional[str]):
  - Required: Yes

====================================================================================================

StaticNode

Type: static
Operation Type: CUSTOM

Description

A node to simply input static things. Used when we want to input things into a workflow which might be difficult to inline. e.g. static ICL examples into a jinja node.

Configuration

The configuration parameters for this node are as follows:

type (Literal[‘static’]):
- Default: static
- Required: No
operation_type (Literal[NodeOperationType.CUSTOM: ‘CUSTOM’]):
- Default: NodeOperationType.CUSTOM
- Required: No
name (str):
- Description: The name of the node.
- Required: Yes
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
value (Optional[Any]):
- Description: The value to return from the node. Exactly one of value and from_file must be provided.
- Required: No
from_file (Union[Dict, List, str, NoneType]):
- Description: A local/s3 path or a nested dict/list where each leaf is a local/s3 path. Use when you need to compile multiple data sources into one.
- Required: No

Method

The node provides the following method:

Inputs

(No explicit inputs listed for the primary method)

Output

output (typing.Any):
- Required: Yes

====================================================================================================

TokenizerChatTemplateNode

Type: tokenizer_chat_template
Operation Type: PROMPT_ENGINEERING

Description

A node to help with formatting chat templates for instruct-trained LLMs. Allows the user to pass in a list of messages and get the formatted string back using the official tokenizer template on huggingface.

Configuration

The configuration parameters for this node are as follows:

type (Literal[‘tokenizer_chat_template’]):
- Default: tokenizer_chat_template
- Required: No
operation_type (Literal[NodeOperationType.PROMPT_ENGINEERING: ‘PROMPT_ENGINEERING’]):
- Default: NodeOperationType.PROMPT_ENGINEERING
- Required: No
name (str):
- Description: The name of the node.
- Required: Yes
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
llm_model (Literal[‘llama-2-7b-chat’, ‘llama-2-13b-chat’, ‘llama-2-70b-chat’, ‘falcon-7b-insruct’, ‘falcon-40b-instruct’, ‘mistral-7b-instruct’, ‘mixtral-8x7b-instruct’, ‘mixtral-8x22b-instruct’, ‘codellama-7b-instruct’, ‘codellama-13b-instruct’, ‘codellama-34b-instruct’, ‘codellama-70b-instruct’, ‘zephyr-7b-alpha’, ‘zephyr-7b-beta’, ‘gemma-2b-instruct’, ‘gemma-7b-instruct’, ‘llama-3-8b-instruct’, ‘llama-3-70b-instruct’, ‘llama-3-1-8b-instruct’, ‘llama-3-1-70b-instruct’, ‘llama-3-1-405b-instruct’, ‘llama-3-3-70b-instruct’, ‘dolphin-2.2-70b’, ‘dolphin-2.9-llama3-70b’, ‘dolphin-2.9-llama3-8b’]):
- Description: The name of the model to use for tokenization.
- Required: No
add_generation_prompt (bool):
- Description: Whether to end the prompt with the token(s) that indicate the start of an assistant message. This is useful when you want to generate a response from the model. Note that this argument will be passed to the chat template, and so it must be supported in the template for this argument to have any effect.
- Required: No
padding (bool):
- Description: Whether to pad sequences to the maximum length. Has no effect if tokenize is False.
- Required: No
truncation (bool):
- Description: Whether to truncate sequences at the maximum length. Has no effect if tokenize is False.
- Required: No
max_length (Optional[int]):
- Description: Maximum length (in tokens) to use for padding or truncation. Has no effect if tokenize is False.
- Required: No
kwargs (Dict[str, Any]):
- Description: Additional kwargs to pass to the template renderer. Will be accessible by the chat template.
- Required: No

Method

The node provides the following method:

Inputs

messages (List[egp_services.types.core.chat.Message]):
- Required: Yes

References in Inputs

The inputs include references to the following models:

Message

Description: Represents a single message with a role
Fields:
- role (Literal[‘system’, ‘user’, ‘assistant’, ‘tool’, ‘ipython’]):
  - Description: The role of the message
  - Required: Yes
- content (str):
  - Description: The content of the message
  - Required: Yes

Output

output (typing.Any):
- Required: Yes

====================================================================================================

ToolGenerationNode

Type: tool_generation
Operation Type: COMPLETION

Description

A node that generates tool calls and executes tools in a loop.

Configuration

The configuration parameters for this node are as follows:

type (Literal[‘tool_generation’]):
- Default: tool_generation
- Required: No
operation_type (Literal[NodeOperationType.COMPLETION: ‘COMPLETION’]):
- Default: NodeOperationType.COMPLETION
- Required: No
name (str):
- Description: The name of the node.
- Required: Yes
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
model (str):
- Description: The model to use for completion.
- Required: Yes
temperature (Optional[float]):
- Description: The temperature to use for completion.
- Required: No
max_tokens (Optional[int]):
- Description: The maximum number of tokens to generate.
- Required: No
presence_penalty (Optional[float]):
- Description: The presence penalty to use for completion.
- Required: No
stop (Union[str, List[str], NoneType]):
- Description: The stop token(s) to use for completion.
- Required: No
seed (Optional[int]):
- Description: The seed to use for completion.
- Required: No
tool_choice (Union[Literal[‘required’, ‘auto’], Dict[str, Any], NoneType]):
- Description: Whether a tool must be used in the completion.
- Required: No
tools (Optional[List[egp_services.nodes.generation.tool_generation.ToolConfig]]):
- Description: The tools to use in the completion.
- Required: No

References in Configuration

The configuration includes references to the following models:

ToolConfig

Description: No description available
Fields:
- name (Literal[‘internal.Google Search’, ‘internal.WikipediaSearch’, ‘experimental.PythonInterpreter’, ‘experimental.sympy_calculator’, ‘nodes.StaticNode’, ‘nodes.GenerationNode’, ‘nodes.SQLExecutorNode’, ‘nodes.CodeExecutionNode’, ‘nodes.RetrieverNode’, ‘nodes.RerankerNode’, ‘nodes.ChatGenerationNode’]):
  - Required: Yes
- init_kwargs (Optional[Dict[str, Any]]):
  - Required: No

Method

The node provides the following method:

Inputs

messages (List[egp_services.types.core.chat.Message]):
- Required: Yes

References in Inputs

The inputs include references to the following models:

Message

Description: Represents a single message with a role
Fields:
- role (Literal[‘system’, ‘user’, ‘assistant’, ‘tool’, ‘ipython’]):
  - Description: The role of the message
  - Required: Yes
- content (str):
  - Description: The content of the message
  - Required: Yes

Output

output (str):
- Required: Yes

====================================================================================================

TruncateMessagesNode

Type: truncate_messages
Operation Type: DATA_MANIPULATION

Description

Node to truncate old messages from a list of messages based on a token limit

Configuration

The configuration parameters for this node are as follows:

type (Literal[‘truncate_messages’]):
- Default: truncate_messages
- Required: No
operation_type (Literal[NodeOperationType.DATA_MANIPULATION: ‘DATA_MANIPULATION’]):
- Default: NodeOperationType.DATA_MANIPULATION
- Required: No
name (str):
- Description: The name of the node.
- Required: Yes
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
llm_model (Optional[Literal[‘llama-2-7b-chat’, ‘llama-2-13b-chat’, ‘llama-2-70b-chat’, ‘falcon-7b-insruct’, ‘falcon-40b-instruct’, ‘mistral-7b-instruct’, ‘mixtral-8x7b-instruct’, ‘mixtral-8x22b-instruct’, ‘codellama-7b-instruct’, ‘codellama-13b-instruct’, ‘codellama-34b-instruct’, ‘codellama-70b-instruct’, ‘zephyr-7b-alpha’, ‘zephyr-7b-beta’, ‘gemma-2b-instruct’, ‘gemma-7b-instruct’, ‘llama-3-8b-instruct’, ‘llama-3-70b-instruct’, ‘llama-3-1-8b-instruct’, ‘llama-3-1-70b-instruct’, ‘llama-3-1-405b-instruct’, ‘dolphin-2.2-70b’, ‘dolphin-2.9-llama3-70b’, ‘dolphin-2.9-llama3-8b’]]):
- Description: The name of the model to use for tokenization.
- Required: No
desired_token_count (int):
- Description: The desired token count for the truncated messages
- Required: Yes
preserve_system_message (bool):
- Description: Whether to preserve the system message in the truncated messages
- Default: True
- Required: No
prohibited_starting_roles (List[str]):
- Description: List of prohibiting roles to start after a system message
- Default: ['assistant']
- Required: No

Method

The node provides the following method:

Inputs

messages (List[egp_services.types.core.chat.Message]):
- Required: Yes

References in Inputs

The inputs include references to the following models:

Message

Description: Represents a single message with a role
Fields:
- role (Literal[‘system’, ‘user’, ‘assistant’, ‘tool’, ‘ipython’]):
  - Description: The role of the message
  - Required: Yes
- content (str):
  - Description: The content of the message
  - Required: Yes

Getting Started

Agents

Evaluations

Uploading Data

Creating Applications

Creating Applications

Evaluation Datasets

Inference

Miscellaneous

Managing Annotations

Components

​BatchedNode

​Description

​Configuration

​Method

​Inputs

​Output

​BranchDecisionNode

​Description

​Configuration

​References in Configuration

​ConditionalWorkflowItem

​Method

​Inputs

​Output

​References in Output

​BranchDecisionNodeOutput

​ChatGenerationNode

​Description

​Configuration

​References in Configuration

​RetryConfig

​Method

​Inputs

​References in Inputs

​Message

​Output

​ChatGenerationWithCitationsNode

​Description

​Configuration

​References in Configuration

​RetryConfig

​Method

​Inputs

​References in Inputs

​Message

​Chunk

​Output

​References in Output

​ModifiedCitationResponse

​ChunkEvaluationNode

​Description

​Configuration

​Method

​Inputs

​References in Inputs

​Chunk

​RetrievalSource

​Output

​CitationNode

​Description

​Configuration

​References in Configuration

​CitationContext

​Method

​Inputs

​References in Inputs

​Chunk

​ModelCitation

​Output

​References in Output

​CitationResponse

​CodeExecutionNode

​Description

​Configuration

​Method

​Inputs

​Output

​ConditionNode

​Description

BatchedNode

Description

Configuration

Method

Inputs

Output

BranchDecisionNode

Description

Configuration

References in Configuration

ConditionalWorkflowItem

Method

Inputs

Output

References in Output

BranchDecisionNodeOutput

ChatGenerationNode

Description

Configuration

References in Configuration

RetryConfig

Method

Inputs

References in Inputs

Message

Output

ChatGenerationWithCitationsNode

Description

Configuration

References in Configuration

RetryConfig

Method

Inputs

References in Inputs

Message

Chunk

Output

References in Output

ModifiedCitationResponse

ChunkEvaluationNode

Description

Configuration

Method

Inputs

References in Inputs

Chunk

RetrievalSource

Output

CitationNode

Description

Configuration

References in Configuration

CitationContext

Method

Inputs

References in Inputs

Chunk

ModelCitation

Output

References in Output

CitationResponse

CodeExecutionNode

Description

Configuration

Method

Inputs

Output

ConditionNode

Description

Configuration

References in Configuration

CompoundCondition

Method

Inputs

Output

CreateMessagesNode

Description

Configuration

References in Configuration

AlternatingRoleMessages