Agent Service Node Library
A library of available nodes
New nodes are constantly being created. This document was last updated 5/1/2024
BatchedNode
Type: batched
Operation Type: CUSTOM
Description
Group inputs into batches.
This node uses more_itertools.chunked or chunked_even to split an iterable into batches.
Here is an example config for a batched node:
In this example, when the input is “0123456789”, the output will be: [[0, 1, 2, 3], [4, 5, 6], [7, 8, 9]]
This allows you to easily batch up the input for processing in parallel.
Configuration
The configuration parameters for this node are as follows:
-
type (Literal[‘batched’]):
- Default: batched
- Required: No
-
operation_type (Literal[NodeOperationType.CUSTOM: ‘CUSTOM’]):
- Default: NodeOperationType.CUSTOM
- Required: No
-
name (str):
- Description: The name of the node.
- Required: Yes
-
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
-
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
-
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
-
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
-
batch_size (int):
- Description: The size of the batches to split the input into.
- Required: Yes
-
split_approx_evenly (bool):
- Description: Whether to split the input into batches of approximately equal size.
- Default: True
- Required: No
Method
The node provides the following method:
Inputs
- iterable (Iterable[~T]):
- Required: Yes
Output
- output (list):
- Required: Yes
====================================================================================================
BranchDecisionNode
Type: branch_decision_node
Operation Type: CUSTOM
Description
Decides which branch to execute based on conditions provided by the user in a plan config. For example, a plan config might contain a branch config with 3 mutually exclusive branches like this:
and this node will run with input_kwargs = {"add_frog_to_message": MESSAGE, "add_giraffe_to_message": MESSAGE}
,
in order to decide which workflow to execute next.
Configuration
The configuration parameters for this node are as follows:
-
type (Literal[‘branch_decision_node’]):
- Default: branch_decision_node
- Required: No
-
operation_type (Literal[NodeOperationType.CUSTOM: ‘CUSTOM’]):
- Default: NodeOperationType.CUSTOM
- Required: No
-
name (str):
- Description: The name of the node.
- Required: Yes
-
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
-
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
-
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
-
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
-
ordered_conditional_workflow_cfgs (List[egp_services.plan.workflow_items.ConditionalWorkflowItem]):
- Description: The ordered list of conditional workflows to evaluate.
- Required: Yes
-
dependent_nodes (List[str]):
- Description: The names of the nodes that this node depends on.
- Required: Yes
References in Configuration
The configuration includes references to the following models:
ConditionalWorkflowItem
- Description: Representation of a workflow that is part of a branch complex Attributes: condition: Either if, elif, or else condition_input_var: the variable that is inputted into the unary condition operator: the unary condition operator reference_var: the unary condition reference variable that is used to build a predicate function with the operator condition_tree: a compound logical expression expressed as a tree (see CompoundCondition), optional if the above condition vars are set workflow_nodes: to be set to a list of all nodes in the workflow at runtime, so they can be easily ignored if this branch is not selected to run
- Fields:
- workflow_name (str):
- Description: The name of the abstract workflow
- Required: Yes
- workflow_alias (Optional[str]):
- Description: The alias of the abstract workflow in the graph
- Required: No
- workflow_inputs (Optional[Dict[str, Union[str, Dict[str, Union[str, Dict]]]]]):
- Description: The inputs to the workflow
- Required: No
- condition (Literal[‘if’, ‘elif’, ‘else’]):
- Description: The condition type
- Required: Yes
- condition_input_var (Optional[str]):
- Description: The input variable to the unary condition
- Required: No
- operator (Optional[str]):
- Description: The operator to use in the unary condition
- Required: No
- reference_var (Optional[str]):
- Description: The reference variable to use in the unary condition
- Required: No
- condition_tree (Optional[egp_services.helpers.CompoundCondition]):
- Description: The compound condition tree
- Required: No
- workflow_nodes (List[str]):
- Description: The nodes that make up the workflow
- Required: No
- workflow_name (str):
Method
The node provides the following method:
Inputs
(No explicit inputs listed for the primary method)
Output
- output (BranchDecisionNodeOutput):
- Required: Yes
References in Output
The output includes references to the following models:
BranchDecisionNodeOutput
- Description: No description available
- Fields:
- selected_workflow_name (Optional[str]):
- Required: Yes
- new_nodes_to_ignore (Set[str]):
- Required: Yes
- selected_workflow_name (Optional[str]):
====================================================================================================
ChatGenerationNode
Type: chat_generation
Operation Type: COMPLETION
Description
A node that generates chat completions from an LLM.
Configuration
The configuration parameters for this node are as follows:
-
type (Literal[‘chat_generation’]):
- Default: chat_generation
- Required: No
-
operation_type (Literal[NodeOperationType.COMPLETION: ‘COMPLETION’]):
- Default: NodeOperationType.COMPLETION
- Required: No
-
name (str):
- Description: The name of the node.
- Required: Yes
-
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
-
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
-
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
-
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
-
llm_model (Optional[str]):
- Description: The name of the LLM model to use for generation. If not provided, llm_model_instance and llm_model_deployment must be set.
- Required: No
-
llm_model_instance (Optional[str]):
- Description: The name of the custom LLM model instance to use for generation. If not provided, llm_model must be set.
- Required: No
-
llm_model_deployment (Optional[str]):
- Description: The name of the custom LLM model deployment to use for generation. If not provided, llm_model must be set.
- Required: No
-
max_tokens (int):
- Description: The maximum number of tokens to generate in the completion. The maximum value is 2048.
- Default: 512
- Required: No
-
temperature (float):
- Description: The sampling temperature. Lower values will result in more deterministic completions.
- Default: 0.2
- Required: No
-
stop_sequences (Optional[List[str]]):
- Description: A list of strings that will stop the completion when encountered.
- Required: No
-
strip_whitespace (bool):
- Description: Whether to strip leading and trailing whitespace from the response.
- Required: No
-
retry_config (RetryConfig):
- Description: The retry configuration for the node. Can be used to specify the number of retries, delay, backoff, and exceptions to retry on.
- Default:
{'tries': 5, 'delay': 1, 'backoff': 2, 'exceptions': ['SGPClientError']}
- Required: No
-
memory_strategy (Optional[Dict[str, Any]]):
- Description: The memory strategy to prevent exceeding the LLM’s context limit from being exceeded. Currently SGP only supports the Last K memory strategy (keep the last k messages), but will be adding new strategies soon. (e.g.
{'type': 'last_k', 'params: {'k': 1000} }
) - Required: No
- Description: The memory strategy to prevent exceeding the LLM’s context limit from being exceeded. Currently SGP only supports the Last K memory strategy (keep the last k messages), but will be adding new strategies soon. (e.g.
-
instructions (Optional[str]):
- Description: Instructions for the system prompt. If not provided, there will be no system message. It can also be provided at runtime.
- Required: No
References in Configuration
The configuration includes references to the following models:
RetryConfig
- Description: No description available
- Fields:
- tries (int):
- Description: The number of times to retry the operation before giving up.
- Default: 5
- Required: No
- delay (int):
- Description: The number of seconds to wait before retrying the operation.
- Default: 1
- Required: No
- backoff (int):
- Description: The factor by which to increase the delay between retries.
- Default: 2
- Required: No
- exceptions (List[Literal[‘SGPClientError’, ‘APITimeoutError’, ‘InternalServerError’, ‘RateLimitError’, ‘Exception’]]):
- Description: The exceptions to retry on.
- Default:
['SGPClientError']
- Required: No
- tries (int):
Method
The node provides the following method:
Inputs
-
messages (List[egp_services.types.core.chat.Message]):
- Required: Yes
-
dynamic_kwargs (typing.Any):
- Required: Yes
References in Inputs
The inputs include references to the following models:
Message
- Description: Represents a single message with a role
- Fields:
- role (Literal[‘system’, ‘user’, ‘assistant’, ‘tool’, ‘ipython’]):
- Description: The role of the message
- Required: Yes
- content (str):
- Description: The content of the message
- Required: Yes
- role (Literal[‘system’, ‘user’, ‘assistant’, ‘tool’, ‘ipython’]):
Output
- output (str):
- Required: Yes
====================================================================================================
ChatGenerationWithCitationsNode
Type: chat_generation_with_citations
Operation Type: COMPLETION
Description
Extends ChatGenerationNode to enrich inline citations by modifying the streamed text to and wrapping the result in a CitationResponse object, including any citations referenced in that chunk.
The streaming citation enrichment uses a regex pattern to match citations and replaces them with a
format string that includes the citation index and the fragment UUID. For example, to replace
“[1]” with “[[1]](#chunk_id=abcd-1234)
”, the regex pattern would be r"\[(\d+)\]"
and the replace
string would be r"[[{}]](#chunk_id={})"
.
This formulation means it is easy to handle citation formats where the citation itself is replaced
by the enriched text, but not easy to deal with longer-range edits such as the {'<'}span{'>'}
tag format of
Donovan v1. It also does not allow us to handle grouped citations (e.g., [1, 2]) or range citations
(e.g., [1-3]). Instead, we have to instruct the model to insert those separately (e.g., [1][2][3]).
Configuration
The configuration parameters for this node are as follows:
-
regex_pattern (str):
- Description: The regex pattern to match citations in the text. The first group must be the citation index!
- Required: Yes
-
regex_replace (str):
- Description: The format string to replace the citation with. The first argument is the citation index and the second is the fragment UUID.
- Required: Yes
-
use_original_indexing (bool):
- Description: Citations are numbered in the order they appear in the output. To use the original indexing, set this to True.
- Required: No
-
type (Literal[‘chat_generation_with_citations’]):
- Default: chat_generation_with_citations
- Required: No
-
operation_type (Literal[NodeOperationType.COMPLETION: ‘COMPLETION’]):
- Default: NodeOperationType.COMPLETION
- Required: No
-
name (str):
- Description: The name of the node.
- Required: Yes
-
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
-
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
-
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
-
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
-
llm_model (Optional[str]):
- Description: The name of the LLM model to use for generation. If not provided, llm_model_instance and llm_model_deployment must be set.
- Required: No
-
llm_model_instance (Optional[str]):
- Description: The name of the custom LLM model instance to use for generation. If not provided, llm_model must be set.
- Required: No
-
llm_model_deployment (Optional[str]):
- Description: The name of the custom LLM model deployment to use for generation. If not provided, llm_model must be set.
- Required: No
-
max_tokens (int):
- Description: The maximum number of tokens to generate in the completion. The maximum value is 2048.
- Default: 512
- Required: No
-
temperature (float):
- Description: The sampling temperature. Lower values will result in more deterministic completions.
- Default: 0.2
- Required: No
-
stop_sequences (Optional[List[str]]):
- Description: A list of strings that will stop the completion when encountered.
- Required: No
-
strip_whitespace (bool):
- Description: Whether to strip leading and trailing whitespace from the response.
- Required: No
-
retry_config (RetryConfig):
- Description: The retry configuration for the node. Can be used to specify the number of retries, delay, backoff, and exceptions to retry on.
- Default:
{'tries': 5, 'delay': 1, 'backoff': 2, 'exceptions': ['SGPClientError']}
- Required: No
-
memory_strategy (Optional[Dict[str, Any]]):
- Description: The memory strategy to prevent exceeding the LLM’s context limit from being exceeded. Currently SGP only supports the Last K memory strategy (keep the last k messages), but will be adding new strategies soon. (e.g.
{'type': 'last_k', 'params: {'k': 1000} }
) - Required: No
- Description: The memory strategy to prevent exceeding the LLM’s context limit from being exceeded. Currently SGP only supports the Last K memory strategy (keep the last k messages), but will be adding new strategies soon. (e.g.
-
instructions (Optional[str]):
- Description: Instructions for the system prompt. If not provided, there will be no system message. It can also be provided at runtime.
- Required: No
References in Configuration
The configuration includes references to the following models:
RetryConfig
- Description: No description available
- Fields:
- tries (int):
- Description: The number of times to retry the operation before giving up.
- Default: 5
- Required: No
- delay (int):
- Description: The number of seconds to wait before retrying the operation.
- Default: 1
- Required: No
- backoff (int):
- Description: The factor by which to increase the delay between retries.
- Default: 2
- Required: No
- exceptions (List[Literal[‘SGPClientError’, ‘APITimeoutError’, ‘InternalServerError’, ‘RateLimitError’, ‘Exception’]]):
- Description: The exceptions to retry on.
- Default:
['SGPClientError']
- Required: No
- tries (int):
Method
The node provides the following method:
Inputs
-
messages (List[egp_services.types.core.chat.Message]):
- Required: Yes
-
references (Optional[List[scale_gp.types.shared.chunk.Chunk]]):
- Required: No
References in Inputs
The inputs include references to the following models:
Message
- Description: Represents a single message with a role
- Fields:
- role (Literal[‘system’, ‘user’, ‘assistant’, ‘tool’, ‘ipython’]):
- Description: The role of the message
- Required: Yes
- content (str):
- Description: The content of the message
- Required: Yes
- role (Literal[‘system’, ‘user’, ‘assistant’, ‘tool’, ‘ipython’]):
Chunk
- Description: No description available
- Fields:
- chunk_id (str):
- Required: Yes
- score (float):
- Required: Yes
- text (str):
- Required: Yes
- attachment_url (Optional[str]):
- Required: No
- embedding (Optional[List[float]]):
- Required: No
- metadata (Optional[object]):
- Required: No
- title (Optional[str]):
- Required: No
- user_supplied_metadata (Optional[object]):
- Required: No
- chunk_id (str):
Output
- output (ModifiedCitationResponse):
- Required: Yes
References in Output
The output includes references to the following models:
ModifiedCitationResponse
- Description: Basic class to represent the output of a citation node.
- Fields:
- output (str):
- Description: The output of the citation node
- Required: Yes
- citations (List[egp_services.types.citation.Citation]):
- Description: The citations created by the citation node
- Required: Yes
- original (str):
- Description: The original text output
- Required: Yes
- output (str):
====================================================================================================
ChunkEvaluationNode
Type: chunk_eval
Operation Type: EVALUATION
Description
This is a node to evaluate how the retrieval of your chunks has gone. It uses fuzzy matching of chunks against source chunks.
Configuration
The configuration parameters for this node are as follows:
-
type (Literal[‘chunk_eval’]):
- Default: chunk_eval
- Required: No
-
operation_type (Literal[NodeOperationType.EVALUATION: ‘EVALUATION’]):
- Default: NodeOperationType.EVALUATION
- Required: No
-
name (str):
- Description: The name of the node.
- Required: Yes
-
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
-
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
-
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
-
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
-
top_k_thresholds (List[int]):
- Description: The top-k values you’d like to evaluate retrieval against.
- Required: Yes
-
fuzzy_match_threshold (float):
- Description: The threshold for fuzzy string matching.
- Default: 0.5
- Required: No
-
require_all (bool):
- Description: Whether all sources must be present in the top k chunks.
- Required: No
Method
The node provides the following method:
Inputs
-
chunks (List[scale_gp.types.shared.chunk.Chunk]):
- Required: Yes
-
sources (List[egp_services.types.dataset.RetrievalSource]):
- Required: Yes
References in Inputs
The inputs include references to the following models:
Chunk
- Description: No description available
- Fields:
- chunk_id (str):
- Required: Yes
- score (float):
- Required: Yes
- text (str):
- Required: Yes
- attachment_url (Optional[str]):
- Required: No
- embedding (Optional[List[float]]):
- Required: No
- metadata (Optional[object]):
- Required: No
- title (Optional[str]):
- Required: No
- user_supplied_metadata (Optional[object]):
- Required: No
- chunk_id (str):
RetrievalSource
- Description: Represents a single source from the corpus of documents that we expect should be retrieved when a particular question is asked.
- Fields:
- source_type (RetrievalSourceType):
- Description: The type of the source
- Required: Yes
- document (str):
- Description: The name of the document
- Required: Yes
- page (Optional[int]):
- Description: The page number of the document
- Required: No
- content (Optional[str]):
- Description: The text content of the chunk
- Required: No
- source_type (RetrievalSourceType):
Output
- output (Dict[str, bool]):
- Required: Yes
====================================================================================================
CitationNode
Type: citation
Operation Type: CITATION
Description
A node that generates citations for a given text using a citation generator using ROUGE similarity.
Configuration
The configuration parameters for this node are as follows:
-
type (Literal[‘citation’]):
- Default: citation
- Required: No
-
operation_type (Literal[NodeOperationType.CITATION: ‘CITATION’]):
- Default: NodeOperationType.CITATION
- Required: No
-
name (str):
- Description: The name of the node.
- Required: Yes
-
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
-
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
-
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
-
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
-
citation_type (Literal[‘rouge’, ‘model_defined’]):
- Description: The type of citation to use.
- Required: Yes
-
citation_context (CitationContext):
- Description: The citation context (e.g. configuration of how make the citations) to use.
- Default:
{'min_similarity': 0.5, 'metric': 'rouge2', 'score': 'recall', 'generate_with_llm': False}
- Required: No
-
s3_path_override (Optional[str]):
- Description: If provided, the S3 path to use for the attachment URL in the citation to deal with auth challenges.
- Required: No
References in Configuration
The configuration includes references to the following models:
CitationContext
- Description: No description available
- Fields:
- min_similarity (Optional[Annotated[float, None, Interval(gt=None, ge=0, lt=None, le=1), None, None]]):
- Description: The similarity threshold at which a candidate match is returned as a citation
- Required: No
- metric (Optional[Annotated[str, StringConstraints(strip_whitespace=None, to_upper=None, to_lower=None, strict=None, min_length=None, max_length=None, pattern=‘^rouge(L|Lsum|\d+)$’)]]):
- Description: The ROUGE metric to use (rouge2 by default)
- Required: No
- score (Optional[Literal[‘precision’, ‘recall’, ‘fmeasure’]]):
- Description: The value to use with ROUGE citation matching (recall is the default)
- Required: No
- generate_with_llm (bool):
- Description: When True, we prompt the LLM to include citations in it’s response and parse them during post-processing rather than identifying cited chunks entirely in post-processing like we do for other methods.
- Required: No
- min_similarity (Optional[Annotated[float, None, Interval(gt=None, ge=0, lt=None, le=1), None, None]]):
Method
The node provides the following method:
Inputs
-
generated_text (str):
- Required: Yes
-
chunks (List[scale_gp.types.shared.chunk.Chunk]):
- Required: Yes
-
model_citations (Optional[List[egp_services.types.citation.ModelCitation]]):
- Required: No
References in Inputs
The inputs include references to the following models:
Chunk
- Description: No description available
- Fields:
- chunk_id (str):
- Required: Yes
- score (float):
- Required: Yes
- text (str):
- Required: Yes
- attachment_url (Optional[str]):
- Required: No
- embedding (Optional[List[float]]):
- Required: No
- metadata (Optional[object]):
- Required: No
- title (Optional[str]):
- Required: No
- user_supplied_metadata (Optional[object]):
- Required: No
- chunk_id (str):
ModelCitation
- Description: No description available
- Fields:
- start_index (int):
- Description: The start index of the citation
- Required: Yes
- end_index (int):
- Description: The end index of the citation
- Required: Yes
- cited_text (str):
- Description: The text of the citation
- Required: Yes
- chunks (List[scale_gp.types.shared.chunk.Chunk]):
- Description: The chunks of the citation
- Required: Yes
- chunk_indices (List[int]):
- Description: The indices of the chunks
- Required: Yes
- start_index (int):
Output
- output (CitationResponse):
- Required: Yes
References in Output
The output includes references to the following models:
CitationResponse
- Description: Basic class to represent the output of a citation node.
- Fields:
- output (str):
- Description: The output of the citation node
- Required: Yes
- citations (List[egp_services.types.citation.Citation]):
- Description: The citations created by the citation node
- Required: Yes
- output (str):
====================================================================================================
CodeExecutionNode
Type: code_execution
Operation Type: CODE_EXECUTION
Description
A node that executes Python code in an AWS Lambda sandbox.
Configuration
The configuration parameters for this node are as follows:
-
type (Literal[‘code_execution’]):
- Default: code_execution
- Required: No
-
operation_type (Literal[NodeOperationType.CODE_EXECUTION: ‘CODE_EXECUTION’]):
- Default: NodeOperationType.CODE_EXECUTION
- Required: No
-
name (str):
- Description: The name of the node.
- Required: Yes
-
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
-
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
-
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
-
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
-
files (Dict[str, str]):
- Description: A mapping of sandboxed file paths to their source paths. Keys should be relative paths at which the files will be saved within the sandbox, and values should be source file identifiers compatible with smart_open.
- Required: Yes
-
output_artifacts_dir (Optional[str]):
- Description: If set, output files will be saved to this directory
- Required: No
Method
The node provides the following method:
Inputs
-
code (str):
- Required: Yes
-
files (Optional[Dict[str, str]]):
- Required: No
Output
- output (Dict[str, str]):
- Required: Yes
====================================================================================================
ConditionNode
Type: condition
Operation Type: DATA_MANIPULATION
Description
A node that checks certain conditions and outputs the result.
Configuration
The configuration parameters for this node are as follows:
-
type (Literal[‘condition’]):
- Default: condition
- Required: No
-
operation_type (Literal[NodeOperationType.DATA_MANIPULATION: ‘DATA_MANIPULATION’]):
- Default: NodeOperationType.DATA_MANIPULATION
- Required: No
-
name (str):
- Description: The name of the node.
- Required: Yes
-
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
-
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
-
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
-
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
-
condition (CompoundCondition):
- Description: A compound condition that will be evaluated to determine whether the node should run. The condition has the following structure: logical_operator: Literal[“ALL”, “ANY”, “NOT”] = “ALL” conditions: Optional[List[Union[UnaryCondition, “CompoundCondition”]]] = None input_names: List[str] = None # populated by model validator
Each condition in the conditions list can be either a UnaryCondition or a CompoundCondition. A Unary Condition looks like; Representation of a boolean function with a single input e.g. the condition specified by input_name: x operator: ‘contains’ ref_value: ‘c’ would evaluate to True if x == ‘cat’ Operators are defined in the constant function store dict_keys([‘greater_than’, ‘equals’, ‘less_than’, ‘is_in’, ‘contains’, ‘has_duplicates’, ‘is_null’, ‘bool’])
- Required: Yes
References in Configuration
The configuration includes references to the following models:
CompoundCondition
- Description: Representation of a compound boolean statement, i.e. a negation, conjunction, or disjunction of UnaryConditions
- Fields:
- logical_operator (Literal[‘ALL’, ‘ANY’, ‘NOT’]):
- Description: The logical operator to apply to the conditions
- Default: ALL
- Required: No
- conditions (Optional[List[Union[egp_services.helpers.UnaryCondition, egp_services.helpers.CompoundCondition]]]):
- Description: The list of conditions to apply the logical operator to
- Required: No
- input_names (Optional[List[str]]):
- Description: The list of input names used in the conditions, populated by model validator
- Required: No
- logical_operator (Literal[‘ALL’, ‘ANY’, ‘NOT’]):
Method
The node provides the following method:
Inputs
(No explicit inputs listed for the primary method)
Output
- output (bool):
- Required: Yes
====================================================================================================
CreateMessagesNode
Type: create_messages
Operation Type: DATA_MANIPULATION
Description
This node will construct a list of valid messages
Configuration
The configuration parameters for this node are as follows:
-
type (Literal[‘create_messages’]):
- Default: create_messages
- Required: No
-
operation_type (Literal[NodeOperationType.DATA_MANIPULATION: ‘DATA_MANIPULATION’]):
- Default: NodeOperationType.DATA_MANIPULATION
- Required: No
-
name (str):
- Description: The name of the node.
- Required: Yes
-
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
-
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
-
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
-
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
-
message_configs (List[Union[egp_services.nodes.formatting.messages.AlternatingRoleMessages, egp_services.nodes.formatting.messages.SingleRoleMessages]]):
- Description: We take as input a list of configs that represent either a single user or two users we want to alternate between
- Required: Yes
References in Configuration
The configuration includes references to the following models:
AlternatingRoleMessages
- Description: A class representing multiple roles that alternate between each other
- Fields:
- role_value_pairs (List[Dict[str, str]]):
- Description: A list of dictionaries with a role and a value (where value is a string or a string representing an input)
- Required: Yes
- role_value_pairs (List[Dict[str, str]]):
SingleRoleMessages
- Description: A class represent a single role. It is a dictionary with a role
- Fields:
- role (str):
- Description: The role of the message (e.g. user, assistant, system)
- Required: Yes
- content (str):
- Description: This can either be a string representing the content (e.g. system message) or a string representing an input (either a str or a list of strings)
- Required: Yes
- role (str):
Method
The node provides the following method:
Inputs
(No explicit inputs listed for the primary method)
Output
- output (List[Dict[str, str]]):
- Required: Yes
====================================================================================================
DataTransformNode
Type: data_transform
Operation Type: DATA_MANIPULATION
Description
A node that performs data transformations on the input data.
Configuration
The configuration parameters for this node are as follows:
-
type (Literal[‘data_transform’]):
- Default: data_transform
- Required: No
-
operation_type (Literal[NodeOperationType.DATA_MANIPULATION: ‘DATA_MANIPULATION’]):
- Default: NodeOperationType.DATA_MANIPULATION
- Required: No
-
name (str):
- Description: The name of the node.
- Required: Yes
-
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
-
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
-
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
-
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
-
action (Literal[‘load_json_chunk_vals’, ‘to_key_value_dict’, ‘get_json_val’, ‘to_list’, ‘strip’, ‘df_sample_str’, ‘till_stop_token’, ‘identity’, ‘select_keys_from_dict’, ‘int_adder’, ‘json_loads’, ‘clean_sql’, ‘len’, ‘index’, ‘slice’, ‘get_pydantic_attribute’, ‘get’, ‘read_text_file’, ‘split_to_chunks_nltk’]):
- Description: The action to perform on the input data. Must be one of the keys in dict_keys([‘load_json_chunk_vals’, ‘to_key_value_dict’, ‘get_json_val’, ‘to_list’, ‘strip’, ‘df_sample_str’, ‘till_stop_token’, ‘identity’, ‘select_keys_from_dict’, ‘int_adder’, ‘json_loads’, ‘clean_sql’, ‘len’, ‘index’, ‘slice’, ‘get_pydantic_attribute’, ‘get’, ‘read_text_file’, ‘split_to_chunks_nltk’]).
- Required: Yes
-
additional_inputs (Dict[str, Any]):
- Description: Additional inputs for the action.
- Required: No
-
apply_to_dictlist_leaves (bool):
- Description: Whether the action should be applied to the leaves of a nested dict/list.
- Required: No
Method
The node provides the following method:
Inputs
- response (typing.Any):
- Required: Yes
Output
- output (Dict[str, Any]):
- Required: Yes
====================================================================================================
FileReaderNode
Type: file_reader
Operation Type: FILE_RETRIEVAL
Description
A node that loads and reads a file to a string. Supports SGP file ids.
Configuration
The configuration parameters for this node are as follows:
-
type (Literal[‘file_reader’]):
- Default: file_reader
- Required: No
-
operation_type (Literal[NodeOperationType.FILE_RETRIEVAL: ‘FILE_RETRIEVAL’]):
- Default: NodeOperationType.FILE_RETRIEVAL
- Required: No
-
name (str):
- Description: The name of the node.
- Required: Yes
-
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
-
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
-
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
-
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
-
file (Union[str, List[str]]):
- Description: Unique identifier for the file held by SGP to be read. May also be a list of file ids to be read. If a list, the node will return a dictionary of file id to file contents.
- Required: No
-
decode_as (Optional[str]):
- Description: If provided, will decode the file contents as this encoding (i.e. ‘utf-8’)
- Required: No
Method
The node provides the following method:
Inputs
(No explicit inputs listed for the primary method)
Output
- output (Union[str, bytes]):
- Required: Yes
====================================================================================================
GenerationNode
Type: generation
Operation Type: COMPLETION
Description
A string in, string out generation node that uses an LLM model to generate completions.
It will pull the model from the SGP API and use it to generate completions.
Configuration
The configuration parameters for this node are as follows:
-
type (Literal[‘generation’]):
- Default: generation
- Required: No
-
operation_type (Literal[NodeOperationType.COMPLETION: ‘COMPLETION’]):
- Default: NodeOperationType.COMPLETION
- Required: No
-
name (str):
- Description: The name of the node.
- Required: Yes
-
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
-
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
-
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
-
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
-
llm_model (Optional[str]):
- Description: The name of the LLM model to use for generation. If not provided, llm_model_instance and llm_model_deployment must be set.
- Required: No
-
llm_model_instance (Optional[str]):
- Description: The name of the custom LLM model instance to use for generation. If not provided, llm_model must be set.
- Required: No
-
llm_model_deployment (Optional[str]):
- Description: The name of the custom LLM model deployment to use for generation. If not provided, llm_model must be set.
- Required: No
-
max_tokens (int):
- Description: The maximum number of tokens to generate in the completion. The maximum value is 2048.
- Default: 512
- Required: No
-
temperature (float):
- Description: The sampling temperature. Lower values will result in more deterministic completions.
- Default: 0.2
- Required: No
-
stop_sequences (Optional[List[str]]):
- Description: A list of strings that will stop the completion when encountered.
- Required: No
-
strip_whitespace (bool):
- Description: Whether to strip leading and trailing whitespace from the response.
- Required: No
-
retry_config (RetryConfig):
- Description: The retry configuration for the node. Can be used to specify the number of retries, delay, backoff, and exceptions to retry on.
- Default:
{'tries': 5, 'delay': 1, 'backoff': 2, 'exceptions': ['SGPClientError']}
- Required: No
References in Configuration
The configuration includes references to the following models:
RetryConfig
- Description: No description available
- Fields:
- tries (int):
- Description: The number of times to retry the operation before giving up.
- Default: 5
- Required: No
- delay (int):
- Description: The number of seconds to wait before retrying the operation.
- Default: 1
- Required: No
- backoff (int):
- Description: The factor by which to increase the delay between retries.
- Default: 2
- Required: No
- exceptions (List[Literal[‘SGPClientError’, ‘APITimeoutError’, ‘InternalServerError’, ‘RateLimitError’, ‘Exception’]]):
- Description: The exceptions to retry on.
- Default:
['SGPClientError']
- Required: No
- tries (int):
Method
The node provides the following method:
Inputs
-
input_prompt (str):
- Required: Yes
-
input_image_dir (Optional[egp_services.types.core.root.InputImageDir]):
- Required: No
-
input_image_paths (Optional[egp_services.types.core.root.InputImagePaths]):
- Required: No
-
dynamic_kwargs (typing.Any):
- Required: Yes
References in Inputs
The inputs include references to the following models:
InputImageDir
- Description: Represents a directory containing input images. The root attribute stores the directory path as a string.
- Fields:
- root (str):
- Description: The directory containing the input images
- Required: Yes
- root (str):
InputImagePaths
- Description: A list of paths to input images. The root attribute stores the directory path as a List of strings.
- Fields:
- root (List[str]):
- Description: A list of paths to input images
- Required: Yes
- root (List[str]):
Output
- output (Union[str, egp_services.tracing.NodeOutput]):
- Required: Yes
References in Output
The output includes references to the following models:
NodeOutput
- Description: Container for the output of a node run and any metadata associated with it.
- Fields:
====================================================================================================
GenerationWithCitationsNode
Type: generation_with_citations
Operation Type: COMPLETION
Description
Extends GenerationNode to enrich inline citations by modifying the streamed text to and wrapping the result in a CitationResponse object, including any citations referenced in that chunk.
The streaming citation enrichment uses a regex pattern to match citations and replaces them with a
format string that includes the citation index and the fragment UUID. For example, to replace
“[1]” with “[[1]](#chunk_id=abcd-1234)
”, the regex pattern would be r"\[(\d+)\]"
and the replace
string would be r"[[{}]](#chunk_id={})"
.
This formulation means it is easy to handle citation formats where the citation itself is replaced
by the enriched text, but not easy to deal with longer-range edits such as the {'<'}span{'>'}
tag format of
Donovan v1. It also does not allow us to handle grouped citations (e.g., [1, 2]) or range citations
(e.g., [1-3]). Instead, we have to instruct the model to insert those separately (e.g., [1][2][3]).
Configuration
The configuration parameters for this node are as follows:
-
regex_pattern (str):
- Description: The regex pattern to match citations in the text. The first group must be the citation index!
- Required: Yes
-
regex_replace (str):
- Description: The format string to replace the citation with. The first argument is the citation index and the second is the fragment UUID.
- Required: Yes
-
use_original_indexing (bool):
- Description: Citations are numbered in the order they appear in the output. To use the original indexing, set this to True.
- Required: No
-
type (Literal[‘generation_with_citations’]):
- Default: generation_with_citations
- Required: No
-
operation_type (Literal[NodeOperationType.COMPLETION: ‘COMPLETION’]):
- Default: NodeOperationType.COMPLETION
- Required: No
-
name (str):
- Description: The name of the node.
- Required: Yes
-
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
-
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
-
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
-
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
-
llm_model (Optional[str]):
- Description: The name of the LLM model to use for generation. If not provided, llm_model_instance and llm_model_deployment must be set.
- Required: No
-
llm_model_instance (Optional[str]):
- Description: The name of the custom LLM model instance to use for generation. If not provided, llm_model must be set.
- Required: No
-
llm_model_deployment (Optional[str]):
- Description: The name of the custom LLM model deployment to use for generation. If not provided, llm_model must be set.
- Required: No
-
max_tokens (int):
- Description: The maximum number of tokens to generate in the completion. The maximum value is 2048.
- Default: 512
- Required: No
-
temperature (float):
- Description: The sampling temperature. Lower values will result in more deterministic completions.
- Default: 0.2
- Required: No
-
stop_sequences (Optional[List[str]]):
- Description: A list of strings that will stop the completion when encountered.
- Required: No
-
strip_whitespace (bool):
- Description: Whether to strip leading and trailing whitespace from the response.
- Required: No
-
retry_config (RetryConfig):
- Description: The retry configuration for the node. Can be used to specify the number of retries, delay, backoff, and exceptions to retry on.
- Default:
{'tries': 5, 'delay': 1, 'backoff': 2, 'exceptions': ['SGPClientError']}
- Required: No
References in Configuration
The configuration includes references to the following models:
RetryConfig
- Description: No description available
- Fields:
- tries (int):
- Description: The number of times to retry the operation before giving up.
- Default: 5
- Required: No
- delay (int):
- Description: The number of seconds to wait before retrying the operation.
- Default: 1
- Required: No
- backoff (int):
- Description: The factor by which to increase the delay between retries.
- Default: 2
- Required: No
- exceptions (List[Literal[‘SGPClientError’, ‘APITimeoutError’, ‘InternalServerError’, ‘RateLimitError’, ‘Exception’]]):
- Description: The exceptions to retry on.
- Default:
['SGPClientError']
- Required: No
- tries (int):
Method
The node provides the following method:
Inputs
-
input_prompt (str):
- Required: Yes
-
references (Optional[List[egp_services.types.citation.Citation]]):
- Required: No
References in Inputs
The inputs include references to the following models:
Citation
- Description: No description available
- Fields:
- body (Optional[str]):
- Description: The text of the citation
- Required: No
- cited_text (Optional[str]):
- Description: The chunk of cited text, this is essentially a duplicate field to body. Keeping this here to maintain backwards compatibility with plugins.
- Required: No
- s3_attachment_url (str):
- Description: The URL of the attachment
- Required: Yes
- page_number (str):
- Description: The page number of the citation
- Required: Yes
- fragment_uuid (str):
- Description: The UUID of the fragment
- Required: Yes
- document_uuid (str):
- Description: The UUID of the document
- Required: Yes
- source_type (Optional[Literal[‘document’, ‘sql’]]):
- Description: The type of the source
- Required: No
- document_classification (Optional[str]):
- Description: The classification of the document
- Required: No
- metadata (Optional[Dict]):
- Description: The metadata of the citation
- Required: No
- body (Optional[str]):
Output
- output (ModifiedCitationResponse):
- Required: Yes
References in Output
The output includes references to the following models:
ModifiedCitationResponse
- Description: Basic class to represent the output of a citation node.
- Fields:
- output (str):
- Description: The output of the citation node
- Required: Yes
- citations (List[egp_services.types.citation.Citation]):
- Description: The citations created by the citation node
- Required: Yes
- original (str):
- Description: The original text output
- Required: Yes
- output (str):
====================================================================================================
GenerationWithDocumentModelNode
Type: generation_with_document_model
Operation Type: COMPLETION
Description
This is a node that generates completions using a document model (e.g. Cohere’s document model).
This is different than a normal generation model because you can pass the “documents” (or chunks) directly to the LLM provider and they’ll automatically deal with the retrieval and citation for you.
Configuration
The configuration parameters for this node are as follows:
-
type (Literal[‘generation_with_document_model’]):
- Default: generation_with_document_model
- Required: No
-
operation_type (Literal[NodeOperationType.COMPLETION: ‘COMPLETION’]):
- Default: NodeOperationType.COMPLETION
- Required: No
-
name (str):
- Description: The name of the node.
- Required: Yes
-
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
-
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
-
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
-
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
-
llm_model (str):
- Description: The document model to use.
- Required: Yes
-
max_tokens (int):
- Description: The maximum number of tokens to generate in the completion. The maximum value is 2048.
- Default: 512
- Required: No
-
temperature (float):
- Description: The sampling temperature. Lower values will result in more deterministic completions.
- Default: 0.2
- Required: No
-
stop_sequences (Optional[List[str]]):
- Description: A list of strings that will stop the completion when encountered.
- Required: No
Method
The node provides the following method:
Inputs
-
user_prompt (str):
- Required: Yes
-
chunks (List[scale_gp.types.shared.chunk.Chunk]):
- Required: Yes
References in Inputs
The inputs include references to the following models:
Chunk
- Description: No description available
- Fields:
- chunk_id (str):
- Required: Yes
- score (float):
- Required: Yes
- text (str):
- Required: Yes
- attachment_url (Optional[str]):
- Required: No
- embedding (Optional[List[float]]):
- Required: No
- metadata (Optional[object]):
- Required: No
- title (Optional[str]):
- Required: No
- user_supplied_metadata (Optional[object]):
- Required: No
- chunk_id (str):
Output
- output (Tuple[str, List[egp_services.types.citation.ModelCitation]]):
- Required: Yes
References in Output
The output includes references to the following models:
ModelCitation
- Description: No description available
- Fields:
- start_index (int):
- Description: The start index of the citation
- Required: Yes
- end_index (int):
- Description: The end index of the citation
- Required: Yes
- cited_text (str):
- Description: The text of the citation
- Required: Yes
- chunks (List[scale_gp.types.shared.chunk.Chunk]):
- Description: The chunks of the citation
- Required: Yes
- chunk_indices (List[int]):
- Description: The indices of the chunks
- Required: Yes
- start_index (int):
====================================================================================================
GetMessageNode
Type: get_message
Operation Type: DATA_MANIPULATION
Description
Node to get a message from a list of messages
Configuration
The configuration parameters for this node are as follows:
-
type (Literal[‘get_message’]):
- Default: get_message
- Required: No
-
operation_type (Literal[NodeOperationType.DATA_MANIPULATION: ‘DATA_MANIPULATION’]):
- Default: NodeOperationType.DATA_MANIPULATION
- Required: No
-
name (str):
- Description: The name of the node.
- Required: Yes
-
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
-
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
-
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
-
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
-
index (int):
- Description: The index of the message to get
- Required: Yes
Method
The node provides the following method:
Inputs
- messages (List[egp_services.types.core.chat.Message]):
- Required: Yes
References in Inputs
The inputs include references to the following models:
Message
- Description: Represents a single message with a role
- Fields:
- role (Literal[‘system’, ‘user’, ‘assistant’, ‘tool’, ‘ipython’]):
- Description: The role of the message
- Required: Yes
- content (str):
- Description: The content of the message
- Required: Yes
- role (Literal[‘system’, ‘user’, ‘assistant’, ‘tool’, ‘ipython’]):
Output
- output (str):
- Required: Yes
====================================================================================================
InsertMessagesNode
Type: insert_messages
Operation Type: DATA_MANIPULATION
Description
Node to insert a list of messages into another list of messages at a certain index
Configuration
The configuration parameters for this node are as follows:
-
type (Literal[‘insert_messages’]):
- Default: insert_messages
- Required: No
-
operation_type (Literal[NodeOperationType.DATA_MANIPULATION: ‘DATA_MANIPULATION’]):
- Default: NodeOperationType.DATA_MANIPULATION
- Required: No
-
name (str):
- Description: The name of the node.
- Required: Yes
-
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
-
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
-
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
-
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
-
index (int):
- Description: The index to insert the new messages at. If index = -1, then the messages will be appended to the end of the list
- Required: Yes
Method
The node provides the following method:
Inputs
-
messages (List[egp_services.types.core.chat.Message]):
- Required: Yes
-
new_messages (List[egp_services.types.core.chat.Message]):
- Required: Yes
References in Inputs
The inputs include references to the following models:
Message
- Description: Represents a single message with a role
- Fields:
- role (Literal[‘system’, ‘user’, ‘assistant’, ‘tool’, ‘ipython’]):
- Description: The role of the message
- Required: Yes
- content (str):
- Description: The content of the message
- Required: Yes
- role (Literal[‘system’, ‘user’, ‘assistant’, ‘tool’, ‘ipython’]):
Output
- output (List[Dict[str, str]]):
- Required: Yes
====================================================================================================
JinjaNode
Type: jinja
Operation Type: PROMPT_ENGINEERING
Description
An extremely flexible node that takes as input a dictionary of data, where each key is a tag corresponding to the data transformation Jinja template tag (optional), and each value is the data to be templated.
It will always take as input something extremely flexible and return a string.
Here are some example configs:
Configuration
The configuration parameters for this node are as follows:
-
type (Literal[‘jinja’]):
- Default: jinja
- Required: No
-
operation_type (Literal[NodeOperationType.PROMPT_ENGINEERING: ‘PROMPT_ENGINEERING’]):
- Default: NodeOperationType.PROMPT_ENGINEERING
- Required: No
-
name (str):
- Description: The name of the node.
- Required: Yes
-
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
-
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
-
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
-
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
-
data_transformations (Dict[str, egp_services.nodes.jinja.JinjaNodeTemplate]):
- Description: A dictionary of data transformations to apply to the input data.
- Required: No
-
output_template (JinjaNodeTemplate):
- Description: The output template to use.
- Default:
{'jinja_template_str_unloaded': None, 'jinja_template_path': None, 'jinja_helper_functions': [], 'jinja_template_str_loaded': None}
- Required: No
-
log_output (bool):
- Description: Whether to log the output of the node to the logger.
- Required: No
-
log_prefix (str):
- Description: A prefix to add to the log output. Useful for distinguishing between multiple nodes.
- Required: No
-
verbose (bool):
- Description: Whether to log additional information about the node.
- Default: True
- Required: No
-
llm_model (Optional[Literal[‘llama-2-7b-chat’, ‘llama-2-13b-chat’, ‘llama-2-70b-chat’, ‘falcon-7b-insruct’, ‘falcon-40b-instruct’, ‘mistral-7b-instruct’, ‘mixtral-8x7b-instruct’, ‘mixtral-8x22b-instruct’, ‘codellama-7b-instruct’, ‘codellama-13b-instruct’, ‘codellama-34b-instruct’, ‘codellama-70b-instruct’, ‘zephyr-7b-alpha’, ‘zephyr-7b-beta’, ‘gemma-2b-instruct’, ‘gemma-7b-instruct’, ‘llama-3-8b-instruct’, ‘llama-3-70b-instruct’, ‘llama-3-1-8b-instruct’, ‘llama-3-1-70b-instruct’, ‘llama-3-1-405b-instruct’, ‘dolphin-2.2-70b’, ‘dolphin-2.9-llama3-70b’, ‘dolphin-2.9-llama3-8b’]]):
- Description: The name of the model to use for tokenization.
- Required: No
-
max_tokens (Optional[int]):
- Description: The maximum number of tokens to use for the output template. If set, the output template will be token-limited.
- Required: No
-
context_chunks_key (Optional[str]):
- Description: The key in the templated data that contains the context chunks to be token-limited.
- Default: context_chunks
- Required: No
References in Configuration
The configuration includes references to the following models:
JinjaNodeTemplate
- Description: Base model for a Jinja template. Guaranteed to store a string that can be read in to Template().
- Fields:
- jinja_template_str_unloaded (Optional[str]):
-
Description: Raw template to apply to the data. This should be a Jinja2 template string. Please note, the data will be mapped as ‘value’ in the template. Default None corresponds to
{{value}}
. Should access propertyjinja_template_str
or fieldjinja_template_str_loaded
for the loaded template data -
Required: No
-
- jinja_template_path (Optional[str]):
- Description: Path to a Jinja2 template file. Default None.
- Required: No
- jinja_helper_functions (Optional[List[Union[str, Dict[str, Any]]]]):
- Description: Potential helper functions to use in jinja, please refer to egp_services.enums
- Required: No
- jinja_template_str_loaded (Optional[str]):
-
Description: The original jinja_template_str field from the config might not contain the needed template, and we may need to load S3 data specified with
jinja_template_path
. This field caches the loaded template content, it is also accessed through propertyjinja_template_str
. -
Required: No
-
- jinja_template_str_unloaded (Optional[str]):
Method
The node provides the following method:
Inputs
(No explicit inputs listed for the primary method)
Output
- output (str):
- Required: Yes
====================================================================================================
LLMEngineNode
Type: llm_engine
Operation Type: COMPLETION
Description
A node that generates completions from an LLM leveraging LLMEngine calls instead of SGP.
Has increased functionality such as batch completion.
Configuration
The configuration parameters for this node are as follows:
-
type (Literal[‘llm_engine’]):
- Default: llm_engine
- Required: No
-
operation_type (Literal[NodeOperationType.COMPLETION: ‘COMPLETION’]):
- Default: NodeOperationType.COMPLETION
- Required: No
-
name (str):
- Description: The name of the node.
- Required: Yes
-
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
-
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
-
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
-
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
-
llm_model (str):
- Description: The LLM model to use for generation. This should be a Scale-hosted LLM model.
- Required: Yes
-
max_tokens (Optional[int]):
- Description: The maximum number of tokens to generate in the completion.
- Default: 512
- Required: No
-
temperature (Optional[float]):
- Description: The temperature to use for generation. Higher values lead to more randomness.
- Default: 0.2
- Required: No
-
stop_sequences (Optional[List[str]]):
- Description: A list of strings that, if generated, will stop the completion.
- Required: No
-
presence_penalty (Optional[float]):
- Description: The presence penalty to use for generation.
- Required: Yes
-
frequency_penalty (Optional[float]):
- Description: The frequency penalty to use for generation.
- Required: Yes
-
top_k (Optional[int]):
- Description: The top-k value to use for generation.
- Required: Yes
-
top_p (Optional[float]):
- Description: The top-p value to use for generation.
- Required: Yes
-
timeout (int):
- Description: The maximum time in seconds to wait for the completion. Used for max_runtime_sec in batch completions call.
- Default: 900
- Required: No
-
include_stop_str_in_output (Optional[bool]):
- Description: Whether to include the stop string in the output.
- Required: No
-
guided_json (Optional[Dict[str, Any]]):
- Description: A JSON object to guide the completion.
- Required: No
-
guided_regex (Optional[str]):
- Description: A regex string to guide the completion.
- Required: No
-
guided_choice (Optional[List[str]]):
- Description: A list of strings to guide the completion.
- Required: No
-
batch_run_mode (Literal[‘sync’, ‘async’]):
- Description: Whether to run the batch completions synchronously or asynchronously.
- Default: async
- Required: No
-
batch_sys_kwargs (LLMEngineBatchSysKwargs):
- Description: System-level kwargs for batch completions.
- Required: No
References in Configuration
The configuration includes references to the following models:
LLMEngineBatchSysKwargs
- Description: No description available
- Fields:
- checkpoint_path (Optional[str]):
- Description: The checkpoint path to use for the model.
- Required: No
- seed (Optional[int]):
- Description: The seed to use for the model’s generation.
- Required: No
- num_shards (Optional[int]):
- Description: The number of shards to split the model into
- Default: 1
- Required: No
- labels (Optional[Dict[str, str]]):
- Description: The labels to use for the model (e.g. team/product)
- Required: No
- checkpoint_path (Optional[str]):
Method
The node provides the following method:
Inputs
- input_prompt (str):
- Required: Yes
Output
- output (str):
- Required: Yes
====================================================================================================
LLMGeneratedCitationNode
Type: llm_generated_citation
Operation Type: CITATION
Description
Converts LLM-generated citations to a format that can be used to cite chunks.
Configuration
The configuration parameters for this node are as follows:
-
type (Literal[‘llm_generated_citation’]):
- Default: llm_generated_citation
- Required: No
-
operation_type (Literal[NodeOperationType.CITATION: ‘CITATION’]):
- Default: NodeOperationType.CITATION
- Required: No
-
name (str):
- Description: The name of the node.
- Required: Yes
-
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
-
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
-
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
-
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
Method
The node provides the following method:
Inputs
-
generated_text (str):
- Required: Yes
-
chunks (List[scale_gp.types.shared.chunk.Chunk]):
- Required: Yes
References in Inputs
The inputs include references to the following models:
Chunk
- Description: No description available
- Fields:
- chunk_id (str):
- Required: Yes
- score (float):
- Required: Yes
- text (str):
- Required: Yes
- attachment_url (Optional[str]):
- Required: No
- embedding (Optional[List[float]]):
- Required: No
- metadata (Optional[object]):
- Required: No
- title (Optional[str]):
- Required: No
- user_supplied_metadata (Optional[object]):
- Required: No
- chunk_id (str):
Output
- output (CitationAndReferencesResponse):
- Required: Yes
References in Output
The output includes references to the following models:
CitationAndReferencesResponse
- Description: Class to represent the output of a citation node. Can include a list of uncited references that were seen by the LLM but not directly cited.
- Fields:
- output (str):
- Description: The output of the citation node
- Required: Yes
- citations (List[egp_services.types.citation.Citation]):
- Description: The citations created by the citation node
- Required: Yes
- uncited_references (Optional[List[egp_services.types.citation.Citation]]):
- Description: The references that were not cited
- Required: No
- output (str):
====================================================================================================
MergeNode
Type: merge_node
Operation Type: CUSTOM
Description
Either used to rename workflow outputs OR used after a branch in a plan, to allow users to reference the results from any branch under the same name. For example, extending the example branch config above with the following merge config:
allows the user to reference the outputs of any of these 3 branches via the alias “add_animal_to_message_by_color.message_with_animal.output
” in downstream workflows.
Configuration
The configuration parameters for this node are as follows:
-
type (Literal[‘merge_node’]):
- Default: merge_node
- Required: No
-
operation_type (Literal[NodeOperationType.CUSTOM: ‘CUSTOM’]):
- Default: NodeOperationType.CUSTOM
- Required: No
-
name (str):
- Description: The name of the node.
- Required: Yes
-
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
-
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
-
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
-
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
-
input_hierarchy (List[str]):
- Description: A list of node names defining the order in which to look for non-null inputs.
- Required: Yes
Method
The node provides the following method:
Inputs
(No explicit inputs listed for the primary method)
Output
- output (typing.Any):
- Required: Yes
====================================================================================================
ProcessorNode
Type: processor
Operation Type: CUSTOM
Description
A node that processes data by executing a series of functions. This allows for very custom logic that is specific to a certain project/customer.
Here are some example configs:
Configuration
The configuration parameters for this node are as follows:
-
type (Literal[‘processor’]):
- Default: processor
- Required: No
-
operation_type (Literal[NodeOperationType.CUSTOM: ‘CUSTOM’]):
- Default: NodeOperationType.CUSTOM
- Required: No
-
name (str):
- Description: The name of the node.
- Required: Yes
-
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
-
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
-
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
-
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
-
return_key (str):
- Description: The function to return results from.
- Required: Yes
-
function_specs (Dict[str, egp_services.nodes.processing.node.FunctionSpec]):
- Description: Specifications for executing a function.
- Required: Yes
References in Configuration
The configuration includes references to the following models:
FunctionSpec
- Description: Define the specification of function execution
Key: A a unique name for the execution of a function and the kwargs.
Values: A dictionary mapping:
- path (str): path to the function
- kwargs (Dict[str, Any]): Keyword arguments for this execution of the function.
Function Key & Path Representation:
- Fields:
- path (str):
- Description: Path to the function.
- Required: Yes
- kwargs (Dict[str, Any]):
- Description: Keyword arguments for the function.
- Required: Yes
- path (str):
Method
The node provides the following method:
Inputs
(No explicit inputs listed for the primary method)
Output
- output (typing.Any):
- Required: Yes
====================================================================================================
RegexMatchNode
Type: regex_match
Operation Type: DATA_MANIPULATION
Description
A node that matches a string against a regex pattern and returns the match groups.
Configuration
The configuration parameters for this node are as follows:
-
type (Literal[‘regex_match’]):
- Default: regex_match
- Required: No
-
operation_type (Literal[NodeOperationType.DATA_MANIPULATION: ‘DATA_MANIPULATION’]):
- Default: NodeOperationType.DATA_MANIPULATION
- Required: No
-
name (str):
- Description: The name of the node.
- Required: Yes
-
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
-
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
-
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
-
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
-
pattern (str):
- Description: The regex pattern to match against.
- Required: Yes
Method
The node provides the following method:
Inputs
- string (str):
- Required: Yes
Output
- output (Optional[Dict[str, str]]):
- Required: Yes
====================================================================================================
RemoveMessageNode
Type: remove_message
Operation Type: DATA_MANIPULATION
Description
Node to remove a message from a list of messages
Configuration
The configuration parameters for this node are as follows:
-
type (Literal[‘remove_message’]):
- Default: remove_message
- Required: No
-
operation_type (Literal[NodeOperationType.DATA_MANIPULATION: ‘DATA_MANIPULATION’]):
- Default: NodeOperationType.DATA_MANIPULATION
- Required: No
-
name (str):
- Description: The name of the node.
- Required: Yes
-
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
-
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
-
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
-
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
-
index (int):
- Description: The index of the message to remove. If index = -1, then the last message will be removed
- Required: Yes
Method
The node provides the following method:
Inputs
- messages (List[egp_services.types.core.chat.Message]):
- Required: Yes
References in Inputs
The inputs include references to the following models:
Message
- Description: Represents a single message with a role
- Fields:
- role (Literal[‘system’, ‘user’, ‘assistant’, ‘tool’, ‘ipython’]):
- Description: The role of the message
- Required: Yes
- content (str):
- Description: The content of the message
- Required: Yes
- role (Literal[‘system’, ‘user’, ‘assistant’, ‘tool’, ‘ipython’]):
Output
- output (List[Dict[str, str]]):
- Required: Yes
====================================================================================================
RerankerNode
Type: reranker
Operation Type: RERANKING
Description
A node that reranks a list of chunks based on a set of scorers.
An example configs:
Configuration
The configuration parameters for this node are as follows:
-
type (Literal[‘reranker’]):
- Default: reranker
- Required: No
-
operation_type (Literal[NodeOperationType.RERANKING: ‘RERANKING’]):
- Default: NodeOperationType.RERANKING
- Required: No
-
name (str):
- Description: The name of the node.
- Required: Yes
-
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
-
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
-
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
-
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
-
scorers (List[egp_services.nodes.chunks.scorers.BM25Scorer | egp_services.nodes.chunks.scorers.CrossEncoderScorer | egp_services.nodes.chunks.scorers.RecencyScorer | egp_services.nodes.chunks.scorers.CustomModelScorer | egp_services.nodes.chunks.scorers.LLMRerankerScorer]):
- Description: The list of scorers to use for reranking.
- Required: Yes
-
num_to_return (int):
- Description: The number of chunks to return after reranking.
- Required: Yes
-
score_threshold (Optional[float]):
- Description: The threshold of the score to filter the chunks.
- Required: No
References in Configuration
The configuration includes references to the following models:
LLMRerankerScorer
- Description: LLM reranker using a Sliding window based approach.
- Fields:
- name (Literal[‘llm-reranker’]):
- Default: llm-reranker
- Required: No
- weight (float):
- Default: 1.0
- Required: No
- client_kwargs (Dict[str, str]):
- Required: No
- model (str):
- Required: Yes
- max_tokens (int):
- Default: 512
- Required: No
- temperature (float):
- Default: 0.2
- Required: No
- stop_sequences (Optional[List[str]]):
- Required: No
- jinja_template_path (str):
- Required: Yes
- jinja_template_str (str):
- Required: Yes
- name (Literal[‘llm-reranker’]):
CustomModelScorer
- Description: Finetuned reranker model uploaded to the Models API.
- Fields:
- name (Literal[‘sgp-custom-model’]):
- Default: sgp-custom-model
- Required: No
- weight (float):
- Default: 1.0
- Required: No
- client_kwargs (Dict[str, str]):
- Required: No
- model (str):
- Required: Yes
- instance_name (Optional[str]):
- Required: No
- deployment_name (Optional[str]):
- Required: No
- name (Literal[‘sgp-custom-model’]):
CrossEncoderScorer
- Description: Cross-encoder reranker. Can be loaded from local path or from EGP/Cohere. User must provide the name of the cross-encoder model.
- Fields:
- name (Literal[‘cross-encoder’]):
- Default: cross-encoder
- Required: No
- weight (float):
- Default: 1.0
- Required: No
- client_kwargs (Dict[str, str]):
- Required: No
- model (str):
- Required: Yes
- reranking_strategy (Union[scale_gp.types.chunk_rank_params.RankStrategyCrossEncoderRankStrategy, scale_gp.types.chunk_rank_params.RankStrategyModelRankStrategy, egp_services.nodes.chunks.scorers.LocalRerankStrategy, NoneType]):
- Required: No
- name (Literal[‘cross-encoder’]):
RecencyScorer
- Description: Recency scorer for chunks. Score is computed based on delta days or seconds using date metadata and the provided query date.
- Fields:
- name (Literal[‘recency’]):
- Default: recency
- Required: No
- weight (float):
- Default: 1.0
- Required: No
- client_kwargs (Dict[str, str]):
- Required: No
- field (Optional[str]):
- Default: published_date
- Required: No
- units (Optional[Literal[‘days’, ‘seconds’]]):
- Default: days
- Required: No
- name (Literal[‘recency’]):
BM25Scorer
- Description: BM25 scorer for chunks. Users can specify which field to use for computing BM25 score with query.
- Fields:
- name (Literal[‘bm25’]):
- Default: bm25
- Required: No
- weight (float):
- Default: 1.0
- Required: No
- client_kwargs (Dict[str, str]):
- Required: No
- field (Optional[str]):
- Default: text
- Required: No
- name (Literal[‘bm25’]):
Method
The node provides the following method:
Inputs
-
query (str):
- Required: Yes
-
chunks (List[scale_gp.types.shared.chunk.Chunk]):
- Required: Yes
References in Inputs
The inputs include references to the following models:
Chunk
- Description: No description available
- Fields:
- chunk_id (str):
- Required: Yes
- score (float):
- Required: Yes
- text (str):
- Required: Yes
- attachment_url (Optional[str]):
- Required: No
- embedding (Optional[List[float]]):
- Required: No
- metadata (Optional[object]):
- Required: No
- title (Optional[str]):
- Required: No
- user_supplied_metadata (Optional[object]):
- Required: No
- chunk_id (str):
Output
- output (List[scale_gp.types.shared.chunk.Chunk]):
- Required: Yes
References in Output
The output includes references to the following models:
Chunk
- Description: No description available
- Fields:
- chunk_id (str):
- Required: Yes
- score (float):
- Required: Yes
- text (str):
- Required: Yes
- attachment_url (Optional[str]):
- Required: No
- embedding (Optional[List[float]]):
- Required: No
- metadata (Optional[object]):
- Required: No
- title (Optional[str]):
- Required: No
- user_supplied_metadata (Optional[object]):
- Required: No
- chunk_id (str):
====================================================================================================
ResponseParserNode
Type: response_parser
Operation Type: DATA_MANIPULATION
Description
A node that parses a response based on a given action.
Configuration
The configuration parameters for this node are as follows:
-
type (Literal[‘response_parser’]):
- Default: response_parser
- Required: No
-
operation_type (Literal[NodeOperationType.DATA_MANIPULATION: ‘DATA_MANIPULATION’]):
- Default: NodeOperationType.DATA_MANIPULATION
- Required: No
-
name (str):
- Description: The name of the node.
- Required: Yes
-
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
-
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
-
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
-
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
-
action (Literal[‘greater_than’, ‘equals’, ‘less_than’, ‘is_in’, ‘contains’, ‘has_duplicates’, ‘is_null’, ‘bool’, ‘to_integer’, ‘to_key_value_dict’]):
- Description: The action to perform on the response. Must be one of the keys: dict_keys([‘greater_than’, ‘equals’, ‘less_than’, ‘is_in’, ‘contains’, ‘has_duplicates’, ‘is_null’, ‘bool’, ‘to_integer’, ‘to_key_value_dict’])
- Required: Yes
-
reference_value (Optional[Any]):
- Description: A reference value to use in the action. The meaning of this value depends on the action.
- Required: No
Method
The node provides the following method:
Inputs
- response (str):
- Required: Yes
Output
- output (Dict[str, Any]):
- Required: Yes
====================================================================================================
RetrieverNode
Type: retriever
Operation Type: KB_RETRIEVAL
Description
Retrieves chunks from a knowledge base based on a query. The knowledge base can be specified by ID or name.
We also have the option to filter the retrieved chunks based on metadata fields. This is useful when we want to retrieve chunks that have metadata fields that match the values of certain columns in the input data.
Metadata: inputs format: Dictkey: str, val: str key: field that exists in metadata of the chunks from the knowledgebase to be queried val: field / column name that exists in input data kwargs / eval dataset csv file Example: The following code requires chunks to have metadata fields “chapter” and “section” and requires the data to have fields / columns called “chapter_column” and “section_column” It will retrieve chunks for a question such that only those chunks are retrieved whose metadata in the two fields is identical to the values provided for them in the question’s row.
Configuration
The configuration parameters for this node are as follows:
-
type (Literal[‘retriever’]):
- Default: retriever
- Required: No
-
operation_type (Literal[NodeOperationType.KB_RETRIEVAL: ‘KB_RETRIEVAL’]):
- Default: NodeOperationType.KB_RETRIEVAL
- Required: No
-
name (str):
- Description: The name of the node.
- Required: Yes
-
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
-
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
-
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
-
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
-
filter_fields (Optional[List[str]]):
- Description: The metadata fields to filter on.
- Required: No
-
knowledge_base_id (Optional[str]):
- Description: The ID of the knowledge base to query.
- Required: No
-
knowledge_base_name (Optional[str]):
- Description: The name of the knowledge base to query.
- Required: No
-
num_to_return (int):
- Description: The number of chunks to return after retrieval.
- Required: Yes
-
exact_knn_search (Optional[bool]):
- Description: Whether to use exact KNN search. If not provided, will be set to True if num_to_return 1000.
- Required: No
-
metadata (Optional[Dict[str, Optional[str]]]):
- Description: Metadata filters to apply to the chunks retrieved.
- Default:
{'field_concatenation_mode': 'and'}
- Required: No
-
min_results_per_knowledge_base (int):
- Description: The minimum number of results to return per knowledge base.
- Required: No
-
num_retriever_workers (int):
- Description: The number of concurrent retriever workers to use.
- Default: 1
- Required: No
Method
The node provides the following method:
Inputs
-
query (str):
- Required: Yes
-
metadata_filters (Optional[Dict[str, str]]):
- Required: No
-
include_embeddings (bool):
- Required: No
-
knowledge_base_ids (Union[egp_services.types.core.root.KnowledgeBaseId, List[egp_services.types.core.root.KnowledgeBaseId], NoneType]):
- Required: No
References in Inputs
The inputs include references to the following models:
KnowledgeBaseId
- Description: Represents a unique identifier for a knowledge base. The root attribute holds the ID as a string.
- Fields:
- root (str):
- Description: The id of the knowledge base
- Required: Yes
- root (str):
Output
- output (List[scale_gp.types.shared.chunk.Chunk]):
- Required: Yes
References in Output
The output includes references to the following models:
Chunk
- Description: No description available
- Fields:
- chunk_id (str):
- Required: Yes
- score (float):
- Required: Yes
- text (str):
- Required: Yes
- attachment_url (Optional[str]):
- Required: No
- embedding (Optional[List[float]]):
- Required: No
- metadata (Optional[object]):
- Required: No
- title (Optional[str]):
- Required: No
- user_supplied_metadata (Optional[object]):
- Required: No
- chunk_id (str):
====================================================================================================
SearchCitationNode
Type: search_citation
Operation Type: CITATION
Description
This node is used to create citations objects found via regex search.
Configuration
The configuration parameters for this node are as follows:
-
type (Literal[‘search_citation’]):
- Default: search_citation
- Required: No
-
operation_type (Literal[NodeOperationType.CITATION: ‘CITATION’]):
- Default: NodeOperationType.CITATION
- Required: No
-
name (str):
- Description: The name of the node.
- Required: Yes
-
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
-
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
-
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
-
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
-
search_regex (str):
- Description: The regex to search for the search result in the response.
- Required: Yes
-
end_search_regex (str):
- Description: The regex to search for the end of the search result in the response.
- Required: Yes
Method
The node provides the following method:
Inputs
- response (str):
- Required: Yes
Output
- output (CitationResponse):
- Required: Yes
References in Output
The output includes references to the following models:
CitationResponse
- Description: Basic class to represent the output of a citation node.
- Fields:
- output (str):
- Description: The output of the citation node
- Required: Yes
- citations (List[egp_services.types.citation.Citation]):
- Description: The citations created by the citation node
- Required: Yes
- output (str):
====================================================================================================
SQLExecutorNode
Type: sql_executor
Operation Type: CODE_EXECUTION
Description
A node that executes a SQL query on a DB. Right now, only Snowflake is supported.
Configuration
The configuration parameters for this node are as follows:
-
type (Literal[‘sql_executor’]):
- Default: sql_executor
- Required: No
-
operation_type (Literal[NodeOperationType.CODE_EXECUTION: ‘CODE_EXECUTION’]):
- Default: NodeOperationType.CODE_EXECUTION
- Required: No
-
name (str):
- Description: The name of the node.
- Required: Yes
-
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
-
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
-
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
-
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
-
connector_kwargs (Dict[str, str]):
- Description: A dictionary of keyword arguments to pass to the DB connector
- Required: Yes
-
connector_type (Literal[‘snowflake’]):
- Description: The type of connector to use
- Default: snowflake
- Required: No
-
secrets (List[str]):
- Description: A list of keys in connector_kwargs that should be fetched from secrets
- Required: No
-
return_type (Literal[‘df’, ‘dicts’, ‘markdown’, ‘json’, ‘str’]):
- Description: The type of return value to return
- Default: markdown
- Required: No
-
schema_remapping_file (Optional[str]):
- Description: A file containing a schema remapping for tables
- Required: No
-
log_queries (bool):
- Description: Whether to log the queries executed by the node
- Default: True
- Required: No
Method
The node provides the following method:
Inputs
- query (str):
- Required: Yes
Output
- output (SQLResponse):
- Required: Yes
References in Output
The output includes references to the following models:
SQLResponse
- Description: No description available
- Fields:
- response (Union[str, List[Dict], NoneType]):
- Required: Yes
- error (Optional[str]):
- Required: Yes
- response (Union[str, List[Dict], NoneType]):
====================================================================================================
StaticNode
Type: static
Operation Type: CUSTOM
Description
A node to simply input static things.
Used when we want to input things into a workflow which might be difficult to inline. e.g. static ICL examples into a jinja node.
Configuration
The configuration parameters for this node are as follows:
-
type (Literal[‘static’]):
- Default: static
- Required: No
-
operation_type (Literal[NodeOperationType.CUSTOM: ‘CUSTOM’]):
- Default: NodeOperationType.CUSTOM
- Required: No
-
name (str):
- Description: The name of the node.
- Required: Yes
-
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
-
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
-
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
-
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
-
value (Optional[Any]):
- Description: The value to return from the node. Exactly one of
value
andfrom_file
must be provided. - Required: No
- Description: The value to return from the node. Exactly one of
-
from_file (Union[Dict, List, str, NoneType]):
- Description: A local/s3 path or a nested dict/list where each leaf is a local/s3 path. Use when you need to compile multiple data sources into one.
- Required: No
Method
The node provides the following method:
Inputs
(No explicit inputs listed for the primary method)
Output
- output (typing.Any):
- Required: Yes
====================================================================================================
TokenizerChatTemplateNode
Type: tokenizer_chat_template
Operation Type: PROMPT_ENGINEERING
Description
A node to help with formatting chat templates for instruct-trained LLMs.
Allows the user to pass in a list of messages and get the formatted string back using the official tokenizer template on huggingface.
Configuration
The configuration parameters for this node are as follows:
-
type (Literal[‘tokenizer_chat_template’]):
- Default: tokenizer_chat_template
- Required: No
-
operation_type (Literal[NodeOperationType.PROMPT_ENGINEERING: ‘PROMPT_ENGINEERING’]):
- Default: NodeOperationType.PROMPT_ENGINEERING
- Required: No
-
name (str):
- Description: The name of the node.
- Required: Yes
-
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
-
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
-
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
-
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
-
llm_model (Literal[‘llama-2-7b-chat’, ‘llama-2-13b-chat’, ‘llama-2-70b-chat’, ‘falcon-7b-insruct’, ‘falcon-40b-instruct’, ‘mistral-7b-instruct’, ‘mixtral-8x7b-instruct’, ‘mixtral-8x22b-instruct’, ‘codellama-7b-instruct’, ‘codellama-13b-instruct’, ‘codellama-34b-instruct’, ‘codellama-70b-instruct’, ‘zephyr-7b-alpha’, ‘zephyr-7b-beta’, ‘gemma-2b-instruct’, ‘gemma-7b-instruct’, ‘llama-3-8b-instruct’, ‘llama-3-70b-instruct’, ‘llama-3-1-8b-instruct’, ‘llama-3-1-70b-instruct’, ‘llama-3-1-405b-instruct’, ‘llama-3-3-70b-instruct’, ‘dolphin-2.2-70b’, ‘dolphin-2.9-llama3-70b’, ‘dolphin-2.9-llama3-8b’]):
- Description: The name of the model to use for tokenization.
- Required: No
-
add_generation_prompt (bool):
- Description: Whether to end the prompt with the token(s) that indicate the start of an assistant message. This is useful when you want to generate a response from the model. Note that this argument will be passed to the chat template, and so it must be supported in the template for this argument to have any effect.
- Required: No
-
padding (bool):
- Description: Whether to pad sequences to the maximum length. Has no effect if tokenize is
False
. - Required: No
- Description: Whether to pad sequences to the maximum length. Has no effect if tokenize is
-
truncation (bool):
- Description: Whether to truncate sequences at the maximum length. Has no effect if tokenize is
False
. - Required: No
- Description: Whether to truncate sequences at the maximum length. Has no effect if tokenize is
-
max_length (Optional[int]):
- Description: Maximum length (in tokens) to use for padding or truncation. Has no effect if tokenize is
False
. - Required: No
- Description: Maximum length (in tokens) to use for padding or truncation. Has no effect if tokenize is
-
kwargs (Dict[str, Any]):
- Description: Additional kwargs to pass to the template renderer. Will be accessible by the chat template.
- Required: No
Method
The node provides the following method:
Inputs
- messages (List[egp_services.types.core.chat.Message]):
- Required: Yes
References in Inputs
The inputs include references to the following models:
Message
- Description: Represents a single message with a role
- Fields:
- role (Literal[‘system’, ‘user’, ‘assistant’, ‘tool’, ‘ipython’]):
- Description: The role of the message
- Required: Yes
- content (str):
- Description: The content of the message
- Required: Yes
- role (Literal[‘system’, ‘user’, ‘assistant’, ‘tool’, ‘ipython’]):
Output
- output (typing.Any):
- Required: Yes
====================================================================================================
ToolGenerationNode
Type: tool_generation
Operation Type: COMPLETION
Description
A node that generates tool calls and executes tools in a loop.
Configuration
The configuration parameters for this node are as follows:
-
type (Literal[‘tool_generation’]):
- Default: tool_generation
- Required: No
-
operation_type (Literal[NodeOperationType.COMPLETION: ‘COMPLETION’]):
- Default: NodeOperationType.COMPLETION
- Required: No
-
name (str):
- Description: The name of the node.
- Required: Yes
-
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
-
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
-
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
-
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
-
model (str):
- Description: The model to use for completion.
- Required: Yes
-
temperature (Optional[float]):
- Description: The temperature to use for completion.
- Required: No
-
max_tokens (Optional[int]):
- Description: The maximum number of tokens to generate.
- Required: No
-
presence_penalty (Optional[float]):
- Description: The presence penalty to use for completion.
- Required: No
-
stop (Union[str, List[str], NoneType]):
- Description: The stop token(s) to use for completion.
- Required: No
-
seed (Optional[int]):
- Description: The seed to use for completion.
- Required: No
-
tool_choice (Union[Literal[‘required’, ‘auto’], Dict[str, Any], NoneType]):
- Description: Whether a tool must be used in the completion.
- Required: No
-
tools (Optional[List[egp_services.nodes.generation.tool_generation.ToolConfig]]):
- Description: The tools to use in the completion.
- Required: No
References in Configuration
The configuration includes references to the following models:
ToolConfig
- Description: No description available
- Fields:
- name (Literal[‘internal.Google Search’, ‘internal.WikipediaSearch’, ‘experimental.PythonInterpreter’, ‘experimental.sympy_calculator’, ‘nodes.StaticNode’, ‘nodes.GenerationNode’, ‘nodes.SQLExecutorNode’, ‘nodes.CodeExecutionNode’, ‘nodes.RetrieverNode’, ‘nodes.RerankerNode’, ‘nodes.ChatGenerationNode’]):
- Required: Yes
- init_kwargs (Optional[Dict[str, Any]]):
- Required: No
- name (Literal[‘internal.Google Search’, ‘internal.WikipediaSearch’, ‘experimental.PythonInterpreter’, ‘experimental.sympy_calculator’, ‘nodes.StaticNode’, ‘nodes.GenerationNode’, ‘nodes.SQLExecutorNode’, ‘nodes.CodeExecutionNode’, ‘nodes.RetrieverNode’, ‘nodes.RerankerNode’, ‘nodes.ChatGenerationNode’]):
Method
The node provides the following method:
Inputs
- messages (List[egp_services.types.core.chat.Message]):
- Required: Yes
References in Inputs
The inputs include references to the following models:
Message
- Description: Represents a single message with a role
- Fields:
- role (Literal[‘system’, ‘user’, ‘assistant’, ‘tool’, ‘ipython’]):
- Description: The role of the message
- Required: Yes
- content (str):
- Description: The content of the message
- Required: Yes
- role (Literal[‘system’, ‘user’, ‘assistant’, ‘tool’, ‘ipython’]):
Output
- output (str):
- Required: Yes
====================================================================================================
TruncateMessagesNode
Type: truncate_messages
Operation Type: DATA_MANIPULATION
Description
Node to truncate old messages from a list of messages based on a token limit
Configuration
The configuration parameters for this node are as follows:
-
type (Literal[‘truncate_messages’]):
- Default: truncate_messages
- Required: No
-
operation_type (Literal[NodeOperationType.DATA_MANIPULATION: ‘DATA_MANIPULATION’]):
- Default: NodeOperationType.DATA_MANIPULATION
- Required: No
-
name (str):
- Description: The name of the node.
- Required: Yes
-
client_kwargs (Dict[str, Any]):
- Description: Keyword arguments for the SGPClient.
- Required: No
-
node_metadata (Optional[List[Literal[‘_output_token_usage’, ‘_time_taken’, ‘_input_token_usage’, ‘_start_timestamp’]]]):
- Description: Metadata fields to capture for the node.
- Required: No
-
num_workers (Optional[int]):
- Description: The number of workers to use for parallel processing.
- Default: 8
- Required: No
-
type_hints (Optional[Dict[str, Any]]):
- Description: Type hints for the input parameters of the node in JSON schema.
- Required: No
-
llm_model (Optional[Literal[‘llama-2-7b-chat’, ‘llama-2-13b-chat’, ‘llama-2-70b-chat’, ‘falcon-7b-insruct’, ‘falcon-40b-instruct’, ‘mistral-7b-instruct’, ‘mixtral-8x7b-instruct’, ‘mixtral-8x22b-instruct’, ‘codellama-7b-instruct’, ‘codellama-13b-instruct’, ‘codellama-34b-instruct’, ‘codellama-70b-instruct’, ‘zephyr-7b-alpha’, ‘zephyr-7b-beta’, ‘gemma-2b-instruct’, ‘gemma-7b-instruct’, ‘llama-3-8b-instruct’, ‘llama-3-70b-instruct’, ‘llama-3-1-8b-instruct’, ‘llama-3-1-70b-instruct’, ‘llama-3-1-405b-instruct’, ‘dolphin-2.2-70b’, ‘dolphin-2.9-llama3-70b’, ‘dolphin-2.9-llama3-8b’]]):
- Description: The name of the model to use for tokenization.
- Required: No
-
desired_token_count (int):
- Description: The desired token count for the truncated messages
- Required: Yes
-
preserve_system_message (bool):
- Description: Whether to preserve the system message in the truncated messages
- Default: True
- Required: No
-
prohibited_starting_roles (List[str]):
- Description: List of prohibiting roles to start after a system message
- Default:
['assistant']
- Required: No
Method
The node provides the following method:
Inputs
- messages (List[egp_services.types.core.chat.Message]):
- Required: Yes
References in Inputs
The inputs include references to the following models:
Message
- Description: Represents a single message with a role
- Fields:
- role (Literal[‘system’, ‘user’, ‘assistant’, ‘tool’, ‘ipython’]):
- Description: The role of the message
- Required: Yes
- content (str):
- Description: The content of the message
- Required: Yes
- role (Literal[‘system’, ‘user’, ‘assistant’, ‘tool’, ‘ipython’]):