First steps for building agents with YAML.
query_llm
that is of type generation
. It is designed to take a user’s question (provided as user_question
) and generate an answer directly using an LLM.
"gpt-4o-mini"
model.0.2
to control randomness in the response (lower values yield more deterministic outputs).input_prompt
. In this workflow, input_prompt
is mapped directly to user_question
, meaning that whatever question is supplied by the user will be forwarded as the prompt to the LLM.
custom_prompt
node injects a fixed directive into the user’s provided text. This directive tells the model to “be Yoda” when formulating its response. This simple manipulation makes it easy to experiment with persona-based responses without complex multi-turn conversation logic.
query_llm
node, which generates a response using the "gpt-4o-mini"
model. The response is limited in length and controlled by specified generation parameters.
query_prompt
) transforms a series of conversation messages into a single, structured prompt using an inline Jinja template. It:
query_llm
) takes the formatted prompt and passes it to the "gpt-4o-mini"
model to generate a text response. The response is constrained by the token limit, temperature settings, and stop sequences to ensure the output is clear and concise.
user_input
section where the key messages
is declared with a type of Messages
.query_prompt
Node)jinja
query_prompt
node formats the conversation history into a single prompt string that is suitable for sending to the LLM.
jinja_template_str
is defined inline within the YAML. Here’s what it does:
<|begin_of_text|>
, which can be used by the LLM to recognize the start of the prompt.messages
array has a role of "system"
.loop_messages
.<|start_header_id|>
and <|end_header_id|>
, appending the stripped content, and terminating it with <|eot_id|>
.loop_messages
to include all messages.system_message
is an empty string).loop_messages
:
loop.index0 == 0
), it outputs the system_message
(if one was defined).user
or assistant
), formatted with <|start_header_id|>
and <|end_header_id|>
.<|eot_id|>
to signal the end of that message.loop.last and message['role'] == 'user'
), the template appends an empty header for the assistant
. This cues the LLM that it should generate a response.messages
, which comes from the overall user_input
.
query_llm
Node)generation
"gpt-4o-mini"
, a variant of GPT-4 optimized for this use case.0.2
for controlled and less random output.["<|eot_id|>"]
as a stopping point for the model, ensuring that it stops generating once the designated token is reached.input_prompt
from the output of the query_prompt
node ( query_prompt.output
).