Build a chat-bot for retrieval augmented generation.
Step | Node Name | Type | Purpose |
---|---|---|---|
1 | get_last_message | get_message | Extracts the most recent user message |
2 | retrieve | retriever | Searches knowledge bases for relevant information |
3 | prompt | jinja | Formats retrieved context into a structured LLM prompt |
4 | llm | generation | Generates a response based on the retrieved knowledge |
get_last_message
get_message
index: -1
ensures that the last message is extracted.messages
(provided by user_input
).retrieve
retriever
num_to_return: 10
specifies the number of results to retrieve.query
: The last user message (get_last_message.output
).knowledge_base_ids
: The set of knowledge bases to search (user_input.knowledge_base_ids
).prompt
jinja
context_chunks
: The retrieved information (retrieve.output
).question
: The original user message (get_last_message.output
).llm
generation
gpt-4o-mini
512
0.2
(low variability for more deterministic responses).input_prompt
from prompt.output
.reranker
node, which calls a reranking model by unique name encoder model deployment id.
The YAML from above, with the added reranker looks like this:
num_to_return
to fine-tune the number of retrieved documents.gpt-4o-mini
with a larger model for improved performance.