An Agent Judge uses a systematic approach to evaluate your dataset. You can either select an existing agent or create a new one. In the dropdown, you will see all existing judges, both LLM and Agent.
We offer two built-in agent judges, Default Agent and IF Agent.
Default Agent - Used for general purpose evaluations
IF Agent - Excels in systematically extracting instructions from a given prompt and assessing a response’s adherence to each instruction. The IF Agent can identify implicit and explicit instructions, structure constraints, content requests, and behavioral requirements (tone, audience, vocabulary)