> ## Documentation Index
> Fetch the complete documentation index at: https://docs.gp.scale.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Agent as a Judge

> Leverage specialized agents to produce evaluation results.

## Create Evaluation

### Navigate to the Evaluate tab and click "Create evaluation".

<img src="https://mintcdn.com/scalegp/QzzT7_6hBafVZgHm/images/nextgen-evaluation/agent-as-a-judge/create-evaluation.png?fit=max&auto=format&n=QzzT7_6hBafVZgHm&q=85&s=da1a819832df41bcf72222ae6e666e8b" alt="create-evaluation" width="1755" height="975" data-path="images/nextgen-evaluation/agent-as-a-judge/create-evaluation.png" />

### Add Evaluation Details

Add in Evaluation name, description (optional), tags (optional), and select a dataset.

<img src="https://mintcdn.com/scalegp/QzzT7_6hBafVZgHm/images/nextgen-evaluation/agent-as-a-judge/select-eval-dataset.png?fit=max&auto=format&n=QzzT7_6hBafVZgHm&q=85&s=79979c7939878b1169927278e74f6b69" alt="create-evaluation" width="1710" height="875" data-path="images/nextgen-evaluation/agent-as-a-judge/select-eval-dataset.png" />

### Add an Agent Judge

An Agent Judge uses a systematic approach to evaluate your dataset. You can either select an existing agent or create a new one. In the dropdown, you will see all existing judges, both LLM and Agent.

<img src="https://mintcdn.com/scalegp/QzzT7_6hBafVZgHm/images/nextgen-evaluation/agent-as-a-judge/judge-dropdown.png?fit=max&auto=format&n=QzzT7_6hBafVZgHm&q=85&s=9b474c70787da3bcfbd7a391a4878b34" alt="judge-dropdown" width="809" height="267" data-path="images/nextgen-evaluation/agent-as-a-judge/judge-dropdown.png" />

### Agent Judge Types

We offer two built-in agent judges, Default Agent and IF Agent.

<ul>
  <li> **Default Agent** - Used for general purpose evaluations </li>
  <li> **IF Agent** - Excels in systematically extracting instructions from a given prompt and assessing a response's adherence to each instruction. The IF Agent can identify implicit and explicit instructions, structure constraints, content requests, and behavioral requirements (tone, audience, vocabulary) </li>
</ul>

<img src="https://mintcdn.com/scalegp/QzzT7_6hBafVZgHm/images/nextgen-evaluation/agent-as-a-judge/select-judge-type.png?fit=max&auto=format&n=QzzT7_6hBafVZgHm&q=85&s=e7671c065ed49bc14d18f589e33e0dfd" alt="select-judge-type" width="641" height="471" data-path="images/nextgen-evaluation/agent-as-a-judge/select-judge-type.png" />

### Configure Agent Judge

Each Agent Judge has its own parameters that can be configured

<ul>
  <li> **Model** - The model that the Agent Judge uses to evaluate the dataset. </li>
  <li> **Output Column Name** - The name of the column the results of this judge will show up on your evaluation results. </li>
  <li> **Description** - A description of what the agent judge is evaluating. </li>
  <li> **Output Rules** - A space to provide additional instructions to the agent. </li>
  <li> **Output Type (optional)** - Text, integer, float, or boolean. If not provided, output type is inferred from output rules.</li>
</ul>

<img src="https://mintcdn.com/scalegp/QzzT7_6hBafVZgHm/images/nextgen-evaluation/agent-as-a-judge/agent-config.png?fit=max&auto=format&n=QzzT7_6hBafVZgHm&q=85&s=f753fc2ddea15cb76f38dda5c2e18baf" alt="agent-config" width="643" height="774" data-path="images/nextgen-evaluation/agent-as-a-judge/agent-config.png" />

### Create Evaluation

Select the rows on the dataset you want to run the evaluation, and click **Create Evaluation**.

<img src="https://mintcdn.com/scalegp/QzzT7_6hBafVZgHm/images/nextgen-evaluation/agent-as-a-judge/create-agent-eval.png?fit=max&auto=format&n=QzzT7_6hBafVZgHm&q=85&s=bd47e461623f220591d98d3f108eb7c8" alt="create-agent-eval" width="1857" height="997" data-path="images/nextgen-evaluation/agent-as-a-judge/create-agent-eval.png" />

## View Evaluation Results

If you navigate back to the **Evaluation** tab, you should be able to see the results of the evaluation.

### Data

The data page will have a column with the results of the Agent Judge

<img src="https://mintcdn.com/scalegp/QzzT7_6hBafVZgHm/images/nextgen-evaluation/agent-as-a-judge/agent-eval-results.png?fit=max&auto=format&n=QzzT7_6hBafVZgHm&q=85&s=36036feff3f2a697cd76683b925e29bc" alt="agent-eval-results" width="2146" height="988" data-path="images/nextgen-evaluation/agent-as-a-judge/agent-eval-results.png" />

### Overview

The overview page will have a graph with the visual representation of the evaluation result.

<img src="https://mintcdn.com/scalegp/QzzT7_6hBafVZgHm/images/nextgen-evaluation/agent-as-a-judge/agent-eval-overview.png?fit=max&auto=format&n=QzzT7_6hBafVZgHm&q=85&s=e28090dc34955651c75389ebd6385d89" alt="agent-eval-overview" width="1499" height="984" data-path="images/nextgen-evaluation/agent-as-a-judge/agent-eval-overview.png" />
