> ## Documentation Index
> Fetch the complete documentation index at: https://docs.gp.scale.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Annotation Configuration

> Customize annotation configurations.

By default, the annotation UI which annotators see in SGP shows the test case input, expected output, and output. However, for complex evaluations may want to:

* display data from the trace
* select which parts of test case inputs and test case outputs to display
* modify the layout the annotation UI

The **Annotation Configuration** allows you to do all three.

Here's what an example annotation configuration looks like:

```python theme={null}
from scale_gp.lib.types.data_locator import data_locator # this is a helper to produce

annotation_configuration = dict(
  annotation_config_type="flexible", # this is the default, so we could have omitted annotation_config_type entirely.
  direction="row", # or "col"
  components=[ # 2D array representing how things will be layed out in the UI
    [
      dict(data_loc=["test_case_output", "output", "string_output"], label="string output"),
      dict(data_loc=["test_case_data", "expected_output", "string_expected"]),
    ],
    [
      dict(data_loc=data_locator.test_case_output.output["messages_output"]), # The data_locator is an easier way of producing data_locs
    ],
    [
      dict(data_loc=data_locator.trace["tool_call"]).input["string_input"] # reference the "tool_call" node from the trace earlier
    ]
  ]
)

evaluation = sgp_client.evaluations.create(
    account_id=ACCOUNT_ID,
    name=f"example flexible evaluation",
    description="This is a test evaluation",
    type="builder",
    evaluation_config_id=evaluation_config.id, # You need to create an evaluation config, evaluation_dataset, and application spec/variant
    evaluation_dataset_id=flexible_evaluation_dataset.id,
    application_variant_id=application_variant.id,
    application_spec_id=application.id,
  annotation_config=annotation_configuration,
)
```

When a contributor annotates this evaluation in the UI, they will see an annotation UI that looks something like this:

<img src="https://mintcdn.com/scalegp/2sSrpizRElJqluR6/images/flexible-evaluation/flexible_annotation_ui_image.png?fit=max&auto=format&n=2sSrpizRElJqluR6&q=85&s=12f67409a9b7a0275e453f3aae1d0412" alt="Flexible Annotation UI using Annotation Config" width="3456" height="2234" data-path="images/flexible-evaluation/flexible_annotation_ui_image.png" />

Let's break down how a custom annotation config is set up:

* `annotation_config_type`: by default this is "flexible". The other types are "summarization" and "multiturn" which make it easier to work with specific use cases
* `components`: this is a 2D list of annotation items. Each annotation item points to somewhere in the test case data, test case output, or trace. When the annotator grades the test case output, they will see data pulled from each location
  * Each annotation item has a "data\_loc" field and an optional "label" field. The "data\_loc" is an array that points to where annotation data should be pulled from. The "label" is a name to be displayed to a user for the "data\_loc".
    > ⚠️  if a "data\_loc" points somewhere that doesn't exist for one or more test cases, you will not be able to create the evaluation.
* `direction`: by default "row". Decides whether components are laid out as rows or as columns

Here's are some examples of how different arrangements of components produce different UIs:

<img src="https://mintcdn.com/scalegp/6g9o6SNfLus5dKei/images/flexible-evaluation/docs_annotation_config_graphic.svg?fit=max&auto=format&n=6g9o6SNfLus5dKei&q=85&s=2da9019cc320fec54b9229f9a59da964" alt="Diagram of How Annotation Configs Map to Layouts" width="1109" height="1148" data-path="images/flexible-evaluation/docs_annotation_config_graphic.svg" />

`data_loc`s can take any of these shapes:

| `data_locator` Helper                                                     | `data_loc` array                                                              | Meaning                                                            |
| :------------------------------------------------------------------------ | :---------------------------------------------------------------------------- | :----------------------------------------------------------------- |
| `data_locator.test_case_data.input`                                       | `["test_case_data", "input"]`                                                 | Display the entire input from the test case                        |
| `data_locator.test_case_data.input["\<input key>"]`                       | `["test_case_data", "input", "\<input key>"]`                                 | Displays  a single key from the input                              |
| `data_locator.test_case_data.expected`                                    | `["test_case_data", "expected_output"]`                                       | Display the entire expected output from the test case              |
| `data_locator.test_case_data.expected["\<expected output key>"]`          | `["test_case_data", "expected_output", "\<expected output key>"]`             | Display a single key from the expected output                      |
| `data_locator.test_case_output`                                           | `["test_case_output", "output"]`                                              | Display the entire output from the test case output                |
| `data_locator.test_case_output["\<output key>"]`                          | `["test_case_output", "output", "\<output key>"]`                             | Display a single key from the output                               |
| `data_locator.trace["\<node id from the trace>"].input`                   | `["trace", "\<node id from the trace>", "input"]`                             | Display the entire input from a single part of the trace           |
| `data_locator.trace["\<node id from the trace>"].input["\<input key>"]`   | `["trace", "\<node id from the trace>", "input", "\<input key>"]`             | Display a single key from the input of a part of the trace         |
| `data_locator.trace["\<node id from the trace>"].output`                  | `["trace", "\<node id from the trace>", "output"]`                            | Display the entire output from a single part of the trace          |
| `data_locator.trace["\<node id from the trace>"].output["\<output key>"]` | `["trace", "\<node id from the trace>", "output", "\<output key>"]`           | Display a single key from the output of a part of the trace        |
| `data_locator.trace["\<node id from the trace>"].expected`                | `["trace", "\<node id from the trace>", "expected"]`                          | Display the entire expected output from a single part of the trace |
| `["trace", "\<node id from the trace>", "expected", "\<expected key>"]`   | `data_locator.trace["\<node id from the trace>"].expected["\<expected key>"]` | Display a single key from the expec output of a part of the trace  |

It is highly recommended that you use the `data_locator` helper instead of manually creating the `data_loc` array.

### Customizing the Annotation UI per question

Sometimes, you have certain questions in an evaluation rubric that are relevant only to a specific part of the test case, test case output or trace. For instance, you might ask a question specifically about the "completion" or "reranking" step in the trace.

In that case you can create a `question_id_to_annotation_config` mapping that lets you override the annotation config for a specific question ID:

```python theme={null}
question_id_to_annotation_config = {
    questions[1].id: dict(
        components=[
            [
                dict(
                    data_loc=data_locator.trace["completions"].input,
                    label="string output",
                ),
                dict(
                    data_loc=data_locator.trace["completions"].output
                )
            ],
            [
                dict(
                    data_loc=data_locator.trace["completions"].expected
                ),
            ],
        ],
    )
}

evaluation = sgp_client.evaluations.create(
  ... # specify all the usual evaluation fields,
  annotation_config=annotation_configuration,
  question_id_to_annotation_configuration=queston_id_to_annotation_config # where specified, overrides the annotation_config
)
```

In the annotation UI, the information rendered will now change for each respective evaluation question as mapped above:

<img src="https://mintcdn.com/scalegp/E4QRE88hrDqzkijh/images/flexible-evaluation/docs_flexible_annotation.gif?s=140be193a94be2e056e7ada3c54a19dd" alt="Flexible Annotation with question_id_to_evaluation_config Demonstration" width="960" height="621" data-path="images/flexible-evaluation/docs_flexible_annotation.gif" />

### Dev Mode

SGP also supports "Dev Mode" which allows an annotator to view all the inputs, outputs and the full trace all at once. You can toggle Dev Mode by clicking on the top right in the annotation UI:

<img src="https://mintcdn.com/scalegp/E4QRE88hrDqzkijh/images/flexible-evaluation/dev_mode_toggle.png?fit=max&auto=format&n=E4QRE88hrDqzkijh&q=85&s=f71a41889019cbb9cae622ed4cde40ba" alt="Dev Mode turned on. An annotator can toggle dev mode using the &#x22;Dev Mode&#x22; switch in the top right corner." width="3456" height="2234" data-path="images/flexible-evaluation/dev_mode_toggle.png" />
