1. Perform the steps outlined in the Multi-Stage Evaluation Recipe
1. Perform the steps outlined in the Multi-Stage Evaluation Recipe
Following the previous recipe, create an evaluation dataset, question set, application variant etc.
Stop after generating test case outputs for your application.
2. Create an annotation config
2. Create an annotation config
An annotation configuration is used to configure what the annotation UI will look like. The annotation UI is displayed as a two dimensional grid. There are configurations used to indicate the annotation experience.First we will define the “annotation_config_dict”. This is used to indicate the layout of the test case on the annotations page.
This dictionary has three keys: “annotation_config_type”, “direction”, and “components”.
For all flexible evaluations, set the “annotation_config_type” to “flexible”.
The “direction” allows users to configure their UI in two ways. If the user selects “row” for this configuration, then the UI will allow for a max number of 2 items in each row and users can create as many rows as desired. If the user selects “col” for this configuration, then the UI will allow for a max number of 2 items in each column, and the users can choose to have as many columns as possible. Overflow for both will allow for either horizontal or vertical scrolling.
The “components” section indicates which components in the application variant to show for annotations in the UI.
3. Create question specific configs
3. Create question specific configs
If desired, you can configure specific views per question.
For each question that needs to be annotated, users can indicate which components of the test case are relevant to the question.
For example, if the question is focused on evaluating the retrieval part of an application test case,
indicate here to only show the traces that are relevant to the retrieval part of the test case in the annotation UI.
4. Create an evaluation with the annotation config
4. Create an evaluation with the annotation config
You can now create an evaluation with your specifiec annotation configs. If you take a look at the annotation view,
you will see it follow the row and column format you specified, along with showing any special information such as traces.If you click on the second question in the annotation view, you will see it change according to the question specific config
defined in question_id_to_annotation_config_dict.