How to evaluate an application with multimodal inputs
file_id
to the image for inputs, outputs, and traces, and these images will be rendered on the platform’s UI. This guide will walk through an example of an external application and flexible evaluation that has multimodal inputs and outputs.
This tutorial assumes that you already have a file hosted in either a publicly available server or in the SGP platform.
If you have an image that is hosted in a publicly available server, enter the path to that image under file_id
.
If you need a place to host the images, follow these instructions to upload your file to the SGP platform and obtain the file_id
.
Steps:
file_id
pointing to the multimedia input and optionally a query.
If you have an image that is hosted in a publicly available server, enter the path to that image under file_id
. If you need a place to host the images, follow these instructions to upload your file to the SGP platform and obtain the file_id
.
DatasetBuilder
.
file_id
s in the traces. These traces will show up as images in the annotation view as well. generate_outputs
function will run you application on the dataset and upload responses to SGP.
You can see that we already wrapped this in an External Application, so you can easily call evaluation on this.
Let’s use your runner to run the test-cases with the application. This step will generate the outputs and upload as evaluation test-case results.