Evaluations
Generate an Application Report Card
Recipes
- Evaluations
- Applications
- Datasets
- Inference
Evaluations
Generate an Application Report Card
Generate a report card for an application variant
Follow the instructions in the Quickstart Guide to setup the SGP Client
from scale_gp import SGPClient
client = SGPClient(api_key=api_key)
Create an evaluation by following any previous evaluation recipe such as Multistage Evaluation
evaluation = client.evaluations.create(
type="builder",
account_id=account_id,
application_spec_id=spec_id,
application_variant_id=variant_id,
description="description",
evaluation_dataset_id=evaluation_dataset.id,
name="Report card evaluation",
evaluation_config_id=evaluation_config.id,
)
A report card will provide a summary of the performance of the application variant against an evaluation dataset Three main scores will be produced: Accuracy, Quality and Trust & Safety. The performance in each category will contribute to an overall Scale Confidence Score for the variant.
You can view generated report cards in the UI at: https://egp.dashboard.scale.com/applications/{application_spec_id}/{application_variant_id}/report-card/overview
create_response = client.application_variant_reports.create(
application_variant_id=variant_id,
evaluation_dataset_ids=[manual_evaluation_dataset.id],
account_id=account_id,
)
# retrieve the application variant report, will still be PENDING if immediately retrieved
application_variant_report = client.application_variant_reports.retrieve(
application_variant_report_id=create_response.id,
view=["AsyncJobs"],
)