Datasets
Create a manual dataset
Datasets
Create a manual dataset
Create a dataset and manually upload test cases. Evaluation datasets contain a set of test cases used to evaluate the performance of your applications.
https://pypi.org/project/scale-egp/
# Prerequisite: pip install -U scale-egp
import os
from typing import List
from scale_gp import SGPClient
from scale_gp.types.evaluation_datasets.test_case import TestCase
from scale_gp.types.evaluation_datasets.test_case_batch_params import Item
from scale_gp.types.evaluation_datasets.test_case_create_params import TestCaseData
Fetch your API Key from: https://gp.scale.com/admin/api-key
Fetch your Account ID from: https://gp.scale.com/admin/accounts
All resources you interact with using this client will belong to this account.
Note: If you are using your own VPC-deployed version of Scale GP, you will have a different endpoint_url. For users of our multi-tenant platform, use https://gp.scale.com
client = SGPClient(api_key=api_key)
Create a dataset that we will manually upload test cases into
manual_evaluation_dataset = client.evaluation_datasets.create(
account_id=account_id,
name="manual_evaluation_dataset",
schema_type="GENERATION",
type="manual",
)
Define and batch upload desired test cases into the dataset
DATASET: List[TestCaseData]= [
{
"input": "What is the capital of France?",
"expected_output": "The capital of France is Paris.",
"expected_extra_info": {
"info": "Paris, located in the northern part of France along the Seine River, is not only the country’s capital but also its largest city. Known for its art, fashion, gastronomy, and culture, Paris has a significant influence both in France and globally.",
"schema_type": "STRING",
},
},
{
"input": "What color is an apple?",
"expected_output": "An apple is typically red when ripe, though green and yellow varieties also exist depending on the species and maturity.",
"expected_extra_info": {
"info": "Apples come in various colors including red, green, and yellow. Red apples are one of the most common, with varieties like Red Delicious being very popular. Green apples, like Granny Smith, are tart and used often in baking. Yellow apples such as Golden Delicious are sweet and softer.",
"schema_type": "STRING",
},
},
{
"input": "Who is the first president of the USA?",
"expected_output": "The first president of the USA is George Washington.",
"expected_extra_info": {
"info": "George Washington served as the first president of the United States from 1789 to 1797. He is a pivotal figure in American history, recognized for his leadership during the Revolutionary War and setting many precedents for the national government.",
"schema_type": "STRING",
},
},
]
items: List[Item] = [Item({"account_id": account_id, "test_case_data": item}) for item in DATASET]
test_cases: List[TestCase] = []
uploaded_test_cases = client.evaluation_datasets.test_cases.batch(
evaluation_dataset_id=manual_evaluation_dataset.id,
items=items,
)
test_cases.extend(uploaded_test_cases)
Publishing the dataset allows it to be available for use in evaluations
published_dataset_response = client.evaluation_datasets.publish(
evaluation_dataset_id=manual_evaluation_dataset.id,
)