Using Knowledge Base
Generating datasets manually is a known tedious task. If you have a Knowledge Base set up for your application in the platform, you can also automatically generate datasets for evaluation through the platform. Often times, the best evaluations are done through having a diverse dataset, so we recommend evaluating your application against a mix of manually and synthetically generated datasets.
Generate Dataset through the SDK
SDK instructions for uploading datasets can be found here.
Generate Dataset through the UI
To create a new safety dataset from the platform, navigate to Evaluation Datasets. From there, select Create Dataset and choose the Using Knowledge Base option.
Configure Options
In order to generate a dataset, you will need the following:
- Knowledge Base: This is the source of data for your evaluation dataset. It includes any documents, csv uploads, or data sources to pull the information from.
- Evaluation Dataset Name: This is the name for the dataset so that you can easily find it after creation to perform evaluations.
- Advanced Settings (Optional): This allows you to select how many test cases you want and how you want to group the test cases.
- Number of Test Cases: # of test cases generated
- Group Test Cases by Artifact: If you select this option, we will only create test cases with Knowledge pulled from a single artifact. (Ex: If you have several documents in your knowledge base, we will not create any test cases that take data from more than one document)
Dataset Generation
After you’ve configured the dataset generation, you may have to wait a while for the datasets to be generated. This depends on the size of your Knowledge Base and how many test cases you selected to be generated.
Approve Test Cases and Publish Datasets
Approve and Edit Test Cases
After Dataset is generated, you can select which datasets to approve. You can also directly edit the content of the dataset through the UI. If you want to see what data was used to generate the Input and Expected Output, you can click the View Chunks button.
Once you approve a test case, you can no longer edit or undo the approval.
If you want to generate more test cases as part of this dataset, you can select the Generate More button to generate additional test cases for this dataset.
Publish Dataset
Once you are satisfied with the test cases in this dataset, click Publish to publish your dataset.