Datasets
Datasets in Root Signals contain static information that can be included as context for skill execution. They allow you to provide additional data to your skills, such as information about your organization, products, customers, or any other relevant domain knowledge.
By leveraging data sets, you can enhance the capabilities of your skills and provide them with relevant domain knowledge or test data to ensure their performance and accuracy.
Importing a Data Set via SDK
See SDK documentation.
Importing a Data Set via UI
To import a new data set:
Navigate to the Data Sets view.
Click the "Import Data Set" button on the top right corner of the screen.
Enter a name for your data set. If no name is provided, the file name will be used as the data set name.
Choose the data set type:
Reference Data: Used for skills that require additional context.
Test Data: Used for defining test cases and validating skill or evaluator performance.
Select a tag for the data set or create a new one.
Either upload a file or provide a URL from which the system can retrieve the data.
Preview the data set by clicking the "Preview" button on the bottom right corner.
Save the data set by clicking the "Submit" button.
Using Data Sets in Skills
Reference Data Sets
Data sets can be linked to skills using reference variables. When defining a skill, you can choose a data set as a reference variable, and the skill will have access to that data set during execution. This allows you to provide additional context or information to the skill based on the selected data set.
Test Data Sets
When creating a new skill or an evaluator, you have the option to select a test data or a calibration data set, correspondingly, to drive the skill or evaluator with multiple predefined sequential inputs for the skill's performance evaluation.
Root Signals allows you to test your skill against multiple models simultaneously. In the "Prompts" and "Models" sections of the skill creation form, you can add multiple prompt variants and select one or more models to be tested, correspondingly. By clicking the "Test" / "Calibrate" button in the bottom right corner, the system will run tests using your selected test data set against each of the chosen prompts and models. This feature enables you to compare their performance and select the one with the best trade-offs for your use case.
Permissions
Datasets in Root Signals contain static information that can be included as context for skill execution. They can contain information about your organization, products, customers, etc. Datasets are linked to skills using reference variables.
Access to datasets is controlled through permissions. By default, when a user uploads a new dataset, it is set to 'unlisted' status. Unlisted datasets are only visible to the user who created them and to administrators in the organization. This allows users to work on datasets privately until they are ready to be shared with others.
To make a dataset available to other users in the organization, the dataset owner or an administrator needs to change the status to 'listed'. Listed datasets are visible to all users in the organization and can be used in skills by anyone.
Dataset permissions do not control skill execution privilege
Note that dataset permissions control whether a dataset can be used in skill creation or skill editing as a reference variable or as a test data set. Unless more specific permissions information is made available via enterprise integrations, dataset permissions do not control who can use the data set in skill execution. I.e. once dataset in fixed to a skill as a reference variable, anyone who has privileges to execute the skill will also have implicit access to the data set through the skill execution.
It is important for dataset owners and administrators to carefully consider the sensitivity and relevance of datasets before making them widely available. Datasets may contain confidential or proprietary information that should only be accessible to authorized users.
Contact Root Signals for more fine-grained controls in enterprise, regulated or governmental contexts.
In Summary
The dataset permission system in Root Signals allows for granular control over who can access and use specific datasets. The unlisted/listed status toggle and the special privileges granted to administrators provide flexibility in managing data assets across the organization. Proper management of dataset permissions is crucial for ensuring data security and relevance in skill development and execution.
Last updated