Skill testing

When creating or editing a skill, you have the option to select a test data set to evaluate the performance of different language models for your specific use case. This feature allows you to make an informed decision about which model best suits your needs before finalizing your skill.

By selecting a test dataset and specifying multiple models to test against, you can click the "Test" button to run the skill with the test data on each model. Root Signals will generate a test report showing how well the skill performs.

Selecting a Test Data Set

  1. During the skill creation process, click on the "Test Data" field.

  2. Choose a data set from the list of available test data sets. These data sets have been previously defined and uploaded to the Root Signals platform.

  3. Once you select a test data set, you can preview its contents to ensure it aligns with your skill's requirements.

Prompt Variants

  1. Proceed to the "Prompts" section of the skill creation form.

  2. Write and add one or more prompts that you want to evaluate for your skill. Your skill will be tested against all the added prompts. Note the sames variables must exist in all prompts.

  3. Root Signals will run tests using your chosen test data set and provide a report on how well each prompt performs.

Choosing Models to Test

  1. Proceed to the "Models" section of the skill creation form.

  2. Select one or more models that you want to evaluate for your skill. Your skill will be tested against all the selected models.

  3. Root Signals will run tests using your chosen test data set and provide a report on how well each model performs.

Running the Test

  1. After selecting your test data set and models, click the "Test" button located in the bottom right corner, next to the "Create Skill" button.

  2. The system will execute your skill using the test data set against each of the selected models.

  3. Once the tests are complete, Root Signals will generate a comprehensive test report.

Interpreting the Test Report

  1. The test report provides insights into how each model performed when executing your skill with the given test data set.

  2. Analyze the report to determine which model best meets the requirements of your skill in terms of accuracy, relevance, and other key metrics.

  3. Based on the test results, select the most appropriate model for your skill.

Finalizing Your Skill

  1. After reviewing the test report and selecting the optimal model, click the "Create Skill" button to finalize your skill.

  2. Your newly created skill will now use the chosen model when executed by users.

  3. If you wish to deploy the Skill in an external context unsupported by Root Signals, you may simply copy the prompt and model definitions to your target context. Note that any use of variables as well as dataset connections will then need to be recreated in the new context.

By leveraging test data sets to compare the performance of different models, you can ensure that your skill is powered by the most suitable language model for your specific use case. This feature in Root Signals empowers you to make data-driven decisions and optimize your skill's performance.

Last updated