Test run comparison
Autoblocks test run comparison is a powerful tool for analyzing and comparing the results of your tests. It provides a visual representation of your evaluator and run results, making it easy to identify patterns and trends.
Local tests overview
The local tests overview page is your private workspace to iterate on your AI powered product. You can quickly analyze the results of your evaluators across test suites all in one place and dive deeper into the results by clicking "Compare Runs".
CI tests overview
The CI tests overview page allows you to see the results of your evaluators across test suites for any branch and build. Using the dropdowns you are able to easily find your branches and commits to compare the results of your tests over time.
In most cases a build is triggered by a push to a branch, but could also be triggered by scheduled or on-demand workflow runs. Regardless of the source of the trigger, a build is always associated with the head commit of the branch it ran on.
Test details
The test details page allows you to compare evaluator results and test case outputs for each run side by side. This enables you to quickly iterate on different parts of your AI system and compare the results.
When you are viewing a local test, you will be able to share your filtered view with a teammate by using the "Share" button.
By default only the two most recent runs are selected; you can add more by using the "Runs" dropdown.
Sorting the test results table
When you use the "Choose Evaluator" dropdown, you can sort the test results table by the evaluator you are interested in. This makes it easy to analyze the test cases that are performing well and those that need improvement.
Expanding the test results table
You can expand the test results table to a full screen view by using the "Expand" Button. Additionally you can resize each column in the table to make it easier to read longer outputs.
Visualizing prompts used in a run
When using the Autoblocks prompt SDK and tracer SDK, you can visualize the prompts used in a run by clicking the "Prompts" icon next to the run. This allows you to quickly identify which prompts and templates were used across runs.
As long as you've implemented the prompt and tracer SDKs, no additional setup is required to visualize prompts used in test runs.