Test run comparison

Autoblocks test run comparison is a powerful tool for analyzing and comparing the results of your tests. It provides a visual representation of your evaluator and run results, making it easy to identify patterns and trends.

Local tests overview

The local tests overview page is your private workspace to iterate on your AI powered product. You can quickly analyze the results of your evaluators across test suites all in one place and dive deeper into the results by clicking "Compare Runs".

CI tests overview

The CI tests overview page allows you to see the results of your evaluators across test suites for any branch and build. Using the dropdowns you are able to easily find your branches and commits to compare the results of your tests over time.

Test details

The test details page allows you to compare evaluator results and test case outputs for each run side by side. This enables you to quickly iterate on different parts of your AI system and compare the results.

Diff against baseline or previous run

You can use the "Diff Output Against" dropdown to create a comparison between the current run and a baseline or previous run output. This is useful for identifying changes in the results of your tests over time.

Sorting the test results table

When you use the "Choose Evaluator" dropdown, you can sort the test results table by the evaluator you are interested in. This makes it easy to analyze the test cases that are performing well and those that need improvement.

Expanding the test results table

You can expand the test results table to a full screen view by using the "Expand" Button. Additionally you can resize each column in the table to make it easier to read longer outputs.

Visualizing prompts used in a run

When using the Autoblocks prompt SDK and tracer SDK, you can visualize the prompts used in a run by clicking the "Prompts" icon next to the run. This allows you to quickly identify which prompts and templates were used across runs.