Overview
Human review mode is designed for humans to review, grade, and discuss test results..
Formatting your test cases and outputs
The schemas of your test case and output as they exist in your codebase often contain implementation details that are not relevant to a human reviewer. Each SDK provides methods that allow you to transform your test cases and outputs into human-readable formats.
There are four different content types you can use to control the rendering in the Autoblocks UI:
- TEXT
- HTML
- MARKDOWN
- LINK
This is often a good starting point when setting up a test suite for the first time. Developers can run the test without any code-based evaluators and review the results manually to understand the responses being generated by the LLM.
Creating a human review job programmatically
Whether you are on the free plan or a paid plan, you can create human review jobs directly in code with either the RunManager
or in runTestSuite
.
run_test_suite / runTestSuite
Run Manager
Using the results
You can use the results of a human review job for a variety of purposes, such as:
- Fine tuning an evaluation model
- Few shot examples in your LLM judges
- Improving your core product based on expert feedback
- and more!