TypeScript
TypeScript SDK Reference
Technical reference for the Autoblocks TypeScript SDK testing functionality.
TypeScript SDK Reference
runTestSuite
The main entrypoint into the testing framework.
name | required | type | description |
---|---|---|---|
id | true | string | A unique ID for the test suite. This will be displayed in the Autoblocks platform and should remain the same for the lifetime of a test suite. |
testCases | true | BaseTestCase[] | A list of instances that subclass BaseTestCase . These can be any schema that facilitates testing your application. They will be passed directly to fn and will also be made available to your evaluators. BaseTestCase is an abstract base class that requires you to implement the hash function. See Test case hashing for more information. |
testCaseHash | false | (testCase: BaseTestCase) => string | An optional function that returns a string that uniquely identifies a test case for its lifetime. If not provided, the test case’s hash method will be used. |
evaluators | true | BaseTestEvaluator[] | A list of instances that subclass BaseTestEvaluator . |
fn | true | (testCase: BaseTestCase) => Promise<any> or any | The function you are testing. Its only argument is an instance of a test case. This function can be synchronous or asynchronous and can return any type. |
maxTestCaseConcurrency | false | number | The maximum number of test cases that can be running concurrently through fn . Useful to avoid rate limiting from external services, such as an LLM provider. |
BaseTestEvaluator
An abstract base class that you can subclass to create your own evaluators.
name | required | type | description |
---|---|---|---|
id | true | string | A unique identifier for the evaluator. |
maxConcurrency | false | number | The maximum number of concurrent calls to evaluateTestCase allowed for the evaluator. Useful to avoid rate limiting from external services, such as an LLM provider. |
evaluateTestCase | true | (testCase: BaseTestCase, output: any) => Promise<Evaluation or undefined> or Evaluation or undefined | Creates an evaluation on a test case and its output. This method can be synchronous or asynchronous. |
Evaluation
An interface that represents the result of an evaluation.
name | required | type | description |
---|---|---|---|
score | true | number | A number between 0 and 1 that represents the score of the evaluation. |
threshold | false | Threshold | An optional Threshold that describes the range the score must be in in order to be considered passing. If no threshold is attached, the score is reported and the pass / fail status is undefined. |
metadata | false | Record<string, any> | Key-value pairs that provide additional context about the evaluation. This is typically used to explain why an evaluation failed. Attached metadata is surfaced in the test run comparison UI. |
Threshold
An interface that defines the passing criteria for an evaluation.
name | required | type | description |
---|---|---|---|
lt | false | number | The score must be less than this number in order to be considered passing. |
lte | false | number | The score must be less than or equal to this number in order to be considered passing. |
gt | false | number | The score must be greater than this number in order to be considered passing. |
gte | false | number | The score must be greater than or equal to this number in order to be considered passing. |