> ## Documentation Index > Fetch the complete documentation index at: https://docs.autoblocks.ai/llms.txt > Use this file to discover all available pages before exploring further. # TypeScript Quick Start > Get started with the Autoblocks Testing SDK for TypeScript. ## Overview Autoblocks Testing enables you to declaratively define tests for your LLM application and execute them either locally or in a CI/CD pipeline. Your tests can exist in a standalone script or be executed as part of a larger test framework. ```typescript runTestSuite({ id: 'my-test-suite', testCases: genTestCases(), testCaseHash: ['input'], evaluators: [new HasAllSubstrings(), new IsFriendly()], fn: testFn, }); ``` ## Getting Started ### Install the SDK ```bash npm install @autoblocks/client # or yarn add @autoblocks/client # or pnpm add @autoblocks/client ``` ### Setup Tracing Tracing is used to track the execution of your test suite. See the [tracing](/v2/guides/tracing) guide for more information. ### Define your test case schema Your test case schema should contain all of the properties necessary to run your test function and to then make assertions on the output via your evaluators. This schema can be anything you want in order to facilitate testing your application. ```typescript interface MyTestCase { input: string; expectedSubstrings: string[]; } ``` ### Implement a function to test This function should take an instance of a test case and return an output. The function can be synchronous or asynchronous and the output can be anything: a string, a number, a complex object, etc. For this example, we are splitting the test case's `input` property on its hyphens and randomly discarding some of the substrings to simulate failures on the `has-all-substrings` evaluator. ```typescript async function testFn({ testCase }: { testCase: MyTestCase }): Promise { // Simulate doing work await new Promise((resolve) => setTimeout(resolve, Math.random() * 1000)); const substrings = testCase.input.split('-'); if (Math.random() < 0.2) { // Remove a substring randomly. This will cause about 20% of the test cases to fail // the "has-all-substrings" evaluator. substrings.pop(); } return substrings.join('-'); } ``` ### Create an evaluator Evaluators allow you to attach an [`Evaluation`](#reference-evaluation) to a test case's output, where the output is the result of running the test case through the function you are testing. Your test suite can have multiple evaluators. The evaluation method that you implement on the evaluator will have access to both the test case instance and the output of the test function over the given test case. Your evaluation method can be synchronous or asynchronous, but it must return an instance of `Evaluation`. The evaluation must have a score between 0 and 1, and you can optionally attach a [`Threshold`](#reference-threshold) that describes the range the score must be in in order to be considered passing. If no threshold is attached, the score is reported and the pass / fail status is undefined. Evaluations can also have metadata attached to them, which can be useful for providing additional context when an evaluation fails. For this example we'll define two evaluators: ```typescript import { BaseTestEvaluator, type Evaluation, } from '@autoblocks/client/testing'; /** * An evaluator is a class that subclasses BaseTestEvaluator. * * It must specify an ID, which is a unique identifier for the evaluator. * * It has two required type parameters: * - TestCaseType: The type of your test cases. * - OutputType: The type of the output returned by the function you are testing. */ class HasAllSubstrings extends BaseTestEvaluator { id = 'has-all-substrings'; /** * Evaluates the output of a test case. * * Required to be implemented by subclasses of BaseTestEvaluator. * This method can be synchronous or asynchronous. */ evaluateTestCase(args: { testCase: MyTestCase; output: string }): Evaluation { const missingSubstrings = args.testCase.expectedSubstrings.filter( (s) => !args.output.includes(s), ); const score = missingSubstrings.length ? 0 : 1; return { score, threshold: { // If the score is not greater than or equal to 1, // this evaluation will be marked as a failure. gte: 1, }, metadata: { // Include the missing substrings as metadata // so that we can easily see which strings were // missing when viewing a failed evaluation // in the Autoblocks UI. missingSubstrings, }, }; } } class IsFriendly extends BaseTestEvaluator { id = 'is-friendly'; // The maximum number of concurrent calls to `evaluateTestCase` allowed for this evaluator. // Useful to avoid rate limiting from external services, such as an LLM provider. maxConcurrency = 5; async getScore(output: string): Promise { // Simulate doing work await new Promise((resolve) => setTimeout(resolve, Math.random() * 1000)); // Simulate a friendliness score, e.g. as determined by an LLM. return Math.random(); } /** * This can also be an async function. This is useful if you are interacting * with an external service that requires async calls, such as OpenAI, or if * the evaluation you are performing could benefit from concurrency. */ async evaluateTestCase(args: { testCase: MyTestCase; output: string; }): Promise { const score = await this.getScore(args.output); return { score, // Evaluations don't need thresholds attached to them. // In this case, the evaluation will just consist of the score. }; } } ``` An evaluator can be used across many test suites! The recommended approach for this is to create your own abstract class that subclasses `BaseTestEvaluator` and implements any shared logic, then subclass that abstract class for each test suite. See the [example](https://github.com/autoblocksai/autoblocks-examples/blob/main/JavaScript/testing-sdk/src/evaluators/has-substrings.ts) and the [TypeScript documentation](https://www.typescriptlang.org/docs/handbook/classes.html#abstract-classes) on abstract classes. ### Create a test suite We now have all of the pieces necessary to run a test suite. Below we'll generate some toy test cases in the schema we defined above, where the input is a random UUID and its expected substrings are the substrings of the UUID when split by "-": ```typescript import crypto from 'crypto'; import { runTestSuite } from '@autoblocks/client/testing/v2'; function genTestCases(n: number): MyTestCase[] { const testCases: MyTestCase[] = []; for (let i = 0; i < n; i++) { const randomId = crypto.randomUUID(); testCases.push({ input: randomId, expectedSubstrings: randomId.split('-'), }); } return testCases; } (async () => { await runTestSuite({ id: 'my-test-suite', fn: testFn, // Specify here either a list of properties that uniquely identify a test case // or a function that takes a test case and returns a hash. See the section on // hashing test cases for more information. testCaseHash: ['input'], testCases: genTestCases(400), evaluators: [ new HasAllSubstrings(), new IsFriendly(), ], // The maximum number of test cases that can be running // concurrently through `fn`. Useful to avoid rate limiting // from external services, such as an LLM provider. maxTestCaseConcurrency: 10, }); })(); ``` ### Run the test suite locally To execute this test suite, first get your **local testing API key** from the [settings page](https://app.autoblocks.ai/settings/api-keys) and set it as an environment variable: ```bash export AUTOBLOCKS_V2_API_KEY=... ``` Make sure you've followed our [CLI setup instructions](/cli/setup) and then run the following: ```bash # Assumes you've saved the above code in a file called run.ts npx autoblocks testing exec -m "my first run" -- npx tsx run.ts ``` The [`autoblocks testing exec`](/cli/commands#testing-exec) command will show the progress of all test suites in your terminal and also send the results to Autoblocks: You can view details of the results by clicking on the link displayed in the terminal or by visiting the [test suites page](https://app.autoblocks.ai/testing/local) in the Autoblocks platform. ## Examples To see a more complete example, check out our [TypeScript example repository](https://github.com/autoblocksai/autoblocks-examples/tree/main/JavaScript/testing-sdk).