Empower Non-Technical Users to Test and Evaluate in AI Playground

Our Testing SDKs provide engineers with best-in-class tools for a rapid iterate → test → evaluate feedback loop as they make changes to a codebase. At the same time, non-technical stakeholders often have valuable insights that can be difficult to capture and integrate into the development process. Directly including these folks—your designers, product managers, CEOs, and anyone else who isn't writing code—in the development and testing process of your AI product will be crucial to its success.

Prerequisites

There are a few requirements to get started with AI Playground:

  1. Setup test suites
  2. Setup test suites to run in CI
  3. Setup prompt management

With these in place, you can empower anyone on your team to make changes to a prompt and then run the same tests a developer would against those changes.

Update your GitHub Actions workflow

The workflow you created in the CI setup guide should be updated to run on the workflow_dispatch event. It requires an input called autoblocks-overrides:

name: Autoblocks Tests

on:
  pull_request:
  workflow_dispatch:
    inputs:
      autoblocks-overrides:
        type: string
        description: Overrides for Autoblocks-managed entities
        required: false

Install the GitHub App

Install our GitHub App to the repository where the above GitHub Actions workflow resides.

Invite team members to Autoblocks

Make sure you've invited any team members that want to participate in the testing and evaluation process to your Autoblocks organization.

Find the test suite you want to run

Navigate to the test suites page and click on the "Compare Runs" button for the test suite you want to run in AI Playground.

At the top of the page, click "Rerun in Playground." This will take you to the AI Playground, where you can run the test suite after making modifications to the prompts and configs that are used within the test suite.

Make changes to the prompts or configs

In the AI Playground, the prompts and configs available to modify are displayed as tabs across the top of the page. Make the modifications you want to test and then click the "Rerun" button at the top of the page.

View your test run

You can find your UI-triggered test runs on the test suites page.