Online Evaluations

Autoblocks allows you to evaluate your AI events in real-time. You can use evaluators to monitor your AI features and ensure that they are working as expected.

Setting up evaluations

Online evaluations are supported through the Tracer SDK. Evaluators, classes that produce evaluations, are maintained in your codebase, with the flexibility to define logic for your specific use case. Common patterns for evaluators include rules based, LLM based, and/or comparing against ground truth.

import { AutoblocksTracer } from '@autoblocks/client';
import {
  BaseEventEvaluator,
  type Evaluation,
  type TracerEvent,
} from '@autoblocks/client/testing';

class IsProfessionalTone extends BaseEventEvaluator {
  id = 'is-professional-tone';

  private async isProfessionalToneLlmScore(content: string): Promise<Evaluation> {
    // Your call to an LLM. Omitted for brevity
    return { score: 0.5, threshold: { gte: 0.5 } };
  }

  async evaluateEvent(args: { event: TracerEvent }): Promise<Evaluation> {
    return await this.isProfessionalToneLlmScore(
      args.event.properties['response']
    )
  }
}

const tracer = new AutoblocksTracer();
tracer.sendEvent(
  'ai.response',
  {
    properties: {
      response: 'Hello, how are you?',
    },
    evaluators: [
      new IsProfessionalTone(),
    ],
  },
);

Monitoring

You can monitor your evaluations through Autoblocks dashboards. These configurable dashboards provide a real-time view of your online evaluation results to help you understand which prompts, models, and settings are performing best.

Learn more about dashboards