Grid Search

Grid search enables you to test multiple combinations of parameters in your application. This powerful feature allows you to systematically explore different configurations and compare their performance. Common use cases for grid search include:

  1. Comparing different language models (e.g., GPT-3.5 vs GPT-4)
  2. Testing various prompts or prompt structures
  3. Evaluating different hyperparameters (e.g., temperature, top_p)
  4. Assessing the impact of different system messages or context settings

By using grid search, you can automate the process of running your tests across multiple parameter combinations, helping you identify optimal configurations for your AI-powered applications.

from dataclasses import dataclass

from autoblocks.testing.models import BaseTestCase
from autoblocks.testing.run import grid_search_ctx
from autoblocks.testing.run import run_test_suite
from autoblocks.testing.util import md5

@dataclass
class TestCase(BaseTestCase):
    input: str
    expected_substrings: list[str]

    def hash(self) -> str:
        return md5(self.input) # Unique identifier for a test case
    
def test_fn(test_case: TestCase):
    ctx = grid_search_ctx()
     # Replace with your LLM call
    return f"{ctx.get('model')}: {test_case.input}"

run_test_suite(
    id="my-test-suite",
    test_cases=[
        TestCase(
            input="hello world",
            expected_substrings=["hello", "world"],
        )
    ], # Replace with your test cases
    fn=test_fn,
    evaluators=[], # Replace with your evaluators
    grid_search_params=dict(
        model=["gpt-4o-mini", "gpt-4o"]
    )
)