Simulations
Analyze Results
Learn how to analyze and understand your simulation results
Understanding Results Structure
Executions vs Test Runs
- Executions: Individual simulation runs with their specific results
- Test Runs: Groups of executions bundled together for comparison over time
- Compare performance across different scenarios
- Track improvements between iterations
- Analyze patterns across multiple runs
Reviewing Executions
Execution Details
Each execution provides detailed information about:
- Timestamp of the run
- Duration of the conversation
- Tokens used
- Input/Output pairs
- Pass/Fail status
- Evaluation results
Transcript Review
Review conversations in detail with:
- Complete conversation transcript
- Audio playback of the interaction
- Turn-by-turn message analysis
Performance Metrics
Track important metrics including:
- Response times
- Token usage
- Success rates
- Evaluation results
- Overall pass rates
Advanced Search
Search Syntax
Use powerful search operators to find specific executions:
Content Search:
output:text
- Find executions where output contains “text”input:text
- Find executions where input contains “text”
Metric Filters:
duration>100
- Executions longer than 100msduration<500
- Executions shorter than 500mstokens>1000
- Executions using more than 1000 tokens
Combining Searches:
- Use
AND
to combine conditions:duration>100 AND output:pass
- Use
OR
for alternatives:output:hello OR output:hi
- Use parentheses for complex queries:
(duration>100 AND output:pass) OR input:complete
Visualization Tools
Timeline View
- Visual representation of execution timing
- Identify patterns in response times
- Spot anomalies or performance issues
- Track conversation flow
Performance Graphs
- Success rate trends
- Duration distribution
- Token usage patterns
- Data capture accuracy over time
Comparison Tools
Compare executions across:
- Different personas
- Time periods
- Edge cases
- Data field variations
Best Practices
Analysis Workflow
-
Review Overall Metrics
- Check success rates
- Analyze duration patterns
- Review token usage
-
Deep Dive into Failures
- Examine failed executions
- Review error patterns
- Identify common issues
-
Compare Across Runs
- Track improvements
- Identify regressions
- Analyze pattern changes
-
Document Findings
- Note successful strategies
- Document areas for improvement
- Track action items
Tips for Effective Analysis
- Start with high-level metrics
- Use search to find specific patterns
- Compare similar scenarios
- Track improvements over time
- Document unusual cases
- Share insights with team