Skip to main content

vai eval compare

Compare evaluation results from multiple vai eval runs side by side, highlighting improvements and regressions.

Synopsis

vai eval compare <files...> [options]

Description

vai eval compare loads multiple evaluation result files (created by vai eval --output) and presents them in a comparison table. Each metric shows the value for each configuration, making it easy to see which model/settings combination performs best.

Options

FlagDescriptionDefault
<files...>Two or more evaluation result files (required)
--jsonMachine-readable JSON output

Examples

Compare two configurations

vai eval compare results-lite.json results-large.json

Compare with and without reranking

vai eval compare no-rerank.json with-rerank.json

Tips

  • Label your result files clearly (e.g., voyage-4-lite-no-rerank.json) so comparisons are easy to read.
  • Use this in CI to detect quality regressions when changing models or chunk settings.