Evaluation Guide

vai's evaluation tools help you measure retrieval quality, compare configurations, and track improvements over time.

Quick Start

# Run evaluation against a test set
vai eval --test-set queries.json --db myapp --collection docs

# Save results for comparison
vai eval --test-set queries.json --rerank --output results-v1.json

# Compare configurations
vai eval compare results-v1.json results-v2.json

Metrics

Metric	What It Measures	Range
MRR	How high is the first relevant result?	0–1
nDCG@K	How well are relevant results ranked?	0–1
Recall@K	What fraction of relevant docs are found?	0–1
MAP	Overall precision across recall levels	0–1
Precision@K	Fraction of top-K that are relevant	0–1

Higher is better for all metrics.

Workflow

Create a test set with queries and known relevant documents
Run baseline evaluation and save results
Change configuration (model, chunk size, reranking, etc.)
Re-evaluate and compare against baseline
Iterate until quality meets your requirements

Next Steps

Test Sets — Creating evaluation test sets
Comparing Configs — Side-by-side comparison

Quick Start​

Metrics​

Workflow​

Next Steps​

Quick Start

Metrics

Workflow

Next Steps