Skip to main content

vai similarity

Compute cosine similarity between texts by embedding them and comparing their vectors.

Synopsis

vai similarity [textA] [textB] [options]
vai similarity [textA] --against <text1> <text2> ... [options]

Description

vai similarity embeds two or more texts and computes cosine similarity between them. It supports two modes:

  • Two-text comparison: Compare exactly two texts and get a single similarity score.
  • One-vs-many: Compare one text against multiple texts using --against, with results sorted by similarity (descending).

All texts are embedded in a single API call for efficiency.

Options

FlagDescriptionDefault
--against <texts...>Compare first text against multiple texts
--file1 <path>Read text A from a file
--file2 <path>Read text B from a file
-m, --model <model>Embedding modelvoyage-4-large
--dimensions <n>Output dimensionsModel default
--jsonMachine-readable JSON output
-q, --quietSuppress non-essential output (score only)

Examples

Compare two texts

vai similarity "king" "queen"

Compare one text against many

vai similarity "database" --against "MongoDB is a NoSQL database" "Python is a programming language" "Vector search finds similar documents"

Compare files

vai similarity --file1 document-a.txt --file2 document-b.txt

Get just the score

vai similarity "cat" "dog" --quiet
# Output: 0.847293

JSON output for scripting

vai similarity "hello" "world" --json

Output

In two-text mode, outputs a single cosine similarity score (0.0 to 1.0). In one-vs-many mode, results are sorted by similarity descending, showing each comparison text and its score.

Tips

  • Cosine similarity ranges from -1 to 1, but for normalized embeddings it's typically 0 to 1. Higher means more similar.
  • Use --quiet (CLI) to get just the numeric score for scripting.
  • No --input-type is set since you're comparing texts directly, not doing asymmetric retrieval.
  • vai embed — Generate raw embeddings
  • vai search — Similarity search against a MongoDB collection