vai similarity

Compute cosine similarity between texts by embedding them and comparing their vectors.

CLI
Playground

Synopsis

vai similarity [textA] [textB] [options]
vai similarity [textA] --against <text1> <text2> ... [options]

Description

vai similarity embeds two or more texts and computes cosine similarity between them. It supports two modes:

Two-text comparison: Compare exactly two texts and get a single similarity score.
One-vs-many: Compare one text against multiple texts using --against, with results sorted by similarity (descending).

All texts are embedded in a single API call for efficiency.

Options

Flag	Description	Default
`--against <texts...>`	Compare first text against multiple texts	—
`--file1 <path>`	Read text A from a file	—
`--file2 <path>`	Read text B from a file	—
`-m, --model <model>`	Embedding model	`voyage-4-large`
`--dimensions <n>`	Output dimensions	Model default
`--json`	Machine-readable JSON output	—
`-q, --quiet`	Suppress non-essential output (score only)	—

Examples

Compare two texts

vai similarity "king" "queen"

Compare one text against many

vai similarity "database" --against "MongoDB is a NoSQL database" "Python is a programming language" "Vector search finds similar documents"

Compare files

vai similarity --file1 document-a.txt --file2 document-b.txt

Get just the score

vai similarity "cat" "dog" --quiet
# Output: 0.847293

JSON output for scripting

vai similarity "hello" "world" --json

Using the Similarity Tab

The Similarity tab in vai playground lets you visually compare texts and see how closely related they are in embedding space.

Getting Started

Run vai playground to start the web app
Select the Similarity tab from the navigation
Enter two texts in the input fields
Click Compare to compute the cosine similarity score

Features

Side-by-side input: Enter both texts in adjacent fields for easy comparison.

Similarity score: Displays the cosine similarity as a value between 0.0 and 1.0, with a visual indicator of how similar the texts are. Higher scores mean more semantically similar.

Model selection: Choose which embedding model to use for the comparison.

Dimensions: Adjust output dimensions to see how dimensionality affects similarity scores.

Use Cases

Testing how well a search query matches candidate documents
Checking if two pieces of content are semantically similar (deduplication)
Exploring how the embedding model understands different phrasings

Output

In two-text mode, outputs a single cosine similarity score (0.0 to 1.0). In one-vs-many mode, results are sorted by similarity descending, showing each comparison text and its score.

Tips

Cosine similarity ranges from -1 to 1, but for normalized embeddings it's typically 0 to 1. Higher means more similar.
Use --quiet (CLI) to get just the numeric score for scripting.
No --input-type is set since you're comparing texts directly, not doing asymmetric retrieval.

vai embed — Generate raw embeddings
vai search — Similarity search against a MongoDB collection

Synopsis​

Description​

Options​

Examples​

Compare two texts​

Compare one text against many​

Compare files​

Get just the score​

JSON output for scripting​

Using the Similarity Tab​

Getting Started​

Features​

Use Cases​

Output​

Tips​

Related Commands​