Skip to main content

Voyage 4 Family

The Voyage 4 family is Voyage AI's latest generation of embedding models, released in January 2026. All models share the same embedding space and support flexible dimensions. In vai v1.31.0, that family now includes a first-class local path through voyage-4-nano.

Model Comparison

ModelArchitectureRTEB ScorePrice/1M tokensAvailabilityBest For
voyage-4-largeMoE71.41$0.12Voyage AI APIBest quality, production search
voyage-4Dense70.07$0.06Voyage AI APIBalanced quality and cost
voyage-4-liteDense68.10$0.02Voyage AI APIHigh-volume, cost-sensitive
voyage-4-nanoDenseTBDFree (open-weight)Local inference in vaiLocal development, zero-cost experimentation, offline-friendly workflows

Key Features

Shared Embedding Space

All Voyage 4 models produce vectors in the same semantic space. You can:

  • Embed documents with voyage-4-lite and query with voyage-4-large
  • Index locally with voyage-4-nano, then query the same collection later with an API-backed Voyage 4 model
  • Mix models across different parts of your pipeline
  • Upgrade models without re-embedding your entire corpus (within the family)

Local Inference with Nano

voyage-4-nano is the open-weight member of the Voyage 4 family. In vai, it runs locally through a lightweight Python bridge that manages the model environment and inference process while keeping the CLI experience intact.

This gives you a new onboarding path:

  • install vai
  • run vai nano setup
  • embed locally with vai embed --local
  • move to API-backed models later when you need hosted scale or production throughput

Unlike the API-backed models, nano does not require a Voyage API key for local embedding. It is the fastest way to try vai end to end.

Flexible Dimensions (Matryoshka)

The Voyage 4 family supports: 256, 512, 1024 (default), 2048

Matryoshka representation learning means embeddings are structured so that the first N dimensions carry the most information. You can truncate to fewer dimensions without retraining, trading some accuracy for smaller storage and faster search.

32K Context Window

All Voyage 4 models support up to 32K tokens of input context, suitable for long documents without truncation.

Architecture Details

voyage-4-large uses a Mixture of Experts (MoE) architecture with multiple specialized sub-networks. A gating mechanism routes inputs to the most relevant experts, giving the model broad expertise without proportionally increasing compute cost.

voyage-4, voyage-4-lite, and voyage-4-nano use traditional dense architectures where all parameters are activated for every input.

RTEB Benchmark Context

The RTEB (Retrieval Text Embedding Benchmark) scores represent NDCG@10 averaged across 29 retrieval datasets. For context:

ModelScorevs. voyage-4-large
voyage-4-large71.41
voyage-470.07-1.34
Gemini Embedding 00168.66-2.75
voyage-4-lite68.10-3.31
Cohere Embed v465.75-5.66
OpenAI v3 Large62.57-8.84

Try It

# Set up local inference
vai nano setup

# Embed locally with nano
vai embed "What is vector search?" --local

# Build a local-first pipeline
vai pipeline ./docs/ --local --db myapp --collection knowledge --create-index

# Compare API-backed costs across the family
vai estimate --doc-model voyage-4-lite --query-model voyage-4-large