Skip to main content

vai store

Embed text and store it directly in a MongoDB Atlas collection. Supports single documents and batch mode via JSONL files.

Synopsis

vai store --db <database> --collection <name> [options]

Description

vai store takes text (from --text, --file, or stdin), generates an embedding via the Voyage AI API, and inserts the document into MongoDB Atlas. Each stored document includes the text, embedding vector, model name, dimensions, and a createdAt timestamp.

For batch storage, pass a .jsonl file where each line has a "text" field and optional "metadata" object.

Options

FlagDescriptionDefault
--db <database>Database name (required)
--collection <name>Collection name (required)
--field <name>Embedding field nameembedding
--text <text>Text to embed and store
-f, --file <path>File to embed (text file, or .jsonl for batch)
-m, --model <model>Embedding modelvoyage-4-large
--input-type <type>Input type: query or documentdocument
-d, --dimensions <n>Output dimensionsModel default
--output-dtype <type>Output data type: float, int8, uint8, binary, ubinaryfloat
--metadata <json>Additional metadata as JSON string
--jsonMachine-readable JSON output
-q, --quietSuppress non-essential output

Examples

Store a single document

vai store --text "MongoDB Atlas provides vector search" --db myapp --collection docs

Store from a file with metadata

vai store --file article.txt --db myapp --collection docs \
--metadata '{"source": "blog", "author": "Jane"}'

Batch store from JSONL

vai store --file documents.jsonl --db myapp --collection docs

The JSONL file format:

{"text": "First document content", "metadata": {"source": "file1.md"}}
{"text": "Second document content", "metadata": {"source": "file2.md"}}

Store with a lighter model

vai store --text "quick test" --db test --collection embeddings --model voyage-4-lite

Tips

  • For bulk imports with progress tracking and error handling, use vai ingest instead.
  • The --metadata JSON is merged into the top-level document — use it for tags, sources, or any fields you want to filter on later.
  • Batch mode (.jsonl) embeds all texts in a single API call, which is more efficient than storing one at a time.