5-Minute RAG Pipeline
This guide takes you from a folder of documents to a fully searchable vector database with two-stage retrieval. You'll use four vai commands.
Guide moment
This is where the robot's thinking and search states make the most sense
The CLI already signals in-progress work during ingestion and retrieval. Using that same robot language in the guide helps the docs feel like a continuation of the live product experience instead of a separate manual.
Prerequisites
- vai installed (
npm install -g voyageai-cli) - MongoDB Atlas cluster (free tier)
- Python 3.9+ if you want to use local
voyage-4-nanoinference - Voyage AI API key (get one free) if you want API-backed embedding and reranking
If you want a zero-API-key ingestion path, run vai nano setup first and add --local to your vai pipeline command.
Step 1: Set Credentials
export VOYAGE_API_KEY="your-voyage-ai-key"
export MONGODB_URI="mongodb+srv://user:pass@cluster.mongodb.net/"
Step 2: Initialize the Project
vai init --yes
This creates .vai.json with defaults: voyage-4-large model, recursive chunking at 512 characters with 50-character overlap.
Step 3: Run the Pipeline
Point vai at a directory of documents:
vai pipeline ./docs/ --db myapp --collection knowledge --create-index
Or use local nano inference for the embedding step:
vai pipeline ./docs/ --local --db myapp --collection knowledge --create-index
This single command:
- Reads all supported files (
.txt,.md,.html,.json,.jsonl,.pdf) - Chunks each file using the recursive strategy
- Embeds chunks in batches with
voyage-4-large, or locally withvoyage-4-nanowhen--localis enabled - Stores vectors in MongoDB Atlas with source metadata
- Creates a vector search index (with
--create-index)
You'll see a progress bar as documents are processed.
Step 4: Search with Two-Stage Retrieval
vai query "How do I configure replica sets?" --db myapp --collection knowledge
This performs two-stage retrieval:
- Embed your query with Voyage AI
- Vector search against MongoDB Atlas to find the top candidates
- Rerank candidates with
rerank-2.5for precision
Results come back ranked by relevance with scores and source metadata.
Customizing the Pipeline
Different Chunking Strategy
For markdown-heavy docs, use heading-aware chunking:
vai pipeline ./docs/ --strategy markdown --chunk-size 1024 --overlap 100
Preview Without API Calls
vai pipeline ./docs/ --dry-run
Shows how many files would be processed, chunk counts, and estimated cost without making any API calls or writing to the database.
Custom Model
vai pipeline ./docs/ --model voyage-4-lite
Use voyage-4-lite for budget-friendly embedding or voyage-4 for balanced quality/cost.
Local model path
vai nano setup
vai pipeline ./docs/ --local --model voyage-4-nano --db myapp --collection knowledge
Use this when you want local embedding with no Voyage API key for ingestion.
Skip Reranking
vai query "my question" --no-rerank
Runs vector search only, skipping the reranking step. Faster but slightly less precise.
Add Filters
vai query "performance tuning" --filter '{"category": "guides"}' --top-k 10
Pre-filter documents by metadata before vector search.
Cost Estimation
Before running the pipeline on a large corpus, estimate costs:
vai estimate --docs 10000 --queries 1000 --months 12
This shows cost breakdowns for every Voyage 4 model, including asymmetric strategies (embed with voyage-4-large, query with voyage-4-lite).
Next Steps
- Chat: Ask questions about your documents conversationally
- Workflows: Build multi-step RAG pipelines
- Evaluation: Measure retrieval quality
- Benchmarking: Compare models on your data