Skip to main content

vai chat

A conversational RAG interface that combines vector search retrieval with LLM-powered responses. Chat with your knowledge base using natural language.

Synopsis

vai chat [options]

Description

vai chat starts an interactive chat session that:

  1. Takes your question
  2. Retrieves relevant documents from MongoDB Atlas via vector search
  3. Optionally reranks results for better precision
  4. Sends the context + question to an LLM for a grounded response
  5. Streams the response back to your terminal

Supports three LLM providers: Anthropic (Claude), OpenAI (GPT), and Ollama (local models). Sessions can be persisted to MongoDB for resumption.

Two modes are available:

  • Pipeline mode (default): Fixed RAG flow, search → rerank → generate
  • Agent mode: The LLM uses tool calls to decide when and how to search

Options

FlagDescriptionDefault
--db <name>MongoDB database nameFrom .vai.json
--collection <name>Collection with embedded documentsFrom .vai.json
--session <id>Resume a previous chat session
--llm-provider <name>LLM provider: anthropic, openai, ollamaFrom config
--llm-model <name>Specific LLM modelFrom config
--llm-api-key <key>LLM API keyFrom config
--llm-base-url <url>LLM API base URL (for Ollama)From config
--mode <mode>Chat mode: pipeline or agentpipeline
--max-context-docs <n>Max retrieved documents for context5
--max-turns <n>Max conversation turns before truncation20
--no-historyDisable MongoDB persistence (in-memory only)
--no-rerankSkip reranking step
--no-streamWait for complete response (don't stream)
--system-prompt <text>Override the system prompt
--text-field <name>Document text field nametext
--filter <json>MongoDB pre-filter for vector search
--estimateShow per-turn cost breakdown and exit
--jsonOutput JSON per turn (for scripting)
-q, --quietSuppress decorative output

Examples

Start a chat session

vai chat --db myapp --collection docs

Chat with Ollama (local)

vai chat --llm-provider ollama --llm-model llama3 --llm-base-url http://localhost:11434

Agent mode with Anthropic

vai chat --mode agent --llm-provider anthropic --llm-model claude-sonnet-4-20250514

Resume a previous session

vai chat --session abc123

Estimate per-turn costs

vai chat --estimate

Setup

Before using chat, configure your LLM provider:

# Anthropic
vai config set llm-provider anthropic
vai config set llm-api-key sk-ant-...

# OpenAI
vai config set llm-provider openai
vai config set llm-api-key sk-...

# Ollama (no API key needed)
vai config set llm-provider ollama
vai config set llm-base-url http://localhost:11434
vai config set llm-model llama3

Or use vai init, which includes chat setup in the interactive wizard.

How It Works

Tips

  • Pipeline mode is simpler and more predictable. Agent mode gives the LLM more autonomy to search and reason.
  • Use --no-history for quick ad-hoc questions without persisting the conversation.
  • The --filter option lets you scope the chat to specific document categories.
  • Chat settings can also be configured in .vai.json under the chat key.
  • vai query — Single-shot retrieval (non-conversational)
  • vai config — Set LLM provider credentials
  • vai init — Interactive project setup including chat config