Workflow Schema
This page provides the complete schema for vai workflow JSON files. Workflow files define multi-step RAG pipelines as portable, declarative configurations.
TypeScript Interface
interface VaiWorkflow {
// Identity
$schema?: string;
name: string;
description?: string;
version?: string;
// Store appearance
branding?: WorkflowBranding;
// Parameterization
inputs?: Record<string, WorkflowInput>;
// Shared defaults
defaults?: WorkflowDefaults;
// The pipeline
steps: WorkflowStep[];
// What the workflow produces
output?: Record<string, any>;
}
interface WorkflowBranding {
/** Predefined Lucide icon name (e.g., "search", "brain", "trophy") */
icon?: string;
/** Hex color code for the icon accent (e.g., "#00D4AA") */
color?: string;
}
interface WorkflowInput {
type: "string" | "number" | "boolean";
description?: string;
required?: boolean;
default?: any;
}
interface WorkflowDefaults {
db?: string;
collection?: string;
model?: string;
}
interface WorkflowStep {
id: string;
name?: string;
description?: string;
tool: StepTool;
inputs: Record<string, any>;
condition?: string;
forEach?: string;
continueOnError?: boolean;
}
type StepTool =
// VAI tools
| "query"
| "search"
| "rerank"
| "embed"
| "similarity"
| "ingest"
| "collections"
| "models"
| "explain"
| "estimate"
// Control flow
| "merge"
| "filter"
| "transform"
// LLM
| "generate";
JSON Schema
{
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "vai Workflow",
"description": "A multi-step RAG pipeline definition for vai",
"type": "object",
"required": ["name", "steps"],
"properties": {
"$schema": {
"type": "string",
"description": "Schema URL for IDE validation"
},
"name": {
"type": "string",
"description": "Human-readable workflow name"
},
"description": {
"type": "string",
"description": "What this workflow does"
},
"version": {
"type": "string",
"pattern": "^\\d+\\.\\d+\\.\\d+$",
"description": "Semver version"
},
"branding": {
"type": "object",
"description": "Icon and accent color for the Workflow Store",
"properties": {
"icon": {
"type": "string",
"description": "Predefined Lucide icon name"
},
"color": {
"type": "string",
"pattern": "^#[0-9A-Fa-f]{6}$",
"description": "Hex color code for the icon accent"
}
}
},
"inputs": {
"type": "object",
"additionalProperties": {
"type": "object",
"required": ["type"],
"properties": {
"type": {
"type": "string",
"enum": ["string", "number", "boolean"]
},
"description": { "type": "string" },
"required": { "type": "boolean" },
"default": {}
}
}
},
"defaults": {
"type": "object",
"properties": {
"db": { "type": "string" },
"collection": { "type": "string" },
"model": { "type": "string" }
}
},
"steps": {
"type": "array",
"minItems": 1,
"items": {
"type": "object",
"required": ["id", "tool", "inputs"],
"properties": {
"id": {
"type": "string",
"pattern": "^[a-zA-Z_][a-zA-Z0-9_]*$",
"description": "Unique step identifier"
},
"name": {
"type": "string",
"description": "Human-readable label"
},
"description": {
"type": "string"
},
"tool": {
"type": "string",
"enum": [
"query", "search", "rerank", "embed",
"similarity", "ingest", "collections",
"models", "explain", "estimate",
"merge", "filter", "transform",
"generate"
]
},
"inputs": {
"type": "object",
"description": "Tool-specific inputs with template expression support"
},
"condition": {
"type": "string",
"description": "Template expression; step runs only if truthy"
},
"forEach": {
"type": "string",
"description": "Template expression resolving to an array"
},
"continueOnError": {
"type": "boolean",
"default": false
}
}
}
},
"output": {
"type": "object",
"description": "Template expressions defining workflow result"
}
}
}
Tool Input Schemas
query
Full RAG query with optional reranking.
| Input | Type | Required | Description |
|---|---|---|---|
query | string | Yes | The search query |
db | string | No | Database name (uses defaults) |
collection | string | No | Collection name (uses defaults) |
limit | number | No | Max results (default: 5) |
filter | object | No | MongoDB pre-filter for vector search |
rerank | boolean | No | Whether to rerank results (default: true) |
model | string | No | Voyage AI embedding model |
search
Raw vector similarity search without reranking.
| Input | Type | Required | Description |
|---|---|---|---|
query | string | Yes | The search query |
db | string | No | Database name |
collection | string | No | Collection name |
limit | number | No | Max results (default: 10) |
filter | object | No | MongoDB pre-filter |
model | string | No | Embedding model |
rerank
Rerank documents against a query.
| Input | Type | Required | Description |
|---|---|---|---|
query | string | Yes | The query to rank against |
documents | array | Yes | Documents to rerank |
model | string | No | Reranking model (default: rerank-2.5) |
embed
Generate an embedding vector.
| Input | Type | Required | Description |
|---|---|---|---|
text | string | Yes | Text to embed |
model | string | No | Embedding model |
inputType | "document" | "query" | No | Whether text is a document or query |
dimensions | number | No | Output dimensions for Matryoshka models |
similarity
Compare two texts semantically.
| Input | Type | Required | Description |
|---|---|---|---|
text1 | string | Yes | First text |
text2 | string | Yes | Second text |
model | string | No | Embedding model |
ingest
Chunk, embed, and store a document.
| Input | Type | Required | Description |
|---|---|---|---|
text | string | Yes | Document text |
source | string | No | Source identifier for citations |
db | string | No | Database name |
collection | string | No | Collection name |
chunkSize | number | No | Target chunk size in characters (default: 512) |
chunkStrategy | string | No | Chunking strategy: fixed, sentence, paragraph, recursive, markdown |
model | string | No | Embedding model |
metadata | object | No | Additional metadata to store |
estimate
Estimate embedding costs.
| Input | Type | Required | Description |
|---|---|---|---|
docs | number | Yes | Number of documents to embed |
queries | number | No | Queries per month (default: 0) |
months | number | No | Time horizon in months (default: 12) |
merge
Concatenate arrays from multiple steps.
| Input | Type | Required | Description |
|---|---|---|---|
arrays | array | Yes | Template expressions resolving to arrays |
dedup | boolean | No | Remove duplicates (default: false) |
dedup_field | string | No | Field to use for dedup comparison |
filter
Filter array items by condition.
| Input | Type | Required | Description |
|---|---|---|---|
array | string | Yes | Template expression resolving to an array |
condition | string | Yes | Condition using item as current element |
generate
Call the configured LLM.
| Input | Type | Required | Description |
|---|---|---|---|
prompt | string | Yes | The user prompt |
context | any | No | Context data for the LLM |
systemPrompt | string | No | System message |
Validation
Validate any workflow file against this schema:
vai workflow validate my-workflow.json
The validator checks:
- JSON syntax
- Schema conformance
- Step ID uniqueness and naming conventions
- Template expression validity
- Circular dependency detection
- Step reference resolution
- Execution plan generation
Next Steps
- Schema Reference: Field-by-field documentation
- Template Expressions: Expression grammar
- Built-in Templates: See the schema in practice