Purging Stale Data
Over time, source files get renamed, moved, or deleted, but their embeddings remain in MongoDB. vai purge --stale finds and removes these orphaned embeddings.
How It Works
- vai queries all documents with a
metadata.sourcefield - For each document, it checks if the source file still exists on disk
- Documents whose source files are missing are flagged for deletion
Usage
Preview first (always recommended)
vai purge --stale --dry-run
Delete stale documents
vai purge --stale --force
Other purge criteria
# By source pattern
vai purge --source "docs/deprecated/*"
# By model (clean up after migration)
vai purge --model voyage-3.5-lite
# By date
vai purge --before 2024-01-01
# Combine filters
vai purge --model voyage-3.5-lite --before 2024-06-01
Best Practices
- Run
--dry-runfirst to review what will be deleted - Schedule periodic stale purges after document updates
- After purging, re-run your pipeline to re-ingest updated content
Further Reading
vai purge— Full command reference- Refresh Embeddings — Re-embed instead of deleting