Vector Database
A vector database stores and queries high‑dimensional vectors for semantic search and similarity retrieval. It optimizes index structures and ANN algorithms to enable fast, scalable embedding queries.
Classification
- ComplexityMedium
- Impact areaTechnical
- Decision typeArchitectural
- Organizational maturityIntermediate
Technical context
Principles & goals
Use cases & scenarios
Compromises
- Lack of governance for embedding generation causes drift
- Incorrect metrics produce irrelevant returns
- Scaling without sharding strategy increases costs dramatically
- Version embeddings and document models
- Test multiple index configurations against production data
- Separate read and write paths for scalability
I/O & resources
- Embeddings (vector arrays)
- Metadata (IDs, attributes)
- Configuration for index and metric
- KNN hit lists with distances
- Query performance statistics
- Integration points for metadata lookups
Description
Vector databases are specialized stores for dense vector representations (embeddings) that provide indices, approximate nearest neighbor algorithms and distance metrics for fast semantic and neighborhood search. They form the infrastructure for retrieval, recommendation and semantic search in embedding‑driven applications.
✔Benefits
- Efficient semantic search and similarity retrieval
- Scalable query performance for embeddings
- Integration with ML pipelines and retrieval workflows
✖Limitations
- Optimized for dense vectors, not relational access
- Approximate algorithms may yield inconsistent hits
- Maintaining embeddings and reindexing requires effort
Trade-offs
Metrics
- Query latency (P95)
95th percentile of response time for search queries; important for UX.
- Recall at K
Share of relevant items returned within the top‑K hits.
- Index build duration
Time required to create or reindex an index.
Examples & implementations
Milvus in a recommendation pipeline
Milvus serves as a central vector store for product embeddings and provides fast k-NN hits for personalization.
FAISS for academic research
FAISS is used to evaluate ANN algorithms and prototype semantic search.
Vector search in conversational AI
Vector databases supply document passages for retrieval‑augmented generation in chatbots.
Implementation steps
Assess data shapes and embedding pipelines
Prototype with a small corpus and different index types
Plan infrastructure: sharding, replication, monitoring
Training and governance for embedding versioning
⚠️ Technical debt & bottlenecks
Technical debt
- Monolithic indices without sharding plan
- Ad hoc embedding conversions without tests
- No rollback for embedding model versions
Known bottlenecks
Misuse examples
- Using a default index for heterogeneous data without tuning
- No monitoring of embedding drift after model updates
- Storing sensitive personal data in unprotected embeddings
Typical traps
- Underestimating reindexing costs when models change
- Ignoring metric choice (cosine vs. euclidean)
- Missing specification for batch update scenarios
Required skills
Architectural drivers
Constraints
- • Limited precision of ANN methods
- • Incompatible embedding formats between models
- • Regulatory requirements for personal data