Catalog
concept#Data#Platform#Software Engineering

Vector Database

A vector database stores and queries high‑dimensional vectors for semantic search and similarity retrieval. It optimizes index structures and ANN algorithms to enable fast, scalable embedding queries.

Vector databases are specialized stores for dense vector representations (embeddings) that provide indices, approximate nearest neighbor algorithms and distance metrics for fast semantic and neighborhood search.
Established
Medium

Classification

  • Medium
  • Technical
  • Architectural
  • Intermediate

Technical context

Feature store as embedding sourceRelational DB for metadataSearch frontend / API gateway

Principles & goals

Separate vector storage from metadata layersChoose index and metric based on data distributionEvaluate trade‑offs between accuracy and latency
Build
Enterprise, Domain, Team

Use cases & scenarios

Compromises

  • Lack of governance for embedding generation causes drift
  • Incorrect metrics produce irrelevant returns
  • Scaling without sharding strategy increases costs dramatically
  • Version embeddings and document models
  • Test multiple index configurations against production data
  • Separate read and write paths for scalability

I/O & resources

  • Embeddings (vector arrays)
  • Metadata (IDs, attributes)
  • Configuration for index and metric
  • KNN hit lists with distances
  • Query performance statistics
  • Integration points for metadata lookups

Description

Vector databases are specialized stores for dense vector representations (embeddings) that provide indices, approximate nearest neighbor algorithms and distance metrics for fast semantic and neighborhood search. They form the infrastructure for retrieval, recommendation and semantic search in embedding‑driven applications.

  • Efficient semantic search and similarity retrieval
  • Scalable query performance for embeddings
  • Integration with ML pipelines and retrieval workflows

  • Optimized for dense vectors, not relational access
  • Approximate algorithms may yield inconsistent hits
  • Maintaining embeddings and reindexing requires effort

  • Query latency (P95)

    95th percentile of response time for search queries; important for UX.

  • Recall at K

    Share of relevant items returned within the top‑K hits.

  • Index build duration

    Time required to create or reindex an index.

Milvus in a recommendation pipeline

Milvus serves as a central vector store for product embeddings and provides fast k-NN hits for personalization.

FAISS for academic research

FAISS is used to evaluate ANN algorithms and prototype semantic search.

Vector search in conversational AI

Vector databases supply document passages for retrieval‑augmented generation in chatbots.

1

Assess data shapes and embedding pipelines

2

Prototype with a small corpus and different index types

3

Plan infrastructure: sharding, replication, monitoring

4

Training and governance for embedding versioning

⚠️ Technical debt & bottlenecks

  • Monolithic indices without sharding plan
  • Ad hoc embedding conversions without tests
  • No rollback for embedding model versions
Index build timeNetwork bandwidth for large vector transfersMemory for dense vector storage
  • Using a default index for heterogeneous data without tuning
  • No monitoring of embedding drift after model updates
  • Storing sensitive personal data in unprotected embeddings
  • Underestimating reindexing costs when models change
  • Ignoring metric choice (cosine vs. euclidean)
  • Missing specification for batch update scenarios
Understanding of ANN algorithms and indicesKnowledge of embedding generation and modelingOperational and scaling experience of distributed systems
Response latency under user requirementsData scale and number of vectorsConsistency requirements for metadata
  • Limited precision of ANN methods
  • Incompatible embedding formats between models
  • Regulatory requirements for personal data