Catalog
concept#Data#Platform#Architecture#Integration

Data Serving

Architectural concept for reliably delivering curated data to diverse consumers with defined latency, format, and access requirements.

Data Serving is the architectural concept of reliably delivering curated data to consumers with appropriate latency, format, and access guarantees.
Established
High

Classification

  • High
  • Technical
  • Architectural
  • Intermediate

Technical context

message broker (e.g., Kafka)feature store or specialized serving storesAPI gateways and authentication services

Principles & goals

Specify clear latency and consistency SLAs for consumers.Separate data processing (compute) from serving paths for optimization.Version served views and features to ensure reproducibility.
Run
Domain, Team

Use cases & scenarios

Compromises

  • Inconsistent views of data leading to incorrect decisions.
  • Performance bottlenecks in the serving layer under peak load.
  • High operational costs from uncontrolled materialization and caching.
  • Define clearly measurable SLAs and SLOs for serving paths.
  • Use appropriate storage engines according to access patterns.
  • Version views and features, document breaking changes.

I/O & resources

  • source data streams or batch exports
  • schema, index and metadata definitions
  • requirements for latency, consistency and access control
  • served APIs, materialized views and caches
  • metrics for latency, freshness and error rates
  • documented SLAs and access paths for consumers

Description

Data Serving is the architectural concept of reliably delivering curated data to consumers with appropriate latency, format, and access guarantees. It encompasses APIs, materialized views, caches, and specialized serving engines to support analytics, operational workloads, and feature retrieval. Design balances freshness, scalability, cost and consistency across consumers.

  • Lower read latency and improved user experience.
  • Decoupling backend processing and consumer workloads.
  • Enables specialized optimizations for different access patterns.

  • Increased operational overhead from additional components like caches or stores.
  • Complexity in consistency and cache invalidation across distributed systems.
  • Cost due to redundant storage or specialized serving engines.

  • P95 latency

    95th percentile of read response times; important for SLA measurement.

  • staleness (data age)

    Time difference between source and served view; measures freshness.

  • cache hit rate

    Ratio of cache hits to requests; indicator of efficiency.

E‑commerce product catalog

Materialized views and cache layers deliver product and price data with guaranteed latency for checkout and search.

Monitoring dashboards

Aggregated metrics and time series data are provided in optimized serving engines to enable real-time alerts.

Personalized recommendations

Feature retrieval paths provide user attributes quickly to recommendation engines and synchronize with batch training data.

1

Analyze consumer requirements (latency, volume, consistency).

2

Design serving paths: APIs, materialized views, caches.

3

Implement infrastructure, monitoring and SLOs.

4

Introduce incrementally and measure against defined metrics.

⚠️ Technical debt & bottlenecks

  • Ad-hoc caching without clear TTL and invalidation.
  • Monolithic serving components without scaling path.
  • Missing versioning of served views and APIs.
network latencyI/O and storage bottleneckscache invalidation
  • Caching critical transactional data without invalidation strategy.
  • Feature serving without versioning leading to inconsistent ML inference results.
  • Materialized views for rare queries causing high storage usage.
  • Underestimating costs for replication and storage.
  • Missing end-to-end validation of data quality in serving paths.
  • Complex invalidation logic leads to hard-to-find bugs.
distributed system architecture and consistency modelsdata modeling and query optimizationoperation and observability of distributed services
Consumers' latency and availability requirementsAccess patterns (reads vs. writes, batch vs. real-time)Consistency and reproducibility requirements (e.g., for ML)
  • budget and cost constraints for redundant storage
  • existing data models and compliance requirements
  • limitations from third-party services or cloud quotas