Catalog
concept#Data#Analytics#Architecture#Integration#Platform

Data Warehousing

Central architectural concept for collecting, integrating and historizing enterprise data to support analytics and reporting.

Data warehousing is an architectural concept for centralizing, integrating, and historizing large enterprise data from diverse sources.
Established
High

Classification

  • High
  • Business
  • Architectural
  • Intermediate

Technical context

ERP and CRM systems (e.g., SAP, Salesforce)Streaming and event platformsBI tools and analytics platforms (e.g., Power BI, Tableau)

Principles & goals

Single source of truth: central consolidated data basisSeparation of storage and analytics: optimized formats for queriesGovernance and metadata: ensure traceability and quality
Build
Enterprise, Domain, Team

Use cases & scenarios

Compromises

  • Data inconsistencies with insufficient governance
  • Excessive centralization can restrict agility
  • Cost explosion from uncontrolled storage and compute growth
  • Iterative rollout: core domains first, gradual expansion
  • Introduce automated data quality monitoring
  • Define clear responsibilities for data products

I/O & resources

  • Source transactions from operational systems
  • Master data and reference tables
  • Metadata, schema definitions and business glossary
  • Cleansed fact and dimension tables
  • Data marts and query-optimized views
  • Reports, dashboards and analytical datasets

Description

Data warehousing is an architectural concept for centralizing, integrating, and historizing large enterprise data from diverse sources. It enables analytics, reporting, and data-driven decisions through structured schemas, ETL/ELT processes, and semantic models. Typical implementations use data marts, star or snowflake schemas, and specialized warehouse systems.

  • Consistent, cleansed data foundation for analytics
  • Improved performance for reporting workloads
  • Support for historical analyses and compliance

  • High initial effort for modeling and ETL/ELT pipelines
  • Possible latency between source and warehouse
  • Scaling large, heterogeneous datasets requires planning

  • Query latency (P95)

    Time in which 95% of queries are answered; important for user experience.

  • ETL run time

    Total duration of data load and transformation processes; affects data freshness.

  • Cost per TB and per query

    Monetary metrics to evaluate storage and query costs.

Retail data warehouse for sales analytics

Central warehouse integrating POS, online store and inventory data to analyze sales trends and assortment efficiency.

Financial reporting platform

Consolidated financial dataset enables monthly, quarterly and annual closings on a unified data basis.

Marketing attribution warehouse

Combining campaign data and web events for attribution and campaign optimization.

1

Stakeholder workshops to define goals and KPIs

2

Create source inventory and perform data profiling

3

Design logical and physical data model

4

Implement and automate ETL/ELT pipelines

5

Complete monitoring, testing and rollout planning

⚠️ Technical debt & bottlenecks

  • Legacy ETL scripts without tests and documentation
  • Ad-hoc partitioning and missing indexing strategy
  • Monolithic schemas without modularization into data marts
ETL runtimesQuery performance on complex joinsMetadata and governance processes
  • Using the warehouse as a replacement for operational OLTP systems
  • Directly loading raw, uncleaned data without profiling
  • Missing archival strategy leads to unaffordable costs
  • Unclear ownership of data products
  • Underestimating operational effort for ETL pipelines
  • Neglecting security and access controls
Data modeling (star/snowflake schema)ETL/ELT development and pipeline designData governance and quality management
Data consistency and referential integrityQuery performance and latency requirementsScalability and cost optimization
  • Availability and quality of source system data
  • Budget for storage and compute capacity
  • Compliance and data protection requirements