Catalog
concept#Data#Analytics#Architecture#Platform

Dimensional Modeling

A pragmatic approach to modeling data warehouses that structures data into fact and dimension tables for efficient analysis and query performance.

Dimensional modeling is a pragmatic modeling paradigm for analytical databases and data warehouses.
Established
Medium

Classification

  • Medium
  • Technical
  • Architectural
  • Intermediate

Technical context

ETL/ELT tools (e.g., dbt, Informatica, Azure Data Factory)BI and visualization tools (e.g., Power BI, Tableau)Data catalog and MDM systems

Principles & goals

Define grain first: each fact table must have a clearly defined granularity.Conformed dimensions promote reuse and consistent analytics.Simplicity over normalization: prioritize simplicity for analytic performance.
Build
Domain, Team

Use cases & scenarios

Compromises

  • Incorrect grain determination leads to inconsistent or irreparable data models.
  • Uncoordinated conformed dimensions cause inconsistencies across datamarts.
  • Excessive denormalization can increase ETL complexity and error proneness.
  • Define grain early and document it clearly.
  • Use conformed dimensions for cross-domain consistency.
  • Handle slowly changing dimensions explicitly and consistently.

I/O & resources

  • Source system tables and CSV/log feeds
  • Business reporting requirements and metric definitions
  • Data catalog and domain glossary
  • Fact tables with defined grain
  • Dimension tables (conformed or domain-specific)
  • Documentation of grain, keys and SCD strategies

Description

Dimensional modeling is a pragmatic modeling paradigm for analytical databases and data warehouses. It organizes data into fact and dimension tables (star or snowflake schema) to optimize query performance, analytical usability, and understandability. Key concerns include grain, conformed dimensions, and slowly changing dimensions. It supports fast aggregations and clear reporting paths.

  • Improved query performance via simple joins and optimized aggregations.
  • Better understandability for business users and BI developers.
  • Facilitates incremental loads and aggregation strategies.

  • Can lead to redundancy and increased storage consumption.
  • Not ideal for highly transactional OLTP scenarios.
  • Complex slowly changing dimensions require additional implementation effort.

  • Query latency (p95)

    95th percentile of query response time for analytical queries.

  • ETL duration

    Total duration of daily/periodic load processes.

  • Storage per fact table

    Storage consumption of individual fact tables for cost estimation.

Classic retail star schema

Sales fact table with product, time, store and customer dimensions for fast revenue aggregations.

Snowflake schema to reduce redundancy

Dimensions normalized when attribute hierarchies are large and storage optimization is required.

Conformed dimensions for cross-area reports

Shared customer dimension reused across sales, marketing and finance datamarts.

1

Stakeholder workshops to capture metrics and analysis use cases.

2

Define the grain for each planned fact table.

3

Design dimensions and decide on conformance.

4

Implement ETL/ELT processes including SCD strategies.

5

Test, validate with business teams and optimize performance.

⚠️ Technical debt & bottlenecks

  • Temporary denormalizations without later refactoring.
  • Missing documentation of grain and key relationships.
  • Ad-hoc local dimensions created without conformance review.
ETL performanceHigh cardinality in dimensionsGrain mismatch between fact tables
  • Using dimensional models for highly transactional workloads without adjustments.
  • Creating many narrow facts without clear grain definition.
  • Lack of coordination on shared dimensions between teams.
  • Under- or overspecifying grain that makes later corrections difficult.
  • Underestimating the effort for SCD implementation.
  • Not establishing governance for conformed dimensions.
Data modeling and schema designETL/ELT development and performance tuningKnowledge of BI/reporting requirements
Query performance and scalabilityData consistency and reuse (conformed dimensions)ETL complexity and load windows
  • Existing source systems with limited history
  • Limited compute or storage resources
  • Regulatory requirements for data retention