Catalog
concept#Data#Analytics#Architecture#Software Engineering

Data Modeling

Concept for formal modeling of data structures, relationships, and business rules.

Data modeling describes the structured representation of information needs into formal schemas, entities, attributes, and relationships.
Established
Medium

Classification

  • Medium
  • Technical
  • Architectural
  • Intermediate

Technical context

Relational databases (PostgreSQL, MySQL)Data warehouse systems (Snowflake, Redshift)API gateways and schema registries

Principles & goals

Domain semantics before technical optimizationExplicit normalization/denormalization decisionsEnsure evolvability and backward compatibility
Discovery
Enterprise, Domain, Team

Use cases & scenarios

Compromises

  • Static models block rapid product changes
  • Inconsistent domain interpretations across teams
  • Incorrect mappings cause data corruption
  • Start with a lean core model and extend iteratively
  • Document semantics and naming conventions clearly
  • Automate validation and tests against schemas

I/O & resources

  • Domain requirements and glossary
  • Existing schemas and data samples
  • Performance and scaling requirements
  • Formal schemas (ER/UML/JSON Schema/OpenAPI)
  • Mapping and migration plans
  • Validation and governance rules

Description

Data modeling describes the structured representation of information needs into formal schemas, entities, attributes, and relationships. It ensures data consistency, integrity, and analytical usability, and underpins databases, data warehouses, and APIs. Effective models balance domain semantics, performance, and evolvability. They guide integration and governance decisions.

  • Improved data quality and consistency across systems
  • Better foundation for analytics and reporting
  • Clearer API and integration contracts

  • Initial effort for analysis and modeling
  • Over-modeling leads to complexity and maintenance burden
  • Not all requirements can be fully captured upfront

  • Data consistency rate

    Share of records that conform to validation rules and references.

  • Schema change effort

    Time and effort to plan and deploy schema changes.

  • Average query latency

    Average response time for typical data-related queries.

Product catalog for e‑commerce

Modeling product variants, attributes, categories, and pricing to support search and personalization.

Customer master data in banking

Consolidated customer model to satisfy regulatory requirements and avoid duplicates.

Analytics event schema for usage metrics

Event-based definitions for consistent collection of usage data across platforms.

1

Stakeholder workshops to gather requirements

2

Reverse-engineer existing data sources

3

Design a core model and validate with domain teams

4

Define migration and governance processes

5

Iterative implementation, testing, and monitoring

⚠️ Technical debt & bottlenecks

  • Ad-hoc fields without documentation in production schema
  • Outdated mappings to legacy systems
  • Missing migration paths for critical attributes
Joins/query performanceData quality and duplicatesCross-system mappings
  • Fully normalizing a reporting data warehouse leads to slow reports
  • Schema changes that ignore API consumers break integrations
  • Abandoning domain model in favor of technical details creates semantic inconsistencies
  • Detail modeling too early before domain knowledge is stable
  • Insufficient tests for edge cases and inconsistencies
  • Missing governance for schema evolution
Data modeling and ER designSQL and database architectureDomain analysis and domain modeling
Cross-system schema consistencyQuery performance requirementsAbility for schema evolution and migration
  • Compatibility with legacy system schemas
  • Regulatory requirements for data retention
  • Technical limits of storage systems