Catalog
method#Data#Architecture#Governance#Software Engineering

Data Modeling

A systematic method for defining and structuring data across conceptual, logical, and physical layers.

Data modeling is a structured method to define, organize, and document data structures across conceptual, logical, and physical layers.
Established
Medium

Classification

  • Medium
  • Technical
  • Architectural
  • Intermediate

Technical context

Data catalogs and metadata systemsETL/ELT pipelines and orchestration toolsCI/CD for data model and schema changes

Principles & goals

Separate conceptual from physical modelDefine clear entities and ownershipDocument semantics and business rules alongside structure
Discovery
Enterprise, Domain, Team

Use cases & scenarios

Compromises

  • Incorrect semantics lead to inconsistent integrations
  • Unclear ownership prevents effective maintenance
  • Excessive normalization can harm performance
  • Work iteratively: start small and verify
  • Maintain model, metadata and rules together
  • Introduce automated tests and validations

I/O & resources

  • Business requirements, processes, glossary
  • Existing schemas and sample data
  • Performance and compliance requirements
  • Conceptual, logical and physical data model
  • Mapping and migration documentation
  • Metadata and glossary for governance

Description

Data modeling is a structured method to define, organize, and document data structures across conceptual, logical, and physical layers. It promotes consistency, interoperability, and performance optimization through clear entities, attributes, relationships, and decisions about normalization and decomposition. Use cases include analytics schemas, integration mappings, and transactional database design.

  • Improved data quality and consistency
  • Easier integration and reuse
  • Better performance through appropriate physical design

  • Can incur initial effort and delays
  • Not all requirements can be fully modeled upfront
  • Over-modeling leads to complexity and maintenance overhead

  • Data redundancy score

    Measurement of duplicated information across entities.

  • Query latency

    Average response time of typical database queries.

  • Schema convergence

    Degree of alignment between source and target model after harmonization.

Enterprise reference data model

Building a central data model to harmonize customer and product master data across multiple systems.

Migration to microservices

Splitting a monolithic schema into service-specific models with clear interfaces and ownership.

Dimensional reporting model

Designing a data warehouse model to accelerate BI queries and simplify dashboards.

1

Stakeholder workshops for term definition

2

Create conceptual models and review cycles

3

Derive logical/physical schemas and test migration

⚠️ Technical debt & bottlenecks

  • Ad-hoc schemas without documentation
  • Temporary denormalization that was never cleaned up
  • Missing migration paths for schema changes
Lack of domain expertiseIncompatible source schemasPerformance issues with complex joins
  • Excessive generalization yields hard-to-understand models
  • Ignoring query profiles in physical design
  • Lack of synchronization between model and implementation
  • Too much detail in the conceptual model
  • Unclear naming conventions cause confusion
  • Unaddressed privacy requirements
Data modeling (ER, dimensional)SQL and database performanceDomain knowledge and stakeholder interviews
Scalability of data accessInteroperability between systemsData quality and governance requirements
  • Technical limits of target database
  • Regulatory requirements for metadata
  • Existing legacy systems and conversion effort