Catalog
concept#Data#Architecture#Platform#Security

Data Storage

Fundamental concept for persistent retention of digital data, covering storage types, consistency and redundancy strategies, and operational considerations.

Data storage covers concepts, technologies and practices for persistent retention of digital information.
Established
Medium

Classification

  • Medium
  • Technical
  • Architectural
  • Intermediate

Technical context

Backup and archiving solutionsDatabases and container orchestrationMonitoring and observability toolchain

Principles & goals

Classify data by access pattern and criticalityImmutability and versioning for auditabilitySeparation of control planes: performance vs. cost
Build
Enterprise, Domain, Team

Use cases & scenarios

Compromises

  • Data loss from insufficient replication or backups
  • Compliance and privacy breaches due to misconfiguration
  • Cost overruns from missing lifecycle management
  • Tier data by access frequency and criticality
  • Introduce automated lifecycle policies and cost monitoring
  • Perform regular recovery tests and maintain documentation

I/O & resources

  • Data volume and growth forecasts
  • Access profiles and latency requirements
  • Compliance and security requirements
  • Specification of a storage architecture
  • Implemented storage tiers and lifecycle rules
  • Monitoring dashboards and recovery plans

Description

Data storage covers concepts, technologies and practices for persistent retention of digital information. It includes storage types (block, file, object), consistency and redundancy strategies, access patterns, backup, replication, scalability and cost considerations for on-premises, distributed or cloud-based environments. Robust storage architectures balance performance, availability and cost and support data integrity.

  • Reliable persistence and recoverability of data
  • Cost optimization via appropriate storage tiers
  • Scalability and performance tuning for workloads

  • Complexity with heterogeneous storage architectures
  • Costs for high availability and high performance
  • Latency constraints in remote or shared systems

  • Throughput (MB/s)

    Measures data flow per second; critical for batch and streaming workloads.

  • Latency (P95/P99)

    Time for read/write operations at percentiles used to evaluate service performance.

  • Cost per TB/month

    Direct storage costs used for budgeting and architecture decisions.

Enterprise archive on S3-compatible object storage

Organization uses S3-compatible service for cost-efficient long-term archival with lifecycle policies.

Distributed block storage for relational clusters

Critical databases rely on distributed block storage with replication and consistent snapshots.

Data lake on object-based storage for analytics

Analytics platform uses object storage as a data lake with versioning and metadata indexing.

1

Capture requirements and perform data classification.

2

Design architecture with appropriate storage tiers and access paths.

3

Set up a proof-of-concept and run performance and recovery tests.

4

Go-live with monitoring, alerting and lifecycle policies.

5

Regular reviews, cost optimization and adjustment to usage profiles.

⚠️ Technical debt & bottlenecks

  • Legacy systems without lifecycle management incur growing costs
  • Incompatible storage APIs complicate migrations
  • Missing automation for tiering and replication
I/O bottlenecksNetwork latencyMetadata scaling
  • Using expensive NVMe storage for infrequently accessed archives
  • Missing encryption for sensitive data in object storage
  • Scaling by adding volumes only instead of changing architecture
  • Underestimating metadata and management overhead
  • Ignoring network latencies for remote locations
  • Forgetting to regularly test restore procedures
Storage architecture and system designOperation and monitoring of distributed systemsKnowledge of backup, replication and recovery strategies
Latency and throughput requirements of workloadsSecurity and compliance requirementsCost and capacity planning
  • Budget constraints for storage hardware or cloud spend
  • Legal requirements for data locality
  • Existing dependencies on legacy systems