Catalog
concept#Security#Data#Architecture

Hashing Algorithms

Deterministic functions producing fixed-size digests from arbitrary input, used for integrity checks, indexing and cryptographic primitives.

Hashing algorithms are deterministic functions that map arbitrary input to fixed-size digests, used for integrity checks, authentication primitives, and indexing.
Established
Medium

Classification

  • Medium
  • Technical
  • Architectural
  • Intermediate

Technical context

TLS/SSL stacks, certificate infrastructureContent-addressed storage systems (e.g., object/blob stores)Authentication and secrets-management systems

Principles & goals

Choose algorithms with proven cryptographic strength and active maintenance.Do not rely on hashes alone for authentication; combine with salts/KDFs for passwords.Consider performance, compatibility and migration requirements when selecting.
Build
Enterprise, Domain, Team

Use cases & scenarios

Compromises

  • Using broken algorithms (e.g., MD5, SHA-1) leads to integrity compromises.
  • Incorrect implementation can cause timing attacks or side-channel leaks.
  • Ignored compatibility requirements complicate migration to stronger algorithms.
  • Use modern, recommended algorithms (e.g., SHA-2/3, BLAKE2, Argon2 for KDF).
  • Use vetted libraries instead of implementing cryptography yourself.
  • Store metadata (algorithm, parameters, salt) alongside the hash for future verification.

I/O & resources

  • Input data (bytes) or streams to hash
  • Security requirements (e.g., collision resistance)
  • Performance and compatibility requirements
  • Fixed digest/hash value
  • Metadata (algorithm version, salt, KDF parameters)
  • Integrity or consistency indicators for downstream processes

Description

Hashing algorithms are deterministic functions that map arbitrary input to fixed-size digests, used for integrity checks, authentication primitives, and indexing. They prioritize collision resistance, preimage resistance and speed depending on use-case. Choosing one involves trade-offs between security, performance and compatibility; deprecated algorithms (MD5, SHA-1) must be avoided.

  • Efficient integrity checks and comparison of large data sets.
  • Enables content-addressing and simple indexing.
  • Fundamental building block for many cryptographic protocols and signature schemes.

  • No confidentiality: hashes are not reversible but not secret.
  • Vulnerable with deprecated algorithms to collisions and attacks.
  • Insufficient alone for password storage without salt and KDF parameters.

  • Collision probability

    Probability that two distinct inputs produce the same digest; important to assess security.

  • Throughput (MB/s)

    Amount of data processed per second for a given implementation/hardware.

  • Latency per hash

    Time to compute a single hash; relevant for real-time applications.

Git object hashes (historically SHA-1)

Git uses hashes for content-addressed identification of commits and objects; migrations to stronger algorithms are underway.

SHA-256 in TLS and certificates

SHA-256 is widely used for signature and integrity checks in TLS certificates and signature chains.

BLAKE2 for high-performance integrity checks

BLAKE2 offers high speed and good cryptographic properties; popular in performance-critical systems.

1

Assess security and performance requirements and existing dependencies.

2

Select an appropriate, up-to-date algorithm and vetted libraries.

3

Implement with correct handling of salt/KDF parameters, test and plan migration strategy.

⚠️ Technical debt & bottlenecks

  • Legacy databases with MD5/SHA-1 hashes require migration effort.
  • Missing documentation of hash parameters used in systems.
  • Monolithic components that hardcode algorithms and block migration.
Compute cost for large datasetsLegacy compatibility with deprecated algorithmsI/O and throughput limits for parallel hashing
  • Using MD5 for password storage in a web application.
  • Using bare hashes without salt in multi-tenant systems.
  • Relying on hashes alone as an access control measure.
  • Omitting necessary metadata (algorithm version, salt) prevents later verification.
  • Assuming a long hash is automatically secure.
  • Insufficient vetting of libraries for side-channel behavior.
Fundamentals of cryptography and threat modelsExperience with secure implementations and librariesKnowledge of protocols and migration strategies
Integrity protection and data auditabilityPerformance and scalability requirementsRegulatory and compliance requirements for data security
  • Existing protocols may mandate specific hash algorithms.
  • Regulatory rules may prescribe minimum strengths.
  • Resource limits on embedded systems restrict choices.