Catalog
concept#Architecture#Software Engineering#Software Quality#System Monitoring

Health Checks

Health Checks are systematic assessments of the health of software applications and systems.

Health Checks are procedures for verifying and ensuring the functionality of software.
Established
Medium

Classification

  • Medium
  • Technical
  • Technical
  • Intermediate

Technical context

AWS CloudWatchGrafanaPrometheus

Principles & goals

Regularity of checksIntegration into existing systemsConsider user feedback
Run
Team, Domain

Use cases & scenarios

Compromises

  • Overlooking critical errors
  • Incorrect configuration of health checks
  • Lack of resources for monitoring
  • Conduct regular tests
  • Automate monitoring
  • Integrate security audits

I/O & resources

  • Credentials to systems
  • Monitoring tools
  • Technical documentation
  • Diagnostic reports
  • Monitoring data
  • System status reports

Description

Health Checks are procedures for verifying and ensuring the functionality of software. They help identify potential issues early on and guarantee system stability.

  • Early detection of issues
  • Improved system performance
  • Increased reliability

  • Limited testing scenarios
  • Dependency on monitoring tools
  • Possible false alarms

  • Response Time

    The time taken to respond to a request.

  • Availability Rate

    The proportion of time the system is available.

  • Error Rate

    The percentage of errors that occur during use.

Monitoring a microservice

A company implements health checks for its microservices to ensure availability.

Automated tests

Automated tests help to conduct health checks efficiently.

Continuous Integration

Health checks are part of the CI process to identify errors early.

1

Planning the implementation

2

Integrating health checks into existing systems

3

Reviewing and testing the implementation

⚠️ Technical debt & bottlenecks

  • Outdated monitoring systems
  • Lack of automation
  • Non-optimized infrastructure
Latency issuesResource bottleneckTechnological dependencies
  • Not checking the configuration
  • Ignoring warnings
  • Setting health checks without testing
  • Implementing too many checks
  • Choosing the wrong metrics
  • Insufficient team knowledge
Knowledge of networking technologyFamiliarity with monitoring toolsProgramming skills
UsabilitySystem integrationPerformance assurance
  • Restricted software versions
  • Operating system dependencies
  • Limited infrastructure resources