Incident Response
Structured process for detecting, analysing and containing security incidents and restoring normal operations.
Classification
- ComplexityMedium
- Impact areaOrganizational
- Decision typeOrganizational
- Organizational maturityIntermediate
Technical context
Principles & goals
Use cases & scenarios
Compromises
- Misjudgements can lead to incorrect containment and follow-up issues.
- Sensitive information may be exposed during response.
- Over-automation can result in missing contextual analysis.
- Regular tabletop exercises with cross-functional teams.
- Version and review playbooks after every incident.
- Separate evidence preservation from recovery activities.
I/O & resources
- Telemetry from SIEM, EDR, network logs
- Contact data and escalation matrix
- Playbooks, runbooks and verification procedures
- Containment and recovery actions
- Forensic artifacts and analysis reports
- Improvement actions and updated playbooks
Description
Incident response is a structured process for detecting, assessing and containing security incidents and restoring normal operations. It includes preparation, detection, analysis, containment, eradication and lessons learned. The goal is to minimise damage, enable rapid recovery and continuously strengthen organisational resilience.
✔Benefits
- Faster restoration of services after security incidents.
- Reduction of damage scope and downtime.
- Improved transparency and accountability within the organisation.
✖Limitations
- Requires continuous maintenance of playbooks and tools.
- Depends on quality of underlying telemetry.
- Can slow down when responsibilities are unclear.
Trade-offs
Metrics
- Mean Time to Detect (MTTD)
Average time from incident occurrence to detection.
- Mean Time to Respond (MTTR)
Average time to initial response or containment.
- Number of recurring incidents
Count of incidents that reoccur after closure.
Examples & implementations
Organisation with dedicated CSIRT
A company operates a dedicated Computer Security Incident Response Team with clear escalation and communication processes.
Cloud service provider with playbooks
A cloud provider uses standardized playbooks for common incidents and automated runbooks to speed up recovery.
Small team with external incident support
A startup relies on external specialists for forensic analysis while focusing internal resources on coordination and communication.
Implementation steps
Establish an incident response team and role allocation.
Create and test playbooks for common incidents.
Integrate telemetry sources and establish alerting.
⚠️ Technical debt & bottlenecks
Technical debt
- Outdated playbooks and missing automation scripts.
- Fragmented log storage complicates correlation analysis.
- Insufficient documentation of recovery processes.
Known bottlenecks
Misuse examples
- Immediately restoring production systems without forensics.
- Publicly communicating sensitive details during an ongoing investigation.
- Automatically blocking accounts without escalation for legitimate exceptions.
Typical traps
- Over-optimisation for speed instead of contextual quality.
- Unclear severity criteria lead to misprioritisation.
- Untested playbooks fail in real incidents.
Required skills
Architectural drivers
Constraints
- • Limited forensic capacity for parallel incidents.
- • Regulatory requirements for data retention and reporting.
- • Restricted access to historical telemetry data.