Files
gh-greyhaven-ai-claude-code…/skills/incident-response/templates/incident-timeline-template.md
2025-11-29 18:29:18 +08:00

3.3 KiB

Incident Timeline: [INCIDENT TITLE]

Incident ID: INC-YYYY-MM-DD-XXX Severity: [SEV1 / SEV2 / SEV3] Status: [Investigating / Mitigating / Resolved / Monitoring] Started: [YYYY-MM-DD HH:MM UTC]


Incident Overview

Impact:

  • Customer Impact: [All users / X% of users / Specific feature]
  • Services Affected: [List affected services]
  • Error Rate: [X%]
  • Revenue Impact: [$X estimated]

Symptoms:

  • [User-facing symptom 1]
  • [User-facing symptom 2]
  • [Metric: baseline → current]

Team

Incident Commander: @[name] Technical Lead: @[name] Communications Lead: @[name] Scribe: @[name] SMEs: @[name1], @[name2]

Channels:

  • Slack: #incident-XXX
  • Zoom: [link]
  • Status Page: [link]

Timeline

Time (UTC) Event Action Taken Owner Status
[HH:MM] [Alert fired / Issue detected] [What was done] @[name] 🔴 Started
[HH:MM] [IC joined] [Declared severity, assigned roles] @[IC] 🔴 Investigating
[HH:MM] [Discovery] [What was found] @[name] 🔴 Investigating
[HH:MM] [Root cause identified] [What the root cause is] @[name] 🟡 Identified
[HH:MM] [Mitigation started] [What fix is being applied] @[name] 🟡 Mitigating
[HH:MM] [Mitigation complete] [Verification of fix] @[name] 🟢 Mitigated
[HH:MM] [Incident resolved] [All checks passing] @[IC] 🟢 Resolved

Total Duration: [X] minutes/hours


Status Updates

Update #1 ([HH:MM UTC] - T+[X] min)

Status: [Investigating / Mitigating] Root Cause: [Known / Unknown - investigating X] Current Actions: [What team is doing] Impact: [Current impact status] ETA: [Estimated resolution time OR "Unknown"] Next Update: [Time]

Update #2 ([HH:MM UTC] - T+[X] min)

[Same format as Update #1]

Final Update ([HH:MM UTC] - T+[X] min)

Status: Resolved Root Cause: [Brief summary] Fix Applied: [What was done] Impact: Resolved Monitoring: [Ongoing monitoring period]


Root Cause (Brief)

Immediate Cause: [What directly caused the issue]

Contributing Factors:

  1. [Factor 1]
  2. [Factor 2]
  3. [Factor 3]

Resolution Summary

Temporary Fix (if applicable):

  • [What was done to quickly mitigate]
  • [When it was applied]

Permanent Fix:

  • [What was done for long-term solution]
  • [When it was applied]

Verification:

  • [How we confirmed the fix worked]
  • [Metrics that returned to normal]

Communications

Internal

  • [HH:MM] - SEV1 declared in #incidents
  • [HH:MM] - Update #1 posted
  • [HH:MM] - Update #2 posted
  • [HH:MM] - Resolution announced

External

  • [HH:MM] - Status page: "Investigating"
  • [HH:MM] - Status page: "Identified"
  • [HH:MM] - Status page: "Monitoring"
  • [HH:MM] - Status page: "Resolved"
  • [HH:MM] - Customer email sent (if applicable)

Executive

  • [HH:MM] - Initial notification to CTO/CEO (SEV1 only)
  • [HH:MM] - Resolution summary sent

Next Steps

  • Full postmortem scheduled: [Date/Time]
  • Action items created in Linear
  • Runbook updated with new learnings
  • Monitoring improvements identified

Notes

[Any additional context, observations, or learnings captured during the incident]


Return to templates index