Files
gh-anton-abyzov-specweave-p…/agents/sre/playbooks/05-ddos-attack.md
2025-11-29 17:56:41 +08:00

7.0 KiB

Playbook: DDoS Attack

Symptoms

  • Sudden traffic spike (10x-100x normal)
  • Legitimate users can't access service
  • High bandwidth usage (saturated)
  • Server overload (CPU, memory, network)
  • Monitoring alert: "Traffic spike", "Bandwidth >90%"

Severity

  • SEV1 - Production service unavailable due to attack

Diagnosis

Step 1: Confirm Traffic Spike

# Check current connections
netstat -ntu | wc -l

# Compare to baseline (normal: 100-500, attack: 10,000+)

# Check requests per second (nginx)
tail -f /var/log/nginx/access.log | awk '{print $4}' | uniq -c

Step 2: Identify Attack Pattern

Check connections by IP:

# Top 20 IPs by connection count
netstat -ntu | awk '{print $5}' | cut -d: -f1 | sort | uniq -c | sort -nr | head -20

# Example output:
# 5000 192.168.1.100  ← Attacker IP
# 3000 192.168.1.101  ← Attacker IP
# 2 192.168.1.200    ← Legitimate user

Check HTTP requests by IP (nginx):

awk '{print $1}' /var/log/nginx/access.log | sort | uniq -c | sort -nr | head -20

Check request patterns:

# Check requested URLs
awk '{print $7}' /var/log/nginx/access.log | sort | uniq -c | sort -nr | head -20

# Check user agents (bots often have telltale user agents)
awk -F'"' '{print $6}' /var/log/nginx/access.log | sort | uniq -c | sort -nr

Step 3: Classify Attack Type

HTTP Flood (application layer):

  • Many HTTP requests from distributed IPs
  • Valid HTTP requests, just too many
  • Example: 10,000 requests/second to homepage

SYN Flood (network layer):

  • Many TCP SYN packets
  • Connection requests never complete
  • Exhausts server connection table

Amplification (DNS, NTP):

  • Small request → Large response
  • Attacker spoofs your IP
  • Servers send large responses to you

Mitigation

Immediate (Now - 5 min)

Option A: Block Attacker IPs (if few IPs)

# Block single IP (iptables)
iptables -A INPUT -s <ATTACKER_IP> -j DROP

# Block IP range
iptables -A INPUT -s 192.168.1.0/24 -j DROP

# Block specific country (using ipset + GeoIP)
# Advanced, see infrastructure team

# Impact: Blocks attacker, restores service
# Risk: Low (if attacker IPs identified correctly)

Option B: Enable Rate Limiting (nginx)

# Add to nginx.conf
http {
  # Define rate limit zone (10 req/s per IP)
  limit_req_zone $binary_remote_addr zone=one:10m rate=10r/s;

  server {
    location / {
      # Apply rate limit
      limit_req zone=one burst=20 nodelay;
      limit_req_status 429;
    }
  }
}

# Reload nginx
nginx -t && systemctl reload nginx

# Impact: Limits requests per IP
# Risk: Low (legitimate users rarely exceed 10 req/s)

Option C: Enable CloudFlare "Under Attack" Mode

# If using CloudFlare:
# 1. Log in to CloudFlare dashboard
# 2. Select domain
# 3. Click "Under Attack Mode"
# 4. Adds JavaScript challenge before serving content

# Impact: Blocks bots, allows legitimate browsers
# Risk: Low (slight user friction)

Option D: Enable AWS Shield (AWS)

# AWS Shield Standard: Free, automatic DDoS protection
# AWS Shield Advanced: $3000/month, enhanced protection

# CloudFormation:
aws cloudformation deploy \
  --template-file shield.yaml \
  --stack-name ddos-protection

# Impact: Absorbs DDoS at AWS edge
# Risk: None (AWS handles)

Short-term (5 min - 1 hour)

Option A: Add Connection Limits

# Limit concurrent connections per IP
limit_conn_zone $binary_remote_addr zone=addr:10m;

server {
  location / {
    limit_conn addr 10;  # Max 10 concurrent connections per IP
  }
}

Option B: Add CAPTCHA (reCAPTCHA)

<!-- Add reCAPTCHA to sensitive endpoints -->
<form action="/login" method="POST">
  <input type="email" name="email">
  <input type="password" name="password">
  <div class="g-recaptcha" data-sitekey="YOUR_SITE_KEY"></div>
  <button type="submit">Login</button>
</form>

Option C: Scale Up (cloud auto-scaling)

# AWS: Increase Auto Scaling Group desired capacity
aws autoscaling set-desired-capacity \
  --auto-scaling-group-name my-asg \
  --desired-capacity 20  # Was 5

# Impact: More capacity to handle attack
# Risk: Medium (costs money, may not fully mitigate)
# NOTE: Only do this if legitimate traffic also spiked

Long-term (1 hour+)

  • Enable CloudFlare or AWS Shield (DDoS protection service)
  • Implement rate limiting on all endpoints
  • Add CAPTCHA to login, signup, checkout
  • Configure auto-scaling (handle legitimate traffic spikes)
  • Add monitoring alert for traffic anomalies
  • Create DDoS response plan
  • Contact ISP for upstream filtering (if very large attack)
  • Review and update firewall rules
  • Add geographic blocking (if applicable)

Important Notes

DO NOT:

  • Scale up indefinitely (attack can grow, costs explode)
  • Fight DDoS at application layer alone (use CDN, cloud protection)

DO:

  • Use CDN/DDoS protection service (CloudFlare, AWS Shield, Akamai)
  • Enable rate limiting
  • Block attacker IPs/ranges
  • Monitor costs (auto-scaling can be expensive)

Escalation

Escalate to infrastructure team if:

  • Attack very large (>10 Gbps)
  • Need upstream filtering at ISP level

Escalate to security team:

  • All DDoS attacks (for post-mortem, legal action)

Contact ISP if:

  • Attack saturating internet connection
  • Need transit provider to filter

Contact CloudFlare/AWS if:

  • Using their DDoS protection
  • Need assistance enabling features

Prevention Checklist

  • Use CDN (CloudFlare, CloudFront, Akamai)
  • Enable DDoS protection (AWS Shield, CloudFlare)
  • Implement rate limiting (per IP, per user)
  • Add CAPTCHA to sensitive endpoints
  • Configure auto-scaling (within cost limits)
  • Monitor traffic patterns (detect spikes early)
  • Have DDoS response plan ready
  • Test response plan (tabletop exercise)


Post-Incident

After resolving:

  • Create post-mortem (mandatory for DDoS)
  • Identify attack vectors
  • Document attacker IPs, patterns
  • Report to ISP, CloudFlare (they may block attacker)
  • Review and improve DDoS defenses
  • Consider legal action (if attacker identified)
  • Update this runbook if needed

Useful Commands Reference

# Check connection count
netstat -ntu | wc -l

# Top IPs by connection count
netstat -ntu | awk '{print $5}' | cut -d: -f1 | sort | uniq -c | sort -nr | head -20

# Block IP (iptables)
iptables -A INPUT -s <IP> -j DROP

# Check nginx requests per second
tail -f /var/log/nginx/access.log | awk '{print $4}' | uniq -c

# List iptables rules
iptables -L -n -v

# Clear all iptables rules (CAREFUL!)
iptables -F

# Save iptables rules (persist after reboot)
iptables-save > /etc/iptables/rules.v4