gh-tachyon-beep-skillpacks-…/skills/load-testing-patterns/SKILL.md

---
name: load-testing-patterns
description: Use when designing load tests, choosing tools (k6, JMeter, Gatling), calculating concurrent users from DAU, interpreting latency degradation, identifying bottlenecks, or running spike/soak/stress tests - provides test patterns, anti-patterns, and load calculation frameworks
---

# Load Testing Patterns

## Overview

**Core principle:** Test realistic load patterns, not constant artificial load. Find limits before users do.

**Rule:** Load testing reveals system behavior under stress. Without it, production is your load test.

## Tool Selection Decision Tree

| Your Need | Protocol | Team Skills | Use | Why |
|-----------|----------|-------------|-----|-----|
| Modern API testing | HTTP/REST/GraphQL | JavaScript | **k6** | Best dev experience, CI/CD friendly |
| Enterprise/complex protocols | HTTP/SOAP/JMS/JDBC | Java/GUI comfort | **JMeter** | Mature, comprehensive protocols |
| Python team | HTTP/WebSocket | Python | **Locust** | Pythonic, easy scripting |
| High performance/complex scenarios | HTTP/gRPC | Scala/Java | **Gatling** | Best reports, high throughput |
| Cloud-native at scale | HTTP/WebSocket | Any (SaaS) | **Artillery, Flood.io** | Managed, distributed |

**First choice:** k6 (modern, scriptable, excellent CI/CD integration)

**Why not ApacheBench/wrk:** Too simple for realistic scenarios, no complex user flows

## Test Pattern Library

| Pattern | Purpose | Duration | When to Use |
|---------|---------|----------|-------------|
| **Smoke Test** | Verify test works | 1-2 min | Before every test run |
| **Load Test** | Normal/peak capacity | 10-30 min | Regular capacity validation |
| **Stress Test** | Find breaking point | 20-60 min | Understand limits |
| **Spike Test** | Sudden traffic surge | 5-15 min | Black Friday, launch events |
| **Soak Test** | Memory leaks, stability | 1-8 hours | Pre-release validation |
| **Capacity Test** | Max sustainable load | Variable | Capacity planning |

### Smoke Test

**Goal:** Verify test script works with minimal load

```javascript
// k6 smoke test
export let options = {
  vus: 1,
  duration: '1m',
  thresholds: {
    http_req_duration: ['p(95)<500'],  // 95% < 500ms
    http_req_failed: ['rate<0.01'],     // <1% errors
  }
}
```

**Purpose:** Catch test script bugs before running expensive full tests

### Load Test (Ramp-Up Pattern)

**Goal:** Test normal and peak expected load

```javascript
// k6 load test with ramp-up
export let options = {
  stages: [
    { duration: '5m', target: 100 },   // Ramp to normal load
    { duration: '10m', target: 100 },  // Hold at normal
    { duration: '5m', target: 200 },   // Ramp to peak
    { duration: '10m', target: 200 },  // Hold at peak
    { duration: '5m', target: 0 },     // Ramp down
  ],
  thresholds: {
    http_req_duration: ['p(95)<500', 'p(99)<1000'],
    http_req_failed: ['rate<0.05'],
  }
}
```

**Pattern:** Gradual ramp-up → sustain → ramp down. Never start at peak.

### Stress Test (Breaking Point)

**Goal:** Find system limits

```javascript
// k6 stress test
export let options = {
  stages: [
    { duration: '5m', target: 100 },   // Normal
    { duration: '5m', target: 300 },   // Above peak
    { duration: '5m', target: 600 },   // 2x peak
    { duration: '5m', target: 900 },   // 3x peak (expect failure)
    { duration: '10m', target: 0 },    // Recovery
  ]
}
```

**Success:** Identify at what load system degrades (not necessarily breaking completely)

### Spike Test (Sudden Surge)

**Goal:** Test sudden traffic bursts (viral post, email campaign)

```javascript
// k6 spike test
export let options = {
  stages: [
    { duration: '1m', target: 100 },   // Normal
    { duration: '30s', target: 1000 }, // SPIKE to 10x
    { duration: '5m', target: 1000 },  // Hold spike
    { duration: '2m', target: 100 },   // Back to normal
    { duration: '5m', target: 100 },   // Recovery check
  ]
}
```

**Tests:** Auto-scaling, circuit breakers, rate limiting

### Soak Test (Endurance)

**Goal:** Find memory leaks, resource exhaustion over time

```javascript
// k6 soak test
export let options = {
  stages: [
    { duration: '5m', target: 100 },   // Ramp
    { duration: '4h', target: 100 },   // Soak (sustained load)
    { duration: '5m', target: 0 },     // Ramp down
  ]
}
```

**Monitor:** Memory growth, connection leaks, disk space, file descriptors

**Duration:** Minimum 1 hour, ideally 4-8 hours

## Load Calculation Framework

**Problem:** Convert "10,000 daily active users" to concurrent load

### Step 1: DAU to Concurrent Users

```
Concurrent Users = DAU × Concurrency Ratio × Peak Multiplier

Concurrency Ratios by App Type:
- Web apps: 5-10%
- Social media: 10-20%
- Business apps: 20-30% (work hours)
- Gaming: 15-25%

Peak Multiplier: 1.5-2x for safety margin
```

**Example:**
```
DAU = 10,000
Concurrency = 10% (web app)
Peak Multiplier = 1.5

Concurrent Users = 10,000 × 0.10 × 1.5 = 1,500 concurrent users
```

### Step 2: Concurrent Users to Requests/Second

```
RPS = (Concurrent Users × Requests per Session) / (Session Duration × Think Time Ratio)

Think Time Ratio:
- Active browsing: 0.3-0.5 (30-50% time clicking/typing)
- Reading-heavy: 0.1-0.2 (10-20% active)
- API clients: 0.8-1.0 (80-100% active)
```

**Example:**
```
Concurrent Users = 1,500
Requests per Session = 20
Session Duration = 10 minutes = 600 seconds
Think Time Ratio = 0.3 (web browsing)

RPS = (1,500 × 20) / (600 × 0.3) = 30,000 / 180 = 167 RPS
```

### Step 3: Model Realistic Patterns

Don't use constant load. Use realistic traffic patterns:

```javascript
// Realistic daily pattern
export let options = {
  stages: [
    // Morning ramp
    { duration: '2h', target: 500 },    // 08:00-10:00
    { duration: '2h', target: 1000 },   // 10:00-12:00 (peak)
    // Lunch dip
    { duration: '1h', target: 600 },    // 12:00-13:00
    // Afternoon peak
    { duration: '2h', target: 1200 },   // 13:00-15:00 (peak)
    { duration: '2h', target: 800 },    // 15:00-17:00
    // Evening drop
    { duration: '2h', target: 300 },    // 17:00-19:00
  ]
}
```

## Anti-Patterns Catalog

### ❌ Coordinated Omission
**Symptom:** Fixed rate load generation ignores slow responses, underestimating latency

**Why bad:** Hides real latency impact when system slows down

**Fix:** Use arrival rate (requests/sec) not iteration rate

```javascript
// ❌ Bad - coordinated omission
export default function() {
  http.get('https://api.example.com')
  sleep(1)  // Wait 1s between requests
}

// ✅ Good - arrival rate pacing
export let options = {
  scenarios: {
    constant_arrival_rate: {
      executor: 'constant-arrival-rate',
      rate: 100,  // 100 RPS regardless of response time
      timeUnit: '1s',
      duration: '10m',
      preAllocatedVUs: 50,
      maxVUs: 200,
    }
  }
}
```

---

### ❌ Cold Start Testing
**Symptom:** Running load test immediately after deployment without warm-up

**Why bad:** JIT compilation, cache warming, connection pooling haven't stabilized

**Fix:** Warm-up phase before measurement

```javascript
// ✅ Good - warm-up phase
export let options = {
  stages: [
    { duration: '2m', target: 50 },    // Warm-up (not measured)
    { duration: '10m', target: 100 },  // Actual test
  ]
}
```

---

### ❌ Unrealistic Test Data
**Symptom:** Using same user ID, same query parameters for all virtual users

**Why bad:** Caches give unrealistic performance, doesn't test real database load

**Fix:** Parameterized, realistic data

```javascript
// ❌ Bad - same data
http.get('https://api.example.com/users/123')

// ✅ Good - parameterized data
import { SharedArray } from 'k6/data'
import papaparse from 'https://jslib.k6.io/papaparse/5.1.1/index.js'

const csvData = new SharedArray('users', function () {
  return papaparse.parse(open('./users.csv'), { header: true }).data
})

export default function() {
  const user = csvData[__VU % csvData.length]
  http.get(`https://api.example.com/users/${user.id}`)
}
```

---

### ❌ Constant Load Pattern
**Symptom:** Running with constant VUs instead of realistic traffic pattern

**Why bad:** Real traffic has peaks, valleys, not flat line

**Fix:** Use realistic daily/hourly patterns

---

### ❌ Ignoring Think Time
**Symptom:** No delays between requests, hammering API as fast as possible

**Why bad:** Unrealistic user behavior, overestimates load

**Fix:** Add realistic think time based on user behavior

```javascript
// ✅ Good - realistic think time
import { sleep } from 'k6'

export default function() {
  http.get('https://api.example.com/products')
  sleep(Math.random() * 3 + 2)  // 2-5 seconds browsing

  http.post('https://api.example.com/cart', {...})
  sleep(Math.random() * 5 + 5)  // 5-10 seconds deciding

  http.post('https://api.example.com/checkout', {...})
}
```

## Result Interpretation Guide

### Latency Degradation Patterns

| Pattern | Cause | What to Check |
|---------|-------|---------------|
| **Linear growth** (2x users → 2x latency) | CPU-bound | Thread pool, CPU usage |
| **Exponential growth** (2x users → 10x latency) | Resource saturation | Connection pools, locks, queues |
| **Sudden cliff** (works until X, then fails) | Hard limit hit | Max connections, memory, file descriptors |
| **Gradual degradation** (slow increase over time) | Memory leak, cache pollution | Memory trends, GC activity |

### Bottleneck Classification

**Symptom: p95 latency 10x at 2x load**
→ **Resource saturation** (database connection pool, thread pool, queue)

**Symptom: Errors increase with load**
→ **Hard limit** (connection limit, rate limiting, timeout)

**Symptom: Latency grows over time at constant load**
→ **Memory leak** or **cache pollution**

**Symptom: High variance (p50 good, p99 terrible)**
→ **GC pauses**, **lock contention**, or **slow queries**

### What to Monitor

| Layer | Metrics to Track |
|-------|------------------|
| **Application** | Request rate, error rate, p50/p95/p99 latency, active requests |
| **Runtime** | GC pauses (JVM, .NET), thread pool usage, heap/memory |
| **Database** | Connection pool usage, query latency, lock waits, slow queries |
| **Infrastructure** | CPU %, memory %, disk I/O, network throughput |
| **External** | Third-party API latency, rate limit hits |

### Capacity Planning Formula

```
Safe Capacity = (Breaking Point × Degradation Factor) × Safety Margin

Breaking Point = VUs where p95 latency > threshold
Degradation Factor = 0.7 (start degradation before break)
Safety Margin = 0.5-0.7 (handle traffic spikes)

Example:
- System breaks at 1000 VUs (p95 > 1s)
- Start seeing degradation at 700 VUs (70%)
- Safe capacity: 700 × 0.7 = 490 VUs
```

## Authentication and Session Management

**Problem:** Real APIs require authentication. Can't use same token for all virtual users.

### Token Strategy Decision Framework

| Scenario | Strategy | Why |
|----------|----------|-----|
| **Short test (<10 min)** | Pre-generate tokens | Fast, simple, no login load |
| **Long test (soak)** | Login during test + refresh | Realistic, tests auth system |
| **Testing auth system** | Simulate login flow | Auth is part of load |
| **Read-only testing** | Shared token (single user) | Simplest, adequate for API-only tests |

**Default:** Pre-generate tokens for load tests, simulate login for auth system tests

### Pre-Generated Tokens Pattern

**Best for:** API testing where auth system isn't being tested

```javascript
// k6 with pre-generated JWT tokens
import http from 'k6/http'
import { SharedArray } from 'k6/data'

// Load tokens from file (generated externally)
const tokens = new SharedArray('auth tokens', function () {
  return JSON.parse(open('./tokens.json'))
})

export default function() {
  const token = tokens[__VU % tokens.length]

  const headers = {
    'Authorization': `Bearer ${token}`
  }

  http.get('https://api.example.com/protected', { headers })
}
```

**Generate tokens externally:**

```bash
# Script to generate 1000 tokens
for i in {1..1000}; do
  curl -X POST https://api.example.com/login \
    -d "username=loadtest_user_$i&password=test" \
    | jq -r '.token'
done > tokens.json
```

**Pros:** No login load, fast test setup
**Cons:** Tokens may expire during long tests, not testing auth flow

---

### Login Flow Simulation Pattern

**Best for:** Testing auth system, soak tests where tokens expire

```javascript
// k6 with login simulation
import http from 'k6/http'
import { SharedArray } from 'k6/data'

const users = new SharedArray('users', function () {
  return JSON.parse(open('./users.json'))  // [{username, password}, ...]
})

export default function() {
  const user = users[__VU % users.length]

  // Login to get token
  const loginRes = http.post('https://api.example.com/login', {
    username: user.username,
    password: user.password
  })

  const token = loginRes.json('token')

  // Use token for subsequent requests
  const headers = { 'Authorization': `Bearer ${token}` }

  http.get('https://api.example.com/protected', { headers })
  http.post('https://api.example.com/data', {}, { headers })
}
```

**Token refresh for long tests:**

```javascript
// k6 with token refresh
import { sleep } from 'k6'

let token = null
let tokenExpiry = 0

export default function() {
  const now = Date.now() / 1000

  // Refresh token if expired or about to expire
  if (!token || now > tokenExpiry - 300) {  // Refresh 5 min before expiry
    const loginRes = http.post('https://api.example.com/login', {...})
    token = loginRes.json('token')
    tokenExpiry = loginRes.json('expires_at')
  }

  http.get('https://api.example.com/protected', {
    headers: { 'Authorization': `Bearer ${token}` }
  })

  sleep(1)
}
```

---

### Session Cookie Management

**For cookie-based auth:**

```javascript
// k6 with session cookies
import http from 'k6/http'

export default function() {
  // k6 automatically handles cookies with jar
  const jar = http.cookieJar()

  // Login (sets session cookie)
  http.post('https://example.com/login', {
    username: 'user',
    password: 'pass'
  })

  // Subsequent requests use session cookie automatically
  http.get('https://example.com/dashboard')
  http.get('https://example.com/profile')
}
```

---

### Rate Limiting Detection

**Pattern:** Detect when hitting rate limits during load test

```javascript
// k6 rate limit detection
import { check } from 'k6'

export default function() {
  const res = http.get('https://api.example.com/data')

  check(res, {
    'not rate limited': (r) => r.status !== 429
  })

  if (res.status === 429) {
    console.warn(`Rate limited at VU ${__VU}, iteration ${__ITER}`)
    const retryAfter = res.headers['Retry-After']
    console.warn(`Retry-After: ${retryAfter} seconds`)
  }
}
```

**Thresholds for rate limiting:**

```javascript
export let options = {
  thresholds: {
    'http_req_failed{status:429}': ['rate<0.01']  // <1% rate limited
  }
}
```

## Third-Party Dependency Handling

**Problem:** APIs call external services (payment, email, third-party APIs). Should you mock them?

### Mock vs Real Decision Framework

| External Service | Mock or Real? | Why |
|------------------|---------------|-----|
| **Payment gateway** | Real (sandbox) | Need to test integration, has sandbox mode |
| **Email provider** | Mock | Cost ($0.001/email × 1000 VUs = expensive), no value testing |
| **Third-party API (has staging)** | Real (staging) | Test integration, realistic latency |
| **Third-party API (no staging)** | Mock | Can't load test production, rate limits |
| **Internal microservices** | Real | Testing real integration points |
| **Analytics/tracking** | Mock | High volume, no functional impact |

**Rule:** Use real services if they have sandbox/staging. Mock if expensive, rate-limited, or no test environment.

---

### Service Virtualization with WireMock

**Best for:** Mocking HTTP APIs with realistic responses

```javascript
// k6 test pointing to WireMock
export default function() {
  // WireMock running on localhost:8080 mocks external API
  const res = http.get('http://localhost:8080/api/payment/process')

  check(res, {
    'payment mock responds': (r) => r.status === 200
  })
}
```

**WireMock stub setup:**

```json
{
  "request": {
    "method": "POST",
    "url": "/api/payment/process"
  },
  "response": {
    "status": 200,
    "jsonBody": {
      "transaction_id": "{{randomValue type='UUID'}}",
      "status": "approved"
    },
    "headers": {
      "Content-Type": "application/json"
    },
    "fixedDelayMilliseconds": 200
  }
}
```

**Why WireMock:** Realistic latency simulation, dynamic responses, stateful mocking

---

### Partial Mocking Pattern

**Pattern:** Mock some services, use real for others

```javascript
// k6 with partial mocking
import http from 'k6/http'

export default function() {
  // Real API (points to staging)
  const productRes = http.get('https://staging-api.example.com/products')

  // Mock email service (points to WireMock)
  http.post('http://localhost:8080/mock/email/send', {
    to: 'user@example.com',
    subject: 'Order confirmation'
  })

  // Real payment sandbox
  http.post('https://sandbox-payment.stripe.com/charge', {
    amount: 1000,
    currency: 'usd',
    source: 'tok_visa'
  })
}
```

**Decision criteria:**
- Real: Services with sandbox, need integration validation, low cost
- Mock: No sandbox, expensive, rate-limited, testing failure scenarios

---

### Testing External Service Failures

**Use mocks to simulate failures:**

```javascript
// WireMock stub for failure scenarios
{
  "request": {
    "method": "POST",
    "url": "/api/payment/process"
  },
  "response": {
    "status": 503,
    "jsonBody": {
      "error": "Service temporarily unavailable"
    },
    "fixedDelayMilliseconds": 5000  // Slow failure
  }
}
```

**k6 test for resilience:**

```javascript
export default function() {
  const res = http.post('http://localhost:8080/api/payment/process', {})

  // Verify app handles payment failures gracefully
  check(res, {
    'handles payment failure': (r) => r.status === 503,
    'returns within timeout': (r) => r.timings.duration < 6000
  })
}
```

---

### Cost and Compliance Guardrails

**Before testing with real external services:**

| Check | Why |
|-------|-----|
| **Sandbox mode exists?** | Avoid production costs/rate limits |
| **Cost per request?** | 1000 VUs × 10 req/s × 600s = 6M requests |
| **Rate limits?** | Will you hit external service limits? |
| **Terms of service?** | Does load testing violate TOS? |
| **Data privacy?** | Using real user emails/PII? |

**Example cost calculation:**

```
Email service: $0.001/email
Load test: 100 VUs × 5 emails/session × 600s = 300,000 emails
Cost: 300,000 × $0.001 = $300

Decision: Mock email service, use real payment sandbox (free)
```

**Compliance:**
- Don't use real user data in load tests (GDPR, privacy)
- Check third-party TOS (some prohibit load testing)
- Use synthetic test data only

## Your First Load Test

**Goal:** Basic load test in one day

**Hour 1-2: Install tool and write smoke test**

```bash
# Install k6
brew install k6  # macOS
# or snap install k6  # Linux

# Create test.js
cat > test.js <<'EOF'
import http from 'k6/http'
import { check, sleep } from 'k6'

export let options = {
  vus: 1,
  duration: '30s'
}

export default function() {
  let res = http.get('https://your-api.com/health')
  check(res, {
    'status is 200': (r) => r.status === 200,
    'response < 500ms': (r) => r.timings.duration < 500
  })
  sleep(1)
}
EOF

# Run smoke test
k6 run test.js
```

**Hour 3-4: Calculate target load**

```
Your DAU: 10,000
Concurrency: 10%
Peak multiplier: 1.5
Target: 10,000 × 0.10 × 1.5 = 1,500 VUs
```

**Hour 5-6: Write load test with ramp-up**

```javascript
export let options = {
  stages: [
    { duration: '5m', target: 750 },   // Ramp to normal (50%)
    { duration: '10m', target: 750 },  // Hold normal
    { duration: '5m', target: 1500 },  // Ramp to peak
    { duration: '10m', target: 1500 }, // Hold peak
    { duration: '5m', target: 0 },     // Ramp down
  ],
  thresholds: {
    http_req_duration: ['p(95)<500', 'p(99)<1000'],
    http_req_failed: ['rate<0.05']  // < 5% errors
  }
}
```

**Hour 7-8: Run test and analyze**

```bash
# Run load test
k6 run --out json=results.json test.js

# Check summary output for:
# - p95/p99 latency trends
# - Error rates
# - When degradation started
```

**If test fails:** Check thresholds, adjust targets, investigate bottlenecks

## Common Mistakes

### ❌ Testing Production Without Safeguards
**Fix:** Use feature flags, test environment, or controlled percentage

---

### ❌ No Baseline Performance Metrics
**Fix:** Run smoke test first to establish baseline before load testing

---

### ❌ Using Iteration Duration Instead of Arrival Rate
**Fix:** Use `constant-arrival-rate` executor in k6

---

### ❌ Not Warming Up Caches/JIT
**Fix:** 2-5 minute warm-up phase before measurement

## Quick Reference

**Tool Selection:**
- Modern API: k6
- Enterprise: JMeter
- Python team: Locust

**Test Patterns:**
- Smoke: 1 VU, 1 min
- Load: Ramp-up → peak → ramp-down
- Stress: Increase until break
- Spike: Sudden 10x surge
- Soak: 4-8 hours constant

**Load Calculation:**
```
Concurrent = DAU × 0.10 × 1.5
RPS = (Concurrent × Requests/Session) / (Duration × Think Time)
```

**Anti-Patterns:**
- Coordinated omission (use arrival rate)
- Cold start (warm-up first)
- Unrealistic data (parameterize)
- Constant load (use realistic patterns)

**Result Interpretation:**
- Linear growth → CPU-bound
- Exponential growth → Resource saturation
- Sudden cliff → Hard limit
- Gradual degradation → Memory leak

**Authentication:**
- Short tests: Pre-generate tokens
- Long tests: Login + refresh
- Testing auth: Simulate login flow

**Third-Party Dependencies:**
- Has sandbox: Use real (staging/sandbox)
- Expensive/rate-limited: Mock (WireMock)
- No sandbox: Mock

## Bottom Line

**Start with smoke test (1 VU). Calculate realistic load from DAU. Use ramp-up pattern (never start at peak). Monitor p95/p99 latency. Find breaking point before users do.**

Test realistic scenarios with think time, not hammer tests.