Initial commit

This commit is contained in:
Zhongwei Li
2025-11-30 08:59:43 +08:00
commit 966ef521f7
25 changed files with 9763 additions and 0 deletions

View File

@@ -0,0 +1,843 @@
---
name: load-testing-patterns
description: Use when designing load tests, choosing tools (k6, JMeter, Gatling), calculating concurrent users from DAU, interpreting latency degradation, identifying bottlenecks, or running spike/soak/stress tests - provides test patterns, anti-patterns, and load calculation frameworks
---
# Load Testing Patterns
## Overview
**Core principle:** Test realistic load patterns, not constant artificial load. Find limits before users do.
**Rule:** Load testing reveals system behavior under stress. Without it, production is your load test.
## Tool Selection Decision Tree
| Your Need | Protocol | Team Skills | Use | Why |
|-----------|----------|-------------|-----|-----|
| Modern API testing | HTTP/REST/GraphQL | JavaScript | **k6** | Best dev experience, CI/CD friendly |
| Enterprise/complex protocols | HTTP/SOAP/JMS/JDBC | Java/GUI comfort | **JMeter** | Mature, comprehensive protocols |
| Python team | HTTP/WebSocket | Python | **Locust** | Pythonic, easy scripting |
| High performance/complex scenarios | HTTP/gRPC | Scala/Java | **Gatling** | Best reports, high throughput |
| Cloud-native at scale | HTTP/WebSocket | Any (SaaS) | **Artillery, Flood.io** | Managed, distributed |
**First choice:** k6 (modern, scriptable, excellent CI/CD integration)
**Why not ApacheBench/wrk:** Too simple for realistic scenarios, no complex user flows
## Test Pattern Library
| Pattern | Purpose | Duration | When to Use |
|---------|---------|----------|-------------|
| **Smoke Test** | Verify test works | 1-2 min | Before every test run |
| **Load Test** | Normal/peak capacity | 10-30 min | Regular capacity validation |
| **Stress Test** | Find breaking point | 20-60 min | Understand limits |
| **Spike Test** | Sudden traffic surge | 5-15 min | Black Friday, launch events |
| **Soak Test** | Memory leaks, stability | 1-8 hours | Pre-release validation |
| **Capacity Test** | Max sustainable load | Variable | Capacity planning |
### Smoke Test
**Goal:** Verify test script works with minimal load
```javascript
// k6 smoke test
export let options = {
vus: 1,
duration: '1m',
thresholds: {
http_req_duration: ['p(95)<500'], // 95% < 500ms
http_req_failed: ['rate<0.01'], // <1% errors
}
}
```
**Purpose:** Catch test script bugs before running expensive full tests
### Load Test (Ramp-Up Pattern)
**Goal:** Test normal and peak expected load
```javascript
// k6 load test with ramp-up
export let options = {
stages: [
{ duration: '5m', target: 100 }, // Ramp to normal load
{ duration: '10m', target: 100 }, // Hold at normal
{ duration: '5m', target: 200 }, // Ramp to peak
{ duration: '10m', target: 200 }, // Hold at peak
{ duration: '5m', target: 0 }, // Ramp down
],
thresholds: {
http_req_duration: ['p(95)<500', 'p(99)<1000'],
http_req_failed: ['rate<0.05'],
}
}
```
**Pattern:** Gradual ramp-up → sustain → ramp down. Never start at peak.
### Stress Test (Breaking Point)
**Goal:** Find system limits
```javascript
// k6 stress test
export let options = {
stages: [
{ duration: '5m', target: 100 }, // Normal
{ duration: '5m', target: 300 }, // Above peak
{ duration: '5m', target: 600 }, // 2x peak
{ duration: '5m', target: 900 }, // 3x peak (expect failure)
{ duration: '10m', target: 0 }, // Recovery
]
}
```
**Success:** Identify at what load system degrades (not necessarily breaking completely)
### Spike Test (Sudden Surge)
**Goal:** Test sudden traffic bursts (viral post, email campaign)
```javascript
// k6 spike test
export let options = {
stages: [
{ duration: '1m', target: 100 }, // Normal
{ duration: '30s', target: 1000 }, // SPIKE to 10x
{ duration: '5m', target: 1000 }, // Hold spike
{ duration: '2m', target: 100 }, // Back to normal
{ duration: '5m', target: 100 }, // Recovery check
]
}
```
**Tests:** Auto-scaling, circuit breakers, rate limiting
### Soak Test (Endurance)
**Goal:** Find memory leaks, resource exhaustion over time
```javascript
// k6 soak test
export let options = {
stages: [
{ duration: '5m', target: 100 }, // Ramp
{ duration: '4h', target: 100 }, // Soak (sustained load)
{ duration: '5m', target: 0 }, // Ramp down
]
}
```
**Monitor:** Memory growth, connection leaks, disk space, file descriptors
**Duration:** Minimum 1 hour, ideally 4-8 hours
## Load Calculation Framework
**Problem:** Convert "10,000 daily active users" to concurrent load
### Step 1: DAU to Concurrent Users
```
Concurrent Users = DAU × Concurrency Ratio × Peak Multiplier
Concurrency Ratios by App Type:
- Web apps: 5-10%
- Social media: 10-20%
- Business apps: 20-30% (work hours)
- Gaming: 15-25%
Peak Multiplier: 1.5-2x for safety margin
```
**Example:**
```
DAU = 10,000
Concurrency = 10% (web app)
Peak Multiplier = 1.5
Concurrent Users = 10,000 × 0.10 × 1.5 = 1,500 concurrent users
```
### Step 2: Concurrent Users to Requests/Second
```
RPS = (Concurrent Users × Requests per Session) / (Session Duration × Think Time Ratio)
Think Time Ratio:
- Active browsing: 0.3-0.5 (30-50% time clicking/typing)
- Reading-heavy: 0.1-0.2 (10-20% active)
- API clients: 0.8-1.0 (80-100% active)
```
**Example:**
```
Concurrent Users = 1,500
Requests per Session = 20
Session Duration = 10 minutes = 600 seconds
Think Time Ratio = 0.3 (web browsing)
RPS = (1,500 × 20) / (600 × 0.3) = 30,000 / 180 = 167 RPS
```
### Step 3: Model Realistic Patterns
Don't use constant load. Use realistic traffic patterns:
```javascript
// Realistic daily pattern
export let options = {
stages: [
// Morning ramp
{ duration: '2h', target: 500 }, // 08:00-10:00
{ duration: '2h', target: 1000 }, // 10:00-12:00 (peak)
// Lunch dip
{ duration: '1h', target: 600 }, // 12:00-13:00
// Afternoon peak
{ duration: '2h', target: 1200 }, // 13:00-15:00 (peak)
{ duration: '2h', target: 800 }, // 15:00-17:00
// Evening drop
{ duration: '2h', target: 300 }, // 17:00-19:00
]
}
```
## Anti-Patterns Catalog
### ❌ Coordinated Omission
**Symptom:** Fixed rate load generation ignores slow responses, underestimating latency
**Why bad:** Hides real latency impact when system slows down
**Fix:** Use arrival rate (requests/sec) not iteration rate
```javascript
// ❌ Bad - coordinated omission
export default function() {
http.get('https://api.example.com')
sleep(1) // Wait 1s between requests
}
// ✅ Good - arrival rate pacing
export let options = {
scenarios: {
constant_arrival_rate: {
executor: 'constant-arrival-rate',
rate: 100, // 100 RPS regardless of response time
timeUnit: '1s',
duration: '10m',
preAllocatedVUs: 50,
maxVUs: 200,
}
}
}
```
---
### ❌ Cold Start Testing
**Symptom:** Running load test immediately after deployment without warm-up
**Why bad:** JIT compilation, cache warming, connection pooling haven't stabilized
**Fix:** Warm-up phase before measurement
```javascript
// ✅ Good - warm-up phase
export let options = {
stages: [
{ duration: '2m', target: 50 }, // Warm-up (not measured)
{ duration: '10m', target: 100 }, // Actual test
]
}
```
---
### ❌ Unrealistic Test Data
**Symptom:** Using same user ID, same query parameters for all virtual users
**Why bad:** Caches give unrealistic performance, doesn't test real database load
**Fix:** Parameterized, realistic data
```javascript
// ❌ Bad - same data
http.get('https://api.example.com/users/123')
// ✅ Good - parameterized data
import { SharedArray } from 'k6/data'
import papaparse from 'https://jslib.k6.io/papaparse/5.1.1/index.js'
const csvData = new SharedArray('users', function () {
return papaparse.parse(open('./users.csv'), { header: true }).data
})
export default function() {
const user = csvData[__VU % csvData.length]
http.get(`https://api.example.com/users/${user.id}`)
}
```
---
### ❌ Constant Load Pattern
**Symptom:** Running with constant VUs instead of realistic traffic pattern
**Why bad:** Real traffic has peaks, valleys, not flat line
**Fix:** Use realistic daily/hourly patterns
---
### ❌ Ignoring Think Time
**Symptom:** No delays between requests, hammering API as fast as possible
**Why bad:** Unrealistic user behavior, overestimates load
**Fix:** Add realistic think time based on user behavior
```javascript
// ✅ Good - realistic think time
import { sleep } from 'k6'
export default function() {
http.get('https://api.example.com/products')
sleep(Math.random() * 3 + 2) // 2-5 seconds browsing
http.post('https://api.example.com/cart', {...})
sleep(Math.random() * 5 + 5) // 5-10 seconds deciding
http.post('https://api.example.com/checkout', {...})
}
```
## Result Interpretation Guide
### Latency Degradation Patterns
| Pattern | Cause | What to Check |
|---------|-------|---------------|
| **Linear growth** (2x users → 2x latency) | CPU-bound | Thread pool, CPU usage |
| **Exponential growth** (2x users → 10x latency) | Resource saturation | Connection pools, locks, queues |
| **Sudden cliff** (works until X, then fails) | Hard limit hit | Max connections, memory, file descriptors |
| **Gradual degradation** (slow increase over time) | Memory leak, cache pollution | Memory trends, GC activity |
### Bottleneck Classification
**Symptom: p95 latency 10x at 2x load**
**Resource saturation** (database connection pool, thread pool, queue)
**Symptom: Errors increase with load**
**Hard limit** (connection limit, rate limiting, timeout)
**Symptom: Latency grows over time at constant load**
**Memory leak** or **cache pollution**
**Symptom: High variance (p50 good, p99 terrible)**
**GC pauses**, **lock contention**, or **slow queries**
### What to Monitor
| Layer | Metrics to Track |
|-------|------------------|
| **Application** | Request rate, error rate, p50/p95/p99 latency, active requests |
| **Runtime** | GC pauses (JVM, .NET), thread pool usage, heap/memory |
| **Database** | Connection pool usage, query latency, lock waits, slow queries |
| **Infrastructure** | CPU %, memory %, disk I/O, network throughput |
| **External** | Third-party API latency, rate limit hits |
### Capacity Planning Formula
```
Safe Capacity = (Breaking Point × Degradation Factor) × Safety Margin
Breaking Point = VUs where p95 latency > threshold
Degradation Factor = 0.7 (start degradation before break)
Safety Margin = 0.5-0.7 (handle traffic spikes)
Example:
- System breaks at 1000 VUs (p95 > 1s)
- Start seeing degradation at 700 VUs (70%)
- Safe capacity: 700 × 0.7 = 490 VUs
```
## Authentication and Session Management
**Problem:** Real APIs require authentication. Can't use same token for all virtual users.
### Token Strategy Decision Framework
| Scenario | Strategy | Why |
|----------|----------|-----|
| **Short test (<10 min)** | Pre-generate tokens | Fast, simple, no login load |
| **Long test (soak)** | Login during test + refresh | Realistic, tests auth system |
| **Testing auth system** | Simulate login flow | Auth is part of load |
| **Read-only testing** | Shared token (single user) | Simplest, adequate for API-only tests |
**Default:** Pre-generate tokens for load tests, simulate login for auth system tests
### Pre-Generated Tokens Pattern
**Best for:** API testing where auth system isn't being tested
```javascript
// k6 with pre-generated JWT tokens
import http from 'k6/http'
import { SharedArray } from 'k6/data'
// Load tokens from file (generated externally)
const tokens = new SharedArray('auth tokens', function () {
return JSON.parse(open('./tokens.json'))
})
export default function() {
const token = tokens[__VU % tokens.length]
const headers = {
'Authorization': `Bearer ${token}`
}
http.get('https://api.example.com/protected', { headers })
}
```
**Generate tokens externally:**
```bash
# Script to generate 1000 tokens
for i in {1..1000}; do
curl -X POST https://api.example.com/login \
-d "username=loadtest_user_$i&password=test" \
| jq -r '.token'
done > tokens.json
```
**Pros:** No login load, fast test setup
**Cons:** Tokens may expire during long tests, not testing auth flow
---
### Login Flow Simulation Pattern
**Best for:** Testing auth system, soak tests where tokens expire
```javascript
// k6 with login simulation
import http from 'k6/http'
import { SharedArray } from 'k6/data'
const users = new SharedArray('users', function () {
return JSON.parse(open('./users.json')) // [{username, password}, ...]
})
export default function() {
const user = users[__VU % users.length]
// Login to get token
const loginRes = http.post('https://api.example.com/login', {
username: user.username,
password: user.password
})
const token = loginRes.json('token')
// Use token for subsequent requests
const headers = { 'Authorization': `Bearer ${token}` }
http.get('https://api.example.com/protected', { headers })
http.post('https://api.example.com/data', {}, { headers })
}
```
**Token refresh for long tests:**
```javascript
// k6 with token refresh
import { sleep } from 'k6'
let token = null
let tokenExpiry = 0
export default function() {
const now = Date.now() / 1000
// Refresh token if expired or about to expire
if (!token || now > tokenExpiry - 300) { // Refresh 5 min before expiry
const loginRes = http.post('https://api.example.com/login', {...})
token = loginRes.json('token')
tokenExpiry = loginRes.json('expires_at')
}
http.get('https://api.example.com/protected', {
headers: { 'Authorization': `Bearer ${token}` }
})
sleep(1)
}
```
---
### Session Cookie Management
**For cookie-based auth:**
```javascript
// k6 with session cookies
import http from 'k6/http'
export default function() {
// k6 automatically handles cookies with jar
const jar = http.cookieJar()
// Login (sets session cookie)
http.post('https://example.com/login', {
username: 'user',
password: 'pass'
})
// Subsequent requests use session cookie automatically
http.get('https://example.com/dashboard')
http.get('https://example.com/profile')
}
```
---
### Rate Limiting Detection
**Pattern:** Detect when hitting rate limits during load test
```javascript
// k6 rate limit detection
import { check } from 'k6'
export default function() {
const res = http.get('https://api.example.com/data')
check(res, {
'not rate limited': (r) => r.status !== 429
})
if (res.status === 429) {
console.warn(`Rate limited at VU ${__VU}, iteration ${__ITER}`)
const retryAfter = res.headers['Retry-After']
console.warn(`Retry-After: ${retryAfter} seconds`)
}
}
```
**Thresholds for rate limiting:**
```javascript
export let options = {
thresholds: {
'http_req_failed{status:429}': ['rate<0.01'] // <1% rate limited
}
}
```
## Third-Party Dependency Handling
**Problem:** APIs call external services (payment, email, third-party APIs). Should you mock them?
### Mock vs Real Decision Framework
| External Service | Mock or Real? | Why |
|------------------|---------------|-----|
| **Payment gateway** | Real (sandbox) | Need to test integration, has sandbox mode |
| **Email provider** | Mock | Cost ($0.001/email × 1000 VUs = expensive), no value testing |
| **Third-party API (has staging)** | Real (staging) | Test integration, realistic latency |
| **Third-party API (no staging)** | Mock | Can't load test production, rate limits |
| **Internal microservices** | Real | Testing real integration points |
| **Analytics/tracking** | Mock | High volume, no functional impact |
**Rule:** Use real services if they have sandbox/staging. Mock if expensive, rate-limited, or no test environment.
---
### Service Virtualization with WireMock
**Best for:** Mocking HTTP APIs with realistic responses
```javascript
// k6 test pointing to WireMock
export default function() {
// WireMock running on localhost:8080 mocks external API
const res = http.get('http://localhost:8080/api/payment/process')
check(res, {
'payment mock responds': (r) => r.status === 200
})
}
```
**WireMock stub setup:**
```json
{
"request": {
"method": "POST",
"url": "/api/payment/process"
},
"response": {
"status": 200,
"jsonBody": {
"transaction_id": "{{randomValue type='UUID'}}",
"status": "approved"
},
"headers": {
"Content-Type": "application/json"
},
"fixedDelayMilliseconds": 200
}
}
```
**Why WireMock:** Realistic latency simulation, dynamic responses, stateful mocking
---
### Partial Mocking Pattern
**Pattern:** Mock some services, use real for others
```javascript
// k6 with partial mocking
import http from 'k6/http'
export default function() {
// Real API (points to staging)
const productRes = http.get('https://staging-api.example.com/products')
// Mock email service (points to WireMock)
http.post('http://localhost:8080/mock/email/send', {
to: 'user@example.com',
subject: 'Order confirmation'
})
// Real payment sandbox
http.post('https://sandbox-payment.stripe.com/charge', {
amount: 1000,
currency: 'usd',
source: 'tok_visa'
})
}
```
**Decision criteria:**
- Real: Services with sandbox, need integration validation, low cost
- Mock: No sandbox, expensive, rate-limited, testing failure scenarios
---
### Testing External Service Failures
**Use mocks to simulate failures:**
```javascript
// WireMock stub for failure scenarios
{
"request": {
"method": "POST",
"url": "/api/payment/process"
},
"response": {
"status": 503,
"jsonBody": {
"error": "Service temporarily unavailable"
},
"fixedDelayMilliseconds": 5000 // Slow failure
}
}
```
**k6 test for resilience:**
```javascript
export default function() {
const res = http.post('http://localhost:8080/api/payment/process', {})
// Verify app handles payment failures gracefully
check(res, {
'handles payment failure': (r) => r.status === 503,
'returns within timeout': (r) => r.timings.duration < 6000
})
}
```
---
### Cost and Compliance Guardrails
**Before testing with real external services:**
| Check | Why |
|-------|-----|
| **Sandbox mode exists?** | Avoid production costs/rate limits |
| **Cost per request?** | 1000 VUs × 10 req/s × 600s = 6M requests |
| **Rate limits?** | Will you hit external service limits? |
| **Terms of service?** | Does load testing violate TOS? |
| **Data privacy?** | Using real user emails/PII? |
**Example cost calculation:**
```
Email service: $0.001/email
Load test: 100 VUs × 5 emails/session × 600s = 300,000 emails
Cost: 300,000 × $0.001 = $300
Decision: Mock email service, use real payment sandbox (free)
```
**Compliance:**
- Don't use real user data in load tests (GDPR, privacy)
- Check third-party TOS (some prohibit load testing)
- Use synthetic test data only
## Your First Load Test
**Goal:** Basic load test in one day
**Hour 1-2: Install tool and write smoke test**
```bash
# Install k6
brew install k6 # macOS
# or snap install k6 # Linux
# Create test.js
cat > test.js <<'EOF'
import http from 'k6/http'
import { check, sleep } from 'k6'
export let options = {
vus: 1,
duration: '30s'
}
export default function() {
let res = http.get('https://your-api.com/health')
check(res, {
'status is 200': (r) => r.status === 200,
'response < 500ms': (r) => r.timings.duration < 500
})
sleep(1)
}
EOF
# Run smoke test
k6 run test.js
```
**Hour 3-4: Calculate target load**
```
Your DAU: 10,000
Concurrency: 10%
Peak multiplier: 1.5
Target: 10,000 × 0.10 × 1.5 = 1,500 VUs
```
**Hour 5-6: Write load test with ramp-up**
```javascript
export let options = {
stages: [
{ duration: '5m', target: 750 }, // Ramp to normal (50%)
{ duration: '10m', target: 750 }, // Hold normal
{ duration: '5m', target: 1500 }, // Ramp to peak
{ duration: '10m', target: 1500 }, // Hold peak
{ duration: '5m', target: 0 }, // Ramp down
],
thresholds: {
http_req_duration: ['p(95)<500', 'p(99)<1000'],
http_req_failed: ['rate<0.05'] // < 5% errors
}
}
```
**Hour 7-8: Run test and analyze**
```bash
# Run load test
k6 run --out json=results.json test.js
# Check summary output for:
# - p95/p99 latency trends
# - Error rates
# - When degradation started
```
**If test fails:** Check thresholds, adjust targets, investigate bottlenecks
## Common Mistakes
### ❌ Testing Production Without Safeguards
**Fix:** Use feature flags, test environment, or controlled percentage
---
### ❌ No Baseline Performance Metrics
**Fix:** Run smoke test first to establish baseline before load testing
---
### ❌ Using Iteration Duration Instead of Arrival Rate
**Fix:** Use `constant-arrival-rate` executor in k6
---
### ❌ Not Warming Up Caches/JIT
**Fix:** 2-5 minute warm-up phase before measurement
## Quick Reference
**Tool Selection:**
- Modern API: k6
- Enterprise: JMeter
- Python team: Locust
**Test Patterns:**
- Smoke: 1 VU, 1 min
- Load: Ramp-up → peak → ramp-down
- Stress: Increase until break
- Spike: Sudden 10x surge
- Soak: 4-8 hours constant
**Load Calculation:**
```
Concurrent = DAU × 0.10 × 1.5
RPS = (Concurrent × Requests/Session) / (Duration × Think Time)
```
**Anti-Patterns:**
- Coordinated omission (use arrival rate)
- Cold start (warm-up first)
- Unrealistic data (parameterize)
- Constant load (use realistic patterns)
**Result Interpretation:**
- Linear growth → CPU-bound
- Exponential growth → Resource saturation
- Sudden cliff → Hard limit
- Gradual degradation → Memory leak
**Authentication:**
- Short tests: Pre-generate tokens
- Long tests: Login + refresh
- Testing auth: Simulate login flow
**Third-Party Dependencies:**
- Has sandbox: Use real (staging/sandbox)
- Expensive/rate-limited: Mock (WireMock)
- No sandbox: Mock
## Bottom Line
**Start with smoke test (1 VU). Calculate realistic load from DAU. Use ramp-up pattern (never start at peak). Monitor p95/p99 latency. Find breaking point before users do.**
Test realistic scenarios with think time, not hammer tests.