--- name: load-testing-patterns description: Use when designing load tests, choosing tools (k6, JMeter, Gatling), calculating concurrent users from DAU, interpreting latency degradation, identifying bottlenecks, or running spike/soak/stress tests - provides test patterns, anti-patterns, and load calculation frameworks --- # Load Testing Patterns ## Overview **Core principle:** Test realistic load patterns, not constant artificial load. Find limits before users do. **Rule:** Load testing reveals system behavior under stress. Without it, production is your load test. ## Tool Selection Decision Tree | Your Need | Protocol | Team Skills | Use | Why | |-----------|----------|-------------|-----|-----| | Modern API testing | HTTP/REST/GraphQL | JavaScript | **k6** | Best dev experience, CI/CD friendly | | Enterprise/complex protocols | HTTP/SOAP/JMS/JDBC | Java/GUI comfort | **JMeter** | Mature, comprehensive protocols | | Python team | HTTP/WebSocket | Python | **Locust** | Pythonic, easy scripting | | High performance/complex scenarios | HTTP/gRPC | Scala/Java | **Gatling** | Best reports, high throughput | | Cloud-native at scale | HTTP/WebSocket | Any (SaaS) | **Artillery, Flood.io** | Managed, distributed | **First choice:** k6 (modern, scriptable, excellent CI/CD integration) **Why not ApacheBench/wrk:** Too simple for realistic scenarios, no complex user flows ## Test Pattern Library | Pattern | Purpose | Duration | When to Use | |---------|---------|----------|-------------| | **Smoke Test** | Verify test works | 1-2 min | Before every test run | | **Load Test** | Normal/peak capacity | 10-30 min | Regular capacity validation | | **Stress Test** | Find breaking point | 20-60 min | Understand limits | | **Spike Test** | Sudden traffic surge | 5-15 min | Black Friday, launch events | | **Soak Test** | Memory leaks, stability | 1-8 hours | Pre-release validation | | **Capacity Test** | Max sustainable load | Variable | Capacity planning | ### Smoke Test **Goal:** Verify test script works with minimal load ```javascript // k6 smoke test export let options = { vus: 1, duration: '1m', thresholds: { http_req_duration: ['p(95)<500'], // 95% < 500ms http_req_failed: ['rate<0.01'], // <1% errors } } ``` **Purpose:** Catch test script bugs before running expensive full tests ### Load Test (Ramp-Up Pattern) **Goal:** Test normal and peak expected load ```javascript // k6 load test with ramp-up export let options = { stages: [ { duration: '5m', target: 100 }, // Ramp to normal load { duration: '10m', target: 100 }, // Hold at normal { duration: '5m', target: 200 }, // Ramp to peak { duration: '10m', target: 200 }, // Hold at peak { duration: '5m', target: 0 }, // Ramp down ], thresholds: { http_req_duration: ['p(95)<500', 'p(99)<1000'], http_req_failed: ['rate<0.05'], } } ``` **Pattern:** Gradual ramp-up → sustain → ramp down. Never start at peak. ### Stress Test (Breaking Point) **Goal:** Find system limits ```javascript // k6 stress test export let options = { stages: [ { duration: '5m', target: 100 }, // Normal { duration: '5m', target: 300 }, // Above peak { duration: '5m', target: 600 }, // 2x peak { duration: '5m', target: 900 }, // 3x peak (expect failure) { duration: '10m', target: 0 }, // Recovery ] } ``` **Success:** Identify at what load system degrades (not necessarily breaking completely) ### Spike Test (Sudden Surge) **Goal:** Test sudden traffic bursts (viral post, email campaign) ```javascript // k6 spike test export let options = { stages: [ { duration: '1m', target: 100 }, // Normal { duration: '30s', target: 1000 }, // SPIKE to 10x { duration: '5m', target: 1000 }, // Hold spike { duration: '2m', target: 100 }, // Back to normal { duration: '5m', target: 100 }, // Recovery check ] } ``` **Tests:** Auto-scaling, circuit breakers, rate limiting ### Soak Test (Endurance) **Goal:** Find memory leaks, resource exhaustion over time ```javascript // k6 soak test export let options = { stages: [ { duration: '5m', target: 100 }, // Ramp { duration: '4h', target: 100 }, // Soak (sustained load) { duration: '5m', target: 0 }, // Ramp down ] } ``` **Monitor:** Memory growth, connection leaks, disk space, file descriptors **Duration:** Minimum 1 hour, ideally 4-8 hours ## Load Calculation Framework **Problem:** Convert "10,000 daily active users" to concurrent load ### Step 1: DAU to Concurrent Users ``` Concurrent Users = DAU × Concurrency Ratio × Peak Multiplier Concurrency Ratios by App Type: - Web apps: 5-10% - Social media: 10-20% - Business apps: 20-30% (work hours) - Gaming: 15-25% Peak Multiplier: 1.5-2x for safety margin ``` **Example:** ``` DAU = 10,000 Concurrency = 10% (web app) Peak Multiplier = 1.5 Concurrent Users = 10,000 × 0.10 × 1.5 = 1,500 concurrent users ``` ### Step 2: Concurrent Users to Requests/Second ``` RPS = (Concurrent Users × Requests per Session) / (Session Duration × Think Time Ratio) Think Time Ratio: - Active browsing: 0.3-0.5 (30-50% time clicking/typing) - Reading-heavy: 0.1-0.2 (10-20% active) - API clients: 0.8-1.0 (80-100% active) ``` **Example:** ``` Concurrent Users = 1,500 Requests per Session = 20 Session Duration = 10 minutes = 600 seconds Think Time Ratio = 0.3 (web browsing) RPS = (1,500 × 20) / (600 × 0.3) = 30,000 / 180 = 167 RPS ``` ### Step 3: Model Realistic Patterns Don't use constant load. Use realistic traffic patterns: ```javascript // Realistic daily pattern export let options = { stages: [ // Morning ramp { duration: '2h', target: 500 }, // 08:00-10:00 { duration: '2h', target: 1000 }, // 10:00-12:00 (peak) // Lunch dip { duration: '1h', target: 600 }, // 12:00-13:00 // Afternoon peak { duration: '2h', target: 1200 }, // 13:00-15:00 (peak) { duration: '2h', target: 800 }, // 15:00-17:00 // Evening drop { duration: '2h', target: 300 }, // 17:00-19:00 ] } ``` ## Anti-Patterns Catalog ### ❌ Coordinated Omission **Symptom:** Fixed rate load generation ignores slow responses, underestimating latency **Why bad:** Hides real latency impact when system slows down **Fix:** Use arrival rate (requests/sec) not iteration rate ```javascript // ❌ Bad - coordinated omission export default function() { http.get('https://api.example.com') sleep(1) // Wait 1s between requests } // ✅ Good - arrival rate pacing export let options = { scenarios: { constant_arrival_rate: { executor: 'constant-arrival-rate', rate: 100, // 100 RPS regardless of response time timeUnit: '1s', duration: '10m', preAllocatedVUs: 50, maxVUs: 200, } } } ``` --- ### ❌ Cold Start Testing **Symptom:** Running load test immediately after deployment without warm-up **Why bad:** JIT compilation, cache warming, connection pooling haven't stabilized **Fix:** Warm-up phase before measurement ```javascript // ✅ Good - warm-up phase export let options = { stages: [ { duration: '2m', target: 50 }, // Warm-up (not measured) { duration: '10m', target: 100 }, // Actual test ] } ``` --- ### ❌ Unrealistic Test Data **Symptom:** Using same user ID, same query parameters for all virtual users **Why bad:** Caches give unrealistic performance, doesn't test real database load **Fix:** Parameterized, realistic data ```javascript // ❌ Bad - same data http.get('https://api.example.com/users/123') // ✅ Good - parameterized data import { SharedArray } from 'k6/data' import papaparse from 'https://jslib.k6.io/papaparse/5.1.1/index.js' const csvData = new SharedArray('users', function () { return papaparse.parse(open('./users.csv'), { header: true }).data }) export default function() { const user = csvData[__VU % csvData.length] http.get(`https://api.example.com/users/${user.id}`) } ``` --- ### ❌ Constant Load Pattern **Symptom:** Running with constant VUs instead of realistic traffic pattern **Why bad:** Real traffic has peaks, valleys, not flat line **Fix:** Use realistic daily/hourly patterns --- ### ❌ Ignoring Think Time **Symptom:** No delays between requests, hammering API as fast as possible **Why bad:** Unrealistic user behavior, overestimates load **Fix:** Add realistic think time based on user behavior ```javascript // ✅ Good - realistic think time import { sleep } from 'k6' export default function() { http.get('https://api.example.com/products') sleep(Math.random() * 3 + 2) // 2-5 seconds browsing http.post('https://api.example.com/cart', {...}) sleep(Math.random() * 5 + 5) // 5-10 seconds deciding http.post('https://api.example.com/checkout', {...}) } ``` ## Result Interpretation Guide ### Latency Degradation Patterns | Pattern | Cause | What to Check | |---------|-------|---------------| | **Linear growth** (2x users → 2x latency) | CPU-bound | Thread pool, CPU usage | | **Exponential growth** (2x users → 10x latency) | Resource saturation | Connection pools, locks, queues | | **Sudden cliff** (works until X, then fails) | Hard limit hit | Max connections, memory, file descriptors | | **Gradual degradation** (slow increase over time) | Memory leak, cache pollution | Memory trends, GC activity | ### Bottleneck Classification **Symptom: p95 latency 10x at 2x load** → **Resource saturation** (database connection pool, thread pool, queue) **Symptom: Errors increase with load** → **Hard limit** (connection limit, rate limiting, timeout) **Symptom: Latency grows over time at constant load** → **Memory leak** or **cache pollution** **Symptom: High variance (p50 good, p99 terrible)** → **GC pauses**, **lock contention**, or **slow queries** ### What to Monitor | Layer | Metrics to Track | |-------|------------------| | **Application** | Request rate, error rate, p50/p95/p99 latency, active requests | | **Runtime** | GC pauses (JVM, .NET), thread pool usage, heap/memory | | **Database** | Connection pool usage, query latency, lock waits, slow queries | | **Infrastructure** | CPU %, memory %, disk I/O, network throughput | | **External** | Third-party API latency, rate limit hits | ### Capacity Planning Formula ``` Safe Capacity = (Breaking Point × Degradation Factor) × Safety Margin Breaking Point = VUs where p95 latency > threshold Degradation Factor = 0.7 (start degradation before break) Safety Margin = 0.5-0.7 (handle traffic spikes) Example: - System breaks at 1000 VUs (p95 > 1s) - Start seeing degradation at 700 VUs (70%) - Safe capacity: 700 × 0.7 = 490 VUs ``` ## Authentication and Session Management **Problem:** Real APIs require authentication. Can't use same token for all virtual users. ### Token Strategy Decision Framework | Scenario | Strategy | Why | |----------|----------|-----| | **Short test (<10 min)** | Pre-generate tokens | Fast, simple, no login load | | **Long test (soak)** | Login during test + refresh | Realistic, tests auth system | | **Testing auth system** | Simulate login flow | Auth is part of load | | **Read-only testing** | Shared token (single user) | Simplest, adequate for API-only tests | **Default:** Pre-generate tokens for load tests, simulate login for auth system tests ### Pre-Generated Tokens Pattern **Best for:** API testing where auth system isn't being tested ```javascript // k6 with pre-generated JWT tokens import http from 'k6/http' import { SharedArray } from 'k6/data' // Load tokens from file (generated externally) const tokens = new SharedArray('auth tokens', function () { return JSON.parse(open('./tokens.json')) }) export default function() { const token = tokens[__VU % tokens.length] const headers = { 'Authorization': `Bearer ${token}` } http.get('https://api.example.com/protected', { headers }) } ``` **Generate tokens externally:** ```bash # Script to generate 1000 tokens for i in {1..1000}; do curl -X POST https://api.example.com/login \ -d "username=loadtest_user_$i&password=test" \ | jq -r '.token' done > tokens.json ``` **Pros:** No login load, fast test setup **Cons:** Tokens may expire during long tests, not testing auth flow --- ### Login Flow Simulation Pattern **Best for:** Testing auth system, soak tests where tokens expire ```javascript // k6 with login simulation import http from 'k6/http' import { SharedArray } from 'k6/data' const users = new SharedArray('users', function () { return JSON.parse(open('./users.json')) // [{username, password}, ...] }) export default function() { const user = users[__VU % users.length] // Login to get token const loginRes = http.post('https://api.example.com/login', { username: user.username, password: user.password }) const token = loginRes.json('token') // Use token for subsequent requests const headers = { 'Authorization': `Bearer ${token}` } http.get('https://api.example.com/protected', { headers }) http.post('https://api.example.com/data', {}, { headers }) } ``` **Token refresh for long tests:** ```javascript // k6 with token refresh import { sleep } from 'k6' let token = null let tokenExpiry = 0 export default function() { const now = Date.now() / 1000 // Refresh token if expired or about to expire if (!token || now > tokenExpiry - 300) { // Refresh 5 min before expiry const loginRes = http.post('https://api.example.com/login', {...}) token = loginRes.json('token') tokenExpiry = loginRes.json('expires_at') } http.get('https://api.example.com/protected', { headers: { 'Authorization': `Bearer ${token}` } }) sleep(1) } ``` --- ### Session Cookie Management **For cookie-based auth:** ```javascript // k6 with session cookies import http from 'k6/http' export default function() { // k6 automatically handles cookies with jar const jar = http.cookieJar() // Login (sets session cookie) http.post('https://example.com/login', { username: 'user', password: 'pass' }) // Subsequent requests use session cookie automatically http.get('https://example.com/dashboard') http.get('https://example.com/profile') } ``` --- ### Rate Limiting Detection **Pattern:** Detect when hitting rate limits during load test ```javascript // k6 rate limit detection import { check } from 'k6' export default function() { const res = http.get('https://api.example.com/data') check(res, { 'not rate limited': (r) => r.status !== 429 }) if (res.status === 429) { console.warn(`Rate limited at VU ${__VU}, iteration ${__ITER}`) const retryAfter = res.headers['Retry-After'] console.warn(`Retry-After: ${retryAfter} seconds`) } } ``` **Thresholds for rate limiting:** ```javascript export let options = { thresholds: { 'http_req_failed{status:429}': ['rate<0.01'] // <1% rate limited } } ``` ## Third-Party Dependency Handling **Problem:** APIs call external services (payment, email, third-party APIs). Should you mock them? ### Mock vs Real Decision Framework | External Service | Mock or Real? | Why | |------------------|---------------|-----| | **Payment gateway** | Real (sandbox) | Need to test integration, has sandbox mode | | **Email provider** | Mock | Cost ($0.001/email × 1000 VUs = expensive), no value testing | | **Third-party API (has staging)** | Real (staging) | Test integration, realistic latency | | **Third-party API (no staging)** | Mock | Can't load test production, rate limits | | **Internal microservices** | Real | Testing real integration points | | **Analytics/tracking** | Mock | High volume, no functional impact | **Rule:** Use real services if they have sandbox/staging. Mock if expensive, rate-limited, or no test environment. --- ### Service Virtualization with WireMock **Best for:** Mocking HTTP APIs with realistic responses ```javascript // k6 test pointing to WireMock export default function() { // WireMock running on localhost:8080 mocks external API const res = http.get('http://localhost:8080/api/payment/process') check(res, { 'payment mock responds': (r) => r.status === 200 }) } ``` **WireMock stub setup:** ```json { "request": { "method": "POST", "url": "/api/payment/process" }, "response": { "status": 200, "jsonBody": { "transaction_id": "{{randomValue type='UUID'}}", "status": "approved" }, "headers": { "Content-Type": "application/json" }, "fixedDelayMilliseconds": 200 } } ``` **Why WireMock:** Realistic latency simulation, dynamic responses, stateful mocking --- ### Partial Mocking Pattern **Pattern:** Mock some services, use real for others ```javascript // k6 with partial mocking import http from 'k6/http' export default function() { // Real API (points to staging) const productRes = http.get('https://staging-api.example.com/products') // Mock email service (points to WireMock) http.post('http://localhost:8080/mock/email/send', { to: 'user@example.com', subject: 'Order confirmation' }) // Real payment sandbox http.post('https://sandbox-payment.stripe.com/charge', { amount: 1000, currency: 'usd', source: 'tok_visa' }) } ``` **Decision criteria:** - Real: Services with sandbox, need integration validation, low cost - Mock: No sandbox, expensive, rate-limited, testing failure scenarios --- ### Testing External Service Failures **Use mocks to simulate failures:** ```javascript // WireMock stub for failure scenarios { "request": { "method": "POST", "url": "/api/payment/process" }, "response": { "status": 503, "jsonBody": { "error": "Service temporarily unavailable" }, "fixedDelayMilliseconds": 5000 // Slow failure } } ``` **k6 test for resilience:** ```javascript export default function() { const res = http.post('http://localhost:8080/api/payment/process', {}) // Verify app handles payment failures gracefully check(res, { 'handles payment failure': (r) => r.status === 503, 'returns within timeout': (r) => r.timings.duration < 6000 }) } ``` --- ### Cost and Compliance Guardrails **Before testing with real external services:** | Check | Why | |-------|-----| | **Sandbox mode exists?** | Avoid production costs/rate limits | | **Cost per request?** | 1000 VUs × 10 req/s × 600s = 6M requests | | **Rate limits?** | Will you hit external service limits? | | **Terms of service?** | Does load testing violate TOS? | | **Data privacy?** | Using real user emails/PII? | **Example cost calculation:** ``` Email service: $0.001/email Load test: 100 VUs × 5 emails/session × 600s = 300,000 emails Cost: 300,000 × $0.001 = $300 Decision: Mock email service, use real payment sandbox (free) ``` **Compliance:** - Don't use real user data in load tests (GDPR, privacy) - Check third-party TOS (some prohibit load testing) - Use synthetic test data only ## Your First Load Test **Goal:** Basic load test in one day **Hour 1-2: Install tool and write smoke test** ```bash # Install k6 brew install k6 # macOS # or snap install k6 # Linux # Create test.js cat > test.js <<'EOF' import http from 'k6/http' import { check, sleep } from 'k6' export let options = { vus: 1, duration: '30s' } export default function() { let res = http.get('https://your-api.com/health') check(res, { 'status is 200': (r) => r.status === 200, 'response < 500ms': (r) => r.timings.duration < 500 }) sleep(1) } EOF # Run smoke test k6 run test.js ``` **Hour 3-4: Calculate target load** ``` Your DAU: 10,000 Concurrency: 10% Peak multiplier: 1.5 Target: 10,000 × 0.10 × 1.5 = 1,500 VUs ``` **Hour 5-6: Write load test with ramp-up** ```javascript export let options = { stages: [ { duration: '5m', target: 750 }, // Ramp to normal (50%) { duration: '10m', target: 750 }, // Hold normal { duration: '5m', target: 1500 }, // Ramp to peak { duration: '10m', target: 1500 }, // Hold peak { duration: '5m', target: 0 }, // Ramp down ], thresholds: { http_req_duration: ['p(95)<500', 'p(99)<1000'], http_req_failed: ['rate<0.05'] // < 5% errors } } ``` **Hour 7-8: Run test and analyze** ```bash # Run load test k6 run --out json=results.json test.js # Check summary output for: # - p95/p99 latency trends # - Error rates # - When degradation started ``` **If test fails:** Check thresholds, adjust targets, investigate bottlenecks ## Common Mistakes ### ❌ Testing Production Without Safeguards **Fix:** Use feature flags, test environment, or controlled percentage --- ### ❌ No Baseline Performance Metrics **Fix:** Run smoke test first to establish baseline before load testing --- ### ❌ Using Iteration Duration Instead of Arrival Rate **Fix:** Use `constant-arrival-rate` executor in k6 --- ### ❌ Not Warming Up Caches/JIT **Fix:** 2-5 minute warm-up phase before measurement ## Quick Reference **Tool Selection:** - Modern API: k6 - Enterprise: JMeter - Python team: Locust **Test Patterns:** - Smoke: 1 VU, 1 min - Load: Ramp-up → peak → ramp-down - Stress: Increase until break - Spike: Sudden 10x surge - Soak: 4-8 hours constant **Load Calculation:** ``` Concurrent = DAU × 0.10 × 1.5 RPS = (Concurrent × Requests/Session) / (Duration × Think Time) ``` **Anti-Patterns:** - Coordinated omission (use arrival rate) - Cold start (warm-up first) - Unrealistic data (parameterize) - Constant load (use realistic patterns) **Result Interpretation:** - Linear growth → CPU-bound - Exponential growth → Resource saturation - Sudden cliff → Hard limit - Gradual degradation → Memory leak **Authentication:** - Short tests: Pre-generate tokens - Long tests: Login + refresh - Testing auth: Simulate login flow **Third-Party Dependencies:** - Has sandbox: Use real (staging/sandbox) - Expensive/rate-limited: Mock (WireMock) - No sandbox: Mock ## Bottom Line **Start with smoke test (1 VU). Calculate realistic load from DAU. Use ramp-up pattern (never start at peak). Monitor p95/p99 latency. Find breaking point before users do.** Test realistic scenarios with think time, not hammer tests.