--- description: Run API load tests with k6, Artillery, or Gatling to measure performance under load shortcut: loadtest --- # Run API Load Test Execute comprehensive load tests to measure API performance, identify bottlenecks, and validate scalability under realistic traffic patterns. ## Design Decisions This command supports multiple load testing tools to accommodate different testing scenarios and team preferences: - **k6**: Chosen for developer-friendly JavaScript API, excellent CLI output, and built-in metrics - **Artillery**: Selected for YAML configuration simplicity and scenario-based testing - **Gatling**: Included for enterprise-grade reporting and Scala DSL power users Alternative approaches considered: - **JMeter**: Excluded due to GUI-heavy approach and XML configuration complexity - **Locust**: Considered but not included to limit Python dependencies - **Custom solutions**: Avoided to leverage battle-tested tools with proven metrics accuracy ## When to Use This Command **USE WHEN:** - Validating API performance before production deployment - Establishing baseline performance metrics for SLAs - Testing autoscaling behavior under load - Identifying memory leaks or resource exhaustion issues - Comparing performance across API versions - Simulating Black Friday or high-traffic events **DON'T USE WHEN:** - Testing production APIs without permission (use staging environments) - You need functional correctness testing (use integration tests instead) - Testing third-party APIs you don't control - During active development (use unit/integration tests first) ## Prerequisites **Required:** - Node.js 18+ (for k6 and Artillery) - Java 11+ (for Gatling) - Target API endpoint accessible from your machine - API authentication credentials (if required) **Recommended:** - Monitoring tools configured (Prometheus, Grafana, DataDog) - Baseline metrics from previous test runs - Staging environment that mirrors production capacity **Install Tools:** ```bash # k6 (recommended for most use cases) brew install k6 # macOS sudo apt-get install k6 # Ubuntu # Artillery npm install -g artillery # Gatling wget https://repo1.maven.org/maven2/io/gatling/highcharts/gatling-charts-highcharts-bundle/3.9.5/gatling-charts-highcharts-bundle-3.9.5.zip unzip gatling-charts-highcharts-bundle-3.9.5.zip ``` ## Detailed Process ### Step 1: Define Test Objectives Establish clear performance targets before running tests: - **Response time**: p95 < 200ms, p99 < 500ms - **Throughput**: 1000 requests/second sustained - **Error rate**: < 0.1% under normal load - **Concurrent users**: Support 500 simultaneous users Document expected behavior under different load levels: - Normal load: 100-500 RPS - Peak load: 1000-2000 RPS - Stress test: 3000+ RPS until failure ### Step 2: Configure Test Scenario Create test scripts matching realistic user behavior patterns: **k6 test script** (`load-test.js`): ```javascript import http from 'k6/http'; import { check, sleep } from 'k6'; export const options = { stages: [ { duration: '2m', target: 100 }, // Ramp-up { duration: '5m', target: 100 }, // Sustained load { duration: '2m', target: 200 }, // Scale up { duration: '5m', target: 200 }, // Sustained peak { duration: '2m', target: 0 }, // Ramp-down ], thresholds: { http_req_duration: ['p(95)<200', 'p(99)<500'], http_req_failed: ['rate<0.01'], }, }; export default function () { const res = http.get('https://api.example.com/v1/products'); check(res, { 'status is 200': (r) => r.status === 200, 'response time < 200ms': (r) => r.timings.duration < 200, }); sleep(1); } ``` **Artillery config** (`artillery.yml`): ```yaml config: target: 'https://api.example.com' phases: - duration: 60 arrivalRate: 10 name: "Warm up" - duration: 300 arrivalRate: 50 name: "Sustained load" - duration: 120 arrivalRate: 100 name: "Peak load" processor: "./flows.js" scenarios: - name: "Product browsing flow" flow: - get: url: "/v1/products" capture: - json: "$.products[0].id" as: "productId" - get: url: "/v1/products/{{ productId }}" - think: 3 ``` ### Step 3: Execute Load Test Run tests with appropriate parameters and monitor system resources: ```bash # k6 test execution with custom parameters k6 run load-test.js \ --vus 100 \ --duration 10m \ --out json=results.json \ --summary-export=summary.json # Artillery with real-time reporting artillery run artillery.yml \ --output report.json # Gatling test execution ./gatling.sh -s com.example.LoadTest \ -rf results/ ``` Monitor system metrics during execution: - CPU utilization (should stay below 80%) - Memory consumption (watch for leaks) - Network I/O (bandwidth saturation) - Database connections (connection pool exhaustion) ### Step 4: Analyze Results Review metrics to identify performance bottlenecks: **Response Time Analysis:** ```bash # k6 summary shows percentile distribution http_req_duration..............: avg=156ms p(95)=289ms p(99)=456ms http_req_failed................: 0.12% (12 failures / 10000 requests) http_reqs......................: 10000 166.67/s vus............................: 100 min=0 max=100 ``` Key metrics to examine: - **p50 (median)**: Typical user experience - **p95**: Worst case for 95% of users - **p99**: Tail latency affecting 1% of requests - **Error rate**: Percentage of failed requests - **Throughput**: Successful requests per second ### Step 5: Generate Reports and Recommendations Create actionable reports with findings and optimization suggestions: **Performance Report Structure:** ```markdown # Load Test Results - 2025-10-11 ## Test Configuration - Duration: 10 minutes - Virtual Users: 100 - Target: https://api.example.com/v1/products ## Results Summary - Total Requests: 10,000 - Success Rate: 99.88% - Avg Response Time: 156ms - p95 Response Time: 289ms - Throughput: 166.67 RPS ## Findings 1. Database query optimization needed (p99 spikes to 456ms) 2. Connection pool exhausted at 150 concurrent users 3. Memory leak detected after 8 minutes ## Recommendations 1. Add database indexes on product_id and category 2. Increase connection pool from 20 to 50 3. Fix memory leak in image processing service ``` ## Output Format The command generates structured performance reports: **Console Output:** ``` Running load test with k6... execution: local script: load-test.js output: json (results.json) scenarios: (100.00%) 1 scenario, 200 max VUs, 17m0s max duration data_received..................: 48 MB 80 kB/s data_sent......................: 2.4 MB 4.0 kB/s http_req_blocked...............: avg=1.23ms p(95)=3.45ms p(99)=8.91ms http_req_connecting............: avg=856µs p(95)=2.34ms p(99)=5.67ms http_req_duration..............: avg=156.78ms p(95)=289.45ms p(99)=456.12ms http_req_failed................: 0.12% http_req_receiving.............: avg=234µs p(95)=567µs p(99)=1.23ms http_req_sending...............: avg=123µs p(95)=345µs p(99)=789µs http_req_tls_handshaking.......: avg=0s p(95)=0s p(99)=0s http_req_waiting...............: avg=156.42ms p(95)=288.89ms p(99)=455.34ms http_reqs......................: 10000 166.67/s iteration_duration.............: avg=1.16s p(95)=1.29s p(99)=1.46s iterations.....................: 10000 166.67/s vus............................: 100 min=0 max=200 vus_max........................: 200 min=200 max=200 ``` **JSON Report:** ```json { "metrics": { "http_req_duration": { "avg": 156.78, "p95": 289.45, "p99": 456.12 }, "http_req_failed": 0.0012, "http_reqs": { "count": 10000, "rate": 166.67 } }, "root_group": { "checks": { "status is 200": { "passes": 9988, "fails": 12 } } } } ``` ## Code Examples ### Example 1: Basic Load Test with k6 Test a REST API endpoint with gradual ramp-up and threshold validation: ```javascript // basic-load-test.js import http from 'k6/http'; import { check, sleep } from 'k6'; import { Rate } from 'k6/metrics'; // Custom metrics const errorRate = new Rate('errors'); export const options = { // Ramp-up pattern: 0 -> 50 -> 100 -> 50 -> 0 stages: [ { duration: '1m', target: 50 }, // Ramp-up to 50 users { duration: '3m', target: 50 }, // Stay at 50 users { duration: '1m', target: 100 }, // Spike to 100 users { duration: '3m', target: 100 }, // Stay at 100 users { duration: '1m', target: 50 }, // Scale down to 50 { duration: '1m', target: 0 }, // Ramp-down to 0 ], // Performance thresholds (test fails if exceeded) thresholds: { 'http_req_duration': ['p(95)<300', 'p(99)<500'], 'http_req_failed': ['rate<0.01'], // Less than 1% errors 'errors': ['rate<0.1'], }, }; export default function () { // Test parameters const baseUrl = 'https://api.example.com'; const params = { headers: { 'Content-Type': 'application/json', 'Authorization': `Bearer ${__ENV.API_TOKEN}`, }, }; // API request const res = http.get(`${baseUrl}/v1/products?limit=20`, params); // Validation checks const checkRes = check(res, { 'status is 200': (r) => r.status === 200, 'response time < 300ms': (r) => r.timings.duration < 300, 'has products': (r) => r.json('products').length > 0, 'valid JSON': (r) => { try { JSON.parse(r.body); return true; } catch (e) { return false; } }, }); // Track custom error metric errorRate.add(!checkRes); // Simulate user think time sleep(Math.random() * 3 + 1); // 1-4 seconds } // Teardown function (runs once at end) export function teardown(data) { console.log('Load test completed'); } ``` **Run command:** ```bash # Set API token and execute export API_TOKEN="your-token-here" k6 run basic-load-test.js \ --out json=results.json \ --summary-export=summary.json # Generate HTML report from JSON k6-reporter results.json --output report.html ``` ### Example 2: Stress Testing with Artillery Test API breaking point with gradual load increase until failure: ```yaml # stress-test.yml config: target: 'https://api.example.com' phases: # Gradual ramp-up to find breaking point - duration: 60 arrivalRate: 10 name: "Phase 1: Baseline (10 RPS)" - duration: 60 arrivalRate: 50 name: "Phase 2: Moderate (50 RPS)" - duration: 60 arrivalRate: 100 name: "Phase 3: High (100 RPS)" - duration: 60 arrivalRate: 200 name: "Phase 4: Stress (200 RPS)" - duration: 60 arrivalRate: 400 name: "Phase 5: Breaking point (400 RPS)" # Environment variables variables: api_token: "{{ $processEnvironment.API_TOKEN }}" # HTTP settings http: timeout: 10 pool: 50 # Custom plugins plugins: expect: {} metrics-by-endpoint: {} # Success criteria ensure: p95: 500 p99: 1000 maxErrorRate: 1 # Test scenarios scenarios: - name: "Product CRUD operations" weight: 70 flow: # List products - get: url: "/v1/products" headers: Authorization: "Bearer {{ api_token }}" expect: - statusCode: 200 - contentType: json - hasProperty: products capture: - json: "$.products[0].id" as: "productId" # Get product details - get: url: "/v1/products/{{ productId }}" headers: Authorization: "Bearer {{ api_token }}" expect: - statusCode: 200 - hasProperty: id # Think time (user reading) - think: 2 # Search products - get: url: "/v1/products/search?q=laptop" headers: Authorization: "Bearer {{ api_token }}" expect: - statusCode: 200 - name: "User authentication flow" weight: 20 flow: - post: url: "/v1/auth/login" json: email: "test@example.com" password: "password123" expect: - statusCode: 200 - hasProperty: token capture: - json: "$.token" as: "userToken" - get: url: "/v1/users/me" headers: Authorization: "Bearer {{ userToken }}" expect: - statusCode: 200 - name: "Shopping cart operations" weight: 10 flow: - post: url: "/v1/cart/items" headers: Authorization: "Bearer {{ api_token }}" json: productId: "{{ productId }}" quantity: 1 expect: - statusCode: 201 - get: url: "/v1/cart" headers: Authorization: "Bearer {{ api_token }}" expect: - statusCode: 200 - hasProperty: items ``` **Run with custom processor:** ```javascript // flows.js - Custom logic for Artillery module.exports = { // Before request hook setAuthToken: function(requestParams, context, ee, next) { requestParams.headers = requestParams.headers || {}; requestParams.headers['X-Request-ID'] = `req-${Date.now()}-${Math.random()}`; return next(); }, // After response hook logResponse: function(requestParams, response, context, ee, next) { if (response.statusCode >= 400) { console.log(`Error: ${response.statusCode} - ${requestParams.url}`); } return next(); }, // Custom function to generate dynamic data generateTestData: function(context, events, done) { context.vars.userId = `user-${Math.floor(Math.random() * 10000)}`; context.vars.timestamp = new Date().toISOString(); return done(); } }; ``` **Execute stress test:** ```bash # Run with environment variable API_TOKEN="your-token" artillery run stress-test.yml \ --output stress-results.json # Generate HTML report artillery report stress-results.json \ --output stress-report.html # Run with custom config overrides artillery run stress-test.yml \ --config config.phases[0].duration=30 \ --config config.phases[0].arrivalRate=20 ``` ### Example 3: Performance Testing with Gatling (Scala DSL) Enterprise-grade load test with complex scenarios and detailed reporting: ```scala // LoadSimulation.scala package com.example.loadtest import io.gatling.core.Predef._ import io.gatling.http.Predef._ import scala.concurrent.duration._ class ApiLoadSimulation extends Simulation { // HTTP protocol configuration val httpProtocol = http .baseUrl("https://api.example.com") .acceptHeader("application/json") .authorizationHeader("Bearer ${accessToken}") .userAgentHeader("Gatling Load Test") .shareConnections // Feeders for test data val userFeeder = csv("users.csv").circular val productFeeder = csv("products.csv").random // Custom headers val sentHeaders = Map( "X-Request-ID" -> "${requestId}", "X-Client-Version" -> "1.0.0" ) // Scenario 1: Browse products val browseProducts = scenario("Browse Products") .feed(userFeeder) .exec(session => session.set("requestId", java.util.UUID.randomUUID.toString)) .exec( http("List Products") .get("/v1/products") .headers(sentHeaders) .check(status.is(200)) .check(jsonPath("$.products[*].id").findAll.saveAs("productIds")) ) .pause(2, 5) .exec( http("Get Product Details") .get("/v1/products/${productIds.random()}") .check(status.is(200)) .check(jsonPath("$.id").exists) .check(jsonPath("$.price").ofType[Double].saveAs("price")) ) .pause(1, 3) // Scenario 2: Search and filter val searchProducts = scenario("Search Products") .exec(session => session.set("requestId", java.util.UUID.randomUUID.toString)) .exec( http("Search Products") .get("/v1/products/search") .queryParam("q", "laptop") .queryParam("minPrice", "500") .queryParam("maxPrice", "2000") .headers(sentHeaders) .check(status.is(200)) .check(jsonPath("$.total").ofType[Int].gt(0)) ) .pause(2, 4) .exec( http("Apply Filters") .get("/v1/products/search") .queryParam("q", "laptop") .queryParam("brand", "Dell") .queryParam("sort", "price") .check(status.is(200)) ) // Scenario 3: Checkout flow val checkout = scenario("Checkout Flow") .feed(userFeeder) .feed(productFeeder) .exec(session => session.set("requestId", java.util.UUID.randomUUID.toString)) .exec( http("Add to Cart") .post("/v1/cart/items") .headers(sentHeaders) .body(StringBody("""{"productId": "${productId}", "quantity": 1}""")) .asJson .check(status.is(201)) .check(jsonPath("$.cartId").saveAs("cartId")) ) .pause(1, 2) .exec( http("Get Cart") .get("/v1/cart/${cartId}") .check(status.is(200)) .check(jsonPath("$.total").ofType[Double].saveAs("total")) ) .pause(2, 4) .exec( http("Create Order") .post("/v1/orders") .body(StringBody("""{"cartId": "${cartId}", "paymentMethod": "credit_card"}""")) .asJson .check(status.in(200, 201)) .check(jsonPath("$.orderId").saveAs("orderId")) ) .exec( http("Get Order Status") .get("/v1/orders/${orderId}") .check(status.is(200)) .check(jsonPath("$.status").is("pending")) ) // Load profile: Realistic production traffic pattern setUp( // 70% users browse products browseProducts.inject( rampUsersPerSec(1) to 50 during (2 minutes), constantUsersPerSec(50) during (5 minutes), rampUsersPerSec(50) to 100 during (3 minutes), constantUsersPerSec(100) during (5 minutes), rampUsersPerSec(100) to 0 during (2 minutes) ).protocols(httpProtocol), // 20% users search searchProducts.inject( rampUsersPerSec(1) to 15 during (2 minutes), constantUsersPerSec(15) during (10 minutes), rampUsersPerSec(15) to 0 during (2 minutes) ).protocols(httpProtocol), // 10% users complete checkout checkout.inject( rampUsersPerSec(1) to 10 during (3 minutes), constantUsersPerSec(10) during (10 minutes), rampUsersPerSec(10) to 0 during (2 minutes) ).protocols(httpProtocol) ).protocols(httpProtocol) .assertions( global.responseTime.max.lt(2000), global.responseTime.percentile3.lt(500), global.successfulRequests.percent.gt(99) ) } ``` **Supporting data files:** `users.csv`: ```csv userId,accessToken user-001,eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9... user-002,eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9... user-003,eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9... ``` `products.csv`: ```csv productId,category prod-001,electronics prod-002,clothing prod-003,books ``` **Run Gatling simulation:** ```bash # Using Gatling Maven plugin mvn gatling:test -Dgatling.simulationClass=com.example.loadtest.ApiLoadSimulation # Using standalone Gatling ./gatling.sh -s com.example.loadtest.ApiLoadSimulation \ -rf results/ # Generate report only (from previous run) ./gatling.sh -ro results/apisimulation-20251011143022456 ``` **Gatling configuration** (`gatling.conf`): ```hocon gatling { core { outputDirectoryBaseName = "api-load-test" runDescription = "Production load simulation" encoding = "utf-8" simulationClass = "" } charting { indicators { lowerBound = 100 # Lower bound for response time (ms) higherBound = 500 # Higher bound for response time (ms) percentile1 = 50 # First percentile percentile2 = 75 # Second percentile percentile3 = 95 # Third percentile percentile4 = 99 # Fourth percentile } } http { ahc { pooledConnectionIdleTimeout = 60000 readTimeout = 60000 requestTimeout = 60000 connectionTimeout = 30000 maxConnections = 200 maxConnectionsPerHost = 50 } } data { writers = [console, file] } } ``` ## Error Handling Common errors and solutions: **Connection Refused:** ``` Error: connect ECONNREFUSED 127.0.0.1:8080 ``` Solution: Verify API is running and accessible. Check network connectivity and firewall rules. **Timeout Errors:** ``` http_req_failed: 45.2% (4520 failures / 10000 requests) ``` Solution: Increase timeout values or reduce concurrent users. API may be overwhelmed. **SSL/TLS Errors:** ``` Error: x509: certificate signed by unknown authority ``` Solution: Add `insecureSkipTLSVerify: true` or configure proper CA certificates. **Rate Limiting:** ``` HTTP 429 Too Many Requests ``` Solution: Reduce request rate or increase rate limits on API server. Add backoff logic. **Memory Exhaustion:** ``` JavaScript heap out of memory ``` Solution: Increase Node.js memory limit: `NODE_OPTIONS=--max-old-space-size=4096 k6 run test.js` **Authentication Failures:** ``` HTTP 401 Unauthorized ``` Solution: Verify API tokens are valid and not expired. Check authorization headers. ## Configuration Options ### k6 Options ```bash --vus N # Number of virtual users (default: 1) --duration Xm # Test duration (e.g., 10m, 30s) --iterations N # Total iterations across all VUs --stage "Xm:N" # Add load stage (duration:target) --rps N # Max requests per second --max-redirects N # Max HTTP redirects (default: 10) --batch N # Max parallel batch requests --batch-per-host N # Max parallel requests per host --http-debug # Enable HTTP debug logging --no-connection-reuse # Disable HTTP keep-alive --throw # Throw errors on failed HTTP requests --summary-trend-stats # Custom summary stats (e.g., "avg,p(95),p(99)") --out json=file.json # Export results to JSON --out influxdb=http://... # Export to InfluxDB --out statsd # Export to StatsD ``` ### Artillery Options ```bash --target URL # Override target URL --output FILE # Save results to JSON file --overrides FILE # Override config with JSON file --variables FILE # Load variables from JSON --config KEY=VALUE # Override single config value --environment ENV # Select environment from config --solo # Run test without publishing --quiet # Suppress output --plugins # List installed plugins --dotenv FILE # Load environment from .env file ``` ### Gatling Options ```bash -s CLASS # Simulation class to run -rf FOLDER # Results folder -rd DESC # Run description -nr # No reports generation -ro FOLDER # Generate reports only ``` ## Best Practices ### DO: - Start with baseline test (low load) to verify test scripts work correctly - Ramp up load gradually to identify inflection points - Monitor backend resources (CPU, memory, database) during tests - Use realistic think times (1-5 seconds) to simulate user behavior - Test in staging environment that mirrors production capacity - Run tests multiple times to establish consistency - Document test configuration and results for historical comparison - Use connection pooling and HTTP keep-alive for realistic scenarios - Set appropriate timeouts (30-60 seconds for most APIs) - Clean up test data after runs (especially for write-heavy tests) ### DON'T: - Don't load test production without explicit permission and monitoring - Don't ignore warmup period (JIT compilation, cache warming) - Don't test from same datacenter as API (unrealistic latency) - Don't use default test data (create realistic, varied datasets) - Don't skip cool-down period (observe resource cleanup) - Don't test only happy paths (include error scenarios) - Don't ignore database connection limits - Don't run tests during production deployments - Don't compare results across different network conditions - Don't test third-party APIs without permission ### TIPS: - Use distributed load generation for tests > 1000 VUs - Export metrics to monitoring systems (Prometheus, DataDog) for correlation - Create custom dashboards showing load test progress in real-time - Use percentiles (p95, p99) instead of averages for SLA targets - Test cache warm vs cold scenarios separately - Include authentication overhead in realistic flows - Validate response bodies, not just status codes - Use unique IDs per virtual user to avoid data conflicts - Schedule tests during low-traffic periods - Keep test scripts in version control with API code ## Related Commands - `/api-mock-server` - Create mock API for testing without backend - `/api-monitoring-dashboard` - Set up real-time monitoring during load tests - `/api-cache-manager` - Configure caching to improve performance under load - `/api-rate-limiter` - Implement rate limiting to protect APIs - `/deployment-pipeline-orchestrator` - Integrate load tests into CI/CD pipeline - `/kubernetes-deployment-creator` - Configure autoscaling based on load test findings ## Performance Considerations ### Test Environment Sizing - **Client machine**: 1 VU ≈ 1-10 MB RAM, 0.01-0.1 CPU cores - **Network bandwidth**: 1000 VUs ≈ 10-100 Mbps depending on payload size - **k6 limits**: Single instance handles 30,000-40,000 VUs (depends on script complexity) - **Artillery limits**: Single instance handles 5,000-10,000 RPS - **Gatling limits**: Single instance handles 50,000+ VUs (JVM-based) ### Backend Resource Planning - **Database connections**: Plan for peak concurrent users + connection pool overhead - **CPU utilization**: Keep below 80% under sustained load (leave headroom for spikes) - **Memory**: Monitor for leaks (heap should stabilize after warmup) - **Network I/O**: Ensure network bandwidth exceeds expected throughput by 50% ### Optimization Strategies - **HTTP keep-alive**: Reduces connection overhead by 50-80% - **Response compression**: Reduces bandwidth by 60-80% for text responses - **CDN caching**: Offloads 70-90% of static asset requests - **Database indexing**: Can improve query performance by 10-100x - **Connection pooling**: Reduces latency by 20-50ms per request ## Security Notes ### Testing Permissions - Obtain written approval before load testing any environment - Verify testing is allowed by API terms of service - Use dedicated test accounts with limited privileges - Test in isolated environments to prevent data corruption ### Credential Management - Never hardcode API keys or passwords in test scripts - Use environment variables: `export API_TOKEN=$(vault read -field=token secret/api)` - Rotate test credentials regularly - Use short-lived tokens (JWT with 1-hour expiry) - Store sensitive data in secrets managers (Vault, AWS Secrets Manager) ### Data Privacy - Use synthetic test data (never real customer PII) - Anonymize logs and results before sharing - Clean up test data immediately after test completion - Encrypt results files containing sensitive information ### Network Security - Run tests from trusted networks (avoid public WiFi) - Use VPN when testing internal APIs - Implement IP whitelisting for test traffic - Monitor for anomalous traffic patterns during tests ## Troubleshooting Guide ### Issue: Inconsistent Results Between Runs **Symptoms:** Response times vary by > 50% between identical test runs **Diagnosis:** - Check for background jobs or cron tasks running during test - Verify database wasn't backed up during test - Ensure no other load tests running concurrently **Solution:** - Schedule tests during known quiet periods - Disable background tasks during test window - Run multiple iterations and take median results ### Issue: Low Throughput Despite Low CPU/Memory **Symptoms:** API handling only 100 RPS despite 20% CPU usage **Diagnosis:** - Check network bandwidth utilization - Examine database connection pool exhaustion - Look for synchronous I/O blocking (file system, external API calls) **Solution:** - Increase connection pool size - Implement async I/O for external calls - Add caching layer (Redis) for frequently accessed data ### Issue: Error Rate Increases Under Load **Symptoms:** 0.1% errors at 100 RPS, 5% errors at 500 RPS **Diagnosis:** - Database deadlocks or lock contention - Race conditions in concurrent code paths - Resource exhaustion (file descriptors, sockets) **Solution:** - Add database query logging to identify slow queries - Implement optimistic locking or queue-based processing - Increase file descriptor limits: `ulimit -n 65536` ### Issue: Memory Leak Detected **Symptoms:** Memory usage grows continuously without stabilizing **Diagnosis:** - Heap dump analysis shows growing object count - GC frequency increases over time - API becomes unresponsive after extended load **Solution:** - Profile application with heap analyzer (Chrome DevTools, VisualVM) - Check for unclosed database connections or file handles - Review event listener registration (potential memory leak source) ### Issue: Test Client Crashes **Symptoms:** k6/Artillery process terminated with OOM error **Diagnosis:** - Too many VUs for available client memory - Large response bodies consuming memory - Results export causing memory pressure **Solution:** - Reduce VU count or distribute across multiple machines - Increase Node.js memory: `NODE_OPTIONS=--max-old-space-size=8192` - Disable detailed logging: `--quiet` or `--summary-export` only ## Version History - **1.0.0** (2025-10-11) - Initial release with k6, Artillery, and Gatling support - **1.1.0** (2025-10-15) - Added custom metrics and Prometheus integration - **1.2.0** (2025-10-20) - Distributed load testing support for high-scale scenarios