gh-jeremylongshore-claude-c…/commands/run-load-test.md

---
description: Run API load tests with k6, Artillery, or Gatling to measure performance under load
shortcut: loadtest
---

# Run API Load Test

Execute comprehensive load tests to measure API performance, identify bottlenecks, and validate scalability under realistic traffic patterns.

## Design Decisions

This command supports multiple load testing tools to accommodate different testing scenarios and team preferences:

- **k6**: Chosen for developer-friendly JavaScript API, excellent CLI output, and built-in metrics
- **Artillery**: Selected for YAML configuration simplicity and scenario-based testing
- **Gatling**: Included for enterprise-grade reporting and Scala DSL power users

Alternative approaches considered:
- **JMeter**: Excluded due to GUI-heavy approach and XML configuration complexity
- **Locust**: Considered but not included to limit Python dependencies
- **Custom solutions**: Avoided to leverage battle-tested tools with proven metrics accuracy

## When to Use This Command

**USE WHEN:**
- Validating API performance before production deployment
- Establishing baseline performance metrics for SLAs
- Testing autoscaling behavior under load
- Identifying memory leaks or resource exhaustion issues
- Comparing performance across API versions
- Simulating Black Friday or high-traffic events

**DON'T USE WHEN:**
- Testing production APIs without permission (use staging environments)
- You need functional correctness testing (use integration tests instead)
- Testing third-party APIs you don't control
- During active development (use unit/integration tests first)

## Prerequisites

**Required:**
- Node.js 18+ (for k6 and Artillery)
- Java 11+ (for Gatling)
- Target API endpoint accessible from your machine
- API authentication credentials (if required)

**Recommended:**
- Monitoring tools configured (Prometheus, Grafana, DataDog)
- Baseline metrics from previous test runs
- Staging environment that mirrors production capacity

**Install Tools:**
```bash
# k6 (recommended for most use cases)
brew install k6  # macOS
sudo apt-get install k6  # Ubuntu

# Artillery
npm install -g artillery

# Gatling
wget https://repo1.maven.org/maven2/io/gatling/highcharts/gatling-charts-highcharts-bundle/3.9.5/gatling-charts-highcharts-bundle-3.9.5.zip
unzip gatling-charts-highcharts-bundle-3.9.5.zip
```

## Detailed Process

### Step 1: Define Test Objectives
Establish clear performance targets before running tests:
- **Response time**: p95 < 200ms, p99 < 500ms
- **Throughput**: 1000 requests/second sustained
- **Error rate**: < 0.1% under normal load
- **Concurrent users**: Support 500 simultaneous users

Document expected behavior under different load levels:
- Normal load: 100-500 RPS
- Peak load: 1000-2000 RPS
- Stress test: 3000+ RPS until failure

### Step 2: Configure Test Scenario
Create test scripts matching realistic user behavior patterns:

**k6 test script** (`load-test.js`):
```javascript
import http from 'k6/http';
import { check, sleep } from 'k6';

export const options = {
  stages: [
    { duration: '2m', target: 100 },  // Ramp-up
    { duration: '5m', target: 100 },  // Sustained load
    { duration: '2m', target: 200 },  // Scale up
    { duration: '5m', target: 200 },  // Sustained peak
    { duration: '2m', target: 0 },    // Ramp-down
  ],
  thresholds: {
    http_req_duration: ['p(95)<200', 'p(99)<500'],
    http_req_failed: ['rate<0.01'],
  },
};

export default function () {
  const res = http.get('https://api.example.com/v1/products');
  check(res, {
    'status is 200': (r) => r.status === 200,
    'response time < 200ms': (r) => r.timings.duration < 200,
  });
  sleep(1);
}
```

**Artillery config** (`artillery.yml`):
```yaml
config:
  target: 'https://api.example.com'
  phases:
    - duration: 60
      arrivalRate: 10
      name: "Warm up"
    - duration: 300
      arrivalRate: 50
      name: "Sustained load"
    - duration: 120
      arrivalRate: 100
      name: "Peak load"
  processor: "./flows.js"
scenarios:
  - name: "Product browsing flow"
    flow:
      - get:
          url: "/v1/products"
          capture:
            - json: "$.products[0].id"
              as: "productId"
      - get:
          url: "/v1/products/{{ productId }}"
      - think: 3
```

### Step 3: Execute Load Test
Run tests with appropriate parameters and monitor system resources:

```bash
# k6 test execution with custom parameters
k6 run load-test.js \
  --vus 100 \
  --duration 10m \
  --out json=results.json \
  --summary-export=summary.json

# Artillery with real-time reporting
artillery run artillery.yml \
  --output report.json

# Gatling test execution
./gatling.sh -s com.example.LoadTest \
  -rf results/
```

Monitor system metrics during execution:
- CPU utilization (should stay below 80%)
- Memory consumption (watch for leaks)
- Network I/O (bandwidth saturation)
- Database connections (connection pool exhaustion)

### Step 4: Analyze Results
Review metrics to identify performance bottlenecks:

**Response Time Analysis:**
```bash
# k6 summary shows percentile distribution
  http_req_duration..............: avg=156ms  p(95)=289ms p(99)=456ms
  http_req_failed................: 0.12% (12 failures / 10000 requests)
  http_reqs......................: 10000  166.67/s
  vus............................: 100    min=0 max=100
```

Key metrics to examine:
- **p50 (median)**: Typical user experience
- **p95**: Worst case for 95% of users
- **p99**: Tail latency affecting 1% of requests
- **Error rate**: Percentage of failed requests
- **Throughput**: Successful requests per second

### Step 5: Generate Reports and Recommendations
Create actionable reports with findings and optimization suggestions:

**Performance Report Structure:**
```markdown
# Load Test Results - 2025-10-11

## Test Configuration
- Duration: 10 minutes
- Virtual Users: 100
- Target: https://api.example.com/v1/products

## Results Summary
- Total Requests: 10,000
- Success Rate: 99.88%
- Avg Response Time: 156ms
- p95 Response Time: 289ms
- Throughput: 166.67 RPS

## Findings
1. Database query optimization needed (p99 spikes to 456ms)
2. Connection pool exhausted at 150 concurrent users
3. Memory leak detected after 8 minutes

## Recommendations
1. Add database indexes on product_id and category
2. Increase connection pool from 20 to 50
3. Fix memory leak in image processing service
```

## Output Format

The command generates structured performance reports:

**Console Output:**
```
Running load test with k6...

  execution: local
    script: load-test.js
    output: json (results.json)

  scenarios: (100.00%) 1 scenario, 200 max VUs, 17m0s max duration

  data_received..................: 48 MB   80 kB/s
  data_sent......................: 2.4 MB  4.0 kB/s
  http_req_blocked...............: avg=1.23ms   p(95)=3.45ms  p(99)=8.91ms
  http_req_connecting............: avg=856µs    p(95)=2.34ms  p(99)=5.67ms
  http_req_duration..............: avg=156.78ms p(95)=289.45ms p(99)=456.12ms
  http_req_failed................: 0.12%
  http_req_receiving.............: avg=234µs    p(95)=567µs   p(99)=1.23ms
  http_req_sending...............: avg=123µs    p(95)=345µs   p(99)=789µs
  http_req_tls_handshaking.......: avg=0s       p(95)=0s      p(99)=0s
  http_req_waiting...............: avg=156.42ms p(95)=288.89ms p(99)=455.34ms
  http_reqs......................: 10000   166.67/s
  iteration_duration.............: avg=1.16s    p(95)=1.29s   p(99)=1.46s
  iterations.....................: 10000   166.67/s
  vus............................: 100     min=0 max=200
  vus_max........................: 200     min=200 max=200
```

**JSON Report:**
```json
{
  "metrics": {
    "http_req_duration": {
      "avg": 156.78,
      "p95": 289.45,
      "p99": 456.12
    },
    "http_req_failed": 0.0012,
    "http_reqs": {
      "count": 10000,
      "rate": 166.67
    }
  },
  "root_group": {
    "checks": {
      "status is 200": {
        "passes": 9988,
        "fails": 12
      }
    }
  }
}
```

## Code Examples

### Example 1: Basic Load Test with k6

Test a REST API endpoint with gradual ramp-up and threshold validation:

```javascript
// basic-load-test.js
import http from 'k6/http';
import { check, sleep } from 'k6';
import { Rate } from 'k6/metrics';

// Custom metrics
const errorRate = new Rate('errors');

export const options = {
  // Ramp-up pattern: 0 -> 50 -> 100 -> 50 -> 0
  stages: [
    { duration: '1m', target: 50 },   // Ramp-up to 50 users
    { duration: '3m', target: 50 },   // Stay at 50 users
    { duration: '1m', target: 100 },  // Spike to 100 users
    { duration: '3m', target: 100 },  // Stay at 100 users
    { duration: '1m', target: 50 },   // Scale down to 50
    { duration: '1m', target: 0 },    // Ramp-down to 0
  ],

  // Performance thresholds (test fails if exceeded)
  thresholds: {
    'http_req_duration': ['p(95)<300', 'p(99)<500'],
    'http_req_failed': ['rate<0.01'],  // Less than 1% errors
    'errors': ['rate<0.1'],
  },
};

export default function () {
  // Test parameters
  const baseUrl = 'https://api.example.com';
  const params = {
    headers: {
      'Content-Type': 'application/json',
      'Authorization': `Bearer ${__ENV.API_TOKEN}`,
    },
  };

  // API request
  const res = http.get(`${baseUrl}/v1/products?limit=20`, params);

  // Validation checks
  const checkRes = check(res, {
    'status is 200': (r) => r.status === 200,
    'response time < 300ms': (r) => r.timings.duration < 300,
    'has products': (r) => r.json('products').length > 0,
    'valid JSON': (r) => {
      try {
        JSON.parse(r.body);
        return true;
      } catch (e) {
        return false;
      }
    },
  });

  // Track custom error metric
  errorRate.add(!checkRes);

  // Simulate user think time
  sleep(Math.random() * 3 + 1); // 1-4 seconds
}

// Teardown function (runs once at end)
export function teardown(data) {
  console.log('Load test completed');
}
```

**Run command:**
```bash
# Set API token and execute
export API_TOKEN="your-token-here"
k6 run basic-load-test.js \
  --out json=results.json \
  --summary-export=summary.json

# Generate HTML report from JSON
k6-reporter results.json --output report.html
```

### Example 2: Stress Testing with Artillery

Test API breaking point with gradual load increase until failure:

```yaml
# stress-test.yml
config:
  target: 'https://api.example.com'
  phases:
    # Gradual ramp-up to find breaking point
    - duration: 60
      arrivalRate: 10
      name: "Phase 1: Baseline (10 RPS)"
    - duration: 60
      arrivalRate: 50
      name: "Phase 2: Moderate (50 RPS)"
    - duration: 60
      arrivalRate: 100
      name: "Phase 3: High (100 RPS)"
    - duration: 60
      arrivalRate: 200
      name: "Phase 4: Stress (200 RPS)"
    - duration: 60
      arrivalRate: 400
      name: "Phase 5: Breaking point (400 RPS)"

  # Environment variables
  variables:
    api_token: "{{ $processEnvironment.API_TOKEN }}"

  # HTTP settings
  http:
    timeout: 10
    pool: 50

  # Custom plugins
  plugins:
    expect: {}
    metrics-by-endpoint: {}

  # Success criteria
  ensure:
    p95: 500
    p99: 1000
    maxErrorRate: 1

# Test scenarios
scenarios:
  - name: "Product CRUD operations"
    weight: 70
    flow:
      # List products
      - get:
          url: "/v1/products"
          headers:
            Authorization: "Bearer {{ api_token }}"
          expect:
            - statusCode: 200
            - contentType: json
            - hasProperty: products
          capture:
            - json: "$.products[0].id"
              as: "productId"

      # Get product details
      - get:
          url: "/v1/products/{{ productId }}"
          headers:
            Authorization: "Bearer {{ api_token }}"
          expect:
            - statusCode: 200
            - hasProperty: id

      # Think time (user reading)
      - think: 2

      # Search products
      - get:
          url: "/v1/products/search?q=laptop"
          headers:
            Authorization: "Bearer {{ api_token }}"
          expect:
            - statusCode: 200

  - name: "User authentication flow"
    weight: 20
    flow:
      - post:
          url: "/v1/auth/login"
          json:
            email: "test@example.com"
            password: "password123"
          expect:
            - statusCode: 200
            - hasProperty: token
          capture:
            - json: "$.token"
              as: "userToken"

      - get:
          url: "/v1/users/me"
          headers:
            Authorization: "Bearer {{ userToken }}"
          expect:
            - statusCode: 200

  - name: "Shopping cart operations"
    weight: 10
    flow:
      - post:
          url: "/v1/cart/items"
          headers:
            Authorization: "Bearer {{ api_token }}"
          json:
            productId: "{{ productId }}"
            quantity: 1
          expect:
            - statusCode: 201

      - get:
          url: "/v1/cart"
          headers:
            Authorization: "Bearer {{ api_token }}"
          expect:
            - statusCode: 200
            - hasProperty: items
```

**Run with custom processor:**
```javascript
// flows.js - Custom logic for Artillery
module.exports = {
  // Before request hook
  setAuthToken: function(requestParams, context, ee, next) {
    requestParams.headers = requestParams.headers || {};
    requestParams.headers['X-Request-ID'] = `req-${Date.now()}-${Math.random()}`;
    return next();
  },

  // After response hook
  logResponse: function(requestParams, response, context, ee, next) {
    if (response.statusCode >= 400) {
      console.log(`Error: ${response.statusCode} - ${requestParams.url}`);
    }
    return next();
  },

  // Custom function to generate dynamic data
  generateTestData: function(context, events, done) {
    context.vars.userId = `user-${Math.floor(Math.random() * 10000)}`;
    context.vars.timestamp = new Date().toISOString();
    return done();
  }
};
```

**Execute stress test:**
```bash
# Run with environment variable
API_TOKEN="your-token" artillery run stress-test.yml \
  --output stress-results.json

# Generate HTML report
artillery report stress-results.json \
  --output stress-report.html

# Run with custom config overrides
artillery run stress-test.yml \
  --config config.phases[0].duration=30 \
  --config config.phases[0].arrivalRate=20
```

### Example 3: Performance Testing with Gatling (Scala DSL)

Enterprise-grade load test with complex scenarios and detailed reporting:

```scala
// LoadSimulation.scala
package com.example.loadtest

import io.gatling.core.Predef._
import io.gatling.http.Predef._
import scala.concurrent.duration._

class ApiLoadSimulation extends Simulation {

  // HTTP protocol configuration
  val httpProtocol = http
    .baseUrl("https://api.example.com")
    .acceptHeader("application/json")
    .authorizationHeader("Bearer ${accessToken}")
    .userAgentHeader("Gatling Load Test")
    .shareConnections

  // Feeders for test data
  val userFeeder = csv("users.csv").circular
  val productFeeder = csv("products.csv").random

  // Custom headers
  val sentHeaders = Map(
    "X-Request-ID" -> "${requestId}",
    "X-Client-Version" -> "1.0.0"
  )

  // Scenario 1: Browse products
  val browseProducts = scenario("Browse Products")
    .feed(userFeeder)
    .exec(session => session.set("requestId", java.util.UUID.randomUUID.toString))
    .exec(
      http("List Products")
        .get("/v1/products")
        .headers(sentHeaders)
        .check(status.is(200))
        .check(jsonPath("$.products[*].id").findAll.saveAs("productIds"))
    )
    .pause(2, 5)
    .exec(
      http("Get Product Details")
        .get("/v1/products/${productIds.random()}")
        .check(status.is(200))
        .check(jsonPath("$.id").exists)
        .check(jsonPath("$.price").ofType[Double].saveAs("price"))
    )
    .pause(1, 3)

  // Scenario 2: Search and filter
  val searchProducts = scenario("Search Products")
    .exec(session => session.set("requestId", java.util.UUID.randomUUID.toString))
    .exec(
      http("Search Products")
        .get("/v1/products/search")
        .queryParam("q", "laptop")
        .queryParam("minPrice", "500")
        .queryParam("maxPrice", "2000")
        .headers(sentHeaders)
        .check(status.is(200))
        .check(jsonPath("$.total").ofType[Int].gt(0))
    )
    .pause(2, 4)
    .exec(
      http("Apply Filters")
        .get("/v1/products/search")
        .queryParam("q", "laptop")
        .queryParam("brand", "Dell")
        .queryParam("sort", "price")
        .check(status.is(200))
    )

  // Scenario 3: Checkout flow
  val checkout = scenario("Checkout Flow")
    .feed(userFeeder)
    .feed(productFeeder)
    .exec(session => session.set("requestId", java.util.UUID.randomUUID.toString))
    .exec(
      http("Add to Cart")
        .post("/v1/cart/items")
        .headers(sentHeaders)
        .body(StringBody("""{"productId": "${productId}", "quantity": 1}"""))
        .asJson
        .check(status.is(201))
        .check(jsonPath("$.cartId").saveAs("cartId"))
    )
    .pause(1, 2)
    .exec(
      http("Get Cart")
        .get("/v1/cart/${cartId}")
        .check(status.is(200))
        .check(jsonPath("$.total").ofType[Double].saveAs("total"))
    )
    .pause(2, 4)
    .exec(
      http("Create Order")
        .post("/v1/orders")
        .body(StringBody("""{"cartId": "${cartId}", "paymentMethod": "credit_card"}"""))
        .asJson
        .check(status.in(200, 201))
        .check(jsonPath("$.orderId").saveAs("orderId"))
    )
    .exec(
      http("Get Order Status")
        .get("/v1/orders/${orderId}")
        .check(status.is(200))
        .check(jsonPath("$.status").is("pending"))
    )

  // Load profile: Realistic production traffic pattern
  setUp(
    // 70% users browse products
    browseProducts.inject(
      rampUsersPerSec(1) to 50 during (2 minutes),
      constantUsersPerSec(50) during (5 minutes),
      rampUsersPerSec(50) to 100 during (3 minutes),
      constantUsersPerSec(100) during (5 minutes),
      rampUsersPerSec(100) to 0 during (2 minutes)
    ).protocols(httpProtocol),

    // 20% users search
    searchProducts.inject(
      rampUsersPerSec(1) to 15 during (2 minutes),
      constantUsersPerSec(15) during (10 minutes),
      rampUsersPerSec(15) to 0 during (2 minutes)
    ).protocols(httpProtocol),

    // 10% users complete checkout
    checkout.inject(
      rampUsersPerSec(1) to 10 during (3 minutes),
      constantUsersPerSec(10) during (10 minutes),
      rampUsersPerSec(10) to 0 during (2 minutes)
    ).protocols(httpProtocol)
  ).protocols(httpProtocol)
   .assertions(
     global.responseTime.max.lt(2000),
     global.responseTime.percentile3.lt(500),
     global.successfulRequests.percent.gt(99)
   )
}
```

**Supporting data files:**

`users.csv`:
```csv
userId,accessToken
user-001,eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...
user-002,eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...
user-003,eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...
```

`products.csv`:
```csv
productId,category
prod-001,electronics
prod-002,clothing
prod-003,books
```

**Run Gatling simulation:**
```bash
# Using Gatling Maven plugin
mvn gatling:test -Dgatling.simulationClass=com.example.loadtest.ApiLoadSimulation

# Using standalone Gatling
./gatling.sh -s com.example.loadtest.ApiLoadSimulation \
  -rf results/

# Generate report only (from previous run)
./gatling.sh -ro results/apisimulation-20251011143022456
```

**Gatling configuration** (`gatling.conf`):
```hocon
gatling {
  core {
    outputDirectoryBaseName = "api-load-test"
    runDescription = "Production load simulation"
    encoding = "utf-8"
    simulationClass = ""
  }
  charting {
    indicators {
      lowerBound = 100      # Lower bound for response time (ms)
      higherBound = 500     # Higher bound for response time (ms)
      percentile1 = 50      # First percentile
      percentile2 = 75      # Second percentile
      percentile3 = 95      # Third percentile
      percentile4 = 99      # Fourth percentile
    }
  }
  http {
    ahc {
      pooledConnectionIdleTimeout = 60000
      readTimeout = 60000
      requestTimeout = 60000
      connectionTimeout = 30000
      maxConnections = 200
      maxConnectionsPerHost = 50
    }
  }
  data {
    writers = [console, file]
  }
}
```

## Error Handling

Common errors and solutions:

**Connection Refused:**
```
Error: connect ECONNREFUSED 127.0.0.1:8080
```
Solution: Verify API is running and accessible. Check network connectivity and firewall rules.

**Timeout Errors:**
```
http_req_failed: 45.2% (4520 failures / 10000 requests)
```
Solution: Increase timeout values or reduce concurrent users. API may be overwhelmed.

**SSL/TLS Errors:**
```
Error: x509: certificate signed by unknown authority
```
Solution: Add `insecureSkipTLSVerify: true` or configure proper CA certificates.

**Rate Limiting:**
```
HTTP 429 Too Many Requests
```
Solution: Reduce request rate or increase rate limits on API server. Add backoff logic.

**Memory Exhaustion:**
```
JavaScript heap out of memory
```
Solution: Increase Node.js memory limit: `NODE_OPTIONS=--max-old-space-size=4096 k6 run test.js`

**Authentication Failures:**
```
HTTP 401 Unauthorized
```
Solution: Verify API tokens are valid and not expired. Check authorization headers.

## Configuration Options

### k6 Options

```bash
--vus N                    # Number of virtual users (default: 1)
--duration Xm              # Test duration (e.g., 10m, 30s)
--iterations N             # Total iterations across all VUs
--stage "Xm:N"            # Add load stage (duration:target)
--rps N                    # Max requests per second
--max-redirects N          # Max HTTP redirects (default: 10)
--batch N                  # Max parallel batch requests
--batch-per-host N         # Max parallel requests per host
--http-debug              # Enable HTTP debug logging
--no-connection-reuse     # Disable HTTP keep-alive
--throw                   # Throw errors on failed HTTP requests
--summary-trend-stats     # Custom summary stats (e.g., "avg,p(95),p(99)")
--out json=file.json      # Export results to JSON
--out influxdb=http://... # Export to InfluxDB
--out statsd              # Export to StatsD
```

### Artillery Options

```bash
--target URL               # Override target URL
--output FILE              # Save results to JSON file
--overrides FILE           # Override config with JSON file
--variables FILE           # Load variables from JSON
--config KEY=VALUE         # Override single config value
--environment ENV          # Select environment from config
--solo                     # Run test without publishing
--quiet                    # Suppress output
--plugins                  # List installed plugins
--dotenv FILE              # Load environment from .env file
```

### Gatling Options

```bash
-s CLASS                   # Simulation class to run
-rf FOLDER                 # Results folder
-rd DESC                   # Run description
-nr                        # No reports generation
-ro FOLDER                 # Generate reports only
```

## Best Practices

### DO:
- Start with baseline test (low load) to verify test scripts work correctly
- Ramp up load gradually to identify inflection points
- Monitor backend resources (CPU, memory, database) during tests
- Use realistic think times (1-5 seconds) to simulate user behavior
- Test in staging environment that mirrors production capacity
- Run tests multiple times to establish consistency
- Document test configuration and results for historical comparison
- Use connection pooling and HTTP keep-alive for realistic scenarios
- Set appropriate timeouts (30-60 seconds for most APIs)
- Clean up test data after runs (especially for write-heavy tests)

### DON'T:
- Don't load test production without explicit permission and monitoring
- Don't ignore warmup period (JIT compilation, cache warming)
- Don't test from same datacenter as API (unrealistic latency)
- Don't use default test data (create realistic, varied datasets)
- Don't skip cool-down period (observe resource cleanup)
- Don't test only happy paths (include error scenarios)
- Don't ignore database connection limits
- Don't run tests during production deployments
- Don't compare results across different network conditions
- Don't test third-party APIs without permission

### TIPS:
- Use distributed load generation for tests > 1000 VUs
- Export metrics to monitoring systems (Prometheus, DataDog) for correlation
- Create custom dashboards showing load test progress in real-time
- Use percentiles (p95, p99) instead of averages for SLA targets
- Test cache warm vs cold scenarios separately
- Include authentication overhead in realistic flows
- Validate response bodies, not just status codes
- Use unique IDs per virtual user to avoid data conflicts
- Schedule tests during low-traffic periods
- Keep test scripts in version control with API code

## Related Commands

- `/api-mock-server` - Create mock API for testing without backend
- `/api-monitoring-dashboard` - Set up real-time monitoring during load tests
- `/api-cache-manager` - Configure caching to improve performance under load
- `/api-rate-limiter` - Implement rate limiting to protect APIs
- `/deployment-pipeline-orchestrator` - Integrate load tests into CI/CD pipeline
- `/kubernetes-deployment-creator` - Configure autoscaling based on load test findings

## Performance Considerations

### Test Environment Sizing
- **Client machine**: 1 VU ≈ 1-10 MB RAM, 0.01-0.1 CPU cores
- **Network bandwidth**: 1000 VUs ≈ 10-100 Mbps depending on payload size
- **k6 limits**: Single instance handles 30,000-40,000 VUs (depends on script complexity)
- **Artillery limits**: Single instance handles 5,000-10,000 RPS
- **Gatling limits**: Single instance handles 50,000+ VUs (JVM-based)

### Backend Resource Planning
- **Database connections**: Plan for peak concurrent users + connection pool overhead
- **CPU utilization**: Keep below 80% under sustained load (leave headroom for spikes)
- **Memory**: Monitor for leaks (heap should stabilize after warmup)
- **Network I/O**: Ensure network bandwidth exceeds expected throughput by 50%

### Optimization Strategies
- **HTTP keep-alive**: Reduces connection overhead by 50-80%
- **Response compression**: Reduces bandwidth by 60-80% for text responses
- **CDN caching**: Offloads 70-90% of static asset requests
- **Database indexing**: Can improve query performance by 10-100x
- **Connection pooling**: Reduces latency by 20-50ms per request

## Security Notes

### Testing Permissions
- Obtain written approval before load testing any environment
- Verify testing is allowed by API terms of service
- Use dedicated test accounts with limited privileges
- Test in isolated environments to prevent data corruption

### Credential Management
- Never hardcode API keys or passwords in test scripts
- Use environment variables: `export API_TOKEN=$(vault read -field=token secret/api)`
- Rotate test credentials regularly
- Use short-lived tokens (JWT with 1-hour expiry)
- Store sensitive data in secrets managers (Vault, AWS Secrets Manager)

### Data Privacy
- Use synthetic test data (never real customer PII)
- Anonymize logs and results before sharing
- Clean up test data immediately after test completion
- Encrypt results files containing sensitive information

### Network Security
- Run tests from trusted networks (avoid public WiFi)
- Use VPN when testing internal APIs
- Implement IP whitelisting for test traffic
- Monitor for anomalous traffic patterns during tests

## Troubleshooting Guide

### Issue: Inconsistent Results Between Runs
**Symptoms:** Response times vary by > 50% between identical test runs
**Diagnosis:**
- Check for background jobs or cron tasks running during test
- Verify database wasn't backed up during test
- Ensure no other load tests running concurrently
**Solution:**
- Schedule tests during known quiet periods
- Disable background tasks during test window
- Run multiple iterations and take median results

### Issue: Low Throughput Despite Low CPU/Memory
**Symptoms:** API handling only 100 RPS despite 20% CPU usage
**Diagnosis:**
- Check network bandwidth utilization
- Examine database connection pool exhaustion
- Look for synchronous I/O blocking (file system, external API calls)
**Solution:**
- Increase connection pool size
- Implement async I/O for external calls
- Add caching layer (Redis) for frequently accessed data

### Issue: Error Rate Increases Under Load
**Symptoms:** 0.1% errors at 100 RPS, 5% errors at 500 RPS
**Diagnosis:**
- Database deadlocks or lock contention
- Race conditions in concurrent code paths
- Resource exhaustion (file descriptors, sockets)
**Solution:**
- Add database query logging to identify slow queries
- Implement optimistic locking or queue-based processing
- Increase file descriptor limits: `ulimit -n 65536`

### Issue: Memory Leak Detected
**Symptoms:** Memory usage grows continuously without stabilizing
**Diagnosis:**
- Heap dump analysis shows growing object count
- GC frequency increases over time
- API becomes unresponsive after extended load
**Solution:**
- Profile application with heap analyzer (Chrome DevTools, VisualVM)
- Check for unclosed database connections or file handles
- Review event listener registration (potential memory leak source)

### Issue: Test Client Crashes
**Symptoms:** k6/Artillery process terminated with OOM error
**Diagnosis:**
- Too many VUs for available client memory
- Large response bodies consuming memory
- Results export causing memory pressure
**Solution:**
- Reduce VU count or distribute across multiple machines
- Increase Node.js memory: `NODE_OPTIONS=--max-old-space-size=8192`
- Disable detailed logging: `--quiet` or `--summary-export` only

## Version History

- **1.0.0** (2025-10-11) - Initial release with k6, Artillery, and Gatling support
- **1.1.0** (2025-10-15) - Added custom metrics and Prometheus integration
- **1.2.0** (2025-10-20) - Distributed load testing support for high-scale scenarios