Files
2025-11-30 09:07:22 +08:00

9.4 KiB

Test Quality Standards

Version: 2.0 Source: Bootstrap-002 Test Strategy Development Last Updated: 2025-10-18

This document defines quality criteria, coverage targets, and best practices for test development.


Test Quality Checklist

For every test, ensure compliance with these quality standards:

Structure

  • Test name clearly describes scenario
  • Setup is minimal and focused
  • Single concept tested per test
  • Clear error messages with context

Execution

  • Cleanup handled (defer, t.Cleanup)
  • No hard-coded paths or values
  • Deterministic (no randomness)
  • Fast execution (<100ms for unit tests)

Coverage

  • Tests both happy and error paths
  • Uses test helpers where appropriate
  • Follows documented patterns
  • Includes edge cases

CLI Test Additional Checklist

When testing CLI commands, also ensure:

  • Command flags reset between tests
  • Output captured properly (stdout/stderr)
  • Environment variables reset (if used)
  • Working directory restored (if changed)
  • Temporary files cleaned up
  • No dependency on external binaries (unless integration test)
  • Tests both happy path and error cases
  • Help text validated (if command has help)

Coverage Target Goals

By Category

Different code categories require different coverage levels based on criticality:

Category Target Coverage Priority Rationale
Error Handling 80-90% P1 Critical for reliability
Business Logic 75-85% P2 Core functionality
CLI Handlers 70-80% P2 User-facing behavior
Integration 70-80% P3 End-to-end validation
Utilities 60-70% P3 Supporting functions
Infrastructure 40-60% P4 Best effort

Overall Project Target: 75-80%

Priority Decision Tree

Is function critical to core functionality?
├─ YES: Is it error handling or validation?
│  ├─ YES: Priority 1 (80%+ coverage target)
│  └─ NO: Is it business logic?
│     ├─ YES: Priority 2 (75%+ coverage)
│     └─ NO: Priority 3 (60%+ coverage)
└─ NO: Is it infrastructure/initialization?
   ├─ YES: Priority 4 (test if easy, skip if hard)
   └─ NO: Priority 5 (skip)

Test Naming Conventions

Unit Tests

// Format: TestFunctionName_Scenario
TestValidateInput_NilInput
TestValidateInput_EmptyInput
TestProcessData_ValidFormat

Table-Driven Tests

// Format: TestFunctionName (scenarios in table)
TestValidateInput  // Table contains: "nil input", "empty input", etc.
TestProcessData    // Table contains: "valid format", "invalid format", etc.

Integration Tests

// Format: TestHandler_Scenario or TestIntegration_Feature
TestQueryTools_SuccessfulQuery
TestGetSessionStats_ErrorHandling
TestIntegration_CompleteWorkflow

Test Structure Best Practices

Setup-Execute-Assert Pattern

func TestFunction(t *testing.T) {
    // Setup: Create test data and dependencies
    input := createTestInput()
    mock := createMockDependency()

    // Execute: Call the function under test
    result, err := Function(input, mock)

    // Assert: Verify expected behavior
    if err != nil {
        t.Fatalf("unexpected error: %v", err)
    }
    if result != expected {
        t.Errorf("expected %v, got %v", expected, result)
    }
}

Cleanup Handling

func TestFunction(t *testing.T) {
    // Using defer for cleanup
    originalValue := globalVar
    defer func() { globalVar = originalValue }()

    // Or using t.Cleanup (preferred)
    t.Cleanup(func() {
        globalVar = originalValue
    })

    // Test logic...
}

Helper Functions

// Mark as helper for better error reporting
func createTestInput(t *testing.T) *Input {
    t.Helper()  // Errors will point to caller, not this line

    return &Input{
        Field1: "test",
        Field2: 42,
    }
}

Error Message Guidelines

Good Error Messages

// Include context and actual values
if result != expected {
    t.Errorf("Function() = %v, expected %v", result, expected)
}

// Include relevant state
if len(results) != expectedCount {
    t.Errorf("got %d results, expected %d: %+v",
        len(results), expectedCount, results)
}

Poor Error Messages

// Avoid: No context
if err != nil {
    t.Fatal("error occurred")
}

// Avoid: Missing actual values
if !valid {
    t.Error("validation failed")
}

Test Performance Standards

Unit Tests

  • Target: <100ms per test
  • Maximum: <500ms per test
  • If slower: Consider mocking or refactoring

Integration Tests

  • Target: <1s per test
  • Maximum: <5s per test
  • If slower: Use testing.Short() to skip in short mode
func TestIntegration_SlowOperation(t *testing.T) {
    if testing.Short() {
        t.Skip("skipping slow integration test in short mode")
    }
    // Test logic...
}

Running Tests

# Fast tests only
go test -short ./...

# All tests with timeout
go test -timeout 5m ./...

Test Data Management

Inline Test Data

For small, simple data:

tests := []struct {
    name  string
    input string
}{
    {"empty", ""},
    {"single", "a"},
    {"multiple", "abc"},
}

Fixture Files

For complex data structures:

func loadTestFixture(t *testing.T, name string) []byte {
    t.Helper()
    data, err := os.ReadFile(filepath.Join("testdata", name))
    if err != nil {
        t.Fatalf("failed to load fixture %s: %v", name, err)
    }
    return data
}

Golden Files

For output validation:

func TestFormatOutput(t *testing.T) {
    output := formatOutput(testData)

    goldenPath := filepath.Join("testdata", "expected_output.golden")

    if *update {
        os.WriteFile(goldenPath, []byte(output), 0644)
    }

    expected, _ := os.ReadFile(goldenPath)
    if string(expected) != output {
        t.Errorf("output mismatch\ngot:\n%s\nwant:\n%s", output, expected)
    }
}

Common Anti-Patterns to Avoid

1. Testing Implementation Instead of Behavior

// Bad: Tests internal implementation
func TestFunction(t *testing.T) {
    obj := New()
    if obj.internalField != "expected" {  // Don't test internals
        t.Error("internal field wrong")
    }
}

// Good: Tests observable behavior
func TestFunction(t *testing.T) {
    obj := New()
    result := obj.PublicMethod()  // Test public interface
    if result != expected {
        t.Error("unexpected result")
    }
}

2. Overly Complex Test Setup

// Bad: Complex setup obscures test intent
func TestFunction(t *testing.T) {
    // 50 lines of setup...
    result := Function(complex, setup, params)
    // Assert...
}

// Good: Use helper functions
func TestFunction(t *testing.T) {
    setup := createTestSetup(t)  // Helper abstracts complexity
    result := Function(setup)
    // Assert...
}

3. Testing Multiple Concepts in One Test

// Bad: Tests multiple unrelated things
func TestValidation(t *testing.T) {
    // Tests format validation
    // Tests length validation
    // Tests encoding validation
    // Tests error handling
}

// Good: Separate tests for each concept
func TestValidation_Format(t *testing.T) { /*...*/ }
func TestValidation_Length(t *testing.T) { /*...*/ }
func TestValidation_Encoding(t *testing.T) { /*...*/ }
func TestValidation_ErrorHandling(t *testing.T) { /*...*/ }

4. Shared State Between Tests

// Bad: Tests depend on execution order
var sharedState string

func TestFirst(t *testing.T) {
    sharedState = "initialized"
}

func TestSecond(t *testing.T) {
    // Breaks if TestFirst doesn't run first
    if sharedState != "initialized" { /*...*/ }
}

// Good: Each test is independent
func TestFirst(t *testing.T) {
    state := "initialized"  // Local state
    // Test...
}

func TestSecond(t *testing.T) {
    state := setupState()  // Creates own state
    // Test...
}

Code Review Checklist for Tests

When reviewing test code, verify:

  • Tests are independent (can run in any order)
  • Test names are descriptive
  • Happy path and error paths both covered
  • Edge cases included
  • No magic numbers or strings (use constants)
  • Cleanup handled properly
  • Error messages provide context
  • Tests are reasonably fast
  • No commented-out test code
  • Follows established patterns in codebase

Continuous Improvement

Track Test Metrics

Record for each test batch:

Date: 2025-10-18
Batch: Validation error paths (4 tests)
Pattern: Error Path + Table-Driven
Time: 50 min (estimated 60 min) → 17% faster
Coverage: internal/validation 57.9% → 75.2% (+17.3%)
Total coverage: 72.3% → 73.5% (+1.2%)
Efficiency: 0.3% per test
Issues: None
Lessons: Table-driven error tests very efficient

Regular Coverage Analysis

# Weekly coverage review
go test -coverprofile=coverage.out ./...
go tool cover -func=coverage.out | tail -20

# Identify degradation
diff coverage-last-week.txt coverage-this-week.txt

Test Suite Health

Monitor:

  • Total test count (growing)
  • Test execution time (stable or decreasing)
  • Coverage percentage (stable or increasing)
  • Flaky test rate (near zero)
  • Test maintenance time (decreasing)

Source: Bootstrap-002 Test Strategy Development Framework: BAIME (Bootstrapped AI Methodology Engineering) Status: Production-ready, validated through 4 iterations