zhongwei/gh-yaleh-meta-cc-claude

Fork 0

Files

Zhongwei Li fab98d059b Initial commit

2025-11-30 09:07:22 +08:00

9.4 KiB

Raw Permalink Blame History

Test Quality Standards

Version: 2.0 Source: Bootstrap-002 Test Strategy Development Last Updated: 2025-10-18

This document defines quality criteria, coverage targets, and best practices for test development.

Test Quality Checklist

For every test, ensure compliance with these quality standards:

Structure

Test name clearly describes scenario
Setup is minimal and focused
Single concept tested per test
Clear error messages with context

Execution

Cleanup handled (defer, t.Cleanup)
No hard-coded paths or values
Deterministic (no randomness)
Fast execution (<100ms for unit tests)

Coverage

Tests both happy and error paths
Uses test helpers where appropriate
Follows documented patterns
Includes edge cases

CLI Test Additional Checklist

When testing CLI commands, also ensure:

Command flags reset between tests
Output captured properly (stdout/stderr)
Environment variables reset (if used)
Working directory restored (if changed)
Temporary files cleaned up
No dependency on external binaries (unless integration test)
Tests both happy path and error cases
Help text validated (if command has help)

Coverage Target Goals

By Category

Different code categories require different coverage levels based on criticality:

Category	Target Coverage	Priority	Rationale
Error Handling	80-90%	P1	Critical for reliability
Business Logic	75-85%	P2	Core functionality
CLI Handlers	70-80%	P2	User-facing behavior
Integration	70-80%	P3	End-to-end validation
Utilities	60-70%	P3	Supporting functions
Infrastructure	40-60%	P4	Best effort

Overall Project Target: 75-80%

Priority Decision Tree

Is function critical to core functionality?
├─ YES: Is it error handling or validation?
│  ├─ YES: Priority 1 (80%+ coverage target)
│  └─ NO: Is it business logic?
│     ├─ YES: Priority 2 (75%+ coverage)
│     └─ NO: Priority 3 (60%+ coverage)
└─ NO: Is it infrastructure/initialization?
   ├─ YES: Priority 4 (test if easy, skip if hard)
   └─ NO: Priority 5 (skip)

Test Naming Conventions

Unit Tests

// Format: TestFunctionName_Scenario
TestValidateInput_NilInput
TestValidateInput_EmptyInput
TestProcessData_ValidFormat

Table-Driven Tests

// Format: TestFunctionName (scenarios in table)
TestValidateInput  // Table contains: "nil input", "empty input", etc.
TestProcessData    // Table contains: "valid format", "invalid format", etc.

Integration Tests

// Format: TestHandler_Scenario or TestIntegration_Feature
TestQueryTools_SuccessfulQuery
TestGetSessionStats_ErrorHandling
TestIntegration_CompleteWorkflow

Test Structure Best Practices

Setup-Execute-Assert Pattern

func TestFunction(t *testing.T) {
    // Setup: Create test data and dependencies
    input := createTestInput()
    mock := createMockDependency()

    // Execute: Call the function under test
    result, err := Function(input, mock)

    // Assert: Verify expected behavior
    if err != nil {
        t.Fatalf("unexpected error: %v", err)
    }
    if result != expected {
        t.Errorf("expected %v, got %v", expected, result)
    }
}

Cleanup Handling

func TestFunction(t *testing.T) {
    // Using defer for cleanup
    originalValue := globalVar
    defer func() { globalVar = originalValue }()

    // Or using t.Cleanup (preferred)
    t.Cleanup(func() {
        globalVar = originalValue
    })

    // Test logic...
}

Helper Functions

// Mark as helper for better error reporting
func createTestInput(t *testing.T) *Input {
    t.Helper()  // Errors will point to caller, not this line

    return &Input{
        Field1: "test",
        Field2: 42,
    }
}

Error Message Guidelines

Good Error Messages

// Include context and actual values
if result != expected {
    t.Errorf("Function() = %v, expected %v", result, expected)
}

// Include relevant state
if len(results) != expectedCount {
    t.Errorf("got %d results, expected %d: %+v",
        len(results), expectedCount, results)
}

Poor Error Messages

// Avoid: No context
if err != nil {
    t.Fatal("error occurred")
}

// Avoid: Missing actual values
if !valid {
    t.Error("validation failed")
}

Test Performance Standards

Unit Tests

Target: <100ms per test
Maximum: <500ms per test
If slower: Consider mocking or refactoring

Integration Tests

Target: <1s per test
Maximum: <5s per test
If slower: Use testing.Short() to skip in short mode

func TestIntegration_SlowOperation(t *testing.T) {
    if testing.Short() {
        t.Skip("skipping slow integration test in short mode")
    }
    // Test logic...
}

Running Tests

# Fast tests only
go test -short ./...

# All tests with timeout
go test -timeout 5m ./...

Test Data Management

Inline Test Data

For small, simple data:

tests := []struct {
    name  string
    input string
}{
    {"empty", ""},
    {"single", "a"},
    {"multiple", "abc"},
}

Fixture Files

For complex data structures:

func loadTestFixture(t *testing.T, name string) []byte {
    t.Helper()
    data, err := os.ReadFile(filepath.Join("testdata", name))
    if err != nil {
        t.Fatalf("failed to load fixture %s: %v", name, err)
    }
    return data
}

Golden Files

For output validation:

func TestFormatOutput(t *testing.T) {
    output := formatOutput(testData)

    goldenPath := filepath.Join("testdata", "expected_output.golden")

    if *update {
        os.WriteFile(goldenPath, []byte(output), 0644)
    }

    expected, _ := os.ReadFile(goldenPath)
    if string(expected) != output {
        t.Errorf("output mismatch\ngot:\n%s\nwant:\n%s", output, expected)
    }
}

Common Anti-Patterns to Avoid

1. Testing Implementation Instead of Behavior

// Bad: Tests internal implementation
func TestFunction(t *testing.T) {
    obj := New()
    if obj.internalField != "expected" {  // Don't test internals
        t.Error("internal field wrong")
    }
}

// Good: Tests observable behavior
func TestFunction(t *testing.T) {
    obj := New()
    result := obj.PublicMethod()  // Test public interface
    if result != expected {
        t.Error("unexpected result")
    }
}

2. Overly Complex Test Setup

// Bad: Complex setup obscures test intent
func TestFunction(t *testing.T) {
    // 50 lines of setup...
    result := Function(complex, setup, params)
    // Assert...
}

// Good: Use helper functions
func TestFunction(t *testing.T) {
    setup := createTestSetup(t)  // Helper abstracts complexity
    result := Function(setup)
    // Assert...
}

3. Testing Multiple Concepts in One Test

// Bad: Tests multiple unrelated things
func TestValidation(t *testing.T) {
    // Tests format validation
    // Tests length validation
    // Tests encoding validation
    // Tests error handling
}

// Good: Separate tests for each concept
func TestValidation_Format(t *testing.T) { /*...*/ }
func TestValidation_Length(t *testing.T) { /*...*/ }
func TestValidation_Encoding(t *testing.T) { /*...*/ }
func TestValidation_ErrorHandling(t *testing.T) { /*...*/ }

4. Shared State Between Tests

// Bad: Tests depend on execution order
var sharedState string

func TestFirst(t *testing.T) {
    sharedState = "initialized"
}

func TestSecond(t *testing.T) {
    // Breaks if TestFirst doesn't run first
    if sharedState != "initialized" { /*...*/ }
}

// Good: Each test is independent
func TestFirst(t *testing.T) {
    state := "initialized"  // Local state
    // Test...
}

func TestSecond(t *testing.T) {
    state := setupState()  // Creates own state
    // Test...
}

Code Review Checklist for Tests

When reviewing test code, verify:

Tests are independent (can run in any order)
Test names are descriptive
Happy path and error paths both covered
Edge cases included
No magic numbers or strings (use constants)
Cleanup handled properly
Error messages provide context
Tests are reasonably fast
No commented-out test code
Follows established patterns in codebase

Continuous Improvement

Track Test Metrics

Record for each test batch:

Date: 2025-10-18
Batch: Validation error paths (4 tests)
Pattern: Error Path + Table-Driven
Time: 50 min (estimated 60 min) → 17% faster
Coverage: internal/validation 57.9% → 75.2% (+17.3%)
Total coverage: 72.3% → 73.5% (+1.2%)
Efficiency: 0.3% per test
Issues: None
Lessons: Table-driven error tests very efficient

Regular Coverage Analysis

# Weekly coverage review
go test -coverprofile=coverage.out ./...
go tool cover -func=coverage.out | tail -20

# Identify degradation
diff coverage-last-week.txt coverage-this-week.txt

Test Suite Health

Monitor:

Total test count (growing)
Test execution time (stable or decreasing)
Coverage percentage (stable or increasing)
Flaky test rate (near zero)
Test maintenance time (decreasing)

Source: Bootstrap-002 Test Strategy Development Framework: BAIME (Bootstrapped AI Methodology Engineering) Status: Production-ready, validated through 4 iterations

9.4 KiB Raw Permalink Blame History

Test Quality Standards

Test Quality Checklist

Structure

Execution

Coverage

CLI Test Additional Checklist

Coverage Target Goals

By Category

Priority Decision Tree

Test Naming Conventions

Unit Tests

Table-Driven Tests

Integration Tests

Test Structure Best Practices

Setup-Execute-Assert Pattern

Cleanup Handling

Helper Functions

Error Message Guidelines

Good Error Messages

Poor Error Messages

Test Performance Standards

Unit Tests

Integration Tests

Running Tests

Test Data Management

Inline Test Data

Fixture Files

Golden Files

Common Anti-Patterns to Avoid

1. Testing Implementation Instead of Behavior

2. Overly Complex Test Setup

3. Testing Multiple Concepts in One Test

4. Shared State Between Tests

Code Review Checklist for Tests

Continuous Improvement

Track Test Metrics

Regular Coverage Analysis

Test Suite Health

9.4 KiB

Raw Permalink Blame History