Initial commit
This commit is contained in:
442
skills/testing-strategy/reference/quality-criteria.md
Normal file
442
skills/testing-strategy/reference/quality-criteria.md
Normal file
@@ -0,0 +1,442 @@
|
||||
# Test Quality Standards
|
||||
|
||||
**Version**: 2.0
|
||||
**Source**: Bootstrap-002 Test Strategy Development
|
||||
**Last Updated**: 2025-10-18
|
||||
|
||||
This document defines quality criteria, coverage targets, and best practices for test development.
|
||||
|
||||
---
|
||||
|
||||
## Test Quality Checklist
|
||||
|
||||
For every test, ensure compliance with these quality standards:
|
||||
|
||||
### Structure
|
||||
|
||||
- [ ] Test name clearly describes scenario
|
||||
- [ ] Setup is minimal and focused
|
||||
- [ ] Single concept tested per test
|
||||
- [ ] Clear error messages with context
|
||||
|
||||
### Execution
|
||||
|
||||
- [ ] Cleanup handled (defer, t.Cleanup)
|
||||
- [ ] No hard-coded paths or values
|
||||
- [ ] Deterministic (no randomness)
|
||||
- [ ] Fast execution (<100ms for unit tests)
|
||||
|
||||
### Coverage
|
||||
|
||||
- [ ] Tests both happy and error paths
|
||||
- [ ] Uses test helpers where appropriate
|
||||
- [ ] Follows documented patterns
|
||||
- [ ] Includes edge cases
|
||||
|
||||
---
|
||||
|
||||
## CLI Test Additional Checklist
|
||||
|
||||
When testing CLI commands, also ensure:
|
||||
|
||||
- [ ] Command flags reset between tests
|
||||
- [ ] Output captured properly (stdout/stderr)
|
||||
- [ ] Environment variables reset (if used)
|
||||
- [ ] Working directory restored (if changed)
|
||||
- [ ] Temporary files cleaned up
|
||||
- [ ] No dependency on external binaries (unless integration test)
|
||||
- [ ] Tests both happy path and error cases
|
||||
- [ ] Help text validated (if command has help)
|
||||
|
||||
---
|
||||
|
||||
## Coverage Target Goals
|
||||
|
||||
### By Category
|
||||
|
||||
Different code categories require different coverage levels based on criticality:
|
||||
|
||||
| Category | Target Coverage | Priority | Rationale |
|
||||
|----------|----------------|----------|-----------|
|
||||
| Error Handling | 80-90% | P1 | Critical for reliability |
|
||||
| Business Logic | 75-85% | P2 | Core functionality |
|
||||
| CLI Handlers | 70-80% | P2 | User-facing behavior |
|
||||
| Integration | 70-80% | P3 | End-to-end validation |
|
||||
| Utilities | 60-70% | P3 | Supporting functions |
|
||||
| Infrastructure | 40-60% | P4 | Best effort |
|
||||
|
||||
**Overall Project Target**: 75-80%
|
||||
|
||||
### Priority Decision Tree
|
||||
|
||||
```
|
||||
Is function critical to core functionality?
|
||||
├─ YES: Is it error handling or validation?
|
||||
│ ├─ YES: Priority 1 (80%+ coverage target)
|
||||
│ └─ NO: Is it business logic?
|
||||
│ ├─ YES: Priority 2 (75%+ coverage)
|
||||
│ └─ NO: Priority 3 (60%+ coverage)
|
||||
└─ NO: Is it infrastructure/initialization?
|
||||
├─ YES: Priority 4 (test if easy, skip if hard)
|
||||
└─ NO: Priority 5 (skip)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Test Naming Conventions
|
||||
|
||||
### Unit Tests
|
||||
|
||||
```go
|
||||
// Format: TestFunctionName_Scenario
|
||||
TestValidateInput_NilInput
|
||||
TestValidateInput_EmptyInput
|
||||
TestProcessData_ValidFormat
|
||||
```
|
||||
|
||||
### Table-Driven Tests
|
||||
|
||||
```go
|
||||
// Format: TestFunctionName (scenarios in table)
|
||||
TestValidateInput // Table contains: "nil input", "empty input", etc.
|
||||
TestProcessData // Table contains: "valid format", "invalid format", etc.
|
||||
```
|
||||
|
||||
### Integration Tests
|
||||
|
||||
```go
|
||||
// Format: TestHandler_Scenario or TestIntegration_Feature
|
||||
TestQueryTools_SuccessfulQuery
|
||||
TestGetSessionStats_ErrorHandling
|
||||
TestIntegration_CompleteWorkflow
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Test Structure Best Practices
|
||||
|
||||
### Setup-Execute-Assert Pattern
|
||||
|
||||
```go
|
||||
func TestFunction(t *testing.T) {
|
||||
// Setup: Create test data and dependencies
|
||||
input := createTestInput()
|
||||
mock := createMockDependency()
|
||||
|
||||
// Execute: Call the function under test
|
||||
result, err := Function(input, mock)
|
||||
|
||||
// Assert: Verify expected behavior
|
||||
if err != nil {
|
||||
t.Fatalf("unexpected error: %v", err)
|
||||
}
|
||||
if result != expected {
|
||||
t.Errorf("expected %v, got %v", expected, result)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Cleanup Handling
|
||||
|
||||
```go
|
||||
func TestFunction(t *testing.T) {
|
||||
// Using defer for cleanup
|
||||
originalValue := globalVar
|
||||
defer func() { globalVar = originalValue }()
|
||||
|
||||
// Or using t.Cleanup (preferred)
|
||||
t.Cleanup(func() {
|
||||
globalVar = originalValue
|
||||
})
|
||||
|
||||
// Test logic...
|
||||
}
|
||||
```
|
||||
|
||||
### Helper Functions
|
||||
|
||||
```go
|
||||
// Mark as helper for better error reporting
|
||||
func createTestInput(t *testing.T) *Input {
|
||||
t.Helper() // Errors will point to caller, not this line
|
||||
|
||||
return &Input{
|
||||
Field1: "test",
|
||||
Field2: 42,
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Error Message Guidelines
|
||||
|
||||
### Good Error Messages
|
||||
|
||||
```go
|
||||
// Include context and actual values
|
||||
if result != expected {
|
||||
t.Errorf("Function() = %v, expected %v", result, expected)
|
||||
}
|
||||
|
||||
// Include relevant state
|
||||
if len(results) != expectedCount {
|
||||
t.Errorf("got %d results, expected %d: %+v",
|
||||
len(results), expectedCount, results)
|
||||
}
|
||||
```
|
||||
|
||||
### Poor Error Messages
|
||||
|
||||
```go
|
||||
// Avoid: No context
|
||||
if err != nil {
|
||||
t.Fatal("error occurred")
|
||||
}
|
||||
|
||||
// Avoid: Missing actual values
|
||||
if !valid {
|
||||
t.Error("validation failed")
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Test Performance Standards
|
||||
|
||||
### Unit Tests
|
||||
|
||||
- **Target**: <100ms per test
|
||||
- **Maximum**: <500ms per test
|
||||
- **If slower**: Consider mocking or refactoring
|
||||
|
||||
### Integration Tests
|
||||
|
||||
- **Target**: <1s per test
|
||||
- **Maximum**: <5s per test
|
||||
- **If slower**: Use `testing.Short()` to skip in short mode
|
||||
|
||||
```go
|
||||
func TestIntegration_SlowOperation(t *testing.T) {
|
||||
if testing.Short() {
|
||||
t.Skip("skipping slow integration test in short mode")
|
||||
}
|
||||
// Test logic...
|
||||
}
|
||||
```
|
||||
|
||||
### Running Tests
|
||||
|
||||
```bash
|
||||
# Fast tests only
|
||||
go test -short ./...
|
||||
|
||||
# All tests with timeout
|
||||
go test -timeout 5m ./...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Test Data Management
|
||||
|
||||
### Inline Test Data
|
||||
|
||||
For small, simple data:
|
||||
|
||||
```go
|
||||
tests := []struct {
|
||||
name string
|
||||
input string
|
||||
}{
|
||||
{"empty", ""},
|
||||
{"single", "a"},
|
||||
{"multiple", "abc"},
|
||||
}
|
||||
```
|
||||
|
||||
### Fixture Files
|
||||
|
||||
For complex data structures:
|
||||
|
||||
```go
|
||||
func loadTestFixture(t *testing.T, name string) []byte {
|
||||
t.Helper()
|
||||
data, err := os.ReadFile(filepath.Join("testdata", name))
|
||||
if err != nil {
|
||||
t.Fatalf("failed to load fixture %s: %v", name, err)
|
||||
}
|
||||
return data
|
||||
}
|
||||
```
|
||||
|
||||
### Golden Files
|
||||
|
||||
For output validation:
|
||||
|
||||
```go
|
||||
func TestFormatOutput(t *testing.T) {
|
||||
output := formatOutput(testData)
|
||||
|
||||
goldenPath := filepath.Join("testdata", "expected_output.golden")
|
||||
|
||||
if *update {
|
||||
os.WriteFile(goldenPath, []byte(output), 0644)
|
||||
}
|
||||
|
||||
expected, _ := os.ReadFile(goldenPath)
|
||||
if string(expected) != output {
|
||||
t.Errorf("output mismatch\ngot:\n%s\nwant:\n%s", output, expected)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Common Anti-Patterns to Avoid
|
||||
|
||||
### 1. Testing Implementation Instead of Behavior
|
||||
|
||||
```go
|
||||
// Bad: Tests internal implementation
|
||||
func TestFunction(t *testing.T) {
|
||||
obj := New()
|
||||
if obj.internalField != "expected" { // Don't test internals
|
||||
t.Error("internal field wrong")
|
||||
}
|
||||
}
|
||||
|
||||
// Good: Tests observable behavior
|
||||
func TestFunction(t *testing.T) {
|
||||
obj := New()
|
||||
result := obj.PublicMethod() // Test public interface
|
||||
if result != expected {
|
||||
t.Error("unexpected result")
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### 2. Overly Complex Test Setup
|
||||
|
||||
```go
|
||||
// Bad: Complex setup obscures test intent
|
||||
func TestFunction(t *testing.T) {
|
||||
// 50 lines of setup...
|
||||
result := Function(complex, setup, params)
|
||||
// Assert...
|
||||
}
|
||||
|
||||
// Good: Use helper functions
|
||||
func TestFunction(t *testing.T) {
|
||||
setup := createTestSetup(t) // Helper abstracts complexity
|
||||
result := Function(setup)
|
||||
// Assert...
|
||||
}
|
||||
```
|
||||
|
||||
### 3. Testing Multiple Concepts in One Test
|
||||
|
||||
```go
|
||||
// Bad: Tests multiple unrelated things
|
||||
func TestValidation(t *testing.T) {
|
||||
// Tests format validation
|
||||
// Tests length validation
|
||||
// Tests encoding validation
|
||||
// Tests error handling
|
||||
}
|
||||
|
||||
// Good: Separate tests for each concept
|
||||
func TestValidation_Format(t *testing.T) { /*...*/ }
|
||||
func TestValidation_Length(t *testing.T) { /*...*/ }
|
||||
func TestValidation_Encoding(t *testing.T) { /*...*/ }
|
||||
func TestValidation_ErrorHandling(t *testing.T) { /*...*/ }
|
||||
```
|
||||
|
||||
### 4. Shared State Between Tests
|
||||
|
||||
```go
|
||||
// Bad: Tests depend on execution order
|
||||
var sharedState string
|
||||
|
||||
func TestFirst(t *testing.T) {
|
||||
sharedState = "initialized"
|
||||
}
|
||||
|
||||
func TestSecond(t *testing.T) {
|
||||
// Breaks if TestFirst doesn't run first
|
||||
if sharedState != "initialized" { /*...*/ }
|
||||
}
|
||||
|
||||
// Good: Each test is independent
|
||||
func TestFirst(t *testing.T) {
|
||||
state := "initialized" // Local state
|
||||
// Test...
|
||||
}
|
||||
|
||||
func TestSecond(t *testing.T) {
|
||||
state := setupState() // Creates own state
|
||||
// Test...
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Code Review Checklist for Tests
|
||||
|
||||
When reviewing test code, verify:
|
||||
|
||||
- [ ] Tests are independent (can run in any order)
|
||||
- [ ] Test names are descriptive
|
||||
- [ ] Happy path and error paths both covered
|
||||
- [ ] Edge cases included
|
||||
- [ ] No magic numbers or strings (use constants)
|
||||
- [ ] Cleanup handled properly
|
||||
- [ ] Error messages provide context
|
||||
- [ ] Tests are reasonably fast
|
||||
- [ ] No commented-out test code
|
||||
- [ ] Follows established patterns in codebase
|
||||
|
||||
---
|
||||
|
||||
## Continuous Improvement
|
||||
|
||||
### Track Test Metrics
|
||||
|
||||
Record for each test batch:
|
||||
|
||||
```
|
||||
Date: 2025-10-18
|
||||
Batch: Validation error paths (4 tests)
|
||||
Pattern: Error Path + Table-Driven
|
||||
Time: 50 min (estimated 60 min) → 17% faster
|
||||
Coverage: internal/validation 57.9% → 75.2% (+17.3%)
|
||||
Total coverage: 72.3% → 73.5% (+1.2%)
|
||||
Efficiency: 0.3% per test
|
||||
Issues: None
|
||||
Lessons: Table-driven error tests very efficient
|
||||
```
|
||||
|
||||
### Regular Coverage Analysis
|
||||
|
||||
```bash
|
||||
# Weekly coverage review
|
||||
go test -coverprofile=coverage.out ./...
|
||||
go tool cover -func=coverage.out | tail -20
|
||||
|
||||
# Identify degradation
|
||||
diff coverage-last-week.txt coverage-this-week.txt
|
||||
```
|
||||
|
||||
### Test Suite Health
|
||||
|
||||
Monitor:
|
||||
- Total test count (growing)
|
||||
- Test execution time (stable or decreasing)
|
||||
- Coverage percentage (stable or increasing)
|
||||
- Flaky test rate (near zero)
|
||||
- Test maintenance time (decreasing)
|
||||
|
||||
---
|
||||
|
||||
**Source**: Bootstrap-002 Test Strategy Development
|
||||
**Framework**: BAIME (Bootstrapped AI Methodology Engineering)
|
||||
**Status**: Production-ready, validated through 4 iterations
|
||||
Reference in New Issue
Block a user