Initial commit

2025-11-29 18:27:28 +08:00
commit 8db9c44dd8
79 changed files with 37715 additions and 0 deletions
--- a/skills/FrancyJGLisboa__agent-skill-creator/references/phase6-testing.md
+++ b/skills/FrancyJGLisboa__agent-skill-creator/references/phase6-testing.md
@@ -0,0 +1,783 @@
+# Phase 6: Test Suite Generation (NEW v2.0!)
+
+## Objective
+
+**GENERATE** comprehensive test suite that validates ALL functions of the created skill.
+
+**LEARNING:** us-crop-monitor v1.0 had ZERO tests. When expanding to v2.0, it was difficult to ensure nothing broke. v2.0 has 25 tests (100% passing) that ensure reliability.
+
+---
+
+## Why Are Tests Critical?
+
+### Benefits for Developer:
+- ✅ Ensures code works before distribution
+- ✅ Detects bugs early (not after client installs!)
+- ✅ Allows confident changes (regression testing)
+- ✅ Documents expected behavior
+
+### Benefits for Client:
+- ✅ Confidence in skill ("100% tested")
+- ✅ Fewer bugs in production
+- ✅ More professional (commercially viable)
+
+### Benefits for Agent-Creator:
+- ✅ Validates that generated skill actually works
+- ✅ Catch errors before considering "done"
+- ✅ Automatic quality gate
+
+---
+
+## Test Structure
+
+### tests/ Directory
+
+```
+{skill-name}/
+└── tests/
+    ├── test_fetch.py           # Tests API client
+    ├── test_parse.py           # Tests parsers
+    ├── test_analyze.py         # Tests analyses
+    ├── test_integration.py     # Tests end-to-end
+    ├── test_validation.py      # Tests validators
+    ├── test_helpers.py         # Tests helpers (year detection, etc.)
+    └── README.md               # How to run tests
+```
+
+---
+
+## Template 1: test_fetch.py
+
+**Objective:** Validate API client works
+
+```python
+#!/usr/bin/env python3
+"""
+Test suite for {API} client.
+
+Tests all fetch methods with real API data.
+"""
+
+import sys
+from pathlib import Path
+
+sys.path.insert(0, str(Path(__file__).parent.parent / 'scripts'))
+
+from fetch_{api} import {ApiClient}, DataNotFoundError
+
+
+def test_get_{metric1}():
+    """Test fetching {metric1} data."""
+    print("\nTesting get_{metric1}()...")
+
+    try:
+        client = {ApiClient}()
+
+        # Test with valid parameters
+        result = client.get_{metric1}(
+            {entity}='{valid_entity}',
+            year=2024
+        )
+
+        # Validations
+        assert 'data' in result, "Missing 'data' in result"
+        assert 'metadata' in result, "Missing 'metadata'"
+        assert len(result['data']) > 0, "No data returned"
+        assert result['metadata']['from_cache'] in [True, False]
+
+        print(f"  ✓ Fetched {len(result['data'])} records")
+        print(f"  ✓ Metadata present")
+        print(f"  ✓ From cache: {result['metadata']['from_cache']}")
+
+        return True
+
+    except Exception as e:
+        print(f"  ✗ FAILED: {e}")
+        return False
+
+
+def test_get_{metric2}():
+    """Test fetching {metric2} data."""
+    # Similar structure...
+    pass
+
+
+def test_error_handling():
+    """Test that errors are handled correctly."""
+    print("\nTesting error handling...")
+
+    try:
+        client = {ApiClient}()
+
+        # Test invalid entity (should raise)
+        try:
+            result = client.get_{metric1}({entity}='INVALID_ENTITY', year=2024)
+            print("  ✗ Should have raised DataNotFoundError")
+            return False
+        except DataNotFoundError:
+            print("  ✓ Correctly raises DataNotFoundError for invalid entity")
+
+        # Test invalid year (should raise)
+        try:
+            result = client.get_{metric1}({entity}='{valid}', year=2099)
+            print("  ✗ Should have raised ValidationError")
+            return False
+        except Exception as e:
+            print(f"  ✓ Correctly raises error for future year")
+
+        return True
+
+    except Exception as e:
+        print(f"  ✗ Unexpected error: {e}")
+        return False
+
+
+def main():
+    """Run all fetch tests."""
+    print("=" * 70)
+    print("FETCH TESTS - {API} Client")
+    print("=" * 70)
+
+    results = []
+
+    # Test each get_* method
+    results.append(("get_{metric1}", test_get_{metric1}()))
+    results.append(("get_{metric2}", test_get_{metric2}()))
+    # ... add test for ALL get_* methods
+
+    results.append(("error_handling", test_error_handling()))
+
+    # Summary
+    print("\n" + "=" * 70)
+    print("SUMMARY")
+    print("=" * 70)
+
+    passed = sum(1 for _, r in results if r)
+    total = len(results)
+
+    for name, result in results:
+        status = "✓ PASS" if result else "✗ FAIL"
+        print(f"{status}: {name}()")
+
+    print(f"\nResults: {passed}/{total} tests passed")
+
+    return passed == total
+
+
+if __name__ == "__main__":
+    success = main()
+    sys.exit(0 if success else 1)
+```
+
+**Rule:** ONE test function for EACH `get_*()` method implemented!
+
+---
+
+## Template 2: test_parse.py
+
+**Objective:** Validate parsers
+
+```python
+#!/usr/bin/env python3
+"""
+Test suite for data parsers.
+
+Tests all parse_* modules.
+"""
+
+import sys
+from pathlib import Path
+
+sys.path.insert(0, str(Path(__file__).parent.parent / 'scripts'))
+
+from parse_{type1} import parse_{type1}_response
+from parse_{type2} import parse_{type2}_response
+
+
+def test_parse_{type1}():
+    """Test {type1} parser."""
+    print("\nTesting parse_{type1}_response()...")
+
+    # Sample data (real structure from API)
+    sample_data = [
+        {
+            'field1': 'value1',
+            'field2': 'value2',
+            'Value': '123',
+            # ... real API fields
+        }
+    ]
+
+    try:
+        df = parse_{type1}_response(sample_data)
+
+        # Validations
+        assert not df.empty, "DataFrame is empty"
+        assert 'Value' in df.columns or '{metric}_value' in df.columns
+        assert len(df) == len(sample_data)
+
+        print(f"  ✓ Parsed {len(df)} records")
+        print(f"  ✓ Columns: {list(df.columns)}")
+
+        return True
+
+    except Exception as e:
+        print(f"  ✗ FAILED: {e}")
+        import traceback
+        traceback.print_exc()
+        return False
+
+
+def test_parse_empty_data():
+    """Test parser handles empty data gracefully."""
+    print("\nTesting empty data handling...")
+
+    try:
+        from parse_{type1} import ParseError
+
+        try:
+            df = parse_{type1}_response([])
+            print("  ✗ Should have raised ParseError")
+            return False
+        except ParseError as e:
+            print(f"  ✓ Correctly raises ParseError: {e}")
+            return True
+
+    except Exception as e:
+        print(f"  ✗ Unexpected error: {e}")
+        return False
+
+
+def main():
+    results = []
+
+    # Test each parser
+    results.append(("parse_{type1}", test_parse_{type1}()))
+    results.append(("parse_{type2}", test_parse_{type2}()))
+    # ... for ALL parsers
+
+    results.append(("empty_data", test_parse_empty_data()))
+
+    # Summary
+    passed = sum(1 for _, r in results if r)
+    print(f"\nResults: {passed}/{len(results)} passed")
+    return passed == len(results)
+
+
+if __name__ == "__main__":
+    sys.exit(0 if main() else 1)
+```
+
+---
+
+## Template 3: test_integration.py
+
+**Objective:** End-to-end tests (MOST IMPORTANT!)
+
+```python
+#!/usr/bin/env python3
+"""
+Integration tests for {skill-name}.
+
+Tests all analysis functions with REAL API data.
+"""
+
+import sys
+from pathlib import Path
+
+sys.path.insert(0, str(Path(__file__).parent.parent / 'scripts'))
+
+from analyze_{domain} import (
+    {function1},
+    {function2},
+    {function3},
+    # ... import ALL functions
+)
+
+
+def test_{function1}():
+    """Test {function1} with auto-year detection."""
+    print("\n1. Testing {function1}()...")
+
+    try:
+        # Test WITHOUT year (auto-detection)
+        result = {function1}({entity}='{valid_entity}')
+
+        # Validations
+        assert 'year' in result, "Missing year"
+        assert 'year_requested' in result, "Missing year_requested"
+        assert 'year_info' in result, "Missing year_info"
+        assert result['year'] >= 2024, "Year too old"
+        assert result['year_requested'] is None, "Should auto-detect"
+
+        print(f"  ✓ Auto-year detection: {result['year']}")
+        print(f"  ✓ Year info: {result['year_info']}")
+        print(f"  ✓ Data present: {list(result.keys())}")
+
+        return True
+
+    except Exception as e:
+        print(f"  ✗ FAILED: {e}")
+        import traceback
+        traceback.print_exc()
+        return False
+
+
+def test_{function1}_with_explicit_year():
+    """Test {function1} with explicit year."""
+    print("\n2. Testing {function1}() with explicit year...")
+
+    try:
+        # Test WITH year specified
+        result = {function1}({entity}='{valid_entity}', year=2024)
+
+        assert result['year'] == 2024, f"Expected 2024, got {result['year']}"
+        assert result['year_requested'] == 2024
+
+        print(f"  ✓ Uses specified year: {result['year']}")
+
+        return True
+
+    except Exception as e:
+        print(f"  ✗ FAILED: {e}")
+        return False
+
+
+def test_all_functions_exist():
+    """Verify all expected functions are implemented."""
+    print("\nVerifying all functions exist...")
+
+    expected_functions = [
+        '{function1}',
+        '{function2}',
+        '{function3}',
+        # ... ALL functions
+    ]
+
+    missing = []
+    for func_name in expected_functions:
+        if func_name not in globals():
+            missing.append(func_name)
+
+    if missing:
+        print(f"  ✗ Missing functions: {missing}")
+        return False
+    else:
+        print(f"  ✓ All {len(expected_functions)} functions present")
+        return True
+
+
+def main():
+    """Run all integration tests."""
+    print("\n" + "=" * 70)
+    print("{SKILL NAME} - INTEGRATION TEST SUITE")
+    print("=" * 70)
+
+    results = []
+
+    # Test each function
+    results.append(("{function1} auto-year", test_{function1}()))
+    results.append(("{function1} explicit-year", test_{function1}_with_explicit_year()))
+    # ... repeat for ALL functions
+
+    results.append(("all_functions_exist", test_all_functions_exist()))
+
+    # Summary
+    print("\n" + "=" * 70)
+    print("FINAL SUMMARY")
+    print("=" * 70)
+
+    passed = sum(1 for _, r in results if r)
+    total = len(results)
+
+    print(f"\n✓ Passed: {passed}/{total}")
+    print(f"✗ Failed: {total - passed}/{total}")
+
+    if passed == total:
+        print("\n🎉 ALL TESTS PASSED! SKILL IS PRODUCTION READY!")
+    else:
+        print(f"\n⚠ {total - passed} test(s) failed - FIX BEFORE RELEASE")
+
+    print("=" * 70)
+
+    return passed == total
+
+
+if __name__ == "__main__":
+    success = main()
+    sys.exit(0 if success else 1)
+```
+
+**Rule:** Minimum 2 tests per analysis function (auto-year + explicit-year)
+
+---
+
+## Template 4: test_helpers.py
+
+**Objective:** Test year detection helpers
+
+```python
+#!/usr/bin/env python3
+"""
+Test suite for utility helpers.
+
+Tests temporal context detection.
+"""
+
+import sys
+from pathlib import Path
+from datetime import datetime
+
+sys.path.insert(0, str(Path(__file__).parent.parent / 'scripts'))
+
+from utils.helpers import (
+    get_current_{domain}_year,
+    should_try_previous_year,
+    format_year_message
+)
+
+
+def test_get_current_year():
+    """Test current year detection."""
+    print("\nTesting get_current_{domain}_year()...")
+
+    try:
+        year = get_current_{domain}_year()
+        current_year = datetime.now().year
+
+        assert year == current_year, f"Expected {current_year}, got {year}"
+
+        print(f"  ✓ Correctly returns: {year}")
+
+        return True
+
+    except Exception as e:
+        print(f"  ✗ FAILED: {e}")
+        return False
+
+
+def test_should_try_previous_year():
+    """Test seasonal fallback logic."""
+    print("\nTesting should_try_previous_year()...")
+
+    try:
+        # Test with None (current year)
+        result = should_try_previous_year()
+        print(f"  ✓ Current year fallback: {result}")
+
+        # Test with specific year
+        result_past = should_try_previous_year(2023)
+        print(f"  ✓ Past year fallback: {result_past}")
+
+        return True
+
+    except Exception as e:
+        print(f"  ✗ FAILED: {e}")
+        return False
+
+
+def test_format_year_message():
+    """Test year message formatting."""
+    print("\nTesting format_year_message()...")
+
+    try:
+        # Test auto-detected
+        msg1 = format_year_message(2025, None)
+        assert "auto-detected" in msg1.lower() or "2025" in msg1
+        print(f"  ✓ Auto-detected: {msg1}")
+
+        # Test requested
+        msg2 = format_year_message(2024, 2024)
+        assert "2024" in msg2
+        print(f"  ✓ Requested: {msg2}")
+
+        # Test fallback
+        msg3 = format_year_message(2024, 2025)
+        assert "not" in msg3.lower() or "fallback" in msg3.lower()
+        print(f"  ✓ Fallback: {msg3}")
+
+        return True
+
+    except Exception as e:
+        print(f"  ✗ FAILED: {e}")
+        return False
+
+
+def main():
+    results = []
+
+    results.append(("get_current_year", test_get_current_year()))
+    results.append(("should_try_previous_year", test_should_try_previous_year()))
+    results.append(("format_year_message", test_format_year_message()))
+
+    passed = sum(1 for _, r in results if r)
+    print(f"\nResults: {passed}/{len(results)} passed")
+
+    return passed == len(results)
+
+
+if __name__ == "__main__":
+    sys.exit(0 if main() else 1)
+```
+
+---
+
+## Quality Rules for Tests
+
+### 1. ALL tests must use REAL DATA
+
+❌ **FORBIDDEN:**
+```python
+def test_function():
+    # Mock data
+    mock_data = {'fake': 'data'}
+    result = function(mock_data)
+    assert result == 'expected'
+```
+
+✅ **MANDATORY:**
+```python
+def test_function():
+    # Real API call
+    client = ApiClient()
+    result = client.get_real_data(entity='REAL', year=2024)
+
+    # Validate real response
+    assert len(result['data']) > 0
+    assert 'metadata' in result
+```
+
+**Why?**
+- Tests with mocks don't guarantee API is working
+- Real tests detect API changes
+- Client needs to know it works with REAL data
+
+---
+
+### 2. Tests must be FAST
+
+**Goal:** Complete suite in < 60 seconds
+
+**Techniques:**
+- Use cache: First test populates cache, rest use cached
+- Limit requests: Don't test 100 entities, test 2-3
+- Parallel where possible
+
+```python
+# Example: Populate cache once
+@classmethod
+def setUpClass(cls):
+    """Populate cache before all tests."""
+    client = ApiClient()
+    client.get_data('ENTITY1', 2024)  # Cache for other tests
+
+# Tests then use cached data (fast)
+```
+
+---
+
+### 3. Tests must PASS 100%
+
+**Quality Gate:** Skill is only "done" when ALL tests pass.
+
+```python
+if __name__ == "__main__":
+    success = main()
+    if not success:
+        print("\n❌ SKILL NOT READY - FIX FAILING TESTS")
+        sys.exit(1)
+    else:
+        print("\n✅ SKILL READY FOR DISTRIBUTION")
+        sys.exit(0)
+```
+
+---
+
+## Test Coverage Requirements
+
+### Minimum Mandatory:
+
+**Per module:**
+- `fetch_{api}.py`: 1 test per `get_*()` method + 1 error handling test
+- Each `parse_{type}.py`: 1 test per main function
+- `analyze_{domain}.py`: 2 tests per analysis (auto-year + explicit-year)
+- `utils/helpers.py`: 3 tests (get_year, should_fallback, format_message)
+
+**Expected total:** 15-30 tests depending on skill size
+
+**Example (us-crop-monitor v2.0):**
+- test_fetch.py: 6 tests (5 get_* + 1 error)
+- test_parse.py: 4 tests (4 parsers)
+- test_analyze.py: 11 tests (11 functions)
+- test_helpers.py: 3 tests
+- test_integration.py: 1 end-to-end test
+- **Total:** 25 tests
+
+---
+
+## How to Run Tests
+
+### Individual:
+```bash
+python3 tests/test_fetch.py
+python3 tests/test_integration.py
+```
+
+### Complete suite:
+```bash
+# Run all
+for test in tests/test_*.py; do
+    python3 $test || exit 1
+done
+
+# Or with pytest (if available)
+pytest tests/
+```
+
+### In CI/CD:
+```yaml
+# .github/workflows/test.yml
+name: Test Suite
+on: [push]
+jobs:
+  test:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v2
+      - run: pip install -r requirements.txt
+      - run: python3 tests/test_integration.py
+```
+
+---
+
+## Output Example
+
+**When tests pass:**
+```
+======================================================================
+US CROP MONITOR - INTEGRATION TEST SUITE
+======================================================================
+
+1. current_condition_report()...
+   ✓ Year: 2025 | Week: 39
+   ✓ Good+Excellent: 66.0%
+
+2. week_over_week_comparison()...
+   ✓ Year: 2025 | Weeks: 39 vs 38
+   ✓ Delta: -2.2 pts
+
+...
+
+======================================================================
+FINAL SUMMARY
+======================================================================
+
+✓ Passed: 25/25 tests
+✗ Failed: 0/25 tests
+
+🎉 ALL TESTS PASSED! SKILL IS PRODUCTION READY!
+======================================================================
+```
+
+**When tests fail:**
+```
+8. yield_analysis()...
+   ✗ FAILED: 'yield_bu_per_acre' not in result
+
+...
+
+FINAL SUMMARY:
+✓ Passed: 24/25
+✗ Failed: 1/25
+
+❌ SKILL NOT READY - FIX FAILING TESTS
+```
+
+---
+
+## Integration with Agent-Creator
+
+### When to generate tests:
+
+**In Phase 5 (Implementation):**
+
+Updated order:
+```
+...
+8. Implement analyze (analyses)
+9. CREATE TESTS (← here!)
+   - Generate test_fetch.py
+   - Generate test_parse.py
+   - Generate test_analyze.py
+   - Generate test_helpers.py
+   - Generate test_integration.py
+10. RUN TESTS
+    - Run test suite
+    - If fails → FIX and re-run
+    - Only continue when 100% passing
+11. Create examples/
+...
+```
+
+### Quality Gate:
+
+```python
+# Agent-creator should do:
+print("Running test suite...")
+exit_code = subprocess.run(['python3', 'tests/test_integration.py']).returncode
+
+if exit_code != 0:
+    print("❌ Tests failed - aborting skill generation")
+    print("Fix errors above and try again")
+    sys.exit(1)
+
+print("✅ All tests passed - continuing...")
+```
+
+---
+
+## Testing Checklist
+
+Before considering skill "done":
+
+- [ ] tests/ directory created
+- [ ] test_fetch.py with 1 test per get_*() method
+- [ ] test_parse.py with 1 test per parser
+- [ ] test_analyze.py with 2 tests per function (auto-year + explicit)
+- [ ] test_helpers.py with year detection tests
+- [ ] test_integration.py with end-to-end test
+- [ ] ALL tests passing (100%)
+- [ ] Test suite executes in < 60 seconds
+- [ ] README in tests/ explaining how to run
+
+---
+
+## Real Example: us-crop-monitor v2.0
+
+**Tests created:**
+- `test_new_metrics.py` - 5 tests (fetch methods)
+- `test_year_detection.py` - 2 tests (auto-detection)
+- `test_all_year_detection.py` - 4 tests (all functions)
+- `test_new_analyses.py` - 3 tests (new analyses)
+- `tests/test_integrated_validation.py` - 11 tests (comprehensive)
+
+**Total:** 25 tests, 100% passing
+
+**Result:**
+```
+✓ Passed: 25/25 tests
+🎉 ALL TESTS PASSED! SKILL IS PRODUCTION READY!
+```
+
+**Benefit:** Full confidence v2.0 works before distribution!
+
+---
+
+## Conclusion
+
+**ALWAYS generate test suite!**
+
+Skills without tests = prototypes
+Skills with tests = professional products ✅
+
+**ROI:** Tests cost +2h to create, but save 10-20h of debugging later!