# Comprehensive Testing Guide

**Complete testing strategy for MXCP servers: MXCP tests, Python unit tests, mocking, test databases, and concurrency safety.**

## Two Types of Tests

### 1. MXCP Tests (Integration Tests)

**Purpose**: Test the full tool/resource/prompt as it will be called by LLMs.

**Located**: In tool YAML files under `tests:` section

**Run with**: `mxcp test`

**Tests**:
- Tool can be invoked with parameters
- Return type matches specification
- Result structure is correct
- Parameter validation works

**Example**:
```yaml
# tools/get_customers.yml
mxcp: 1
tool:
  name: get_customers
  tests:
    - name: "basic_query"
      arguments:
        - key: city
          value: "Chicago"
      result:
        - customer_id: 3
          name: "Bob Johnson"
```

### 2. Python Unit Tests (Isolation Tests)

**Purpose**: Test Python functions in isolation with mocking, edge cases, concurrency.

**Located**: In `tests/` directory (pytest format)

**Run with**: `pytest` or `python -m pytest`

**Tests**:
- Function logic correctness
- Edge cases and error handling
- Mocked external dependencies
- Concurrency safety
- Result correctness verification

**Example**:
```python
# tests/test_api_wrapper.py
import pytest
from python.api_wrapper import fetch_users

@pytest.mark.asyncio
async def test_fetch_users_correctness():
    """Test that fetch_users returns correct structure"""
    result = await fetch_users(limit=5)

    assert "users" in result
    assert "count" in result
    assert result["count"] == 5
    assert len(result["users"]) == 5
    assert all("id" in user for user in result["users"])
```

## When to Use Which Tests

| Scenario | MXCP Tests | Python Unit Tests |
|----------|------------|-------------------|
| SQL-only tool | ✅ Required | ❌ Not applicable |
| Python tool (no external calls) | ✅ Required | ✅ Recommended |
| Python tool (with API calls) | ✅ Required | ✅ **Required** (with mocking) |
| Python tool (with DB writes) | ✅ Required | ✅ **Required** (test DB) |
| Python tool (async/concurrent) | ✅ Required | ✅ **Required** (concurrency tests) |

## Complete Testing Workflow

### Phase 1: MXCP Tests (Always First)

**For every tool, add test cases to YAML:**

```yaml
tool:
  name: my_tool
  # ... definition ...
  tests:
    - name: "happy_path"
      arguments:
        - key: param1
          value: "test_value"
      result:
        expected_field: "expected_value"

    - name: "edge_case_empty"
      arguments:
        - key: param1
          value: "nonexistent"
      result: []

    - name: "missing_optional_param"
      arguments: []
      # Should work with defaults
```

**Run**:
```bash
mxcp test tool my_tool
```

### Phase 2: Python Unit Tests (For Python Tools)

**Create test file structure**:
```bash
mkdir -p tests
touch tests/__init__.py
touch tests/test_my_module.py
```

**Write unit tests with pytest**:
```python
# tests/test_my_module.py
import pytest
from python.my_module import my_function

def test_my_function_correctness():
    """Verify correct results"""
    result = my_function("input")
    assert result["key"] == "expected_value"
    assert len(result["items"]) == 5

def test_my_function_edge_cases():
    """Test edge cases"""
    assert my_function("") == {"error": "Empty input"}
    assert my_function(None) == {"error": "Invalid input"}
```

**Run**:
```bash
pytest tests/
# Or with coverage
pytest --cov=python tests/
```

## Testing SQL Tools with Test Database

**CRITICAL**: SQL tools must be tested with real data to verify result correctness.

### Pattern 1: Use dbt Seeds for Test Data

```bash
# 1. Create test data seed
cat > seeds/test_data.csv <<'EOF'
id,name,value
1,test1,100
2,test2,200
3,test3,300
EOF

# 2. Create schema
cat > seeds/schema.yml <<'EOF'
version: 2
seeds:
  - name: test_data
    columns:
      - name: id
        tests: [unique, not_null]
EOF

# 3. Load test data
dbt seed --select test_data

# 4. Create tool with tests
cat > tools/query_test_data.yml <<'EOF'
mxcp: 1
tool:
  name: query_test_data
  parameters:
    - name: min_value
      type: integer
  return:
    type: array
  source:
    code: |
      SELECT * FROM test_data WHERE value >= $min_value
  tests:
    - name: "filter_200"
      arguments:
        - key: min_value
          value: 200
      result:
        - id: 2
          value: 200
        - id: 3
          value: 300
EOF

# 5. Test
mxcp test tool query_test_data
```

### Pattern 2: Create Test Fixtures in SQL

```sql
-- models/test_fixtures.sql
{{ config(materialized='table') }}

-- Create predictable test data
SELECT 1 as id, 'Alice' as name, 100 as score
UNION ALL
SELECT 2 as id, 'Bob' as name, 200 as score
UNION ALL
SELECT 3 as id, 'Charlie' as name, 150 as score
```

```yaml
# tools/top_scores.yml
tool:
  name: top_scores
  source:
    code: |
      SELECT * FROM test_fixtures ORDER BY score DESC LIMIT $limit
  tests:
    - name: "top_2"
      arguments:
        - key: limit
          value: 2
      result:
        - id: 2
          name: "Bob"
          score: 200
        - id: 3
          name: "Charlie"
          score: 150
```

### Pattern 3: Verify Aggregation Correctness

```yaml
# tools/calculate_stats.yml
tool:
  name: calculate_stats
  source:
    code: |
      SELECT
        COUNT(*) as total_count,
        SUM(score) as total_score,
        AVG(score) as avg_score,
        MAX(score) as max_score
      FROM test_fixtures
  tests:
    - name: "verify_aggregations"
      arguments: []
      result:
        - total_count: 3
          total_score: 450
          avg_score: 150.0
          max_score: 200
```

**If aggregations don't match expected values, the SQL logic is WRONG.**

## Testing Python Tools with Mocking

**CRITICAL**: Python tools with external API calls MUST use mocking in tests.

### Pattern 1: Mock HTTP Calls with pytest-httpx

```bash
# Install
pip install pytest-httpx
```

```python
# python/api_client.py
import httpx

async def fetch_external_data(api_key: str, user_id: int) -> dict:
    """Fetch data from external API"""
    async with httpx.AsyncClient() as client:
        response = await client.get(
            f"https://api.example.com/users/{user_id}",
            headers={"Authorization": f"Bearer {api_key}"}
        )
        response.raise_for_status()
        return response.json()
```

```python
# tests/test_api_client.py
import pytest
from httpx import Response
from python.api_client import fetch_external_data

@pytest.mark.asyncio
async def test_fetch_external_data_success(httpx_mock):
    """Test successful API call with mocked response"""
    # Mock the HTTP call
    httpx_mock.add_response(
        url="https://api.example.com/users/123",
        json={"id": 123, "name": "Test User", "email": "test@example.com"}
    )

    # Call function
    result = await fetch_external_data("fake_api_key", 123)

    # Verify correctness
    assert result["id"] == 123
    assert result["name"] == "Test User"
    assert result["email"] == "test@example.com"

@pytest.mark.asyncio
async def test_fetch_external_data_error(httpx_mock):
    """Test API error handling"""
    httpx_mock.add_response(
        url="https://api.example.com/users/999",
        status_code=404,
        json={"error": "User not found"}
    )

    # Should handle error gracefully
    with pytest.raises(httpx.HTTPStatusError):
        await fetch_external_data("fake_api_key", 999)
```

### Pattern 2: Mock Database Calls

```python
# python/db_operations.py
from mxcp.runtime import db

def get_user_orders(user_id: int) -> list[dict]:
    """Get orders for a user"""
    result = db.execute(
        "SELECT * FROM orders WHERE user_id = $1",
        {"user_id": user_id}
    )
    return result.fetchall()
```

```python
# tests/test_db_operations.py
import pytest
from unittest.mock import Mock, MagicMock
from python.db_operations import get_user_orders

def test_get_user_orders(monkeypatch):
    """Test with mocked database"""
    # Create mock result
    mock_result = MagicMock()
    mock_result.fetchall.return_value = [
        {"order_id": 1, "user_id": 123, "amount": 50.0},
        {"order_id": 2, "user_id": 123, "amount": 75.0}
    ]

    # Mock db.execute
    mock_db = Mock()
    mock_db.execute.return_value = mock_result

    # Inject mock
    import python.db_operations
    monkeypatch.setattr(python.db_operations, "db", mock_db)

    # Test
    orders = get_user_orders(123)

    # Verify
    assert len(orders) == 2
    assert orders[0]["order_id"] == 1
    assert sum(o["amount"] for o in orders) == 125.0
```

### Pattern 3: Mock Third-Party Libraries

```python
# python/stripe_wrapper.py
import stripe

def create_customer(email: str, name: str) -> dict:
    """Create Stripe customer"""
    customer = stripe.Customer.create(email=email, name=name)
    return {"id": customer.id, "email": customer.email}
```

```python
# tests/test_stripe_wrapper.py
import pytest
from unittest.mock import Mock, patch
from python.stripe_wrapper import create_customer

@patch('stripe.Customer.create')
def test_create_customer(mock_create):
    """Test Stripe customer creation with mock"""
    # Mock Stripe response
    mock_customer = Mock()
    mock_customer.id = "cus_test123"
    mock_customer.email = "test@example.com"
    mock_create.return_value = mock_customer

    # Call function
    result = create_customer("test@example.com", "Test User")

    # Verify correctness
    assert result["id"] == "cus_test123"
    assert result["email"] == "test@example.com"

    # Verify Stripe was called correctly
    mock_create.assert_called_once_with(
        email="test@example.com",
        name="Test User"
    )
```

## Result Correctness Verification

**CRITICAL**: Tests must verify results are CORRECT, not just that code doesn't crash.

### Bad Test (Only checks structure):
```python
def test_calculate_total_bad():
    result = calculate_total([10, 20, 30])
    assert "total" in result  # ❌ Doesn't verify correctness
```

### Good Test (Verifies correct value):
```python
def test_calculate_total_good():
    result = calculate_total([10, 20, 30])
    assert result["total"] == 60  # ✅ Verifies correct calculation
    assert result["count"] == 3   # ✅ Verifies correct count
    assert result["average"] == 20.0  # ✅ Verifies correct average
```

### Pattern: Test Edge Cases for Correctness

```python
def test_aggregation_correctness():
    """Test various aggregations for correctness"""
    data = [
        {"id": 1, "value": 100},
        {"id": 2, "value": 200},
        {"id": 3, "value": 150}
    ]

    result = aggregate_data(data)

    # Verify each aggregation
    assert result["sum"] == 450  # 100 + 200 + 150
    assert result["avg"] == 150.0  # 450 / 3
    assert result["min"] == 100
    assert result["max"] == 200
    assert result["count"] == 3

    # Verify derived values
    assert result["range"] == 100  # 200 - 100
    assert result["median"] == 150

def test_empty_data_correctness():
    """Test edge case: empty data"""
    result = aggregate_data([])

    assert result["sum"] == 0
    assert result["avg"] == 0.0
    assert result["count"] == 0
    # Ensure no crashes, correct behavior for empty data
```

## Concurrency Safety for Python Tools

**CRITICAL**: MXCP tools run as a server - multiple requests can happen simultaneously.

### Common Concurrency Issues

#### ❌ WRONG: Global State with Race Conditions

```python
# python/unsafe_counter.py
counter = 0  # ❌ DANGER: Race condition!

def increment_counter() -> dict:
    global counter
    counter += 1  # ❌ Not thread-safe!
    return {"count": counter}

# Two simultaneous requests could both read counter=5,
# both increment to 6, both write 6 -> one increment lost!
```

#### ✅ CORRECT: Use Thread-Safe Approaches

**Option 1: Avoid shared state (stateless)**
```python
# python/safe_stateless.py
def process_request(data: dict) -> dict:
    """Completely stateless - safe for concurrent calls"""
    result = compute_something(data)
    return {"result": result}
    # No global state, no problem!
```

**Option 2: Use thread-safe structures**
```python
# python/safe_with_lock.py
import threading

counter_lock = threading.Lock()
counter = 0

def increment_counter() -> dict:
    global counter
    with counter_lock:  # ✅ Thread-safe
        counter += 1
        current = counter
    return {"count": current}
```

**Option 3: Use atomic operations**
```python
# python/safe_atomic.py
from threading import Lock
from collections import defaultdict

# Thread-safe counter
class SafeCounter:
    def __init__(self):
        self._value = 0
        self._lock = Lock()

    def increment(self):
        with self._lock:
            self._value += 1
            return self._value

counter = SafeCounter()

def increment_counter() -> dict:
    return {"count": counter.increment()}
```

### Concurrency-Safe Patterns

#### Pattern 1: Database as State (DuckDB is thread-safe)

```python
# python/db_counter.py
from mxcp.runtime import db

def increment_counter() -> dict:
    """Use database for state - thread-safe"""
    db.execute("""
        CREATE TABLE IF NOT EXISTS counter (
            id INTEGER PRIMARY KEY,
            value INTEGER
        )
    """)

    db.execute("""
        INSERT INTO counter (id, value) VALUES (1, 1)
        ON CONFLICT(id) DO UPDATE SET value = value + 1
    """)

    result = db.execute("SELECT value FROM counter WHERE id = 1")
    return {"count": result.fetchone()["value"]}
```

#### Pattern 2: Local Variables Only (Immutable)

```python
# python/safe_processing.py
async def process_data(input_data: list[dict]) -> dict:
    """Local variables only - safe for concurrent calls"""
    # All state is local to this function call
    results = []
    total = 0

    for item in input_data:
        processed = transform(item)  # Pure function
        results.append(processed)
        total += processed["value"]

    return {
        "results": results,
        "total": total,
        "count": len(results)
    }
    # When function returns, all state is discarded
```

#### Pattern 3: Async/Await (Concurrent, Not Parallel)

```python
# python/safe_async.py
import asyncio
import httpx

async def fetch_multiple_users(user_ids: list[int]) -> list[dict]:
    """Concurrent API calls - safe with async"""

    async def fetch_one(user_id: int) -> dict:
        # Each call has its own context - no shared state
        async with httpx.AsyncClient() as client:
            response = await client.get(f"https://api.example.com/users/{user_id}")
            return response.json()

    # Run concurrently, but each fetch_one is independent
    results = await asyncio.gather(*[fetch_one(uid) for uid in user_ids])
    return results
```

### Testing Concurrency Safety

```python
# tests/test_concurrency.py
import pytest
import asyncio
from python.my_module import concurrent_function

@pytest.mark.asyncio
async def test_concurrent_calls_no_race_condition():
    """Test that concurrent calls don't have race conditions"""

    # Run function 100 times concurrently
    tasks = [concurrent_function(i) for i in range(100)]
    results = await asyncio.gather(*tasks)

    # Verify all calls succeeded
    assert len(results) == 100

    # Verify no data corruption
    assert all(isinstance(r, dict) for r in results)

    # If function has a counter, verify correctness
    # (e.g., if each call increments, final count should be 100)

def test_parallel_execution_thread_safe():
    """Test with actual threading"""
    import threading

    results = []
    errors = []

    def worker(n):
        try:
            result = my_function(n)
            results.append(result)
        except Exception as e:
            errors.append(e)

    # Create 50 threads
    threads = [threading.Thread(target=worker, args=(i,)) for i in range(50)]

    # Start all threads
    for t in threads:
        t.start()

    # Wait for completion
    for t in threads:
        t.join()

    # Verify
    assert len(errors) == 0, f"Errors occurred: {errors}"
    assert len(results) == 50
```

## Complete Testing Checklist

### For SQL Tools:

- [ ] MXCP test cases in YAML
- [ ] Test with real seed data
- [ ] Verify result correctness (exact values)
- [ ] Test edge cases (empty results, NULL values)
- [ ] Test filters work correctly
- [ ] Test aggregations are mathematically correct
- [ ] Test with dbt test for data quality

### For Python Tools (No External Calls):

- [ ] MXCP test cases in YAML
- [ ] Python unit tests (pytest)
- [ ] Verify result correctness
- [ ] Test edge cases (empty input, NULL, invalid)
- [ ] Test error handling
- [ ] Test concurrency safety (if using shared state)

### For Python Tools (With External API Calls):

- [ ] MXCP test cases in YAML
- [ ] Python unit tests with mocking (pytest + httpx_mock)
- [ ] Mock all external API calls
- [ ] Test success path with mocked responses
- [ ] Test error cases (404, 500, timeout)
- [ ] Verify correct API parameters
- [ ] Test result correctness
- [ ] Test concurrency (multiple simultaneous calls)

### For Python Tools (With Database Operations):

- [ ] MXCP test cases in YAML
- [ ] Python unit tests
- [ ] Use test fixtures/seed data
- [ ] Verify query results correctness
- [ ] Test transactions (if applicable)
- [ ] Test concurrency (DuckDB is thread-safe)
- [ ] Clean up test data after tests

## Project Structure for Testing

```
project/
├── mxcp-site.yml
├── tools/
│   └── my_tool.yml              # Contains MXCP tests
├── python/
│   └── my_module.py             # Python code
├── tests/
│   ├── __init__.py
│   ├── test_my_module.py        # Python unit tests
│   ├── conftest.py              # pytest fixtures
│   └── fixtures/
│       └── test_data.json       # Test data
├── seeds/
│   ├── test_data.csv            # Test database seeds
│   └── schema.yml
└── requirements.txt             # Include: pytest, pytest-asyncio, pytest-httpx, pytest-cov
```

## Running Tests

```bash
# 1. MXCP tests (always run first)
mxcp validate  # Structure validation
mxcp test      # Integration tests

# 2. dbt tests (if using dbt)
dbt test

# 3. Python unit tests
pytest tests/ -v

# 4. With coverage report
pytest tests/ --cov=python --cov-report=html

# 5. Concurrency stress test (custom)
pytest tests/test_concurrency.py -v --count=100

# All together
mxcp validate && mxcp test && dbt test && pytest tests/ -v
```

## Summary

**Both types of tests are required**:

1. **MXCP tests** - Verify tools work end-to-end
2. **Python unit tests** - Verify logic, mocking, correctness, concurrency

**Key principles**:
- ✅ **Mock all external calls** - Use pytest-httpx, unittest.mock
- ✅ **Verify result correctness** - Don't just check structure
- ✅ **Use test databases** - SQL tools need real data
- ✅ **Test concurrency** - Tools run as servers
- ✅ **Avoid global mutable state** - Use stateless patterns or locks
- ✅ **Test edge cases** - Empty data, NULL, invalid input

**Before declaring a project done, BOTH test types must pass completely.**