gh-slamb2k-agent-smith-agen…/skills/agent-smith/references/LESSONS_LEARNED.md

# Lessons Learned from PocketSmith Migration

**Date:** 2025-11-23
**Source:** build/ directory reference materials (now archived)

This document captures key insights from a previous PocketSmith category migration project that informed Agent Smith's design.

---

## Table of Contents

1. [API Quirks and Workarounds](#api-quirks-and-workarounds)
2. [Category Hierarchy Best Practices](#category-hierarchy-best-practices)
3. [Transaction Categorization Patterns](#transaction-categorization-patterns)
4. [Merchant Name Normalization](#merchant-name-normalization)
5. [User Experience Lessons](#user-experience-lessons)

---

## API Quirks and Workarounds

### Category Rules API Limitations

**Issue:** PocketSmith API does not support updating or deleting category rules.
- GET `/categories/{id}/category_rules` works
- POST works (create only)
- PUT/PATCH/DELETE return 404 errors

**Impact:** Rules created via API cannot be modified programmatically.

**Agent Smith Solution:** Hybrid rule engine with local rules for complex logic, platform rules for simple keywords only.

### Transaction Migration 500 Errors

**Issue:** Bulk transaction updates sometimes fail with 500 Internal Server Errors.

**Root Cause:** Likely API rate limiting or server-side stability issues.

**Agent Smith Solution:**
- Implement rate limiting (0.1-0.5s delay between requests)
- Batch processing with progress tracking
- Retry logic with exponential backoff
- Always backup before bulk operations

### Special Characters in Category Names

**Issue:** Using "&" in category names causes 422 Unprocessable Entity errors.

**Workaround:** Replace "&" with "and" in all category names.

**Example:**
- ❌ "Takeaway & Food Delivery" → 422 error
- ✅ "Takeaway and Food Delivery" → Success

### Use PUT instead of PATCH

**Issue:** PATCH for transaction updates is unreliable in PocketSmith API.

**Solution:** Always use PUT for transaction updates.

```python
# ✅ Correct
response = requests.put(
    f'https://api.pocketsmith.com/v2/transactions/{txn_id}',
    headers=headers,
    json={'category_id': category_id}
)

# ❌ Avoid (unreliable)
response = requests.patch(...)
```

---

## Category Hierarchy Best Practices

### Parent-Child Structure

**Recommendation:** Use 2-level hierarchy maximum.
- 12-15 parent categories for broad grouping
- 2-5 children per parent for specific tracking
- Avoid 3+ levels (PocketSmith UI gets cluttered)

**Example Structure:**
```
Food & Dining (parent)
├── Groceries
├── Restaurants
├── Takeaway and Food Delivery
└── Coffee Shops
```

### Duplicate Category Detection

**Problem:** Duplicate categories accumulate over time, causing confusion.

**Solution:** Before creating categories, check for existing matches:
1. Flatten nested category structure
2. Check both exact matches and case-insensitive matches
3. Check for variations (e.g., "Takeaway" vs "Takeaways")

**Agent Smith Implementation:** Category validation in health check system.

### Consolidation Strategy

**Insight:** Merging duplicate categories is risky:
- Requires migrating all associated transactions
- Transaction updates can fail (500 errors)
- Better to prevent duplicates than merge later

**Agent Smith Approach:** Template-based setup with validation prevents duplicates upfront.

---

## Transaction Categorization Patterns

### Pattern Matching Complexity

**Observation:** Transaction categorization evolved through multiple rounds:
- Round 1: Simple keyword matching (60% coverage)
- Round 2: Pattern matching with normalization (80% coverage)
- Round 3: User clarifications + edge cases (90% coverage)
- Round 4: Manual review of exceptions (95% coverage)

**Lesson:** Need both automated rules AND user override capability.

**Agent Smith Solution:** Tiered intelligence modes (Conservative/Smart/Aggressive) with confidence scoring.

### Confidence-Based Auto-Apply

**Insight:** Not all matches are equal:
- High confidence (95%+): Auto-apply safe (e.g., "WOOLWORTHS" → Groceries)
- Medium confidence (70-94%): Ask user (e.g., "LS DOLLI PL" → Coffee?)
- Low confidence (<70%): Always ask (e.g., "Purchase At Kac" → ???)

**Agent Smith Implementation:**
```python
if confidence >= 90:  # Smart mode threshold
    apply_automatically()
elif confidence >= 70:
    ask_user_for_approval()
else:
    skip_or_manual_review()
```

### Dry-Run Mode is Critical

**Lesson:** Always preview before bulk operations.

**Pattern from migration:**
```python
class BulkCategorizer:
    def __init__(self, dry_run=True):  # Default to dry-run!
        self.dry_run = dry_run

    def categorize_transactions(self):
        if self.dry_run:
            # Show what WOULD happen
            return preview
        else:
            # Actually execute
            return results
```

**Agent Smith Implementation:** All bulk operations support `--mode=dry_run` flag.

---

## Merchant Name Normalization

### Common Payee Patterns

**Observations from transaction data:**

1. **Location codes:** "WOOLWORTHS 1234" → "WOOLWORTHS"
2. **Legal suffixes:** "COLES PTY LTD" → "COLES"
3. **Country codes:** "UBER AU" → "UBER"
4. **Transaction codes:** "PURCHASE NSWxxx123" → "PURCHASE"
5. **Direct debit patterns:** "DIRECT DEBIT 12345" → "DIRECT DEBIT"

**Agent Smith Patterns:**
```python
LOCATION_CODE_PATTERN = r"\s+\d{4,}$"
SUFFIX_PATTERNS = [
    r"\s+PTY\s+LTD$",
    r"\s+LIMITED$",
    r"\s+LTD$",
    r"\s+AU$",
]
```

### Merchant Variation Grouping

**Problem:** Same merchant appears with multiple names:
- "woolworths"
- "WOOLWORTHS PTY LTD"
- "Woolworths 1234"
- "WOOLWORTHS SUPERMARKETS"

**Solution:** Learn canonical names from transaction history.

**Agent Smith Implementation:** `MerchantNormalizer.learn_from_transactions()` in scripts/utils/merchant_normalizer.py:101-130

---

## User Experience Lessons

### Backups are Non-Negotiable

**Critical Lesson:** ALWAYS backup before mutations.

**Migration practice:**
```python
def categorize_transactions(self):
    # Step 1: Always backup first
    self.backup_transactions()

    # Step 2: Then execute
    self.apply_changes()
```

**Agent Smith Policy:** Automatic backups before all mutation operations, tracked in backups/ directory.

### Progress Visibility Matters

**Problem:** Long-running operations feel broken without progress indicators.

**Solution:** Show progress every N iterations:
```python
for i, txn in enumerate(transactions, 1):
    # Process transaction

    if i % 100 == 0:
        print(f"Progress: {i}/{total} ({i/total*100:.1f}%)")
```

**Agent Smith Implementation:** All batch operations show real-time progress.

### Manual Cleanup is Inevitable

**Reality Check:** Even after 5+ rounds of automated categorization, ~5% of transactions needed manual review.

**Reasons:**
- Genuinely ambiguous merchants ("Purchase At Kac" = gambling)
- One-off transactions (unique payees)
- Data quality issues (missing/incorrect payee names)

**Agent Smith Approach:** Make manual review easy with health check reports showing uncategorized transactions.

### Weekly Review Habit

**Post-migration recommendation:** Review recent transactions weekly for first month.

**Why:** Helps catch:
- Miscategorized transactions
- New merchants needing rules
- Changes in spending patterns

**Agent Smith Feature:** Smart alerts with weekly budget reviews (Phase 7).

---

## Implementation Timelines

### Migration Timeline (Reality vs Plan)

**Planned:** 35 minutes total
**Actual:** 3+ hours over multiple days

**Breakdown:**
- Category structure migration: 10 minutes (as planned)
- Rule recreation: 20 minutes (10 minutes planned - API limitations doubled time)
- Transaction categorization Round 1: 30 minutes
- Transaction categorization Round 2: 45 minutes
- Transaction categorization Round 3: 60 minutes
- Manual cleanup and verification: 90 minutes

**Lesson:** Budget 3-5x estimated time for data migration projects.

**Agent Smith Design:** Incremental onboarding (30-60 minutes initial setup, ongoing refinement).

---

## Key Takeaways for Agent Smith

### What We Built Better

1. **Hybrid Rule Engine:** Local + Platform rules overcome API limitations
2. **Confidence Scoring:** Tiered auto-apply based on pattern strength
3. **Merchant Intelligence:** Learned normalization from transaction history
4. **Health Checks:** Proactive detection of category/rule issues
5. **Template System:** Pre-built rule sets prevent common mistakes

### What We Avoided

1. **Manual rule migration** - Templates and import/export instead
2. **Duplicate categories** - Validation and health checks
3. **Bulk update failures** - Rate limiting, retry logic, batching
4. **Lost context** - Comprehensive backups with metadata
5. **User fatigue** - Incremental categorization, not all-at-once

### Core Principles

✅ **Backup before mutations**
✅ **Dry-run before execute**
✅ **Progress visibility**
✅ **Confidence-based automation**
✅ **User choice over forced automation**
✅ **Learn from transaction history**
✅ **Graceful degradation** (LLM fallback when rules don't match)

---

## Reference

**Original Materials:** Archived from `build/` directory (removed 2025-11-23)

**Full backup available at:** `../budget-smith-backup-20251120_093733/`

**See Also:**
- [Agent Smith Design](2025-11-20-agent-smith-design.md) - Complete system design
- [Unified Rules Guide](../guides/unified-rules-guide.md) - Rule engine documentation
- [Health Check Guide](../guides/health-check-guide.md) - Health scoring system

---

**Last Updated:** 2025-11-23