Files
gh-slamb2k-agent-smith-agen…/skills/agent-smith/references/LESSONS_LEARNED.md
2025-11-30 08:57:54 +08:00

9.4 KiB

Lessons Learned from PocketSmith Migration

Date: 2025-11-23 Source: build/ directory reference materials (now archived)

This document captures key insights from a previous PocketSmith category migration project that informed Agent Smith's design.


Table of Contents

  1. API Quirks and Workarounds
  2. Category Hierarchy Best Practices
  3. Transaction Categorization Patterns
  4. Merchant Name Normalization
  5. User Experience Lessons

API Quirks and Workarounds

Category Rules API Limitations

Issue: PocketSmith API does not support updating or deleting category rules.

  • GET /categories/{id}/category_rules works
  • POST works (create only)
  • PUT/PATCH/DELETE return 404 errors

Impact: Rules created via API cannot be modified programmatically.

Agent Smith Solution: Hybrid rule engine with local rules for complex logic, platform rules for simple keywords only.

Transaction Migration 500 Errors

Issue: Bulk transaction updates sometimes fail with 500 Internal Server Errors.

Root Cause: Likely API rate limiting or server-side stability issues.

Agent Smith Solution:

  • Implement rate limiting (0.1-0.5s delay between requests)
  • Batch processing with progress tracking
  • Retry logic with exponential backoff
  • Always backup before bulk operations

Special Characters in Category Names

Issue: Using "&" in category names causes 422 Unprocessable Entity errors.

Workaround: Replace "&" with "and" in all category names.

Example:

  • "Takeaway & Food Delivery" → 422 error
  • "Takeaway and Food Delivery" → Success

Use PUT instead of PATCH

Issue: PATCH for transaction updates is unreliable in PocketSmith API.

Solution: Always use PUT for transaction updates.

# ✅ Correct
response = requests.put(
    f'https://api.pocketsmith.com/v2/transactions/{txn_id}',
    headers=headers,
    json={'category_id': category_id}
)

# ❌ Avoid (unreliable)
response = requests.patch(...)

Category Hierarchy Best Practices

Parent-Child Structure

Recommendation: Use 2-level hierarchy maximum.

  • 12-15 parent categories for broad grouping
  • 2-5 children per parent for specific tracking
  • Avoid 3+ levels (PocketSmith UI gets cluttered)

Example Structure:

Food & Dining (parent)
├── Groceries
├── Restaurants
├── Takeaway and Food Delivery
└── Coffee Shops

Duplicate Category Detection

Problem: Duplicate categories accumulate over time, causing confusion.

Solution: Before creating categories, check for existing matches:

  1. Flatten nested category structure
  2. Check both exact matches and case-insensitive matches
  3. Check for variations (e.g., "Takeaway" vs "Takeaways")

Agent Smith Implementation: Category validation in health check system.

Consolidation Strategy

Insight: Merging duplicate categories is risky:

  • Requires migrating all associated transactions
  • Transaction updates can fail (500 errors)
  • Better to prevent duplicates than merge later

Agent Smith Approach: Template-based setup with validation prevents duplicates upfront.


Transaction Categorization Patterns

Pattern Matching Complexity

Observation: Transaction categorization evolved through multiple rounds:

  • Round 1: Simple keyword matching (60% coverage)
  • Round 2: Pattern matching with normalization (80% coverage)
  • Round 3: User clarifications + edge cases (90% coverage)
  • Round 4: Manual review of exceptions (95% coverage)

Lesson: Need both automated rules AND user override capability.

Agent Smith Solution: Tiered intelligence modes (Conservative/Smart/Aggressive) with confidence scoring.

Confidence-Based Auto-Apply

Insight: Not all matches are equal:

  • High confidence (95%+): Auto-apply safe (e.g., "WOOLWORTHS" → Groceries)
  • Medium confidence (70-94%): Ask user (e.g., "LS DOLLI PL" → Coffee?)
  • Low confidence (<70%): Always ask (e.g., "Purchase At Kac" → ???)

Agent Smith Implementation:

if confidence >= 90:  # Smart mode threshold
    apply_automatically()
elif confidence >= 70:
    ask_user_for_approval()
else:
    skip_or_manual_review()

Dry-Run Mode is Critical

Lesson: Always preview before bulk operations.

Pattern from migration:

class BulkCategorizer:
    def __init__(self, dry_run=True):  # Default to dry-run!
        self.dry_run = dry_run

    def categorize_transactions(self):
        if self.dry_run:
            # Show what WOULD happen
            return preview
        else:
            # Actually execute
            return results

Agent Smith Implementation: All bulk operations support --mode=dry_run flag.


Merchant Name Normalization

Common Payee Patterns

Observations from transaction data:

  1. Location codes: "WOOLWORTHS 1234" → "WOOLWORTHS"
  2. Legal suffixes: "COLES PTY LTD" → "COLES"
  3. Country codes: "UBER AU" → "UBER"
  4. Transaction codes: "PURCHASE NSWxxx123" → "PURCHASE"
  5. Direct debit patterns: "DIRECT DEBIT 12345" → "DIRECT DEBIT"

Agent Smith Patterns:

LOCATION_CODE_PATTERN = r"\s+\d{4,}$"
SUFFIX_PATTERNS = [
    r"\s+PTY\s+LTD$",
    r"\s+LIMITED$",
    r"\s+LTD$",
    r"\s+AU$",
]

Merchant Variation Grouping

Problem: Same merchant appears with multiple names:

  • "woolworths"
  • "WOOLWORTHS PTY LTD"
  • "Woolworths 1234"
  • "WOOLWORTHS SUPERMARKETS"

Solution: Learn canonical names from transaction history.

Agent Smith Implementation: MerchantNormalizer.learn_from_transactions() in scripts/utils/merchant_normalizer.py:101-130


User Experience Lessons

Backups are Non-Negotiable

Critical Lesson: ALWAYS backup before mutations.

Migration practice:

def categorize_transactions(self):
    # Step 1: Always backup first
    self.backup_transactions()

    # Step 2: Then execute
    self.apply_changes()

Agent Smith Policy: Automatic backups before all mutation operations, tracked in backups/ directory.

Progress Visibility Matters

Problem: Long-running operations feel broken without progress indicators.

Solution: Show progress every N iterations:

for i, txn in enumerate(transactions, 1):
    # Process transaction

    if i % 100 == 0:
        print(f"Progress: {i}/{total} ({i/total*100:.1f}%)")

Agent Smith Implementation: All batch operations show real-time progress.

Manual Cleanup is Inevitable

Reality Check: Even after 5+ rounds of automated categorization, ~5% of transactions needed manual review.

Reasons:

  • Genuinely ambiguous merchants ("Purchase At Kac" = gambling)
  • One-off transactions (unique payees)
  • Data quality issues (missing/incorrect payee names)

Agent Smith Approach: Make manual review easy with health check reports showing uncategorized transactions.

Weekly Review Habit

Post-migration recommendation: Review recent transactions weekly for first month.

Why: Helps catch:

  • Miscategorized transactions
  • New merchants needing rules
  • Changes in spending patterns

Agent Smith Feature: Smart alerts with weekly budget reviews (Phase 7).


Implementation Timelines

Migration Timeline (Reality vs Plan)

Planned: 35 minutes total Actual: 3+ hours over multiple days

Breakdown:

  • Category structure migration: 10 minutes (as planned)
  • Rule recreation: 20 minutes (10 minutes planned - API limitations doubled time)
  • Transaction categorization Round 1: 30 minutes
  • Transaction categorization Round 2: 45 minutes
  • Transaction categorization Round 3: 60 minutes
  • Manual cleanup and verification: 90 minutes

Lesson: Budget 3-5x estimated time for data migration projects.

Agent Smith Design: Incremental onboarding (30-60 minutes initial setup, ongoing refinement).


Key Takeaways for Agent Smith

What We Built Better

  1. Hybrid Rule Engine: Local + Platform rules overcome API limitations
  2. Confidence Scoring: Tiered auto-apply based on pattern strength
  3. Merchant Intelligence: Learned normalization from transaction history
  4. Health Checks: Proactive detection of category/rule issues
  5. Template System: Pre-built rule sets prevent common mistakes

What We Avoided

  1. Manual rule migration - Templates and import/export instead
  2. Duplicate categories - Validation and health checks
  3. Bulk update failures - Rate limiting, retry logic, batching
  4. Lost context - Comprehensive backups with metadata
  5. User fatigue - Incremental categorization, not all-at-once

Core Principles

Backup before mutations Dry-run before execute Progress visibility Confidence-based automation User choice over forced automation Learn from transaction history Graceful degradation (LLM fallback when rules don't match)


Reference

Original Materials: Archived from build/ directory (removed 2025-11-23)

Full backup available at: ../budget-smith-backup-20251120_093733/

See Also:


Last Updated: 2025-11-23