zhongwei/gh-slamb2k-agent-smith-agent-smith-plugin

Files

Zhongwei Li 8f952ee727 Initial commit

2025-11-30 08:57:54 +08:00

9.4 KiB

Raw Blame History

Lessons Learned from PocketSmith Migration

Date: 2025-11-23 Source: build/ directory reference materials (now archived)

This document captures key insights from a previous PocketSmith category migration project that informed Agent Smith's design.

API Quirks and Workarounds
Category Hierarchy Best Practices
Transaction Categorization Patterns
Merchant Name Normalization
User Experience Lessons

API Quirks and Workarounds

Category Rules API Limitations

Issue: PocketSmith API does not support updating or deleting category rules.

GET /categories/{id}/category_rules works
POST works (create only)
PUT/PATCH/DELETE return 404 errors

Impact: Rules created via API cannot be modified programmatically.

Agent Smith Solution: Hybrid rule engine with local rules for complex logic, platform rules for simple keywords only.

Transaction Migration 500 Errors

Issue: Bulk transaction updates sometimes fail with 500 Internal Server Errors.

Root Cause: Likely API rate limiting or server-side stability issues.

Agent Smith Solution:

Implement rate limiting (0.1-0.5s delay between requests)
Batch processing with progress tracking
Retry logic with exponential backoff
Always backup before bulk operations

Special Characters in Category Names

Issue: Using "&" in category names causes 422 Unprocessable Entity errors.

Workaround: Replace "&" with "and" in all category names.

Example:

❌ "Takeaway & Food Delivery" → 422 error
✅ "Takeaway and Food Delivery" → Success

Use PUT instead of PATCH

Issue: PATCH for transaction updates is unreliable in PocketSmith API.

Solution: Always use PUT for transaction updates.

# ✅ Correct
response = requests.put(
    f'https://api.pocketsmith.com/v2/transactions/{txn_id}',
    headers=headers,
    json={'category_id': category_id}
)

# ❌ Avoid (unreliable)
response = requests.patch(...)

Category Hierarchy Best Practices

Parent-Child Structure

Recommendation: Use 2-level hierarchy maximum.

12-15 parent categories for broad grouping
2-5 children per parent for specific tracking
Avoid 3+ levels (PocketSmith UI gets cluttered)

Example Structure:

Food & Dining (parent)
├── Groceries
├── Restaurants
├── Takeaway and Food Delivery
└── Coffee Shops

Duplicate Category Detection

Problem: Duplicate categories accumulate over time, causing confusion.

Solution: Before creating categories, check for existing matches:

Flatten nested category structure
Check both exact matches and case-insensitive matches
Check for variations (e.g., "Takeaway" vs "Takeaways")

Agent Smith Implementation: Category validation in health check system.

Consolidation Strategy

Insight: Merging duplicate categories is risky:

Requires migrating all associated transactions
Transaction updates can fail (500 errors)
Better to prevent duplicates than merge later

Agent Smith Approach: Template-based setup with validation prevents duplicates upfront.

Transaction Categorization Patterns

Pattern Matching Complexity

Observation: Transaction categorization evolved through multiple rounds:

Round 1: Simple keyword matching (60% coverage)
Round 2: Pattern matching with normalization (80% coverage)
Round 3: User clarifications + edge cases (90% coverage)
Round 4: Manual review of exceptions (95% coverage)

Lesson: Need both automated rules AND user override capability.

Agent Smith Solution: Tiered intelligence modes (Conservative/Smart/Aggressive) with confidence scoring.

Confidence-Based Auto-Apply

Insight: Not all matches are equal:

High confidence (95%+): Auto-apply safe (e.g., "WOOLWORTHS" → Groceries)
Medium confidence (70-94%): Ask user (e.g., "LS DOLLI PL" → Coffee?)
Low confidence (<70%): Always ask (e.g., "Purchase At Kac" → ???)

Agent Smith Implementation:

if confidence >= 90:  # Smart mode threshold
    apply_automatically()
elif confidence >= 70:
    ask_user_for_approval()
else:
    skip_or_manual_review()

Dry-Run Mode is Critical

Lesson: Always preview before bulk operations.

Pattern from migration:

class BulkCategorizer:
    def __init__(self, dry_run=True):  # Default to dry-run!
        self.dry_run = dry_run

    def categorize_transactions(self):
        if self.dry_run:
            # Show what WOULD happen
            return preview
        else:
            # Actually execute
            return results

Agent Smith Implementation: All bulk operations support --mode=dry_run flag.

Merchant Name Normalization

Common Payee Patterns

Observations from transaction data:

Location codes: "WOOLWORTHS 1234" → "WOOLWORTHS"
Legal suffixes: "COLES PTY LTD" → "COLES"
Country codes: "UBER AU" → "UBER"
Transaction codes: "PURCHASE NSWxxx123" → "PURCHASE"
Direct debit patterns: "DIRECT DEBIT 12345" → "DIRECT DEBIT"

Agent Smith Patterns:

LOCATION_CODE_PATTERN = r"\s+\d{4,}$"
SUFFIX_PATTERNS = [
    r"\s+PTY\s+LTD$",
    r"\s+LIMITED$",
    r"\s+LTD$",
    r"\s+AU$",
]

Merchant Variation Grouping

Problem: Same merchant appears with multiple names:

"woolworths"
"WOOLWORTHS PTY LTD"
"Woolworths 1234"
"WOOLWORTHS SUPERMARKETS"

Solution: Learn canonical names from transaction history.

Agent Smith Implementation: MerchantNormalizer.learn_from_transactions() in scripts/utils/merchant_normalizer.py:101-130

User Experience Lessons

Backups are Non-Negotiable

Critical Lesson: ALWAYS backup before mutations.

Migration practice:

def categorize_transactions(self):
    # Step 1: Always backup first
    self.backup_transactions()

    # Step 2: Then execute
    self.apply_changes()

Agent Smith Policy: Automatic backups before all mutation operations, tracked in backups/ directory.

Progress Visibility Matters

Problem: Long-running operations feel broken without progress indicators.

Solution: Show progress every N iterations:

for i, txn in enumerate(transactions, 1):
    # Process transaction

    if i % 100 == 0:
        print(f"Progress: {i}/{total} ({i/total*100:.1f}%)")

Agent Smith Implementation: All batch operations show real-time progress.

Manual Cleanup is Inevitable

Reality Check: Even after 5+ rounds of automated categorization, ~5% of transactions needed manual review.

Reasons:

Genuinely ambiguous merchants ("Purchase At Kac" = gambling)
One-off transactions (unique payees)
Data quality issues (missing/incorrect payee names)

Agent Smith Approach: Make manual review easy with health check reports showing uncategorized transactions.

Weekly Review Habit

Post-migration recommendation: Review recent transactions weekly for first month.

Why: Helps catch:

Miscategorized transactions
New merchants needing rules
Changes in spending patterns

Agent Smith Feature: Smart alerts with weekly budget reviews (Phase 7).

Implementation Timelines

Migration Timeline (Reality vs Plan)

Planned: 35 minutes total Actual: 3+ hours over multiple days

Breakdown:

Category structure migration: 10 minutes (as planned)
Rule recreation: 20 minutes (10 minutes planned - API limitations doubled time)
Transaction categorization Round 1: 30 minutes
Transaction categorization Round 2: 45 minutes
Transaction categorization Round 3: 60 minutes
Manual cleanup and verification: 90 minutes

Lesson: Budget 3-5x estimated time for data migration projects.

Agent Smith Design: Incremental onboarding (30-60 minutes initial setup, ongoing refinement).

Key Takeaways for Agent Smith

What We Built Better

Hybrid Rule Engine: Local + Platform rules overcome API limitations
Confidence Scoring: Tiered auto-apply based on pattern strength
Merchant Intelligence: Learned normalization from transaction history
Health Checks: Proactive detection of category/rule issues
Template System: Pre-built rule sets prevent common mistakes

What We Avoided

Manual rule migration - Templates and import/export instead
Duplicate categories - Validation and health checks
Bulk update failures - Rate limiting, retry logic, batching
Lost context - Comprehensive backups with metadata
User fatigue - Incremental categorization, not all-at-once

Core Principles

✅ Backup before mutations ✅ Dry-run before execute ✅ Progress visibility ✅ Confidence-based automation ✅ User choice over forced automation ✅ Learn from transaction history ✅ Graceful degradation (LLM fallback when rules don't match)

Reference

Original Materials: Archived from build/ directory (removed 2025-11-23)

Full backup available at: ../budget-smith-backup-20251120_093733/

See Also:

Agent Smith Design - Complete system design
Unified Rules Guide - Rule engine documentation
Health Check Guide - Health scoring system

Last Updated: 2025-11-23

9.4 KiB Raw Blame History