Files
gh-slamb2k-agent-smith-agen…/skills/agent-smith/references/unified-rules-guide.md
2025-11-30 08:57:54 +08:00

45 KiB

Unified Rules Guide - Categories & Labels

Overview

Agent Smith uses a unified YAML rule system that handles both transaction categorization and labeling in a single, easy-to-read file.

Key Features:

  • YAML format - Easy to read and edit
  • Two-phase execution - Categories first, then labels
  • Pattern matching - Regex patterns with exclusions
  • Confidence scoring - 0-100% confidence for auto-apply logic
  • Smart labeling - Context-aware labels (account, category, amount)
  • LLM fallback - AI categorization when rules don't match
  • Template system - Pre-built rule sets for common household types

Table of Contents

  1. Quick Start
  2. Rule Types
  3. Execution Flow
  4. Intelligence Modes
  5. LLM Integration
  6. Advanced Patterns
  7. Best Practices
  8. Operational Modes
  9. Update Strategies
  10. Template System
  11. Migration Guide
  12. Troubleshooting

Quick Start

1. Choose a Template

Start with a pre-built template that matches your household type:

uv run python scripts/setup/template_selector.py

Available templates:

  • Simple - Single person, no shared expenses
  • Separated Families - Divorced/separated parents with shared custody
  • Shared Household - Couples, roommates, or families
  • Advanced - Business owners, investors, complex finances

2. Customize Your Rules

Edit data/rules.yaml to match your specific needs:

rules:
  # Add your first category rule
  - type: category
    name: Coffee → Dining Out
    patterns: [STARBUCKS, COSTA, CAFE]
    category: Food & Dining > Dining Out
    confidence: 95

  # Add your first label rule
  - type: label
    name: Personal Coffee
    when:
      categories: [Dining Out]
      accounts: [Personal]
    labels: [Discretionary, Personal]

3. Test Your Rules

Always test before applying to real transactions:

# Dry run - preview what would happen
uv run python scripts/operations/batch_categorize.py --mode=dry_run --period=2025-11

# Validate - see what would change on existing categorizations
uv run python scripts/operations/batch_categorize.py --mode=validate --period=2025-11

# Apply - actually categorize transactions
uv run python scripts/operations/batch_categorize.py --mode=apply --period=2025-11

4. Review and Refine

Check the results and refine your rules:

# See categorization summary
/agent-smith-analyze spending --period=2025-11

# Check uncategorized transactions
/agent-smith-categorize --mode=smart --show-uncategorized

Rule Types

Category Rules

Categorize transactions based on payee patterns, amounts, and accounts.

Full Syntax:

- type: category
  name: Rule Name (for logging/display)
  patterns: [PATTERN1, PATTERN2, PATTERN3]  # OR logic
  exclude_patterns: [EXCLUDE1, EXCLUDE2]    # Optional
  category: Category > Subcategory
  confidence: 95                             # 0-100%
  accounts: [Account1, Account2]             # Optional filter
  amount_operator: ">"                       # Optional: >, <, >=, <=, ==, !=
  amount_value: 100.00                       # Required if amount_operator set

Field Descriptions:

Field Required Type Description
type Yes String Must be "category"
name Yes String Descriptive name for logs (e.g., "WOOLWORTHS → Groceries")
patterns Yes List[String] Payee keywords to match (case-insensitive, OR logic)
category Yes String Category to assign (can include parent: "Parent > Child")
confidence No Integer Confidence score 0-100% (default: 90)
exclude_patterns No List[String] Patterns to exclude from match
accounts No List[String] Only match transactions in these accounts
amount_operator No String Amount comparison: >, <, >=, <=, ==, !=
amount_value No Number Amount threshold (required if operator set)

Examples:

# Basic pattern matching
- type: category
  name: WOOLWORTHS → Groceries
  patterns: [WOOLWORTHS, COLES, ALDI]
  category: Food & Dining > Groceries
  confidence: 95

# With exclusions (exclude UBER EATS from UBER)
- type: category
  name: UBER → Transport
  patterns: [UBER]
  exclude_patterns: [UBER EATS]
  category: Transport
  confidence: 90

# Account-specific rule
- type: category
  name: Work Laptop Purchase
  patterns: [APPLE STORE, MICROSOFT STORE]
  accounts: [Work Credit Card]
  category: Work > Equipment
  confidence: 90

# Amount-based rule (large purchases)
- type: category
  name: Large Electronics
  patterns: [JB HI-FI, HARVEY NORMAN]
  category: Shopping > Electronics
  confidence: 85
  amount_operator: ">"
  amount_value: 500

Label Rules

Apply labels to transactions based on their category, account, amount, or categorization status.

Full Syntax:

- type: label
  name: Label Rule Name
  when:
    categories: [Category1, Category2]       # Optional (OR logic)
    accounts: [Account1, Account2]           # Optional (OR logic)
    amount_operator: ">"                     # Optional
    amount_value: 100.00                     # Required if operator set
    uncategorized: true                      # Optional (true to match uncategorized)
  labels: [Label1, Label2, Label3]

Important: All when conditions must match (AND logic), but values within each list use OR logic.

Field Descriptions:

Field Required Type Description
type Yes String Must be "label"
name Yes String Descriptive name for logs
when Yes Object Conditions that must ALL match
when.categories No List[String] Match if category contains any of these (OR)
when.accounts No List[String] Match if account name contains any of these (OR)
when.amount_operator No String Amount comparison: >, <, >=, <=, ==, !=
when.amount_value No Number Amount threshold
when.uncategorized No Boolean Match uncategorized transactions (true/false)
labels Yes List[String] Labels to apply when conditions match

Examples:

# Category-based labeling
- type: label
  name: Tax Deductible Work Expenses
  when:
    categories: [Work, Professional Development, Home Office]
  labels: [Tax Deductible, ATO: D1]

# Account-based labeling
- type: label
  name: Shared Household Expense
  when:
    accounts: [Shared Bills, Joint Account]
  labels: [Shared Expense, Needs Reconciliation]

# Combined conditions (category AND account)
- type: label
  name: Personal Coffee Spending
  when:
    categories: [Dining Out]
    accounts: [Personal]
  labels: [Discretionary, Personal]

# Amount-based labeling
- type: label
  name: Large Purchase Flag
  when:
    amount_operator: ">"
    amount_value: 500
  labels: [Large Purchase, Review Required]

# Flag uncategorized transactions
- type: label
  name: Needs Categorization
  when:
    uncategorized: true
  labels: [Uncategorized, Needs Review]

# Multi-condition (category AND account AND amount)
- type: label
  name: Large Shared Grocery Trip
  when:
    categories: [Groceries]
    accounts: [Shared Bills]
    amount_operator: ">"
    amount_value: 200
  labels: [Shared Expense, Large Purchase, Needs Approval]

Execution Flow

The unified rule engine uses two-phase execution to ensure labels can depend on categories assigned in the same run.

Phase 1: Categorization

  1. Iterate through all category rules in order
  2. For each transaction, find the FIRST matching rule
  3. Apply the category and confidence score
  4. Short-circuit: Stop at first match (no further category rules evaluated)
  5. Update transaction with matched category for Phase 2

Rule Order Matters! Specific rules should come before general rules.

Phase 2: Labeling

  1. Using the transaction (now with category from Phase 1)
  2. Check ALL label rules
  3. Apply labels from EVERY matching rule (additive)
  4. Deduplicate and sort labels

All Matches Applied! Unlike categories, ALL matching label rules are applied.

Example Execution

Transaction: WOOLWORTHS in Shared Bills account, amount -$127.50

Phase 1 - Category Rules:

# Rule 1 matches!
- type: category
  name: WOOLWORTHS → Groceries
  patterns: [WOOLWORTHS]
  category: Food & Dining > Groceries
  confidence: 95

# Rule 2 would also match but is NOT evaluated (short-circuit)
- type: category
  name: All Food Purchases
  patterns: [WOOLWORTHS, COLES, RESTAURANT]
  category: Food
  confidence: 80

Result after Phase 1: Category = "Food & Dining > Groceries", Confidence = 95

Phase 2 - Label Rules:

# Rule 1 matches (category: Groceries, account: Shared Bills)
- type: label
  name: Shared Grocery Expense
  when:
    categories: [Groceries]
    accounts: [Shared Bills]
  labels: [Shared Expense, Essential]

# Rule 2 matches (amount > 100)
- type: label
  name: Large Purchase
  when:
    amount_operator: ">"
    amount_value: 100
  labels: [Large Purchase, Review]

# Rule 3 does NOT match (category doesn't contain "Dining Out")
- type: label
  name: Discretionary Dining
  when:
    categories: [Dining Out]
  labels: [Discretionary]

Final Result:

  • Category: Food & Dining > Groceries
  • Confidence: 95
  • Labels: Essential, Large Purchase, Review, Shared Expense (sorted, deduplicated)

Intelligence Modes

Agent Smith has three intelligence modes that control auto-apply behavior based on confidence scores.

Conservative Mode

Never auto-applies - always asks user for confirmation.

Confidence Level: ANY
Action: Ask user for approval
Use when: Learning the system, want full control

Example:

Transaction: STARBUCKS -$6.50
Rule match: "Dining Out" (95% confidence)
→ [Ask User] Apply category "Dining Out"?
  [Yes] [No] [Edit]

Smart Mode (Default)

Balanced approach - auto-applies high confidence, asks for medium, skips low.

Confidence ≥ 90%:  Auto-apply without asking
Confidence 70-89%: Ask user for approval (LLM validates first)
Confidence < 70%:  Skip (don't categorize)

Example:

Transaction: UBER -$25.00
Rule match: "Transport" (95% confidence)
→ [Auto-applied] Category: Transport

Transaction: UBER MEDICAL CENTRE -$80
Rule match: "UBER → Transport" (75% confidence)
→ [LLM Validates] This looks like medical, not transport
→ [Suggests] Medical & Healthcare (90% confidence)
→ [Auto-applied] Category: Medical & Healthcare

Aggressive Mode

More permissive - auto-applies medium-high confidence, asks for medium-low.

Confidence ≥ 80%: Auto-apply without asking
Confidence 50-79%: Ask user for approval
Confidence < 50%: Skip (don't categorize)

Example:

Transaction: ACME WIDGETS -$245.00
Rule match: "Business Supplies" (82% confidence)
→ [Auto-applied] Category: Business Supplies

Setting the Mode

In command:

/agent-smith-categorize --mode=smart

In environment (.env):

DEFAULT_INTELLIGENCE_MODE=smart

In code:

from scripts.workflows.categorization import CategorizationWorkflow

workflow = CategorizationWorkflow(
    client=client,
    mode="smart"  # conservative, smart, or aggressive
)

LLM Integration

When rule-based categorization fails, Agent Smith falls back to AI-powered categorization using Claude.

Fallback Categorization

When no rule matches, Agent Smith asks the LLM to suggest a category.

Flow:

  1. No category rule matches transaction
  2. Build LLM prompt with:
    • Full category hierarchy
    • Transaction details (payee, amount, date)
    • Intelligence mode guidance
  3. LLM suggests category with confidence and reasoning
  4. Apply intelligence mode thresholds:
    • High confidence → Auto-apply (or ask in conservative mode)
    • Medium confidence → Ask user
    • Low confidence → Skip

Example:

Transaction: ACME WIDGETS LTD -$245.00
No rule match
→ [LLM] Analyzing transaction...
→ [LLM] Suggests: Business Supplies (85% confidence)
   Reasoning: "ACME WIDGETS appears to be a business supplier based on
   naming convention and typical transaction amount."
→ [Smart Mode] 85% is above ask threshold (70%) but below auto (90%)
→ [Ask User] Apply category "Business Supplies"?
   [Yes] [No] [Create Rule]

Validation

Medium-confidence rule matches (70-89% in smart mode) are validated by the LLM to catch edge cases.

Flow:

  1. Rule matches with medium confidence
  2. Build validation prompt with:
    • Transaction details
    • Suggested category
    • Rule confidence
  3. LLM responds: CONFIRM or REJECT
    • CONFIRM: Can upgrade confidence → auto-apply
    • REJECT: Suggests alternative category
  4. Apply validated result

Example:

Transaction: UBER MEDICAL CENTRE -$80
Rule match: "UBER → Transport" (75% confidence)
→ [LLM] Validating categorization...
→ [LLM] REJECT - This appears to be a medical facility, not transport
   Suggests: Medical & Healthcare (90% confidence)
→ [Smart Mode] 90% ≥ auto-apply threshold
→ [Auto-applied] Category: Medical & Healthcare

Learning from LLM Results

After the LLM categorizes transactions, Agent Smith offers to create rules for future use.

Flow:

  1. LLM categorizes N transactions with same merchant
  2. Detect pattern: Same payee → Same category
  3. Suggest rule creation
  4. User approves, edits, or declines
  5. If approved: Add rule to data/rules.yaml

Example:

LLM categorized 12 "ACME WIDGETS" transactions as "Business Supplies"

Suggested rule:
  - type: category
    name: ACME WIDGETS → Business Supplies
    patterns: [ACME WIDGETS]
    category: Business Supplies
    confidence: 90

[Create Rule] [Edit Rule] [Decline]

→ User selects [Create Rule]
→ Rule added to data/rules.yaml
→ Future ACME WIDGETS transactions auto-categorized (90% confidence)

Advanced Patterns

Cross-Category Labels

Apply the same label to multiple categories:

# Tax deductible categories
- type: label
  name: ATO Tax Deductible
  when:
    categories: [Work, Professional Development, Home Office, Software]
  labels: [Tax Deductible, ATO: D1]

# Large purchases across all categories
- type: label
  name: Large Purchase Alert
  when:
    amount_operator: ">"
    amount_value: 500
  labels: [Large Purchase, Review Required]

Account-Based Workflows

Different labels for same category in different accounts:

# Same category rule for all accounts
- type: category
  name: Transport
  patterns: [UBER, LYFT, TAXI]
  category: Transport
  confidence: 90

# Personal transport
- type: label
  name: Personal Transport
  when:
    categories: [Transport]
    accounts: [Personal]
  labels: [Personal, Discretionary]

# Work transport (reimbursable)
- type: label
  name: Work Transport
  when:
    categories: [Transport]
    accounts: [Work, Personal]  # Can be from either account
    amount_operator: ">"
    amount_value: 20            # But large amounts suggest work trips
  labels: [Work Related, Reimbursable]

Shared Household Expense Tracking

Track who paid for shared expenses:

# Shared groceries
- type: category
  name: Shared Groceries
  patterns: [WOOLWORTHS, COLES]
  accounts: [Shared Bills, Joint Account]
  category: Food & Dining > Groceries
  confidence: 95

- type: label
  name: Shared Essential
  when:
    categories: [Groceries]
    accounts: [Shared Bills, Joint Account]
  labels: [Shared Expense, Essential, Needs Reconciliation]

# Large shared purchases need approval
- type: label
  name: Needs Approval
  when:
    accounts: [Shared Bills, Joint Account]
    amount_operator: ">"
    amount_value: 150
  labels: [Needs Approval, Review Required]

Tax Deductible Tracking

Flag potential tax deductions with ATO codes:

# Work-related expenses
- type: label
  name: Work Deduction - D1
  when:
    categories: [Work, Office Supplies, Professional Development]
  labels: [Tax Deductible, ATO: D1, Work-related other expenses]

# Home office expenses
- type: label
  name: Home Office Deduction - D2
  when:
    categories: [Home Office, Internet, Phone]
  labels: [Tax Deductible, ATO: D2, Home office expenses]

# Large deductions requiring substantiation
- type: label
  name: Requires Receipt (>$300)
  when:
    labels: [Tax Deductible]  # Note: This won't work! Labels can't check labels
    amount_operator: ">"
    amount_value: 300
  labels: [Substantiation Required, Keep Receipt]

Important: Label rules cannot check for other labels. Use categories or accounts instead.

Uncategorized Transaction Management

Flag and prioritize uncategorized transactions:

# Flag all uncategorized
- type: label
  name: Needs Categorization
  when:
    uncategorized: true
  labels: [Uncategorized, Needs Review]

# High-priority uncategorized (large amounts)
- type: label
  name: High Priority Uncategorized
  when:
    uncategorized: true
    amount_operator: ">"
    amount_value: 100
  labels: [Uncategorized, High Priority, Urgent Review]

# Uncategorized in shared account
- type: label
  name: Uncategorized Shared Expense
  when:
    uncategorized: true
    accounts: [Shared Bills, Joint Account]
  labels: [Uncategorized, Shared Account, Needs Approval]

Best Practices

1. Order Rules Specific → General

Rules are evaluated in order. Put specific rules first:

# ✓ GOOD: Specific first
- type: category
  name: UBER EATS → Dining Out
  patterns: [UBER EATS]
  category: Food & Dining > Dining Out
  confidence: 95

- type: category
  name: UBER → Transport
  patterns: [UBER]
  category: Transport
  confidence: 90

# ✗ BAD: General first (UBER catches UBER EATS)
- type: category
  name: UBER → Transport
  patterns: [UBER]
  category: Transport
  confidence: 90

- type: category
  name: UBER EATS → Dining Out  # Never reached!
  patterns: [UBER EATS]
  category: Food & Dining > Dining Out
  confidence: 95

Fix with exclusions:

- type: category
  name: UBER → Transport
  patterns: [UBER]
  exclude_patterns: [UBER EATS]
  category: Transport
  confidence: 90

2. Use Visual Grouping

Group related rules with comments for easy scanning:

# ═══════════════════════════════════════════════════════════
# GROCERIES WORKFLOW
# ═══════════════════════════════════════════════════════════

- type: category
  name: Groceries
  patterns: [WOOLWORTHS, COLES, ALDI]
  category: Food & Dining > Groceries
  confidence: 95

- type: label
  name: Essential Spending
  when:
    categories: [Groceries]
  labels: [Essential, Needs]

- type: label
  name: Shared Groceries
  when:
    categories: [Groceries]
    accounts: [Shared Bills]
  labels: [Shared Expense, Reconciliation]

# ═══════════════════════════════════════════════════════════
# TRANSPORT WORKFLOW
# ═══════════════════════════════════════════════════════════

- type: category
  name: Rideshare
  patterns: [UBER, LYFT]
  exclude_patterns: [UBER EATS]
  category: Transport
  confidence: 90

3. Start with High Confidence

Begin with rules you're certain about (95%+):

# High confidence - very specific merchants
- type: category
  name: WOOLWORTHS → Groceries
  patterns: [WOOLWORTHS]
  category: Food & Dining > Groceries
  confidence: 95

- type: category
  name: AGL → Utilities
  patterns: [AGL]
  category: Bills > Utilities
  confidence: 95

Add medium-confidence rules (80-90%) later as you verify:

# Medium confidence - could be ambiguous
- type: category
  name: Amazon Purchases
  patterns: [AMAZON]
  category: Shopping
  confidence: 80  # Could be books, electronics, groceries, etc.

4. Test with Dry Run

Always test before applying to real transactions:

# Preview what would happen without making changes
uv run python scripts/operations/batch_categorize.py \
  --mode=dry_run \
  --period=2025-11 \
  --limit=50

# See what would change on existing categorizations
uv run python scripts/operations/batch_categorize.py \
  --mode=validate \
  --period=2025-11

Review the output carefully before running with --mode=apply.

5. Version Control Your Rules

Commit data/rules.yaml to git to track evolution:

# After adding/modifying rules
git add data/rules.yaml
git commit -m "rules: add coffee shop categorization with personal label"

# View history
git log --oneline data/rules.yaml

# Compare versions
git diff HEAD~1 data/rules.yaml

6. Review Rule Performance Regularly

Check rule accuracy monthly:

# Analyze categorization coverage
/agent-smith-analyze rules --period=last-month

# See which rules are matching most often
/agent-smith-analyze rules --sort=matches

# Find low-accuracy rules
/agent-smith-analyze rules --min-accuracy=80

Refine rules that have low accuracy or aren't matching as expected.

7. Use Templates as Starting Points

Don't start from scratch - use a template:

uv run python scripts/setup/template_selector.py

Then customize by:

  1. Updating merchant names for your region (e.g., WOOLWORTHS → KROGER)
  2. Adjusting account names to match your PocketSmith setup
  3. Adding your specific categories
  4. Fine-tuning confidence scores based on your data

8. Document Complex Rules

Add comments explaining non-obvious rules:

# Complex rule: UBER is transport UNLESS it's UBER EATS or during work hours
# Work hours trips from Personal account are likely work-related (reimbursable)
- type: category
  name: UBER Transport (Excluding Food Delivery)
  patterns: [UBER]
  exclude_patterns: [UBER EATS, UBER EATS MARKETPLACE]
  category: Transport
  confidence: 90

# Note: Work-related UBER trips need manual review for reimbursement
# They'll get the "Reimbursable" label from the account-based rule below

Operational Modes

The batch processor supports three operational modes for safe rule testing and application.

DRY_RUN Mode

Purpose: Preview what would happen without making any changes.

Use when:

  • Testing new rules
  • Checking rule coverage
  • Seeing potential categorizations before committing

Example:

uv run python scripts/operations/batch_categorize.py \
  --mode=dry_run \
  --period=2025-11

Output:

DRY RUN MODE - No changes will be made

Transaction #12345: WOOLWORTHS -$127.50
  → Would categorize as: Food & Dining > Groceries (95% confidence)
  → Would apply labels: [Essential, Shared Expense]

Transaction #12346: STARBUCKS -$6.50
  → Would categorize as: Food & Dining > Dining Out (90% confidence)
  → Would apply labels: [Discretionary, Personal]

Transaction #12347: ACME WIDGETS -$245.00
  → No rule match
  → Would request LLM categorization

Summary:
  Would categorize: 2/3 transactions (66.7%)
  LLM fallback needed: 1 transaction
  No changes made (DRY RUN)

VALIDATE Mode

Purpose: Show what would CHANGE on transactions that already have categories.

Use when:

  • Checking if new rules conflict with existing categorizations
  • Planning to update categories with better rules
  • Auditing categorization accuracy

Example:

uv run python scripts/operations/batch_categorize.py \
  --mode=validate \
  --period=2025-11

Output:

VALIDATE MODE - Showing potential changes

Transaction #12345: WOOLWORTHS -$127.50
  Current: Food (80% confidence)
  New: Food & Dining > Groceries (95% confidence)
  Change: Category would be updated ✓

Transaction #12346: STARBUCKS -$6.50
  Current: Food & Dining > Dining Out (90% confidence)
  New: Food & Dining > Dining Out (90% confidence)
  Change: No change (same category)

Transaction #12347: UBER -$25.00
  Current: Dining Out (user-assigned)
  New: Transport (90% confidence from rule)
  Change: Category would be REPLACED (was user-assigned!)

Summary:
  Would update: 2 transactions
  Already correct: 1 transaction
  Would replace user assignments: 1 transaction ⚠️
  No changes made (VALIDATE)

APPLY Mode

Purpose: Actually apply categorizations and labels to transactions.

Use when:

  • Ready to commit changes after testing with DRY_RUN/VALIDATE
  • Processing new uncategorized transactions
  • Updating categorizations with improved rules

Example:

uv run python scripts/operations/batch_categorize.py \
  --mode=apply \
  --period=2025-11 \
  --update-strategy=skip_existing

Output:

APPLY MODE - Making changes to PocketSmith

Transaction #12345: WOOLWORTHS -$127.50
  ✓ Categorized as: Food & Dining > Groceries (95%)
  ✓ Labels applied: [Essential, Shared Expense]

Transaction #12346: STARBUCKS -$6.50
  ⊘ Skipped (already categorized)

Transaction #12347: ACME WIDGETS -$245.00
  → Requesting LLM categorization...
  ? Suggested: Business Supplies (85% confidence)
    [A]ccept  [E]dit  [S]kip  [C]reate Rule

Update Strategies

Control how the batch processor handles transactions that already have categories.

SKIP_EXISTING (Default)

Only process uncategorized transactions. Leave existing categorizations unchanged.

Use when:

  • Processing new transactions
  • Don't want to override user-assigned categories
  • Preserving manual categorization work
uv run python scripts/operations/batch_categorize.py \
  --mode=apply \
  --update-strategy=skip_existing

Behavior:

  • Uncategorized → Apply rules
  • Already categorized → Skip
  • User-assigned → Skip

REPLACE_ALL

Replace ALL categorizations, even if they were user-assigned.

Use when:

  • Rebuilding all categorizations from scratch
  • Confident new rules are better than old assignments
  • Fixing systemic categorization errors

⚠️ Warning: This will override user-assigned categories!

uv run python scripts/operations/batch_categorize.py \
  --mode=apply \
  --update-strategy=replace_all

Behavior:

  • Uncategorized → Apply rules
  • Already categorized → Replace with rule result
  • User-assigned → Replace with rule result (loses user intent!)

UPGRADE_CONFIDENCE

Replace categorization ONLY if new rule has higher confidence.

Use when:

  • Improving categorizations with better rules
  • Keeping high-confidence assignments
  • Upgrading low-confidence auto-categorizations
uv run python scripts/operations/batch_categorize.py \
  --mode=apply \
  --update-strategy=upgrade_confidence

Behavior:

  • Uncategorized → Apply rules
  • Lower confidence → Replace with higher confidence rule
  • Higher confidence → Keep existing
  • User-assigned (confidence: 100%) → Never replaced

Example:

Transaction: WOOLWORTHS -$50
Current: Food (80% confidence from old rule)
New: Food & Dining > Groceries (95% confidence from new rule)
→ REPLACED (95% > 80%)

Transaction: STARBUCKS -$6
Current: Dining Out (95% confidence)
New: Dining Out (90% confidence from new rule)
→ KEPT (95% > 90%)

REPLACE_IF_DIFFERENT

Replace categorization if the category NAME differs.

Use when:

  • Fixing miscategorized transactions
  • Migrating to a new category structure
  • Correcting category hierarchies
uv run python scripts/operations/batch_categorize.py \
  --mode=apply \
  --update-strategy=replace_if_different

Behavior:

  • Uncategorized → Apply rules
  • Same category → Keep existing
  • Different category → Replace with rule result
  • User-assigned → Still replaced if different!

Example:

Transaction: WOOLWORTHS -$50
Current: Food
New: Food & Dining > Groceries
→ REPLACED (different category name)

Transaction: STARBUCKS -$6
Current: Dining Out
New: Dining Out
→ KEPT (same category)

Template System

Agent Smith provides pre-built rule templates for common household types. Templates are stored in data/templates/ and can be applied to create your data/rules.yaml.

Available Templates

1. Simple - Single Person

File: data/templates/simple.yaml

Best for:

  • Single person households
  • No shared expenses
  • Basic income and expense tracking

Includes:

  • Income categorization (salary, wages)
  • Essential expenses (groceries, utilities, rent)
  • Discretionary spending (dining out, entertainment)
  • Transport categories
  • Basic labels (Essential, Discretionary, Large Purchase)
  • Uncategorized flagging

Example rules:

# Income
- type: category
  patterns: [SALARY, WAGES, EMPLOYER]
  category: Income > Salary
  confidence: 95

# Essential groceries
- type: category
  patterns: [WOOLWORTHS, COLES, ALDI]
  category: Food & Dining > Groceries
  confidence: 95

- type: label
  when:
    categories: [Groceries, Utilities, Rent]
  labels: [Essential]

2. Separated Families

File: data/templates/separated-families.yaml

Best for:

  • Divorced or separated parents
  • Shared custody arrangements
  • Child support tracking
  • Kids' expense management

Includes:

  • Kids' expense categories (school, activities, clothing, medical)
  • Child support tracking
  • Contributor labels (Parent A, Parent B)
  • Reimbursement workflows
  • School term and vacation labels
  • Medical and education receipts flagging

Example rules:

# Child expenses
- type: category
  patterns: [SCHOOL, UNIFORM, SCHOOL FEES]
  category: Kids > Education
  confidence: 90

- type: label
  when:
    categories: [Kids]
  labels: [Child Expense, Needs Documentation]

# Child support tracking
- type: label
  when:
    patterns: [CHILD SUPPORT]
  labels: [Child Support, Parent B Contribution]

# Shared kid expenses requiring reimbursement
- type: label
  when:
    categories: [Kids]
    amount_operator: ">"
    amount_value: 50
  labels: [Needs Reimbursement, Split 50/50]

3. Shared Household

File: data/templates/shared-household.yaml

Best for:

  • Couples living together
  • Roommates sharing expenses
  • Families with joint accounts

Includes:

  • Shared vs personal expense separation
  • Contributor tracking (Person A, Person B)
  • Approval workflows (large purchases, discretionary spending)
  • Reconciliation labels
  • Essential vs discretionary labeling
  • Account-based routing (Shared Bills, Personal accounts)

Example rules:

# Shared essential expenses
- type: category
  patterns: [WOOLWORTHS, COLES]
  accounts: [Shared Bills, Joint Account]
  category: Food & Dining > Groceries
  confidence: 95

- type: label
  when:
    categories: [Groceries]
    accounts: [Shared Bills]
  labels: [Shared Expense, Essential, Monthly Reconciliation]

# Approval workflow for large shared purchases
- type: label
  when:
    accounts: [Shared Bills]
    amount_operator: ">"
    amount_value: 150
  labels: [Needs Approval, Review Required]

# Personal vs shared distinction
- type: label
  when:
    accounts: [Personal, PersonA Account, PersonB Account]
  labels: [Personal, Individual]

4. Advanced

File: data/templates/advanced.yaml

Best for:

  • Business owners
  • Investors and traders
  • Complex financial situations
  • Tax optimization focus

Includes:

  • Business expense categories (with ATO codes)
  • Investment tracking (shares, crypto, property)
  • Tax deductible flagging (work, home office, professional development)
  • Capital gains tracking
  • Substantiation requirements ($300 threshold)
  • Instant asset write-off flagging
  • GST tracking
  • Business vs personal separation

Example rules:

# Business expenses
- type: category
  patterns: [OFFICE, STATIONERY, SUPPLIES]
  accounts: [Business, Work]
  category: Work > Office Supplies
  confidence: 90

- type: label
  when:
    categories: [Work, Home Office, Professional Development]
  labels: [Tax Deductible, ATO: D1, Business Expense]

# Investment purchases
- type: category
  patterns: [COMMSEC, SELFWEALTH, STAKE]
  category: Investments > Share Purchase
  confidence: 90

- type: label
  when:
    categories: [Investments]
  labels: [CGT Event, Track Cost Base]

# Substantiation requirements
- type: label
  when:
    labels: [Tax Deductible]
    amount_operator: ">"
    amount_value: 300
  labels: [Receipt Required, ATO Substantiation]

Applying a Template

Interactive selection:

uv run python scripts/setup/template_selector.py

Output:

══════════════════════════════════════════════════════════════════
Agent Smith - Rule Template Setup
══════════════════════════════════════════════════════════════════

Available templates:

1. Simple - Single Person
   Basic categories for individual financial tracking
   Best for: Single person, no shared expenses

2. Separated Families
   Kids expenses, child support, contributor tracking
   Best for: Divorced/separated parents with shared custody

3. Shared Household
   Shared expense tracking with approval workflows
   Best for: Couples, roommates, or families

4. Advanced
   Tax optimization and investment management
   Best for: Business owners, investors, complex finances

Select template (1-4): 3

Applying template: Shared Household
Backed up existing rules to data/rules.yaml.backup
✓ Template applied successfully!

Next steps:
1. Review data/rules.yaml and customize for your needs
2. Update merchant patterns for your region
3. Adjust account names to match your PocketSmith setup
4. Run: /agent-smith-categorize --mode=dry-run to test

Programmatic usage:

from scripts.setup.template_selector import TemplateSelector

selector = TemplateSelector()

# List templates
templates = selector.list_templates()
for key, info in templates.items():
    print(f"{info['name']}: {info['description']}")

# Apply template
selector.apply_template("shared-household", backup=True)

Customizing Templates

After applying a template:

  1. Update merchant patterns for your region:

    # Template (Australian)
    patterns: [WOOLWORTHS, COLES, ALDI]
    
    # Customize (US)
    patterns: [KROGER, SAFEWAY, WHOLE FOODS]
    
  2. Adjust account names to match your PocketSmith:

    # Template
    accounts: [Shared Bills, Joint Account]
    
    # Your setup
    accounts: [Joint Checking, Household Card]
    
  3. Add your specific categories:

    # Add new rules
    - type: category
      name: Pet Expenses
      patterns: [VET, PET STORE, PETBARN]
      category: Pets > Veterinary
      confidence: 90
    
  4. Fine-tune confidence scores based on your data:

    # Start conservative
    confidence: 70
    
    # After validation, increase
    confidence: 90
    

Migration Guide

From Platform Rules to Unified YAML

If you have existing platform rules created via the PocketSmith API, you can migrate them to the unified YAML format.

See: Platform to Local Rules Migration Guide

Quick summary:

  1. Export platform rules to JSON
  2. Convert to unified YAML format
  3. Test with dry run
  4. Disable platform rules (keep for backup)
  5. Enable unified rules

Migration script:

uv run python scripts/migrations/migrate_platform_to_local.py \
  --output=data/rules.yaml \
  --backup

Adding Labels to Existing Rules

If you have category rules and want to add labels:

  1. Keep all existing category rules as-is
  2. Add label rules at the bottom
  3. Test with dry run to see labels applied
  4. Apply with --update-strategy=skip_existing to avoid re-categorizing

Example:

# Existing category rules (don't change)
- type: category
  name: WOOLWORTHS → Groceries
  patterns: [WOOLWORTHS]
  category: Groceries
  confidence: 95

# NEW: Add label rules
- type: label
  name: Essential Spending
  when:
    categories: [Groceries]
  labels: [Essential, Needs]

- type: label
  name: Large Grocery Trip
  when:
    categories: [Groceries]
    amount_operator: ">"
    amount_value: 150
  labels: [Large Purchase]

Run with:

uv run python scripts/operations/batch_categorize.py \
  --mode=apply \
  --update-strategy=skip_existing \
  --period=2025-11

This will:

  • Skip already categorized transactions (no re-categorization)
  • Apply new labels to all transactions (even already categorized ones)

Troubleshooting

Rule Not Matching

Symptom: Rule should match but doesn't.

Check:

  1. Pattern case sensitivity - Patterns are case-insensitive, but spacing matters:

    # Won't match "UBEREATS" (no space)
    patterns: [UBER EATS]
    
    # Better: account for variations
    patterns: [UBER EATS, UBEREATS]
    
  2. Exclusion patterns blocking - Check if an exclusion is preventing the match:

    - type: category
      patterns: [UBER]
      exclude_patterns: [UBER EATS, MEDICAL]  # Blocks "UBER MEDICAL"
      category: Transport
    
  3. Account filter too restrictive - Transaction might be in a different account:

    # Only matches transactions in "Personal" account
    accounts: [Personal]
    
    # Check transaction's actual account name
    
  4. Amount condition incorrect - Verify the amount operator and value:

    amount_operator: ">"
    amount_value: 100
    # Won't match transactions ≤ $100
    
  5. Rule order - A previous rule might have matched first (short-circuit):

    # General rule matches first!
    - patterns: [UBER]
      category: Transport
    
    # Specific rule never reached
    - patterns: [UBER EATS]
      category: Dining Out  # Dead code!
    

Debug with test script:

# Test specific payee
uv run python scripts/operations/test_rules.py \
  --payee="EXACT PAYEE NAME" \
  --account="Account Name" \
  --amount=127.50 \
  --debug

Output:

Testing transaction:
  Payee: EXACT PAYEE NAME
  Account: Account Name
  Amount: $127.50

Checking category rules...
  ✗ Rule 1 "WOOLWORTHS → Groceries": Pattern mismatch
  ✗ Rule 2 "UBER → Transport": Pattern mismatch
  ✗ Rule 3 "CAFE → Dining Out": Pattern mismatch

No category match found.

Checking label rules...
  (skipped - no category assigned)

Result: No categorization

Multiple Rules Matching

Symptom: Worried about multiple rules matching the same transaction.

This is expected! Category rules use short-circuit (first match wins), label rules accumulate all matches.

For category rules:

# Only the FIRST matching rule applies
- patterns: [UBER EATS]
  category: Dining Out
  # ✓ This matches UBER EATS

- patterns: [UBER]
  category: Transport
  # ✗ Never reached for UBER EATS (already matched above)

For label rules:

# ALL matching rules apply (additive)
- type: label
  when:
    categories: [Groceries]
  labels: [Essential]
  # ✓ Matches

- type: label
  when:
    amount_operator: ">"
    amount_value: 100
  labels: [Large Purchase]
  # ✓ Also matches

# Result: [Essential, Large Purchase]

Fix unwanted category matches by adjusting rule order or using exclusions.

Control label accumulation by making conditions more specific:

# Too broad - applies to ALL transactions
- type: label
  when:
    amount_operator: ">"
    amount_value: 0
  labels: [Has Amount]  # Not useful!

# Better - specific categories only
- type: label
  when:
    categories: [Groceries, Dining Out]
    amount_operator: ">"
    amount_value: 100
  labels: [Large Food Purchase]

Labels Not Applying

Symptom: Label rule should match but labels aren't applied.

Check:

  1. Category must be assigned first - Labels depend on Phase 1 categorization:

    # Label requires category "Groceries"
    - type: label
      when:
        categories: [Groceries]
      labels: [Essential]
    
    # But transaction wasn't categorized in Phase 1
    # → Label rule won't match
    

    Fix: Ensure a category rule matches the transaction first.

  2. When conditions too restrictive - All conditions must match (AND logic):

    - type: label
      when:
        categories: [Groceries]  # Must match
        accounts: [Shared Bills]  # AND must match
        amount_operator: ">"       # AND must match
        amount_value: 100
      labels: [Large Shared Grocery]
    
    # Won't match if ANY condition fails
    
  3. Uncategorized flag incorrect - Can't combine with other conditions:

    # This won't work as expected
    - type: label
      when:
        uncategorized: true
        categories: [Groceries]  # Contradiction! Can't be both uncategorized and have a category
      labels: [Invalid]
    

    Fix: Use uncategorized: true alone or with accounts/amount only.

  4. Labels can't check labels - You can't reference other labels in conditions:

    # This WON'T work - no way to check existing labels
    - type: label
      when:
        labels: [Tax Deductible]  # Not supported!
        amount_operator: ">"
        amount_value: 300
      labels: [Substantiation Required]
    

    Fix: Use categories or accounts as conditions instead.

Debug:

uv run python scripts/operations/test_rules.py \
  --payee="WOOLWORTHS" \
  --account="Shared Bills" \
  --amount=127.50 \
  --category="Groceries" \
  --debug

Confidence Scores Unclear

Symptom: Not sure what confidence score to use.

Guidelines:

Confidence When to Use Example
95-100% Exact merchant match, no ambiguity WOOLWORTHS → Groceries
85-94% Very likely but minor ambiguity AMAZON → Shopping (could be many subcategories)
75-84% Likely but context-dependent UBER → Transport (unless UBER EATS)
70-74% Moderate confidence, needs validation Generic patterns like "MARKET"
< 70% Low confidence, probably shouldn't auto-apply Broad patterns

Smart mode thresholds:

  • ≥ 90%: Auto-apply
  • 70-89%: Ask user (with LLM validation)
  • < 70%: Skip

Start high (95%), reduce if:

  • LLM frequently suggests different category
  • User frequently overrides
  • Pattern matches too broadly

LLM Not Being Used

Symptom: Expected LLM fallback but it's not happening.

Possible causes:

  1. Rule matched - LLM only used when NO rule matches:

    Transaction: ACME WIDGETS
    Rule match: "Generic Business" pattern [WIDGETS] (75%)
    → Rule applied, LLM not needed
    

    Fix: Remove overly broad rules if you want LLM to handle edge cases.

  2. Categories not provided - LLM needs category list:

    workflow.categorize_transaction(
        transaction=txn,
        available_categories=None  # ← LLM can't suggest without categories!
    )
    

    Fix: Pass available_categories from PocketSmith API.

  3. Conservative mode + low confidence - Conservative never auto-applies:

    Mode: Conservative
    LLM suggests: Business Supplies (85%)
    → Asks user (doesn't auto-apply)
    

    This is expected! Conservative always asks.

Performance Issues

Symptom: Batch categorization is slow with many rules.

Optimizations:

  1. Reduce rule count - Consolidate similar patterns:

    # Before: 3 rules
    - patterns: [WOOLWORTHS]
      category: Groceries
    - patterns: [COLES]
      category: Groceries
    - patterns: [ALDI]
      category: Groceries
    
    # After: 1 rule
    - patterns: [WOOLWORTHS, COLES, ALDI]
      category: Groceries
    
  2. Use account filters - Skip irrelevant transactions early:

    # Check account BEFORE pattern matching
    - patterns: [WORK PATTERN]
      accounts: [Work Credit Card]  # Skips 90% of transactions
      category: Work Expenses
    
  3. Order by frequency - Put most common rules first:

    # Most frequent transaction (groceries) - check first
    - patterns: [WOOLWORTHS, COLES]
      category: Groceries
    
    # Less frequent - check later
    - patterns: [RARE MERCHANT]
      category: Rare Category
    
  4. Limit batch size - Process in smaller chunks:

    # Instead of processing all at once
    uv run python scripts/operations/batch_categorize.py --period=2025
    
    # Process month by month
    uv run python scripts/operations/batch_categorize.py --period=2025-01
    uv run python scripts/operations/batch_categorize.py --period=2025-02
    # etc.
    

Examples

See docs/examples/ for complete example YAML files:

  • basic-rules.yaml - Simple category and label rules
  • advanced-patterns.yaml - Complex rules with exclusions, amounts, accounts
  • household-workflow.yaml - Complete shared household setup
  • tax-deductible.yaml - Tax optimization rules with ATO codes
  • migration-example.yaml - Migrated from platform rules

Further Reading

Support

For questions or issues:

  1. Check this guide's troubleshooting section
  2. Review example files in docs/examples/
  3. Check template files in data/templates/
  4. Refer to design documentation
  5. Create an issue in the repository

Last Updated: 2025-11-22 Version: 1.0.0