zhongwei/gh-slamb2k-agent-smith-agent-smith-plugin

Files

Zhongwei Li 8f952ee727 Initial commit

2025-11-30 08:57:54 +08:00

45 KiB

Raw Blame History

Unified Rules Guide - Categories & Labels

Overview

Agent Smith uses a unified YAML rule system that handles both transaction categorization and labeling in a single, easy-to-read file.

Key Features:

YAML format - Easy to read and edit
Two-phase execution - Categories first, then labels
Pattern matching - Regex patterns with exclusions
Confidence scoring - 0-100% confidence for auto-apply logic
Smart labeling - Context-aware labels (account, category, amount)
LLM fallback - AI categorization when rules don't match
Template system - Pre-built rule sets for common household types

Quick Start
Rule Types
Execution Flow
Intelligence Modes
LLM Integration
Advanced Patterns
Best Practices
Operational Modes
Update Strategies
Template System
Migration Guide
Troubleshooting

Quick Start

1. Choose a Template

Start with a pre-built template that matches your household type:

uv run python scripts/setup/template_selector.py

Available templates:

Simple - Single person, no shared expenses
Separated Families - Divorced/separated parents with shared custody
Shared Household - Couples, roommates, or families
Advanced - Business owners, investors, complex finances

2. Customize Your Rules

Edit data/rules.yaml to match your specific needs:

rules:
  # Add your first category rule
  - type: category
    name: Coffee → Dining Out
    patterns: [STARBUCKS, COSTA, CAFE]
    category: Food & Dining > Dining Out
    confidence: 95

  # Add your first label rule
  - type: label
    name: Personal Coffee
    when:
      categories: [Dining Out]
      accounts: [Personal]
    labels: [Discretionary, Personal]

3. Test Your Rules

Always test before applying to real transactions:

# Dry run - preview what would happen
uv run python scripts/operations/batch_categorize.py --mode=dry_run --period=2025-11

# Validate - see what would change on existing categorizations
uv run python scripts/operations/batch_categorize.py --mode=validate --period=2025-11

# Apply - actually categorize transactions
uv run python scripts/operations/batch_categorize.py --mode=apply --period=2025-11

4. Review and Refine

Check the results and refine your rules:

# See categorization summary
/agent-smith-analyze spending --period=2025-11

# Check uncategorized transactions
/agent-smith-categorize --mode=smart --show-uncategorized

Rule Types

Category Rules

Categorize transactions based on payee patterns, amounts, and accounts.

Full Syntax:

- type: category
  name: Rule Name (for logging/display)
  patterns: [PATTERN1, PATTERN2, PATTERN3]  # OR logic
  exclude_patterns: [EXCLUDE1, EXCLUDE2]    # Optional
  category: Category > Subcategory
  confidence: 95                             # 0-100%
  accounts: [Account1, Account2]             # Optional filter
  amount_operator: ">"                       # Optional: >, <, >=, <=, ==, !=
  amount_value: 100.00                       # Required if amount_operator set

Field Descriptions:

Field	Required	Type	Description
`type`	Yes	String	Must be "category"
`name`	Yes	String	Descriptive name for logs (e.g., "WOOLWORTHS → Groceries")
`patterns`	Yes	List[String]	Payee keywords to match (case-insensitive, OR logic)
`category`	Yes	String	Category to assign (can include parent: "Parent > Child")
`confidence`	No	Integer	Confidence score 0-100% (default: 90)
`exclude_patterns`	No	List[String]	Patterns to exclude from match
`accounts`	No	List[String]	Only match transactions in these accounts
`amount_operator`	No	String	Amount comparison: >, <, >=, <=, ==, !=
`amount_value`	No	Number	Amount threshold (required if operator set)

Examples:

# Basic pattern matching
- type: category
  name: WOOLWORTHS → Groceries
  patterns: [WOOLWORTHS, COLES, ALDI]
  category: Food & Dining > Groceries
  confidence: 95

# With exclusions (exclude UBER EATS from UBER)
- type: category
  name: UBER → Transport
  patterns: [UBER]
  exclude_patterns: [UBER EATS]
  category: Transport
  confidence: 90

# Account-specific rule
- type: category
  name: Work Laptop Purchase
  patterns: [APPLE STORE, MICROSOFT STORE]
  accounts: [Work Credit Card]
  category: Work > Equipment
  confidence: 90

# Amount-based rule (large purchases)
- type: category
  name: Large Electronics
  patterns: [JB HI-FI, HARVEY NORMAN]
  category: Shopping > Electronics
  confidence: 85
  amount_operator: ">"
  amount_value: 500

Label Rules

Apply labels to transactions based on their category, account, amount, or categorization status.

Full Syntax:

- type: label
  name: Label Rule Name
  when:
    categories: [Category1, Category2]       # Optional (OR logic)
    accounts: [Account1, Account2]           # Optional (OR logic)
    amount_operator: ">"                     # Optional
    amount_value: 100.00                     # Required if operator set
    uncategorized: true                      # Optional (true to match uncategorized)
  labels: [Label1, Label2, Label3]

Important: All when conditions must match (AND logic), but values within each list use OR logic.

Field Descriptions:

Field	Required	Type	Description
`type`	Yes	String	Must be "label"
`name`	Yes	String	Descriptive name for logs
`when`	Yes	Object	Conditions that must ALL match
`when.categories`	No	List[String]	Match if category contains any of these (OR)
`when.accounts`	No	List[String]	Match if account name contains any of these (OR)
`when.amount_operator`	No	String	Amount comparison: >, <, >=, <=, ==, !=
`when.amount_value`	No	Number	Amount threshold
`when.uncategorized`	No	Boolean	Match uncategorized transactions (true/false)
`labels`	Yes	List[String]	Labels to apply when conditions match

Examples:

# Category-based labeling
- type: label
  name: Tax Deductible Work Expenses
  when:
    categories: [Work, Professional Development, Home Office]
  labels: [Tax Deductible, ATO: D1]

# Account-based labeling
- type: label
  name: Shared Household Expense
  when:
    accounts: [Shared Bills, Joint Account]
  labels: [Shared Expense, Needs Reconciliation]

# Combined conditions (category AND account)
- type: label
  name: Personal Coffee Spending
  when:
    categories: [Dining Out]
    accounts: [Personal]
  labels: [Discretionary, Personal]

# Amount-based labeling
- type: label
  name: Large Purchase Flag
  when:
    amount_operator: ">"
    amount_value: 500
  labels: [Large Purchase, Review Required]

# Flag uncategorized transactions
- type: label
  name: Needs Categorization
  when:
    uncategorized: true
  labels: [Uncategorized, Needs Review]

# Multi-condition (category AND account AND amount)
- type: label
  name: Large Shared Grocery Trip
  when:
    categories: [Groceries]
    accounts: [Shared Bills]
    amount_operator: ">"
    amount_value: 200
  labels: [Shared Expense, Large Purchase, Needs Approval]

Execution Flow

The unified rule engine uses two-phase execution to ensure labels can depend on categories assigned in the same run.

Phase 1: Categorization

Iterate through all category rules in order
For each transaction, find the FIRST matching rule
Apply the category and confidence score
Short-circuit: Stop at first match (no further category rules evaluated)
Update transaction with matched category for Phase 2

Rule Order Matters! Specific rules should come before general rules.

Phase 2: Labeling

Using the transaction (now with category from Phase 1)
Check ALL label rules
Apply labels from EVERY matching rule (additive)
Deduplicate and sort labels

All Matches Applied! Unlike categories, ALL matching label rules are applied.

Example Execution

Transaction: WOOLWORTHS in Shared Bills account, amount -$127.50

Phase 1 - Category Rules:

# Rule 1 matches!
- type: category
  name: WOOLWORTHS → Groceries
  patterns: [WOOLWORTHS]
  category: Food & Dining > Groceries
  confidence: 95

# Rule 2 would also match but is NOT evaluated (short-circuit)
- type: category
  name: All Food Purchases
  patterns: [WOOLWORTHS, COLES, RESTAURANT]
  category: Food
  confidence: 80

Result after Phase 1: Category = "Food & Dining > Groceries", Confidence = 95

Phase 2 - Label Rules:

# Rule 1 matches (category: Groceries, account: Shared Bills)
- type: label
  name: Shared Grocery Expense
  when:
    categories: [Groceries]
    accounts: [Shared Bills]
  labels: [Shared Expense, Essential]

# Rule 2 matches (amount > 100)
- type: label
  name: Large Purchase
  when:
    amount_operator: ">"
    amount_value: 100
  labels: [Large Purchase, Review]

# Rule 3 does NOT match (category doesn't contain "Dining Out")
- type: label
  name: Discretionary Dining
  when:
    categories: [Dining Out]
  labels: [Discretionary]

Final Result:

Category: Food & Dining > Groceries
Confidence: 95
Labels: Essential, Large Purchase, Review, Shared Expense (sorted, deduplicated)

Intelligence Modes

Agent Smith has three intelligence modes that control auto-apply behavior based on confidence scores.

Conservative Mode

Never auto-applies - always asks user for confirmation.

Confidence Level: ANY
Action: Ask user for approval
Use when: Learning the system, want full control

Example:

Transaction: STARBUCKS -$6.50
Rule match: "Dining Out" (95% confidence)
→ [Ask User] Apply category "Dining Out"?
  [Yes] [No] [Edit]

Smart Mode (Default)

Balanced approach - auto-applies high confidence, asks for medium, skips low.

Confidence ≥ 90%:  Auto-apply without asking
Confidence 70-89%: Ask user for approval (LLM validates first)
Confidence < 70%:  Skip (don't categorize)

Example:

Transaction: UBER -$25.00
Rule match: "Transport" (95% confidence)
→ [Auto-applied] Category: Transport

Transaction: UBER MEDICAL CENTRE -$80
Rule match: "UBER → Transport" (75% confidence)
→ [LLM Validates] This looks like medical, not transport
→ [Suggests] Medical & Healthcare (90% confidence)
→ [Auto-applied] Category: Medical & Healthcare

Aggressive Mode

More permissive - auto-applies medium-high confidence, asks for medium-low.

Confidence ≥ 80%: Auto-apply without asking
Confidence 50-79%: Ask user for approval
Confidence < 50%: Skip (don't categorize)

Example:

Transaction: ACME WIDGETS -$245.00
Rule match: "Business Supplies" (82% confidence)
→ [Auto-applied] Category: Business Supplies

Setting the Mode

In command:

/agent-smith-categorize --mode=smart

In environment (.env):

DEFAULT_INTELLIGENCE_MODE=smart

In code:

from scripts.workflows.categorization import CategorizationWorkflow

workflow = CategorizationWorkflow(
    client=client,
    mode="smart"  # conservative, smart, or aggressive
)

LLM Integration

When rule-based categorization fails, Agent Smith falls back to AI-powered categorization using Claude.

Fallback Categorization

When no rule matches, Agent Smith asks the LLM to suggest a category.

Flow:

No category rule matches transaction
Build LLM prompt with:
- Full category hierarchy
- Transaction details (payee, amount, date)
- Intelligence mode guidance
LLM suggests category with confidence and reasoning
Apply intelligence mode thresholds:
- High confidence → Auto-apply (or ask in conservative mode)
- Medium confidence → Ask user
- Low confidence → Skip

Example:

Transaction: ACME WIDGETS LTD -$245.00
No rule match
→ [LLM] Analyzing transaction...
→ [LLM] Suggests: Business Supplies (85% confidence)
   Reasoning: "ACME WIDGETS appears to be a business supplier based on
   naming convention and typical transaction amount."
→ [Smart Mode] 85% is above ask threshold (70%) but below auto (90%)
→ [Ask User] Apply category "Business Supplies"?
   [Yes] [No] [Create Rule]

Validation

Medium-confidence rule matches (70-89% in smart mode) are validated by the LLM to catch edge cases.

Flow:

Rule matches with medium confidence
Build validation prompt with:
- Transaction details
- Suggested category
- Rule confidence
LLM responds: CONFIRM or REJECT
- CONFIRM: Can upgrade confidence → auto-apply
- REJECT: Suggests alternative category
Apply validated result

Example:

Transaction: UBER MEDICAL CENTRE -$80
Rule match: "UBER → Transport" (75% confidence)
→ [LLM] Validating categorization...
→ [LLM] REJECT - This appears to be a medical facility, not transport
   Suggests: Medical & Healthcare (90% confidence)
→ [Smart Mode] 90% ≥ auto-apply threshold
→ [Auto-applied] Category: Medical & Healthcare

Learning from LLM Results

After the LLM categorizes transactions, Agent Smith offers to create rules for future use.

Flow:

LLM categorizes N transactions with same merchant
Detect pattern: Same payee → Same category
Suggest rule creation
User approves, edits, or declines
If approved: Add rule to data/rules.yaml

Example:

LLM categorized 12 "ACME WIDGETS" transactions as "Business Supplies"

Suggested rule:
  - type: category
    name: ACME WIDGETS → Business Supplies
    patterns: [ACME WIDGETS]
    category: Business Supplies
    confidence: 90

[Create Rule] [Edit Rule] [Decline]

→ User selects [Create Rule]
→ Rule added to data/rules.yaml
→ Future ACME WIDGETS transactions auto-categorized (90% confidence)

Advanced Patterns

Cross-Category Labels

Apply the same label to multiple categories:

# Tax deductible categories
- type: label
  name: ATO Tax Deductible
  when:
    categories: [Work, Professional Development, Home Office, Software]
  labels: [Tax Deductible, ATO: D1]

# Large purchases across all categories
- type: label
  name: Large Purchase Alert
  when:
    amount_operator: ">"
    amount_value: 500
  labels: [Large Purchase, Review Required]

Account-Based Workflows

Different labels for same category in different accounts:

# Same category rule for all accounts
- type: category
  name: Transport
  patterns: [UBER, LYFT, TAXI]
  category: Transport
  confidence: 90

# Personal transport
- type: label
  name: Personal Transport
  when:
    categories: [Transport]
    accounts: [Personal]
  labels: [Personal, Discretionary]

# Work transport (reimbursable)
- type: label
  name: Work Transport
  when:
    categories: [Transport]
    accounts: [Work, Personal]  # Can be from either account
    amount_operator: ">"
    amount_value: 20            # But large amounts suggest work trips
  labels: [Work Related, Reimbursable]

Shared Household Expense Tracking

Track who paid for shared expenses:

# Shared groceries
- type: category
  name: Shared Groceries
  patterns: [WOOLWORTHS, COLES]
  accounts: [Shared Bills, Joint Account]
  category: Food & Dining > Groceries
  confidence: 95

- type: label
  name: Shared Essential
  when:
    categories: [Groceries]
    accounts: [Shared Bills, Joint Account]
  labels: [Shared Expense, Essential, Needs Reconciliation]

# Large shared purchases need approval
- type: label
  name: Needs Approval
  when:
    accounts: [Shared Bills, Joint Account]
    amount_operator: ">"
    amount_value: 150
  labels: [Needs Approval, Review Required]

Tax Deductible Tracking

Flag potential tax deductions with ATO codes:

# Work-related expenses
- type: label
  name: Work Deduction - D1
  when:
    categories: [Work, Office Supplies, Professional Development]
  labels: [Tax Deductible, ATO: D1, Work-related other expenses]

# Home office expenses
- type: label
  name: Home Office Deduction - D2
  when:
    categories: [Home Office, Internet, Phone]
  labels: [Tax Deductible, ATO: D2, Home office expenses]

# Large deductions requiring substantiation
- type: label
  name: Requires Receipt (>$300)
  when:
    labels: [Tax Deductible]  # Note: This won't work! Labels can't check labels
    amount_operator: ">"
    amount_value: 300
  labels: [Substantiation Required, Keep Receipt]

Important: Label rules cannot check for other labels. Use categories or accounts instead.

Uncategorized Transaction Management

Flag and prioritize uncategorized transactions:

# Flag all uncategorized
- type: label
  name: Needs Categorization
  when:
    uncategorized: true
  labels: [Uncategorized, Needs Review]

# High-priority uncategorized (large amounts)
- type: label
  name: High Priority Uncategorized
  when:
    uncategorized: true
    amount_operator: ">"
    amount_value: 100
  labels: [Uncategorized, High Priority, Urgent Review]

# Uncategorized in shared account
- type: label
  name: Uncategorized Shared Expense
  when:
    uncategorized: true
    accounts: [Shared Bills, Joint Account]
  labels: [Uncategorized, Shared Account, Needs Approval]

Best Practices

1. Order Rules Specific → General

Rules are evaluated in order. Put specific rules first:

# ✓ GOOD: Specific first
- type: category
  name: UBER EATS → Dining Out
  patterns: [UBER EATS]
  category: Food & Dining > Dining Out
  confidence: 95

- type: category
  name: UBER → Transport
  patterns: [UBER]
  category: Transport
  confidence: 90

# ✗ BAD: General first (UBER catches UBER EATS)
- type: category
  name: UBER → Transport
  patterns: [UBER]
  category: Transport
  confidence: 90

- type: category
  name: UBER EATS → Dining Out  # Never reached!
  patterns: [UBER EATS]
  category: Food & Dining > Dining Out
  confidence: 95

Fix with exclusions:

- type: category
  name: UBER → Transport
  patterns: [UBER]
  exclude_patterns: [UBER EATS]
  category: Transport
  confidence: 90

2. Use Visual Grouping

Group related rules with comments for easy scanning:

# ═══════════════════════════════════════════════════════════
# GROCERIES WORKFLOW
# ═══════════════════════════════════════════════════════════

- type: category
  name: Groceries
  patterns: [WOOLWORTHS, COLES, ALDI]
  category: Food & Dining > Groceries
  confidence: 95

- type: label
  name: Essential Spending
  when:
    categories: [Groceries]
  labels: [Essential, Needs]

- type: label
  name: Shared Groceries
  when:
    categories: [Groceries]
    accounts: [Shared Bills]
  labels: [Shared Expense, Reconciliation]

# ═══════════════════════════════════════════════════════════
# TRANSPORT WORKFLOW
# ═══════════════════════════════════════════════════════════

- type: category
  name: Rideshare
  patterns: [UBER, LYFT]
  exclude_patterns: [UBER EATS]
  category: Transport
  confidence: 90

3. Start with High Confidence

Begin with rules you're certain about (95%+):

# High confidence - very specific merchants
- type: category
  name: WOOLWORTHS → Groceries
  patterns: [WOOLWORTHS]
  category: Food & Dining > Groceries
  confidence: 95

- type: category
  name: AGL → Utilities
  patterns: [AGL]
  category: Bills > Utilities
  confidence: 95

Add medium-confidence rules (80-90%) later as you verify:

# Medium confidence - could be ambiguous
- type: category
  name: Amazon Purchases
  patterns: [AMAZON]
  category: Shopping
  confidence: 80  # Could be books, electronics, groceries, etc.

4. Test with Dry Run

Always test before applying to real transactions:

# Preview what would happen without making changes
uv run python scripts/operations/batch_categorize.py \
  --mode=dry_run \
  --period=2025-11 \
  --limit=50

# See what would change on existing categorizations
uv run python scripts/operations/batch_categorize.py \
  --mode=validate \
  --period=2025-11

Review the output carefully before running with --mode=apply.

5. Version Control Your Rules

Commit data/rules.yaml to git to track evolution:

# After adding/modifying rules
git add data/rules.yaml
git commit -m "rules: add coffee shop categorization with personal label"

# View history
git log --oneline data/rules.yaml

# Compare versions
git diff HEAD~1 data/rules.yaml

6. Review Rule Performance Regularly

Check rule accuracy monthly:

# Analyze categorization coverage
/agent-smith-analyze rules --period=last-month

# See which rules are matching most often
/agent-smith-analyze rules --sort=matches

# Find low-accuracy rules
/agent-smith-analyze rules --min-accuracy=80

Refine rules that have low accuracy or aren't matching as expected.

7. Use Templates as Starting Points

Don't start from scratch - use a template:

uv run python scripts/setup/template_selector.py

Then customize by:

Updating merchant names for your region (e.g., WOOLWORTHS → KROGER)
Adjusting account names to match your PocketSmith setup
Adding your specific categories
Fine-tuning confidence scores based on your data

8. Document Complex Rules

Add comments explaining non-obvious rules:

# Complex rule: UBER is transport UNLESS it's UBER EATS or during work hours
# Work hours trips from Personal account are likely work-related (reimbursable)
- type: category
  name: UBER Transport (Excluding Food Delivery)
  patterns: [UBER]
  exclude_patterns: [UBER EATS, UBER EATS MARKETPLACE]
  category: Transport
  confidence: 90

# Note: Work-related UBER trips need manual review for reimbursement
# They'll get the "Reimbursable" label from the account-based rule below

Operational Modes

The batch processor supports three operational modes for safe rule testing and application.

DRY_RUN Mode

Purpose: Preview what would happen without making any changes.

Use when:

Testing new rules
Checking rule coverage
Seeing potential categorizations before committing

Example:

uv run python scripts/operations/batch_categorize.py \
  --mode=dry_run \
  --period=2025-11

Output:

DRY RUN MODE - No changes will be made

Transaction #12345: WOOLWORTHS -$127.50
  → Would categorize as: Food & Dining > Groceries (95% confidence)
  → Would apply labels: [Essential, Shared Expense]

Transaction #12346: STARBUCKS -$6.50
  → Would categorize as: Food & Dining > Dining Out (90% confidence)
  → Would apply labels: [Discretionary, Personal]

Transaction #12347: ACME WIDGETS -$245.00
  → No rule match
  → Would request LLM categorization

Summary:
  Would categorize: 2/3 transactions (66.7%)
  LLM fallback needed: 1 transaction
  No changes made (DRY RUN)

VALIDATE Mode

Purpose: Show what would CHANGE on transactions that already have categories.

Use when:

Checking if new rules conflict with existing categorizations
Planning to update categories with better rules
Auditing categorization accuracy

Example:

uv run python scripts/operations/batch_categorize.py \
  --mode=validate \
  --period=2025-11

Output:

VALIDATE MODE - Showing potential changes

Transaction #12345: WOOLWORTHS -$127.50
  Current: Food (80% confidence)
  New: Food & Dining > Groceries (95% confidence)
  Change: Category would be updated ✓

Transaction #12346: STARBUCKS -$6.50
  Current: Food & Dining > Dining Out (90% confidence)
  New: Food & Dining > Dining Out (90% confidence)
  Change: No change (same category)

Transaction #12347: UBER -$25.00
  Current: Dining Out (user-assigned)
  New: Transport (90% confidence from rule)
  Change: Category would be REPLACED (was user-assigned!)

Summary:
  Would update: 2 transactions
  Already correct: 1 transaction
  Would replace user assignments: 1 transaction ⚠️
  No changes made (VALIDATE)

APPLY Mode

Purpose: Actually apply categorizations and labels to transactions.

Use when:

Ready to commit changes after testing with DRY_RUN/VALIDATE
Processing new uncategorized transactions
Updating categorizations with improved rules

Example:

uv run python scripts/operations/batch_categorize.py \
  --mode=apply \
  --period=2025-11 \
  --update-strategy=skip_existing

Output:

APPLY MODE - Making changes to PocketSmith

Transaction #12345: WOOLWORTHS -$127.50
  ✓ Categorized as: Food & Dining > Groceries (95%)
  ✓ Labels applied: [Essential, Shared Expense]

Transaction #12346: STARBUCKS -$6.50
  ⊘ Skipped (already categorized)

Transaction #12347: ACME WIDGETS -$245.00
  → Requesting LLM categorization...
  ? Suggested: Business Supplies (85% confidence)
    [A]ccept  [E]dit  [S]kip  [C]reate Rule

Update Strategies

Control how the batch processor handles transactions that already have categories.

SKIP_EXISTING (Default)

Only process uncategorized transactions. Leave existing categorizations unchanged.

Use when:

Processing new transactions
Don't want to override user-assigned categories
Preserving manual categorization work

uv run python scripts/operations/batch_categorize.py \
  --mode=apply \
  --update-strategy=skip_existing

Behavior:

Uncategorized → Apply rules
Already categorized → Skip
User-assigned → Skip

REPLACE_ALL

Replace ALL categorizations, even if they were user-assigned.

Use when:

Rebuilding all categorizations from scratch
Confident new rules are better than old assignments
Fixing systemic categorization errors

⚠️ Warning: This will override user-assigned categories!

uv run python scripts/operations/batch_categorize.py \
  --mode=apply \
  --update-strategy=replace_all

Behavior:

Uncategorized → Apply rules
Already categorized → Replace with rule result
User-assigned → Replace with rule result (loses user intent!)

UPGRADE_CONFIDENCE

Replace categorization ONLY if new rule has higher confidence.

Use when:

Improving categorizations with better rules
Keeping high-confidence assignments
Upgrading low-confidence auto-categorizations

uv run python scripts/operations/batch_categorize.py \
  --mode=apply \
  --update-strategy=upgrade_confidence

Behavior:

Uncategorized → Apply rules
Lower confidence → Replace with higher confidence rule
Higher confidence → Keep existing
User-assigned (confidence: 100%) → Never replaced

Example:

Transaction: WOOLWORTHS -$50
Current: Food (80% confidence from old rule)
New: Food & Dining > Groceries (95% confidence from new rule)
→ REPLACED (95% > 80%)

Transaction: STARBUCKS -$6
Current: Dining Out (95% confidence)
New: Dining Out (90% confidence from new rule)
→ KEPT (95% > 90%)

REPLACE_IF_DIFFERENT

Replace categorization if the category NAME differs.

Use when:

Fixing miscategorized transactions
Migrating to a new category structure
Correcting category hierarchies

uv run python scripts/operations/batch_categorize.py \
  --mode=apply \
  --update-strategy=replace_if_different

Behavior:

Uncategorized → Apply rules
Same category → Keep existing
Different category → Replace with rule result
User-assigned → Still replaced if different!

Example:

Transaction: WOOLWORTHS -$50
Current: Food
New: Food & Dining > Groceries
→ REPLACED (different category name)

Transaction: STARBUCKS -$6
Current: Dining Out
New: Dining Out
→ KEPT (same category)

Template System

Agent Smith provides pre-built rule templates for common household types. Templates are stored in data/templates/ and can be applied to create your data/rules.yaml.

Available Templates

1. Simple - Single Person

File: data/templates/simple.yaml

Best for:

Single person households
No shared expenses
Basic income and expense tracking

Includes:

Income categorization (salary, wages)
Essential expenses (groceries, utilities, rent)
Discretionary spending (dining out, entertainment)
Transport categories
Basic labels (Essential, Discretionary, Large Purchase)
Uncategorized flagging

Example rules:

# Income
- type: category
  patterns: [SALARY, WAGES, EMPLOYER]
  category: Income > Salary
  confidence: 95

# Essential groceries
- type: category
  patterns: [WOOLWORTHS, COLES, ALDI]
  category: Food & Dining > Groceries
  confidence: 95

- type: label
  when:
    categories: [Groceries, Utilities, Rent]
  labels: [Essential]

2. Separated Families

File: data/templates/separated-families.yaml

Best for:

Divorced or separated parents
Shared custody arrangements
Child support tracking
Kids' expense management

Includes:

Kids' expense categories (school, activities, clothing, medical)
Child support tracking
Contributor labels (Parent A, Parent B)
Reimbursement workflows
School term and vacation labels
Medical and education receipts flagging

Example rules:

# Child expenses
- type: category
  patterns: [SCHOOL, UNIFORM, SCHOOL FEES]
  category: Kids > Education
  confidence: 90

- type: label
  when:
    categories: [Kids]
  labels: [Child Expense, Needs Documentation]

# Child support tracking
- type: label
  when:
    patterns: [CHILD SUPPORT]
  labels: [Child Support, Parent B Contribution]

# Shared kid expenses requiring reimbursement
- type: label
  when:
    categories: [Kids]
    amount_operator: ">"
    amount_value: 50
  labels: [Needs Reimbursement, Split 50/50]

3. Shared Household

File: data/templates/shared-household.yaml

Best for:

Couples living together
Roommates sharing expenses
Families with joint accounts

Includes:

Shared vs personal expense separation
Contributor tracking (Person A, Person B)
Approval workflows (large purchases, discretionary spending)
Reconciliation labels
Essential vs discretionary labeling
Account-based routing (Shared Bills, Personal accounts)

Example rules:

# Shared essential expenses
- type: category
  patterns: [WOOLWORTHS, COLES]
  accounts: [Shared Bills, Joint Account]
  category: Food & Dining > Groceries
  confidence: 95

- type: label
  when:
    categories: [Groceries]
    accounts: [Shared Bills]
  labels: [Shared Expense, Essential, Monthly Reconciliation]

# Approval workflow for large shared purchases
- type: label
  when:
    accounts: [Shared Bills]
    amount_operator: ">"
    amount_value: 150
  labels: [Needs Approval, Review Required]

# Personal vs shared distinction
- type: label
  when:
    accounts: [Personal, PersonA Account, PersonB Account]
  labels: [Personal, Individual]

4. Advanced

File: data/templates/advanced.yaml

Best for:

Business owners
Investors and traders
Complex financial situations
Tax optimization focus

Includes:

Business expense categories (with ATO codes)
Investment tracking (shares, crypto, property)
Tax deductible flagging (work, home office, professional development)
Capital gains tracking
Substantiation requirements ($300 threshold)
Instant asset write-off flagging
GST tracking
Business vs personal separation

Example rules:

# Business expenses
- type: category
  patterns: [OFFICE, STATIONERY, SUPPLIES]
  accounts: [Business, Work]
  category: Work > Office Supplies
  confidence: 90

- type: label
  when:
    categories: [Work, Home Office, Professional Development]
  labels: [Tax Deductible, ATO: D1, Business Expense]

# Investment purchases
- type: category
  patterns: [COMMSEC, SELFWEALTH, STAKE]
  category: Investments > Share Purchase
  confidence: 90

- type: label
  when:
    categories: [Investments]
  labels: [CGT Event, Track Cost Base]

# Substantiation requirements
- type: label
  when:
    labels: [Tax Deductible]
    amount_operator: ">"
    amount_value: 300
  labels: [Receipt Required, ATO Substantiation]

Applying a Template

Interactive selection:

uv run python scripts/setup/template_selector.py

Output:

══════════════════════════════════════════════════════════════════
Agent Smith - Rule Template Setup
══════════════════════════════════════════════════════════════════

Available templates:

1. Simple - Single Person
   Basic categories for individual financial tracking
   Best for: Single person, no shared expenses

2. Separated Families
   Kids expenses, child support, contributor tracking
   Best for: Divorced/separated parents with shared custody

3. Shared Household
   Shared expense tracking with approval workflows
   Best for: Couples, roommates, or families

4. Advanced
   Tax optimization and investment management
   Best for: Business owners, investors, complex finances

Select template (1-4): 3

Applying template: Shared Household
Backed up existing rules to data/rules.yaml.backup
✓ Template applied successfully!

Next steps:
1. Review data/rules.yaml and customize for your needs
2. Update merchant patterns for your region
3. Adjust account names to match your PocketSmith setup
4. Run: /agent-smith-categorize --mode=dry-run to test

Programmatic usage:

from scripts.setup.template_selector import TemplateSelector

selector = TemplateSelector()

# List templates
templates = selector.list_templates()
for key, info in templates.items():
    print(f"{info['name']}: {info['description']}")

# Apply template
selector.apply_template("shared-household", backup=True)

Customizing Templates

After applying a template:

Update merchant patterns for your region:

# Template (Australian)
patterns: [WOOLWORTHS, COLES, ALDI]

# Customize (US)
patterns: [KROGER, SAFEWAY, WHOLE FOODS]

Adjust account names to match your PocketSmith:

# Template
accounts: [Shared Bills, Joint Account]

# Your setup
accounts: [Joint Checking, Household Card]

Add your specific categories:

# Add new rules
- type: category
  name: Pet Expenses
  patterns: [VET, PET STORE, PETBARN]
  category: Pets > Veterinary
  confidence: 90

Fine-tune confidence scores based on your data:

# Start conservative
confidence: 70

# After validation, increase
confidence: 90

Migration Guide

From Platform Rules to Unified YAML

If you have existing platform rules created via the PocketSmith API, you can migrate them to the unified YAML format.

See: Platform to Local Rules Migration Guide

Quick summary:

Export platform rules to JSON
Convert to unified YAML format
Test with dry run
Disable platform rules (keep for backup)
Enable unified rules

Migration script:

uv run python scripts/migrations/migrate_platform_to_local.py \
  --output=data/rules.yaml \
  --backup

Adding Labels to Existing Rules

If you have category rules and want to add labels:

Keep all existing category rules as-is
Add label rules at the bottom
Test with dry run to see labels applied
Apply with --update-strategy=skip_existing to avoid re-categorizing

Example:

# Existing category rules (don't change)
- type: category
  name: WOOLWORTHS → Groceries
  patterns: [WOOLWORTHS]
  category: Groceries
  confidence: 95

# NEW: Add label rules
- type: label
  name: Essential Spending
  when:
    categories: [Groceries]
  labels: [Essential, Needs]

- type: label
  name: Large Grocery Trip
  when:
    categories: [Groceries]
    amount_operator: ">"
    amount_value: 150
  labels: [Large Purchase]

Run with:

uv run python scripts/operations/batch_categorize.py \
  --mode=apply \
  --update-strategy=skip_existing \
  --period=2025-11

This will:

Skip already categorized transactions (no re-categorization)
Apply new labels to all transactions (even already categorized ones)

Troubleshooting

Rule Not Matching

Symptom: Rule should match but doesn't.

Check:

Pattern case sensitivity - Patterns are case-insensitive, but spacing matters:

# Won't match "UBEREATS" (no space)
patterns: [UBER EATS]

# Better: account for variations
patterns: [UBER EATS, UBEREATS]

Exclusion patterns blocking - Check if an exclusion is preventing the match:

- type: category
  patterns: [UBER]
  exclude_patterns: [UBER EATS, MEDICAL]  # Blocks "UBER MEDICAL"
  category: Transport

Account filter too restrictive - Transaction might be in a different account:

# Only matches transactions in "Personal" account
accounts: [Personal]

# Check transaction's actual account name

Amount condition incorrect - Verify the amount operator and value:

amount_operator: ">"
amount_value: 100
# Won't match transactions ≤ $100

Rule order - A previous rule might have matched first (short-circuit):

# General rule matches first!
- patterns: [UBER]
  category: Transport

# Specific rule never reached
- patterns: [UBER EATS]
  category: Dining Out  # Dead code!

Debug with test script:

# Test specific payee
uv run python scripts/operations/test_rules.py \
  --payee="EXACT PAYEE NAME" \
  --account="Account Name" \
  --amount=127.50 \
  --debug

Output:

Testing transaction:
  Payee: EXACT PAYEE NAME
  Account: Account Name
  Amount: $127.50

Checking category rules...
  ✗ Rule 1 "WOOLWORTHS → Groceries": Pattern mismatch
  ✗ Rule 2 "UBER → Transport": Pattern mismatch
  ✗ Rule 3 "CAFE → Dining Out": Pattern mismatch

No category match found.

Checking label rules...
  (skipped - no category assigned)

Result: No categorization

Multiple Rules Matching

Symptom: Worried about multiple rules matching the same transaction.

This is expected! Category rules use short-circuit (first match wins), label rules accumulate all matches.

For category rules:

# Only the FIRST matching rule applies
- patterns: [UBER EATS]
  category: Dining Out
  # ✓ This matches UBER EATS

- patterns: [UBER]
  category: Transport
  # ✗ Never reached for UBER EATS (already matched above)

For label rules:

# ALL matching rules apply (additive)
- type: label
  when:
    categories: [Groceries]
  labels: [Essential]
  # ✓ Matches

- type: label
  when:
    amount_operator: ">"
    amount_value: 100
  labels: [Large Purchase]
  # ✓ Also matches

# Result: [Essential, Large Purchase]

Fix unwanted category matches by adjusting rule order or using exclusions.

Control label accumulation by making conditions more specific:

# Too broad - applies to ALL transactions
- type: label
  when:
    amount_operator: ">"
    amount_value: 0
  labels: [Has Amount]  # Not useful!

# Better - specific categories only
- type: label
  when:
    categories: [Groceries, Dining Out]
    amount_operator: ">"
    amount_value: 100
  labels: [Large Food Purchase]

Labels Not Applying

Symptom: Label rule should match but labels aren't applied.

Check:

Category must be assigned first - Labels depend on Phase 1 categorization:

# Label requires category "Groceries"
- type: label
  when:
    categories: [Groceries]
  labels: [Essential]

# But transaction wasn't categorized in Phase 1
# → Label rule won't match

Fix: Ensure a category rule matches the transaction first.

When conditions too restrictive - All conditions must match (AND logic):

- type: label
  when:
    categories: [Groceries]  # Must match
    accounts: [Shared Bills]  # AND must match
    amount_operator: ">"       # AND must match
    amount_value: 100
  labels: [Large Shared Grocery]

# Won't match if ANY condition fails

Uncategorized flag incorrect - Can't combine with other conditions:

# This won't work as expected
- type: label
  when:
    uncategorized: true
    categories: [Groceries]  # Contradiction! Can't be both uncategorized and have a category
  labels: [Invalid]

Fix: Use uncategorized: true alone or with accounts/amount only.

Labels can't check labels - You can't reference other labels in conditions:

# This WON'T work - no way to check existing labels
- type: label
  when:
    labels: [Tax Deductible]  # Not supported!
    amount_operator: ">"
    amount_value: 300
  labels: [Substantiation Required]

Fix: Use categories or accounts as conditions instead.

Debug:

uv run python scripts/operations/test_rules.py \
  --payee="WOOLWORTHS" \
  --account="Shared Bills" \
  --amount=127.50 \
  --category="Groceries" \
  --debug

Confidence Scores Unclear

Symptom: Not sure what confidence score to use.

Guidelines:

Confidence	When to Use	Example
95-100%	Exact merchant match, no ambiguity	WOOLWORTHS → Groceries
85-94%	Very likely but minor ambiguity	AMAZON → Shopping (could be many subcategories)
75-84%	Likely but context-dependent	UBER → Transport (unless UBER EATS)
70-74%	Moderate confidence, needs validation	Generic patterns like "MARKET"
< 70%	Low confidence, probably shouldn't auto-apply	Broad patterns

Smart mode thresholds:

≥ 90%: Auto-apply
70-89%: Ask user (with LLM validation)
< 70%: Skip

Start high (95%), reduce if:

LLM frequently suggests different category
User frequently overrides
Pattern matches too broadly

LLM Not Being Used

Symptom: Expected LLM fallback but it's not happening.

Possible causes:

Rule matched - LLM only used when NO rule matches:
```
Transaction: ACME WIDGETS
Rule match: "Generic Business" pattern [WIDGETS] (75%)
→ Rule applied, LLM not needed
```
Fix: Remove overly broad rules if you want LLM to handle edge cases.

Categories not provided - LLM needs category list:

workflow.categorize_transaction(
    transaction=txn,
    available_categories=None  # ← LLM can't suggest without categories!
)

Fix: Pass available_categories from PocketSmith API.

Conservative mode + low confidence - Conservative never auto-applies:
```
Mode: Conservative
LLM suggests: Business Supplies (85%)
→ Asks user (doesn't auto-apply)
```
This is expected! Conservative always asks.

Performance Issues

Symptom: Batch categorization is slow with many rules.

Optimizations:

Reduce rule count - Consolidate similar patterns:

# Before: 3 rules
- patterns: [WOOLWORTHS]
  category: Groceries
- patterns: [COLES]
  category: Groceries
- patterns: [ALDI]
  category: Groceries

# After: 1 rule
- patterns: [WOOLWORTHS, COLES, ALDI]
  category: Groceries

Use account filters - Skip irrelevant transactions early:

# Check account BEFORE pattern matching
- patterns: [WORK PATTERN]
  accounts: [Work Credit Card]  # Skips 90% of transactions
  category: Work Expenses

Order by frequency - Put most common rules first:

# Most frequent transaction (groceries) - check first
- patterns: [WOOLWORTHS, COLES]
  category: Groceries

# Less frequent - check later
- patterns: [RARE MERCHANT]
  category: Rare Category

Limit batch size - Process in smaller chunks:

# Instead of processing all at once
uv run python scripts/operations/batch_categorize.py --period=2025

# Process month by month
uv run python scripts/operations/batch_categorize.py --period=2025-01
uv run python scripts/operations/batch_categorize.py --period=2025-02
# etc.

Examples

See docs/examples/ for complete example YAML files:

basic-rules.yaml - Simple category and label rules
advanced-patterns.yaml - Complex rules with exclusions, amounts, accounts
household-workflow.yaml - Complete shared household setup
tax-deductible.yaml - Tax optimization rules with ATO codes
migration-example.yaml - Migrated from platform rules

Support

For questions or issues:

Check this guide's troubleshooting section
Review example files in docs/examples/
Check template files in data/templates/
Refer to design documentation
Create an issue in the repository

Last Updated: 2025-11-22 Version: 1.0.0

45 KiB Raw Blame History

Unified Rules Guide - Categories & Labels

Overview

Table of Contents

Quick Start

1. Choose a Template

2. Customize Your Rules

3. Test Your Rules

4. Review and Refine

Rule Types

Category Rules

Label Rules

Execution Flow

Phase 1: Categorization

Phase 2: Labeling

Example Execution

Intelligence Modes

Conservative Mode

Smart Mode (Default)

Aggressive Mode

Setting the Mode

LLM Integration

Fallback Categorization

Validation

Learning from LLM Results

Advanced Patterns

Cross-Category Labels

Account-Based Workflows

Shared Household Expense Tracking

Tax Deductible Tracking

Uncategorized Transaction Management

Best Practices

1. Order Rules Specific → General

2. Use Visual Grouping

3. Start with High Confidence

4. Test with Dry Run

5. Version Control Your Rules

6. Review Rule Performance Regularly

7. Use Templates as Starting Points

8. Document Complex Rules

Operational Modes

DRY_RUN Mode

VALIDATE Mode

APPLY Mode

Update Strategies

SKIP_EXISTING (Default)

REPLACE_ALL

UPGRADE_CONFIDENCE

REPLACE_IF_DIFFERENT

Template System

Available Templates

1. Simple - Single Person

2. Separated Families

3. Shared Household

4. Advanced

Applying a Template

Customizing Templates

Migration Guide

From Platform Rules to Unified YAML

Adding Labels to Existing Rules

Troubleshooting

Rule Not Matching

Multiple Rules Matching

Labels Not Applying

Confidence Scores Unclear

LLM Not Being Used

Performance Issues

Examples

Further Reading

Support

45 KiB

Raw Blame History