Initial commit

2025-11-30 08:47:43 +08:00
commit 2e8d89fca3
41 changed files with 14051 additions and 0 deletions
--- a/skills/technical-patterns/policyengine-code-style-skill/SKILL.md
+++ b/skills/technical-patterns/policyengine-code-style-skill/SKILL.md
@@ -0,0 +1,382 @@
+---
+name: policyengine-code-style
+description: PolicyEngine code writing style guide - formula optimization, direct returns, eliminating unnecessary variables
+---
+
+# PolicyEngine Code Writing Style Guide
+
+Essential patterns for writing clean, efficient PolicyEngine formulas.
+
+## Core Principles
+
+1. **Eliminate unnecessary intermediate variables**
+2. **Use direct parameter/variable access**
+3. **Return directly when possible**
+4. **Combine boolean logic**
+5. **Use correct period access** (period vs period.this_year)
+6. **NO hardcoded values** - use parameters or constants
+
+---
+
+## Pattern 1: Direct Parameter Access
+
+### ❌ Bad - Unnecessary intermediate variable
+
+```python
+def formula(spm_unit, period, parameters):
+    countable = spm_unit("tn_tanf_countable_resources", period)
+    p = parameters(period).gov.states.tn.dhs.tanf.resource_limit
+    resource_limit = p.amount  # ❌ Unnecessary
+    return countable <= resource_limit
+```
+
+### ✅ Good - Direct access
+
+```python
+def formula(spm_unit, period, parameters):
+    countable = spm_unit("tn_tanf_countable_resources", period)
+    p = parameters(period).gov.states.tn.dhs.tanf.resource_limit
+    return countable <= p.amount
+```
+
+---
+
+## Pattern 2: Direct Return
+
+### ❌ Bad - Unnecessary result variable
+
+```python
+def formula(spm_unit, period, parameters):
+    assets = spm_unit("spm_unit_assets", period.this_year)
+    p = parameters(period).gov.states.tn.dhs.tanf.resource_limit
+    vehicle_exemption = p.vehicle_exemption  # ❌ Unnecessary
+    countable = max_(assets - vehicle_exemption, 0)  # ❌ Unnecessary
+    return countable
+```
+
+### ✅ Good - Direct return
+
+```python
+def formula(spm_unit, period, parameters):
+    assets = spm_unit("spm_unit_assets", period.this_year)
+    p = parameters(period).gov.states.tn.dhs.tanf.resource_limit
+    return max_(assets - p.vehicle_exemption, 0)
+```
+
+---
+
+## Pattern 3: Combined Boolean Logic
+
+### ❌ Bad - Too many intermediate booleans
+
+```python
+def formula(spm_unit, period, parameters):
+    person = spm_unit.members
+    age = person("age", period.this_year)
+    is_disabled = person("is_disabled", period.this_year)
+
+    caretaker_is_60_or_older = spm_unit.any(age >= 60)  # ❌ Unnecessary
+    caretaker_is_disabled = spm_unit.any(is_disabled)   # ❌ Unnecessary
+    eligible = caretaker_is_60_or_older | caretaker_is_disabled  # ❌ Unnecessary
+
+    return eligible
+```
+
+### ✅ Good - Combined logic
+
+```python
+def formula(spm_unit, period, parameters):
+    person = spm_unit.members
+    age = person("age", period.this_year)
+    is_disabled = person("is_disabled", period.this_year)
+
+    return spm_unit.any((age >= 60) | is_disabled)
+```
+
+---
+
+## Pattern 4: Period Access - period vs period.this_year
+
+### ❌ Bad - Wrong period access
+
+```python
+def formula(person, period, parameters):
+    # MONTH formula accessing YEAR variables
+    age = person("age", period)  # ❌ Gives age/12 = 2.5 "monthly age"
+    assets = person("assets", period)  # ❌ Gives assets/12
+    monthly_income = person("employment_income", period.this_year) / MONTHS_IN_YEAR  # ❌ Redundant
+
+    return (age >= 18) & (assets < 10000) & (monthly_income < 2000)
+```
+
+### ✅ Good - Correct period access
+
+```python
+def formula(person, period, parameters):
+    # MONTH formula accessing YEAR variables
+    age = person("age", period.this_year)  # ✅ Gets actual age (30)
+    assets = person("assets", period.this_year)  # ✅ Gets actual assets ($10,000)
+    monthly_income = person("employment_income", period)  # ✅ Auto-converts to monthly
+
+    p = parameters(period).gov.program.eligibility
+    return (age >= p.age_min) & (age <= p.age_max) &
+           (assets < p.asset_limit) & (monthly_income < p.income_threshold)
+```
+
+**Rule:**
+- Income/flows → Use `period` (want monthly from annual)
+- Age/assets/counts/booleans → Use `period.this_year` (don't divide by 12)
+
+---
+
+## Pattern 5: No Hardcoded Values
+
+### ❌ Bad - Hardcoded numbers
+
+```python
+def formula(spm_unit, period, parameters):
+    size = spm_unit.nb_persons()
+    capped_size = min_(size, 10)  # ❌ Hardcoded
+
+    age = person("age", period.this_year)
+    income = person("income", period) / 12  # ❌ Use MONTHS_IN_YEAR
+
+    # ❌ Hardcoded thresholds
+    if age >= 18 and age <= 65 and income < 2000:
+        return True
+```
+
+### ✅ Good - Parameterized
+
+```python
+def formula(spm_unit, period, parameters):
+    p = parameters(period).gov.program
+    capped_size = min_(spm_unit.nb_persons(), p.max_unit_size)  # ✅
+
+    age = person("age", period.this_year)
+    monthly_income = person("income", period)  # ✅ Auto-converts (no manual /12)
+
+    age_eligible = (age >= p.age_min) & (age <= p.age_max)  # ✅
+    income_eligible = monthly_income < p.income_threshold  # ✅
+
+    return age_eligible & income_eligible
+```
+
+---
+
+## Pattern 6: Streamline Variable Access
+
+### ❌ Bad - Redundant steps
+
+```python
+def formula(spm_unit, period, parameters):
+    unit_size = spm_unit.nb_persons()  # ❌ Unnecessary
+    max_size = 10  # ❌ Hardcoded
+    capped_size = min_(unit_size, max_size)
+
+    p = parameters(period).gov.states.tn.dhs.tanf.benefit
+    spa = p.standard_payment_amount[capped_size]  # ❌ Unnecessary
+    dgpa = p.differential_grant_payment_amount[capped_size]  # ❌ Unnecessary
+
+    eligible = spm_unit("eligible_for_dgpa", period)
+    return where(eligible, dgpa, spa)
+```
+
+### ✅ Good - Streamlined
+
+```python
+def formula(spm_unit, period, parameters):
+    p = parameters(period).gov.states.tn.dhs.tanf.benefit
+    capped_size = min_(spm_unit.nb_persons(), p.max_unit_size)
+    eligible = spm_unit("eligible_for_dgpa", period)
+
+    return where(
+        eligible,
+        p.differential_grant_payment_amount[capped_size],
+        p.standard_payment_amount[capped_size]
+    )
+```
+
+---
+
+## When to Keep Intermediate Variables
+
+### ✅ Keep when value is used multiple times
+
+```python
+def formula(tax_unit, period, parameters):
+    p = parameters(period).gov.irs.credits
+    filing_status = tax_unit("filing_status", period)
+
+    # ✅ Used multiple times - keep as variable
+    threshold = p.phase_out.start[filing_status]
+
+    income = tax_unit("adjusted_gross_income", period)
+    excess = max_(0, income - threshold)
+    reduction = (excess / p.phase_out.width) * threshold
+
+    return max_(0, threshold - reduction)
+```
+
+### ✅ Keep when calculation is complex
+
+```python
+def formula(spm_unit, period, parameters):
+    p = parameters(period).gov.program
+    gross_earned = spm_unit("gross_earned_income", period)
+
+    # ✅ Complex multi-step calculation - break it down
+    work_expense_deduction = min_(gross_earned * p.work_expense_rate, p.work_expense_max)
+    after_work_expense = gross_earned - work_expense_deduction
+
+    earned_disregard = after_work_expense * p.earned_disregard_rate
+    countable_earned = after_work_expense - earned_disregard
+
+    dependent_care = spm_unit("dependent_care_expenses", period)
+
+    return max_(0, countable_earned - dependent_care)
+```
+
+---
+
+## Complete Example: Before vs After
+
+### ❌ Before - Multiple Issues
+
+```python
+def formula(person, period, parameters):
+    # Wrong period access
+    age = person("age", period)  # ❌ age/12
+    assets = person("assets", period)  # ❌ assets/12
+    annual_income = person("employment_income", period.this_year)
+    monthly_income = annual_income / 12  # ❌ Use MONTHS_IN_YEAR
+
+    # Hardcoded values
+    min_age = 18  # ❌
+    max_age = 64  # ❌
+    asset_limit = 10000  # ❌
+    income_limit = 2000  # ❌
+
+    # Unnecessary intermediate variables
+    age_check = (age >= min_age) & (age <= max_age)
+    asset_check = assets <= asset_limit
+    income_check = monthly_income <= income_limit
+    eligible = age_check & asset_check & income_check
+
+    return eligible
+```
+
+### ✅ After - Clean and Correct
+
+```python
+def formula(person, period, parameters):
+    p = parameters(period).gov.program.eligibility
+
+    # Correct period access
+    age = person("age", period.this_year)
+    assets = person("assets", period.this_year)
+    monthly_income = person("employment_income", period)
+
+    # Direct return with combined logic
+    return (
+        (age >= p.age_min) & (age <= p.age_max) &
+        (assets <= p.asset_limit) &
+        (monthly_income <= p.income_threshold)
+    )
+```
+
+---
+
+## Pattern 7: Minimal Comments
+
+### Code Should Be Self-Documenting
+
+**Variable names and structure should explain the code - not comments.**
+
+### ❌ Bad - Verbose explanatory comments
+
+```python
+def formula(spm_unit, period, parameters):
+    # Wisconsin disregards all earned income of dependent children (< 18)
+    # Calculate earned income for adults only
+    is_adult = spm_unit.members("age", period.this_year) >= 18  # Hard-coded!
+    adult_earned = spm_unit.sum(
+        spm_unit.members("tanf_gross_earned_income", period) * is_adult
+    )
+
+    # All unearned income is counted (including children's)
+    gross_unearned = add(spm_unit, period, ["tanf_gross_unearned_income"])
+
+    # NOTE: Wisconsin disregards many additional income sources that
+    # are not separately tracked in PolicyEngine (educational aid, etc.)
+    return max_(total_income - disregards, 0)
+```
+
+### ✅ Good - Clean self-documenting code
+
+```python
+def formula(spm_unit, period, parameters):
+    p = parameters(period).gov.states.wi.dcf.tanf.income
+
+    is_adult = spm_unit.members("age", period.this_year) >= p.adult_age_threshold
+    adult_earned = spm_unit.sum(
+        spm_unit.members("tanf_gross_earned_income", period) * is_adult
+    )
+    gross_unearned = add(spm_unit, period, ["tanf_gross_unearned_income"])
+    child_support = add(spm_unit, period, ["child_support_received"])
+
+    return max_(adult_earned + gross_unearned - child_support, 0)
+```
+
+### Comment Rules
+
+1. **NO comments explaining what code does** - variable names should be clear
+2. **OK: Brief NOTE about PolicyEngine limitations** (one line):
+   ```python
+   # NOTE: Time limit cannot be tracked in PolicyEngine
+   ```
+3. **NO multi-line explanations** of what the code calculates
+
+---
+
+## Quick Checklist
+
+Before finalizing code:
+- [ ] No hardcoded numbers (use parameters or constants like MONTHS_IN_YEAR)
+- [ ] Correct period access:
+  - Income/flows use `period`
+  - Age/assets/counts/booleans use `period.this_year`
+- [ ] No single-use intermediate variables
+- [ ] Direct parameter access (`p.amount` not `amount = p.amount`)
+- [ ] Direct returns when possible
+- [ ] Combined boolean logic when possible
+- [ ] Minimal comments (code should be self-documenting)
+
+---
+
+## Key Takeaways
+
+1. **Less is more** - Eliminate unnecessary variables
+2. **Direct is better** - Access parameters and return directly
+3. **Combine when logical** - Group related boolean conditions
+4. **Keep when needed** - Complex calculations and reused values deserve variables
+5. **Period matters** - Use correct period access to avoid auto-conversion bugs
+
+---
+
+## Related Skills
+
+- **policyengine-period-patterns-skill** - Deep dive on period handling
+- **policyengine-implementation-patterns-skill** - Variable structure and patterns
+- **policyengine-vectorization-skill** - NumPy operations and vectorization
+
+---
+
+## For Agents
+
+When writing or reviewing formulas:
+1. **Scan for single-use variables** - eliminate them
+2. **Check period access** - ensure correct for variable type
+3. **Look for hardcoded values** - parameterize them
+4. **Identify redundant steps** - streamline them
+5. **Consider readability** - keep complex calculations clear