Files
2025-11-30 08:47:54 +08:00

8.5 KiB

name, description
name description
policyengine-aggregation PolicyEngine aggregation patterns - using adds attribute and add() function for summing variables across entities

PolicyEngine Aggregation Patterns

Essential patterns for summing variables across entities in PolicyEngine.

Quick Decision Guide

Is the variable ONLY a sum of other variables?
│
├─ YES → Use `adds` attribute (NO formula needed!)
│         adds = ["var1", "var2"]
│
└─ NO → Use `add()` function in formula
         (when you need max_, where, conditions, etc.)

Quick Reference

Need Use Example
Simple sum adds adds = ["var1", "var2"]
Sum from parameters adds adds = "gov.path.to.list"
Sum + max_() add() max_(0, add(...))
Sum + where() add() where(cond, add(...), 0)
Sum + conditions add() if cond: add(...)
Count booleans adds adds = ["is_eligible"]

1. adds Class Attribute (Preferred When Possible)

When to Use

Use adds when a variable is ONLY the sum of other variables with NO additional logic.

Syntax

class variable_name(Variable):
    value_type = float
    entity = Entity
    definition_period = PERIOD

    # Option 1: List of variables
    adds = ["variable1", "variable2", "variable3"]

    # Option 2: Parameter tree path
    adds = "gov.path.to.parameter.list"

Key Points

  • No formula() method needed
  • Automatically handles entity aggregation (person → household/tax_unit/spm_unit)
  • Clean and declarative

Example: Simple Income Sum

class tanf_gross_earned_income(Variable):
    value_type = float
    entity = SPMUnit
    label = "TANF gross earned income"
    unit = USD
    definition_period = MONTH

    adds = ["employment_income", "self_employment_income"]
    # NO formula needed! Automatically:
    # 1. Gets each person's employment_income
    # 2. Gets each person's self_employment_income
    # 3. Sums all values across SPM unit members

Example: Using Parameter List

class income_tax_refundable_credits(Variable):
    value_type = float
    entity = TaxUnit
    definition_period = YEAR

    adds = "gov.irs.credits.refundable"
    # Parameter file contains list like:
    # - earned_income_tax_credit
    # - child_tax_credit
    # - additional_child_tax_credit

Example: Counting Boolean Values

class count_eligible_people(Variable):
    value_type = int
    entity = SPMUnit
    definition_period = YEAR

    adds = ["is_eligible_person"]
    # Automatically sums True (1) and False (0) across members

2. add() Function (When Logic Needed)

When to Use

Use add() inside a formula() when you need:

  • To apply max_(), where(), or conditions
  • To combine with other operations
  • To modify values before/after summing

Syntax

from policyengine_us.model_api import *

def formula(entity, period, parameters):
    result = add(entity, period, variable_list)

Parameters:

  • entity: The entity to operate on
  • period: The time period for calculation
  • variable_list: List of variable names or parameter path

Example: With max_() to Prevent Negatives

class adjusted_earned_income(Variable):
    value_type = float
    entity = SPMUnit
    definition_period = MONTH

    def formula(spm_unit, period, parameters):
        # Need max_() to clip negative values
        gross = add(spm_unit, period, ["employment_income", "self_employment_income"])
        return max_(0, gross)  # Prevent negative income

Example: With Additional Logic

class household_benefits(Variable):
    value_type = float
    entity = Household
    definition_period = YEAR

    def formula(household, period, parameters):
        # Sum existing benefits
        BENEFITS = ["snap", "tanf", "ssi", "social_security"]
        existing = add(household, period, BENEFITS)

        # Add new benefit conditionally
        new_benefit = household("special_benefit", period)
        p = parameters(period).gov.special_benefit

        if p.include_in_total:
            return existing + new_benefit
        return existing

Example: Building on Previous Variables

class total_deductions(Variable):
    value_type = float
    entity = TaxUnit
    definition_period = YEAR

    def formula(tax_unit, period, parameters):
        p = parameters(period).gov.irs.deductions

        # Get standard deductions using parameter list
        standard = add(tax_unit, period, p.standard_items)

        # Apply phase-out logic
        income = tax_unit("adjusted_gross_income", period)
        phase_out_rate = p.phase_out_rate
        phase_out_start = p.phase_out_start

        reduction = max_(0, (income - phase_out_start) * phase_out_rate)
        return max_(0, standard - reduction)

3. Common Anti-Patterns to Avoid

NEVER: Manual Summing

# WRONG - Never do this!
def formula(spm_unit, period, parameters):
    person = spm_unit.members
    employment = person("employment_income", period)
    self_emp = person("self_employment_income", period)
    return spm_unit.sum(employment + self_emp)  # ❌ BAD

CORRECT: Use adds

# RIGHT - Clean and simple
adds = ["employment_income", "self_employment_income"]  # ✅ GOOD

WRONG: Using add() When adds Suffices

# WRONG - Unnecessary complexity
def formula(spm_unit, period, parameters):
    return add(spm_unit, period, ["income1", "income2"])  # ❌ Overkill

CORRECT: Use adds

# RIGHT - Simpler
adds = ["income1", "income2"]  # ✅ GOOD

4. Entity Aggregation Explained

When using adds or add(), PolicyEngine automatically handles entity aggregation:

class household_total_income(Variable):
    entity = Household  # Higher-level entity
    definition_period = YEAR

    adds = ["employment_income", "self_employment_income"]
    # employment_income is defined for Person (lower-level)
    # PolicyEngine automatically:
    # 1. Gets employment_income for each person in household
    # 2. Gets self_employment_income for each person
    # 3. Sums all values to household level

This works across all entity hierarchies:

  • Person → Tax Unit
  • Person → SPM Unit
  • Person → Household
  • Tax Unit → Household
  • SPM Unit → Household

5. Parameter Lists

Parameters can define lists of variables to sum:

Parameter file (gov/irs/credits/refundable.yaml):

description: List of refundable tax credits
values:
  2024-01-01:
    - earned_income_tax_credit
    - child_tax_credit
    - additional_child_tax_credit

Usage in variable:

adds = "gov.irs.credits.refundable"
# Automatically sums all credits in the list

6. Decision Matrix

Scenario Solution Code
Sum 2-3 variables adds attribute adds = ["var1", "var2"]
Sum many variables Parameter list adds = "gov.path.list"
Sum + prevent negatives add() with max_() max_(0, add(...))
Sum + conditional add() with where() where(eligible, add(...), 0)
Sum + phase-out add() with calculation add(...) - reduction
Count people/entities adds with boolean adds = ["is_child"]

7. Key Principles

  1. Default to adds attribute when variable is only a sum
  2. Use add() function only when additional logic is needed
  3. Never manually sum with entity.sum(person(...) + person(...))
  4. Let PolicyEngine handle entity aggregation automatically
  5. Use parameter lists for maintainable, configurable sums

  • policyengine-period-patterns-skill: For period conversion when summing across different time periods
  • policyengine-core-skill: For understanding entity hierarchies and relationships

For Agents

When implementing or reviewing code:

  1. Check if adds can be used before writing a formula
  2. Prefer declarative over imperative when possible
  3. Follow existing patterns in the codebase
  4. Test entity aggregation carefully in YAML tests
  5. Document parameter lists clearly for adds references

Common Use Cases

Earned Income

adds = ["employment_income", "self_employment_income"]

Unearned Income

adds = ["interest_income", "dividend_income", "rental_income"]

Total Benefits

adds = ["snap", "tanf", "wic", "ssi", "social_security"]

Tax Credits

adds = "gov.irs.credits.refundable"

Counting Children

adds = ["is_child"]  # Returns count of children