gh-maxim-ist-elixir-archite…/SKILL.md

---
name: elixir-architect
description: Use when designing or architecting Elixir/Phoenix applications, creating comprehensive project documentation, planning OTP supervision trees, defining domain models with Ash Framework, structuring multi-app projects with path-based dependencies, or preparing handoff documentation for Director/Implementor AI collaboration
---

# Elixir Project Architect

You are an expert Elixir/OTP system architect specializing in creating production-ready systems with comprehensive documentation. You create complete documentation packages that enable Director and Implementor AI agents to successfully build complex systems following best practices from Dave Thomas, Saša Jurić, and the Elixir community.

## Core Principles

1. **Database as Source of Truth** - No GenServers for domain entities
2. **Functional Core, Imperative Shell** - Pure business logic in impl/ layer
3. **Let It Crash** - Supervision trees for fault tolerance
4. **Dave Thomas Structure** - Path-based dependencies, not umbrella apps
5. **Ash Framework First** - Declarative domain modeling with auto-generated APIs
6. **Oban for Async** - Never block request path with external calls
7. **Test-Driven Development** - Write tests first, always

## When to Use This Skill

Invoke this skill when you need to:

- Design a new Elixir/Phoenix application from scratch
- Create comprehensive architecture documentation
- Plan OTP supervision trees and process architecture
- Define domain models with Ash Framework resources
- Structure multi-app projects (Dave Thomas style)
- Create Architecture Decision Records (ADRs)
- Prepare handoff documentation for AI agent collaboration
- Set up guardrails for Director/Implementor AI workflows
- Design financial systems, e-commerce platforms, or SaaS applications
- Plan background job processing with Oban
- Structure event-driven systems with GenStage/Broadway

## Your Process

### Phase 1: Gather Requirements

Ask the user these essential questions:

1. **Project Domain**: What is the system for? (e.g., task management, e-commerce, SaaS, messaging platform)
2. **Tech Stack**: Confirm Elixir + OTP + Ash + Oban + Phoenix + LiveView?
3. **Project Location**: Where should files be created? (provide absolute path)
4. **Structure Style**: Dave Thomas path-based dependencies or umbrella app?
5. **Special Requirements**:
   - Multi-tenancy needed?
   - Event sourcing or CQRS?
   - External integrations (payment processors, APIs)?
   - Real-time features (WebSockets, LiveView)?
   - Background processing needs?
6. **Scale Targets**: Expected load, users, transactions per second?
7. **AI Collaboration**: Will Director and Implementor AIs be used?

### Phase 2: Expert Consultation

Launch parallel Task agents to research:

1. **Domain Patterns** - Research similar systems and proven architectures
2. **Framework Best Practices** - Ash Framework, Oban, Phoenix patterns
3. **Book Knowledge** - Extract wisdom from available Elixir books
4. **Structure Analysis** - Study Dave Thomas's multi-app approach
5. **Superpowers Framework** - If handoff docs needed, research task breakdown format

Example Task invocations:
```
Task 1: Research [domain] architecture patterns and data models
Task 2: Analyze Ash Framework resource patterns, extensions, and best practices
Task 3: Study Dave Thomas's path-based dependency approach from available projects
Task 4: Research Superpowers framework for implementation plan format
```

### Phase 3: Create Directory Structure

Create this structure at the user-specified location:

```
project_root/
├── README.md
├── CLAUDE.md
├── docs/
│   ├── HANDOFF.md
│   ├── architecture/
│   │   ├── 00_SYSTEM_OVERVIEW.md
│   │   ├── 01_DOMAIN_MODEL.md
│   │   ├── 02_DATA_LAYER.md
│   │   ├── 03_FUNCTIONAL_CORE.md
│   │   ├── 04_BOUNDARIES.md
│   │   ├── 05_LIFECYCLE.md
│   │   ├── 06_WORKERS.md
│   │   └── 07_INTEGRATION_PATTERNS.md
│   ├── design/          # Empty - Director AI fills during feature work
│   ├── plans/           # Empty - Director AI creates Superpowers plans
│   ├── api/             # Empty - Director AI documents API contracts
│   ├── decisions/       # ADRs
│   │   ├── ADR-001-framework-choice.md
│   │   ├── ADR-002-id-strategy.md
│   │   ├── ADR-003-process-architecture.md
│   │   └── [domain-specific ADRs]
│   └── guardrails/
│       ├── NEVER_DO.md
│       ├── ALWAYS_DO.md
│       ├── DIRECTOR_ROLE.md
│       ├── IMPLEMENTOR_ROLE.md
│       └── CODE_REVIEW_CHECKLIST.md
```

### Phase 4: Foundation Documentation

#### README.md Structure

```markdown
# [Project Name]

[One-line description]

## Overview
[2-3 paragraphs: what this system does and why]

## Architecture
This project follows Dave Thomas's multi-app structure:

project_root/
├── [app_name]_core/      # Domain logic (Ash resources, pure functions)
├── [app_name]_api/       # REST/GraphQL APIs (Phoenix)
├── [app_name]_jobs/      # Background jobs (Oban workers)
├── [app_name]_events/    # Event streaming (Broadway)
└── [app_name]_admin/     # Admin UI (LiveView)

## Tech Stack
- **Elixir** 1.17+ with OTP 27+
- **Ash Framework** 3.0+ - Declarative domain modeling
- **Oban** 2.17+ - Background job processing
- **Phoenix** 1.7+ - Web framework
- **PostgreSQL** 16+ - Primary database

## Getting Started
[Setup instructions]

## Development
[Common tasks, testing, etc.]

## Documentation
See `docs/` directory for comprehensive architecture documentation.
```

#### CLAUDE.md - Critical AI Context

Must include these sections with concrete examples:

1. **Project Context** - System purpose and domain
2. **Hybrid Design Philosophy** - Pattern sources
3. **Key Architectural Decisions** - With trade-offs
4. **Database as Source of Truth** - Why no GenServers for entities
5. **Code Conventions** - Naming, structure, organization
6. **Money Handling** - Never floats! Use integers (cents) or Decimal
7. **Testing Patterns** - Unit/Integration/Property tests
8. **AI Agent Roles** - Director vs Implementor boundaries
9. **Common Mistakes** - Anti-patterns with corrections

Example money handling section:
```elixir
# ❌ NEVER
attribute :amount, :float

# ✅ ALWAYS
attribute :amount, :integer    # Store cents: 100_00 = $100.00
attribute :balance, :decimal   # Or use Decimal for precision

# Why: 0.1 + 0.2 != 0.3 in floating point!
```

### Phase 5: Guardrails Documentation

Create 5 critical files:

#### 1. NEVER_DO.md (10 Prohibitions)

Template structure:
```markdown
# NEVER DO: Critical Prohibitions

## 1. Never Use Floats for Money
❌ **NEVER**: `attribute :amount, :float`
✅ **ALWAYS**: `attribute :amount, :integer` or `attribute :balance, :decimal`
**Why**: Float precision errors cause incorrect financial calculations

## 2. Never Update Balance Without Version Check
❌ **NEVER**: Direct update without optimistic locking
✅ **ALWAYS**: Check version field for concurrent updates
**Why**: Prevents lost updates in concurrent scenarios

[... 8 more critical prohibitions with code examples ...]
```

Include prohibitions for:
- Float usage for money
- Missing version checks (optimistic locking)
- GenServers for domain entities
- Partial transaction commits
- Skipping double-entry validation (if financial)
- Synchronous external API calls in request path
- Storing financial state in process memory
- Mutable data structures
- Logging sensitive data
- Direct user input in queries (SQL injection)

#### 2. ALWAYS_DO.md (22 Mandatory Practices)

Categories:
- **Data Integrity**: Transactions, events, ULIDs, audit trail
- **Testing**: TDD, edge cases, concurrent scenarios, property tests
- **Code Quality**: Typespecs, documentation, commits, DRY, YAGNI
- **Architecture**: Separation of concerns, Ash Actions, Oban, GenStage

Example:
```elixir
# ✅ ALWAYS wrap multi-step operations in transactions
Multi.new()
|> Multi.insert(:transaction, transaction_changeset)
|> Multi.run(:operations, fn _repo, %{transaction: txn} ->
  create_operations(txn.id, params)
end)
|> Multi.run(:update_balances, fn _repo, %{operations: ops} ->
  update_balances(ops)
end)
|> Repo.transaction()
```

#### 3. DIRECTOR_ROLE.md

Define Director AI responsibilities:
- Architecture decisions
- Design documentation
- Implementation planning (Superpowers format)
- Code review against design
- Maintaining consistency

Include:
- What Director CAN do (document, design, plan, review)
- What Director CANNOT do (implement, code, execute)
- Decision authority matrix
- Communication protocol with templates
- Quality gates

#### 4. IMPLEMENTOR_ROLE.md

Define Implementor AI responsibilities:
- Execute implementation plans
- Write tests first (TDD)
- Maintain code quality
- Report progress/blockers

Include:
- What Implementor CAN do (code, test, tactical decisions)
- What Implementor CANNOT do (architecture, design changes)
- When to stop and ask Director
- TDD workflow with examples
- Code quality checklist

#### 5. CODE_REVIEW_CHECKLIST.md

Comprehensive checklist covering:
- Correctness (logic, error handling)
- Financial Integrity (if applicable: double-entry, balances, audit trail)
- Data Integrity (transactions, optimistic locking, constraints)
- Security (input validation, secrets, SQL injection)
- Testing (coverage, edge cases, property tests)
- Code Quality (typespecs, docs, formatting, Credo)
- Documentation (moduledocs, function docs, examples)
- Performance (N+1 queries, indexes, caching)
- Architecture (layering, separation, patterns)

### Phase 6: Architecture Documentation (8 Files)

#### 00_SYSTEM_OVERVIEW.md
- Vision and goals
- High-level architecture diagram (ASCII art is fine)
- Component overview (apps and their purposes)
- Data flow diagrams
- Technology justification (why Ash, why Oban, why PostgreSQL)
- Scalability strategy (read replicas, caching, partitioning)
- Security approach (authentication, authorization, secrets)
- Performance targets with specific metrics

#### 01_DOMAIN_MODEL.md
- All domain entities with complete field definitions
- Relationships between entities (has_many, belongs_to)
- Business rules and constraints
- State machines (if applicable, with ASCII diagrams)
- Use cases with concrete code examples
- Entity lifecycle explanations

Example entity:
```elixir
%Task{
  id: "tsk_01HQBMB5KTQNDRPQHM3VXDT2E9K",  # ULID with prefix
  project_id: "prj_01HQBMA5KTQNDRPQHM3VXDT2E9K",
  title: "Implement user authentication",
  description: "Add JWT-based auth with refresh tokens",
  status: :in_progress,  # :todo | :in_progress | :blocked | :review | :done
  priority: :high,  # :low | :medium | :high | :urgent
  assignee_id: "usr_01HQBMB5KTQNDRPQHM3VXDT2E9K",
  due_date: ~D[2024-02-01],
  estimated_hours: 8,
  version: 1,
  inserted_at: ~U[2024-01-01 00:00:00Z],
  updated_at: ~U[2024-01-01 00:00:00Z]
}
```

#### 02_DATA_LAYER.md
- Complete Ash Resource definitions for all entities
- PostgreSQL table schemas
- Indexes and their justifications
- Optimistic locking implementation (version fields)
- Performance considerations
- Migration strategy

Example Ash Resource:
```elixir
defmodule TaskManager.Task do
  use Ash.Resource,
    domain: TaskManager,
    data_layer: AshPostgres.DataLayer,
    extensions: [AshPaperTrail]

  postgres do
    table "tasks"
    repo TaskManager.Repo
  end

  attributes do
    uuid_v7_primary_key :id, prefix: "tsk"
    attribute :title, :string, allow_nil?: false
    attribute :description, :string
    attribute :status, :atom,
      constraints: [one_of: [:todo, :in_progress, :blocked, :review, :done]],
      default: :todo
    attribute :priority, :atom,
      constraints: [one_of: [:low, :medium, :high, :urgent]],
      default: :medium
    attribute :due_date, :date
    attribute :estimated_hours, :integer
    attribute :version, :integer, default: 1
    timestamps()
  end

  relationships do
    belongs_to :project, TaskManager.Project
    belongs_to :assignee, TaskManager.User
    has_many :comments, TaskManager.Comment
  end

  actions do
    defaults [:read, :destroy]

    create :create do
      accept [:title, :description, :status, :priority, :project_id, :assignee_id]
      change fn changeset, _ ->
        Ash.Changeset.force_change_attribute(changeset, :status, :todo)
      end
    end

    update :update_with_version do
      accept [:title, :description, :status, :priority, :assignee_id, :due_date]
      require_atomic? false
      change optimistic_lock(:version)
    end

    update :assign do
      accept [:assignee_id]
      change optimistic_lock(:version)
    end

    update :transition_status do
      accept [:status]
      validate fn changeset, _ ->
        # Validate state machine transitions
        validate_status_transition(changeset)
      end
      change optimistic_lock(:version)
    end
  end
end
```

#### 03_FUNCTIONAL_CORE.md
- Pure business logic patterns (no side effects)
- Core calculations (priorities, estimates, metrics)
- Validation logic (state transitions, constraints)
- Testing patterns for pure functions
- Property test examples

Example:
```elixir
defmodule TaskManager.Impl.TaskLogic do
  @moduledoc """
  Pure functions for task business logic.
  No database access, no side effects.
  """

  @spec can_transition?(atom(), atom()) :: boolean()
  def can_transition?(from_status, to_status) do
    valid_transitions = %{
      todo: [:in_progress, :blocked],
      in_progress: [:blocked, :review, :done],
      blocked: [:todo, :in_progress],
      review: [:in_progress, :done],
      done: []
    }

    to_status in Map.get(valid_transitions, from_status, [])
  end

  @spec calculate_priority_score(map()) :: integer()
  def calculate_priority_score(task) do
    base_score = priority_value(task.priority)
    urgency_bonus = days_until_due(task.due_date)
    dependency_factor = if task.has_blockers?, do: -10, else: 0

    base_score + urgency_bonus + dependency_factor
  end

  defp priority_value(:urgent), do: 100
  defp priority_value(:high), do: 75
  defp priority_value(:medium), do: 50
  defp priority_value(:low), do: 25

  defp days_until_due(nil), do: 0
  defp days_until_due(due_date) do
    diff = Date.diff(due_date, Date.utc_today())
    cond do
      diff < 0 -> 50  # Overdue
      diff <= 3 -> 30  # Within 3 days
      diff <= 7 -> 15  # Within a week
      true -> 0
    end
  end
end
```

#### 04_BOUNDARIES.md
- Service orchestration layer
- Ecto.Multi patterns for atomic operations
- Transaction boundaries
- Error handling strategies
- Service composition patterns

Example:
```elixir
defmodule TaskManager.Boundaries.TaskService do
  alias Ecto.Multi
  alias TaskManager.Impl.TaskLogic

  def transition_task(task_id, new_status, opts \\ []) do
    Multi.new()
    |> Multi.run(:load_task, fn _repo, _changes ->
      case Ash.get(Task, task_id) do
        {:ok, task} -> {:ok, task}
        error -> error
      end
    end)
    |> Multi.run(:validate_transition, fn _repo, %{load_task: task} ->
      # Pure validation from impl/ layer
      if TaskLogic.can_transition?(task.status, new_status) do
        {:ok, :valid}
      else
        {:error, :invalid_transition}
      end
    end)
    |> Multi.run(:update_task, fn _repo, %{load_task: task} ->
      Task.transition_status(task, %{status: new_status})
    end)
    |> Multi.run(:create_activity, fn _repo, %{update_task: task} ->
      create_activity_log(task, "status_changed", %{from: task.status, to: new_status})
    end)
    |> Multi.run(:notify_assignee, fn _repo, %{update_task: task} ->
      if opts[:notify], do: send_notification(task.assignee_id, task)
      {:ok, :notified}
    end)
    |> Multi.run(:publish_event, fn _repo, %{update_task: task} ->
      publish_task_updated(task)
    end)
    |> Repo.transaction()
  end
end
```

#### 05_LIFECYCLE.md
- OTP application structure
- Supervision tree diagrams
- GenServer usage (infrastructure only, NOT entities!)
- GenStage/Flow pipelines
- Telemetry setup
- Health checks

Example supervisor:
```elixir
def start(_type, _args) do
  children = [
    {TaskManager.Repo, []},
    {Phoenix.PubSub, name: TaskManager.PubSub},
    {TaskManager.Runtime.TaskCache, []},
    genstage_supervisor_spec(),
    {Oban, Application.fetch_env!(:task_manager, Oban)}
  ]

  opts = [strategy: :one_for_one, name: TaskManager.Supervisor]
  Supervisor.start_link(children, opts)
end
```

#### 06_WORKERS.md
- Oban worker definitions
- Job queues and priorities
- Retry strategies
- Worker testing patterns
- Background job best practices

Example:
```elixir
defmodule TaskManager.Workers.ReminderNotifier do
  use Oban.Worker,
    queue: :notifications,
    max_attempts: 3,
    priority: 2

  @impl Oban.Worker
  def perform(%Oban.Job{args: %{"task_id" => id, "type" => type}}) do
    with {:ok, task} <- get_task(id),
         {:ok, assignee} <- get_assignee(task.assignee_id),
         :ok <- send_reminder(assignee, task, type) do
      {:ok, :notified}
    end
  end

  defp send_reminder(assignee, task, "due_soon") do
    # Send email/push notification
    # Task is due within 24 hours
    Notifications.send(assignee.email, "Task Due Soon", render_template(task))
  end

  defp send_reminder(assignee, task, "overdue") do
    # Task is past due date
    Notifications.send_urgent(assignee.email, "Overdue Task", render_template(task))
  end
end
```

#### 07_INTEGRATION_PATTERNS.md
- HTTP client patterns with Finch
- Circuit breaker implementation
- Retry logic with exponential backoff
- Webhook handling (incoming and outgoing)
- Event streaming with Broadway
- External service integration patterns

Example:
```elixir
defmodule TaskManager.Integration.HTTPClient do
  def request(method, url, body, opts \\ []) do
    timeout = Keyword.get(opts, :timeout, 5_000)
    retries = Keyword.get(opts, :retries, 3)

    request = build_request(method, url, body)
    do_request_with_retry(request, timeout, retries)
  end

  defp do_request_with_retry(request, timeout, retries_left, attempt \\ 1) do
    case Finch.request(request, TaskManager.Finch, receive_timeout: timeout) do
      {:ok, %{status: status}} when status in 200..299 ->
        {:ok, decode_response(response)}

      {:ok, %{status: status}} when status in 500..599 and retries_left > 0 ->
        backoff = calculate_backoff(attempt)
        Process.sleep(backoff)
        do_request_with_retry(request, timeout, retries_left - 1, attempt + 1)

      {:error, _} = error ->
        error
    end
  end

  defp calculate_backoff(attempt) do
    # Exponential backoff: 100ms, 200ms, 400ms, 800ms
    trunc(:math.pow(2, attempt - 1) * 100)
  end
end
```

### Phase 7: Architecture Decision Records

Create ADRs for major decisions. Template:

```markdown
# ADR-XXX: [Decision Title]

**Status:** Accepted
**Date:** YYYY-MM-DD
**Deciders:** [Role]
**Context:** [Brief context]

## Context
[Detailed explanation of the situation requiring a decision]

## Decision
[Clear statement of what was decided]

## Rationale
[Why this decision was made - include code examples, metrics, trade-offs]

## Alternatives Considered

### Alternative 1: [Name]
**Implementation:**
```elixir
# Example code
```

**Pros:**
- Advantage 1
- Advantage 2

**Cons:**
- Disadvantage 1
- Disadvantage 2

**Why Rejected:** [Clear explanation]

### Alternative 2: [Name]
[Same structure]

## Consequences

### Positive
1. Benefit with explanation
2. Another benefit

### Negative
1. Trade-off with mitigation strategy
2. Another trade-off

## Implementation Guidelines

### DO: [Pattern]
```elixir
# Good example
```

### DON'T: [Anti-pattern]
```elixir
# Bad example
```

## Validation
[How we'll verify this was the right choice]
- Metric 1: Target value
- Metric 2: Target value

## References
- [Link 1]
- [Link 2]

## Related ADRs
- ADR-XXX: Related Decision

## Review Schedule
**Last Reviewed:** YYYY-MM-DD
**Next Review:** YYYY-MM-DD
```

**Minimum ADRs to create:**

1. **ADR-001: Framework Choice** (Ash vs Plain Ecto vs Event Sourcing)
2. **ADR-002: ID Strategy** (ULID vs UUID vs Auto-increment vs Snowflake)
3. **ADR-003: Process Architecture** (Database as source of truth vs GenServers for entities)
4. **Domain-specific ADRs** based on requirements

### Phase 8: Handoff Documentation

Create HANDOFF.md with:

1. **Overview** - Project status, location, ready state
2. **Project Structure** - Annotated directory tree
3. **Documentation Index** - What each file contains
4. **Workflow** - Director → Implementor → Review → Iterate cycle
5. **Implementation Phases** - Break project into 4-week phases
6. **Key Architectural Principles** - DO/DON'T examples
7. **Testing Strategy** - Unit/Integration/Property test patterns
8. **Commit Message Format** - Conventional commits structure
9. **Communication Protocol** - Message templates between Director/Implementor
10. **Troubleshooting** - Common issues and solutions
11. **Success Metrics** - Specific performance targets
12. **Next Steps** - Immediate actions for Director AI

Example workflow section:
```markdown
## Workflow

### Phase 1: Director Creates Design & Plan
1. Read feature request from user
2. Review architecture documents
3. Create design document in `docs/design/`
4. Create implementation plan in `docs/plans/` (Superpowers format)
5. Commit design + plan
6. Hand off to Implementor with plan path

### Phase 2: Implementor Executes Plan
1. Read implementation plan
2. For each task:
   - Write test first (TDD)
   - Implement minimum code
   - Refactor
   - Run tests
   - Commit
3. Report completion to Director

### Phase 3: Director Reviews
1. Review committed code
2. Check against design
3. Verify guardrails followed
4. Either approve or request changes

### Phase 4: Iterate Until Approved
[Loop until feature is complete]
```

### Phase 9: Validate and Summarize

Before finishing, verify:

1. ✅ All directories created
2. ✅ 20+ documentation files present
3. ✅ All cross-references between docs work
4. ✅ All code examples are valid Elixir syntax
5. ✅ Every architectural principle has concrete example
6. ✅ ADRs include alternatives with rationale
7. ✅ Guardrails have DO/DON'T code examples
8. ✅ Domain-specific adaptations included

Present summary:
```markdown
## Project Architecture Complete! 🚀

**Location:** /path/to/project

**Created:**
- ✅ Complete directory structure
- ✅ Foundation docs (README, CLAUDE.md)
- ✅ 5 guardrail documents
- ✅ 8 architecture documents (~6,000 lines)
- ✅ X Architecture Decision Records
- ✅ Handoff documentation

**Ready For:**
- Director AI to create first design + plan
- Implementor AI to execute implementation
- Iterative feature development

**Next Step:**
Director AI should begin by creating the first feature design.
```

## Domain-Specific Adaptations

### For Task Management Systems

Add emphasis on:

1. **NEVER_DO.md** additions:
   - Never allow invalid status transitions (enforce state machine)
   - Never skip concurrency checks (optimistic locking for updates)
   - Never store assignment history only in memory (persist for audit)

2. **Domain Model** inclusions:
   - Task state machine (todo → in_progress → review → done)
   - Priority calculation algorithms
   - Dependency management (blocked tasks, prerequisites)
   - Assignment and notification workflows
   - Activity log for audit trail

3. **ADRs** to add:
   - State machine implementation (database constraints vs application logic)
   - Priority scoring algorithm
   - Real-time update strategy (PubSub vs polling)
   - Notification delivery guarantees

4. **Use Cases** examples:
   - Create task and assign to user
   - Transition task through workflow states
   - Handle blocked tasks and dependencies
   - Generate team velocity reports

### For Financial Systems

Add emphasis on:

1. **NEVER_DO.md** additions:
   - Never use floats for money (float precision errors)
   - Never allow partial transaction commits (atomicity required)
   - Never skip double-entry validation (balance integrity)
   - Never update balances without version check (optimistic locking)

2. **Domain Model** inclusions:
   - Double-entry bookkeeping explanation and examples
   - Balance calculation patterns (debits vs credits)
   - Commission/fee calculation models
   - Two-phase transaction workflow (pending → approved/canceled)
   - Immutable audit trail requirements

3. **ADRs** to add:
   - Money representation (integer cents vs Decimal)
   - Transaction isolation levels
   - Audit trail implementation strategy
   - Optimistic vs pessimistic locking choice

4. **Use Cases** examples:
   - Account-to-account transfer
   - Payment with commission split
   - Voucher creation and redemption
   - Balance reconciliation

### For E-Commerce Systems

Add emphasis on:

1. **Domain Model** additions:
   - Order state machine (cart → placed → paid → fulfilled → delivered)
   - Inventory management and reservation
   - Payment processing flow
   - Refund and cancellation handling

2. **Workers** to document:
   - Order fulfillment worker
   - Inventory synchronization worker
   - Email notification worker
   - Abandoned cart recovery worker

3. **Integration Patterns**:
   - Payment gateway integration (Stripe, PayPal)
   - Shipping provider APIs
   - Inventory management system sync

### For SaaS Platforms

Add emphasis on:

1. **Domain Model** additions:
   - Multi-tenancy strategy (shared schema with tenant_id vs separate schemas)
   - Subscription and billing models
   - Usage tracking and metering
   - Feature flags and plan limits

2. **Data Layer** considerations:
   - Tenant isolation strategy (row-level security)
   - Cross-tenant query prevention
   - Data partitioning approach

3. **Security**:
   - Tenant context enforcement
   - API authentication (JWT, API keys)
   - Authorization patterns (role-based, attribute-based)

## Critical Patterns and Best Practices

### State Machine Validation

```elixir
# ✅ ALWAYS validate state transitions
def transition_status(task, new_status) do
  if TaskLogic.can_transition?(task.status, new_status) do
    Task.update(task, %{status: new_status})
  else
    {:error, :invalid_transition}
  end
end

# Define valid transitions
def can_transition?(from_status, to_status) do
  valid_transitions = %{
    todo: [:in_progress, :blocked],
    in_progress: [:blocked, :review, :done],
    blocked: [:todo, :in_progress],
    review: [:in_progress, :done],
    done: []
  }

  to_status in Map.get(valid_transitions, from_status, [])
end
```

### Optimistic Locking

```elixir
# ✅ ALWAYS check version for concurrent updates
def update_task(task_id, new_attrs) do
  task = Repo.get!(Task, task_id)

  changeset =
    task
    |> change(new_attrs)
    |> optimistic_lock(:version)

  case Repo.update(changeset) do
    {:ok, updated} -> {:ok, updated}
    {:error, changeset} ->
      if changeset.errors[:version] do
        {:error, :version_conflict}
      else
        {:error, changeset}
      end
  end
end
```

### GenServer Usage (Infrastructure Only!)

```elixir
# ❌ DON'T: GenServer per entity
defmodule TaskServer do
  use GenServer
  # Storing task state in process - DON'T DO THIS
end

# ✅ DO: GenServer for infrastructure
defmodule TaskCache do
  use GenServer
  # Caching active tasks (transient data, can rebuild from DB)
end

defmodule RateLimiter do
  use GenServer
  # Tracking API request counts (acceptable to lose on crash)
end
```

### Ecto.Multi for Atomic Operations

```elixir
# ✅ ALWAYS use Multi for multi-step operations
Multi.new()
|> Multi.insert(:task, task_changeset)
|> Multi.run(:assign, fn _repo, %{task: task} ->
  create_assignment(task.id, assignee_id)
end)
|> Multi.run(:activity_log, fn _repo, %{task: task} ->
  log_activity(task, "task_created")
end)
|> Multi.run(:publish_event, fn _repo, changes ->
  publish_event(changes)
end)
|> Repo.transaction()
```

### Async External Calls

```elixir
# ❌ NEVER block request path
def send_notification(task) do
  HTTPClient.post("https://notifications.com/api", ...)  # BLOCKS!
end

# ✅ ALWAYS enqueue background job
def send_notification(task) do
  %{task_id: task.id, type: "assignment"}
  |> NotificationWorker.new()
  |> Oban.insert()
end
```

## Common Mistakes to Avoid

1. **Too Generic** - Always adapt to specific domain needs
2. **Missing Examples** - Every principle needs concrete code
3. **Unclear Boundaries** - Director vs Implementor roles must be explicit
4. **No Trade-offs** - Always explain downsides of decisions in ADRs
5. **Incomplete ADRs** - Must include alternatives considered and why rejected
6. **Vague Metrics** - Use specific numbers (<100ms, 1000 TPS, >90% coverage)
7. **Umbrella Apps** - Unless explicitly requested, use Dave Thomas structure
8. **GenServers for Entities** - Database is source of truth, not processes

## Quality Gates

Before considering work complete:

- [ ] All code examples use valid Elixir syntax
- [ ] Every "NEVER DO" has a corresponding "ALWAYS DO"
- [ ] Every ADR explains alternatives and why they were rejected
- [ ] Domain model includes complete entity definitions with types
- [ ] Performance targets are specific and measurable
- [ ] Guardrails have clear, executable examples
- [ ] Communication protocol includes message templates
- [ ] Testing strategy covers unit/integration/property tests
- [ ] Integration patterns include retry/circuit breaker
- [ ] Supervision tree is documented with ASCII diagram

## Success Criteria

You've succeeded when:

1. ✅ Director AI can create feature designs without asking architectural questions
2. ✅ Implementor AI can write code without asking design questions
3. ✅ All major decisions are documented with clear rationale
4. ✅ Code examples are copy-paste ready
5. ✅ Domain-specific requirements are thoroughly addressed
6. ✅ Performance targets are realistic and measurable
7. ✅ The system can be built by following the documentation alone

## Notes

- **Empty directories** (docs/design/, docs/plans/, docs/api/) are intentional - Director fills these during feature work
- **Superpowers format** for implementation plans: Markdown with YAML frontmatter, 2-5 minute tasks
- **All code examples** must be valid Elixir that could actually run
- **Consult experts** via Task agents - don't guess at best practices
- **Dave Thomas structure** preferred over umbrella apps unless user specifies otherwise
- **Database as source of truth** - avoid GenServers for domain entities (see ADR-003)