Initial commit

2025-11-30 08:25:58 +08:00
commit e13b6ff259
31 changed files with 3185 additions and 0 deletions
--- a/skills/filling-pdf-forms/references/AskUserQuestion-Rules.md
+++ b/skills/filling-pdf-forms/references/AskUserQuestion-Rules.md
@@ -0,0 +1,181 @@
+# AskUserQuestion using chatfield.cli strategy
+
+**CRITICAL: Strict adherence required. No deviations permitted.**
+
+This document defines MANDATORY patterns for using `AskUserQuestion` with `chatfield.cli` interviews. Assumes you already know the AskUserQuestion tool signature.
+
+---
+
+## MANDATORY Pattern for EVERY Question
+
+**REQUIRED - EXACT structure:**
+
+```python
+AskUserQuestion(
+    questions=[{
+        "question": "<chatfield.cli's exact question>",  # No paraphrasing
+        "header": "<12 chars max>",
+        "multiSelect": <True/False>,  # Based on data model
+        "options": [
+            # POSITION 1: REQUIRED
+            {"label": "Skip", "description": "Skip (N/A, blank, negative, etc)"},
+            # POSITION 2: REQUIRED
+            {"label": "Delegate", "description": "Ask Claude to look up the needed information using all available resources"},
+            # POSITION 3: First option from chatfield.cli (if present)
+            {"label": "<First from chatfield.cli>", "description": "..."},
+            # POSITION 4: Second option from chatfield.cli (if present)
+            {"label": "<Second from chatfield.cli>", "description": "..."}
+        ]
+    }]
+)
+# POSITION 5 (implicit): "Other" - auto-added for free text
+```
+
+---
+
+## Determine multiSelect
+
+**Check `interview.py` Form Data Model (Chatfield builder API):**
+
+| Data Model | multiSelect |
+|------------|-------------|
+| `.as_multi()` or `.one_or_more()` | `True` |
+| `.as_one()` or `.as_nullable_one()` | `False` |
+| Plain `.field()` (no cardinality) | `False` |
+
+---
+
+## Parse chatfield.cli Options
+
+**If chatfield.cli output contains options, extract and prioritize:**
+
+**Recognize patterns:**
+- `"Status? (Single, Married, Divorced)"`
+- `"Choose: A, B, C, D"`
+- `"Preference: Red | Blue | Green"`
+
+Add **first TWO** as positions 3-4
+
+**Example:**
+```
+chatfield.cli: "Status? (Single, Married, Divorced, Widowed)"
+Options:
+1. Skip
+2. Delegate
+3. Single    ← First from chatfield.cli
+4. Married   ← Second from chatfield.cli
+"Other": User can type "Divorced" or "Widowed"
+```
+
+---
+
+## Handle Responses
+
+| Selection | Action |
+|-----------|--------|
+| Types via "Other" | If starts with `'`: strip prefix and pass verbatim to chatfield.cli. Otherwise: judge if it's a direct answer or instruction to Claude. Direct answer → pass to chatfield.cli; Request for Claude → research/process, then respond to chatfield.cli |
+| "Skip" | Context-aware response: Yes/No questions → "No"; Optional/nullable fields → "N/A"; Other fields → "Skip" |
+| "Delegate" | Research & provide answer |
+| Option 3-4 | Pass selection to CLI |
+| Multi-select | Join: "Email, Phone" to chatfield.cli next iteration |
+
+## Distinguishing Direct Answers from Claude Requests
+
+**When user types via "Other", judge intent:**
+
+**Direct answers** (pass to chatfield.cli):
+- "Find new customers in new markets" ← answer to "What is your business strategy?"
+- "123 Main St, Boston MA" ← answer to "What is your address?"
+- "Python and TypeScript" ← answer to "What programming languages?"
+
+**Requests for Claude** (research first):
+- "look up my SSN" ← asking Claude to find something
+- "research the population" ← asking Claude to look something up
+- "what's today's date" ← asking Claude a question
+
+**Edge case:** `'` prefix forces verbatim pass-through regardless of content
+
+---
+
+## Delegation Pattern
+
+**When user selects "Delegate":**
+1. Parse question to understand needed info
+2. Treat this as if the user directly asked, "Help me find out ..."
+2. Use ALL tools available to you,
+4. Pass the result to chatfield.cli as if user typed it
+5. If not found, ask user
+
+---
+
+## Quick Examples (RULES 1-7)
+
+**Note:** Skip handling is context-aware per "Handle Responses" table above.
+
+### RULE 1: Free Text
+```
+# chatfield.cli: "What is your name?"
+# multiSelect: False
+# Options: Skip, Delegate
+```
+
+### RULE 2: Yes/No
+```
+# chatfield.cli: "Are you employed?"
+# multiSelect: False
+# Options: Skip, Delegate, Yes, No
+```
+
+### RULE 3: Single-Select Choice
+```
+# chatfield.cli: "Status? (Single, Married, Divorced, Widowed)"
+# multiSelect: False
+# Extract: ["Single", "Married", "Divorced", "Widowed"]
+# Options: Skip, Delegate, Single, Married
+# Via Other: "Divorced", "Widowed"
+```
+
+### RULE 4: Multi-Select Choice
+```
+# chatfield.cli: "Contact? (Email, Phone, Text, Mail)"
+# Data model: .as_multi(...)
+# multiSelect: True
+# Extract: ["Email", "Phone", "Text", "Mail"]
+# Options: Skip, Delegate, Email, Phone
+# Via Other: "Text", "Mail"
+```
+
+### RULE 5: Numeric
+```
+# chatfield.cli: "How many dependents?"
+# multiSelect: False
+# Options: Skip, Delegate (optionally: "0", "1-2")
+# Via Other: Exact number
+```
+
+### RULE 6: Complex/Address
+```
+# chatfield.cli: "Mailing address?"
+# multiSelect: False
+# Options: Skip, Delegate
+# Via Other: Full address
+```
+
+### RULE 7: Date
+```
+# chatfield.cli: "Date of birth?"
+# multiSelect: False
+# Options: Skip, Delegate (optionally: "Today", "Tomorrow")
+# Via Other: Specific date
+```
+
+---
+
+## MANDATORY Checklist
+
+**EVERY question MUST:**
+- [ ] Be based on chatfield.cli's stdout message
+- [ ] Include "Skip" as option 1
+- [ ] Include "Delegate" as option 2
+- [ ] Check Form Data Model for multiSelect
+- [ ] Add first TWO chatfield.cli options as 3-4 (if present)
--- a/skills/filling-pdf-forms/references/CLI-Interview-Loop.md
+++ b/skills/filling-pdf-forms/references/CLI-Interview-Loop.md
@@ -0,0 +1,74 @@
+# CLI Interview Loop
+
+**CRITICAL: Strict adherence required. No deviations permitted.**
+
+Run `chatfield.cli` iteratively, presenting its output messages via AskUserQuestion(), passing responses back, repeating until complete.
+
+**Files:**
+- State: `<basename>.chatfield/interview.db`
+- Interview: `<basename>.chatfield/interview.py` (or `interview_<lang>.py` if translated)
+
+## Workflow Overview
+
+```plantuml
+@startuml CLI-Interview-Loop
+title CLI Interview Loop
+start
+:Initialize chatfield.cli (no message);
+:chatfield.cli outputs first question;
+repeat
+  :Understand the chatfield.cli message;
+  :Consider the Form Data Model for multiSelect;
+  :Build AskUserQuestion;
+  :Present to user via AskUserQuestion();
+  :Call chatfield.cli with the result as a message;
+  :chatfield.cli outputs next question/response;
+repeat while (Complete?) is (no)
+->yes;
+:Run chatfield.cli --inspect;
+:Parse collected data;
+stop
+@enduml
+```
+
+## CLI Command Reference
+
+```bash
+# Initialize (NO user message)
+python -m chatfield.cli --state=<state> --interview=<interview>
+
+# Continue (WITH message)
+python -m chatfield.cli --state=<state> --interview=<interview> "user response"
+
+# Inspect (when complete, or any time to troubleshoot)
+python -m chatfield.cli --state=<state> --interview=<interview> --inspect
+```
+
+In all cases, chatfield.cli will print to its stdout a message for the user.
+
+## Interview Loop Process
+
+**CRITICAL**: When building AskUserQuestion from chatfield.cli's message, you MUST strictly follow ./AskUserQuestion-Rules.md
+
+1. Initialize: `python -m chatfield.cli --state=<state> --interview=<interview>` (NO message)
+2. Read chatfield.cli's stdout message
+3. Recall or look up Form Data Model for multiSelect (`.as_multi()`, `.one_or_more()` → True)
+4. Build AskUserQuestion per mandatory rules: ./AskUserQuestion-Rules.md
+5. Present AskUserQuestion to user
+6. Handle response:
+   - "Other" text → pass to chatfield.cli
+   - "Skip" → Context-aware response: Yes/No questions → "No"; Optional/nullable fields → "N/A"; Other fields → "Skip"
+   - "Delegate" → research answer, pass to chatfield.cli
+   - Options 3-4 → pass selection to chatfield.cli
+   - Multi-select → join with commas, pass to chatfield.cli
+7. Call: `python -m chatfield.cli --state=<state> --interview=<interview> "user response"`
+8. Repeat steps 2-7 until completion signal
+9. Run: `python -m chatfield.cli --state=<state> --interview=<interview> --inspect`
+
+## Completion Signals
+
+Watch for:
+- "Thank you! I have all the information I need."
+- "complete" / "done"
+
+When Chatfield mentions the conversation is complete, stop the loop. The CLI Interview loop is done.
--- a/skills/filling-pdf-forms/references/Converting-PDF-To-Chatfield.md
+++ b/skills/filling-pdf-forms/references/Converting-PDF-To-Chatfield.md
@@ -0,0 +1,332 @@
+# Converting PDF Forms to Chatfield Interviews
+
+<purpose>
+This guide covers how to build a Chatfield interview definition from PDF form data. This is the core transformation step that converts a static PDF form into a conversational interview.
+</purpose>
+
+<important>
+**Read complete API reference**: See ./Data-Model-API.md for all builder methods, transformations, and validation rules.
+</important>
+
+## Process Overview
+
+```plantuml
+@startuml Converting-PDF-To-Chatfield
+title Converting PDF Forms to Chatfield Interviews
+start
+:Prerequisites: Form extraction complete;
+partition "Read Input Files" {
+  :Read <basename>.form.md;
+  :Read <basename>.form.json;
+}
+:Build Interview Definition;
+repeat
+  :Validate Form Data Model
+  (see validation checklist);
+  if (All checks pass?) then (yes)
+  else (no)
+    :Fix issues identified in validation;
+  endif
+repeat while (All checks pass?) is (no)
+->yes;
+:**✓ FORM DATA MODEL COMPLETE**;
+:interview.py ready for next step;
+stop
+@enduml
+```
+
+## The Form Data Model
+
+<definition>
+The **Form Data Model** is the `interview.py` file in the `.chatfield/` working directory. This file contains the chatfield builder definition that faithfully represents the PDF form.
+</definition>
+
+## Critical Principle: Faithfulness to Original PDF
+
+<critical_principle>
+**The Form Data Model must be as accurate and faithful as possible to the source PDF.**
+
+**Why?** Downstream code will NOT see the PDF anymore. The interview must create the "illusion" that the AI agent has full access to the form, speaking to the user, writing information - all from the Form Data Model alone.
+
+This means every field, every instruction, every validation rule from the PDF must be captured in the interview definition.
+</critical_principle>
+
+## Language Matching Rule
+
+**CRITICAL: Only pass English-language strings to the chatfield builder API for English-language forms.**
+
+The chatfield object strings should virtually always match the PDF's primary language:
+- `.type()` - Use short identifier (e.g., "DHFS_FoodBusinessLicense"), not full official name. **HARD LIMIT: 64 characters maximum**
+- `.desc()` - Use form's language
+- `.trait()` - Use form's language for Background content
+- `.hint()` - Use form's language
+
+**Translation happens LATER** (see ./Translating.md), not during initial definition.
+
+## Key Rules
+
+These fundamental rules apply to all Form Data Models:
+
+1. **Faithfulness to PDF**: The interview definition must accurately represent the source PDF form
+2. **Short type identifiers**: Top-level `.type()` should be a short "class name" identifier (e.g., "W9_TIN", "DHFS_FoodBusinessLicense"), not the full official form name. **HARD LIMIT: 64 characters maximum**
+3. **Direct mapping default**: Use PDF field_ids directly from `.form.json` unless using fan-out patterns
+4. **Fan-out patterns**: Use `.as_*()` casts to populate multiple PDF fields from single collected value
+5. **Exact field_ids**: Keep field IDs from `.form.json` unchanged (use as cast names or direct field names)
+6. **Extract knowledge**: ALL form instructions go into Alice traits/hints
+7. **Format flexibility**: Never specify format in `.desc()` - Alice accepts variations
+8. **Validation vs transformation**: `.must()` for content constraints (use SPARINGLY), `.as_*()` for formatting (use LIBERALLY). Alice NEVER mentions format requirements to Bob
+9. **Language matching**: All strings (`.desc()`, `.trait()`, `.hint()`) must match the PDF's language
+
+## Reading Input Files
+
+Your inputs from form-extract:
+- **`<basename>.chatfield/<basename>.form.md`** - PDF content as Markdown (use this for form knowledge)
+- **`<basename>.chatfield/<basename>.form.json`** - Field IDs, types, and metadata
+
+## Extracting Form Knowledge
+
+From `.form.md`, extract ONLY actionable knowledge:
+- Form purpose (1-2 sentences)
+- Key term definitions
+- Field completion instructions
+- Valid options/codes
+- Decision logic ("If X then Y")
+
+**Do NOT extract:**
+- Decorative text
+- Repeated boilerplate
+- Page numbers, footers
+
+Place extracted knowledge in interview:
+- **Form-level** → Alice traits: `.trait("Background: [context]...")`
+- **Field-level** → Field hints: `.hint("Background: [guidance]")`
+
+## Builder API Patterns
+
+### Direct Mapping (Default)
+
+One PDF field_id → one question
+
+```python
+.field("topmostSubform[0].Page1[0].f1_01[0]")
+    .desc("What is your full legal name?")  # English .desc() for English form
+    .hint("Background: Should match official records")
+```
+
+### Fan-out Pattern
+
+Collect once, populate multiple PDF fields via `.as_*()` casts
+
+```python
+.field("age")
+    .desc("What is your age in years?")
+    .as_int("age_years", "Age as integer")
+    .as_bool("over_18", "True if 18 or older")
+    .as_str("age_display", "Age formatted for display")
+```
+
+**CRITICAL**: For fan-out, cast names MUST be exact PDF field_ids from `.form.json`
+
+#### Re-representation Sub-pattern
+
+When PDF has multiple fields for the same value in different formats (numeric vs words, date vs formatted date, etc.), collect ONCE and use casts:
+
+```python
+.field("amount")
+    .desc("What is the payment amount?")
+    .as_int("amount_numeric", "Amount as number")
+    .as_str("amount_in_words", "Amount spelled out in words (e.g., 'One hundred')")
+
+.field("event_date")
+    .desc("When did the event occur?")
+    .as_str("date_iso", "Date in ISO format (YYYY-MM-DD)")
+    .as_str("date_display", "Date formatted as 'January 15, 2025'")
+```
+
+**Key principle**: Eliminate duplicate questions about the same underlying information.
+
+### Discriminate + Split Pattern
+
+Mutually-exclusive fields
+
+```python
+.field("tin")
+    .desc("Is your taxpayer ID an EIN or SSN, and what is the number?")
+    .must("be exactly 9 digits")
+    .must("indicate SSN or EIN type")
+    .as_str("ssn_part1", "First 3 of SSN, or empty if N/A")
+    .as_str("ssn_part2", "Middle 2 of SSN, or empty if N/A")
+    .as_str("ssn_part3", "Last 4 of SSN, or empty if N/A")
+    .as_str("ein_full", "Full 9-digit EIN, or empty if N/A")
+```
+
+### Expand Pattern
+
+Multiple checkboxes from single field
+
+```python
+.field("preferences")
+    .desc("What are your communication preferences?")
+    .as_bool("email_ok", "True if wants email")
+    .as_bool("phone_ok", "True if wants phone calls")
+    .as_bool("mail_ok", "True if wants postal mail")
+```
+
+## `.must()` vs `.as_*()` Usage
+
+**`.must()`** - CONTENT constraints (use SPARINGLY):
+- Only when field MUST contain specific information
+- Creates hard blocking constraint
+- Example: `.must("match tax return exactly")`
+
+**`.as_*()`** - TYPE/FORMAT transformations (use LIBERALLY):
+- For any type casting, formatting, derived values
+- Alice accepts variations, computes transformation
+- Example: `.as_int()`, `.as_bool()`, `.as_str("name", "desc")`
+
+**Rule of thumb**: Expect MORE `.as_*()` calls than `.must()` calls.
+
+## Field Types
+
+- **Text** → `.field("id").desc("question")`
+- **Checkbox** → `.field("id").desc("question").as_bool()`
+- **Radio/choice (required)** → `.field("id").desc("question").as_one("opt1", "opt2")`
+- **Radio/choice (optional)** → `.field("id").desc("question").as_nullable_one("opt1", "opt2")`
+
+## Optional Fields
+
+```python
+.field("middle_name")
+    .desc("Middle name")
+    .hint("Background: Optional per form instructions")
+```
+
+## Hint Conventions
+
+All hints must have a prefix:
+
+- **"Background:"** - Internal notes for Alice only
+  - Alice uses these for formatting, conversions, context without mentioning to Bob
+  - Example: `.hint("Background: Convert to Buddhist calendar by adding 543 years")`
+- **"Tooltip:"** - May be shared with Bob if helpful
+  - Example: `.hint("Tooltip: Your employer provides this number")`
+
+**See ./Data-Model-API.md** for complete list of transformations (`.as_int()`, `.as_bool()`, etc.) and cardinality options (`.as_one()`, `.as_multi()`, etc.).
+
+## When to Use `.conclude()`
+
+Only when derived field depends on multiple previous fields OR complex logic that can't be expressed in a single field's casts.
+
+## Additional Guidance from PDF Forms
+
+**Extract Knowledge Wisely:**
+- Extract actionable knowledge ONLY from PDF
+- Form purpose (1-2 sentences max)
+- Key term definitions
+- Field completion instructions
+- Valid options/codes
+- Decision logic ("If X then Y")
+- **Do NOT extract**: Decorative text, repeated boilerplate, page numbers, footers
+
+**Alice Traits for Format Flexibility:**
+```python
+.alice()
+    .type("Form Assistant")
+    .trait("Collects information content naturally, handling all formatting invisibly")
+    .trait("Accepts format variations (SSN with/without hyphens)")
+    .trait("Background: [extracted form knowledge goes here]")
+```
+
+**Default to Direct Mapping:**
+PDF field_ids are internal - users only see `.desc()`. Use field IDs directly unless using fan-out patterns.
+
+**Format Flexibility:**
+Never specify format in `.desc()` - Alice accepts variations. Use `.as_*()` for formatting requirements.
+
+## Complete Example
+
+```python
+from chatfield import chatfield
+
+interview = (chatfield()
+    .type("W9_TIN")
+    .desc("Form to provide TIN to entities paying income")
+
+    .alice()
+        .type("Tax Form Assistant")
+        .trait("Collects information content naturally, handling all formatting invisibly")
+        .trait("Accepts format variations (SSN with/without hyphens)")
+        .trait("Background: W-9 used to provide TIN to entities paying income")
+        .trait("Background: EIN for business entities, SSN for individuals")
+
+    .bob()
+        .type("Taxpayer completing W-9 form")
+        .trait("Speaks naturally and freely")
+
+    .field("name")
+        .desc("What is your full legal name as shown on your tax return?")
+        .hint("Background: Must match IRS records exactly")
+
+    .field("business_name")
+        .desc("Business name or disregarded entity name, if different from above")
+        .hint("Background: Optional - only if applicable")
+
+    .field("tin")
+        .desc("What is your taxpayer identification number (SSN or EIN)?")
+        .must("be exactly 9 digits")
+        .must("indicate whether SSN or EIN")
+        .as_str("ssn_part1", "First 3 digits of SSN, or empty if using EIN")
+        .as_str("ssn_part2", "Middle 2 digits of SSN, or empty if using EIN")
+        .as_str("ssn_part3", "Last 4 digits of SSN, or empty if using EIN")
+        .as_str("ein_part1", "First 2 digits of EIN, or empty if using SSN")
+        .as_str("ein_part2", "Last 7 digits of EIN, or empty if using SSN")
+
+    .field("address")
+        .desc("What is your address (number, street, apt/suite)?")
+
+    .field("city_state_zip")
+        .desc("What is your city, state, and ZIP code?")
+        .as_str("city", "City name")
+        .as_str("state", "State abbreviation (2 letters)")
+        .as_str("zip", "ZIP code")
+
+    .build()
+)
+```
+
+## Validation Checklist
+
+Before proceeding, validate the interview definition:
+
+<validation_checklist>
+```
+Interview Validation Checklist:
+- [ ] All field_ids from .form.json are mapped
+- [ ] No field_ids duplicated or missing
+- [ ] Re-representations (amount/amount_in_words, date/date_formatted, etc.) use single field with casts, not duplicate questions
+- [ ] .desc() describes WHAT information is needed (content), never HOW it should be formatted
+- [ ] .hint() provides context about content (e.g., "Optional", "Must match passport"), never formatting instructions
+- [ ] All formatting requirements (dates, codes, number formats, etc.) use .as_*() transformations exclusively
+- [ ] Fan-out patterns use .as_*() with PDF field_ids as cast names
+- [ ] Split patterns use .as_*() with "or empty/0 if N/A" descriptions
+- [ ] Discriminate + split uses .as_*() for mutually-exclusive fields
+- [ ] Expand pattern uses .as_*() casts on single field
+- [ ] .conclude() used only when necessary (multi-field dependencies)
+- [ ] Alice traits include extracted form knowledge
+- [ ] Field hints provide context from PDF instructions
+- [ ] Optional fields explicitly marked with hint("Background: Optional...")
+- [ ] .must() used sparingly (only true content requirements)
+- [ ] Field .desc() questions are natural and user-friendly (no technical field_ids)
+- [ ] ALL STRINGS match the PDF's primary language
+```
+</validation_checklist>
+
+If any items fail:
+1. Review the specific issue
+2. Fix the interview definition
+3. Re-run validation checklist
+4. Proceed only when all items pass
+
+## The Result: Form Data Model
+
+When validation passes, you have successfully created the **Form Data Model** in `<basename>.chatfield/interview.py`.
--- a/skills/filling-pdf-forms/references/Data-Model-API.md
+++ b/skills/filling-pdf-forms/references/Data-Model-API.md
@@ -0,0 +1,216 @@
+# Conversational Form API Reference
+
+**Library:** `chatfield` Python package
+
+API reference for building conversational form interviews. Powered by the Chatfield library.
+
+## Contents
+- Quick Start
+- Builder API
+  - Interview Configuration
+  - Roles
+  - Fields
+  - Validation
+  - Special Field Types
+  - Transformations
+  - Cardinality
+- Field Access
+- Optional Fields
+
+---
+
+## Quick Start
+
+```python
+from chatfield import chatfield, Interviewer
+
+# Define
+interview = (chatfield()
+    .field("name")
+        .desc("What is your full name?")
+        .must("include first and last")
+    .field("age")
+        .desc("Your age?")
+        .as_int()
+        .must("be between 18 and 120")
+    .build())
+
+# Run
+interviewer = Interviewer(interview)
+user_input = None
+while not interview._done:
+    message = interviewer.go(user_input)
+    print(f"Assistant: {message}")
+    if not interview._done:
+        user_input = input("You: ").strip()
+
+# Access
+print(interview.name, interview.age.as_int)
+```
+
+---
+
+## Builder API
+
+### Interview Configuration
+
+```python
+interview = (chatfield()
+    .type("Job Application")            # Interview type
+    .desc("Collect applicant info")     # Description
+    .build())
+```
+
+### Roles
+
+```python
+.alice()                                # Configure AI assistant
+    .type("Tax Assistant")
+    .trait("Professional and accurate")
+    .trait("Never provides tax advice")
+
+.bob()                                  # Configure user
+    .type("Taxpayer")
+    .trait("Speaks colloquially")
+```
+
+### Fields
+
+```python
+.field("email")                         # Define field (becomes interview.email)
+    .desc("What is your email?")        # User-facing question
+```
+
+**All fields mandatory to populate** (must be non-`None` for `._done`). Content can be empty string `""`.
+Exception: `.as_one()`, `.as_multi()`, and fields with strict validation require non-empty values.
+
+### Validation
+
+```python
+.field("email")
+    .must("be valid email format")      # Requirement (AND logic)
+    .must("not be disposable")
+    .reject("profanity")                # Block pattern
+    .hint("Background: Company email preferred")    # Advisory (not enforced)
+```
+
+### Hints
+
+Hints provide context and guidance to Alice. **All hints must start with "Background:" or "Tooltip:"**
+
+```python
+# Background hints: Internal notes for Alice only (not mentioned to Bob)
+.hint("Background: Convert Gregorian to Buddhist calendar (+543 years)")
+.hint("Background: Optional per form instructions")
+
+# Tooltip hints: May be shared with Bob if helpful
+.hint("Tooltip: Your employer should provide this number")
+.hint("Tooltip: Ask your supervisor if unsure")
+```
+
+**Background hints** are for Alice's internal use - she handles formatting/conversions transparently without mentioning them to Bob.
+**Tooltip hints** may be shared with Bob to help clarify what information is needed.
+
+### Special Field Types
+
+```python
+.field("sentiment_score")
+    .confidential()                     # Track silently, never ask Bob
+
+.field("summary")
+    .conclude()                         # Compute after regular fields (auto-confidential)
+```
+
+### Transformations
+
+LLM computes during collection. Access via `interview.field.as_*`
+
+```python
+.field("age").as_int()                  # → interview.age.as_int = 25
+.field("price").as_float()              # → interview.price.as_float = 99.99
+.field("citizen").as_bool()             # → interview.citizen.as_bool = True
+.field("hobbies").as_list()             # → interview.hobbies.as_list = ["reading", "coding"]
+.field("config").as_json()              # → interview.config.as_json = {"theme": "dark"}
+.field("progress").as_percent()         # → interview.progress.as_percent = 0.75
+.field("greeting").as_lang("fr")        # → interview.greeting.as_lang_fr = "Bonjour"
+
+# Optional descriptions guide edge cases
+.field("has_partners")
+    .as_bool("true if you have partners; false if not or N/A")
+
+.field("quantity")
+    .as_int("parse as integer, ignore units")
+
+# Named string casts for formatting
+.field("ssn")
+    .must("be exactly 9 digits")
+    .as_str("formatted", "Format as ###-##-####")
+# Access: interview.ssn.as_str_formatted → "123-45-6789"
+```
+
+**Validation vs. Casts:**
+- **Validation** (`.must()`): Check content ("9 digits", "valid email")
+- **Casts** (`.as_*()`): Provide format (hyphens, capitalization)
+
+### Choice Cardinality
+
+Select from predefined options:
+
+```python
+.field("tax_class")
+    .as_one("Individual", "C Corp", "S Corp")       # Exactly one choice required
+
+.field("dietary")
+    .as_nullable_one("Vegetarian", "Vegan")         # Zero or one
+
+.field("languages")
+    .as_multi("Python", "JavaScript", "Go")         # One or more choices required
+
+.field("interests")
+    .as_nullable_multi("ML", "Web Dev", "DevOps")   # Zero or more
+```
+
+### Build
+
+```python
+.build()                                # Return Interview instance
+```
+
+---
+
+## Field Access
+
+**Dot notation** (regular fields):
+```python
+interview.name
+interview.age.as_int
+```
+
+**Bracket notation** (special characters):
+```python
+interview["topmostSubform[0].Page1[0].f1_01[0]"]    # PDF form fields
+interview["user.name"]                               # Dots
+interview["full name"]                               # Spaces
+interview["class"]                                   # Reserved words
+```
+
+---
+
+## Optional Fields
+
+Fields known to be optional (from PDF tooltip, nearby context, or instructions):
+
+```python
+.alice()
+    .trait("Records optional fields as empty string when user says blank/none/skip")
+
+.field("middle_name")
+    .desc("Middle name")
+    .hint("Background: Optional per form instructions")
+
+.field("extension")
+    .desc("Phone extension")
+    .hint("Background: Leave blank if none")
+```
+
+For optional **choices**, use `.as_nullable_one()` or `.as_nullable_multi()` (see examples above).
--- a/skills/filling-pdf-forms/references/Populating-Fillable.md
+++ b/skills/filling-pdf-forms/references/Populating-Fillable.md
@@ -0,0 +1,100 @@
+# Populating Fillable PDF Forms
+
+<purpose>
+After collecting data via Chatfield interview, populate fillable PDF forms with the results.
+</purpose>
+
+## Process Overview
+
+```plantuml
+@startuml Populating-Fillable
+title Populating Fillable PDF Forms
+start
+:Parse Chatfield output;
+:Read <basename>.form.json for metadata;
+:Create <basename>.values.json;
+repeat
+  :Validate .values.json
+  (see validation checklist);
+  if (All checks pass?) then (yes)
+  else (no)
+    :Fix .values.json;
+  endif
+repeat while (All checks pass?) is (no)
+->yes;
+:Execute fill_fillable_fields.py;
+:**✓ PDF POPULATION COMPLETE**;
+stop
+@enduml
+```
+
+## Process
+
+### 1. Parse Chatfield Output
+
+Run Chatfield with `--inspect` for a final summary of all collected data:
+```bash
+python -m chatfield.cli --state='<basename>.chatfield/interview.db' --interview='<basename>.chatfield/interview.py' --inspect
+```
+
+Extract `field_id` and the proper value for each field.
+
+### 2. Create `.values.json`
+
+Create `<basename>.values.json` in the `<basename>.chatfield/` directory with the collected field values:
+
+```json
+[
+  {"field_id": "name", "page": 1, "value": "John Doe"},
+  {"field_id": "age_years", "page": 1, "value": 25},
+  {"field_id": "age_display", "page": 1, "value": "25"},
+  {"field_id": "checkbox_over_18", "page": 1, "value": "/1"}
+]
+```
+
+**Value selection priority:**
+- **CRITICAL**: If a language cast exists for a field (e.g., `.as_lang_es`, `.as_lang_fr`), **always prefer it** over the raw value
+- This ensures forms are populated in the form's language, not the conversation language
+- The language cast name matches the form's language code (e.g., `as_lang_es` for Spanish forms)
+- Only use the raw value if no language cast exists
+
+**Boolean conversion for checkboxes:**
+- Read `.form.json` for `checked_value` and `unchecked_value`
+- Typically: `"/1"` or `"/On"` for checked, `"/Off"` for unchecked
+- Convert Python `True`/`False` → PDF checkbox values
+
+### 3. Validate `.values.json`
+
+**Before running the population script**, validate the `.values.json` file against the validation checklist below:
+- Verify all field_ids from `.form.json` are present
+- Check checkbox values match `checked_value`/`unchecked_value` from `.form.json`
+- Ensure numeric fields use numbers, not strings
+- Confirm language cast values are used when available
+
+If validation fails, fix the `.values.json` file and re-validate until all checks pass.
+
+### 4. Populate PDF
+
+Once validation passes, run the population script (note, the `scripts` directory is relative to the base directory for this skill):
+
+```bash
+python scripts/fill_fillable_fields.py <basename>.pdf <basename>.chatfield/<basename>.values.json <basename>.done.pdf
+
+## Validation Checklist
+
+<validation_checklist>
+**Missing fields:**
+- Check that all field_ids from `.form.json` are in `.values.json`
+- Verify field_id spelling matches exactly
+
+**Wrong checkbox values:**
+- Check `checked_value`/`unchecked_value` in `.form.json`
+- Common values: `/1`, `/On`, `/Yes` for checked; `/Off`, `/No` for unchecked
+
+**Type errors:**
+- Ensure numeric fields use numbers, not strings: `25` not `"25"`
+- Ensure boolean checkboxes use proper values from `.form.json`
+
+**Language translation (for translated forms):**
+- Ensure language cast value is used when it exists (e.g., `as_lang_es` for Spanish forms)
+</validation_checklist>
--- a/skills/filling-pdf-forms/references/Populating-Nonfillable.md
+++ b/skills/filling-pdf-forms/references/Populating-Nonfillable.md
@@ -0,0 +1,121 @@
+# Populating Non-fillable PDF Forms
+
+<purpose>
+After collecting data via Chatfield interview, populate the non-fillable PDF with text annotations.
+</purpose>
+
+## Process Overview
+
+```plantuml
+@startuml Populating-Nonfillable
+title Populating Non-fillable PDF Forms
+start
+:Parse Chatfield output;
+:Create .values.json with field values;
+:Add annotations to PDF;
+:**✓ PDF POPULATION COMPLETE**;
+stop
+@enduml
+```
+
+## Process
+
+### 1. Parse Chatfield Output
+
+Run Chatfield with `--inspect` for a final summary of all collected data:
+```bash
+python -m chatfield.cli --state='<basename>.chatfield/interview.db' --interview='<basename>.chatfield/interview.py' --inspect
+```
+
+Extract `field_id` and value for each field from the interview results.
+
+### 2. Create `.values.json`
+
+Create `<basename>.chatfield/<basename>.values.json` with the collected field values in the format expected by the annotation script:
+
+```json
+{
+  "fields": [
+    {
+      "field_id": "full_name",
+      "page": 1,
+      "value": "John Doe"
+    },
+    {
+      "field_id": "is_over_18",
+      "page": 2,
+      "value": "X"
+    }
+  ]
+}
+```
+
+**Value selection priority:**
+- **CRITICAL**: If a language cast exists for a field (e.g., `.as_lang_es`, `.as_lang_fr`), **always prefer it** over the raw value
+- This ensures forms are populated in the form's language, not the conversation language
+- The language cast name matches the form's language code (e.g., `as_lang_es` for Spanish forms)
+- Only use the raw value if no language cast exists
+
+**Boolean conversion for checkboxes:**
+- Read `.form.json` for `checked_value` and `unchecked_value`
+- Typically: `"X"` or `"✓"` for checked, `""` (empty string) for unchecked
+- Convert Python `True`/`False` → checkbox display values
+
+### 3. Add Annotations to PDF
+
+Run the annotation script to create the filled PDF:
+
+```bash
+python scripts/fill_nonfillable_fields.py <basename>.pdf <basename>.chatfield/<basename>.values.json <basename>.done.pdf
+```
+
+This script:
+- Reads the `.values.json` file with field values
+- Reads the `.form.json` file (from extraction) with bounding box information
+- Adds text annotations at the specified bounding boxes
+- Creates the output PDF with all annotations
+
+**Verification:**
+- Verify `<basename>.done.pdf` exists
+- Spot-check a few fields to ensure values are correctly placed
+
+**Result**: `<basename>.done.pdf`
+
+## Validation Checklist
+
+<validation_checklist>
+```
+Non-fillable Population Validation:
+- [ ] All field values extracted from CLI output
+- [ ] Language casts used when available (not raw values)
+- [ ] Boolean values converted to checkbox display values
+- [ ] .values.json created with correct format
+- [ ] fill_nonfillable_fields.py executed successfully
+- [ ] Output PDF exists at expected location
+- [ ] Spot-checked fields contain correct values
+- [ ] Text is visible and properly positioned
+```
+</validation_checklist>
+
+## Troubleshooting
+
+**Text not visible:**
+- Check font color in .form.json (should be dark, e.g., "000000" for black)
+- Verify bounding boxes are correct size
+- Ensure font size is appropriate for the bounding box
+
+**Text cut off:**
+- Bounding boxes may be too small
+- Review validation images from extraction phase
+- Consider adjusting bounding boxes and re-running extraction validation
+
+**Wrong language:**
+- Verify you're using language cast values (e.g., `as_lang_es`) not raw values
+- Check that language casts were properly requested in the Form Data Model
+
+---
+
+**See Also:**
+- ./Populating-Fillable.md - Population workflow for fillable PDFs
+- ../extracting-form-fields/references/Nonfillable-Forms.md - How bounding boxes were created
+- ./Converting-PDF-To-Chatfield.md - How the Form Data Model was built
--- a/skills/filling-pdf-forms/references/Translating.md
+++ b/skills/filling-pdf-forms/references/Translating.md
@@ -0,0 +1,218 @@
+# Translating Forms for Users
+
+<purpose>
+Use this guide when the PDF form is in a language different from the user's language. This enables cross-language form completion where the user speaks one language and the form is in another.
+</purpose>
+
+## Process Overview
+
+```plantuml
+@startuml Translating
+title Translating Forms for Users
+start
+:Prerequisites: Form Data Model created\n(form language already determined);
+partition "1. Copy Form Data Model" {
+  :Create language-specific .py file;
+}
+partition "2. Edit Language-Specific Version" {
+  :Edit interview_<lang>.py;
+  partition "3. Alice Translation Traits" {
+    :Add translation traits to Alice;
+  }
+  partition "4. Bob Language Traits" {
+    :Add language trait to Bob;
+  }
+  partition "5. Field Language Casts" {
+    :Add .as_lang("<lang>") to all text fields;
+  }
+}
+repeat
+  :Validate translation setup
+  (see validation checklist);
+  if (All checks pass?) then (yes)
+  else (no)
+    :Fix issues;
+  endif
+repeat while (All checks pass?) is (no)
+->yes;
+:**✓ TRANSLATION COMPLETE**;
+:Re-define Form Data Model as interview_<lang>.py;
+stop
+@enduml
+```
+
+## Critical Principle
+
+<critical_principle>
+The **Form Data Model** (`interview.py`) was already created with the form's language.
+
+**DO NOT recreate it.** Instead, ADAPT it for translation.
+
+The form definition stays in the form's language. Only Alice's behavior and Bob's profile are modified to enable translation.
+</critical_principle>
+
+## Process
+
+### 1. Copy Form Data Model
+
+Create a language-specific .py file. Use ISO 639-1 language codes: `en`, `es`, `fr`, `de`, `zh`, `ja`, etc.
+
+```bash
+# If user speaks Spanish
+cp input.chatfield/interview.py input.chatfield/interview_es.py
+```
+
+### 2. Edit Language-Specific Version
+
+Edit `interview_<lang>.py` to add translation traits.
+
+**What to change:**
+- ✅ Alice traits - Add translation instructions
+- ✅ Bob traits - Add language preference
+- ✅ Text fields - Add `.as_lang("<form-lang-code>")` for translation (e.g., "es" for Spanish)
+
+**What NOT to change:**
+- ❌ Form `.type()` or `.desc()` - Keep form's language
+- ❌ Field definitions - Keep all field IDs unchanged
+- ❌ Field `.desc()` - Keep form's language
+- ❌ Background hints - Keep form's language
+- ❌ Any field IDs or cast names
+
+### 3. Alice Translation Traits
+
+Add these traits to Alice:
+
+```python
+.alice()
+    # Keep existing .type()
+    .trait("Conducts this conversation in [USER_LANGUAGE]")
+    .trait("Translates [USER_LANGUAGE] responses into [FORM_LANGUAGE] for the form")
+    .trait("Explains [FORM_LANGUAGE] terms in [USER_LANGUAGE]")
+    # Keep all existing .trait() calls
+```
+
+### 4. Bob Language Traits
+
+Add these traits to Bob:
+
+```python
+.bob()
+    # Keep existing .type()
+    .trait("Speaks [USER_LANGUAGE] only")
+    # Keep all existing .trait() calls
+```
+
+### 5. Field Language Casts
+
+Add `.as_lang("<form-lang-code>")` to **all text fields** to ensure values are translated to the form's language using ISO 639-1 language codes (es, fr, th, de, etc.):
+
+```python
+.field("field_name")
+    .desc("...")
+    .as_lang("es")  # For Spanish form, use "fr" for French, "th" for Thai, etc.
+    # Keep all existing casts
+```
+
+## Complete Example
+
+**Original Form Data Model** (`interview.py`):
+
+```python
+from chatfield import chatfield
+
+interview = (chatfield()
+    .type("Solicitud de Visa")
+    .desc("Formulario de solicitud de visa de turista")
+
+    .alice()
+        .type("Asistente de Formularios")
+        .trait("Usa lenguaje claro y natural")
+        .trait("Acepta variaciones de formato")
+
+    .bob()
+        .type("Solicitante de visa")
+        .trait("Habla de forma natural y libre")
+
+    .field("nombre_completo")
+        .desc("¿Cuál es su nombre completo?")
+        .hint("Background: Debe coincidir con el pasaporte")
+
+    .field("fecha_nacimiento")
+        .desc("¿Cuál es su fecha de nacimiento?")
+        .as_str("dia", "Día (DD)")
+        .as_str("mes", "Mes (MM)")
+        .as_str("anio", "Año (YYYY)")
+
+    .build()
+)
+```
+
+**Translated Version** (`interview_en.py` for English-speaking user):
+
+```python
+from chatfield import chatfield
+
+interview = (chatfield()
+    .type("Solicitud de Visa")  # Unchanged - form's language
+    .desc("Formulario de solicitud de visa de turista")  # Unchanged
+
+    .alice()
+        .type("Asistente de Formularios")  # Unchanged
+        .trait("Conducts this conversation in English")  # ADDED
+        .trait("Translates English responses into Spanish for the form")  # ADDED
+        .trait("Explains Spanish terms in English")  # ADDED
+        .trait("Usa lenguaje claro y natural")  # Keep existing
+        .trait("Acepta variaciones de formato")  # Keep existing
+
+    .bob()
+        .type("Solicitante de visa")  # Unchanged
+        .trait("Speaks English only")  # ADDED
+        .trait("Habla de forma natural y libre")  # Keep existing
+
+    .field("nombre_completo")  # Unchanged
+        .desc("¿Cuál es su nombre completo?")  # Unchanged - form's language
+        .hint("Background: Debe coincidir con el pasaporte")  # Unchanged
+        .as_lang("es")  # ADDED - translate to Spanish
+
+    .field("fecha_nacimiento")  # Unchanged
+        .desc("¿Cuál es su fecha de nacimiento?")  # Unchanged
+        .as_str("dia", "Día (DD)")  # Unchanged
+        .as_str("mes", "Mes (MM)")  # Unchanged
+        .as_str("anio", "Año (YYYY)")  # Unchanged
+
+    .build()
+)
+```
+
+## Validation Checklist
+
+Before proceeding, verify ALL items:
+
+<validation_checklist>
+```
+Translation Validation Checklist:
+- [ ] Created interview_<lang>.py (copied from interview.py)
+- [ ] No changes to form .type() or .desc()
+- [ ] No changes to field definitions (field IDs)
+- [ ] No changes to field .desc() (keep form's language)
+- [ ] No changes to .as_*() cast names or descriptions
+- [ ] No changes to Background hints (keep form's language)
+- [ ] Added Alice trait: "Conducts this conversation in [USER_LANGUAGE]"
+- [ ] Added Alice trait: "Translates [USER_LANGUAGE] responses into [FORM_LANGUAGE]"
+- [ ] Added Alice trait: "Explains [FORM_LANGUAGE] terms in [USER_LANGUAGE]"
+- [ ] Added Bob trait: "Speaks [USER_LANGUAGE] only"
+- [ ] Added .as_lang("<form-lang-code>") to all text fields (e.g., "es" for Spanish)
+```
+</validation_checklist>
+
+If any items fail:
+1. Review the specific issue
+2. Fix the interview definition
+3. Re-run validation checklist
+4. Proceed only when all items pass
+
+## Re-define Form Data Model
+
+**CRITICAL**: When translation setup is complete, the **Form Data Model** is now the language-specific version (`interview_<lang>.py`), NOT the base `interview.py`.
+
+Use this file for all subsequent steps (CLI execution, etc.).