3.6 KiB
3.6 KiB
Populating Non-fillable PDF Forms
After collecting data via Chatfield interview, populate the non-fillable PDF with text annotations.Process Overview
@startuml Populating-Nonfillable
title Populating Non-fillable PDF Forms
start
:Parse Chatfield output;
:Create .values.json with field values;
:Add annotations to PDF;
:**✓ PDF POPULATION COMPLETE**;
stop
@enduml
Process
1. Parse Chatfield Output
Run Chatfield with --inspect for a final summary of all collected data:
python -m chatfield.cli --state='<basename>.chatfield/interview.db' --interview='<basename>.chatfield/interview.py' --inspect
Extract field_id and value for each field from the interview results.
2. Create .values.json
Create <basename>.chatfield/<basename>.values.json with the collected field values in the format expected by the annotation script:
{
"fields": [
{
"field_id": "full_name",
"page": 1,
"value": "John Doe"
},
{
"field_id": "is_over_18",
"page": 2,
"value": "X"
}
]
}
Value selection priority:
- CRITICAL: If a language cast exists for a field (e.g.,
.as_lang_es,.as_lang_fr), always prefer it over the raw value - This ensures forms are populated in the form's language, not the conversation language
- The language cast name matches the form's language code (e.g.,
as_lang_esfor Spanish forms) - Only use the raw value if no language cast exists
Boolean conversion for checkboxes:
- Read
.form.jsonforchecked_valueandunchecked_value - Typically:
"X"or"✓"for checked,""(empty string) for unchecked - Convert Python
True/False→ checkbox display values
3. Add Annotations to PDF
Run the annotation script to create the filled PDF:
python scripts/fill_nonfillable_fields.py <basename>.pdf <basename>.chatfield/<basename>.values.json <basename>.done.pdf
This script:
- Reads the
.values.jsonfile with field values - Reads the
.form.jsonfile (from extraction) with bounding box information - Adds text annotations at the specified bounding boxes
- Creates the output PDF with all annotations
Verification:
- Verify
<basename>.done.pdfexists - Spot-check a few fields to ensure values are correctly placed
Result: <basename>.done.pdf
Validation Checklist
<validation_checklist>
Non-fillable Population Validation:
- [ ] All field values extracted from CLI output
- [ ] Language casts used when available (not raw values)
- [ ] Boolean values converted to checkbox display values
- [ ] .values.json created with correct format
- [ ] fill_nonfillable_fields.py executed successfully
- [ ] Output PDF exists at expected location
- [ ] Spot-checked fields contain correct values
- [ ] Text is visible and properly positioned
</validation_checklist>
Troubleshooting
Text not visible:
- Check font color in .form.json (should be dark, e.g., "000000" for black)
- Verify bounding boxes are correct size
- Ensure font size is appropriate for the bounding box
Text cut off:
- Bounding boxes may be too small
- Review validation images from extraction phase
- Consider adjusting bounding boxes and re-running extraction validation
Wrong language:
- Verify you're using language cast values (e.g.,
as_lang_es) not raw values - Check that language casts were properly requested in the Form Data Model
See Also:
- ./Populating-Fillable.md - Population workflow for fillable PDFs
- ../extracting-form-fields/references/Nonfillable-Forms.md - How bounding boxes were created
- ./Converting-PDF-To-Chatfield.md - How the Form Data Model was built