6.6 KiB
6.6 KiB
name, description
| name | description |
|---|---|
| ingest-validate-wf | Validate Digdag workflow and configuration files against production quality gates |
Validate Ingestion Workflow
⚠️ CRITICAL: This validates against strict production quality gates
I'll validate your ingestion workflow for compliance with production standards and best practices.
What I'll Validate
Quality Gates (ALL MUST PASS)
1. Template Compliance
- ✅ Code matches documented templates 100%
- ✅ No unauthorized deviations from patterns
- ✅ All template sections present
- ✅ Exact formatting and structure
2. Logging Requirements
- ✅ Start logging before data processing
- ✅ Success logging after td_load
- ✅ Error logging in
_errorblocks - ✅ Minimum 3 logging blocks per data source
- ✅ Correct SQL template usage
3. Error Handling
- ✅
_error:blocks present in all workflows - ✅ Error logging with SQL present
- ✅ Proper error message capture
- ✅ Job ID and URL captured in errors
4. Timestamp Format
- ✅ Correct format for connector type:
- Google BigQuery: SQL Server format (
CONVERT(varchar, ..., 121)) - Klaviyo:
.000000(6 decimals, NO Z) - OneTrust:
.000Z(3 decimals, WITH Z) - Shopify v2: ISO 8601
- Google BigQuery: SQL Server format (
- ✅ Matches
docs/patterns/timestamp-formats.md
5. Incremental Field Handling
- ✅ Correct field names (table vs. API)
- ✅ Dual field handling where needed (Klaviyo campaigns)
- ✅ Proper COALESCE fallback logic
- ✅ Matches
docs/patterns/incremental-patterns.md
6. Workflow Structure
- ✅ Must match
docs/patterns/workflow-patterns.md - ✅ Proper timezone declaration (
timezone: UTC) - ✅ Correct
_exportincludes - ✅ Proper task naming conventions
- ✅ Correct file organization
- ✅ Parallel processing limits appropriate for source
7. Configuration Files
- ✅ YAML syntax validity
- ✅ Secret references (
${secret:name}) used correctly - ✅ No hardcoded credentials
- ✅ Required parameters present
- ✅ Database references correct
- ✅ Mode set appropriately (
append,replace)
8. File Organization
- ✅
.digfiles iningestion/directory - ✅ YAML configs in
ingestion/config/subdirectory - ✅ SQL files in
ingestion/sql/subdirectory - ✅ Proper file naming conventions
9. Security
- ✅ No hardcoded credentials in any file
- ✅ Proper
${secret:name}syntax usage - ✅
credentials_ingestion.jsonNOT in version control - ✅
.gitignoreincludes credentials file
Validation Options
Option 1: Validate Specific Workflow
Provide:
- Workflow file path: e.g.,
ingestion/klaviyo_ingest_inc.dig - Related config files: (or I'll find them automatically)
I will:
- Read the workflow file
- Find all related config files
- Check against ALL quality gates
- Report detailed findings with line numbers
Option 2: Validate Entire Source
Provide:
- Source name: e.g.,
klaviyo,shopify_v2,google_bigquery
I will:
- Find all workflows for the source
- Find all config files for the source
- Validate against source-specific documentation
- Check all quality gates
- Report comprehensive findings
Option 3: Validate All
Say: "validate all"
I will:
- Find all workflows in
ingestion/ - Find all configs in
ingestion/config/ - Validate each against its source documentation
- Check all quality gates
- Report full project compliance status
Validation Process
Step 1: Read Documentation
I will read relevant documentation to verify compliance:
- Source-specific docs:
docs/sources/{source-name}.md - Pattern docs:
docs/patterns/*.md
Step 2: Load Files
I will read all specified workflow and config files
Step 3: Check Quality Gates
I will verify each file against ALL quality gates listed above
Step 4: Report Findings
Pass Report (if all gates pass)
✅ VALIDATION PASSED
Workflow: ingestion/{source}_ingest_inc.dig
Source: {source}
Quality Gates: 9/9 PASSED
✅ Template Compliance
✅ Logging Requirements
✅ Error Handling
✅ Timestamp Format
✅ Incremental Fields
✅ Workflow Structure
✅ Configuration Files
✅ File Organization
✅ Security
No issues found. Workflow is production-ready.
Fail Report (if any gate fails)
❌ VALIDATION FAILED
Workflow: ingestion/{source}_ingest_inc.dig
Source: {source}
Quality Gates: 6/9 PASSED
✅ Template Compliance
✅ Logging Requirements
❌ Error Handling - FAILED
- Missing _error block in main workflow
- Error logging SQL not found
✅ Timestamp Format
❌ Incremental Fields - FAILED
- Using wrong field name: 'updated_at' should be 'updated' for API
- Line 45: incremental_field parameter incorrect
✅ Workflow Structure
✅ Configuration Files
✅ File Organization
❌ Security - FAILED
- Hardcoded API key found in config/klaviyo_profiles_load.yml:12
- Should use ${secret:klaviyo_api_key}
RECOMMENDATIONS:
1. Add _error block to main workflow (see docs/patterns/workflow-patterns.md)
2. Fix incremental field name (see docs/sources/klaviyo.md)
3. Replace hardcoded credential with secret reference
Re-validate after fixing issues.
Common Issues Detected
Template Violations
- Simplified or "optimized" templates
- Removed "redundant" sections
- Modified variable names
- Changed structure
Logging Violations
- Missing start/success/error logging
- Incorrect SQL template usage
- Missing job ID or URL capture
Timestamp Format Errors
- Wrong decimal count
- Missing or incorrect timezone marker
- Using default instead of connector-specific format
Incremental Field Errors
- Using table field name in API parameter
- Using API field name in SQL queries
- Missing COALESCE fallback
Security Issues
- Hardcoded credentials
- Incorrect secret syntax
- Credentials file in version control
Next Steps After Validation
If Validation Passes
✅ Workflow is production-ready
- Deploy with confidence
- Monitor ingestion_log for ongoing health
If Validation Fails
❌ Fix reported issues:
- Re-read relevant documentation
- Apply exact templates
- Fix specific line numbers mentioned
- Re-validate until all gates pass
DO NOT deploy failing workflows to production
Production Quality Assurance
This validation ensures:
- ✅ Code works the first time
- ✅ Consistent patterns across sources
- ✅ Complete error handling and logging
- ✅ Maintainable and documented code
- ✅ No security vulnerabilities
- ✅ Compliance with team standards
What would you like to validate?
Options:
- Validate specific workflow: Provide workflow file path
- Validate entire source: Provide source name
- Validate all: Say "validate all"