Initial commit
This commit is contained in:
35
commands/clean-database.md
Normal file
35
commands/clean-database.md
Normal file
@@ -0,0 +1,35 @@
|
||||
---
|
||||
name: clean-database
|
||||
description: Normalize, deduplicate, and validate enriched datasets to maintain accuracy and compliance.
|
||||
usage: /data-enrichment:clean-database --input enriched.csv --rules rules.yaml
|
||||
---
|
||||
|
||||
# Clean Database Command
|
||||
|
||||
## Purpose
|
||||
Run data quality workflows (formatting, deduplication, validation, suppression) before syncing enriched records into downstream systems.
|
||||
|
||||
## Syntax
|
||||
```bash
|
||||
/data-enrichment:clean-database \
|
||||
--input enriched.csv \
|
||||
--rules rules.yaml \
|
||||
--output clean.csv \
|
||||
--gdpr true
|
||||
```
|
||||
|
||||
### Parameters
|
||||
- `--input`: Source CSV/JSON/Parquet file.
|
||||
- `--rules`: YAML/JSON config defining normalization rules, required fields, dedupe logic.
|
||||
- `--output`: File path or system destination (Salesforce, HubSpot, Snowflake).
|
||||
- `--gdpr`: Apply regional compliance filters (default true).
|
||||
- `--suppress-list`: Path to opt-out or customer suppression list.
|
||||
- `--format`: Output format (csv, json, parquet, api-sync).
|
||||
|
||||
## Features
|
||||
- Email/phone format correction, country normalization, timezone calculation.
|
||||
- Deduping via fuzzy matching and configurable keys.
|
||||
- Confidence scoring and rejection report for records failing validation.
|
||||
- Audit log of transformations for compliance.
|
||||
|
||||
---
|
||||
Reference in New Issue
Block a user