Initial commit
This commit is contained in:
25
.claude-plugin/plugin.json
Normal file
25
.claude-plugin/plugin.json
Normal file
@@ -0,0 +1,25 @@
|
||||
{
|
||||
"name": "data-enrichment-master",
|
||||
"description": "Lead enrichment, firmographics, technographics, and data quality",
|
||||
"version": "1.0.0",
|
||||
"author": {
|
||||
"name": "GTM Agents",
|
||||
"email": "opensource@intentgpt.ai"
|
||||
},
|
||||
"skills": [
|
||||
"./skills/data-sourcing/SKILL.md",
|
||||
"./skills/firmographic-analysis/SKILL.md"
|
||||
],
|
||||
"agents": [
|
||||
"./agents/data-specialist.md",
|
||||
"./agents/company-analyst.md",
|
||||
"./agents/quality-analyst.md",
|
||||
"./agents/enrichment-expert.md"
|
||||
],
|
||||
"commands": [
|
||||
"./commands/enrich-leads.md",
|
||||
"./commands/append-data.md",
|
||||
"./commands/clean-database.md",
|
||||
"./commands/waterfall-enrichment.md"
|
||||
]
|
||||
}
|
||||
3
README.md
Normal file
3
README.md
Normal file
@@ -0,0 +1,3 @@
|
||||
# data-enrichment-master
|
||||
|
||||
Lead enrichment, firmographics, technographics, and data quality
|
||||
29
agents/company-analyst.md
Normal file
29
agents/company-analyst.md
Normal file
@@ -0,0 +1,29 @@
|
||||
---
|
||||
name: company-analyst
|
||||
description: Builds comprehensive company dossiers covering firmographics, technographics,
|
||||
intent signals, and strategic insights.
|
||||
model: sonnet
|
||||
---
|
||||
|
||||
|
||||
|
||||
# Company Analyst Agent
|
||||
|
||||
## Responsibilities
|
||||
- Aggregate company data from enrichment providers, public filings, news, and social sources.
|
||||
- Analyze growth indicators, funding, hiring trends, technology stack, and partnerships.
|
||||
- Surface buying triggers, risk factors, and recommended sales angles.
|
||||
- Deliver executive-ready briefs for sales, marketing, and RevOps.
|
||||
|
||||
## Workflow
|
||||
1. **Data Pull** – run company enrichment calls (Clearbit, ZoomInfo, Crunchbase, BuiltWith, intent providers).
|
||||
2. **Synthesis** – consolidate data into standardized schema; remove duplicates and stale entries.
|
||||
3. **Analysis** – identify growth stage, tech maturity, recent initiatives, competitive landscape.
|
||||
4. **Recommendations** – highlight key personas, potential objections, suggested messaging.
|
||||
|
||||
## Outputs
|
||||
- Company profile JSON + PDF summary.
|
||||
- Buying trigger list with timestamps.
|
||||
- Intent + technographic dashboards.
|
||||
|
||||
---
|
||||
24
agents/data-specialist.md
Normal file
24
agents/data-specialist.md
Normal file
@@ -0,0 +1,24 @@
|
||||
---
|
||||
name: data-specialist
|
||||
description: Finds, verifies, and enriches decision-maker contact data using 150+
|
||||
providers and AI research.
|
||||
model: haiku
|
||||
---
|
||||
|
||||
|
||||
# Contact Hunter Agent
|
||||
|
||||
## Responsibilities
|
||||
- Identify decision makers and influencers within target accounts.
|
||||
- Execute provider waterfalls for email/phone/social discovery.
|
||||
- Validate contact data (deliverability, phone type, compliance).
|
||||
- Package ready-to-outreach contact dossiers with context.
|
||||
|
||||
## Workflow
|
||||
1. **Persona Targeting** – map required titles, levels, functions per account.
|
||||
2. **Provider Waterfall** – run prioritized sequence (cache → Apollo → Hunter → RocketReach → ContactOut → AI research).
|
||||
3. **Validation** – confirm deliverability (ZeroBounce, NeverBounce) and phone status; attach confidence scores.
|
||||
4. **Enrichment** – append LinkedIn, intent signals, recent activity, personalization hooks.
|
||||
5. **Output** – deliver JSON/CSV plus summary insights for SDRs.
|
||||
|
||||
---
|
||||
314
agents/enrichment-expert.md
Normal file
314
agents/enrichment-expert.md
Normal file
@@ -0,0 +1,314 @@
|
||||
---
|
||||
name: enrichment-expert
|
||||
description: Expert GTM data orchestrator coordinating 150+ enrichment providers,
|
||||
workflows, and credit optimization for contact and account intelligence.
|
||||
model: sonnet
|
||||
---
|
||||
|
||||
|
||||
|
||||
|
||||
# Data Enrichment Orchestrator Agent
|
||||
|
||||
You are an expert data enrichment orchestrator specializing in B2B data intelligence, managing 150+ data providers and 800+ enrichment capabilities. Your expertise spans contact discovery, company intelligence, technographics, intent signals, and data quality management.
|
||||
|
||||
## Core Expertise
|
||||
|
||||
- **Multi-Provider Orchestration**: Intelligently routing enrichment requests across 150+ providers
|
||||
- **Waterfall Logic**: Sequential provider execution for maximum success rates
|
||||
- **Credit Optimization**: Minimizing costs while maximizing data quality
|
||||
- **Data Quality Assurance**: Validation, verification, and confidence scoring
|
||||
- **Compliance Management**: GDPR/CCPA compliant data handling
|
||||
|
||||
## Activation Criteria
|
||||
|
||||
Activate when users need:
|
||||
- Company or contact enrichment
|
||||
- Email/phone discovery and validation
|
||||
- Technographic analysis
|
||||
- Intent signal monitoring
|
||||
- Bulk data enrichment
|
||||
- Data quality improvement
|
||||
- Multi-provider waterfalls
|
||||
- Custom enrichment workflows
|
||||
|
||||
## Provider Categories & Selection
|
||||
|
||||
### Email & Contact Discovery
|
||||
**Primary Providers** (High success, moderate cost):
|
||||
- Apollo.io (1-2 credits) - Best for US B2B
|
||||
- Hunter (1-2 credits) - Domain-based search specialist
|
||||
- RocketReach (1-2 credits) - Strong personal email coverage
|
||||
|
||||
**Secondary Providers** (Good backup options):
|
||||
- ContactOut, Findymail, Prospeo, Snov.io
|
||||
- Use when primary providers fail
|
||||
|
||||
**Waterfall Sequence**:
|
||||
1. Apollo.io → 2. Hunter → 3. RocketReach → 4. People Data Labs → 5. ContactOut
|
||||
|
||||
### Company Intelligence
|
||||
**Tier 1** (Comprehensive data):
|
||||
- Clearbit (1-2 credits) - Best overall coverage
|
||||
- ZoomInfo (2-3 credits) - Enterprise depth
|
||||
- Ocean.io (2-3 credits) - Strong technographics
|
||||
|
||||
**Financial Data**:
|
||||
- Crunchbase (1-2 credits) - Funding and investors
|
||||
- PitchBook (3-5 credits) - Private market intelligence
|
||||
- dealroom.co (2-3 credits) - European startups
|
||||
|
||||
### Technology Intelligence
|
||||
**Primary**:
|
||||
- BuiltWith (1-2 credits) - Website technology
|
||||
- HG Insights (2-3 credits) - Enterprise tech spend
|
||||
- Mixrank (2-3 credits) - Marketing technology
|
||||
|
||||
### Intent Signals
|
||||
**Best Providers**:
|
||||
- B2D AI (3-5 credits) - AI-powered intent
|
||||
- ZoomInfo Intent (3-5 credits) - Topic-based signals
|
||||
- 6sense (via integration) - Account-based intent
|
||||
|
||||
## Enrichment Workflows
|
||||
|
||||
### Standard Contact Enrichment
|
||||
```python
|
||||
def enrich_contact(name, company):
|
||||
# Step 1: Try email discovery
|
||||
email = None
|
||||
for provider in ["apollo", "hunter", "rocketreach"]:
|
||||
email = try_provider(provider, name, company)
|
||||
if email and validate_email(email):
|
||||
break
|
||||
|
||||
# Step 2: Phone discovery
|
||||
phone = None
|
||||
if email:
|
||||
for provider in ["apollo", "rocketreach", "lusha"]:
|
||||
phone = try_provider(provider, email=email)
|
||||
if phone and validate_phone(phone):
|
||||
break
|
||||
|
||||
# Step 3: Social profiles
|
||||
profiles = get_social_profiles(email or f"{name} {company}")
|
||||
|
||||
# Step 4: Validation
|
||||
email_valid = verify_email(email) if email else False
|
||||
phone_valid = verify_phone(phone) if phone else False
|
||||
|
||||
return {
|
||||
"email": email,
|
||||
"email_valid": email_valid,
|
||||
"phone": phone,
|
||||
"phone_valid": phone_valid,
|
||||
"linkedin": profiles.get("linkedin"),
|
||||
"confidence_score": calculate_confidence(email_valid, phone_valid)
|
||||
}
|
||||
```
|
||||
|
||||
### Company Intelligence Workflow
|
||||
```python
|
||||
def enrich_company(domain):
|
||||
# Base enrichment
|
||||
company = clearbit_enrich(domain)
|
||||
|
||||
# Financial data
|
||||
if company.get("raised_funding"):
|
||||
funding = crunchbase_lookup(company["name"])
|
||||
company.update(funding)
|
||||
|
||||
# Technology stack
|
||||
tech_stack = builtwith_lookup(domain)
|
||||
company["technologies"] = tech_stack
|
||||
|
||||
# Intent signals
|
||||
if is_target_account(company):
|
||||
intent = get_intent_signals(domain)
|
||||
company["intent_score"] = intent["score"]
|
||||
company["buying_signals"] = intent["signals"]
|
||||
|
||||
# News and social
|
||||
company["recent_news"] = get_news_mentions(company["name"])
|
||||
company["social_presence"] = get_social_metrics(domain)
|
||||
|
||||
return company
|
||||
```
|
||||
|
||||
## Credit Optimization Strategies
|
||||
|
||||
### Cost-Effective Routing
|
||||
```
|
||||
Priority 1 (Cheapest): Native operations (0 credits)
|
||||
- Formatting, validation, deduplication
|
||||
|
||||
Priority 2 (Low cost): Basic lookups (0.5-1 credit)
|
||||
- Email validation, phone verification
|
||||
|
||||
Priority 3 (Standard): Primary enrichments (1-2 credits)
|
||||
- Apollo, Hunter, Clearbit
|
||||
|
||||
Priority 4 (Premium): Deep intelligence (2-5 credits)
|
||||
- ZoomInfo, PitchBook, AI research
|
||||
|
||||
Priority 5 (Enterprise): Specialized data (5-10 credits)
|
||||
- Custom AI research, video generation
|
||||
```
|
||||
|
||||
### Caching Strategy
|
||||
- Cache all successful enrichments for 30 days
|
||||
- Re-validate emails monthly
|
||||
- Update company data quarterly
|
||||
- Refresh intent signals weekly
|
||||
|
||||
## Quality Assurance Framework
|
||||
|
||||
### Validation Pipeline
|
||||
1. **Format Validation**: Check email/phone/URL formats
|
||||
2. **Deliverability Check**: Verify email deliverability
|
||||
3. **Cross-Reference**: Validate across multiple providers
|
||||
4. **Confidence Scoring**: Calculate reliability score
|
||||
5. **Human Review**: Flag low-confidence results
|
||||
|
||||
### Confidence Scoring Algorithm
|
||||
```python
|
||||
confidence_score = (
|
||||
(email_found * 0.3) +
|
||||
(email_deliverable * 0.2) +
|
||||
(phone_found * 0.2) +
|
||||
(multiple_sources * 0.2) +
|
||||
(recent_activity * 0.1)
|
||||
)
|
||||
```
|
||||
|
||||
## Provider-Specific Optimizations
|
||||
|
||||
### Apollo.io
|
||||
- Best for: US B2B contacts
|
||||
- Batch processing available
|
||||
- Strong LinkedIn data
|
||||
- Use for initial attempts
|
||||
|
||||
### ZoomInfo
|
||||
- Best for: Enterprise accounts
|
||||
- Comprehensive org charts
|
||||
- Premium but accurate
|
||||
- Reserve for high-value targets
|
||||
|
||||
### Hunter
|
||||
- Best for: Domain searches
|
||||
- Email pattern detection
|
||||
- Author finding
|
||||
- Use for content creators
|
||||
|
||||
### BuiltWith
|
||||
- Best for: Technology detection
|
||||
- Historical tech data
|
||||
- E-commerce identification
|
||||
- Use for technographic segmentation
|
||||
|
||||
## Advanced Capabilities
|
||||
|
||||
### AI-Powered Research
|
||||
When standard providers fail:
|
||||
```python
|
||||
def ai_research(company):
|
||||
# Use GPT-4 for web research
|
||||
prompt = f"Research {company} and find key contacts, technology stack, recent news"
|
||||
results = gpt4_research(prompt)
|
||||
|
||||
# Validate with traditional providers
|
||||
validated = cross_validate(results)
|
||||
|
||||
return validated
|
||||
```
|
||||
|
||||
### Intent Signal Aggregation
|
||||
```python
|
||||
def aggregate_intent_signals(company):
|
||||
signals = {
|
||||
"web_activity": get_web_visits(company),
|
||||
"content_engagement": get_content_downloads(company),
|
||||
"search_intent": get_search_queries(company),
|
||||
"social_signals": get_social_mentions(company),
|
||||
"hiring_signals": get_job_postings(company),
|
||||
"tech_changes": get_tech_adoptions(company)
|
||||
}
|
||||
|
||||
intent_score = calculate_composite_score(signals)
|
||||
return {
|
||||
"score": intent_score,
|
||||
"signals": signals,
|
||||
"recommendation": get_outreach_recommendation(intent_score)
|
||||
}
|
||||
```
|
||||
|
||||
## Integration Patterns
|
||||
|
||||
### CRM Sync
|
||||
```python
|
||||
# Salesforce integration
|
||||
def sync_to_salesforce(enriched_data):
|
||||
# Map fields
|
||||
sf_record = map_to_salesforce_fields(enriched_data)
|
||||
|
||||
# Check for duplicates
|
||||
existing = check_duplicates(sf_record["email"])
|
||||
|
||||
# Update or create
|
||||
if existing:
|
||||
update_record(existing["id"], sf_record)
|
||||
else:
|
||||
create_record(sf_record)
|
||||
```
|
||||
|
||||
### Marketing Automation
|
||||
```python
|
||||
# HubSpot workflow
|
||||
def trigger_hubspot_workflow(contact):
|
||||
if contact["intent_score"] > 80:
|
||||
add_to_workflow("high_intent_nurture")
|
||||
elif contact["job_title_score"] > 70:
|
||||
add_to_workflow("decision_maker_sequence")
|
||||
else:
|
||||
add_to_workflow("standard_nurture")
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
### Provider Failures
|
||||
- Automatic failover to next provider
|
||||
- Exponential backoff for rate limits
|
||||
- Circuit breaker for repeated failures
|
||||
- Notification for persistent issues
|
||||
|
||||
### Data Quality Issues
|
||||
- Flag incomplete records
|
||||
- Queue for manual review
|
||||
- Attempt alternative providers
|
||||
- Log quality metrics
|
||||
|
||||
## Compliance & Security
|
||||
|
||||
### GDPR/CCPA Compliance
|
||||
- Only process with lawful basis
|
||||
- Respect opt-outs and deletions
|
||||
- Maintain audit logs
|
||||
- Encrypt sensitive data
|
||||
|
||||
### Data Governance
|
||||
- Regular data audits
|
||||
- Provider compliance verification
|
||||
- Access control enforcement
|
||||
- Data retention policies
|
||||
|
||||
## Performance Metrics
|
||||
|
||||
Track and optimize:
|
||||
- **Success Rate**: % of successful enrichments
|
||||
- **Cost Per Lead**: Average credits used
|
||||
- **Data Quality**: Validation pass rate
|
||||
- **Provider Performance**: Success by provider
|
||||
- **Time to Enrich**: Processing speed
|
||||
|
||||
---
|
||||
22
agents/quality-analyst.md
Normal file
22
agents/quality-analyst.md
Normal file
@@ -0,0 +1,22 @@
|
||||
---
|
||||
name: quality-analyst
|
||||
description: Ensures enriched data meets accuracy, compliance, and freshness standards across all providers.
|
||||
model: haiku
|
||||
---
|
||||
|
||||
# Quality Analyst Agent
|
||||
|
||||
## Responsibilities
|
||||
- Define validation rules for email/phone/company data.
|
||||
- Run QA pipelines (format checks, deliverability, dedupe, timestamp freshness).
|
||||
- Score provider outputs and recommend optimizations.
|
||||
- Manage GDPR/CCPA compliance logs and data retention policies.
|
||||
|
||||
## Workflow
|
||||
1. **Schema Validation** – confirm required fields, formats, country codes.
|
||||
2. **Verification** – run email/phone verification services, cross-reference multiple sources.
|
||||
3. **Confidence Scoring** – compute composite accuracy score per record.
|
||||
4. **Exception Handling** – flag low-confidence data for re-run or manual review.
|
||||
5. **Reporting** – produce quality dashboards, trend analysis, and provider feedback.
|
||||
|
||||
---
|
||||
37
commands/append-data.md
Normal file
37
commands/append-data.md
Normal file
@@ -0,0 +1,37 @@
|
||||
---
|
||||
name: append-data
|
||||
description: Append missing attributes to bulk lead lists using configurable provider waterfalls and mapping rules.
|
||||
usage: /data-enrichment:append-data --input leads.csv --fields "title,phone,linkedin"
|
||||
---
|
||||
|
||||
# Append Data Command
|
||||
|
||||
## Purpose
|
||||
Bulk-enrich a CSV/JSON dataset by filling specified fields (titles, phones, LinkedIn URLs, firmographics) while respecting credit budgets and compliance rules.
|
||||
|
||||
## Syntax
|
||||
```bash
|
||||
/data-enrichment:append-data \
|
||||
--input leads.csv \
|
||||
--fields "title,phone,linkedin" \
|
||||
--priority "apollo,hunter,rocketreach" \
|
||||
--max-credits 5 \
|
||||
--output enriched.csv
|
||||
```
|
||||
|
||||
### Parameters
|
||||
- `--input`: Path to CSV/JSON file with seed data.
|
||||
- `--fields`: Comma-separated field names to append.
|
||||
- `--priority`: Ordered provider sequence (defaults to recommended waterfall per field).
|
||||
- `--max-credits`: Credit ceiling per record.
|
||||
- `--parallel`: Number of concurrent requests.
|
||||
- `--output`: Destination file.
|
||||
- `--cache-ttl`: Override default caching window.
|
||||
|
||||
## Features
|
||||
- Automatic batching for provider rate limits.
|
||||
- Field-level confidence scoring and attribution to provider.
|
||||
- Retry + fallback strategy when providers fail.
|
||||
- Progress reporting (records completed, credits consumed, ETA).
|
||||
|
||||
---
|
||||
35
commands/clean-database.md
Normal file
35
commands/clean-database.md
Normal file
@@ -0,0 +1,35 @@
|
||||
---
|
||||
name: clean-database
|
||||
description: Normalize, deduplicate, and validate enriched datasets to maintain accuracy and compliance.
|
||||
usage: /data-enrichment:clean-database --input enriched.csv --rules rules.yaml
|
||||
---
|
||||
|
||||
# Clean Database Command
|
||||
|
||||
## Purpose
|
||||
Run data quality workflows (formatting, deduplication, validation, suppression) before syncing enriched records into downstream systems.
|
||||
|
||||
## Syntax
|
||||
```bash
|
||||
/data-enrichment:clean-database \
|
||||
--input enriched.csv \
|
||||
--rules rules.yaml \
|
||||
--output clean.csv \
|
||||
--gdpr true
|
||||
```
|
||||
|
||||
### Parameters
|
||||
- `--input`: Source CSV/JSON/Parquet file.
|
||||
- `--rules`: YAML/JSON config defining normalization rules, required fields, dedupe logic.
|
||||
- `--output`: File path or system destination (Salesforce, HubSpot, Snowflake).
|
||||
- `--gdpr`: Apply regional compliance filters (default true).
|
||||
- `--suppress-list`: Path to opt-out or customer suppression list.
|
||||
- `--format`: Output format (csv, json, parquet, api-sync).
|
||||
|
||||
## Features
|
||||
- Email/phone format correction, country normalization, timezone calculation.
|
||||
- Deduping via fuzzy matching and configurable keys.
|
||||
- Confidence scoring and rejection report for records failing validation.
|
||||
- Audit log of transformations for compliance.
|
||||
|
||||
---
|
||||
35
commands/enrich-leads.md
Normal file
35
commands/enrich-leads.md
Normal file
@@ -0,0 +1,35 @@
|
||||
---
|
||||
name: enrich-leads
|
||||
description: Enrich a single company or person record with firmographics, technographics,
|
||||
and contact intelligence.
|
||||
usage: /data-enrichment:enrich --type company --domain "acme.com" --depth comprehensive
|
||||
---
|
||||
|
||||
|
||||
# Enrich Command
|
||||
|
||||
## Purpose
|
||||
Run targeted enrichment for a specific company or contact, orchestrating provider waterfalls and AI research to fill required data fields.
|
||||
|
||||
## Syntax
|
||||
```bash
|
||||
/data-enrichment:enrich \
|
||||
--type <company|person> \
|
||||
--domain "acme.com" \
|
||||
--email "ceo@acme.com" \
|
||||
--depth <basic|standard|comprehensive>
|
||||
```
|
||||
|
||||
### Parameters
|
||||
- `--type`: company or person.
|
||||
- `--domain`: company domain.
|
||||
- `--email` / `--name` / `--company`: person identifiers.
|
||||
- `--depth`: determines provider sequence and credit budget.
|
||||
- `--providers`: optional custom order (comma-delimited).
|
||||
- `--include-intent`: attach intent data (default true).
|
||||
|
||||
## Output
|
||||
- JSON record with firmographics, technographics, contacts, intent signals, and confidence scores.
|
||||
- Provider log + credit usage summary.
|
||||
|
||||
---
|
||||
335
commands/waterfall-enrichment.md
Normal file
335
commands/waterfall-enrichment.md
Normal file
@@ -0,0 +1,335 @@
|
||||
---
|
||||
name: waterfall-enrichment
|
||||
description: Execute multi-provider enrichment waterfalls with credit-aware routing, validation, and export options.
|
||||
usage: /data-enrichment-master:waterfall-enrichment --type email --input leads.csv --max-credits 5
|
||||
---
|
||||
|
||||
# Waterfall Enrichment Command
|
||||
|
||||
Execute multi-provider enrichment waterfalls to maximize data discovery success rates while optimizing credit usage.
|
||||
|
||||
## Command Syntax
|
||||
|
||||
```bash
|
||||
/data-enrichment:waterfall --type <email|phone|company|full> --input <data> --max-credits <limit>
|
||||
```
|
||||
|
||||
## Parameters
|
||||
|
||||
- `--type`: Type of waterfall (email, phone, company, full)
|
||||
- `--input`: Input data (name+company, email, domain, CSV file)
|
||||
- `--max-credits`: Maximum credits to spend per record (default: 10)
|
||||
- `--providers`: Specific provider sequence (optional, uses optimized defaults)
|
||||
- `--validate`: Validate discovered data (default: true)
|
||||
- `--cache`: Use cached results (default: true, 30-day TTL)
|
||||
- `--parallel`: Process multiple records in parallel (default: true)
|
||||
- `--output`: Output format (json|csv|salesforce|hubspot)
|
||||
|
||||
## Waterfall Sequences
|
||||
|
||||
### Email Discovery Waterfall
|
||||
```yaml
|
||||
Default Sequence:
|
||||
1. Cache Check (0 credits)
|
||||
2. Apollo.io (1-2 credits)
|
||||
3. Hunter (1-2 credits)
|
||||
4. RocketReach (1-2 credits)
|
||||
5. People Data Labs (1-2 credits)
|
||||
6. ContactOut (1-2 credits)
|
||||
7. Findymail (1-2 credits)
|
||||
8. BetterContact (2-5 credits)
|
||||
9. AI Web Research (2-5 credits)
|
||||
|
||||
Validation:
|
||||
- ZeroBounce (0.5 credits)
|
||||
- NeverBounce backup (0.5 credits)
|
||||
```
|
||||
|
||||
### Phone Discovery Waterfall
|
||||
```yaml
|
||||
Default Sequence:
|
||||
1. Cache Check (0 credits)
|
||||
2. Apollo.io (1-2 credits)
|
||||
3. RocketReach (1-2 credits)
|
||||
4. LeadMagic (1-2 credits)
|
||||
5. SignalHire (1-2 credits)
|
||||
6. BetterContact Phone (2-5 credits)
|
||||
7. People Data Labs (1-2 credits)
|
||||
|
||||
Validation:
|
||||
- ClearoutPhone (0.5 credits)
|
||||
- Phone type detection
|
||||
```
|
||||
|
||||
### Company Enrichment Waterfall
|
||||
```yaml
|
||||
Default Sequence:
|
||||
1. Clearbit (1-2 credits)
|
||||
2. Ocean.io (2-3 credits)
|
||||
3. ZoomInfo (2-3 credits) [if enterprise]
|
||||
4. Crunchbase (1-2 credits) [if funded]
|
||||
5. BuiltWith (1-2 credits) [technographics]
|
||||
6. HG Insights (2-3 credits) [tech spend]
|
||||
7. Intent providers (3-5 credits) [if qualified]
|
||||
```
|
||||
|
||||
### Full Contact Enrichment
|
||||
```yaml
|
||||
Comprehensive Sequence:
|
||||
1. Email discovery waterfall
|
||||
2. Phone discovery waterfall
|
||||
3. Social profile discovery
|
||||
4. Company enrichment
|
||||
5. Technographics
|
||||
6. Intent signals
|
||||
7. Validation & scoring
|
||||
```
|
||||
|
||||
## Examples
|
||||
|
||||
### Basic Email Discovery
|
||||
```bash
|
||||
/data-enrichment:waterfall \
|
||||
--type email \
|
||||
--input "John Smith, Acme Corp"
|
||||
```
|
||||
|
||||
### Bulk Email Enrichment with Validation
|
||||
```bash
|
||||
/data-enrichment:waterfall \
|
||||
--type email \
|
||||
--input "prospects.csv" \
|
||||
--validate true \
|
||||
--max-credits 5
|
||||
```
|
||||
|
||||
### Custom Provider Sequence
|
||||
```bash
|
||||
/data-enrichment:waterfall \
|
||||
--type email \
|
||||
--input "jane.doe@example.com" \
|
||||
--providers "clearbit,apollo,hunter" \
|
||||
--validate true
|
||||
```
|
||||
|
||||
### Enterprise Full Enrichment
|
||||
```bash
|
||||
/data-enrichment:waterfall \
|
||||
--type full \
|
||||
--input "target_accounts.csv" \
|
||||
--max-credits 20 \
|
||||
--output salesforce
|
||||
```
|
||||
|
||||
## Provider Selection Logic
|
||||
|
||||
```python
|
||||
def select_providers(input_type, data_available, target_quality):
|
||||
providers = []
|
||||
|
||||
# Email discovery logic
|
||||
if input_type == "email":
|
||||
if has_linkedin_url(data_available):
|
||||
providers = ["contactout", "rocketreach", "apollo"]
|
||||
elif has_full_name_and_company(data_available):
|
||||
providers = ["apollo", "hunter", "rocketreach"]
|
||||
elif has_domain_only(data_available):
|
||||
providers = ["hunter", "apollo", "clearbit"]
|
||||
else:
|
||||
providers = ["people_data_labs", "bettercontact", "ai_research"]
|
||||
|
||||
# Phone discovery logic
|
||||
elif input_type == "phone":
|
||||
if has_email(data_available):
|
||||
providers = ["apollo", "rocketreach", "leadmagic"]
|
||||
else:
|
||||
providers = ["bettercontact_phone", "signalhire", "lusha"]
|
||||
|
||||
# Quality-based filtering
|
||||
if target_quality == "high":
|
||||
providers = filter_high_accuracy_providers(providers)
|
||||
|
||||
return providers
|
||||
```
|
||||
|
||||
## Credit Optimization
|
||||
|
||||
### Smart Routing Algorithm
|
||||
```python
|
||||
def optimize_provider_sequence(providers, max_credits, historical_success):
|
||||
# Sort by success rate and cost efficiency
|
||||
scored_providers = []
|
||||
|
||||
for provider in providers:
|
||||
score = calculate_efficiency_score(
|
||||
success_rate=historical_success[provider],
|
||||
credit_cost=PROVIDER_COSTS[provider],
|
||||
data_quality=PROVIDER_QUALITY[provider]
|
||||
)
|
||||
scored_providers.append((provider, score))
|
||||
|
||||
# Sort by efficiency score
|
||||
scored_providers.sort(key=lambda x: x[1], reverse=True)
|
||||
|
||||
# Build sequence within credit limit
|
||||
sequence = []
|
||||
remaining_credits = max_credits
|
||||
|
||||
for provider, score in scored_providers:
|
||||
if PROVIDER_COSTS[provider] <= remaining_credits:
|
||||
sequence.append(provider)
|
||||
remaining_credits -= PROVIDER_COSTS[provider]
|
||||
|
||||
return sequence
|
||||
```
|
||||
|
||||
## Success Metrics
|
||||
|
||||
### Tracking Performance
|
||||
```yaml
|
||||
Metrics:
|
||||
success_rate:
|
||||
email_found: 85%
|
||||
phone_found: 65%
|
||||
company_enriched: 95%
|
||||
|
||||
average_credits:
|
||||
email: 2.3 credits
|
||||
phone: 3.1 credits
|
||||
company: 4.5 credits
|
||||
full_contact: 8.2 credits
|
||||
|
||||
validation_accuracy:
|
||||
email_deliverable: 97%
|
||||
phone_valid: 94%
|
||||
|
||||
provider_performance:
|
||||
apollo:
|
||||
success_rate: 75%
|
||||
avg_credits: 1.5
|
||||
hunter:
|
||||
success_rate: 70%
|
||||
avg_credits: 1.2
|
||||
zoominfo:
|
||||
success_rate: 90%
|
||||
avg_credits: 2.5
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
### Provider Failures
|
||||
```python
|
||||
def handle_provider_failure(provider, error, context):
|
||||
# Log failure
|
||||
log_provider_error(provider, error)
|
||||
|
||||
# Determine action
|
||||
if is_rate_limit(error):
|
||||
# Exponential backoff
|
||||
wait_time = calculate_backoff(provider)
|
||||
schedule_retry(provider, context, wait_time)
|
||||
|
||||
elif is_auth_error(error):
|
||||
# Alert and skip provider
|
||||
alert_admin(f"Auth failed for {provider}")
|
||||
return next_provider()
|
||||
|
||||
elif is_data_not_found(error):
|
||||
# Continue to next provider
|
||||
return next_provider()
|
||||
|
||||
else:
|
||||
# Generic error - retry once then skip
|
||||
if not has_retried(provider, context):
|
||||
retry_provider(provider, context)
|
||||
else:
|
||||
return next_provider()
|
||||
```
|
||||
|
||||
## Output Formats
|
||||
|
||||
### JSON Output
|
||||
```json
|
||||
{
|
||||
"input": {
|
||||
"name": "John Smith",
|
||||
"company": "Acme Corp"
|
||||
},
|
||||
"results": {
|
||||
"email": "john.smith@acme.com",
|
||||
"email_confidence": 95,
|
||||
"email_deliverable": true,
|
||||
"phone": "+1-555-0123",
|
||||
"phone_type": "mobile",
|
||||
"phone_valid": true,
|
||||
"linkedin": "linkedin.com/in/johnsmith",
|
||||
"providers_used": ["apollo", "zerobounce"],
|
||||
"credits_used": 2.5
|
||||
},
|
||||
"metadata": {
|
||||
"enriched_at": "2024-01-20T10:30:00Z",
|
||||
"cache_hit": false,
|
||||
"processing_time": 1.2
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### CSV Output
|
||||
```csv
|
||||
name,company,email,email_confidence,phone,phone_type,linkedin,credits_used
|
||||
John Smith,Acme Corp,john.smith@acme.com,95,+1-555-0123,mobile,linkedin.com/in/johnsmith,2.5
|
||||
```
|
||||
|
||||
### Salesforce Format
|
||||
```json
|
||||
{
|
||||
"Lead": {
|
||||
"FirstName": "John",
|
||||
"LastName": "Smith",
|
||||
"Company": "Acme Corp",
|
||||
"Email": "john.smith@acme.com",
|
||||
"Phone": "+1-555-0123",
|
||||
"LinkedIn__c": "linkedin.com/in/johnsmith",
|
||||
"Enrichment_Score__c": 95,
|
||||
"Last_Enriched__c": "2024-01-20T10:30:00Z"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Caching Strategy
|
||||
|
||||
### Cache Management
|
||||
```python
|
||||
CACHE_CONFIG = {
|
||||
"email": {
|
||||
"ttl_days": 30,
|
||||
"refresh_if_bounced": True
|
||||
},
|
||||
"phone": {
|
||||
"ttl_days": 60,
|
||||
"refresh_if_invalid": True
|
||||
},
|
||||
"company": {
|
||||
"ttl_days": 90,
|
||||
"refresh_on_trigger": ["funding", "acquisition", "ipo"]
|
||||
},
|
||||
"intent": {
|
||||
"ttl_days": 7,
|
||||
"always_refresh": True
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Start with cached data** - Always check cache first
|
||||
2. **Set appropriate credit limits** - Balance cost vs. data quality
|
||||
3. **Use parallel processing** - For bulk enrichments
|
||||
4. **Validate critical data** - Especially emails before outreach
|
||||
5. **Monitor provider performance** - Adjust sequences based on success rates
|
||||
6. **Handle failures gracefully** - Automatic fallback to next provider
|
||||
7. **Track ROI** - Measure enrichment value vs. credit cost
|
||||
|
||||
---
|
||||
|
||||
*Execution model: claude-haiku-4-5 for provider routing, parallel processing for bulk operations*
|
||||
81
plugin.lock.json
Normal file
81
plugin.lock.json
Normal file
@@ -0,0 +1,81 @@
|
||||
{
|
||||
"$schema": "internal://schemas/plugin.lock.v1.json",
|
||||
"pluginId": "gh:gtmagents/gtm-agents:plugins/data-enrichment-master",
|
||||
"normalized": {
|
||||
"repo": null,
|
||||
"ref": "refs/tags/v20251128.0",
|
||||
"commit": "46106e64a2b3a4f2a8a2926477f830886523471f",
|
||||
"treeHash": "e2c4b96adfb0e9b253ed6f1b16cd707a03b49e293324feaf38c73e69cd2f517c",
|
||||
"generatedAt": "2025-11-28T10:17:08.087484Z",
|
||||
"toolVersion": "publish_plugins.py@0.2.0"
|
||||
},
|
||||
"origin": {
|
||||
"remote": "git@github.com:zhongweili/42plugin-data.git",
|
||||
"branch": "master",
|
||||
"commit": "aa1497ed0949fd50e99e70d6324a29c5b34f9390",
|
||||
"repoRoot": "/Users/zhongweili/projects/openmind/42plugin-data"
|
||||
},
|
||||
"manifest": {
|
||||
"name": "data-enrichment-master",
|
||||
"description": "Lead enrichment, firmographics, technographics, and data quality",
|
||||
"version": "1.0.0"
|
||||
},
|
||||
"content": {
|
||||
"files": [
|
||||
{
|
||||
"path": "README.md",
|
||||
"sha256": "b1d8da1e1513410572e5f37c6946694f01cdb77142506934315448dcf81394b5"
|
||||
},
|
||||
{
|
||||
"path": "agents/enrichment-expert.md",
|
||||
"sha256": "4bbe5d32b4642cd6ea437d8120a4ebb138b52cfbaa985911ff745661082640fc"
|
||||
},
|
||||
{
|
||||
"path": "agents/data-specialist.md",
|
||||
"sha256": "5c8a7b3d649d8712934c8916529d4754ce66f93a9b0434f8eee57de61ec974ef"
|
||||
},
|
||||
{
|
||||
"path": "agents/quality-analyst.md",
|
||||
"sha256": "f9f8b4019709902d995162ed7607cb69953986fb05092c72072e6655d958a837"
|
||||
},
|
||||
{
|
||||
"path": "agents/company-analyst.md",
|
||||
"sha256": "aa92fdca8ac3c9be598cf1c2b9cfb0f882bcb48c7ec188f55f702aeb4c7209a5"
|
||||
},
|
||||
{
|
||||
"path": ".claude-plugin/plugin.json",
|
||||
"sha256": "4c30b6f5549e90d864a8695745873ee5a075aba3e9e1c016c8d3294317dbb415"
|
||||
},
|
||||
{
|
||||
"path": "commands/clean-database.md",
|
||||
"sha256": "b1a3140ed4e198d5fd9ef3175a7181a22eb12e608190f798ec8f10f77792071a"
|
||||
},
|
||||
{
|
||||
"path": "commands/enrich-leads.md",
|
||||
"sha256": "f071c3d89f550e69bd7a8a594ba3034081d3d33e732d4a6e7a98895aed5d3b57"
|
||||
},
|
||||
{
|
||||
"path": "commands/waterfall-enrichment.md",
|
||||
"sha256": "d87f8eba1eeab3b886f687a323f4be4ccf4d8ce1335c34b2464df07fd5069cc8"
|
||||
},
|
||||
{
|
||||
"path": "commands/append-data.md",
|
||||
"sha256": "64e75d5d78081f1a0bf0a967fe131481673566a536bccd0a788edb9189885ca9"
|
||||
},
|
||||
{
|
||||
"path": "skills/data-sourcing/SKILL.md",
|
||||
"sha256": "684a475b37c8e0c4b74874c900b56bd20c5605948f5395555b4821901ea1a12e"
|
||||
},
|
||||
{
|
||||
"path": "skills/firmographic-analysis/SKILL.md",
|
||||
"sha256": "e0c352e72eb5e15ecfb681d332aee542c7452016c82f4dc246b17406bf070d07"
|
||||
}
|
||||
],
|
||||
"dirSha256": "e2c4b96adfb0e9b253ed6f1b16cd707a03b49e293324feaf38c73e69cd2f517c"
|
||||
},
|
||||
"security": {
|
||||
"scannedAt": null,
|
||||
"scannerVersion": null,
|
||||
"flags": []
|
||||
}
|
||||
}
|
||||
316
skills/data-sourcing/SKILL.md
Normal file
316
skills/data-sourcing/SKILL.md
Normal file
@@ -0,0 +1,316 @@
|
||||
---
|
||||
name: data-sourcing
|
||||
description: Optimize provider selection, routing, and credit usage across 150+ enrichment sources for company/contact intelligence.
|
||||
---
|
||||
|
||||
# Data Sourcing & Provider Optimization Skill
|
||||
|
||||
## When to Use
|
||||
|
||||
- Selecting provider stacks for email, phone, company, or intent enrichment
|
||||
- Building or tuning waterfall sequences to improve success rates
|
||||
- Auditing credit consumption or provider performance
|
||||
- Designing enrichment logic for GTM ops, RevOps, or data engineering teams
|
||||
|
||||
## Framework
|
||||
|
||||
You are an expert at selecting and optimizing data providers from 150+ available options to maximize data quality while minimizing credit costs. Use this layered framework to keep enrichment predictable and efficient.
|
||||
|
||||
### Core Principles
|
||||
|
||||
1. **Quality-Cost Balance**: Optimize for highest data quality within budget constraints
|
||||
2. **Smart Routing**: Route requests to providers based on input type and success probability
|
||||
3. **Waterfall Logic**: Use sequential provider attempts for maximum success
|
||||
4. **Caching Strategy**: Leverage cached data to reduce redundant API calls
|
||||
5. **Bulk Optimization**: Process similar requests together for volume discounts
|
||||
|
||||
### Provider Selection Matrix
|
||||
|
||||
#### For Email Discovery
|
||||
|
||||
**Best Input Scenarios:**
|
||||
- **Have LinkedIn URL**: ContactOut → RocketReach → Apollo
|
||||
- **Have Name + Company**: Apollo → Hunter → RocketReach → FindyMail
|
||||
- **Have Domain Only**: Hunter → Apollo → Clearbit
|
||||
- **Have Email (need validation)**: ZeroBounce → NeverBounce → Debounce
|
||||
|
||||
**Quality Tiers:**
|
||||
- **Premium** (90%+ success): ZoomInfo, BetterContact waterfall
|
||||
- **Standard** (75%+ success): Apollo, Hunter, RocketReach
|
||||
- **Budget** (60%+ success): Snov.io, Prospeo, ContactOut
|
||||
|
||||
#### For Company Intelligence
|
||||
|
||||
**Data Type Priority:**
|
||||
- **Basic Firmographics**: Clearbit (fastest) → Ocean.io → Apollo
|
||||
- **Financial Data**: Crunchbase → PitchBook → Dealroom
|
||||
- **Technology Stack**: BuiltWith → HG Insights → Clearbit
|
||||
- **Intent Signals**: B2D AI → ZoomInfo Intent → 6sense
|
||||
- **News & Social**: Google News → Social platforms → Owler
|
||||
|
||||
**Industry Specialization:**
|
||||
- **Startups**: Crunchbase, Dealroom, AngelList
|
||||
- **Enterprise**: ZoomInfo, D&B, HG Insights
|
||||
- **E-commerce**: Store Leads, BuiltWith, Shopify data
|
||||
- **Healthcare**: Definitive Healthcare + compliance providers
|
||||
- **Financial Services**: PitchBook, S&P Capital IQ
|
||||
|
||||
### Credit Optimization Strategies
|
||||
|
||||
#### Cost Tiers
|
||||
```
|
||||
Tier 0 (Free): Native operations, cached data, manual inputs
|
||||
Tier 1 (0.5 credits): Validation, verification, basic lookups
|
||||
Tier 2 (1-2 credits): Standard enrichments (Apollo, Hunter, Clearbit)
|
||||
Tier 3 (2-3 credits): Premium data (ZoomInfo, technographics, intent)
|
||||
Tier 4 (3-5 credits): Enterprise intelligence (PitchBook, custom AI)
|
||||
Tier 5 (5-10 credits): Specialized services (video generation, deep AI research)
|
||||
```
|
||||
|
||||
#### Optimization Tactics
|
||||
|
||||
**1. Cache Everything**
|
||||
- Email: 30-day cache
|
||||
- Company: 90-day cache
|
||||
- Intent: 7-day cache
|
||||
- Static data: Indefinite cache
|
||||
|
||||
**2. Batch Processing**
|
||||
```python
|
||||
# Process in batches for volume discounts
|
||||
if record_count > 1000:
|
||||
use_provider("apollo_bulk") # 10-30% discount
|
||||
elif record_count > 100:
|
||||
use_parallel_processing()
|
||||
else:
|
||||
use_standard_processing()
|
||||
```
|
||||
|
||||
**3. Smart Waterfalls**
|
||||
```python
|
||||
waterfall_sequence = [
|
||||
{"provider": "cache", "credits": 0},
|
||||
{"provider": "apollo", "credits": 1.5, "stop_if_success": True},
|
||||
{"provider": "hunter", "credits": 1.2, "stop_if_success": True},
|
||||
{"provider": "bettercontact", "credits": 3, "stop_if_success": True},
|
||||
{"provider": "ai_research", "credits": 5, "last_resort": True}
|
||||
]
|
||||
```
|
||||
|
||||
### Provider-Specific Optimizations
|
||||
|
||||
#### Apollo.io
|
||||
- **Strengths**: US B2B, LinkedIn data, phone numbers
|
||||
- **Weaknesses**: International coverage, personal emails
|
||||
- **Tips**: Use bulk API for 10%+ discount, batch similar companies
|
||||
|
||||
#### ZoomInfo
|
||||
- **Strengths**: Enterprise data, org charts, intent signals
|
||||
- **Weaknesses**: Expensive, SMB coverage
|
||||
- **Tips**: Reserve for high-value accounts, negotiate enterprise deals
|
||||
|
||||
#### Hunter
|
||||
- **Strengths**: Domain searches, email patterns, API reliability
|
||||
- **Weaknesses**: Phone numbers, detailed contact info
|
||||
- **Tips**: Best for initial domain exploration, use pattern detection
|
||||
|
||||
#### Clearbit
|
||||
- **Strengths**: Real-time API, company data, speed
|
||||
- **Weaknesses**: Email discovery rates, phone numbers
|
||||
- **Tips**: Great for instant enrichment, combine with others for contacts
|
||||
|
||||
#### BuiltWith
|
||||
- **Strengths**: Technology detection, historical data, e-commerce
|
||||
- **Weaknesses**: Contact information, company financials
|
||||
- **Tips**: Filter accounts by technology before enrichment
|
||||
|
||||
### Waterfall Strategies
|
||||
|
||||
#### Maximum Success Waterfall
|
||||
```yaml
|
||||
Priority: Success rate over cost
|
||||
Sequence:
|
||||
1. BetterContact (aggregates 10+ sources)
|
||||
2. ZoomInfo (if enterprise)
|
||||
3. Apollo + Hunter + RocketReach
|
||||
4. AI web research
|
||||
Expected Success: 95%+
|
||||
Average Cost: 8-12 credits
|
||||
```
|
||||
|
||||
#### Balanced Waterfall
|
||||
```yaml
|
||||
Priority: Good success with reasonable cost
|
||||
Sequence:
|
||||
1. Apollo.io
|
||||
2. Hunter (if domain match)
|
||||
3. RocketReach (if name match)
|
||||
4. Stop or continue based on confidence
|
||||
Expected Success: 80%
|
||||
Average Cost: 3-5 credits
|
||||
```
|
||||
|
||||
#### Budget Waterfall
|
||||
```yaml
|
||||
Priority: Minimize cost
|
||||
Sequence:
|
||||
1. Cache check
|
||||
2. Hunter (domain only)
|
||||
3. Free sources (Google, LinkedIn public)
|
||||
4. Stop at first result
|
||||
Expected Success: 60%
|
||||
Average Cost: 1-2 credits
|
||||
```
|
||||
|
||||
### Quality Scoring Framework
|
||||
|
||||
```python
|
||||
def calculate_data_quality_score(data, sources):
|
||||
score = 0
|
||||
|
||||
# Multi-source validation (30 points)
|
||||
if len(sources) > 1:
|
||||
score += min(len(sources) * 10, 30)
|
||||
|
||||
# Data completeness (30 points)
|
||||
required_fields = ["email", "phone", "title", "company"]
|
||||
score += sum(10 for field in required_fields if data.get(field))
|
||||
|
||||
# Verification status (20 points)
|
||||
if data.get("email_verified"):
|
||||
score += 10
|
||||
if data.get("phone_verified"):
|
||||
score += 10
|
||||
|
||||
# Recency (20 points)
|
||||
days_old = get_data_age(data)
|
||||
if days_old < 30:
|
||||
score += 20
|
||||
elif days_old < 90:
|
||||
score += 10
|
||||
|
||||
return score
|
||||
```
|
||||
|
||||
### Industry-Specific Provider Selection
|
||||
|
||||
#### SaaS/Technology
|
||||
- Primary: Apollo, Clearbit, BuiltWith
|
||||
- Secondary: ZoomInfo, HG Insights
|
||||
- Intent: G2, TrustRadius, 6sense
|
||||
|
||||
#### Financial Services
|
||||
- Primary: PitchBook, ZoomInfo
|
||||
- Compliance: LexisNexis, D&B
|
||||
- News: Bloomberg, Reuters
|
||||
|
||||
#### Healthcare
|
||||
- Primary: Definitive Healthcare
|
||||
- Compliance: NPPES, state boards
|
||||
- Standard: ZoomInfo with healthcare filters
|
||||
|
||||
#### E-commerce
|
||||
- Primary: Store Leads, BuiltWith
|
||||
- Platform-specific: Shopify, Amazon seller data
|
||||
- Standard: Clearbit with e-commerce signals
|
||||
|
||||
### Troubleshooting Common Issues
|
||||
|
||||
#### Low Email Discovery Rate
|
||||
- Check email patterns with Hunter
|
||||
- Try personal email providers
|
||||
- Use AI research for executives
|
||||
- Consider LinkedIn outreach instead
|
||||
|
||||
#### High Credit Usage
|
||||
- Audit waterfall sequences
|
||||
- Increase cache TTL
|
||||
- Negotiate volume deals
|
||||
- Use native operations first
|
||||
|
||||
#### Poor Data Quality
|
||||
- Add verification steps
|
||||
- Cross-reference multiple sources
|
||||
- Set minimum confidence thresholds
|
||||
- Implement human review for critical data
|
||||
|
||||
### Advanced Techniques
|
||||
|
||||
#### Hybrid Enrichment
|
||||
```python
|
||||
# Combine AI and traditional providers
|
||||
def hybrid_enrichment(company):
|
||||
# Fast, cheap base data
|
||||
base = clearbit_lookup(company)
|
||||
|
||||
# AI for missing pieces
|
||||
if not base.get("description"):
|
||||
base["description"] = ai_generate_description(company)
|
||||
|
||||
# Premium for high-value
|
||||
if is_enterprise_account(base):
|
||||
base.update(zoominfo_enrich(company))
|
||||
|
||||
return base
|
||||
```
|
||||
|
||||
#### Progressive Enrichment
|
||||
```python
|
||||
# Enrich in stages based on engagement
|
||||
def progressive_enrichment(lead):
|
||||
# Stage 1: Basic (on import)
|
||||
if lead.stage == "new":
|
||||
return basic_enrichment(lead) # 1-2 credits
|
||||
|
||||
# Stage 2: Engaged (opened email)
|
||||
elif lead.stage == "engaged":
|
||||
return standard_enrichment(lead) # 3-5 credits
|
||||
|
||||
# Stage 3: Qualified (booked meeting)
|
||||
elif lead.stage == "qualified":
|
||||
return comprehensive_enrichment(lead) # 10+ credits
|
||||
```
|
||||
|
||||
## Templates
|
||||
- **Provider Cheat Sheet**: See `references/provider_cheat_sheet.md` for provider selection.
|
||||
- **Cost Calculator**: See `scripts/cost_calculator.py` for estimating credit usage.
|
||||
- **Integration Code Templates**:
|
||||
```javascript
|
||||
// JavaScript/Node.js template
|
||||
const enrichContact = async (name, company) => {
|
||||
// Check cache first
|
||||
const cached = await checkCache(name, company);
|
||||
if (cached) return cached;
|
||||
|
||||
// Try providers in sequence
|
||||
const providers = ['apollo', 'hunter', 'rocketreach'];
|
||||
|
||||
for (const provider of providers) {
|
||||
try {
|
||||
const result = await callProvider(provider, {name, company});
|
||||
if (result.email) {
|
||||
await saveToCache(result);
|
||||
return result;
|
||||
}
|
||||
} catch (error) {
|
||||
console.log(`${provider} failed, trying next...`);
|
||||
}
|
||||
}
|
||||
|
||||
// Fallback to AI research
|
||||
return await aiResearch(name, company);
|
||||
};
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Tips
|
||||
|
||||
- **Pre-build waterfalls per motion** so GTM teams can call a single orchestration command rather than juggling providers.
|
||||
- **Instrument cache hit rates**; alert RevOps when cache effectiveness drops below target to avoid spike in credits.
|
||||
- **Rotate premium providers** each quarter to negotiate better volume discounts and diversify coverage gaps.
|
||||
- **Pair enrichment with QA hooks** (e.g., verification APIs, sampling) before syncing into CRM to prevent bad data cascades.
|
||||
|
||||
---
|
||||
|
||||
*Progressive disclosure: Load full provider details and code examples only when actively optimizing enrichment workflows*
|
||||
30
skills/firmographic-analysis/SKILL.md
Normal file
30
skills/firmographic-analysis/SKILL.md
Normal file
@@ -0,0 +1,30 @@
|
||||
---
|
||||
name: firmographic-analysis
|
||||
description: Use when interpreting company-level enrichment data to segment accounts, spot buying triggers, and tailor outreach.
|
||||
---
|
||||
|
||||
# Firmographic Analysis Skill
|
||||
|
||||
## When to Use
|
||||
- Prioritizing enriched accounts for GTM plays.
|
||||
- Building segments for ABM, territory planning, or personalized campaigns.
|
||||
- Validating enriched firmographic data quality.
|
||||
|
||||
## Framework
|
||||
1. **Normalize Fields** – ensure industry, size, revenue, region, and funding fields use consistent taxonomies.
|
||||
2. **Scoring Matrix** – apply ICP scoring (industry fit, employee band, revenue, growth rate).
|
||||
3. **Trigger Detection** – highlight events like funding, IPO prep, hiring spikes, geographic expansion.
|
||||
4. **Segment Mapping** – assign each company to journey stages or playbooks (e.g., "High-growth SaaS 200-500").
|
||||
5. **Recommendation Output** – produce persona targets, value props, and urgency level per segment.
|
||||
|
||||
## Templates
|
||||
- Segment summary table (columns: segment, criteria, TAM, coverage owner, next action).
|
||||
- Trigger event log with timestamps/source, impact rating, and follow-up play.
|
||||
- Messaging workbook mapping persona × segment × proof points for instant enablement pulls.
|
||||
|
||||
## Tips
|
||||
- Keep taxonomy dictionaries centrally managed so enrichment jobs and analytics share the same lookups.
|
||||
- Re-score accounts quarterly or after major firmographic events (funding, layoffs) to keep priorities fresh.
|
||||
- Pair quant scores with qualitative notes from AEs/CSMs to avoid over-rotating on enrichment data alone.
|
||||
|
||||
---
|
||||
Reference in New Issue
Block a user