Initial commit

This commit is contained in:
Zhongwei Li
2025-11-29 18:17:17 +08:00
commit 6062d3994e
39 changed files with 21748 additions and 0 deletions

View File

@@ -0,0 +1,250 @@
# n8n Workflow Patterns
Proven architectural patterns for building n8n workflows.
---
## Purpose
Teaches architectural patterns for building n8n workflows. Provides structure, best practices, and proven approaches for common use cases.
## Activates On
- build workflow
- workflow pattern
- workflow architecture
- workflow structure
- webhook processing
- http api
- api integration
- database sync
- ai agent
- chatbot
- scheduled task
- automation pattern
## File Count
7 files, ~3,700 lines total
## Priority
**HIGH** - Addresses 813 webhook searches (most common use case)
## Dependencies
**n8n-mcp tools**:
- search_nodes (find nodes for patterns)
- get_node_essentials (understand node operations)
- search_templates (find example workflows)
**Related skills**:
- n8n MCP Tools Expert (find and configure nodes)
- n8n Expression Syntax (write expressions in patterns)
- n8n Node Configuration (configure pattern nodes)
- n8n Validation Expert (validate pattern implementations)
## Coverage
### The 5 Core Patterns
1. **Webhook Processing** (Most Common - 813 searches)
- Receive HTTP requests → Process → Respond
- Critical gotcha: Data under $json.body
- Authentication, validation, error handling
2. **HTTP API Integration** (892 templates)
- Fetch from REST APIs → Transform → Store/Use
- Authentication methods, pagination, rate limiting
- Error handling and retries
3. **Database Operations** (456 templates)
- Read/Write/Sync database data
- Batch processing, transactions, performance
- Security: parameterized queries, read-only access
4. **AI Agent Workflow** (234 templates, 270 AI nodes)
- AI agents with tool access and memory
- 8 AI connection types
- ANY node can be an AI tool
5. **Scheduled Tasks** (28% of all workflows)
- Recurring automation workflows
- Cron schedules, timezone handling
- Monitoring and error handling
### Cross-Cutting Concerns
- Data flow patterns (linear, branching, parallel, loops)
- Error handling strategies
- Performance optimization
- Security best practices
- Testing approaches
- Monitoring and logging
## Evaluations
5 scenarios (100% coverage expected):
1. **eval-001**: Webhook workflow structure
2. **eval-002**: HTTP API integration pattern
3. **eval-003**: Database sync pattern
4. **eval-004**: AI agent workflow with tools
5. **eval-005**: Scheduled report generation
## Key Features
**5 Proven Patterns**: Webhook, HTTP API, Database, AI Agent, Scheduled tasks
**Complete Examples**: Working workflow configurations for each pattern
**Best Practices**: Proven approaches from real-world n8n usage
**Common Gotchas**: Documented mistakes and their fixes
**Integration Guide**: How patterns work with other skills
**Template Examples**: Real examples from 2,653+ n8n templates
## Files
- **SKILL.md** (486 lines) - Pattern overview, selection guide, checklist
- **webhook_processing.md** (554 lines) - Webhook patterns, data structure, auth
- **http_api_integration.md** (763 lines) - REST APIs, pagination, rate limiting
- **database_operations.md** (854 lines) - DB operations, batch processing, security
- **ai_agent_workflow.md** (918 lines) - AI agents, tools, memory, 8 connection types
- **scheduled_tasks.md** (845 lines) - Cron schedules, timezone, monitoring
- **README.md** (this file) - Skill metadata
## Success Metrics
**Expected outcomes**:
- Users select appropriate pattern for their use case
- Workflows follow proven structural patterns
- Common gotchas avoided (webhook $json.body, SQL injection, etc.)
- Proper error handling implemented
- Security best practices followed
## Pattern Selection Stats
Common workflow composition:
**Trigger Distribution**:
- Webhook: 35% (most common)
- Schedule: 28%
- Manual: 22%
- Service triggers: 15%
**Transformation Nodes**:
- Set: 68%
- Code: 42%
- IF: 38%
- Switch: 18%
**Output Channels**:
- HTTP Request: 45%
- Slack: 32%
- Database: 28%
- Email: 24%
**Complexity**:
- Simple (3-5 nodes): 42%
- Medium (6-10 nodes): 38%
- Complex (11+ nodes): 20%
## Critical Insights
**Webhook Processing**:
- 813 searches (most common use case!)
- #1 gotcha: Data under $json.body (not $json directly)
- Must choose response mode: onReceived vs lastNode
**API Integration**:
- Authentication via credentials (never hardcode!)
- Pagination essential for large datasets
- Rate limiting prevents API bans
- continueOnFail: true for error handling
**Database Operations**:
- Always use parameterized queries (SQL injection prevention)
- Batch processing for large datasets
- Read-only access for AI tools
- Transaction handling for multi-step operations
**AI Agents**:
- 8 AI connection types (ai_languageModel, ai_tool, ai_memory, etc.)
- ANY node can be an AI tool (connect to ai_tool port)
- Memory essential for conversations (Window Buffer recommended)
- Tool descriptions critical (AI uses them to decide when to call)
**Scheduled Tasks**:
- Set workflow timezone explicitly (DST handling)
- Prevent overlapping executions (use locks)
- Error Trigger workflow for alerts
- Batch processing for large data
## Workflow Creation Checklist
Every pattern follows this checklist:
### Planning Phase
- [ ] Identify the pattern (webhook, API, database, AI, scheduled)
- [ ] List required nodes (use search_nodes)
- [ ] Understand data flow (input → transform → output)
- [ ] Plan error handling strategy
### Implementation Phase
- [ ] Create workflow with appropriate trigger
- [ ] Add data source nodes
- [ ] Configure authentication/credentials
- [ ] Add transformation nodes (Set, Code, IF)
- [ ] Add output/action nodes
- [ ] Configure error handling
### Validation Phase
- [ ] Validate each node configuration
- [ ] Validate complete workflow
- [ ] Test with sample data
- [ ] Handle edge cases
### Deployment Phase
- [ ] Review workflow settings
- [ ] Activate workflow
- [ ] Monitor first executions
- [ ] Document workflow
## Real Template Examples
**Weather to Slack** (Template #2947):
```
Schedule (daily 8 AM) → HTTP Request (weather) → Set → Slack
```
**Webhook Processing**: 1,085 templates
**HTTP API Integration**: 892 templates
**Database Operations**: 456 templates
**AI Workflows**: 234 templates
Use `search_templates` to find examples for your use case!
## Integration with Other Skills
**Pattern Selection** (this skill):
1. Identify use case
2. Select appropriate pattern
3. Follow pattern structure
**Node Discovery** (n8n MCP Tools Expert):
4. Find nodes for pattern (search_nodes)
5. Understand node operations (get_node_essentials)
**Implementation** (n8n Expression Syntax + Node Configuration):
6. Write expressions ({{$json.body.field}})
7. Configure nodes properly
**Validation** (n8n Validation Expert):
8. Validate workflow structure
9. Fix validation errors
## Last Updated
2025-10-20
---
**Part of**: n8n-skills repository
**Conceived by**: Romuald Członkowski - [www.aiadvisors.pl/en](https://www.aiadvisors.pl/en)

View File

@@ -0,0 +1,409 @@
---
name: n8n-workflow-patterns
description: Proven workflow architectural patterns from real n8n workflows. Use when building new workflows, designing workflow structure, choosing workflow patterns, planning workflow architecture, or asking about webhook processing, HTTP API integration, database operations, AI agent workflows, or scheduled tasks.
---
# n8n Workflow Patterns
Proven architectural patterns for building n8n workflows.
---
## The 5 Core Patterns
Based on analysis of real workflow usage:
1. **[Webhook Processing](webhook_processing.md)** (Most Common)
- Receive HTTP requests → Process → Output
- Pattern: Webhook → Validate → Transform → Respond/Notify
2. **[HTTP API Integration](http_api_integration.md)**
- Fetch from REST APIs → Transform → Store/Use
- Pattern: Trigger → HTTP Request → Transform → Action → Error Handler
3. **[Database Operations](database_operations.md)**
- Read/Write/Sync database data
- Pattern: Schedule → Query → Transform → Write → Verify
4. **[AI Agent Workflow](ai_agent_workflow.md)**
- AI agents with tools and memory
- Pattern: Trigger → AI Agent (Model + Tools + Memory) → Output
5. **[Scheduled Tasks](scheduled_tasks.md)**
- Recurring automation workflows
- Pattern: Schedule → Fetch → Process → Deliver → Log
---
## Pattern Selection Guide
### When to use each pattern:
**Webhook Processing** - Use when:
- Receiving data from external systems
- Building integrations (Slack commands, form submissions, GitHub webhooks)
- Need instant response to events
- Example: "Receive Stripe payment webhook → Update database → Send confirmation"
**HTTP API Integration** - Use when:
- Fetching data from external APIs
- Synchronizing with third-party services
- Building data pipelines
- Example: "Fetch GitHub issues → Transform → Create Jira tickets"
**Database Operations** - Use when:
- Syncing between databases
- Running database queries on schedule
- ETL workflows
- Example: "Read Postgres records → Transform → Write to MySQL"
**AI Agent Workflow** - Use when:
- Building conversational AI
- Need AI with tool access
- Multi-step reasoning tasks
- Example: "Chat with AI that can search docs, query database, send emails"
**Scheduled Tasks** - Use when:
- Recurring reports or summaries
- Periodic data fetching
- Maintenance tasks
- Example: "Daily: Fetch analytics → Generate report → Email team"
---
## Common Workflow Components
All patterns share these building blocks:
### 1. Triggers
- **Webhook** - HTTP endpoint (instant)
- **Schedule** - Cron-based timing (periodic)
- **Manual** - Click to execute (testing)
- **Polling** - Check for changes (intervals)
### 2. Data Sources
- **HTTP Request** - REST APIs
- **Database nodes** - Postgres, MySQL, MongoDB
- **Service nodes** - Slack, Google Sheets, etc.
- **Code** - Custom JavaScript/Python
### 3. Transformation
- **Set** - Map/transform fields
- **Code** - Complex logic
- **IF/Switch** - Conditional routing
- **Merge** - Combine data streams
### 4. Outputs
- **HTTP Request** - Call APIs
- **Database** - Write data
- **Communication** - Email, Slack, Discord
- **Storage** - Files, cloud storage
### 5. Error Handling
- **Error Trigger** - Catch workflow errors
- **IF** - Check for error conditions
- **Stop and Error** - Explicit failure
- **Continue On Fail** - Per-node setting
---
## Workflow Creation Checklist
When building ANY workflow, follow this checklist:
### Planning Phase
- [ ] Identify the pattern (webhook, API, database, AI, scheduled)
- [ ] List required nodes (use search_nodes)
- [ ] Understand data flow (input → transform → output)
- [ ] Plan error handling strategy
### Implementation Phase
- [ ] Create workflow with appropriate trigger
- [ ] Add data source nodes
- [ ] Configure authentication/credentials
- [ ] Add transformation nodes (Set, Code, IF)
- [ ] Add output/action nodes
- [ ] Configure error handling
### Validation Phase
- [ ] Validate each node configuration (validate_node_operation)
- [ ] Validate complete workflow (validate_workflow)
- [ ] Test with sample data
- [ ] Handle edge cases (empty data, errors)
### Deployment Phase
- [ ] Review workflow settings (execution order, timeout, error handling)
- [ ] Activate workflow ⚠️ **Manual activation required in n8n UI** (API/MCP cannot activate)
- [ ] Monitor first executions
- [ ] Document workflow purpose and data flow
---
## Data Flow Patterns
### Linear Flow
```
Trigger → Transform → Action → End
```
**Use when**: Simple workflows with single path
### Branching Flow
```
Trigger → IF → [True Path]
└→ [False Path]
```
**Use when**: Different actions based on conditions
### Parallel Processing
```
Trigger → [Branch 1] → Merge
└→ [Branch 2] ↗
```
**Use when**: Independent operations that can run simultaneously
### Loop Pattern
```
Trigger → Split in Batches → Process → Loop (until done)
```
**Use when**: Processing large datasets in chunks
### Error Handler Pattern
```
Main Flow → [Success Path]
└→ [Error Trigger → Error Handler]
```
**Use when**: Need separate error handling workflow
---
## Common Gotchas
### 1. Webhook Data Structure
**Problem**: Can't access webhook payload data
**Solution**: Data is nested under `$json.body`
```javascript
{{$json.email}}
{{$json.body.email}}
```
See: n8n Expression Syntax skill
### 2. Multiple Input Items
**Problem**: Node processes all input items, but I only want one
**Solution**: Use "Execute Once" mode or process first item only
```javascript
{{$json[0].field}} // First item only
```
### 3. Authentication Issues
**Problem**: API calls failing with 401/403
**Solution**:
- Configure credentials properly
- Use the "Credentials" section, not parameters
- Test credentials before workflow activation
### 4. Node Execution Order
**Problem**: Nodes executing in unexpected order
**Solution**: Check workflow settings → Execution Order
- v0: Top-to-bottom (legacy)
- v1: Connection-based (recommended)
### 5. Expression Errors
**Problem**: Expressions showing as literal text
**Solution**: Use {{}} around expressions
- See n8n Expression Syntax skill for details
---
## Integration with Other Skills
These skills work together with Workflow Patterns:
**n8n MCP Tools Expert** - Use to:
- Find nodes for your pattern (search_nodes)
- Understand node operations (get_node_essentials)
- Create workflows (n8n_create_workflow)
**n8n Expression Syntax** - Use to:
- Write expressions in transformation nodes
- Access webhook data correctly ({{$json.body.field}})
- Reference previous nodes ({{$node["Node Name"].json.field}})
**n8n Node Configuration** - Use to:
- Configure specific operations for pattern nodes
- Understand node-specific requirements
**n8n Validation Expert** - Use to:
- Validate workflow structure
- Fix validation errors
- Ensure workflow correctness before deployment
---
## Pattern Statistics
Common workflow patterns:
**Most Common Triggers**:
1. Webhook - 35%
2. Schedule (periodic tasks) - 28%
3. Manual (testing/admin) - 22%
4. Service triggers (Slack, email, etc.) - 15%
**Most Common Transformations**:
1. Set (field mapping) - 68%
2. Code (custom logic) - 42%
3. IF (conditional routing) - 38%
4. Switch (multi-condition) - 18%
**Most Common Outputs**:
1. HTTP Request (APIs) - 45%
2. Slack - 32%
3. Database writes - 28%
4. Email - 24%
**Average Workflow Complexity**:
- Simple (3-5 nodes): 42%
- Medium (6-10 nodes): 38%
- Complex (11+ nodes): 20%
---
## Quick Start Examples
### Example 1: Simple Webhook → Slack
```
1. Webhook (path: "form-submit", POST)
2. Set (map form fields)
3. Slack (post message to #notifications)
```
### Example 2: Scheduled Report
```
1. Schedule (daily at 9 AM)
2. HTTP Request (fetch analytics)
3. Code (aggregate data)
4. Email (send formatted report)
5. Error Trigger → Slack (notify on failure)
```
### Example 3: Database Sync
```
1. Schedule (every 15 minutes)
2. Postgres (query new records)
3. IF (check if records exist)
4. MySQL (insert records)
5. Postgres (update sync timestamp)
```
### Example 4: AI Assistant
```
1. Webhook (receive chat message)
2. AI Agent
├─ OpenAI Chat Model (ai_languageModel)
├─ HTTP Request Tool (ai_tool)
├─ Database Tool (ai_tool)
└─ Window Buffer Memory (ai_memory)
3. Webhook Response (send AI reply)
```
### Example 5: API Integration
```
1. Manual Trigger (for testing)
2. HTTP Request (GET /api/users)
3. Split In Batches (process 100 at a time)
4. Set (transform user data)
5. Postgres (upsert users)
6. Loop (back to step 3 until done)
```
---
## Detailed Pattern Files
For comprehensive guidance on each pattern:
- **[webhook_processing.md](webhook_processing.md)** - Webhook patterns, data structure, response handling
- **[http_api_integration.md](http_api_integration.md)** - REST APIs, authentication, pagination, retries
- **[database_operations.md](database_operations.md)** - Queries, sync, transactions, batch processing
- **[ai_agent_workflow.md](ai_agent_workflow.md)** - AI agents, tools, memory, langchain nodes
- **[scheduled_tasks.md](scheduled_tasks.md)** - Cron schedules, reports, maintenance tasks
---
## Real Template Examples
From n8n template library:
**Template #2947**: Weather to Slack
- Pattern: Scheduled Task
- Nodes: Schedule → HTTP Request (weather API) → Set → Slack
- Complexity: Simple (4 nodes)
**Webhook Processing**: Most common pattern
- Most common: Form submissions, payment webhooks, chat integrations
**HTTP API**: Common pattern
- Most common: Data fetching, third-party integrations
**Database Operations**: Common pattern
- Most common: ETL, data sync, backup workflows
**AI Agents**: Growing in usage
- Most common: Chatbots, content generation, data analysis
Use `search_templates` and `get_template` from n8n-mcp tools to find examples!
---
## Best Practices
### ✅ Do
- Start with the simplest pattern that solves your problem
- Plan your workflow structure before building
- Use error handling on all workflows
- Test with sample data before activation
- Follow the workflow creation checklist
- Use descriptive node names
- Document complex workflows (notes field)
- Monitor workflow executions after deployment
### ❌ Don't
- Build workflows in one shot (iterate! avg 56s between edits)
- Skip validation before activation
- Ignore error scenarios
- Use complex patterns when simple ones suffice
- Hardcode credentials in parameters
- Forget to handle empty data cases
- Mix multiple patterns without clear boundaries
- Deploy without testing
---
## Summary
**Key Points**:
1. **5 core patterns** cover 90%+ of workflow use cases
2. **Webhook processing** is the most common pattern
3. Use the **workflow creation checklist** for every workflow
4. **Plan pattern****Select nodes****Build****Validate****Deploy**
5. Integrate with other skills for complete workflow development
**Next Steps**:
1. Identify your use case pattern
2. Read the detailed pattern file
3. Use n8n MCP Tools Expert to find nodes
4. Follow the workflow creation checklist
5. Use n8n Validation Expert to validate
**Related Skills**:
- n8n MCP Tools Expert - Find and configure nodes
- n8n Expression Syntax - Write expressions correctly
- n8n Validation Expert - Validate and fix errors
- n8n Node Configuration - Configure specific operations

View File

@@ -0,0 +1,784 @@
# AI Agent Workflow Pattern
**Use Case**: Build AI agents with tool access, memory, and reasoning capabilities.
---
## Pattern Structure
```
Trigger → AI Agent (Model + Tools + Memory) → [Process Response] → Output
```
**Key Characteristic**: AI-powered decision making with tool use
---
## Core AI Connection Types
n8n supports **8 AI connection types** for building agent workflows:
1. **ai_languageModel** - The LLM (OpenAI, Anthropic, etc.)
2. **ai_tool** - Functions the agent can call
3. **ai_memory** - Conversation context
4. **ai_outputParser** - Parse structured outputs
5. **ai_embedding** - Vector embeddings
6. **ai_vectorStore** - Vector database
7. **ai_document** - Document loaders
8. **ai_textSplitter** - Text chunking
---
## Core Components
### 1. Trigger
**Options**:
- **Webhook** - Chat interfaces, API calls (most common)
- **Manual** - Testing and development
- **Schedule** - Periodic AI tasks
### 2. AI Agent Node
**Purpose**: Orchestrate LLM with tools and memory
**Configuration**:
```javascript
{
agent: "conversationalAgent", // or "openAIFunctionsAgent"
promptType: "define",
text: "You are a helpful assistant that can search docs, query databases, and send emails."
}
```
**Connections**:
- **ai_languageModel input** - Connected to LLM node
- **ai_tool inputs** - Connected to tool nodes
- **ai_memory input** - Connected to memory node (optional)
### 3. Language Model
**Available providers**:
- OpenAI (GPT-4, GPT-3.5)
- Anthropic (Claude)
- Google (Gemini)
- Local models (Ollama, LM Studio)
**Example** (OpenAI Chat Model):
```javascript
{
model: "gpt-4",
temperature: 0.7,
maxTokens: 1000
}
```
### 4. Tools (ANY Node Can Be a Tool!)
**Critical insight**: Connect ANY n8n node to agent via `ai_tool` port
**Common tool types**:
- HTTP Request - Call APIs
- Database nodes - Query data
- Code - Custom functions
- Search nodes - Web/document search
- Pre-built tool nodes (Calculator, Wikipedia, etc.)
### 5. Memory (Optional but Recommended)
**Purpose**: Maintain conversation context
**Types**:
- **Buffer Memory** - Store recent messages
- **Window Buffer Memory** - Store last N messages
- **Summary Memory** - Summarize conversation
### 6. Output Processing
**Purpose**: Format AI response for delivery
**Common patterns**:
- Return directly (chat response)
- Store in database (conversation history)
- Send to communication channel (Slack, email)
---
## Common Use Cases
### 1. Conversational Chatbot
**Flow**: Webhook (chat message) → AI Agent → Webhook Response
**Example** (Customer support bot):
```
1. Webhook (path: "chat", POST)
- Receives: {user_id, message, session_id}
2. Window Buffer Memory (load context by session_id)
3. AI Agent
├─ OpenAI Chat Model (gpt-4)
├─ HTTP Request Tool (search knowledge base)
├─ Database Tool (query customer orders)
└─ Window Buffer Memory (conversation context)
4. Code (format response)
5. Webhook Response (send reply)
```
**AI Agent prompt**:
```
You are a customer support assistant.
You can:
1. Search the knowledge base for answers
2. Look up customer orders
3. Provide shipping information
Be helpful and professional.
```
### 2. Document Q&A
**Flow**: Upload docs → Embed → Store → Query with AI
**Example** (Internal documentation assistant):
```
Setup Phase (run once):
1. Read Files (load documentation)
2. Text Splitter (chunk into paragraphs)
3. Embeddings (OpenAI Embeddings)
4. Vector Store (Pinecone/Qdrant) (store vectors)
Query Phase (recurring):
1. Webhook (receive question)
2. AI Agent
├─ OpenAI Chat Model (gpt-4)
├─ Vector Store Tool (search similar docs)
└─ Buffer Memory (context)
3. Webhook Response (answer with citations)
```
### 3. Data Analysis Assistant
**Flow**: Request → AI Agent (with data tools) → Analysis → Visualization
**Example** (SQL analyst agent):
```
1. Webhook (data question: "What were sales last month?")
2. AI Agent
├─ OpenAI Chat Model (gpt-4)
├─ Postgres Tool (execute queries)
└─ Code Tool (data analysis)
3. Code (generate visualization data)
4. Webhook Response (answer + chart data)
```
**Postgres Tool Configuration**:
```javascript
{
name: "query_database",
description: "Execute SQL queries to analyze sales data. Use SELECT queries only.",
// Node executes AI-generated SQL
}
```
### 4. Workflow Automation Agent
**Flow**: Command → AI Agent → Execute actions → Report
**Example** (DevOps assistant):
```
1. Slack (slash command: /deploy production)
2. AI Agent
├─ OpenAI Chat Model (gpt-4)
├─ HTTP Request Tool (GitHub API)
├─ HTTP Request Tool (Deploy API)
└─ Postgres Tool (deployment logs)
3. Agent actions:
- Check if tests passed
- Create deployment
- Log deployment
- Notify team
4. Slack (deployment status)
```
### 5. Email Processing Agent
**Flow**: Email received → AI Agent → Categorize → Route → Respond
**Example** (Support ticket router):
```
1. Email Trigger (new support email)
2. AI Agent
├─ OpenAI Chat Model (gpt-4)
├─ Vector Store Tool (search similar tickets)
└─ HTTP Request Tool (create Jira ticket)
3. Agent actions:
- Categorize urgency (low/medium/high)
- Find similar past tickets
- Create ticket in appropriate project
- Draft response
4. Email (send auto-response)
5. Slack (notify assigned team)
```
---
## Tool Configuration
### Making ANY Node an AI Tool
**Critical concept**: Any n8n node can become an AI tool!
**Requirements**:
1. Connect node to AI Agent via `ai_tool` port (NOT main port)
2. Configure tool name and description
3. Define input schema (optional)
**Example** (HTTP Request as tool):
```javascript
{
// Tool metadata (for AI)
name: "search_github_issues",
description: "Search GitHub issues by keyword. Returns issue titles and URLs.",
// HTTP Request configuration
method: "GET",
url: "https://api.github.com/search/issues",
sendQuery: true,
queryParameters: {
"q": "={{$json.query}} repo:{{$json.repo}}",
"per_page": "5"
}
}
```
**How it works**:
1. AI Agent sees tool: `search_github_issues(query, repo)`
2. AI decides to use it: `search_github_issues("bug", "n8n-io/n8n")`
3. n8n executes HTTP Request with parameters
4. Result returned to AI Agent
5. AI Agent processes result and responds
### Pre-built Tool Nodes
**Available in @n8n/n8n-nodes-langchain**:
- **Calculator Tool** - Math operations
- **Wikipedia Tool** - Wikipedia search
- **Serper Tool** - Google search
- **Wolfram Alpha Tool** - Computational knowledge
- **Custom Tool** - Define with Code node
**Example** (Calculator Tool):
```
AI Agent
├─ OpenAI Chat Model
└─ Calculator Tool (ai_tool connection)
User: "What's 15% of 2,847?"
AI: *uses calculator tool* → "426.05"
```
### Database as Tool
**Pattern**: Postgres/MySQL node connected as ai_tool
**Configuration**:
```javascript
{
// Tool metadata
name: "query_customers",
description: "Query customer database. Use SELECT queries to find customer information by email, name, or ID.",
// Postgres config
operation: "executeQuery",
query: "={{$json.sql}}", // AI provides SQL
// Security: Use read-only database user!
}
```
**Safety**: Create read-only DB user for AI tools!
```sql
CREATE USER ai_readonly WITH PASSWORD 'secure_password';
GRANT SELECT ON customers, orders TO ai_readonly;
-- NO INSERT, UPDATE, DELETE access
```
### Code Node as Tool
**Pattern**: Custom Python/JavaScript function
**Example** (Data processor):
```javascript
// Tool metadata
{
name: "process_csv",
description: "Process CSV data and return statistics. Input: csv_string"
}
// Code node
const csv = $input.first().json.csv_string;
const lines = csv.split('\n');
const data = lines.slice(1).map(line => line.split(','));
return [{
json: {
row_count: data.length,
columns: lines[0].split(','),
summary: {
// Calculate statistics
}
}
}];
```
---
## Memory Configuration
### Buffer Memory
**Stores all messages** (until cleared)
```javascript
{
memoryType: "bufferMemory",
sessionKey: "={{$json.body.user_id}}" // Per-user memory
}
```
### Window Buffer Memory
**Stores last N messages** (recommended)
```javascript
{
memoryType: "windowBufferMemory",
sessionKey: "={{$json.body.session_id}}",
contextWindowLength: 10 // Last 10 messages
}
```
### Summary Memory
**Summarizes old messages** (for long conversations)
```javascript
{
memoryType: "summaryMemory",
sessionKey: "={{$json.body.session_id}}",
maxTokenLimit: 2000
}
```
**How it works**:
1. Conversation grows beyond limit
2. AI summarizes old messages
3. Summary stored, old messages discarded
4. Saves tokens while maintaining context
---
## Agent Types
### 1. Conversational Agent
**Best for**: General chat, customer support
**Features**:
- Natural conversation flow
- Memory integration
- Tool use with reasoning
**When to use**: Most common use case
### 2. OpenAI Functions Agent
**Best for**: Tool-heavy workflows, structured outputs
**Features**:
- Optimized for function calling
- Better tool selection
- Structured responses
**When to use**: Multiple tools, need reliable tool calling
### 3. ReAct Agent
**Best for**: Step-by-step reasoning
**Features**:
- Think → Act → Observe loop
- Visible reasoning process
- Good for debugging
**When to use**: Complex multi-step tasks
---
## Prompt Engineering for Agents
### System Prompt Structure
```
You are a [ROLE].
You can:
- [CAPABILITY 1]
- [CAPABILITY 2]
- [CAPABILITY 3]
Guidelines:
- [GUIDELINE 1]
- [GUIDELINE 2]
Format:
- [OUTPUT FORMAT]
```
### Example (Customer Support)
```
You are a customer support assistant for Acme Corp.
You can:
- Search the knowledge base for answers
- Look up customer orders and shipping status
- Create support tickets for complex issues
Guidelines:
- Be friendly and professional
- If you don't know something, say so and offer to create a ticket
- Always verify customer identity before sharing order details
Format:
- Keep responses concise
- Use bullet points for multiple items
- Include relevant links when available
```
### Example (Data Analyst)
```
You are a data analyst assistant with access to the company database.
You can:
- Query sales, customer, and product data
- Perform data analysis and calculations
- Generate summary statistics
Guidelines:
- Write efficient SQL queries (always use LIMIT)
- Explain your analysis methodology
- Highlight important trends or anomalies
- Use read-only queries (SELECT only)
Format:
- Provide numerical answers with context
- Include query used (for transparency)
- Suggest follow-up analyses when relevant
```
---
## Error Handling
### Pattern 1: Tool Execution Errors
```
AI Agent (continueOnFail on tool nodes)
→ IF (tool error occurred)
└─ Code (log error)
└─ Webhook Response (user-friendly error)
```
### Pattern 2: LLM API Errors
```
Main Workflow:
AI Agent → Process Response
Error Workflow:
Error Trigger
→ IF (rate limit error)
└─ Wait → Retry
→ ELSE
└─ Notify Admin
```
### Pattern 3: Invalid Tool Outputs
```javascript
// Code node - validate tool output
const result = $input.first().json;
if (!result || !result.data) {
throw new Error('Tool returned invalid data');
}
return [{ json: result }];
```
---
## Performance Optimization
### 1. Choose Right Model
```
Fast & cheap: GPT-3.5-turbo, Claude 3 Haiku
Balanced: GPT-4, Claude 3 Sonnet
Powerful: GPT-4-turbo, Claude 3 Opus
```
### 2. Limit Context Window
```javascript
{
memoryType: "windowBufferMemory",
contextWindowLength: 5 // Only last 5 messages
}
```
### 3. Optimize Tool Descriptions
```javascript
// ❌ Vague
description: "Search for things"
// ✅ Clear and concise
description: "Search GitHub issues by keyword and repository. Returns top 5 matching issues with titles and URLs."
```
### 4. Cache Embeddings
For document Q&A, embed documents once:
```
Setup (run once):
Documents → Embed → Store in Vector DB
Query (fast):
Question → Search Vector DB → AI Agent
```
### 5. Async Tools for Slow Operations
```
AI Agent → [Queue slow tool request]
→ Return immediate response
→ [Background: Execute tool + notify when done]
```
---
## Security Considerations
### 1. Read-Only Database Tools
```sql
-- Create limited user for AI tools
CREATE USER ai_agent_ro WITH PASSWORD 'secure';
GRANT SELECT ON public.* TO ai_agent_ro;
-- NO write access!
```
### 2. Validate Tool Inputs
```javascript
// Code node - validate before execution
const query = $json.query;
if (query.toLowerCase().includes('drop ') ||
query.toLowerCase().includes('delete ') ||
query.toLowerCase().includes('update ')) {
throw new Error('Invalid query - write operations not allowed');
}
```
### 3. Rate Limiting
```
Webhook → IF (check user rate limit)
├─ [Within limit] → AI Agent
└─ [Exceeded] → Error (429 Too Many Requests)
```
### 4. Sanitize User Input
```javascript
// Code node
const userInput = $json.body.message
.trim()
.substring(0, 1000); // Max 1000 chars
return [{ json: { sanitized: userInput } }];
```
### 5. Monitor Tool Usage
```
AI Agent → Log Tool Calls
→ IF (suspicious pattern)
└─ Alert Admin + Pause Agent
```
---
## Testing AI Agents
### 1. Start with Manual Trigger
Replace webhook with manual trigger:
```
Manual Trigger
→ Set (mock user input)
→ AI Agent
→ Code (log output)
```
### 2. Test Tools Independently
Before connecting to agent:
```
Manual Trigger → Tool Node → Verify output format
```
### 3. Test with Standard Questions
Create test suite:
```
1. "Hello" - Test basic response
2. "Search for bug reports" - Test tool calling
3. "What did I ask before?" - Test memory
4. Invalid input - Test error handling
```
### 4. Monitor Token Usage
```javascript
// Code node - log token usage
console.log('Input tokens:', $node['AI Agent'].json.usage.input_tokens);
console.log('Output tokens:', $node['AI Agent'].json.usage.output_tokens);
```
### 5. Test Edge Cases
- Empty input
- Very long input
- Tool returns no results
- Tool returns error
- Multiple tool calls in sequence
---
## Common Gotchas
### 1. ❌ Wrong: Connecting tools to main port
```
HTTP Request → AI Agent // Won't work as tool!
```
### ✅ Correct: Use ai_tool connection type
```
HTTP Request --[ai_tool]--> AI Agent
```
### 2. ❌ Wrong: Vague tool descriptions
```
description: "Get data" // AI won't know when to use this
```
### ✅ Correct: Specific descriptions
```
description: "Query customer orders by email address. Returns order ID, status, and shipping info."
```
### 3. ❌ Wrong: No memory for conversations
```
Every message is standalone - no context!
```
### ✅ Correct: Add memory
```
Window Buffer Memory --[ai_memory]--> AI Agent
```
### 4. ❌ Wrong: Giving AI write access
```
Postgres (full access) as tool // AI could DELETE data!
```
### ✅ Correct: Read-only access
```
Postgres (read-only user) as tool // Safe
```
### 5. ❌ Wrong: Unbounded tool responses
```
Tool returns 10MB of data → exceeds token limit
```
### ✅ Correct: Limit tool output
```javascript
{
query: "SELECT * FROM table LIMIT 10" // Only 10 rows
}
```
---
## Real Template Examples
From n8n template library (234 AI templates):
**Simple Chatbot**:
```
Webhook → AI Agent (GPT-4 + Memory) → Webhook Response
```
**Document Q&A**:
```
Setup: Files → Embed → Vector Store
Query: Webhook → AI Agent (GPT-4 + Vector Store Tool) → Response
```
**SQL Analyst**:
```
Webhook → AI Agent (GPT-4 + Postgres Tool) → Format → Response
```
Use `search_templates({query: "ai agent"})` to find more!
---
## Checklist for AI Agent Workflows
### Planning
- [ ] Define agent purpose and capabilities
- [ ] List required tools (APIs, databases, etc.)
- [ ] Design conversation flow
- [ ] Plan memory strategy (per-user, per-session)
- [ ] Consider token costs
### Implementation
- [ ] Choose appropriate LLM model
- [ ] Write clear system prompt
- [ ] Connect tools via ai_tool ports (NOT main)
- [ ] Add tool descriptions
- [ ] Configure memory (Window Buffer recommended)
- [ ] Test each tool independently
### Security
- [ ] Use read-only database access for tools
- [ ] Validate tool inputs
- [ ] Sanitize user inputs
- [ ] Add rate limiting
- [ ] Monitor for abuse
### Testing
- [ ] Test with diverse inputs
- [ ] Verify tool calling works
- [ ] Check memory persistence
- [ ] Test error scenarios
- [ ] Monitor token usage and costs
### Deployment
- [ ] Add error handling
- [ ] Set up logging
- [ ] Monitor performance
- [ ] Set cost alerts
- [ ] Document agent capabilities
---
## Summary
**Key Points**:
1. **8 AI connection types** - Use ai_tool for tools, ai_memory for context
2. **ANY node can be a tool** - Connect to ai_tool port
3. **Memory is essential** for conversations (Window Buffer recommended)
4. **Tool descriptions matter** - AI uses them to decide when to call tools
5. **Security first** - Read-only database access, validate inputs
**Pattern**: Trigger → AI Agent (Model + Tools + Memory) → Output
**Related**:
- [webhook_processing.md](webhook_processing.md) - Receiving chat messages
- [http_api_integration.md](http_api_integration.md) - Tools that call APIs
- [database_operations.md](database_operations.md) - Database tools for agents

View File

@@ -0,0 +1,785 @@
# Database Operations Pattern
**Use Case**: Read, write, sync, and manage database data in workflows.
---
## Pattern Structure
```
Trigger → [Query/Read] → [Transform] → [Write/Update] → [Verify/Log]
```
**Key Characteristic**: Data persistence and synchronization
---
## Core Components
### 1. Trigger
**Options**:
- **Schedule** - Periodic sync/maintenance (most common)
- **Webhook** - Event-driven writes
- **Manual** - One-time operations
### 2. Database Read Nodes
**Supported databases**:
- Postgres
- MySQL
- MongoDB
- Microsoft SQL
- SQLite
- Redis
- And more via community nodes
### 3. Transform
**Purpose**: Map between different database schemas or formats
**Typical nodes**:
- **Set** - Field mapping
- **Code** - Complex transformations
- **Merge** - Combine data from multiple sources
### 4. Database Write Nodes
**Operations**:
- INSERT - Create new records
- UPDATE - Modify existing records
- UPSERT - Insert or update
- DELETE - Remove records
### 5. Verification
**Purpose**: Confirm operations succeeded
**Methods**:
- Query to verify records
- Count rows affected
- Log results
---
## Common Use Cases
### 1. Data Synchronization
**Flow**: Schedule → Read Source DB → Transform → Write Target DB → Log
**Example** (Postgres to MySQL sync):
```
1. Schedule (every 15 minutes)
2. Postgres (SELECT * FROM users WHERE updated_at > {{$json.last_sync}})
3. IF (check if records exist)
4. Set (map Postgres schema to MySQL schema)
5. MySQL (INSERT or UPDATE users)
6. Postgres (UPDATE sync_log SET last_sync = NOW())
7. Slack (notify: "Synced X users")
```
**Incremental sync query**:
```sql
SELECT *
FROM users
WHERE updated_at > $1
ORDER BY updated_at ASC
LIMIT 1000
```
**Parameters**:
```javascript
{
"parameters": [
"={{$node['Get Last Sync'].json.last_sync}}"
]
}
```
### 2. ETL (Extract, Transform, Load)
**Flow**: Extract from multiple sources → Transform → Load into warehouse
**Example** (Consolidate data):
```
1. Schedule (daily at 2 AM)
2. [Parallel branches]
├─ Postgres (SELECT orders)
├─ MySQL (SELECT customers)
└─ MongoDB (SELECT products)
3. Merge (combine all data)
4. Code (transform to warehouse schema)
5. Postgres (warehouse - INSERT into fact_sales)
6. Email (send summary report)
```
### 3. Data Validation & Cleanup
**Flow**: Schedule → Query → Validate → Update/Delete invalid records
**Example** (Clean orphaned records):
```
1. Schedule (weekly)
2. Postgres (SELECT users WHERE email IS NULL OR email = '')
3. IF (invalid records exist)
4. Postgres (UPDATE users SET status='inactive' WHERE email IS NULL)
5. Postgres (DELETE FROM users WHERE created_at < NOW() - INTERVAL '1 year' AND status='inactive')
6. Slack (alert: "Cleaned X invalid records")
```
### 4. Backup & Archive
**Flow**: Schedule → Query → Export → Store
**Example** (Archive old records):
```
1. Schedule (monthly)
2. Postgres (SELECT * FROM orders WHERE created_at < NOW() - INTERVAL '2 years')
3. Code (convert to JSON)
4. Write File (save to archive.json)
5. Google Drive (upload archive)
6. Postgres (DELETE FROM orders WHERE created_at < NOW() - INTERVAL '2 years')
```
### 5. Real-time Data Updates
**Flow**: Webhook → Parse → Update Database
**Example** (Update user status):
```
1. Webhook (receive status update)
2. Postgres (UPDATE users SET status = {{$json.body.status}} WHERE id = {{$json.body.user_id}})
3. IF (rows affected > 0)
4. Redis (SET user:{{$json.body.user_id}}:status {{$json.body.status}})
5. Webhook Response ({"success": true})
```
---
## Database Node Configuration
### Postgres
#### SELECT Query
```javascript
{
operation: "executeQuery",
query: "SELECT id, name, email FROM users WHERE created_at > $1 LIMIT $2",
parameters: [
"={{$json.since_date}}",
"100"
]
}
```
#### INSERT
```javascript
{
operation: "insert",
table: "users",
columns: "id, name, email, created_at",
values: [
{
id: "={{$json.id}}",
name: "={{$json.name}}",
email: "={{$json.email}}",
created_at: "={{$now}}"
}
]
}
```
#### UPDATE
```javascript
{
operation: "update",
table: "users",
updateKey: "id",
columns: "name, email, updated_at",
values: {
id: "={{$json.id}}",
name: "={{$json.name}}",
email: "={{$json.email}}",
updated_at: "={{$now}}"
}
}
```
#### UPSERT (INSERT ... ON CONFLICT)
```javascript
{
operation: "executeQuery",
query: `
INSERT INTO users (id, name, email)
VALUES ($1, $2, $3)
ON CONFLICT (id)
DO UPDATE SET name = $2, email = $3, updated_at = NOW()
`,
parameters: [
"={{$json.id}}",
"={{$json.name}}",
"={{$json.email}}"
]
}
```
### MySQL
#### SELECT with JOIN
```javascript
{
operation: "executeQuery",
query: `
SELECT u.id, u.name, o.order_id, o.total
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.created_at > ?
`,
parameters: [
"={{$json.since_date}}"
]
}
```
#### Bulk INSERT
```javascript
{
operation: "insert",
table: "orders",
columns: "user_id, total, status",
values: $json.orders // Array of objects
}
```
### MongoDB
#### Find Documents
```javascript
{
operation: "find",
collection: "users",
query: JSON.stringify({
created_at: { $gt: new Date($json.since_date) },
status: "active"
}),
limit: 100
}
```
#### Insert Document
```javascript
{
operation: "insert",
collection: "users",
document: JSON.stringify({
name: $json.name,
email: $json.email,
created_at: new Date()
})
}
```
#### Update Document
```javascript
{
operation: "update",
collection: "users",
query: JSON.stringify({ _id: $json.user_id }),
update: JSON.stringify({
$set: {
status: $json.status,
updated_at: new Date()
}
})
}
```
---
## Batch Processing
### Pattern 1: Split In Batches
**Use when**: Processing large datasets to avoid memory issues
```
Postgres (SELECT 10000 records)
→ Split In Batches (100 items per batch)
→ Transform
→ MySQL (write batch)
→ Loop (until all processed)
```
### Pattern 2: Paginated Queries
**Use when**: Database has millions of records
```
Set (initialize: offset=0, limit=1000)
→ Loop Start
→ Postgres (SELECT * FROM large_table LIMIT {{$json.limit}} OFFSET {{$json.offset}})
→ IF (records returned)
├─ Process records
├─ Set (increment offset by 1000)
└─ Loop back
└─ [No records] → End
```
**Query**:
```sql
SELECT * FROM large_table
ORDER BY id
LIMIT $1 OFFSET $2
```
### Pattern 3: Cursor-Based Pagination
**Better performance for large datasets**:
```
Set (initialize: last_id=0)
→ Loop Start
→ Postgres (SELECT * FROM table WHERE id > {{$json.last_id}} ORDER BY id LIMIT 1000)
→ IF (records returned)
├─ Process records
├─ Code (get max id from batch)
└─ Loop back
└─ [No records] → End
```
**Query**:
```sql
SELECT * FROM table
WHERE id > $1
ORDER BY id ASC
LIMIT 1000
```
---
## Transaction Handling
### Pattern 1: BEGIN/COMMIT/ROLLBACK
**For databases that support transactions**:
```javascript
// Node 1: Begin Transaction
{
operation: "executeQuery",
query: "BEGIN"
}
// Node 2-N: Your operations
{
operation: "executeQuery",
query: "INSERT INTO ...",
continueOnFail: true
}
// Node N+1: Commit or Rollback
{
operation: "executeQuery",
query: "={{$node['Operation'].json.error ? 'ROLLBACK' : 'COMMIT'}}"
}
```
### Pattern 2: Atomic Operations
**Use database features for atomicity**:
```sql
-- Upsert example (atomic)
INSERT INTO inventory (product_id, quantity)
VALUES ($1, $2)
ON CONFLICT (product_id)
DO UPDATE SET quantity = inventory.quantity + $2
```
### Pattern 3: Error Rollback
**Manual rollback on error**:
```
Try Operations:
Postgres (INSERT orders)
MySQL (INSERT order_items)
Error Trigger:
Postgres (DELETE FROM orders WHERE id = {{$json.order_id}})
MySQL (DELETE FROM order_items WHERE order_id = {{$json.order_id}})
```
---
## Data Transformation
### Schema Mapping
```javascript
// Code node - map schemas
const sourceData = $input.all();
return sourceData.map(item => ({
json: {
// Source → Target mapping
user_id: item.json.id,
full_name: `${item.json.first_name} ${item.json.last_name}`,
email_address: item.json.email,
registration_date: new Date(item.json.created_at).toISOString(),
// Computed fields
is_premium: item.json.plan_type === 'pro',
// Default values
status: item.json.status || 'active'
}
}));
```
### Data Type Conversions
```javascript
// Code node - convert data types
return $input.all().map(item => ({
json: {
// String to number
user_id: parseInt(item.json.user_id),
// String to date
created_at: new Date(item.json.created_at),
// Number to boolean
is_active: item.json.active === 1,
// JSON string to object
metadata: JSON.parse(item.json.metadata || '{}'),
// Null handling
email: item.json.email || null
}
}));
```
### Aggregation
```javascript
// Code node - aggregate data
const items = $input.all();
const summary = items.reduce((acc, item) => {
const date = item.json.created_at.split('T')[0];
if (!acc[date]) {
acc[date] = { count: 0, total: 0 };
}
acc[date].count++;
acc[date].total += item.json.amount;
return acc;
}, {});
return Object.entries(summary).map(([date, data]) => ({
json: {
date,
count: data.count,
total: data.total,
average: data.total / data.count
}
}));
```
---
## Performance Optimization
### 1. Use Indexes
Ensure database has proper indexes:
```sql
-- Add index for sync queries
CREATE INDEX idx_users_updated_at ON users(updated_at);
-- Add index for lookups
CREATE INDEX idx_orders_user_id ON orders(user_id);
```
### 2. Limit Result Sets
Always use LIMIT:
```sql
-- ✅ Good
SELECT * FROM large_table
WHERE created_at > $1
LIMIT 1000
-- ❌ Bad (unbounded)
SELECT * FROM large_table
WHERE created_at > $1
```
### 3. Use Prepared Statements
Parameterized queries are faster:
```javascript
// ✅ Good - prepared statement
{
query: "SELECT * FROM users WHERE id = $1",
parameters: ["={{$json.id}}"]
}
// ❌ Bad - string concatenation
{
query: "SELECT * FROM users WHERE id = '={{$json.id}}'"
}
```
### 4. Batch Writes
Write multiple records at once:
```javascript
// ✅ Good - batch insert
{
operation: "insert",
table: "orders",
values: $json.items // Array of 100 items
}
// ❌ Bad - individual inserts in loop
// 100 separate INSERT statements
```
### 5. Connection Pooling
Configure in credentials:
```javascript
{
host: "db.example.com",
database: "mydb",
user: "user",
password: "pass",
// Connection pool settings
min: 2,
max: 10,
idleTimeoutMillis: 30000
}
```
---
## Error Handling
### Pattern 1: Check Rows Affected
```
Database Operation (UPDATE users...)
→ IF ({{$json.rowsAffected === 0}})
└─ Alert: "No rows updated - record not found"
```
### Pattern 2: Constraint Violations
```javascript
// Database operation with continueOnFail: true
{
operation: "insert",
continueOnFail: true
}
// Next node: Check for errors
IF ({{$json.error !== undefined}})
IF ({{$json.error.includes('duplicate key')}})
Log: "Record already exists - skipping"
ELSE
Alert: "Database error: {{$json.error}}"
```
### Pattern 3: Rollback on Error
```
Try Operations:
→ Database Write 1
→ Database Write 2
→ Database Write 3
Error Trigger:
→ Rollback Operations
→ Alert Admin
```
---
## Security Best Practices
### 1. Use Parameterized Queries (Prevent SQL Injection)
```javascript
// ✅ SAFE - parameterized
{
query: "SELECT * FROM users WHERE email = $1",
parameters: ["={{$json.email}}"]
}
// ❌ DANGEROUS - SQL injection risk
{
query: "SELECT * FROM users WHERE email = '={{$json.email}}'"
}
```
### 2. Least Privilege Access
**Create dedicated workflow user**:
```sql
-- ✅ Good - limited permissions
CREATE USER n8n_workflow WITH PASSWORD 'secure_password';
GRANT SELECT, INSERT, UPDATE ON orders TO n8n_workflow;
GRANT SELECT ON users TO n8n_workflow;
-- ❌ Bad - too much access
GRANT ALL PRIVILEGES TO n8n_workflow;
```
### 3. Validate Input Data
```javascript
// Code node - validate before write
const email = $json.email;
const name = $json.name;
// Validation
if (!email || !email.includes('@')) {
throw new Error('Invalid email address');
}
if (!name || name.length < 2) {
throw new Error('Invalid name');
}
// Sanitization
return [{
json: {
email: email.toLowerCase().trim(),
name: name.trim()
}
}];
```
### 4. Encrypt Sensitive Data
```javascript
// Code node - encrypt before storage
const crypto = require('crypto');
const algorithm = 'aes-256-cbc';
const key = Buffer.from($credentials.encryptionKey, 'hex');
const iv = crypto.randomBytes(16);
const cipher = crypto.createCipheriv(algorithm, key, iv);
let encrypted = cipher.update($json.sensitive_data, 'utf8', 'hex');
encrypted += cipher.final('hex');
return [{
json: {
encrypted_data: encrypted,
iv: iv.toString('hex')
}
}];
```
---
## Common Gotchas
### 1. ❌ Wrong: Unbounded queries
```sql
SELECT * FROM large_table -- Could return millions
```
### ✅ Correct: Use LIMIT
```sql
SELECT * FROM large_table
ORDER BY created_at DESC
LIMIT 1000
```
### 2. ❌ Wrong: String concatenation in queries
```javascript
query: "SELECT * FROM users WHERE id = '{{$json.id}}'"
```
### ✅ Correct: Parameterized queries
```javascript
query: "SELECT * FROM users WHERE id = $1",
parameters: ["={{$json.id}}"]
```
### 3. ❌ Wrong: No transaction for multi-step operations
```
INSERT into orders
INSERT into order_items // Fails → orphaned order record
```
### ✅ Correct: Use transaction
```
BEGIN
INSERT into orders
INSERT into order_items
COMMIT (or ROLLBACK on error)
```
### 4. ❌ Wrong: Processing all items at once
```
SELECT 1000000 records → Process all → OOM error
```
### ✅ Correct: Batch processing
```
SELECT records → Split In Batches (1000) → Process → Loop
```
---
## Real Template Examples
From n8n template library (456 database templates):
**Data Sync**:
```
Schedule → Postgres (SELECT new records) → Transform → MySQL (INSERT)
```
**ETL Pipeline**:
```
Schedule → [Multiple DB reads] → Merge → Transform → Warehouse (INSERT)
```
**Backup**:
```
Schedule → Postgres (SELECT all) → JSON → Google Drive (upload)
```
Use `search_templates({query: "database"})` to find more!
---
## Checklist for Database Workflows
### Planning
- [ ] Identify source and target databases
- [ ] Understand schema differences
- [ ] Plan transformation logic
- [ ] Consider batch size for large datasets
- [ ] Design error handling strategy
### Implementation
- [ ] Use parameterized queries (never concatenate)
- [ ] Add LIMIT to all SELECT queries
- [ ] Use appropriate operation (INSERT/UPDATE/UPSERT)
- [ ] Configure credentials properly
- [ ] Test with small dataset first
### Performance
- [ ] Add database indexes for queries
- [ ] Use batch operations
- [ ] Implement pagination for large datasets
- [ ] Configure connection pooling
- [ ] Monitor query execution times
### Security
- [ ] Use parameterized queries (SQL injection prevention)
- [ ] Least privilege database user
- [ ] Validate and sanitize input
- [ ] Encrypt sensitive data
- [ ] Never log sensitive data
### Reliability
- [ ] Add transaction handling if needed
- [ ] Check rows affected
- [ ] Handle constraint violations
- [ ] Implement retry logic
- [ ] Add Error Trigger workflow
---
## Summary
**Key Points**:
1. **Always use parameterized queries** (prevent SQL injection)
2. **Batch processing** for large datasets
3. **Transaction handling** for multi-step operations
4. **Limit result sets** to avoid memory issues
5. **Validate input data** before writes
**Pattern**: Trigger → Query → Transform → Write → Verify
**Related**:
- [http_api_integration.md](http_api_integration.md) - Fetching data to store in DB
- [scheduled_tasks.md](scheduled_tasks.md) - Periodic database maintenance

View File

@@ -0,0 +1,734 @@
# HTTP API Integration Pattern
**Use Case**: Fetch data from REST APIs, transform it, and use it in workflows.
---
## Pattern Structure
```
Trigger → HTTP Request → [Transform] → [Action] → [Error Handler]
```
**Key Characteristic**: External data fetching with error handling
---
## Core Components
### 1. Trigger
**Options**:
- **Schedule** - Periodic fetching (most common)
- **Webhook** - Triggered by external event
- **Manual** - On-demand execution
### 2. HTTP Request Node
**Purpose**: Call external REST APIs
**Configuration**:
```javascript
{
method: "GET", // GET, POST, PUT, DELETE, PATCH
url: "https://api.example.com/users",
authentication: "predefinedCredentialType",
sendQuery: true,
queryParameters: {
"page": "={{$json.page}}",
"limit": "100"
},
sendHeaders: true,
headerParameters: {
"Accept": "application/json",
"X-API-Version": "v1"
}
}
```
### 3. Response Processing
**Purpose**: Extract and transform API response data
**Typical flow**:
```
HTTP Request → Code (parse) → Set (map fields) → Action
```
### 4. Action
**Common actions**:
- Store in database
- Send to another API
- Create notifications
- Update spreadsheet
### 5. Error Handler
**Purpose**: Handle API failures gracefully
**Error Trigger Workflow**:
```
Error Trigger → Log Error → Notify Admin → Retry Logic (optional)
```
---
## Common Use Cases
### 1. Data Fetching & Storage
**Flow**: Schedule → HTTP Request → Transform → Database
**Example** (Fetch GitHub issues):
```
1. Schedule (every hour)
2. HTTP Request
- Method: GET
- URL: https://api.github.com/repos/owner/repo/issues
- Auth: Bearer Token
- Query: state=open
3. Code (filter by labels)
4. Set (map to database schema)
5. Postgres (upsert issues)
```
**Response Handling**:
```javascript
// Code node - filter issues
const issues = $input.all();
return issues
.filter(item => item.json.labels.some(l => l.name === 'bug'))
.map(item => ({
json: {
id: item.json.id,
title: item.json.title,
created_at: item.json.created_at
}
}));
```
### 2. API to API Integration
**Flow**: Trigger → Fetch from API A → Transform → Send to API B
**Example** (Jira to Slack):
```
1. Schedule (every 15 minutes)
2. HTTP Request (GET Jira tickets updated today)
3. IF (check if tickets exist)
4. Set (format for Slack)
5. HTTP Request (POST to Slack webhook)
```
### 3. Data Enrichment
**Flow**: Trigger → Fetch base data → Call enrichment API → Combine → Store
**Example** (Enrich contacts with company data):
```
1. Postgres (SELECT new contacts)
2. Code (extract company domains)
3. HTTP Request (call Clearbit API for each domain)
4. Set (combine contact + company data)
5. Postgres (UPDATE contacts with enrichment)
```
### 4. Monitoring & Alerting
**Flow**: Schedule → Check API health → IF unhealthy → Alert
**Example** (API health check):
```
1. Schedule (every 5 minutes)
2. HTTP Request (GET /health endpoint)
3. IF (status !== 200 OR response time > 2000ms)
4. Slack (alert #ops-team)
5. PagerDuty (create incident)
```
### 5. Batch Processing
**Flow**: Trigger → Fetch large dataset → Split in Batches → Process → Loop
**Example** (Process all users):
```
1. Manual Trigger
2. HTTP Request (GET /api/users?limit=1000)
3. Split In Batches (100 items per batch)
4. HTTP Request (POST /api/process for each batch)
5. Wait (2 seconds between batches - rate limiting)
6. Loop (back to step 4 until all processed)
```
---
## Authentication Methods
### 1. None (Public APIs)
```javascript
{
authentication: "none"
}
```
### 2. Bearer Token (Most Common)
**Setup**: Create credential
```javascript
{
authentication: "predefinedCredentialType",
nodeCredentialType: "httpHeaderAuth",
headerAuth: {
name: "Authorization",
value: "Bearer YOUR_TOKEN"
}
}
```
**Access in workflow**:
```javascript
{
authentication: "predefinedCredentialType",
nodeCredentialType: "httpHeaderAuth"
}
```
### 3. API Key (Header or Query)
**Header auth**:
```javascript
{
sendHeaders: true,
headerParameters: {
"X-API-Key": "={{$credentials.apiKey}}"
}
}
```
**Query auth**:
```javascript
{
sendQuery: true,
queryParameters: {
"api_key": "={{$credentials.apiKey}}"
}
}
```
### 4. Basic Auth
**Setup**: Create "Basic Auth" credential
```javascript
{
authentication: "predefinedCredentialType",
nodeCredentialType: "httpBasicAuth"
}
```
### 5. OAuth2
**Setup**: Create OAuth2 credential with:
- Authorization URL
- Token URL
- Client ID
- Client Secret
- Scopes
```javascript
{
authentication: "predefinedCredentialType",
nodeCredentialType: "oAuth2Api"
}
```
---
## Handling API Responses
### Success Response (200-299)
**Default**: Data flows to next node
**Access response**:
```javascript
// Entire response
{{$json}}
// Specific fields
{{$json.data.id}}
{{$json.results[0].name}}
```
### Pagination
#### Pattern 1: Offset-based
```
1. Set (initialize: page=1, has_more=true)
2. HTTP Request (GET /api/items?page={{$json.page}})
3. Code (check if more pages)
4. IF (has_more === true)
└→ Set (increment page) → Loop to step 2
```
**Code node** (check pagination):
```javascript
const items = $input.first().json;
const currentPage = $json.page || 1;
return [{
json: {
items: items.results,
page: currentPage + 1,
has_more: items.next !== null
}
}];
```
#### Pattern 2: Cursor-based
```
1. HTTP Request (GET /api/items)
2. Code (extract next_cursor)
3. IF (next_cursor exists)
└→ Set (cursor={{$json.next_cursor}}) → Loop to step 1
```
#### Pattern 3: Link Header
```javascript
// Code node - parse Link header
const linkHeader = $input.first().json.headers['link'];
const hasNext = linkHeader && linkHeader.includes('rel="next"');
return [{
json: {
items: $input.first().json.body,
has_next: hasNext,
next_url: hasNext ? parseNextUrl(linkHeader) : null
}
}];
```
### Error Responses (400-599)
**Configure HTTP Request**:
```javascript
{
continueOnFail: true, // Don't stop workflow on error
ignoreResponseCode: true // Get response even on error
}
```
**Handle errors**:
```
HTTP Request (continueOnFail: true)
→ IF (check error)
├─ [Success Path]
└─ [Error Path] → Log → Retry or Alert
```
**IF condition**:
```javascript
{{$json.error}} is empty
// OR
{{$json.statusCode}} < 400
```
---
## Rate Limiting
### Pattern 1: Wait Between Requests
```
Split In Batches (1 item per batch)
→ HTTP Request
→ Wait (1 second)
→ Loop
```
### Pattern 2: Exponential Backoff
```javascript
// Code node
const maxRetries = 3;
let retryCount = $json.retryCount || 0;
if ($json.error && retryCount < maxRetries) {
const delay = Math.pow(2, retryCount) * 1000; // 1s, 2s, 4s
return [{
json: {
...$json,
retryCount: retryCount + 1,
waitTime: delay
}
}];
}
```
### Pattern 3: Respect Rate Limit Headers
```javascript
// Code node - check rate limit
const headers = $input.first().json.headers;
const remaining = parseInt(headers['x-ratelimit-remaining'] || '999');
const resetTime = parseInt(headers['x-ratelimit-reset'] || '0');
if (remaining < 10) {
const now = Math.floor(Date.now() / 1000);
const waitSeconds = resetTime - now;
return [{
json: {
shouldWait: true,
waitSeconds: Math.max(waitSeconds, 0)
}
}];
}
return [{ json: { shouldWait: false } }];
```
---
## Request Configuration
### GET Request
```javascript
{
method: "GET",
url: "https://api.example.com/users",
sendQuery: true,
queryParameters: {
"page": "1",
"limit": "100",
"filter": "active"
}
}
```
### POST Request (JSON Body)
```javascript
{
method: "POST",
url: "https://api.example.com/users",
sendBody: true,
bodyParametersJson: JSON.stringify({
name: "={{$json.name}}",
email: "={{$json.email}}",
role: "user"
})
}
```
### POST Request (Form Data)
```javascript
{
method: "POST",
url: "https://api.example.com/upload",
sendBody: true,
bodyParametersUi: {
parameter: [
{ name: "file", value: "={{$json.fileData}}" },
{ name: "filename", value: "={{$json.filename}}" }
]
},
sendHeaders: true,
headerParameters: {
"Content-Type": "multipart/form-data"
}
}
```
### PUT/PATCH Request (Update)
```javascript
{
method: "PATCH",
url: "https://api.example.com/users/={{$json.userId}}",
sendBody: true,
bodyParametersJson: JSON.stringify({
status: "active",
last_updated: "={{$now}}"
})
}
```
### DELETE Request
```javascript
{
method: "DELETE",
url: "https://api.example.com/users/={{$json.userId}}"
}
```
---
## Error Handling Patterns
### Pattern 1: Retry on Failure
```
HTTP Request (continueOnFail: true)
→ IF (error occurred)
└→ Wait (5 seconds)
└→ HTTP Request (retry)
```
### Pattern 2: Fallback API
```
HTTP Request (Primary API, continueOnFail: true)
→ IF (failed)
└→ HTTP Request (Fallback API)
```
### Pattern 3: Error Trigger Workflow
**Main Workflow**:
```
HTTP Request → Process Data
```
**Error Workflow**:
```
Error Trigger
→ Set (extract error details)
→ Slack (alert team)
→ Database (log error for analysis)
```
### Pattern 4: Circuit Breaker
```javascript
// Code node - circuit breaker logic
const failures = $json.recentFailures || 0;
const threshold = 5;
if (failures >= threshold) {
throw new Error('Circuit breaker open - too many failures');
}
return [{ json: { canProceed: true } }];
```
---
## Response Transformation
### Extract Nested Data
```javascript
// Code node
const response = $input.first().json;
return response.data.items.map(item => ({
json: {
id: item.id,
name: item.attributes.name,
email: item.attributes.contact.email
}
}));
```
### Flatten Arrays
```javascript
// Code node - flatten nested array
const items = $input.all();
const flattened = items.flatMap(item =>
item.json.results.map(result => ({
json: {
parent_id: item.json.id,
...result
}
}))
);
return flattened;
```
### Combine Multiple API Responses
```
HTTP Request 1 (users)
→ Set (store users)
→ HTTP Request 2 (orders for each user)
→ Merge (combine users + orders)
```
---
## Testing & Debugging
### 1. Test with Manual Trigger
Replace Schedule with Manual Trigger for testing
### 2. Use Postman/Insomnia First
- Test API outside n8n
- Understand response structure
- Verify authentication
### 3. Log Responses
```javascript
// Code node - log for debugging
console.log('API Response:', JSON.stringify($input.first().json, null, 2));
return $input.all();
```
### 4. Check Execution Data
- View node output in n8n UI
- Check headers, body, status code
- Verify data structure
### 5. Use Binary Data Properly
For file downloads:
```javascript
{
method: "GET",
url: "https://api.example.com/download/file.pdf",
responseFormat: "file", // Important for binary data
outputPropertyName: "data"
}
```
---
## Performance Optimization
### 1. Parallel Requests
Use **Split In Batches** with multiple items:
```
Set (create array of IDs)
→ Split In Batches (10 items per batch)
→ HTTP Request (processes all 10 in parallel)
→ Loop
```
### 2. Caching
```
IF (check cache exists)
├─ [Cache Hit] → Use cached data
└─ [Cache Miss] → HTTP Request → Store in cache
```
### 3. Conditional Fetching
Only fetch if data changed:
```
HTTP Request (GET with If-Modified-Since header)
→ IF (status === 304)
└─ Use existing data
→ IF (status === 200)
└─ Process new data
```
### 4. Batch API Calls
If API supports batch operations:
```javascript
{
method: "POST",
url: "https://api.example.com/batch",
bodyParametersJson: JSON.stringify({
requests: $json.items.map(item => ({
method: "GET",
url: `/users/${item.id}`
}))
})
}
```
---
## Common Gotchas
### 1. ❌ Wrong: Hardcoded URLs
```javascript
url: "https://api.example.com/prod/users"
```
### ✅ Correct: Use environment variables
```javascript
url: "={{$env.API_BASE_URL}}/users"
```
### 2. ❌ Wrong: Credentials in parameters
```javascript
headerParameters: {
"Authorization": "Bearer sk-abc123xyz" // ❌ Exposed!
}
```
### ✅ Correct: Use credentials system
```javascript
authentication: "predefinedCredentialType",
nodeCredentialType: "httpHeaderAuth"
```
### 3. ❌ Wrong: No error handling
```javascript
HTTP Request Process (fails if API down)
```
### ✅ Correct: Handle errors
```javascript
HTTP Request (continueOnFail: true) IF (error) Handle
```
### 4. ❌ Wrong: Blocking on large responses
Processing 10,000 items synchronously
### ✅ Correct: Use batching
```
Split In Batches (100 items) → Process → Loop
```
---
## Real Template Examples
From n8n template library (892 API integration templates):
**GitHub to Notion**:
```
Schedule → HTTP Request (GitHub API) → Transform → HTTP Request (Notion API)
```
**Weather to Slack**:
```
Schedule → HTTP Request (Weather API) → Set (format) → Slack
```
**CRM Sync**:
```
Schedule → HTTP Request (CRM A) → Transform → HTTP Request (CRM B)
```
Use `search_templates({query: "http api"})` to find more!
---
## Checklist for API Integration
### Planning
- [ ] Test API with Postman/curl first
- [ ] Understand response structure
- [ ] Check rate limits
- [ ] Review authentication method
- [ ] Plan error handling
### Implementation
- [ ] Use credentials (never hardcode)
- [ ] Configure proper HTTP method
- [ ] Set correct headers (Content-Type, Accept)
- [ ] Handle pagination if needed
- [ ] Add query parameters properly
### Error Handling
- [ ] Set continueOnFail: true if needed
- [ ] Check response status codes
- [ ] Implement retry logic
- [ ] Add Error Trigger workflow
- [ ] Alert on failures
### Performance
- [ ] Use batching for large datasets
- [ ] Add rate limiting if needed
- [ ] Consider caching
- [ ] Test with production load
### Security
- [ ] Use HTTPS only
- [ ] Store secrets in credentials
- [ ] Validate API responses
- [ ] Use environment variables
---
## Summary
**Key Points**:
1. **Authentication** via credentials system (never hardcode)
2. **Error handling** is critical (continueOnFail + IF checks)
3. **Pagination** for large datasets
4. **Rate limiting** to respect API limits
5. **Transform responses** to match your needs
**Pattern**: Trigger → HTTP Request → Transform → Action → Error Handler
**Related**:
- [webhook_processing.md](webhook_processing.md) - Receiving HTTP requests
- [database_operations.md](database_operations.md) - Storing API data

View File

@@ -0,0 +1,773 @@
# Scheduled Tasks Pattern
**Use Case**: Recurring automation workflows that run automatically on a schedule.
---
## Pattern Structure
```
Schedule Trigger → [Fetch Data] → [Process] → [Deliver] → [Log/Notify]
```
**Key Characteristic**: Time-based automated execution
---
## Core Components
### 1. Schedule Trigger
**Purpose**: Execute workflow at specified times
**Modes**:
- **Interval** - Every X minutes/hours/days
- **Cron** - Specific times (advanced)
- **Days & Hours** - Simple recurring schedule
### 2. Data Source
**Common sources**:
- HTTP Request (APIs)
- Database queries
- File reads
- Service-specific nodes
### 3. Processing
**Typical operations**:
- Filter/transform data
- Aggregate statistics
- Generate reports
- Check conditions
### 4. Delivery
**Output channels**:
- Email
- Slack/Discord/Teams
- File storage
- Database writes
### 5. Logging
**Purpose**: Track execution history
**Methods**:
- Database log entries
- File append
- Monitoring service
---
## Schedule Configuration
### Interval Mode
**Best for**: Simple recurring tasks
**Examples**:
```javascript
// Every 15 minutes
{
mode: "interval",
interval: 15,
unit: "minutes"
}
// Every 2 hours
{
mode: "interval",
interval: 2,
unit: "hours"
}
// Every day at midnight
{
mode: "interval",
interval: 1,
unit: "days"
}
```
### Days & Hours Mode
**Best for**: Specific days and times
**Examples**:
```javascript
// Weekdays at 9 AM
{
mode: "daysAndHours",
days: ["monday", "tuesday", "wednesday", "thursday", "friday"],
hour: 9,
minute: 0
}
// Every Monday at 6 PM
{
mode: "daysAndHours",
days: ["monday"],
hour: 18,
minute: 0
}
```
### Cron Mode (Advanced)
**Best for**: Complex schedules
**Examples**:
```javascript
// Every weekday at 9 AM
{
mode: "cron",
expression: "0 9 * * 1-5"
}
// First day of every month at midnight
{
mode: "cron",
expression: "0 0 1 * *"
}
// Every 15 minutes during business hours (9 AM - 5 PM) on weekdays
{
mode: "cron",
expression: "*/15 9-17 * * 1-5"
}
```
**Cron format**: `minute hour day month weekday`
- `*` = any value
- `*/15` = every 15 units
- `1-5` = range (Monday-Friday)
- `1,15` = specific values
**Cron examples**:
```
0 */6 * * * Every 6 hours
0 9,17 * * * At 9 AM and 5 PM daily
0 0 * * 0 Every Sunday at midnight
*/30 * * * * Every 30 minutes
0 0 1,15 * * 1st and 15th of each month
```
---
## Common Use Cases
### 1. Daily Reports
**Flow**: Schedule → Fetch data → Aggregate → Format → Email
**Example** (Sales report):
```
1. Schedule (daily at 9 AM)
2. Postgres (query yesterday's sales)
SELECT date, SUM(amount) as total, COUNT(*) as orders
FROM orders
WHERE date = CURRENT_DATE - INTERVAL '1 day'
GROUP BY date
3. Code (calculate metrics)
- Total revenue
- Order count
- Average order value
- Comparison to previous day
4. Set (format email body)
Subject: Daily Sales Report - {{$json.date}}
Body: Formatted HTML with metrics
5. Email (send to team@company.com)
6. Slack (post summary to #sales)
```
### 2. Data Synchronization
**Flow**: Schedule → Fetch from source → Transform → Write to target
**Example** (CRM to data warehouse sync):
```
1. Schedule (every hour)
2. Set (store last sync time)
SELECT MAX(synced_at) FROM sync_log
3. HTTP Request (fetch new CRM contacts since last sync)
GET /api/contacts?updated_since={{$json.last_sync}}
4. IF (check if new records exist)
5. Set (transform CRM schema to warehouse schema)
6. Postgres (warehouse - INSERT new contacts)
7. Postgres (UPDATE sync_log SET synced_at = NOW())
8. IF (error occurred)
└─ Slack (alert #data-team)
```
### 3. Monitoring & Health Checks
**Flow**: Schedule → Check endpoints → Alert if down
**Example** (Website uptime monitor):
```
1. Schedule (every 5 minutes)
2. HTTP Request (GET https://example.com/health)
- timeout: 10 seconds
- continueOnFail: true
3. IF (status !== 200 OR response_time > 2000ms)
4. Redis (check alert cooldown - don't spam)
- Key: alert:website_down
- TTL: 30 minutes
5. IF (no recent alert sent)
6. [Alert Actions]
├─ Slack (notify #ops-team)
├─ PagerDuty (create incident)
├─ Email (alert@company.com)
└─ Redis (set alert cooldown)
7. Postgres (log uptime check result)
```
### 4. Cleanup & Maintenance
**Flow**: Schedule → Find old data → Archive/Delete → Report
**Example** (Database cleanup):
```
1. Schedule (weekly on Sunday at 2 AM)
2. Postgres (find old records)
SELECT * FROM logs
WHERE created_at < NOW() - INTERVAL '90 days'
LIMIT 10000
3. IF (records exist)
4. Code (export to JSON for archive)
5. Google Drive (upload archive file)
- Filename: logs_archive_{{$now.format('YYYY-MM-DD')}}.json
6. Postgres (DELETE archived records)
DELETE FROM logs
WHERE id IN ({{$json.archived_ids}})
7. Slack (report: "Archived X records, deleted Y records")
```
### 5. Data Enrichment
**Flow**: Schedule → Find incomplete records → Enrich → Update
**Example** (Enrich contacts with company data):
```
1. Schedule (nightly at 3 AM)
2. Postgres (find contacts without company data)
SELECT id, email, domain FROM contacts
WHERE company_name IS NULL
AND created_at > NOW() - INTERVAL '7 days'
LIMIT 100
3. Split In Batches (10 contacts per batch)
4. HTTP Request (call Clearbit enrichment API)
- For each contact domain
- Rate limit: wait 1 second between batches
5. Set (map API response to database schema)
6. Postgres (UPDATE contacts with company data)
7. Wait (1 second - rate limiting)
8. Loop (back to step 4 until all batches processed)
9. Email (summary: "Enriched X contacts")
```
### 6. Backup Automation
**Flow**: Schedule → Export data → Compress → Store → Verify
**Example** (Database backup):
```
1. Schedule (daily at 2 AM)
2. Code (execute pg_dump)
const { exec } = require('child_process');
exec('pg_dump -h db.example.com mydb > backup.sql')
3. Code (compress backup)
const zlib = require('zlib');
// Compress backup.sql to backup.sql.gz
4. AWS S3 (upload compressed backup)
- Bucket: backups
- Key: db/backup-{{$now.format('YYYY-MM-DD')}}.sql.gz
5. AWS S3 (list old backups)
- Keep last 30 days only
6. AWS S3 (delete old backups)
7. IF (error occurred)
├─ PagerDuty (critical alert)
└─ Email (backup failed!)
ELSE
└─ Slack (#devops: "✅ Backup completed")
```
### 7. Content Publishing
**Flow**: Schedule → Fetch content → Format → Publish
**Example** (Automated social media posts):
```
1. Schedule (every 3 hours during business hours)
- Cron: 0 9,12,15,18 * * 1-5
2. Google Sheets (read content queue)
- Sheet: "Scheduled Posts"
- Filter: status=pending AND publish_time <= NOW()
3. IF (posts available)
4. HTTP Request (shorten URLs in post)
5. HTTP Request (POST to Twitter API)
6. HTTP Request (POST to LinkedIn API)
7. Google Sheets (update status=published)
8. Slack (notify #marketing: "Posted: {{$json.title}}")
```
---
## Timezone Considerations
### Set Workflow Timezone
```javascript
// In workflow settings
{
timezone: "America/New_York" // EST/EDT
}
```
### Common Timezones
```
America/New_York - Eastern (US)
America/Chicago - Central (US)
America/Denver - Mountain (US)
America/Los_Angeles - Pacific (US)
Europe/London - GMT/BST
Europe/Paris - CET/CEST
Asia/Tokyo - JST
Australia/Sydney - AEDT
UTC - Universal Time
```
### Handle Daylight Saving
**Best practice**: Use timezone-aware scheduling
```javascript
// ❌ Bad: UTC schedule for "9 AM local"
// Will be off by 1 hour during DST transitions
// ✅ Good: Set workflow timezone
{
timezone: "America/New_York",
schedule: {
mode: "daysAndHours",
hour: 9 // Always 9 AM Eastern, regardless of DST
}
}
```
---
## Error Handling
### Pattern 1: Error Trigger Workflow
**Main workflow**: Normal execution
**Error workflow**: Alerts and recovery
**Main**:
```
Schedule → Fetch → Process → Deliver
```
**Error**:
```
Error Trigger (for main workflow)
→ Set (extract error details)
→ Slack (#ops-team: "❌ Scheduled job failed")
→ Email (admin alert)
→ Postgres (log error for analysis)
```
### Pattern 2: Retry with Backoff
```
Schedule → HTTP Request (continueOnFail: true)
→ IF (error)
├─ Wait (5 minutes)
├─ HTTP Request (retry 1)
└─ IF (still error)
├─ Wait (15 minutes)
├─ HTTP Request (retry 2)
└─ IF (still error)
└─ Alert admin
```
### Pattern 3: Partial Failure Handling
```
Schedule → Split In Batches
→ Process (continueOnFail: true)
→ Code (track successes and failures)
→ Report:
"✅ Processed: 95/100"
"❌ Failed: 5/100"
```
---
## Performance Optimization
### 1. Batch Processing
For large datasets:
```
Schedule → Query (LIMIT 10000)
→ Split In Batches (100 items)
→ Process batch
→ Loop
```
### 2. Parallel Processing
When operations are independent:
```
Schedule
├─ [Branch 1: Update DB]
├─ [Branch 2: Send emails]
└─ [Branch 3: Generate report]
→ Merge (wait for all) → Final notification
```
### 3. Skip if Already Running
Prevent overlapping executions:
```
Schedule → Redis (check lock)
→ IF (lock exists)
└─ End (skip this execution)
→ ELSE
├─ Redis (set lock, TTL 30 min)
├─ [Execute workflow]
└─ Redis (delete lock)
```
### 4. Early Exit on No Data
Don't waste time if nothing to process:
```
Schedule → Query (check if work exists)
→ IF (no results)
└─ End workflow (exit early)
→ ELSE
└─ Process data
```
---
## Monitoring & Logging
### Pattern 1: Execution Log Table
```sql
CREATE TABLE workflow_executions (
id SERIAL PRIMARY KEY,
workflow_name VARCHAR(255),
started_at TIMESTAMP,
completed_at TIMESTAMP,
status VARCHAR(50),
records_processed INT,
error_message TEXT
);
```
**Log execution**:
```
Schedule
→ Set (record start)
→ [Workflow logic]
→ Postgres (INSERT execution log)
```
### Pattern 2: Metrics Collection
```
Schedule → [Execute]
→ Code (calculate metrics)
- Duration
- Records processed
- Success rate
→ HTTP Request (send to monitoring system)
- Datadog, Prometheus, etc.
```
### Pattern 3: Summary Notifications
Daily/weekly execution summaries:
```
Schedule (daily at 6 PM) → Query execution logs
→ Code (aggregate today's executions)
→ Email (summary report)
"Today's Workflow Executions:
- 24/24 successful
- 0 failures
- Avg duration: 2.3 min"
```
---
## Testing Scheduled Workflows
### 1. Use Manual Trigger for Testing
**Development pattern**:
```
Manual Trigger (for testing)
→ [Same workflow logic]
→ [Outputs]
// Once tested, replace with Schedule Trigger
```
### 2. Test with Different Times
```javascript
// Code node - simulate different times
const testTime = new Date('2024-01-15T09:00:00Z');
return [{ json: { currentTime: testTime } }];
```
### 3. Dry Run Mode
```
Schedule → Set (dryRun: true)
→ IF (dryRun)
└─ Log what would happen (don't execute)
→ ELSE
└─ Execute normally
```
### 4. Shorter Interval for Testing
```javascript
// Testing: every 1 minute
{
mode: "interval",
interval: 1,
unit: "minutes"
}
// Production: every 1 hour
{
mode: "interval",
interval: 1,
unit: "hours"
}
```
---
## Common Gotchas
### 1. ❌ Wrong: Ignoring timezone
```javascript
Schedule (9 AM) // 9 AM in which timezone?
```
### ✅ Correct: Set workflow timezone
```javascript
// Workflow settings
{
timezone: "America/New_York"
}
```
### 2. ❌ Wrong: Overlapping executions
```
Schedule (every 5 min) → Long-running task (10 min)
// Two executions running simultaneously!
```
### ✅ Correct: Add execution lock
```
Schedule → Redis (check lock)
→ IF (locked) → Skip
→ ELSE → Execute
```
### 3. ❌ Wrong: No error handling
```
Schedule → API call → Process (fails silently)
```
### ✅ Correct: Add error workflow
```
Main: Schedule → Execute
Error: Error Trigger → Alert
```
### 4. ❌ Wrong: Processing all data at once
```
Schedule → SELECT 1000000 records → Process (OOM)
```
### ✅ Correct: Batch processing
```
Schedule → SELECT with pagination → Split In Batches → Process
```
### 5. ❌ Wrong: Hardcoded dates
```javascript
query: "SELECT * FROM orders WHERE date = '2024-01-15'"
```
### ✅ Correct: Dynamic dates
```javascript
query: "SELECT * FROM orders WHERE date = CURRENT_DATE - INTERVAL '1 day'"
```
---
## Real Template Examples
From n8n template library:
**Template #2947** (Weather to Slack):
```
Schedule (daily 8 AM)
→ HTTP Request (weather API)
→ Set (format message)
→ Slack (post to #general)
```
**Daily backup**:
```
Schedule (nightly 2 AM)
→ Postgres (export data)
→ Google Drive (upload)
→ Email (confirmation)
```
**Monitoring**:
```
Schedule (every 5 min)
→ HTTP Request (health check)
→ IF (down) → PagerDuty alert
```
Use `search_templates({query: "schedule"})` to find more!
---
## Checklist for Scheduled Workflows
### Planning
- [ ] Define schedule frequency (interval, cron, days & hours)
- [ ] Set workflow timezone
- [ ] Estimate execution duration
- [ ] Plan for failures and retries
- [ ] Consider timezone and DST
### Implementation
- [ ] Configure Schedule Trigger
- [ ] Set workflow timezone in settings
- [ ] Add early exit for no-op cases
- [ ] Implement batch processing for large data
- [ ] Add execution logging
### Error Handling
- [ ] Create Error Trigger workflow
- [ ] Implement retry logic
- [ ] Add alert notifications
- [ ] Log errors for analysis
- [ ] Handle partial failures gracefully
### Monitoring
- [ ] Log each execution (start, end, status)
- [ ] Track metrics (duration, records, success rate)
- [ ] Set up daily/weekly summaries
- [ ] Alert on consecutive failures
- [ ] Monitor resource usage
### Testing
- [ ] Test with Manual Trigger first
- [ ] Verify timezone behavior
- [ ] Test error scenarios
- [ ] Check for overlapping executions
- [ ] Validate output quality
### Deployment
- [ ] Document workflow purpose
- [ ] Set up monitoring
- [ ] Configure alerts
- [ ] Activate workflow in n8n UI ⚠️ **Manual activation required** (API/MCP cannot activate)
- [ ] Test in production (short interval first)
- [ ] Monitor first few executions
---
## Advanced Patterns
### Dynamic Scheduling
**Change schedule based on conditions**:
```
Schedule (check every hour) → Code (check if it's time to run)
→ IF (business hours AND weekday)
└─ Execute workflow
→ ELSE
└─ Skip
```
### Dependent Schedules
**Chain workflows**:
```
Workflow A (daily 2 AM): Data sync
→ On completion → Trigger Workflow B
Workflow B: Generate report (depends on fresh data)
```
### Conditional Execution
**Skip based on external factors**:
```
Schedule → HTTP Request (check feature flag)
→ IF (feature enabled)
└─ Execute
→ ELSE
└─ Skip
```
---
## Summary
**Key Points**:
1. **Set workflow timezone** explicitly
2. **Batch processing** for large datasets
3. **Error handling** is critical (Error Trigger + retries)
4. **Prevent overlaps** with execution locks
5. **Monitor and log** all executions
**Pattern**: Schedule → Fetch → Process → Deliver → Log
**Schedule Modes**:
- **Interval**: Simple recurring (every X minutes/hours)
- **Days & Hours**: Specific days and times
- **Cron**: Advanced complex schedules
**Related**:
- [http_api_integration.md](http_api_integration.md) - Fetching data on schedule
- [database_operations.md](database_operations.md) - Scheduled database tasks
- [webhook_processing.md](webhook_processing.md) - Alternative to scheduling

View File

@@ -0,0 +1,545 @@
# Webhook Processing Pattern
**Use Case**: Receive HTTP requests from external systems and process them instantly.
---
## Pattern Structure
```
Webhook → [Validate] → [Transform] → [Action] → [Response/Notify]
```
**Key Characteristic**: Instant event-driven processing
---
## Core Components
### 1. Webhook Node (Trigger)
**Purpose**: Create HTTP endpoint to receive data
**Configuration**:
```javascript
{
path: "form-submit", // URL path: https://n8n.example.com/webhook/form-submit
httpMethod: "POST", // GET, POST, PUT, DELETE
responseMode: "onReceived", // or "lastNode" for custom response
responseData: "allEntries" // or "firstEntryJson"
}
```
**Critical Gotcha**: Data is nested under `$json.body`
```javascript
{{$json.email}}
{{$json.body.email}}
```
### 2. Validation (Optional but Recommended)
**Purpose**: Verify incoming data before processing
**Options**:
- **IF node** - Check required fields exist
- **Code node** - Custom validation logic
- **Stop and Error** - Fail gracefully with message
**Example**:
```javascript
// IF node condition
{{$json.body.email}} is not empty AND
{{$json.body.name}} is not empty
```
### 3. Transformation
**Purpose**: Map webhook data to desired format
**Typical nodes**:
- **Set** - Field mapping
- **Code** - Complex transformations
**Example** (Set node):
```javascript
{
"user_email": "={{$json.body.email}}",
"user_name": "={{$json.body.name}}",
"timestamp": "={{$now}}"
}
```
### 4. Action
**Purpose**: Do something with the data
**Common actions**:
- Store in database (Postgres, MySQL, MongoDB)
- Send notification (Slack, Email, Discord)
- Call another API (HTTP Request)
- Update external system (CRM, support ticket)
### 5. Response (If responseMode: "lastNode")
**Purpose**: Send custom HTTP response
**Webhook Response Node**:
```javascript
{
statusCode: 200,
headers: {
"Content-Type": "application/json"
},
body: {
"status": "success",
"message": "Form received"
}
}
```
---
## Common Use Cases
### 1. Form Submissions
**Flow**: Form → Webhook → Validate → Database → Email Confirmation
**Example**:
```
1. Webhook (path: "contact-form", POST)
2. IF (check email & message not empty)
3. Postgres (insert into contacts table)
4. Email (send confirmation to user)
5. Slack (notify team in #leads)
6. Webhook Response ({"status": "success"})
```
**Real Data Access**:
```javascript
Name: {{$json.body.name}}
Email: {{$json.body.email}}
Message: {{$json.body.message}}
```
### 2. Payment Webhooks (Stripe, PayPal)
**Flow**: Payment Provider → Webhook → Verify → Update Database → Send Receipt
**Security**: Verify webhook signatures
```javascript
// Code node - verify Stripe signature
const crypto = require('crypto');
const signature = $input.item.headers['stripe-signature'];
const secret = $credentials.stripeWebhookSecret;
// Verify signature matches
const expectedSig = crypto
.createHmac('sha256', secret)
.update($input.item.body)
.digest('hex');
if (signature !== expectedSig) {
throw new Error('Invalid webhook signature');
}
return $input.item.body; // Return validated body
```
### 3. Chat Platform Integrations (Slack, Discord, Teams)
**Flow**: Chat Command → Webhook → Process → Respond
**Example** (Slack slash command):
```
1. Webhook (path: "slack-command", POST)
2. Code (parse Slack payload: $json.body.text, $json.body.user_id)
3. HTTP Request (fetch data from API)
4. Set (format Slack message)
5. Webhook Response (immediate Slack response)
```
**Slack Data Access**:
```javascript
Command: {{$json.body.command}}
Text: {{$json.body.text}}
User ID: {{$json.body.user_id}}
Channel ID: {{$json.body.channel_id}}
```
### 4. GitHub/GitLab Webhooks
**Flow**: Git Event → Webhook → Parse → Notify/Deploy
**Example** (new PR notification):
```
1. Webhook (path: "github", POST)
2. IF (check $json.body.action equals "opened")
3. Set (extract PR details: title, author, url)
4. Slack (notify #dev-team)
5. Webhook Response (200 OK)
```
**GitHub Data Access**:
```javascript
Event Type: {{$json.headers['x-github-event']}}
Action: {{$json.body.action}}
PR Title: {{$json.body.pull_request.title}}
Author: {{$json.body.pull_request.user.login}}
URL: {{$json.body.pull_request.html_url}}
```
### 5. IoT Device Data
**Flow**: Device → Webhook → Validate → Store → Alert (if threshold)
**Example** (temperature sensor):
```
1. Webhook (path: "sensor-data", POST)
2. Set (extract sensor readings)
3. Postgres (insert into sensor_readings)
4. IF (temperature > 80)
5. Email (alert admin)
```
---
## Webhook Data Structure
### Standard Structure
```json
{
"headers": {
"content-type": "application/json",
"user-agent": "...",
"x-custom-header": "..."
},
"params": {
"id": "123" // From URL: /webhook/form/:id
},
"query": {
"token": "abc" // From URL: /webhook/form?token=abc
},
"body": {
// ⚠️ YOUR DATA IS HERE!
"name": "John",
"email": "john@example.com"
}
}
```
### Accessing Different Parts
```javascript
// Headers
{{$json.headers['content-type']}}
{{$json.headers['x-api-key']}}
// URL Parameters
{{$json.params.id}}
// Query Parameters
{{$json.query.token}}
{{$json.query.page}}
// Body (MOST COMMON)
{{$json.body.email}}
{{$json.body.user.name}}
{{$json.body.items[0].price}}
```
---
## Authentication & Security
### 1. Query Parameter Token
**Simple but less secure**
```javascript
// IF node - validate token
{{$json.query.token}} equals "your-secret-token"
```
### 2. Header-Based Auth
**Better security**
```javascript
// IF node - check header
{{$json.headers['x-api-key']}} equals "your-api-key"
```
### 3. Signature Verification
**Best security** (for webhooks from services like Stripe, GitHub)
```javascript
// Code node
const crypto = require('crypto');
const signature = $input.item.headers['x-signature'];
const secret = $credentials.webhookSecret;
const calculatedSig = crypto
.createHmac('sha256', secret)
.update(JSON.stringify($input.item.body))
.digest('hex');
if (signature !== `sha256=${calculatedSig}`) {
throw new Error('Invalid signature');
}
return $input.item.body;
```
### 4. IP Whitelist
**Restrict access by IP** (n8n workflow settings)
- Configure in workflow settings
- Only allow specific IP ranges
- Use for internal systems
---
## Response Modes
### onReceived (Default)
**Behavior**: Immediate 200 OK response, workflow continues in background
**Use when**:
- Long-running workflows
- Response doesn't depend on workflow result
- Fire-and-forget processing
**Configuration**:
```javascript
{
responseMode: "onReceived",
responseCode: 200
}
```
### lastNode (Custom Response)
**Behavior**: Wait for workflow completion, send custom response
**Use when**:
- Need to return data to caller
- Synchronous processing required
- Form submissions with confirmation
**Configuration**:
```javascript
{
responseMode: "lastNode"
}
```
**Then add Webhook Response node**:
```javascript
{
statusCode: 200,
headers: {
"Content-Type": "application/json"
},
body: {
"id": "={{$json.record_id}}",
"status": "success"
}
}
```
---
## Error Handling
### Pattern 1: Try-Catch with Error Trigger
```
Main Flow:
Webhook → [nodes...] → Success Response
Error Flow:
Error Trigger → Log Error → Slack Alert → Error Response
```
**Error Trigger Configuration**:
```javascript
{
workflowId: "current-workflow-id"
}
```
**Error Response** (if responseMode: "lastNode"):
```javascript
{
statusCode: 500,
body: {
"status": "error",
"message": "Processing failed"
}
}
```
### Pattern 2: Validation Early Exit
```
Webhook → IF (validate) → [True: Process]
└→ [False: Error Response]
```
**False Branch Response**:
```javascript
{
statusCode: 400,
body: {
"status": "error",
"message": "Invalid data: missing email"
}
}
```
### Pattern 3: Continue On Fail
**Per-node setting**: Continue even if node fails
**Use case**: Non-critical notifications
```
Webhook → Database (critical) → Slack (continueOnFail: true)
```
---
## Testing Webhooks
### 1. Use Manual Trigger
Replace Webhook with Manual Trigger for testing:
```
Manual Trigger → [set test data] → rest of workflow
```
### 2. Use curl
```bash
curl -X POST https://n8n.example.com/webhook/form-submit \
-H "Content-Type: application/json" \
-d '{"email": "test@example.com", "name": "Test User"}'
```
### 3. Use Postman/Insomnia
- Create request collection
- Test different payloads
- Verify responses
### 4. Webhook.site
- Use webhook.site for testing
- Copy webhook.site URL to your service
- View requests and debug
---
## Performance Considerations
### Large Payloads
- Webhook timeout: 120 seconds (default)
- For large data, consider async processing:
```
Webhook → Queue (Redis/DB) → Response (immediate)
Separate Workflow:
Schedule → Check Queue → Process
```
### High Volume
- Use "Execute Once" mode if processing all items together
- Consider rate limiting
- Monitor execution times
- Scale n8n instance if needed
### Retries
- Webhook calls typically don't retry automatically
- Implement retry logic on caller side
- Or use queue pattern for guaranteed processing
---
## Common Gotchas
### 1. ❌ Wrong: Accessing webhook data
```javascript
{{$json.email}} // Empty or undefined
```
### ✅ Correct
```javascript
{{$json.body.email}} // Data is under .body
```
### 2. ❌ Wrong: Response mode confusion
Using Webhook Response node with responseMode: "onReceived" (ignored)
### ✅ Correct
Set responseMode: "lastNode" to use Webhook Response node
### 3. ❌ Wrong: No validation
Assuming data is always present and valid
### ✅ Correct
Validate data early with IF node or Code node
### 4. ❌ Wrong: Hardcoded paths
Using same path for dev/prod
### ✅ Correct
Use environment variables: `{{$env.WEBHOOK_PATH_PREFIX}}/form-submit`
---
## Real Template Examples
From n8n template library (1,085 webhook templates):
**Simple Form to Slack**:
```
Webhook → Set → Slack
```
**Payment Processing**:
```
Webhook → Verify Signature → Update Database → Send Receipt → Notify Admin
```
**Chat Bot**:
```
Webhook → Parse Command → AI Agent → Format Response → Webhook Response
```
Use `search_templates({query: "webhook"})` to find more!
---
## Checklist for Webhook Workflows
### Setup
- [ ] Choose descriptive webhook path
- [ ] Configure HTTP method (POST most common)
- [ ] Choose response mode (onReceived vs lastNode)
- [ ] Test webhook URL before connecting services
### Security
- [ ] Add authentication (token, signature, IP whitelist)
- [ ] Validate incoming data
- [ ] Sanitize user input (if storing/displaying)
- [ ] Use HTTPS (always)
### Data Handling
- [ ] Remember data is under $json.body
- [ ] Handle missing fields gracefully
- [ ] Transform data to desired format
- [ ] Log important data (for debugging)
### Error Handling
- [ ] Add Error Trigger workflow
- [ ] Validate required fields
- [ ] Return appropriate error responses
- [ ] Alert team on failures
### Testing
- [ ] Test with curl/Postman
- [ ] Test error scenarios
- [ ] Verify response format
- [ ] Monitor first executions
---
## Summary
**Key Points**:
1. **Data under $json.body** (most common mistake!)
2. **Validate early** to catch bad data
3. **Choose response mode** based on use case
4. **Secure webhooks** with auth
5. **Handle errors** gracefully
**Pattern**: Webhook → Validate → Transform → Action → Response
**Related**:
- [n8n Expression Syntax](../../n8n-expression-syntax/SKILL.md) - Accessing webhook data correctly
- [http_api_integration.md](http_api_integration.md) - Making HTTP requests in response