Initial commit

This commit is contained in:
Zhongwei Li
2025-11-29 18:17:17 +08:00
commit 6062d3994e
39 changed files with 21748 additions and 0 deletions

View File

@@ -0,0 +1,794 @@
# Common Patterns - Python Code Node
Production-tested Python patterns for n8n Code nodes.
---
## ⚠️ Important: JavaScript First
**Use JavaScript for 95% of use cases.**
Python in n8n has **NO external libraries** (no requests, pandas, numpy).
Only use Python when:
- You have complex Python-specific logic
- You need Python's standard library features
- You're more comfortable with Python than JavaScript
For most workflows, **JavaScript is the better choice**.
---
## Pattern Overview
These 10 patterns cover common n8n Code node scenarios using Python:
1. **Multi-Source Data Aggregation** - Combine data from multiple nodes
2. **Regex-Based Filtering** - Filter items using pattern matching
3. **Markdown to Structured Data** - Parse markdown into structured format
4. **JSON Object Comparison** - Compare two JSON objects for changes
5. **CRM Data Transformation** - Transform CRM data to standard format
6. **Release Notes Processing** - Parse and categorize release notes
7. **Array Transformation** - Reshape arrays and extract fields
8. **Dictionary Lookup** - Create and use lookup dictionaries
9. **Top N Filtering** - Get top items by score/value
10. **String Aggregation** - Aggregate strings with formatting
---
## Pattern 1: Multi-Source Data Aggregation
**Use case**: Combine data from multiple sources (APIs, webhooks, databases).
**Scenario**: Aggregate news articles from multiple sources.
### Implementation
```python
from datetime import datetime
all_items = _input.all()
processed_articles = []
for item in all_items:
source_name = item["json"].get("name", "Unknown")
source_data = item["json"]
# Process Hacker News source
if source_name == "Hacker News" and source_data.get("hits"):
for hit in source_data["hits"]:
processed_articles.append({
"title": hit.get("title", "No title"),
"url": hit.get("url", ""),
"summary": hit.get("story_text") or "No summary",
"source": "Hacker News",
"score": hit.get("points", 0),
"fetched_at": datetime.now().isoformat()
})
# Process Reddit source
elif source_name == "Reddit" and source_data.get("data"):
for post in source_data["data"].get("children", []):
post_data = post.get("data", {})
processed_articles.append({
"title": post_data.get("title", "No title"),
"url": post_data.get("url", ""),
"summary": post_data.get("selftext", "")[:200],
"source": "Reddit",
"score": post_data.get("score", 0),
"fetched_at": datetime.now().isoformat()
})
# Sort by score descending
processed_articles.sort(key=lambda x: x["score"], reverse=True)
# Return as n8n items
return [{"json": article} for article in processed_articles]
```
### Key Techniques
- Process multiple data sources in one loop
- Normalize different data structures
- Use datetime for timestamps
- Sort by criteria
- Return properly formatted items
---
## Pattern 2: Regex-Based Filtering
**Use case**: Filter items based on pattern matching in text fields.
**Scenario**: Filter support tickets by priority keywords.
### Implementation
```python
import re
all_items = _input.all()
priority_tickets = []
# High priority keywords pattern
high_priority_pattern = re.compile(
r'\b(urgent|critical|emergency|asap|down|outage|broken)\b',
re.IGNORECASE
)
for item in all_items:
ticket = item["json"]
# Check subject and description
subject = ticket.get("subject", "")
description = ticket.get("description", "")
combined_text = f"{subject} {description}"
# Find matches
matches = high_priority_pattern.findall(combined_text)
if matches:
priority_tickets.append({
"json": {
**ticket,
"priority": "high",
"matched_keywords": list(set(matches)),
"keyword_count": len(matches)
}
})
else:
priority_tickets.append({
"json": {
**ticket,
"priority": "normal",
"matched_keywords": [],
"keyword_count": 0
}
})
# Sort by keyword count (most urgent first)
priority_tickets.sort(key=lambda x: x["json"]["keyword_count"], reverse=True)
return priority_tickets
```
### Key Techniques
- Use re.compile() for reusable patterns
- re.IGNORECASE for case-insensitive matching
- Combine multiple text fields for searching
- Extract and deduplicate matches
- Sort by priority indicators
---
## Pattern 3: Markdown to Structured Data
**Use case**: Parse markdown text into structured data.
**Scenario**: Extract tasks from markdown checklist.
### Implementation
```python
import re
markdown_text = _input.first()["json"]["body"].get("markdown", "")
# Parse markdown checklist
tasks = []
lines = markdown_text.split("\n")
for line in lines:
# Match: - [ ] Task or - [x] Task
match = re.match(r'^\s*-\s*\[([ x])\]\s*(.+)$', line, re.IGNORECASE)
if match:
checked = match.group(1).lower() == 'x'
task_text = match.group(2).strip()
# Extract priority if present (e.g., [P1], [HIGH])
priority_match = re.search(r'\[(P\d|HIGH|MEDIUM|LOW)\]', task_text, re.IGNORECASE)
priority = priority_match.group(1).upper() if priority_match else "NORMAL"
# Remove priority tag from text
clean_text = re.sub(r'\[(P\d|HIGH|MEDIUM|LOW)\]', '', task_text, flags=re.IGNORECASE).strip()
tasks.append({
"text": clean_text,
"completed": checked,
"priority": priority,
"original_line": line.strip()
})
return [{
"json": {
"tasks": tasks,
"total": len(tasks),
"completed": sum(1 for t in tasks if t["completed"]),
"pending": sum(1 for t in tasks if not t["completed"])
}
}]
```
### Key Techniques
- Line-by-line parsing
- Multiple regex patterns for extraction
- Extract metadata from text
- Calculate summary statistics
- Return structured data
---
## Pattern 4: JSON Object Comparison
**Use case**: Compare two JSON objects to find differences.
**Scenario**: Compare old and new user profile data.
### Implementation
```python
import json
all_items = _input.all()
# Assume first item is old data, second is new data
old_data = all_items[0]["json"] if len(all_items) > 0 else {}
new_data = all_items[1]["json"] if len(all_items) > 1 else {}
changes = {
"added": {},
"removed": {},
"modified": {},
"unchanged": {}
}
# Find all unique keys
all_keys = set(old_data.keys()) | set(new_data.keys())
for key in all_keys:
old_value = old_data.get(key)
new_value = new_data.get(key)
if key not in old_data:
# Added field
changes["added"][key] = new_value
elif key not in new_data:
# Removed field
changes["removed"][key] = old_value
elif old_value != new_value:
# Modified field
changes["modified"][key] = {
"old": old_value,
"new": new_value
}
else:
# Unchanged field
changes["unchanged"][key] = old_value
return [{
"json": {
"changes": changes,
"summary": {
"added_count": len(changes["added"]),
"removed_count": len(changes["removed"]),
"modified_count": len(changes["modified"]),
"unchanged_count": len(changes["unchanged"]),
"has_changes": len(changes["added"]) > 0 or len(changes["removed"]) > 0 or len(changes["modified"]) > 0
}
}
}]
```
### Key Techniques
- Set operations for key comparison
- Dictionary .get() for safe access
- Categorize changes by type
- Create summary statistics
- Return detailed comparison
---
## Pattern 5: CRM Data Transformation
**Use case**: Transform CRM data to standard format.
**Scenario**: Normalize data from different CRM systems.
### Implementation
```python
from datetime import datetime
import re
all_items = _input.all()
normalized_contacts = []
for item in all_items:
raw_contact = item["json"]
source = raw_contact.get("source", "unknown")
# Normalize email
email = raw_contact.get("email", "").lower().strip()
# Normalize phone (remove non-digits)
phone_raw = raw_contact.get("phone", "")
phone = re.sub(r'\D', '', phone_raw)
# Parse name
if "full_name" in raw_contact:
name_parts = raw_contact["full_name"].split(" ", 1)
first_name = name_parts[0] if len(name_parts) > 0 else ""
last_name = name_parts[1] if len(name_parts) > 1 else ""
else:
first_name = raw_contact.get("first_name", "")
last_name = raw_contact.get("last_name", "")
# Normalize status
status_raw = raw_contact.get("status", "").lower()
status = "active" if status_raw in ["active", "enabled", "true", "1"] else "inactive"
# Create normalized contact
normalized_contacts.append({
"json": {
"id": raw_contact.get("id", ""),
"first_name": first_name.strip(),
"last_name": last_name.strip(),
"full_name": f"{first_name} {last_name}".strip(),
"email": email,
"phone": phone,
"status": status,
"source": source,
"normalized_at": datetime.now().isoformat(),
"original_data": raw_contact
}
})
return normalized_contacts
```
### Key Techniques
- Multiple field name variations handling
- String cleaning and normalization
- Regex for phone number cleaning
- Name parsing logic
- Status normalization
- Preserve original data
---
## Pattern 6: Release Notes Processing
**Use case**: Parse release notes and categorize changes.
**Scenario**: Extract features, fixes, and breaking changes from release notes.
### Implementation
```python
import re
release_notes = _input.first()["json"]["body"].get("notes", "")
categories = {
"features": [],
"fixes": [],
"breaking": [],
"other": []
}
# Split into lines
lines = release_notes.split("\n")
for line in lines:
line = line.strip()
# Skip empty lines and headers
if not line or line.startswith("#"):
continue
# Remove bullet points
clean_line = re.sub(r'^[\*\-\+]\s*', '', line)
# Categorize
if re.search(r'\b(feature|add|new)\b', clean_line, re.IGNORECASE):
categories["features"].append(clean_line)
elif re.search(r'\b(fix|bug|patch|resolve)\b', clean_line, re.IGNORECASE):
categories["fixes"].append(clean_line)
elif re.search(r'\b(breaking|deprecated|remove)\b', clean_line, re.IGNORECASE):
categories["breaking"].append(clean_line)
else:
categories["other"].append(clean_line)
return [{
"json": {
"categories": categories,
"summary": {
"features": len(categories["features"]),
"fixes": len(categories["fixes"]),
"breaking": len(categories["breaking"]),
"other": len(categories["other"]),
"total": sum(len(v) for v in categories.values())
}
}
}]
```
### Key Techniques
- Line-by-line parsing
- Pattern-based categorization
- Bullet point removal
- Skip headers and empty lines
- Summary statistics
---
## Pattern 7: Array Transformation
**Use case**: Reshape arrays and extract specific fields.
**Scenario**: Transform user data array to extract specific fields.
### Implementation
```python
all_items = _input.all()
# Extract and transform
transformed = []
for item in all_items:
user = item["json"]
# Extract nested fields
profile = user.get("profile", {})
settings = user.get("settings", {})
transformed.append({
"json": {
"user_id": user.get("id"),
"email": user.get("email"),
"name": profile.get("name", "Unknown"),
"avatar": profile.get("avatar_url"),
"bio": profile.get("bio", "")[:100], # Truncate to 100 chars
"notifications_enabled": settings.get("notifications", True),
"theme": settings.get("theme", "light"),
"created_at": user.get("created_at"),
"last_login": user.get("last_login_at")
}
})
return transformed
```
### Key Techniques
- Field extraction from nested objects
- Default values with .get()
- String truncation
- Flattening nested structures
---
## Pattern 8: Dictionary Lookup
**Use case**: Create lookup dictionary for fast data access.
**Scenario**: Look up user details by ID.
### Implementation
```python
all_items = _input.all()
# Build lookup dictionary
users_by_id = {}
for item in all_items:
user = item["json"]
user_id = user.get("id")
if user_id:
users_by_id[user_id] = {
"name": user.get("name"),
"email": user.get("email"),
"status": user.get("status")
}
# Example: Look up specific users
lookup_ids = [1, 3, 5]
looked_up = []
for user_id in lookup_ids:
if user_id in users_by_id:
looked_up.append({
"json": {
"id": user_id,
**users_by_id[user_id],
"found": True
}
})
else:
looked_up.append({
"json": {
"id": user_id,
"found": False
}
})
return looked_up
```
### Key Techniques
- Dictionary comprehension alternative
- O(1) lookup time
- Handle missing keys gracefully
- Preserve lookup order
---
## Pattern 9: Top N Filtering
**Use case**: Get top items by score or value.
**Scenario**: Get top 10 products by sales.
### Implementation
```python
all_items = _input.all()
# Extract products with sales
products = []
for item in all_items:
product = item["json"]
products.append({
"id": product.get("id"),
"name": product.get("name"),
"sales": product.get("sales", 0),
"revenue": product.get("revenue", 0.0),
"category": product.get("category")
})
# Sort by sales descending
products.sort(key=lambda p: p["sales"], reverse=True)
# Get top 10
top_10 = products[:10]
return [
{
"json": {
**product,
"rank": index + 1
}
}
for index, product in enumerate(top_10)
]
```
### Key Techniques
- List sorting with custom key
- Slicing for top N
- Add ranking information
- Enumerate for index
---
## Pattern 10: String Aggregation
**Use case**: Aggregate strings with formatting.
**Scenario**: Create summary text from multiple items.
### Implementation
```python
all_items = _input.all()
# Collect messages
messages = []
for item in all_items:
data = item["json"]
user = data.get("user", "Unknown")
message = data.get("message", "")
timestamp = data.get("timestamp", "")
# Format each message
formatted = f"[{timestamp}] {user}: {message}"
messages.append(formatted)
# Join with newlines
summary = "\n".join(messages)
# Create statistics
total_length = sum(len(msg) for msg in messages)
average_length = total_length / len(messages) if messages else 0
return [{
"json": {
"summary": summary,
"message_count": len(messages),
"total_characters": total_length,
"average_length": round(average_length, 2)
}
}]
```
### Key Techniques
- String formatting with f-strings
- Join lists with separator
- Calculate string statistics
- Handle empty lists
---
## Pattern Comparison: Python vs JavaScript
### Data Access
```python
# Python
all_items = _input.all()
first_item = _input.first()
current = _input.item
webhook_data = _json["body"]
# JavaScript
const allItems = $input.all();
const firstItem = $input.first();
const current = $input.item;
const webhookData = $json.body;
```
### Dictionary/Object Access
```python
# Python - Dictionary key access
name = user["name"] # May raise KeyError
name = user.get("name", "?") # Safe with default
# JavaScript - Object property access
const name = user.name; // May be undefined
const name = user.name || "?"; // Safe with default
```
### Array Operations
```python
# Python - List comprehension
filtered = [item for item in items if item["active"]]
# JavaScript - Array methods
const filtered = items.filter(item => item.active);
```
### Sorting
```python
# Python
items.sort(key=lambda x: x["score"], reverse=True)
# JavaScript
items.sort((a, b) => b.score - a.score);
```
---
## Best Practices
### 1. Use .get() for Safe Access
```python
# ✅ SAFE: Use .get() with defaults
name = user.get("name", "Unknown")
email = user.get("email", "no-email@example.com")
# ❌ RISKY: Direct key access
name = user["name"] # KeyError if missing!
```
### 2. Handle Empty Lists
```python
# ✅ SAFE: Check before processing
items = _input.all()
if items:
first = items[0]
else:
return [{"json": {"error": "No items"}}]
# ❌ RISKY: Assume items exist
first = items[0] # IndexError if empty!
```
### 3. Use List Comprehensions
```python
# ✅ PYTHONIC: List comprehension
active = [item for item in items if item["json"].get("active")]
# ❌ VERBOSE: Traditional loop
active = []
for item in items:
if item["json"].get("active"):
active.append(item)
```
### 4. Return Proper Format
```python
# ✅ CORRECT: Array of objects with "json" key
return [{"json": {"field": "value"}}]
# ❌ WRONG: Just the data
return {"field": "value"}
# ❌ WRONG: Array without "json" wrapper
return [{"field": "value"}]
```
### 5. Use Standard Library
```python
# ✅ GOOD: Use standard library
import statistics
average = statistics.mean(numbers)
# ✅ ALSO GOOD: Built-in functions
average = sum(numbers) / len(numbers) if numbers else 0
# ❌ CAN'T DO: External libraries
import numpy as np # ModuleNotFoundError!
```
---
## When to Use Each Pattern
| Pattern | When to Use |
|---------|-------------|
| Multi-Source Aggregation | Combining data from different nodes/sources |
| Regex Filtering | Text pattern matching, validation, extraction |
| Markdown Parsing | Processing formatted text into structured data |
| JSON Comparison | Detecting changes between objects |
| CRM Transformation | Normalizing data from different systems |
| Release Notes | Categorizing text by keywords |
| Array Transformation | Reshaping data, extracting fields |
| Dictionary Lookup | Fast ID-based lookups |
| Top N Filtering | Getting best/worst items by criteria |
| String Aggregation | Creating formatted text summaries |
---
## Summary
**Key Takeaways**:
- Use `.get()` for safe dictionary access
- List comprehensions are pythonic and efficient
- Handle empty lists/None values
- Use standard library (json, datetime, re)
- Return proper n8n format: `[{"json": {...}}]`
**Remember**:
- JavaScript is recommended for 95% of use cases
- Python has NO external libraries
- Use n8n nodes for complex operations
- Code node is for data transformation, not API calls
**See Also**:
- [SKILL.md](SKILL.md) - Python Code overview
- [DATA_ACCESS.md](DATA_ACCESS.md) - Data access patterns
- [STANDARD_LIBRARY.md](STANDARD_LIBRARY.md) - Available modules
- [ERROR_PATTERNS.md](ERROR_PATTERNS.md) - Avoid common mistakes