zhongwei/gh-czlonkowski-n8n-skills

Fork 0

Files

Zhongwei Li 6062d3994e Initial commit

2025-11-29 18:17:17 +08:00

19 KiB

Raw Permalink Blame History

Common Patterns - Python Code Node

Production-tested Python patterns for n8n Code nodes.

⚠️ Important: JavaScript First

Use JavaScript for 95% of use cases.

Python in n8n has NO external libraries (no requests, pandas, numpy).

Only use Python when:

You have complex Python-specific logic
You need Python's standard library features
You're more comfortable with Python than JavaScript

For most workflows, JavaScript is the better choice.

Pattern Overview

These 10 patterns cover common n8n Code node scenarios using Python:

Multi-Source Data Aggregation - Combine data from multiple nodes
Regex-Based Filtering - Filter items using pattern matching
Markdown to Structured Data - Parse markdown into structured format
JSON Object Comparison - Compare two JSON objects for changes
CRM Data Transformation - Transform CRM data to standard format
Release Notes Processing - Parse and categorize release notes
Array Transformation - Reshape arrays and extract fields
Dictionary Lookup - Create and use lookup dictionaries
Top N Filtering - Get top items by score/value
String Aggregation - Aggregate strings with formatting

Pattern 1: Multi-Source Data Aggregation

Use case: Combine data from multiple sources (APIs, webhooks, databases).

Scenario: Aggregate news articles from multiple sources.

Implementation

from datetime import datetime

all_items = _input.all()
processed_articles = []

for item in all_items:
    source_name = item["json"].get("name", "Unknown")
    source_data = item["json"]

    # Process Hacker News source
    if source_name == "Hacker News" and source_data.get("hits"):
        for hit in source_data["hits"]:
            processed_articles.append({
                "title": hit.get("title", "No title"),
                "url": hit.get("url", ""),
                "summary": hit.get("story_text") or "No summary",
                "source": "Hacker News",
                "score": hit.get("points", 0),
                "fetched_at": datetime.now().isoformat()
            })

    # Process Reddit source
    elif source_name == "Reddit" and source_data.get("data"):
        for post in source_data["data"].get("children", []):
            post_data = post.get("data", {})
            processed_articles.append({
                "title": post_data.get("title", "No title"),
                "url": post_data.get("url", ""),
                "summary": post_data.get("selftext", "")[:200],
                "source": "Reddit",
                "score": post_data.get("score", 0),
                "fetched_at": datetime.now().isoformat()
            })

# Sort by score descending
processed_articles.sort(key=lambda x: x["score"], reverse=True)

# Return as n8n items
return [{"json": article} for article in processed_articles]

Key Techniques

Process multiple data sources in one loop
Normalize different data structures
Use datetime for timestamps
Sort by criteria
Return properly formatted items

Pattern 2: Regex-Based Filtering

Use case: Filter items based on pattern matching in text fields.

Scenario: Filter support tickets by priority keywords.

Implementation

import re

all_items = _input.all()
priority_tickets = []

# High priority keywords pattern
high_priority_pattern = re.compile(
    r'\b(urgent|critical|emergency|asap|down|outage|broken)\b',
    re.IGNORECASE
)

for item in all_items:
    ticket = item["json"]

    # Check subject and description
    subject = ticket.get("subject", "")
    description = ticket.get("description", "")
    combined_text = f"{subject} {description}"

    # Find matches
    matches = high_priority_pattern.findall(combined_text)

    if matches:
        priority_tickets.append({
            "json": {
                **ticket,
                "priority": "high",
                "matched_keywords": list(set(matches)),
                "keyword_count": len(matches)
            }
        })
    else:
        priority_tickets.append({
            "json": {
                **ticket,
                "priority": "normal",
                "matched_keywords": [],
                "keyword_count": 0
            }
        })

# Sort by keyword count (most urgent first)
priority_tickets.sort(key=lambda x: x["json"]["keyword_count"], reverse=True)

return priority_tickets

Key Techniques

Use re.compile() for reusable patterns
re.IGNORECASE for case-insensitive matching
Combine multiple text fields for searching
Extract and deduplicate matches
Sort by priority indicators

Pattern 3: Markdown to Structured Data

Use case: Parse markdown text into structured data.

Scenario: Extract tasks from markdown checklist.

Implementation

import re

markdown_text = _input.first()["json"]["body"].get("markdown", "")

# Parse markdown checklist
tasks = []
lines = markdown_text.split("\n")

for line in lines:
    # Match: - [ ] Task or - [x] Task
    match = re.match(r'^\s*-\s*\[([ x])\]\s*(.+)$', line, re.IGNORECASE)

    if match:
        checked = match.group(1).lower() == 'x'
        task_text = match.group(2).strip()

        # Extract priority if present (e.g., [P1], [HIGH])
        priority_match = re.search(r'\[(P\d|HIGH|MEDIUM|LOW)\]', task_text, re.IGNORECASE)
        priority = priority_match.group(1).upper() if priority_match else "NORMAL"

        # Remove priority tag from text
        clean_text = re.sub(r'\[(P\d|HIGH|MEDIUM|LOW)\]', '', task_text, flags=re.IGNORECASE).strip()

        tasks.append({
            "text": clean_text,
            "completed": checked,
            "priority": priority,
            "original_line": line.strip()
        })

return [{
    "json": {
        "tasks": tasks,
        "total": len(tasks),
        "completed": sum(1 for t in tasks if t["completed"]),
        "pending": sum(1 for t in tasks if not t["completed"])
    }
}]

Key Techniques

Line-by-line parsing
Multiple regex patterns for extraction
Extract metadata from text
Calculate summary statistics
Return structured data

Pattern 4: JSON Object Comparison

Use case: Compare two JSON objects to find differences.

Scenario: Compare old and new user profile data.

Implementation

import json

all_items = _input.all()

# Assume first item is old data, second is new data
old_data = all_items[0]["json"] if len(all_items) > 0 else {}
new_data = all_items[1]["json"] if len(all_items) > 1 else {}

changes = {
    "added": {},
    "removed": {},
    "modified": {},
    "unchanged": {}
}

# Find all unique keys
all_keys = set(old_data.keys()) | set(new_data.keys())

for key in all_keys:
    old_value = old_data.get(key)
    new_value = new_data.get(key)

    if key not in old_data:
        # Added field
        changes["added"][key] = new_value
    elif key not in new_data:
        # Removed field
        changes["removed"][key] = old_value
    elif old_value != new_value:
        # Modified field
        changes["modified"][key] = {
            "old": old_value,
            "new": new_value
        }
    else:
        # Unchanged field
        changes["unchanged"][key] = old_value

return [{
    "json": {
        "changes": changes,
        "summary": {
            "added_count": len(changes["added"]),
            "removed_count": len(changes["removed"]),
            "modified_count": len(changes["modified"]),
            "unchanged_count": len(changes["unchanged"]),
            "has_changes": len(changes["added"]) > 0 or len(changes["removed"]) > 0 or len(changes["modified"]) > 0
        }
    }
}]

Key Techniques

Set operations for key comparison
Dictionary .get() for safe access
Categorize changes by type
Create summary statistics
Return detailed comparison

Pattern 5: CRM Data Transformation

Use case: Transform CRM data to standard format.

Scenario: Normalize data from different CRM systems.

Implementation

from datetime import datetime
import re

all_items = _input.all()
normalized_contacts = []

for item in all_items:
    raw_contact = item["json"]
    source = raw_contact.get("source", "unknown")

    # Normalize email
    email = raw_contact.get("email", "").lower().strip()

    # Normalize phone (remove non-digits)
    phone_raw = raw_contact.get("phone", "")
    phone = re.sub(r'\D', '', phone_raw)

    # Parse name
    if "full_name" in raw_contact:
        name_parts = raw_contact["full_name"].split(" ", 1)
        first_name = name_parts[0] if len(name_parts) > 0 else ""
        last_name = name_parts[1] if len(name_parts) > 1 else ""
    else:
        first_name = raw_contact.get("first_name", "")
        last_name = raw_contact.get("last_name", "")

    # Normalize status
    status_raw = raw_contact.get("status", "").lower()
    status = "active" if status_raw in ["active", "enabled", "true", "1"] else "inactive"

    # Create normalized contact
    normalized_contacts.append({
        "json": {
            "id": raw_contact.get("id", ""),
            "first_name": first_name.strip(),
            "last_name": last_name.strip(),
            "full_name": f"{first_name} {last_name}".strip(),
            "email": email,
            "phone": phone,
            "status": status,
            "source": source,
            "normalized_at": datetime.now().isoformat(),
            "original_data": raw_contact
        }
    })

return normalized_contacts

Key Techniques

Multiple field name variations handling
String cleaning and normalization
Regex for phone number cleaning
Name parsing logic
Status normalization
Preserve original data

Pattern 6: Release Notes Processing

Use case: Parse release notes and categorize changes.

Scenario: Extract features, fixes, and breaking changes from release notes.

Implementation

import re

release_notes = _input.first()["json"]["body"].get("notes", "")

categories = {
    "features": [],
    "fixes": [],
    "breaking": [],
    "other": []
}

# Split into lines
lines = release_notes.split("\n")

for line in lines:
    line = line.strip()

    # Skip empty lines and headers
    if not line or line.startswith("#"):
        continue

    # Remove bullet points
    clean_line = re.sub(r'^[\*\-\+]\s*', '', line)

    # Categorize
    if re.search(r'\b(feature|add|new)\b', clean_line, re.IGNORECASE):
        categories["features"].append(clean_line)
    elif re.search(r'\b(fix|bug|patch|resolve)\b', clean_line, re.IGNORECASE):
        categories["fixes"].append(clean_line)
    elif re.search(r'\b(breaking|deprecated|remove)\b', clean_line, re.IGNORECASE):
        categories["breaking"].append(clean_line)
    else:
        categories["other"].append(clean_line)

return [{
    "json": {
        "categories": categories,
        "summary": {
            "features": len(categories["features"]),
            "fixes": len(categories["fixes"]),
            "breaking": len(categories["breaking"]),
            "other": len(categories["other"]),
            "total": sum(len(v) for v in categories.values())
        }
    }
}]

Key Techniques

Line-by-line parsing
Pattern-based categorization
Bullet point removal
Skip headers and empty lines
Summary statistics

Pattern 7: Array Transformation

Use case: Reshape arrays and extract specific fields.

Scenario: Transform user data array to extract specific fields.

Implementation

all_items = _input.all()

# Extract and transform
transformed = []

for item in all_items:
    user = item["json"]

    # Extract nested fields
    profile = user.get("profile", {})
    settings = user.get("settings", {})

    transformed.append({
        "json": {
            "user_id": user.get("id"),
            "email": user.get("email"),
            "name": profile.get("name", "Unknown"),
            "avatar": profile.get("avatar_url"),
            "bio": profile.get("bio", "")[:100],  # Truncate to 100 chars
            "notifications_enabled": settings.get("notifications", True),
            "theme": settings.get("theme", "light"),
            "created_at": user.get("created_at"),
            "last_login": user.get("last_login_at")
        }
    })

return transformed

Key Techniques

Field extraction from nested objects
Default values with .get()
String truncation
Flattening nested structures

Pattern 8: Dictionary Lookup

Use case: Create lookup dictionary for fast data access.

Scenario: Look up user details by ID.

Implementation

all_items = _input.all()

# Build lookup dictionary
users_by_id = {}

for item in all_items:
    user = item["json"]
    user_id = user.get("id")

    if user_id:
        users_by_id[user_id] = {
            "name": user.get("name"),
            "email": user.get("email"),
            "status": user.get("status")
        }

# Example: Look up specific users
lookup_ids = [1, 3, 5]
looked_up = []

for user_id in lookup_ids:
    if user_id in users_by_id:
        looked_up.append({
            "json": {
                "id": user_id,
                **users_by_id[user_id],
                "found": True
            }
        })
    else:
        looked_up.append({
            "json": {
                "id": user_id,
                "found": False
            }
        })

return looked_up

Key Techniques

Dictionary comprehension alternative
O(1) lookup time
Handle missing keys gracefully
Preserve lookup order

Pattern 9: Top N Filtering

Use case: Get top items by score or value.

Scenario: Get top 10 products by sales.

Implementation

all_items = _input.all()

# Extract products with sales
products = []

for item in all_items:
    product = item["json"]
    products.append({
        "id": product.get("id"),
        "name": product.get("name"),
        "sales": product.get("sales", 0),
        "revenue": product.get("revenue", 0.0),
        "category": product.get("category")
    })

# Sort by sales descending
products.sort(key=lambda p: p["sales"], reverse=True)

# Get top 10
top_10 = products[:10]

return [
    {
        "json": {
            **product,
            "rank": index + 1
        }
    }
    for index, product in enumerate(top_10)
]

Key Techniques

List sorting with custom key
Slicing for top N
Add ranking information
Enumerate for index

Pattern 10: String Aggregation

Use case: Aggregate strings with formatting.

Scenario: Create summary text from multiple items.

Implementation

all_items = _input.all()

# Collect messages
messages = []

for item in all_items:
    data = item["json"]

    user = data.get("user", "Unknown")
    message = data.get("message", "")
    timestamp = data.get("timestamp", "")

    # Format each message
    formatted = f"[{timestamp}] {user}: {message}"
    messages.append(formatted)

# Join with newlines
summary = "\n".join(messages)

# Create statistics
total_length = sum(len(msg) for msg in messages)
average_length = total_length / len(messages) if messages else 0

return [{
    "json": {
        "summary": summary,
        "message_count": len(messages),
        "total_characters": total_length,
        "average_length": round(average_length, 2)
    }
}]

Key Techniques

String formatting with f-strings
Join lists with separator
Calculate string statistics
Handle empty lists

Pattern Comparison: Python vs JavaScript

Data Access

# Python
all_items = _input.all()
first_item = _input.first()
current = _input.item
webhook_data = _json["body"]

# JavaScript
const allItems = $input.all();
const firstItem = $input.first();
const current = $input.item;
const webhookData = $json.body;

Dictionary/Object Access

# Python - Dictionary key access
name = user["name"]           # May raise KeyError
name = user.get("name", "?")  # Safe with default

# JavaScript - Object property access
const name = user.name;              // May be undefined
const name = user.name || "?";       // Safe with default

Array Operations

# Python - List comprehension
filtered = [item for item in items if item["active"]]

# JavaScript - Array methods
const filtered = items.filter(item => item.active);

Sorting

# Python
items.sort(key=lambda x: x["score"], reverse=True)

# JavaScript
items.sort((a, b) => b.score - a.score);

Best Practices

1. Use .get() for Safe Access

# ✅ SAFE: Use .get() with defaults
name = user.get("name", "Unknown")
email = user.get("email", "no-email@example.com")

# ❌ RISKY: Direct key access
name = user["name"]  # KeyError if missing!

2. Handle Empty Lists

# ✅ SAFE: Check before processing
items = _input.all()
if items:
    first = items[0]
else:
    return [{"json": {"error": "No items"}}]

# ❌ RISKY: Assume items exist
first = items[0]  # IndexError if empty!

3. Use List Comprehensions

# ✅ PYTHONIC: List comprehension
active = [item for item in items if item["json"].get("active")]

# ❌ VERBOSE: Traditional loop
active = []
for item in items:
    if item["json"].get("active"):
        active.append(item)

4. Return Proper Format

# ✅ CORRECT: Array of objects with "json" key
return [{"json": {"field": "value"}}]

# ❌ WRONG: Just the data
return {"field": "value"}

# ❌ WRONG: Array without "json" wrapper
return [{"field": "value"}]

5. Use Standard Library

# ✅ GOOD: Use standard library
import statistics
average = statistics.mean(numbers)

# ✅ ALSO GOOD: Built-in functions
average = sum(numbers) / len(numbers) if numbers else 0

# ❌ CAN'T DO: External libraries
import numpy as np  # ModuleNotFoundError!

When to Use Each Pattern

Pattern	When to Use
Multi-Source Aggregation	Combining data from different nodes/sources
Regex Filtering	Text pattern matching, validation, extraction
Markdown Parsing	Processing formatted text into structured data
JSON Comparison	Detecting changes between objects
CRM Transformation	Normalizing data from different systems
Release Notes	Categorizing text by keywords
Array Transformation	Reshaping data, extracting fields
Dictionary Lookup	Fast ID-based lookups
Top N Filtering	Getting best/worst items by criteria
String Aggregation	Creating formatted text summaries

Summary

Key Takeaways:

Use .get() for safe dictionary access
List comprehensions are pythonic and efficient
Handle empty lists/None values
Use standard library (json, datetime, re)
Return proper n8n format: [{"json": {...}}]

Remember:

JavaScript is recommended for 95% of use cases
Python has NO external libraries
Use n8n nodes for complex operations
Code node is for data transformation, not API calls

See Also:

SKILL.md - Python Code overview
DATA_ACCESS.md - Data access patterns
STANDARD_LIBRARY.md - Available modules
ERROR_PATTERNS.md - Avoid common mistakes

19 KiB Raw Permalink Blame History

Common Patterns - Python Code Node

⚠️ Important: JavaScript First

Pattern Overview

Pattern 1: Multi-Source Data Aggregation

Implementation

Key Techniques

Pattern 2: Regex-Based Filtering

Implementation

Key Techniques

Pattern 3: Markdown to Structured Data

Implementation

Key Techniques

Pattern 4: JSON Object Comparison

Implementation

Key Techniques

Pattern 5: CRM Data Transformation

Implementation

Key Techniques

Pattern 6: Release Notes Processing

Implementation

Key Techniques

Pattern 7: Array Transformation

Implementation

Key Techniques

Pattern 8: Dictionary Lookup

Implementation

Key Techniques

Pattern 9: Top N Filtering

Implementation

Key Techniques

Pattern 10: String Aggregation

Implementation

Key Techniques

Pattern Comparison: Python vs JavaScript

Data Access

Dictionary/Object Access

Array Operations

Sorting

Best Practices

1. Use .get() for Safe Access

2. Handle Empty Lists

3. Use List Comprehensions

4. Return Proper Format

5. Use Standard Library

When to Use Each Pattern

Summary

19 KiB

Raw Permalink Blame History