Initial commit

2025-11-29 18:29:23 +08:00
commit ebc71f5387
37 changed files with 9382 additions and 0 deletions
--- a/skills/memory-profiling/examples/python-scalene-profiling.md
+++ b/skills/memory-profiling/examples/python-scalene-profiling.md
@@ -0,0 +1,456 @@
+# Python Memory Profiling with Scalene
+
+Line-by-line memory and CPU profiling for Python applications using Scalene, with pytest integration and optimization strategies.
+
+## Overview
+
+**Before Optimization**:
+- Memory usage: 500MB for processing 10K records
+- OOM (Out of Memory) errors with 100K records
+- Processing time: 45 seconds for 10K records
+- List comprehensions loading entire dataset
+
+**After Optimization**:
+- Memory usage: 5MB for processing 10K records (99% reduction)
+- No OOM errors with 1M records
+- Processing time: 8 seconds for 10K records (82% faster)
+- Generator-based streaming
+
+**Tools**: Scalene, pytest, memory_profiler, tracemalloc
+
+## 1. Scalene Installation and Setup
+
+### Installation
+
+```bash
+# Install Scalene
+pip install scalene
+
+# Or with uv (faster)
+uv pip install scalene
+```
+
+### Basic Usage
+
+```bash
+# Profile entire script
+scalene script.py
+
+# Profile with pytest (recommended)
+scalene --cli --memory -m pytest tests/
+
+# HTML output
+scalene --html --outfile profile.html script.py
+
+# Profile specific function
+scalene --reduced-profile script.py
+```
+
+## 2. Profiling with pytest
+
+### Test File Setup
+
+```python
+# tests/test_data_processing.py
+import pytest
+from data_processor import DataProcessor
+
+@pytest.fixture
+def processor():
+    return DataProcessor()
+
+def test_process_large_dataset(processor):
+    # Generate 10K records
+    records = [{'id': i, 'value': i * 2} for i in range(10000)]
+
+    # Process (this is where memory spike occurs)
+    result = processor.process_records(records)
+
+    assert len(result) == 10000
+```
+
+### Running Scalene with pytest
+
+```bash
+# Profile memory usage during test execution
+uv run scalene --cli --memory -m pytest tests/test_data_processing.py 2>&1 | grep -i "memory\|mb\|test"
+
+# Output shows line-by-line memory allocation
+```
+
+**Scalene Output** (before optimization):
+```
+data_processor.py:
+Line | Memory % | Memory (MB) | CPU % | Code
+-----|----------|-------------|-------|-----
+12   | 45%      | 225 MB      | 10%   | result = [transform(r) for r in records]
+18   | 30%      | 150 MB      | 5%    | filtered = [r for r in result if r['value'] > 0]
+25   | 15%      | 75 MB       | 20%   | sorted_data = sorted(filtered, key=lambda x: x['id'])
+```
+
+**Analysis**: Line 12 is the hotspot (45% of memory)
+
+## 3. Memory Hotspot Identification
+
+### Vulnerable Code (Memory Spike)
+
+```python
+# data_processor.py (BEFORE OPTIMIZATION)
+class DataProcessor:
+    def process_records(self, records: list[dict]) -> list[dict]:
+        # ❌ HOTSPOT: List comprehension loads entire dataset
+        result = [self.transform(r) for r in records]  # 225MB for 10K records
+
+        # ❌ Creates another copy
+        filtered = [r for r in result if r['value'] > 0]  # +150MB
+
+        # ❌ sorted() creates yet another copy
+        sorted_data = sorted(filtered, key=lambda x: x['id'])  # +75MB
+
+        return sorted_data  # Total: 450MB for 10K records
+
+    def transform(self, record: dict) -> dict:
+        return {
+            'id': record['id'],
+            'value': record['value'] * 2,
+            'timestamp': datetime.now()
+        }
+```
+
+**Scalene Report**:
+```
+Memory allocation breakdown:
+- Line 12 (list comprehension): 225MB (50%)
+- Line 18 (filtering): 150MB (33%)
+- Line 25 (sorting): 75MB (17%)
+
+Total memory: 450MB for 10,000 records
+Projected for 100K: 4.5GB → OOM!
+```
+
+### Optimized Code (Generator-Based)
+
+```python
+# data_processor.py (AFTER OPTIMIZATION)
+from typing import Iterator
+
+class DataProcessor:
+    def process_records(self, records: list[dict]) -> Iterator[dict]:
+        # ✅ Generator: processes one record at a time
+        transformed = (self.transform(r) for r in records)  # O(1) memory
+
+        # ✅ Generator chaining
+        filtered = (r for r in transformed if r['value'] > 0)  # O(1) memory
+
+        # ✅ Stream-based sorting (only if needed)
+        # For very large datasets, use external sorting or database ORDER BY
+        yield from sorted(filtered, key=lambda x: x['id'])  # Still O(n), but lazy
+
+    def transform(self, record: dict) -> dict:
+        return {
+            'id': record['id'],
+            'value': record['value'] * 2,
+            'timestamp': datetime.now()
+        }
+
+    # Alternative: Fully streaming (no sorting)
+    def process_records_streaming(self, records: list[dict]) -> Iterator[dict]:
+        for record in records:
+            transformed = self.transform(record)
+            if transformed['value'] > 0:
+                yield transformed  # O(1) memory, fully streaming
+```
+
+**Scalene Report (After)**:
+```
+Memory allocation breakdown:
+- Line 12 (generator): 5MB (100% - constant overhead)
+- Line 18 (filter generator): 0MB (lazy)
+- Line 25 (yield): 0MB (lazy)
+
+Total memory: 5MB for 10,000 records (99% reduction!)
+Scalable to 1M+ records without OOM
+```
+
+## 4. Common Memory Patterns
+
+### Pattern 1: List Comprehension → Generator
+
+**Before** (High Memory):
+```python
+# ❌ Loads entire list into memory
+def process_large_file(filename: str) -> list[dict]:
+    with open(filename) as f:
+        lines = f.readlines()  # Loads entire file (500MB)
+
+    # Another copy
+    return [json.loads(line) for line in lines]  # +500MB = 1GB total
+```
+
+**After** (Low Memory):
+```python
+# ✅ Generator: processes line-by-line
+def process_large_file(filename: str) -> Iterator[dict]:
+    with open(filename) as f:
+        for line in f:  # Reads one line at a time
+            yield json.loads(line)  # O(1) memory
+```
+
+**Scalene diff**: 1GB → 5MB (99.5% reduction)
+
+### Pattern 2: DataFrame Memory Optimization
+
+**Before** (High Memory):
+```python
+# ❌ Loads entire CSV into memory
+import pandas as pd
+
+def analyze_data(filename: str):
+    df = pd.read_csv(filename)  # 10GB CSV → 10GB RAM
+
+    # All transformations in memory
+    df['new_col'] = df['value'] * 2
+    df_filtered = df[df['value'] > 0]
+    return df_filtered.groupby('category').sum()
+```
+
+**After** (Low Memory with Chunking):
+```python
+# ✅ Process in chunks
+import pandas as pd
+
+def analyze_data(filename: str):
+    chunk_size = 10000
+    results = []
+
+    # Process 10K rows at a time
+    for chunk in pd.read_csv(filename, chunksize=chunk_size):
+        chunk['new_col'] = chunk['value'] * 2
+        filtered = chunk[chunk['value'] > 0]
+        group_result = filtered.groupby('category').sum()
+        results.append(group_result)
+
+    # Combine results
+    return pd.concat(results).groupby(level=0).sum()  # Much smaller
+```
+
+**Scalene diff**: 10GB → 500MB (95% reduction)
+
+### Pattern 3: String Concatenation
+
+**Before** (High Memory):
+```python
+# ❌ Creates new string each iteration (O(n²) memory)
+def build_report(data: list[dict]) -> str:
+    report = ""
+    for item in data:  # 100K items
+        report += f"{item['id']}: {item['value']}\n"  # New string every time
+    return report  # 500MB final string + 500MB garbage = 1GB
+```
+
+**After** (Low Memory):
+```python
+# ✅ StringIO or join (O(n) memory)
+from io import StringIO
+
+def build_report(data: list[dict]) -> str:
+    buffer = StringIO()
+    for item in data:
+        buffer.write(f"{item['id']}: {item['value']}\n")
+    return buffer.getvalue()
+
+# Or even better: generator
+def build_report_streaming(data: list[dict]) -> Iterator[str]:
+    for item in data:
+        yield f"{item['id']}: {item['value']}\n"
+```
+
+**Scalene diff**: 1GB → 50MB (95% reduction)
+
+## 5. Scalene CLI Reference
+
+### Common Options
+
+```bash
+# Memory-only profiling (fastest)
+scalene --cli --memory script.py
+
+# CPU + Memory profiling
+scalene --cli --cpu --memory script.py
+
+# Reduced profile (functions only, not lines)
+scalene --reduced-profile script.py
+
+# Profile specific function
+scalene --profile-only process_data script.py
+
+# HTML report
+scalene --html --outfile profile.html script.py
+
+# Profile with pytest
+scalene --cli --memory -m pytest tests/
+
+# Set memory sampling interval (default: 1MB)
+scalene --malloc-threshold 0.1 script.py  # Sample every 100KB
+```
+
+### Interpreting Output
+
+**Column Meanings**:
+```
+Memory %  | Percentage of total memory allocated
+Memory MB | Absolute memory allocated (in megabytes)
+CPU %     | Percentage of CPU time spent
+Python %  | Time spent in Python (vs native code)
+```
+
+**Example Output**:
+```
+script.py:
+Line | Memory % | Memory MB | CPU % | Python % | Code
+-----|----------|-----------|-------|----------|-----
+12   | 45.2%    | 225.6 MB  | 10.5% | 95.2%    | data = [x for x in range(1000000)]
+18   | 30.1%    | 150.3 MB  | 5.2%  | 98.1%    | filtered = list(filter(lambda x: x > 0, data))
+```
+
+**Analysis**:
+- Line 12: High memory (45.2%) → optimize list comprehension
+- Line 18: Moderate memory (30.1%) → use generator instead of list()
+
+## 6. Integration with CI/CD
+
+### GitHub Actions Workflow
+
+```yaml
+# .github/workflows/memory-profiling.yml
+name: Memory Profiling
+
+on: [pull_request]
+
+jobs:
+  profile:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Setup Python
+        uses: actions/setup-python@v4
+        with:
+          python-version: '3.11'
+
+      - name: Install dependencies
+        run: |
+          pip install scalene pytest
+
+      - name: Run memory profiling
+        run: |
+          scalene --cli --memory --reduced-profile -m pytest tests/ > profile.txt
+
+      - name: Check for memory hotspots
+        run: |
+          if grep -q "Memory %" profile.txt; then
+            # Alert if any line uses >100MB
+            if awk '$3 > 100 {exit 1}' profile.txt; then
+              echo "Memory hotspot detected!"
+              exit 1
+            fi
+          fi
+
+      - name: Upload profile
+        uses: actions/upload-artifact@v3
+        with:
+          name: memory-profile
+          path: profile.txt
+```
+
+## 7. Real-World Optimization: CSV Processing
+
+### Before (500MB Memory, OOM at 100K rows)
+
+```python
+# csv_processor.py (BEFORE)
+import pandas as pd
+
+class CSVProcessor:
+    def process_file(self, filename: str) -> dict:
+        # ❌ Loads entire CSV
+        df = pd.read_csv(filename)  # 500MB for 10K rows
+
+        # ❌ Multiple copies
+        df['total'] = df['quantity'] * df['price']
+        df_filtered = df[df['total'] > 100]
+        summary = df_filtered.groupby('category').agg({
+            'total': 'sum',
+            'quantity': 'sum'
+        })
+
+        return summary.to_dict()
+```
+
+**Scalene Output**:
+```
+Line 8:  500MB (75%) - pd.read_csv()
+Line 11: 100MB (15%) - df['total'] calculation
+Line 12: 50MB (10%) - filtering
+Total: 650MB for 10K rows
+```
+
+### After (5MB Memory, Handles 1M rows)
+
+```python
+# csv_processor.py (AFTER)
+import pandas as pd
+from collections import defaultdict
+
+class CSVProcessor:
+    def process_file(self, filename: str) -> dict:
+        # ✅ Process in 10K row chunks
+        chunk_size = 10000
+        results = defaultdict(lambda: {'total': 0, 'quantity': 0})
+
+        for chunk in pd.read_csv(filename, chunksize=chunk_size):
+            chunk['total'] = chunk['quantity'] * chunk['price']
+            filtered = chunk[chunk['total'] > 100]
+
+            # Aggregate incrementally
+            for category, group in filtered.groupby('category'):
+                results[category]['total'] += group['total'].sum()
+                results[category]['quantity'] += group['quantity'].sum()
+
+        return dict(results)
+```
+
+**Scalene Output (After)**:
+```
+Line 9:  5MB (100%) - chunk processing (constant memory)
+Total: 5MB for any file size (99% reduction)
+```
+
+## 8. Results and Impact
+
+### Before vs After Metrics
+
+| Metric | Before | After | Impact |
+|--------|--------|-------|--------|
+| **Memory Usage** | 500MB (10K rows) | 5MB (1M rows) | **99% reduction** |
+| **Processing Time** | 45s (10K rows) | 8s (10K rows) | **82% faster** |
+| **Max File Size** | 100K rows (OOM) | 10M+ rows | **100x scalability** |
+| **OOM Errors** | 5/week | 0/month | **100% eliminated** |
+
+### Key Optimizations Applied
+
+1. **List comprehension → Generator**: 225MB → 0MB
+2. **DataFrame chunking**: 500MB → 5MB per chunk
+3. **String concatenation**: 1GB → 50MB (StringIO)
+4. **Lazy evaluation**: Load on demand vs load all
+
+## Related Documentation
+
+- **Node.js Leaks**: [nodejs-memory-leak.md](nodejs-memory-leak.md)
+- **DB Leaks**: [database-connection-leak.md](database-connection-leak.md)
+- **Reference**: [../reference/profiling-tools.md](../reference/profiling-tools.md)
+- **Templates**: [../templates/scalene-config.txt](../templates/scalene-config.txt)
+
+---
+
+Return to [examples index](INDEX.md)