Initial commit

2025-11-30 08:38:46 +08:00
commit 6902106648
49 changed files with 11466 additions and 0 deletions
--- a/.claude/skills/performance-budget-checker/SKILL.md
+++ b/.claude/skills/performance-budget-checker/SKILL.md
@@ -0,0 +1,346 @@
+---
+name: performance-budget-checker
+description: Detects performance anti-patterns like N+1 queries, nested loops, large file operations, and inefficient algorithms. Suggests fast fixes before issues reach production.
+---
+
+# Performance Budget Checker Skill
+
+**Purpose**: Catch performance killers before they slow production.
+
+**Trigger Words**: query, database, loop, for, map, filter, file, read, load, fetch, API, cache
+
+---
+
+## Quick Decision: Check Performance?
+
+```python
+def needs_perf_check(code_context: dict) -> bool:
+    """Fast performance risk evaluation."""
+
+    # Performance-critical patterns
+    patterns = [
+        "for ", "while ", "map(", "filter(",  # Loops
+        "db.", "query", "select", "fetch",  # Database
+        ".all()", ".filter(", ".find(",  # ORM queries
+        "open(", "read", "readlines",  # File I/O
+        "json.loads", "pickle.load",  # Deserialization
+        "sorted(", "sort(",  # Sorting
+        "in list", "in array",  # Linear search
+    ]
+
+    code = code_context.get("code", "").lower()
+    return any(p in code for p in patterns)
+```
+
+---
+
+## Performance Anti-Patterns (Quick Fixes)
+
+### 1. **N+1 Query Problem** (Most Common) ⚠️
+```python
+# ❌ BAD - 1 + N queries (slow!)
+def get_users_with_posts():
+    users = User.query.all()  # 1 query
+    for user in users:
+        user.posts = Post.query.filter_by(user_id=user.id).all()  # N queries!
+    return users
+# Performance: 101 queries for 100 users
+
+# ✅ GOOD - 1 query with JOIN
+def get_users_with_posts():
+    users = User.query.options(joinedload(User.posts)).all()  # 1 query
+    return users
+# Performance: 1 query for 100 users
+
+# Or use prefetch
+def get_users_with_posts():
+    users = User.query.all()
+    user_ids = [u.id for u in users]
+    posts = Post.query.filter(Post.user_id.in_(user_ids)).all()
+    # Group posts by user_id manually
+    return users
+```
+
+**Quick Fix**: Use `joinedload()`, `selectinload()`, or batch fetch.
+
+---
+
+### 2. **Nested Loops** ⚠️
+```python
+# ❌ BAD - O(n²) complexity
+def find_common_items(list1, list2):
+    common = []
+    for item1 in list1:  # O(n)
+        for item2 in list2:  # O(n)
+            if item1 == item2:
+                common.append(item1)
+    return common
+# Performance: 1,000,000 operations for 1000 items each
+
+# ✅ GOOD - O(n) with set
+def find_common_items(list1, list2):
+    return list(set(list1) & set(list2))
+# Performance: 2000 operations for 1000 items each
+```
+
+**Quick Fix**: Use set intersection, dict lookup, or hash map.
+
+---
+
+### 3. **Inefficient Filtering** ⚠️
+```python
+# ❌ BAD - Fetch all, then filter in Python
+def get_active_users():
+    all_users = User.query.all()  # Fetch 10,000 users
+    active = [u for u in all_users if u.is_active]  # Filter in memory
+    return active
+# Performance: 10,000 rows transferred, filtered in Python
+
+# ✅ GOOD - Filter in database
+def get_active_users():
+    return User.query.filter_by(is_active=True).all()
+# Performance: Only active users transferred
+```
+
+**Quick Fix**: Push filtering to database with WHERE clause.
+
+---
+
+### 4. **Large File Loading** ⚠️
+```python
+# ❌ BAD - Load entire file into memory
+def process_large_file(filepath):
+    with open(filepath) as f:
+        data = f.read()  # 1GB file → 1GB memory!
+    for line in data.split('\n'):
+        process_line(line)
+
+# ✅ GOOD - Stream line by line
+def process_large_file(filepath):
+    with open(filepath) as f:
+        for line in f:  # Streaming, ~4KB at a time
+            process_line(line.strip())
+```
+
+**Quick Fix**: Stream files instead of loading fully.
+
+---
+
+### 5. **Missing Pagination** ⚠️
+```python
+# ❌ BAD - Return all 100,000 records
+@app.route("/api/users")
+def get_users():
+    return User.query.all()  # 100,000 rows!
+
+# ✅ GOOD - Paginate
+@app.route("/api/users")
+def get_users():
+    page = request.args.get('page', 1, type=int)
+    per_page = request.args.get('per_page', 50, type=int)
+    return User.query.paginate(page=page, per_page=per_page)
+```
+
+**Quick Fix**: Add pagination to list endpoints.
+
+---
+
+### 6. **No Caching** ⚠️
+```python
+# ❌ BAD - Recompute every time
+def get_top_products():
+    # Expensive computation every request
+    products = Product.query.all()
+    sorted_products = sorted(products, key=lambda p: p.sales, reverse=True)
+    return sorted_products[:10]
+
+# ✅ GOOD - Cache for 5 minutes
+from functools import lru_cache
+import time
+
+@lru_cache(maxsize=1)
+def get_top_products_cached():
+    cache_key = int(time.time() // 300)  # 5 min buckets
+    return _compute_top_products()
+
+def _compute_top_products():
+    products = Product.query.all()
+    sorted_products = sorted(products, key=lambda p: p.sales, reverse=True)
+    return sorted_products[:10]
+```
+
+**Quick Fix**: Add caching for expensive computations.
+
+---
+
+### 7. **Linear Search in List** ⚠️
+```python
+# ❌ BAD - O(n) lookup
+user_ids = [1, 2, 3, ..., 10000]  # List
+if 9999 in user_ids:  # Scans entire list
+    pass
+
+# ✅ GOOD - O(1) lookup
+user_ids = {1, 2, 3, ..., 10000}  # Set
+if 9999 in user_ids:  # Instant lookup
+    pass
+```
+
+**Quick Fix**: Use set/dict for lookups instead of list.
+
+---
+
+### 8. **Synchronous I/O in Loop** ⚠️
+```python
+# ❌ BAD - Sequential API calls (slow)
+def fetch_user_data(user_ids):
+    results = []
+    for user_id in user_ids:  # 100 users
+        data = requests.get(f"/api/users/{user_id}").json()  # 200ms each
+        results.append(data)
+    return results
+# Performance: 100 × 200ms = 20 seconds!
+
+# ✅ GOOD - Parallel requests
+import asyncio
+import aiohttp
+
+async def fetch_user_data(user_ids):
+    async with aiohttp.ClientSession() as session:
+        tasks = [fetch_one(session, uid) for uid in user_ids]
+        results = await asyncio.gather(*tasks)
+    return results
+
+async def fetch_one(session, user_id):
+    async with session.get(f"/api/users/{user_id}") as resp:
+        return await resp.json()
+# Performance: ~200ms total (parallel)
+```
+
+**Quick Fix**: Use async/await or threading for I/O-bound operations.
+
+---
+
+## Performance Budget Guidelines
+
+| Operation | Acceptable | Warning | Critical |
+|-----------|-----------|---------|----------|
+| API response time | <200ms | 200-500ms | >500ms |
+| Database query | <50ms | 50-200ms | >200ms |
+| List endpoint | <100 items | 100-1000 | >1000 |
+| File operation | <1MB | 1-10MB | >10MB |
+| Loop iterations | <1000 | 1000-10000 | >10000 |
+
+---
+
+## Output Format
+
+```markdown
+## Performance Report
+
+**Status**: [✅ WITHIN BUDGET | ⚠️ ISSUES FOUND]
+
+---
+
+### Performance Issues: 2
+
+1. **[HIGH] N+1 Query in get_user_posts() (api.py:34)**
+   - **Issue**: 1 + 100 queries (101 total)
+   - **Impact**: ~500ms for 100 users
+   - **Fix**:
+     ```python
+     # Change this:
+     users = User.query.all()
+     for user in users:
+         user.posts = Post.query.filter_by(user_id=user.id).all()
+
+     # To this:
+     users = User.query.options(joinedload(User.posts)).all()
+     ```
+   - **Expected**: 500ms → 50ms (10x faster)
+
+2. **[MEDIUM] No pagination on /api/products (routes.py:45)**
+   - **Issue**: Returns all 5,000 products
+   - **Impact**: 2MB response, slow load
+   - **Fix**:
+     ```python
+     @app.route("/api/products")
+     def get_products():
+         page = request.args.get('page', 1, type=int)
+         return Product.query.paginate(page=page, per_page=50)
+     ```
+
+---
+
+### Optimizations Applied: 1
+- ✅ Used set() for user_id lookup (utils.py:23) - O(1) instead of O(n)
+
+---
+
+**Next Steps**:
+1. Fix N+1 query with joinedload (5 min fix)
+2. Add pagination to /api/products (10 min)
+3. Consider adding Redis cache for top products
+```
+
+---
+
+## When to Skip Performance Checks
+
+✅ Skip for:
+- Prototypes/POCs
+- Admin-only endpoints (low traffic)
+- One-time scripts
+- Small datasets (<100 items)
+
+⚠️ Always check for:
+- Public APIs
+- User-facing endpoints
+- High-traffic pages
+- Data processing pipelines
+
+---
+
+## What This Skill Does NOT Do
+
+❌ Run actual benchmarks (use profiling tools)
+❌ Optimize algorithms (focus on anti-patterns)
+❌ Check infrastructure (servers, CDN, etc.)
+❌ Replace load testing
+
+✅ **DOES**: Detect common performance anti-patterns with quick fixes.
+
+---
+
+## Configuration
+
+```bash
+# Strict mode: check all loops and queries
+export LAZYDEV_PERF_STRICT=1
+
+# Disable performance checks
+export LAZYDEV_DISABLE_PERF_CHECKS=1
+
+# Set custom thresholds
+export LAZYDEV_PERF_MAX_QUERY_TIME=100  # ms
+export LAZYDEV_PERF_MAX_LOOP_SIZE=5000
+```
+
+---
+
+## Quick Reference: Common Fixes
+
+| Anti-Pattern | Fix | Time Complexity |
+|--------------|-----|-----------------|
+| N+1 queries | `joinedload()` | O(n) → O(1) |
+| Nested loops | Use set/dict | O(n²) → O(n) |
+| Load full file | Stream lines | O(n) memory → O(1) |
+| No pagination | `.paginate()` | O(n) → O(page_size) |
+| Linear search | Use set | O(n) → O(1) |
+| Sync I/O loop | async/await | O(n×t) → O(t) |
+
+---
+
+**Version**: 1.0.0
+**Focus**: Database, loops, I/O, caching
+**Speed**: <3 seconds per file