gh-ricardoroche-ricardos-cl…/.claude/skills/performance-profiling/SKILL.md

---
name: performance-profiling
description: Automatically applies when profiling Python performance. Ensures proper use of profiling tools, async profiling, benchmarking, memory analysis, and optimization strategies.
category: observability
---

# Performance Profiling Patterns

When profiling Python applications, follow these patterns for effective performance analysis.

**Trigger Keywords**: profiling, performance, benchmark, optimization, cProfile, line_profiler, memory_profiler, async profiling, bottleneck, CPU profiling

**Agent Integration**: Used by `performance-engineer`, `backend-architect`, `optimization-engineer`

## ✅ Correct Pattern: CPU Profiling

```python
import cProfile
import pstats
from pstats import SortKey
from typing import Callable
from functools import wraps


def profile_function(output_file: str = None):
    """
    Decorator to profile function execution.

    Args:
        output_file: Optional file to save profile data
    """
    def decorator(func: Callable) -> Callable:
        @wraps(func)
        def wrapper(*args, **kwargs):
            profiler = cProfile.Profile()
            profiler.enable()

            try:
                result = func(*args, **kwargs)
                return result
            finally:
                profiler.disable()

                # Print stats
                stats = pstats.Stats(profiler)
                stats.sort_stats(SortKey.CUMULATIVE)
                stats.print_stats(20)  # Top 20 functions

                # Save to file
                if output_file:
                    stats.dump_stats(output_file)

        return wrapper
    return decorator


# Usage
@profile_function(output_file="profile.stats")
def process_data(data: List[Dict]):
    """Process data with profiling."""
    result = []
    for item in data:
        processed = expensive_operation(item)
        result.append(processed)
    return result


# Analyze saved profile
def analyze_profile(stats_file: str):
    """Analyze saved profile data."""
    stats = pstats.Stats(stats_file)

    print("\n=== Top 20 by cumulative time ===")
    stats.sort_stats(SortKey.CUMULATIVE)
    stats.print_stats(20)

    print("\n=== Top 20 by internal time ===")
    stats.sort_stats(SortKey.TIME)
    stats.print_stats(20)

    print("\n=== Top 20 by call count ===")
    stats.sort_stats(SortKey.CALLS)
    stats.print_stats(20)
```

## Line-by-Line Profiling

```python
from line_profiler import LineProfiler


def profile_lines(func: Callable) -> Callable:
    """
    Decorator for line-by-line profiling.

    Usage:
        @profile_lines
        def my_function():
            ...

        my_function()
    """
    @wraps(func)
    def wrapper(*args, **kwargs):
        profiler = LineProfiler()
        profiler.add_function(func)
        profiler.enable()

        try:
            result = func(*args, **kwargs)
            return result
        finally:
            profiler.disable()
            profiler.print_stats()

    return wrapper


# Usage
@profile_lines
def process_items(items: List[str]) -> List[str]:
    """Process items with line profiling."""
    result = []

    for item in items:
        # Each line's execution time is measured
        cleaned = item.strip().lower()
        if len(cleaned) > 5:
            processed = expensive_transform(cleaned)
            result.append(processed)

    return result


# Alternative: Profile specific functions
def detailed_profile():
    """Profile multiple functions."""
    lp = LineProfiler()

    # Add functions to profile
    lp.add_function(function1)
    lp.add_function(function2)
    lp.add_function(function3)

    # Run with profiling
    lp.enable()
    main_function()
    lp.disable()

    lp.print_stats()
```

## Memory Profiling

```python
from memory_profiler import profile as memory_profile
import tracemalloc


@memory_profile
def memory_intensive_function():
    """
    Profile memory usage.

    Requires: pip install memory-profiler
    """
    data = []

    for i in range(1000000):
        data.append(i * 2)

    return data


# Tracemalloc for detailed memory tracking
def track_memory_detailed():
    """Track memory allocations in detail."""
    tracemalloc.start()

    # Code to profile
    data = memory_intensive_function()

    # Get current memory usage
    current, peak = tracemalloc.get_traced_memory()
    print(f"Current memory: {current / 1024 / 1024:.2f} MB")
    print(f"Peak memory: {peak / 1024 / 1024:.2f} MB")

    # Get top memory allocations
    snapshot = tracemalloc.take_snapshot()
    top_stats = snapshot.statistics('lineno')

    print("\n=== Top 10 memory allocations ===")
    for stat in top_stats[:10]:
        print(stat)

    tracemalloc.stop()


# Context manager for memory tracking
class MemoryTracker:
    """Context manager to track memory usage."""

    def __enter__(self):
        tracemalloc.start()
        self.start_memory = tracemalloc.get_traced_memory()[0]
        return self

    def __exit__(self, *args):
        current, peak = tracemalloc.get_traced_memory()
        self.memory_used = current - self.start_memory
        self.peak_memory = peak
        tracemalloc.stop()

        print(f"Memory used: {self.memory_used / 1024 / 1024:.2f} MB")
        print(f"Peak memory: {self.peak_memory / 1024 / 1024:.2f} MB")


# Usage
with MemoryTracker() as tracker:
    result = memory_intensive_function()

print(f"Total memory: {tracker.memory_used / 1024 / 1024:.2f} MB")
```

## Async Profiling

```python
import asyncio
import time
from typing import Coroutine


class AsyncProfiler:
    """Profile async function execution."""

    def __init__(self):
        self.timings = {}

    async def profile(self, name: str, coro: Coroutine):
        """
        Profile async coroutine.

        Args:
            name: Profile label
            coro: Coroutine to profile

        Returns:
            Coroutine result
        """
        start = time.perf_counter()

        try:
            result = await coro
            duration = time.perf_counter() - start

            self.timings[name] = {
                "duration": duration,
                "status": "success"
            }

            return result

        except Exception as e:
            duration = time.perf_counter() - start

            self.timings[name] = {
                "duration": duration,
                "status": "error",
                "error": str(e)
            }

            raise

    def print_stats(self):
        """Print profiling statistics."""
        print("\n=== Async Profile Stats ===")
        print(f"{'Function':<30} {'Duration':>10} {'Status':>10}")
        print("-" * 52)

        for name, stats in sorted(
            self.timings.items(),
            key=lambda x: x[1]["duration"],
            reverse=True
        ):
            duration = stats["duration"]
            status = stats["status"]
            print(f"{name:<30} {duration:>10.4f}s {status:>10}")


# Usage
async def main():
    profiler = AsyncProfiler()

    # Profile async operations
    users = await profiler.profile(
        "fetch_users",
        fetch_users_from_db()
    )

    orders = await profiler.profile(
        "fetch_orders",
        fetch_orders_from_api()
    )

    result = await profiler.profile(
        "process_data",
        process_data(users, orders)
    )

    profiler.print_stats()

    return result


# Decorator version
def profile_async(name: str = None):
    """Decorator to profile async functions."""
    def decorator(func):
        @wraps(func)
        async def wrapper(*args, **kwargs):
            func_name = name or func.__name__
            start = time.perf_counter()

            try:
                result = await func(*args, **kwargs)
                duration = time.perf_counter() - start

                print(f"{func_name}: {duration:.4f}s")

                return result

            except Exception as e:
                duration = time.perf_counter() - start
                print(f"{func_name}: {duration:.4f}s (error)")
                raise

        return wrapper
    return decorator


@profile_async("fetch_users")
async def fetch_users():
    """Fetch users with profiling."""
    await asyncio.sleep(0.5)
    return ["user1", "user2"]
```

## Benchmarking

```python
import timeit
from typing import Callable, List, Dict
from statistics import mean, stdev


class Benchmark:
    """Benchmark function performance."""

    def __init__(self, iterations: int = 1000):
        self.iterations = iterations
        self.results: Dict[str, List[float]] = {}

    def run(self, name: str, func: Callable, *args, **kwargs):
        """
        Run benchmark for function.

        Args:
            name: Benchmark name
            func: Function to benchmark
            *args, **kwargs: Function arguments
        """
        times = []

        for _ in range(self.iterations):
            start = time.perf_counter()
            func(*args, **kwargs)
            duration = time.perf_counter() - start
            times.append(duration)

        self.results[name] = times

    def compare(self, name1: str, name2: str):
        """Compare two benchmarks."""
        times1 = self.results[name1]
        times2 = self.results[name2]

        mean1 = mean(times1)
        mean2 = mean(times2)

        improvement = ((mean1 - mean2) / mean1) * 100

        print(f"\n=== Comparison: {name1} vs {name2} ===")
        print(f"{name1}: {mean1*1000:.4f}ms ± {stdev(times1)*1000:.4f}ms")
        print(f"{name2}: {mean2*1000:.4f}ms ± {stdev(times2)*1000:.4f}ms")
        print(f"Improvement: {improvement:+.2f}%")

    def print_stats(self):
        """Print all benchmark statistics."""
        print("\n=== Benchmark Results ===")
        print(f"{'Name':<30} {'Mean':>12} {'Std Dev':>12} {'Min':>12} {'Max':>12}")
        print("-" * 80)

        for name, times in self.results.items():
            mean_time = mean(times) * 1000
            std_time = stdev(times) * 1000
            min_time = min(times) * 1000
            max_time = max(times) * 1000

            print(
                f"{name:<30} "
                f"{mean_time:>10.4f}ms "
                f"{std_time:>10.4f}ms "
                f"{min_time:>10.4f}ms "
                f"{max_time:>10.4f}ms"
            )


# Usage
bench = Benchmark(iterations=1000)

# Benchmark different implementations
bench.run("list_comprehension", lambda: [i*2 for i in range(1000)])
bench.run("map_function", lambda: list(map(lambda i: i*2, range(1000))))
bench.run("for_loop", lambda: [i*2 for i in range(1000)])

bench.print_stats()
bench.compare("list_comprehension", "map_function")


# Using timeit module
def benchmark_with_timeit():
    """Benchmark with timeit module."""

    # Setup code
    setup = """
from mymodule import function1, function2
data = list(range(10000))
"""

    # Benchmark function1
    time1 = timeit.timeit(
        "function1(data)",
        setup=setup,
        number=1000
    )

    # Benchmark function2
    time2 = timeit.timeit(
        "function2(data)",
        setup=setup,
        number=1000
    )

    print(f"function1: {time1:.4f}s")
    print(f"function2: {time2:.4f}s")
    print(f"Speedup: {time1/time2:.2f}x")
```

## Performance Testing

```python
import pytest
from typing import Callable


def test_performance(
    func: Callable,
    max_duration_ms: float,
    iterations: int = 100
):
    """
    Test function performance.

    Args:
        func: Function to test
        max_duration_ms: Maximum allowed duration in ms
        iterations: Number of iterations
    """
    durations = []

    for _ in range(iterations):
        start = time.perf_counter()
        func()
        duration = (time.perf_counter() - start) * 1000
        durations.append(duration)

    avg_duration = mean(durations)

    assert avg_duration < max_duration_ms, (
        f"Performance regression: {avg_duration:.2f}ms > {max_duration_ms}ms"
    )


# Pytest performance tests
@pytest.mark.performance
def test_api_response_time():
    """Test API response time is under 100ms."""

    def api_call():
        response = requests.get("http://localhost:8000/api/users")
        return response.json()

    test_performance(api_call, max_duration_ms=100.0, iterations=50)


@pytest.mark.performance
def test_database_query_time():
    """Test database query time is under 50ms."""

    def db_query():
        return db.query(User).filter(User.status == "active").all()

    test_performance(db_query, max_duration_ms=50.0, iterations=100)
```

## ❌ Anti-Patterns

```python
# ❌ Using print() for timing
start = time.time()
result = expensive_function()
print(f"Took {time.time() - start}s")  # Not precise!

# ✅ Better: Use proper profiling
@profile_function()
def expensive_function():
    ...


# ❌ Single measurement
time = timeit.timeit(func, number=1)  # Not reliable!

# ✅ Better: Multiple iterations
times = [timeit.timeit(func, number=1) for _ in range(100)]
avg_time = mean(times)


# ❌ Not profiling before optimizing
# Just guessing where the bottleneck is

# ✅ Better: Profile first, then optimize
@profile_function()
def identify_bottleneck():
    ...


# ❌ Profiling in production without limits
profiler.enable()  # Runs forever!

# ✅ Better: Time-limited profiling
import signal
profiler.enable()
signal.alarm(60)  # Stop after 60 seconds
```

## Best Practices Checklist

- ✅ Profile before optimizing (measure, don't guess)
- ✅ Use appropriate profiling tool (cProfile, line_profiler)
- ✅ Profile memory usage for memory-intensive code
- ✅ Use async profiling for async code
- ✅ Benchmark different implementations
- ✅ Run multiple iterations for reliability
- ✅ Track performance over time
- ✅ Set performance budgets in tests
- ✅ Profile in realistic conditions
- ✅ Focus on hot paths first
- ✅ Document performance characteristics
- ✅ Use sampling for production profiling

## Auto-Apply

When profiling performance:
1. Use cProfile for CPU profiling
2. Use line_profiler for line-by-line analysis
3. Use memory_profiler for memory tracking
4. Profile async code with AsyncProfiler
5. Benchmark alternatives with Benchmark class
6. Add performance tests with time limits
7. Focus on hot paths identified by profiling

## Related Skills

- `monitoring-alerting` - For production metrics
- `query-optimization` - For database performance
- `async-await-checker` - For async patterns
- `pytest-patterns` - For performance testing