Initial commit
This commit is contained in:
183
skills/python-code-review/SKILL.md
Normal file
183
skills/python-code-review/SKILL.md
Normal file
@@ -0,0 +1,183 @@
|
||||
---
|
||||
name: python-code-review
|
||||
description: Deep Python code review of changed files using git diff analysis. Focuses on production quality, security vulnerabilities, performance bottlenecks, architectural issues, and subtle bugs in code changes. Analyzes correctness, efficiency, scalability, and production readiness of modifications. Use for pull request reviews, commit reviews, security audits of changes, and pre-deployment validation. Supports Django, Flask, FastAPI, pandas, and ML frameworks.
|
||||
---
|
||||
|
||||
# Python Code Review Expert
|
||||
|
||||
## ⚠️ MANDATORY COMPLIANCE ⚠️
|
||||
|
||||
**CRITICAL**: The 5-step workflow outlined in this document MUST be followed in exact order for EVERY code review. Skipping steps or deviating from the procedure will result in incomplete and unreliable reviews. This is non-negotiable.
|
||||
|
||||
## File Structure
|
||||
|
||||
- **SKILL.md** (this file): Main instructions and MANDATORY workflow
|
||||
- **examples.md**: Review scenarios with before/after examples
|
||||
- **../../context/python/**: Framework patterns and detection logic
|
||||
- `context_detection.md`, `common_issues.md`, `{framework}_patterns.md`
|
||||
- **../../context/security/**: Security guidelines and OWASP references
|
||||
- `security_guidelines.md`, `owasp_python.md`
|
||||
- **../../memory/skills/python-code-review/**: Project-specific memory storage
|
||||
- `{project-name}/`: Per-project learned patterns and context
|
||||
- **templates/**: `report_template.md`, `inline_comment_template.md`
|
||||
|
||||
## Review Focus Areas
|
||||
|
||||
Deep reviews evaluate 8 critical dimensions **in the changed code**:
|
||||
|
||||
1. **Production Quality**: Correctness, edge cases, error recovery, resilience
|
||||
2. **Deep Bugs**: Race conditions, memory leaks, resource exhaustion, subtle logic errors
|
||||
3. **Security**: Injection flaws, auth bypasses, insecure deserialization, data exposure
|
||||
4. **Performance**: Algorithmic complexity, N+1 queries, memory inefficiency, I/O blocking
|
||||
5. **Architecture**: Tight coupling, missing abstractions, SOLID violations, circular deps
|
||||
6. **Reliability**: Transaction safety, error handling, resource leaks, idempotency
|
||||
7. **Scalability**: Concurrency issues, connection pooling, pagination, unbounded consumption
|
||||
8. **Testing**: Missing critical tests, inadequate edge case coverage
|
||||
|
||||
**Note**: Focus on substantive issues requiring human judgment, not style/formatting details. Reviews are performed on changed code only, using the `get-git-diff` skill to identify modifications.
|
||||
|
||||
---
|
||||
|
||||
## MANDATORY WORKFLOW (MUST FOLLOW EXACTLY)
|
||||
|
||||
### ⚠️ STEP 1: Identify Changed Files via Git Diff (REQUIRED)
|
||||
|
||||
**YOU MUST:**
|
||||
1. **Invoke the `get-git-diff` skill** to identify changed Python files
|
||||
2. Ask clarifying questions to determine comparison scope:
|
||||
- Which commits/branches to compare? (e.g., `HEAD^ vs HEAD`, `main vs feature-branch`)
|
||||
- If not specified, default to comparing current changes against the default branch
|
||||
- Use the diff output to extract the list of modified Python files (`.py` extension)
|
||||
3. If no Python files were changed, inform the user and exit gracefully
|
||||
4. Focus subsequent review ONLY on the files identified in the diff
|
||||
|
||||
**DO NOT PROCEED WITHOUT GIT DIFF ANALYSIS**
|
||||
|
||||
### ⚠️ STEP 2: Load Project Memory & Context Detection (REQUIRED)
|
||||
|
||||
**YOU MUST:**
|
||||
1. **CHECK PROJECT MEMORY FIRST**:
|
||||
- Identify the project name from the repository root or ask the user
|
||||
- Check `../../memory/skills/python-code-review/{project-name}/` for existing project memory
|
||||
- If memory exists, read all files to understand previously learned patterns, frameworks, and project-specific context
|
||||
- If no memory exists, you will create it later in this process
|
||||
2. Analyze changed files' structure and imports
|
||||
3. **READ** `../../context/python/context_detection.md` to identify framework
|
||||
4. Determine which framework-specific patterns file(s) to load
|
||||
5. Ask clarifying questions in Socratic format:
|
||||
- What is the purpose of these changes?
|
||||
- Specific concerns to focus on?
|
||||
- Deployment environment?
|
||||
- Any project-specific conventions or patterns to be aware of?
|
||||
|
||||
**DO NOT PROCEED WITHOUT COMPLETING THIS STEP**
|
||||
|
||||
### ⚠️ STEP 3: Read Pattern Files (REQUIRED)
|
||||
|
||||
**YOU MUST read these files based on context**:
|
||||
|
||||
1. **ALWAYS**: `../../context/python/common_issues.md` (universal anti-patterns and deep bugs)
|
||||
2. **If Django detected**: `../../context/python/django_patterns.md`
|
||||
3. **If Flask detected**: `../../context/python/flask_patterns.md`
|
||||
4. **If FastAPI detected**: `../../context/python/fastapi_patterns.md`
|
||||
5. **If data science detected**: `../../context/python/datascience_patterns.md`
|
||||
6. **If ML detected**: `../../context/python/ml_patterns.md`
|
||||
7. **For security reviews**: `../../context/security/security_guidelines.md` AND `../../context/security/owasp_python.md`
|
||||
|
||||
**Progressive loading**: Only read framework files when detected. Don't load all upfront.
|
||||
|
||||
**DO NOT SKIP PATTERN FILE READING**
|
||||
|
||||
### ⚠️ STEP 4: Deep Manual Review of Changed Code (REQUIRED)
|
||||
|
||||
**YOU MUST examine ONLY the changed code for ALL categories below**:
|
||||
|
||||
**Important**: While reviewing changed lines, consider the surrounding context to understand:
|
||||
- How changes interact with existing code
|
||||
- Whether changes introduce regressions
|
||||
- Impact on callers and dependent code
|
||||
- Whether the change addresses the root cause or masks symptoms
|
||||
|
||||
**Review Categories**:
|
||||
|
||||
**Production Readiness**: Edge cases, input validation, error recovery, resource cleanup, timeouts
|
||||
**Deep Bugs**: Race conditions, memory leaks, off-by-one errors, unhandled exceptions, state corruption, infinite loops, integer overflow, timezone issues
|
||||
**Architecture**: Tight coupling, missing abstractions, SOLID violations, global state, circular dependencies
|
||||
**Security**: SQL/NoSQL/Command injection, auth bypasses, insecure deserialization, SSRF, XXE, crypto weaknesses, data exposure, missing rate limiting
|
||||
**Performance**: O(n²) complexity, N+1 queries, memory leaks, blocking I/O in async, missing indexes, inefficient data structures, cache stampede
|
||||
**Scalability**: Connection pool exhaustion, lock contention, deadlocks, missing pagination, unbounded consumption
|
||||
**Reliability**: Transaction boundaries, data races, resource leaks, missing idempotency
|
||||
|
||||
**DO NOT SKIP ANY CATEGORY**
|
||||
|
||||
### ⚠️ STEP 5: Generate Output & Update Project Memory (REQUIRED)
|
||||
|
||||
**YOU MUST ask user for preferred output format**:
|
||||
|
||||
- **Option A**: Structured report (`templates/report_template.md`) → executive summary, categorized findings, action items → output to `claudedocs/`
|
||||
- **Option B**: Inline comments (`templates/inline_comment_template.md`) → file:line feedback, PR-style
|
||||
- **Option C (Default)**: Both formats
|
||||
|
||||
**DO NOT CHOOSE FORMAT WITHOUT USER INPUT**
|
||||
|
||||
**For EVERY issue in the output, YOU MUST provide**:
|
||||
1. **Severity**: Critical / Important / Minor
|
||||
2. **Category**: Security / Performance / Code Quality / Architecture / Reliability
|
||||
3. **Description**: What is wrong and why it matters
|
||||
4. **Fix**: Concrete code example with improvement
|
||||
5. **Reference**: Link to PEP, OWASP, or framework docs
|
||||
6. **File:line**: Exact location (e.g., `auth.py:142`)
|
||||
|
||||
**Format guidelines**:
|
||||
- Explain WHY (not just what)
|
||||
- Show HOW to fix with examples
|
||||
- Be specific with file:line references
|
||||
- Be balanced (acknowledge good patterns)
|
||||
- Educate, don't criticize
|
||||
|
||||
**DO NOT PROVIDE INCOMPLETE RECOMMENDATIONS**
|
||||
|
||||
**After completing the review, UPDATE PROJECT MEMORY**:
|
||||
|
||||
Create or update files in `../../memory/skills/python-code-review/{project-name}/`:
|
||||
|
||||
1. **project_overview.md**: Framework, architecture patterns, deployment info
|
||||
2. **common_patterns.md**: Project-specific coding patterns and conventions discovered
|
||||
3. **known_issues.md**: Recurring issues or anti-patterns found in this project
|
||||
4. **review_history.md**: Summary of reviews performed with dates and key findings
|
||||
|
||||
This memory will be consulted in future reviews to provide context-aware analysis.
|
||||
|
||||
---
|
||||
|
||||
## Compliance Checklist
|
||||
|
||||
Before completing ANY review, verify:
|
||||
- [ ] Step 1: Git diff analyzed using `get-git-diff` skill and changed Python files identified
|
||||
- [ ] Step 2: Project memory checked in `../../memory/skills/python-code-review/{project-name}/` and context detected
|
||||
- [ ] Step 3: All relevant pattern files read from `../../context/python/` and `../../context/security/`
|
||||
- [ ] Step 4: Manual review completed for ALL categories on changed code only
|
||||
- [ ] Step 5: Output generated with all required fields AND project memory updated
|
||||
|
||||
**FAILURE TO COMPLETE ALL STEPS INVALIDATES THE REVIEW**
|
||||
|
||||
## Further Reading
|
||||
|
||||
Refer to the official documentation:
|
||||
- **Python Standards**:
|
||||
- Python PEPs: https://peps.python.org/
|
||||
- OWASP Python Security: https://owasp.org/www-project-python-security/
|
||||
- **Frameworks**:
|
||||
- Django, Flask, FastAPI official documentation
|
||||
- **Best Practices**:
|
||||
- Real Python: https://realpython.com/
|
||||
|
||||
## Version History
|
||||
|
||||
- v2.1.0 (2025-11-14): Refactored to use centralized context and project-specific memory system
|
||||
- Context files moved to `forge-plugin/context/python/` and `forge-plugin/context/security/`
|
||||
- Project memory stored in `forge-plugin/memory/skills/python-code-review/{project-name}/`
|
||||
- Added project memory loading and persistence in workflow
|
||||
- v2.0.0 (2025-11-13): Changed to diff-based review using `get-git-diff` skill - reviews only changed code
|
||||
- v1.1.0 (2025-11-13): Removed automated analysis and linting/formatting tools
|
||||
- v1.0.0 (2025-11-13): Initial release
|
||||
503
skills/python-code-review/examples.md
Normal file
503
skills/python-code-review/examples.md
Normal file
@@ -0,0 +1,503 @@
|
||||
# Python Code Review Examples
|
||||
|
||||
This file contains example code review scenarios demonstrating common issues and recommended fixes.
|
||||
|
||||
## Example 1: Security Vulnerability - SQL Injection
|
||||
|
||||
### Before (Vulnerable Code)
|
||||
|
||||
```python
|
||||
# user_service.py:15
|
||||
def get_user_by_email(email):
|
||||
query = f"SELECT * FROM users WHERE email = '{email}'"
|
||||
cursor.execute(query)
|
||||
return cursor.fetchone()
|
||||
```
|
||||
|
||||
### Review Comment
|
||||
|
||||
**Severity**: Critical
|
||||
**Category**: Security
|
||||
**File**: user_service.py:16
|
||||
|
||||
SQL injection vulnerability detected. User input is directly interpolated into SQL query, allowing attackers to execute arbitrary SQL commands.
|
||||
|
||||
**Attack example**:
|
||||
```python
|
||||
email = "'; DROP TABLE users; --"
|
||||
# Results in: SELECT * FROM users WHERE email = ''; DROP TABLE users; --'
|
||||
```
|
||||
|
||||
### After (Fixed Code)
|
||||
|
||||
```python
|
||||
# user_service.py:15
|
||||
def get_user_by_email(email):
|
||||
query = "SELECT * FROM users WHERE email = %s"
|
||||
cursor.execute(query, (email,))
|
||||
return cursor.fetchone()
|
||||
```
|
||||
|
||||
**Reference**: OWASP A03:2021 - Injection
|
||||
|
||||
---
|
||||
|
||||
## Example 2: Performance Issue - N+1 Query Problem (Django)
|
||||
|
||||
### Before (Inefficient Code)
|
||||
|
||||
```python
|
||||
# views.py:45
|
||||
def get_posts_with_authors(request):
|
||||
posts = Post.objects.all() # 1 query
|
||||
result = []
|
||||
for post in posts:
|
||||
result.append({
|
||||
'title': post.title,
|
||||
'author': post.author.name # N additional queries!
|
||||
})
|
||||
return JsonResponse(result, safe=False)
|
||||
```
|
||||
|
||||
### Review Comment
|
||||
|
||||
**Severity**: Important
|
||||
**Category**: Performance
|
||||
**File**: views.py:48
|
||||
|
||||
N+1 query problem detected. For 100 posts, this executes 101 database queries (1 for posts + 100 for authors). This causes severe performance degradation under load.
|
||||
|
||||
### After (Optimized Code)
|
||||
|
||||
```python
|
||||
# views.py:45
|
||||
def get_posts_with_authors(request):
|
||||
posts = Post.objects.select_related('author').all() # 1 query with JOIN
|
||||
result = []
|
||||
for post in posts:
|
||||
result.append({
|
||||
'title': post.title,
|
||||
'author': post.author.name
|
||||
})
|
||||
return JsonResponse(result, safe=False)
|
||||
```
|
||||
|
||||
**Performance gain**: 101 queries → 1 query (100x improvement for 100 posts)
|
||||
|
||||
**Reference**: Django QuerySet optimization
|
||||
|
||||
---
|
||||
|
||||
## Example 3: Code Quality - Mutable Default Argument
|
||||
|
||||
### Before (Buggy Code)
|
||||
|
||||
```python
|
||||
# utils.py:22
|
||||
def add_item(item, items=[]):
|
||||
items.append(item)
|
||||
return items
|
||||
|
||||
# Usage that reveals the bug:
|
||||
list1 = add_item('a') # ['a']
|
||||
list2 = add_item('b') # ['a', 'b'] - UNEXPECTED!
|
||||
```
|
||||
|
||||
### Review Comment
|
||||
|
||||
**Severity**: Important
|
||||
**Category**: Code Quality
|
||||
**File**: utils.py:22
|
||||
|
||||
Mutable default argument antipattern. The default list `[]` is created once when the function is defined, not each time it's called. All invocations share the same list object, causing unexpected state persistence.
|
||||
|
||||
### After (Fixed Code)
|
||||
|
||||
```python
|
||||
# utils.py:22
|
||||
def add_item(item, items=None):
|
||||
if items is None:
|
||||
items = []
|
||||
items.append(item)
|
||||
return items
|
||||
|
||||
# Now works correctly:
|
||||
list1 = add_item('a') # ['a']
|
||||
list2 = add_item('b') # ['b'] - CORRECT!
|
||||
```
|
||||
|
||||
**Reference**: Common Python Gotchas
|
||||
|
||||
---
|
||||
|
||||
## Example 4: PEP 8 Compliance - Naming Conventions
|
||||
|
||||
### Before (Non-compliant Code)
|
||||
|
||||
```python
|
||||
# data_processor.py:10
|
||||
def CalculateUserAge(BirthDate):
|
||||
CurrentYear = 2025
|
||||
user_birth_year = BirthDate.year
|
||||
AGE = CurrentYear - user_birth_year
|
||||
return AGE
|
||||
```
|
||||
|
||||
### Review Comment
|
||||
|
||||
**Severity**: Minor
|
||||
**Category**: Style
|
||||
**File**: data_processor.py:10-15
|
||||
|
||||
Multiple PEP 8 naming violations:
|
||||
- Function name should be `snake_case`, not `PascalCase`
|
||||
- Parameter name should be `snake_case`, not `PascalCase`
|
||||
- Local variables should be lowercase, not mixed case or UPPERCASE
|
||||
- UPPERCASE is reserved for constants
|
||||
|
||||
### After (Compliant Code)
|
||||
|
||||
```python
|
||||
# data_processor.py:10
|
||||
def calculate_user_age(birth_date):
|
||||
current_year = 2025
|
||||
user_birth_year = birth_date.year
|
||||
age = current_year - user_birth_year
|
||||
return age
|
||||
```
|
||||
|
||||
**Reference**: PEP 8 - Naming Conventions
|
||||
|
||||
---
|
||||
|
||||
## Example 5: Best Practice - Context Manager for Resource Handling
|
||||
|
||||
### Before (Resource Leak Risk)
|
||||
|
||||
```python
|
||||
# file_processor.py:30
|
||||
def process_log_file(filepath):
|
||||
file = open(filepath, 'r')
|
||||
data = file.read()
|
||||
results = analyze(data)
|
||||
file.close() # May not execute if analyze() raises exception
|
||||
return results
|
||||
```
|
||||
|
||||
### Review Comment
|
||||
|
||||
**Severity**: Important
|
||||
**Category**: Best Practices
|
||||
**File**: file_processor.py:31
|
||||
|
||||
Missing context manager for file handling. If `analyze()` raises an exception, `file.close()` never executes, leaving the file handle open (resource leak).
|
||||
|
||||
### After (Safe Code)
|
||||
|
||||
```python
|
||||
# file_processor.py:30
|
||||
def process_log_file(filepath):
|
||||
with open(filepath, 'r') as file:
|
||||
data = file.read()
|
||||
results = analyze(data)
|
||||
# File automatically closed even if exception occurs
|
||||
return results
|
||||
```
|
||||
|
||||
**Bonus improvement**:
|
||||
```python
|
||||
# Even better with pathlib
|
||||
from pathlib import Path
|
||||
|
||||
def process_log_file(filepath):
|
||||
data = Path(filepath).read_text()
|
||||
return analyze(data)
|
||||
```
|
||||
|
||||
**Reference**: PEP 343 - The "with" Statement
|
||||
|
||||
---
|
||||
|
||||
## Example 6: Security - Hardcoded Credentials
|
||||
|
||||
### Before (Security Risk)
|
||||
|
||||
```python
|
||||
# config.py:5
|
||||
DATABASE_CONFIG = {
|
||||
'host': 'prod-db.example.com',
|
||||
'user': 'admin',
|
||||
'password': 'SuperSecret123!', # NEVER do this
|
||||
'database': 'production'
|
||||
}
|
||||
```
|
||||
|
||||
### Review Comment
|
||||
|
||||
**Severity**: Critical
|
||||
**Category**: Security
|
||||
**File**: config.py:8
|
||||
|
||||
Hardcoded credentials detected. Passwords in source code:
|
||||
1. Are visible to anyone with repository access
|
||||
2. Get committed to version control history
|
||||
3. Can't be rotated without code changes
|
||||
4. May be exposed in logs or error messages
|
||||
|
||||
### After (Secure Code)
|
||||
|
||||
```python
|
||||
# config.py:5
|
||||
import os
|
||||
|
||||
DATABASE_CONFIG = {
|
||||
'host': os.getenv('DB_HOST', 'localhost'),
|
||||
'user': os.getenv('DB_USER'),
|
||||
'password': os.getenv('DB_PASSWORD'),
|
||||
'database': os.getenv('DB_NAME', 'production')
|
||||
}
|
||||
|
||||
# Validate required environment variables
|
||||
required_vars = ['DB_USER', 'DB_PASSWORD']
|
||||
missing = [var for var in required_vars if not os.getenv(var)]
|
||||
if missing:
|
||||
raise RuntimeError(f"Missing required environment variables: {missing}")
|
||||
```
|
||||
|
||||
**Additional security**:
|
||||
```bash
|
||||
# Use environment files (not committed to git)
|
||||
echo "DB_PASSWORD=..." > .env
|
||||
echo ".env" >> .gitignore
|
||||
```
|
||||
|
||||
**Reference**: OWASP A07:2021 - Identification and Authentication Failures
|
||||
|
||||
---
|
||||
|
||||
## Example 7: Performance - Pandas Optimization
|
||||
|
||||
### Before (Inefficient Code)
|
||||
|
||||
```python
|
||||
# data_analysis.py:50
|
||||
import pandas as pd
|
||||
|
||||
def calculate_discounts(df):
|
||||
# Anti-pattern: Iterating over DataFrame rows
|
||||
discounts = []
|
||||
for index, row in df.iterrows():
|
||||
if row['total'] > 100:
|
||||
discount = row['total'] * 0.1
|
||||
else:
|
||||
discount = 0
|
||||
discounts.append(discount)
|
||||
df['discount'] = discounts
|
||||
return df
|
||||
```
|
||||
|
||||
### Review Comment
|
||||
|
||||
**Severity**: Important
|
||||
**Category**: Performance
|
||||
**File**: data_analysis.py:53
|
||||
|
||||
Using `iterrows()` on DataFrame - this is one of the slowest operations in pandas. For 10,000 rows, this can be 100x slower than vectorized operations.
|
||||
|
||||
### After (Vectorized Code)
|
||||
|
||||
```python
|
||||
# data_analysis.py:50
|
||||
import pandas as pd
|
||||
|
||||
def calculate_discounts(df):
|
||||
# Vectorized operation - operates on entire column at once
|
||||
df['discount'] = (df['total'] * 0.1).where(df['total'] > 100, 0)
|
||||
return df
|
||||
|
||||
# Alternative using numpy where:
|
||||
import numpy as np
|
||||
|
||||
def calculate_discounts(df):
|
||||
df['discount'] = np.where(df['total'] > 100, df['total'] * 0.1, 0)
|
||||
return df
|
||||
```
|
||||
|
||||
**Performance**: Vectorized operations use optimized C code, achieving 50-100x speedup on large datasets.
|
||||
|
||||
**Reference**: Pandas Performance Optimization
|
||||
|
||||
---
|
||||
|
||||
## Example 8: Testing - Missing Edge Cases
|
||||
|
||||
### Before (Incomplete Tests)
|
||||
|
||||
```python
|
||||
# test_validators.py:15
|
||||
def test_email_validation():
|
||||
assert is_valid_email('user@example.com') == True
|
||||
assert is_valid_email('invalid-email') == False
|
||||
```
|
||||
|
||||
### Review Comment
|
||||
|
||||
**Severity**: Important
|
||||
**Category**: Testing
|
||||
**File**: test_validators.py:15
|
||||
|
||||
Email validation tests are insufficient. Missing edge cases:
|
||||
- Empty string
|
||||
- None value
|
||||
- Email with special characters
|
||||
- Multiple @ symbols
|
||||
- Missing domain
|
||||
- Whitespace handling
|
||||
- Maximum length validation
|
||||
|
||||
### After (Comprehensive Tests)
|
||||
|
||||
```python
|
||||
# test_validators.py:15
|
||||
import pytest
|
||||
|
||||
@pytest.mark.parametrize('email,expected', [
|
||||
# Valid emails
|
||||
('user@example.com', True),
|
||||
('first.last@example.co.uk', True),
|
||||
('user+tag@example.com', True),
|
||||
|
||||
# Invalid emails
|
||||
('invalid-email', False),
|
||||
('', False),
|
||||
('user@', False),
|
||||
('user@@example.com', False),
|
||||
('@example.com', False),
|
||||
('user @example.com', False),
|
||||
('a' * 256 + '@example.com', False), # Too long
|
||||
])
|
||||
def test_email_validation(email, expected):
|
||||
assert is_valid_email(email) == expected
|
||||
|
||||
def test_email_validation_with_none():
|
||||
with pytest.raises(TypeError):
|
||||
is_valid_email(None)
|
||||
```
|
||||
|
||||
**Reference**: Testing Best Practices
|
||||
|
||||
---
|
||||
|
||||
## Example 9: Architecture - Separation of Concerns (FastAPI)
|
||||
|
||||
### Before (Tightly Coupled Code)
|
||||
|
||||
```python
|
||||
# main.py:25
|
||||
from fastapi import FastAPI
|
||||
import psycopg2
|
||||
|
||||
app = FastAPI()
|
||||
|
||||
@app.get('/users/{user_id}')
|
||||
def get_user(user_id: int):
|
||||
# Business logic mixed with data access and presentation
|
||||
conn = psycopg2.connect("dbname=mydb user=admin password=secret")
|
||||
cursor = conn.cursor()
|
||||
cursor.execute(f"SELECT * FROM users WHERE id = {user_id}")
|
||||
user = cursor.fetchone()
|
||||
conn.close()
|
||||
|
||||
if user:
|
||||
return {'id': user[0], 'name': user[1], 'email': user[2]}
|
||||
return {'error': 'User not found'}
|
||||
```
|
||||
|
||||
### Review Comment
|
||||
|
||||
**Severity**: Important
|
||||
**Category**: Architecture
|
||||
**File**: main.py:25-38
|
||||
|
||||
Multiple violations of separation of concerns:
|
||||
1. Database connection logic in route handler
|
||||
2. SQL injection vulnerability
|
||||
3. Hardcoded credentials
|
||||
4. No error handling
|
||||
5. Manual dict construction
|
||||
6. No dependency injection
|
||||
|
||||
### After (Layered Architecture)
|
||||
|
||||
```python
|
||||
# models.py
|
||||
from pydantic import BaseModel
|
||||
|
||||
class User(BaseModel):
|
||||
id: int
|
||||
name: str
|
||||
email: str
|
||||
|
||||
# database.py
|
||||
from sqlalchemy import create_engine
|
||||
from sqlalchemy.orm import sessionmaker
|
||||
import os
|
||||
|
||||
SQLALCHEMY_DATABASE_URL = os.getenv('DATABASE_URL')
|
||||
engine = create_engine(SQLALCHEMY_DATABASE_URL)
|
||||
SessionLocal = sessionmaker(bind=engine)
|
||||
|
||||
def get_db():
|
||||
db = SessionLocal()
|
||||
try:
|
||||
yield db
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
# repositories.py
|
||||
from sqlalchemy.orm import Session
|
||||
from . import models
|
||||
|
||||
class UserRepository:
|
||||
def get_by_id(self, db: Session, user_id: int):
|
||||
return db.query(models.User).filter(models.User.id == user_id).first()
|
||||
|
||||
# main.py
|
||||
from fastapi import FastAPI, Depends, HTTPException
|
||||
from sqlalchemy.orm import Session
|
||||
from . import models, database, repositories
|
||||
|
||||
app = FastAPI()
|
||||
user_repo = UserRepository()
|
||||
|
||||
@app.get('/users/{user_id}', response_model=models.User)
|
||||
def get_user(user_id: int, db: Session = Depends(database.get_db)):
|
||||
user = user_repo.get_by_id(db, user_id)
|
||||
if not user:
|
||||
raise HTTPException(status_code=404, detail='User not found')
|
||||
return user
|
||||
```
|
||||
|
||||
**Benefits**:
|
||||
- Clear separation of concerns
|
||||
- Dependency injection
|
||||
- Type safety with Pydantic
|
||||
- SQL injection protection via ORM
|
||||
- Reusable repository pattern
|
||||
- Proper error handling
|
||||
|
||||
**Reference**: FastAPI Best Practices, Repository Pattern
|
||||
|
||||
---
|
||||
|
||||
## Summary of Common Issues
|
||||
|
||||
1. **Security**: SQL injection, XSS, hardcoded credentials, insecure cryptography
|
||||
2. **Performance**: N+1 queries, inefficient loops, missing indexes, no caching
|
||||
3. **Code Quality**: Mutable defaults, global state, poor naming, missing docstrings
|
||||
4. **Style**: PEP 8 violations, inconsistent formatting, magic numbers
|
||||
5. **Best Practices**: Missing context managers, no type hints, poor error handling
|
||||
6. **Testing**: Insufficient coverage, missing edge cases, no integration tests
|
||||
7. **Architecture**: Tight coupling, mixed concerns, no dependency injection
|
||||
|
||||
Use these examples as reference when conducting reviews. Adapt the feedback style and technical depth to the codebase context.
|
||||
391
skills/python-code-review/templates/inline_comment_template.md
Normal file
391
skills/python-code-review/templates/inline_comment_template.md
Normal file
@@ -0,0 +1,391 @@
|
||||
# Inline Code Review Comments Template
|
||||
|
||||
This template provides examples of inline PR-style comments for different types of issues.
|
||||
|
||||
---
|
||||
|
||||
## Critical Issues
|
||||
|
||||
### Security Vulnerability
|
||||
|
||||
**File**: `auth.py:45`
|
||||
|
||||
```python
|
||||
# Current code
|
||||
user = db.execute(f"SELECT * FROM users WHERE username = '{username}'")
|
||||
```
|
||||
|
||||
**Issue**: SQL Injection Vulnerability
|
||||
|
||||
**Severity**: 🔴 Critical
|
||||
|
||||
**Description**:
|
||||
User input is directly interpolated into the SQL query, allowing attackers to execute arbitrary SQL commands.
|
||||
|
||||
**Attack Vector**:
|
||||
```python
|
||||
username = "admin' OR '1'='1"
|
||||
# Results in: SELECT * FROM users WHERE username = 'admin' OR '1'='1'
|
||||
```
|
||||
|
||||
**Fix**:
|
||||
```python
|
||||
# Use parameterized queries
|
||||
user = db.execute("SELECT * FROM users WHERE username = %s", (username,))
|
||||
|
||||
# Or use ORM
|
||||
user = User.query.filter_by(username=username).first()
|
||||
```
|
||||
|
||||
**Reference**: OWASP A03:2021 - Injection
|
||||
|
||||
---
|
||||
|
||||
### Data Corruption Risk
|
||||
|
||||
**File**: `payment.py:123`
|
||||
|
||||
```python
|
||||
# Current code
|
||||
order.amount -= discount
|
||||
order.save()
|
||||
payment.process(order.amount)
|
||||
```
|
||||
|
||||
**Issue**: Race Condition in Payment Processing
|
||||
|
||||
**Severity**: 🔴 Critical
|
||||
|
||||
**Description**:
|
||||
If two requests process the same order simultaneously, the discount could be applied twice, leading to incorrect payment amounts.
|
||||
|
||||
**Fix**:
|
||||
```python
|
||||
from django.db import transaction
|
||||
|
||||
@transaction.atomic
|
||||
def process_payment(order_id, discount):
|
||||
order = Order.objects.select_for_update().get(id=order_id)
|
||||
order.amount -= discount
|
||||
order.save()
|
||||
payment.process(order.amount)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Important Issues
|
||||
|
||||
### Performance Bottleneck
|
||||
|
||||
**File**: `views.py:67`
|
||||
|
||||
```python
|
||||
# Current code
|
||||
posts = Post.objects.all()
|
||||
for post in posts:
|
||||
print(post.author.name) # N+1 query problem
|
||||
```
|
||||
|
||||
**Issue**: N+1 Query Problem
|
||||
|
||||
**Severity**: 🟡 Important
|
||||
|
||||
**Impact**: For 100 posts, this executes 101 database queries instead of 1.
|
||||
|
||||
**Performance**: ~1000ms → ~10ms (100x improvement)
|
||||
|
||||
**Fix**:
|
||||
```python
|
||||
posts = Post.objects.select_related('author').all()
|
||||
for post in posts:
|
||||
print(post.author.name) # No additional queries
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Type Safety Issue
|
||||
|
||||
**File**: `utils.py:234`
|
||||
|
||||
```python
|
||||
def calculate_total(prices):
|
||||
return sum(prices) * 1.1
|
||||
```
|
||||
|
||||
**Issue**: Missing Type Hints
|
||||
|
||||
**Severity**: 🟡 Important
|
||||
|
||||
**Description**:
|
||||
Function lacks type hints, making it unclear what types are expected and returned. Could lead to runtime errors.
|
||||
|
||||
**Fix**:
|
||||
```python
|
||||
def calculate_total(prices: list[float]) -> float:
|
||||
"""Calculate total with 10% tax.
|
||||
|
||||
Args:
|
||||
prices: List of item prices
|
||||
|
||||
Returns:
|
||||
Total amount including tax
|
||||
"""
|
||||
return sum(prices) * 1.1
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Architectural Concern
|
||||
|
||||
**File**: `api.py:89`
|
||||
|
||||
```python
|
||||
@app.route('/process')
|
||||
def process_data():
|
||||
# 150 lines of business logic mixed with HTTP handling
|
||||
data = request.get_json()
|
||||
# ... lots of processing ...
|
||||
return jsonify(result)
|
||||
```
|
||||
|
||||
**Issue**: Fat Controller / Missing Service Layer
|
||||
|
||||
**Severity**: 🟡 Important
|
||||
|
||||
**Impact**:
|
||||
- Hard to test business logic
|
||||
- Violates Single Responsibility Principle
|
||||
- Difficult to reuse logic elsewhere
|
||||
|
||||
**Recommendation**:
|
||||
```python
|
||||
# services/data_processor.py
|
||||
class DataProcessor:
|
||||
def process(self, data: dict) -> dict:
|
||||
# Business logic here
|
||||
return result
|
||||
|
||||
# api.py
|
||||
@app.route('/process')
|
||||
def process_data():
|
||||
data = request.get_json()
|
||||
processor = DataProcessor()
|
||||
result = processor.process(data)
|
||||
return jsonify(result)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Minor Issues
|
||||
|
||||
### Code Smell
|
||||
|
||||
**File**: `helpers.py:45`
|
||||
|
||||
```python
|
||||
def append_to_list(item, items=[]): # Mutable default argument!
|
||||
items.append(item)
|
||||
return items
|
||||
```
|
||||
|
||||
**Issue**: Mutable Default Argument
|
||||
|
||||
**Severity**: 🔵 Minor
|
||||
|
||||
**Bug**: Default list is shared between all function calls, causing unexpected behavior.
|
||||
|
||||
**Example**:
|
||||
```python
|
||||
list1 = append_to_list('a') # ['a']
|
||||
list2 = append_to_list('b') # ['a', 'b'] - UNEXPECTED!
|
||||
```
|
||||
|
||||
**Fix**:
|
||||
```python
|
||||
def append_to_list(item, items=None):
|
||||
if items is None:
|
||||
items = []
|
||||
items.append(item)
|
||||
return items
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Dead Code
|
||||
|
||||
**File**: `old_utils.py:123`
|
||||
|
||||
```python
|
||||
def legacy_function():
|
||||
# This function is never called
|
||||
pass
|
||||
```
|
||||
|
||||
**Issue**: Unused Code
|
||||
|
||||
**Severity**: 🔵 Minor
|
||||
|
||||
**Recommendation**: Remove to improve code maintainability and reduce cognitive load.
|
||||
|
||||
---
|
||||
|
||||
### Complexity
|
||||
|
||||
**File**: `calculator.py:56`
|
||||
|
||||
```python
|
||||
def complex_calculation(x, y, z, mode, options):
|
||||
# 50 lines with nested if/else
|
||||
# Cyclomatic complexity: 23 (Rank D)
|
||||
...
|
||||
```
|
||||
|
||||
**Issue**: High Cyclomatic Complexity
|
||||
|
||||
**Severity**: 🔵 Minor
|
||||
|
||||
**Impact**: Hard to understand, test, and maintain.
|
||||
|
||||
**Recommendation**: Refactor into smaller, focused functions:
|
||||
```python
|
||||
def complex_calculation(x, y, z, mode, options):
|
||||
if mode == 'simple':
|
||||
return _simple_calc(x, y, z)
|
||||
elif mode == 'advanced':
|
||||
return _advanced_calc(x, y, z, options)
|
||||
else:
|
||||
return _default_calc(x, y)
|
||||
|
||||
def _simple_calc(x, y, z):
|
||||
...
|
||||
|
||||
def _advanced_calc(x, y, z, options):
|
||||
...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Testing Issues
|
||||
|
||||
### Missing Test Coverage
|
||||
|
||||
**File**: `payment.py:200`
|
||||
|
||||
```python
|
||||
def process_refund(order_id, amount):
|
||||
# Critical business logic with no tests!
|
||||
order = Order.objects.get(id=order_id)
|
||||
order.refund(amount)
|
||||
send_notification(order.user, f"Refunded ${amount}")
|
||||
```
|
||||
|
||||
**Issue**: Missing Tests for Critical Path
|
||||
|
||||
**Severity**: 🟡 Important
|
||||
|
||||
**Recommendation**: Add comprehensive tests:
|
||||
```python
|
||||
# tests/test_payment.py
|
||||
def test_process_refund_success():
|
||||
order = create_test_order(amount=100)
|
||||
process_refund(order.id, 50)
|
||||
assert order.amount == 50
|
||||
assert_notification_sent(order.user)
|
||||
|
||||
def test_process_refund_exceeds_amount():
|
||||
order = create_test_order(amount=100)
|
||||
with pytest.raises(ValueError):
|
||||
process_refund(order.id, 150)
|
||||
|
||||
def test_process_refund_invalid_order():
|
||||
with pytest.raises(Order.DoesNotExist):
|
||||
process_refund(99999, 50)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Information / Suggestions
|
||||
|
||||
### Opportunity for Optimization
|
||||
|
||||
**File**: `data_processor.py:78`
|
||||
|
||||
```python
|
||||
results = []
|
||||
for item in large_dataset:
|
||||
results.append(transform(item))
|
||||
```
|
||||
|
||||
**Suggestion**: Use list comprehension or generator for better performance
|
||||
|
||||
```python
|
||||
# List comprehension (if all results needed in memory)
|
||||
results = [transform(item) for item in large_dataset]
|
||||
|
||||
# Generator (if processing one at a time)
|
||||
results = (transform(item) for item in large_dataset)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Modern Python Pattern
|
||||
|
||||
**File**: `file_handler.py:34`
|
||||
|
||||
```python
|
||||
f = open('data.txt', 'r')
|
||||
data = f.read()
|
||||
f.close() # May not execute if read() raises exception
|
||||
```
|
||||
|
||||
**Suggestion**: Use context manager
|
||||
|
||||
```python
|
||||
with open('data.txt', 'r') as f:
|
||||
data = f.read()
|
||||
# File automatically closed, even if exception occurs
|
||||
|
||||
# Or use pathlib
|
||||
from pathlib import Path
|
||||
data = Path('data.txt').read_text()
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Comment Format Guidelines
|
||||
|
||||
### Structure
|
||||
|
||||
```
|
||||
**File**: path/to/file.py:line_number
|
||||
|
||||
[Code snippet if helpful]
|
||||
|
||||
**Issue**: Brief title
|
||||
|
||||
**Severity**: 🔴 Critical | 🟡 Important | 🔵 Minor | ⚪ Info
|
||||
|
||||
**Description**: Detailed explanation
|
||||
|
||||
**Impact/Why it matters**: Consequences
|
||||
|
||||
**Fix/Recommendation**: Concrete solution with code example
|
||||
|
||||
**Reference**: Links to docs, CVEs, etc. (if applicable)
|
||||
```
|
||||
|
||||
### Severity Levels
|
||||
|
||||
- 🔴 **Critical**: Security vulnerabilities, data corruption, production failures
|
||||
- 🟡 **Important**: Performance issues, type safety, architectural problems, missing tests
|
||||
- 🔵 **Minor**: Code smells, complexity, dead code, minor bugs
|
||||
- ⚪ **Info**: Suggestions, optimizations, style (only if blocking automation)
|
||||
|
||||
### Tone
|
||||
|
||||
- Be specific and actionable
|
||||
- Explain the "why" not just the "what"
|
||||
- Provide code examples
|
||||
- Reference authoritative sources
|
||||
- Acknowledge good code when present
|
||||
- Be constructive, not critical
|
||||
263
skills/python-code-review/templates/report_template.md
Normal file
263
skills/python-code-review/templates/report_template.md
Normal file
@@ -0,0 +1,263 @@
|
||||
# Code Review Report: [Project Name]
|
||||
|
||||
**Date**: [YYYY-MM-DD]
|
||||
**Reviewer**: Claude Code
|
||||
**Scope**: [Brief description of what was reviewed]
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
**Overall Assessment**: [Excellent | Good | Fair | Needs Improvement | Critical Issues Found]
|
||||
|
||||
**Key Findings**:
|
||||
- Critical Issues: [N]
|
||||
- Important Issues: [N]
|
||||
- Performance Concerns: [N]
|
||||
- Security Vulnerabilities: [N]
|
||||
|
||||
**Recommendation**: [Summary recommendation - e.g., "Address critical security issues before deployment" or "Code is production-ready with minor improvements recommended"]
|
||||
|
||||
---
|
||||
|
||||
## Critical Issues
|
||||
|
||||
### 1. [Issue Title]
|
||||
|
||||
**Severity**: Critical
|
||||
**Category**: [Security | Data Corruption | Production Failure]
|
||||
**Location**: `file.py:123`
|
||||
|
||||
**Description**:
|
||||
[Detailed description of the issue]
|
||||
|
||||
**Impact**:
|
||||
[What could go wrong if not fixed]
|
||||
|
||||
**Recommendation**:
|
||||
```python
|
||||
# Before (vulnerable)
|
||||
[problematic code]
|
||||
|
||||
# After (fixed)
|
||||
[corrected code]
|
||||
```
|
||||
|
||||
**References**:
|
||||
- [CWE-XXX](link) or [OWASP](link) if applicable
|
||||
|
||||
---
|
||||
|
||||
### 2. [Next Critical Issue]
|
||||
...
|
||||
|
||||
---
|
||||
|
||||
## Important Issues
|
||||
|
||||
### Performance Bottleneck: [Description]
|
||||
|
||||
**Location**: `file.py:456`
|
||||
**Impact**: [e.g., "O(n²) complexity causes slowdown with large datasets"]
|
||||
|
||||
**Analysis**:
|
||||
[Explanation of the performance issue]
|
||||
|
||||
**Recommendation**:
|
||||
```python
|
||||
# Current implementation (slow)
|
||||
[current code]
|
||||
|
||||
# Optimized implementation
|
||||
[improved code]
|
||||
```
|
||||
|
||||
**Expected Improvement**: [e.g., "100x faster for 10,000 items"]
|
||||
|
||||
---
|
||||
|
||||
### Security Concern: [Description]
|
||||
|
||||
**Location**: `file.py:789`
|
||||
**Severity**: Important
|
||||
|
||||
**Details**:
|
||||
[Description of security concern]
|
||||
|
||||
**Fix**:
|
||||
```python
|
||||
[corrected code]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Architecture and Design
|
||||
|
||||
### Concerns
|
||||
|
||||
1. **Tight Coupling**: [Description]
|
||||
- Location: [files]
|
||||
- Recommendation: [architectural improvement]
|
||||
|
||||
2. **Missing Abstractions**: [Description]
|
||||
- Impact: [code duplication, hard to test, etc.]
|
||||
- Recommendation: [refactoring suggestion]
|
||||
|
||||
### Positive Patterns
|
||||
|
||||
- [Well-implemented pattern 1]
|
||||
- [Good design choice 2]
|
||||
|
||||
---
|
||||
|
||||
## Performance Analysis
|
||||
|
||||
### CPU Profiling Results
|
||||
|
||||
**Top Hotspots**:
|
||||
1. `function_name()` in `file.py`: [X]ms cumulative ([Y]% of total)
|
||||
2. [Next hotspot]
|
||||
|
||||
### Memory Usage
|
||||
|
||||
**Peak Memory**: [X] MB
|
||||
**Concerns**:
|
||||
- [Memory leak in function X]
|
||||
- [Inefficient data structure in Y]
|
||||
|
||||
### Recommendations
|
||||
|
||||
1. [Specific performance improvement 1]
|
||||
2. [Specific performance improvement 2]
|
||||
|
||||
---
|
||||
|
||||
## Code Quality
|
||||
|
||||
### Complexity Analysis
|
||||
|
||||
**High Complexity Functions**:
|
||||
- `function_name()` (file.py:123): Complexity 25 (Rank C)
|
||||
- Recommendation: Refactor into smaller functions
|
||||
|
||||
### Dead Code
|
||||
|
||||
**Unused Code Found**:
|
||||
- `unused_function()` in utils.py
|
||||
- Variable `UNUSED_CONSTANT` in config.py
|
||||
|
||||
**Recommendation**: Remove to improve maintainability
|
||||
|
||||
---
|
||||
|
||||
## Testing
|
||||
|
||||
### Coverage Analysis
|
||||
|
||||
**Current Coverage**: [X]%
|
||||
|
||||
**Missing Critical Tests**:
|
||||
1. Edge case: [description]
|
||||
2. Error path: [description]
|
||||
3. Integration test: [description]
|
||||
|
||||
### Test Quality Issues
|
||||
|
||||
- [Issue with existing tests]
|
||||
- [Recommendation for improvement]
|
||||
|
||||
---
|
||||
|
||||
## Dependencies
|
||||
|
||||
### Vulnerable Dependencies
|
||||
|
||||
| Package | Current | Vulnerability | Fix |
|
||||
|---------|---------|---------------|-----|
|
||||
| package-name | 1.0.0 | CVE-XXXX-XXXX | Upgrade to 1.1.0 |
|
||||
|
||||
### Outdated Dependencies
|
||||
|
||||
- [List of significantly outdated packages]
|
||||
|
||||
---
|
||||
|
||||
## Minor Issues and Suggestions
|
||||
|
||||
### Style and Conventions
|
||||
|
||||
**Note**: These should be handled by automated tools (ruff, isort, basedpyright) in CI/CD.
|
||||
|
||||
- [Only list if blocking automated tool adoption]
|
||||
|
||||
### Documentation
|
||||
|
||||
- Missing docstrings: [list key functions]
|
||||
- Unclear variable names: [examples]
|
||||
|
||||
---
|
||||
|
||||
## Positive Highlights
|
||||
|
||||
**Well-Implemented Features**:
|
||||
1. [Good pattern or implementation 1]
|
||||
2. [Good practice observed 2]
|
||||
3. [Security measure properly implemented]
|
||||
|
||||
---
|
||||
|
||||
## Recommendations Priority Matrix
|
||||
|
||||
### Immediate (Before Deployment)
|
||||
|
||||
1. [ ] Fix SQL injection vulnerability (file.py:123)
|
||||
2. [ ] Address race condition in payment processing (payment.py:456)
|
||||
3. [ ] Fix memory leak in upload handler (upload.py:789)
|
||||
|
||||
### High Priority (This Sprint)
|
||||
|
||||
1. [ ] Optimize N+1 query in user list (views.py:234)
|
||||
2. [ ] Add missing authentication check (api.py:567)
|
||||
3. [ ] Implement error handling in critical path (processor.py:890)
|
||||
|
||||
### Medium Priority (Next Sprint)
|
||||
|
||||
1. [ ] Refactor high complexity functions
|
||||
2. [ ] Add integration tests for payment flow
|
||||
3. [ ] Update vulnerable dependencies
|
||||
|
||||
### Low Priority (Backlog)
|
||||
|
||||
1. [ ] Remove dead code
|
||||
2. [ ] Improve documentation
|
||||
3. [ ] Consider architectural refactoring for module X
|
||||
|
||||
---
|
||||
|
||||
## Automated Tool Results Summary
|
||||
|
||||
- **Ruff**: [N] issues found
|
||||
- **Basedpyright**: [N] type errors
|
||||
- **Bandit**: [N] security issues
|
||||
- **Safety**: [N] vulnerable dependencies
|
||||
- **Performance Profiler**: [Summary of findings]
|
||||
|
||||
**Detailed reports**: See `review_results/` directory
|
||||
|
||||
---
|
||||
|
||||
## Conclusion
|
||||
|
||||
[Overall assessment paragraph summarizing the review, key takeaways, and next steps]
|
||||
|
||||
**Approval Status**: [Approved | Approved with Conditions | Requires Changes | Blocked]
|
||||
|
||||
**Next Steps**:
|
||||
1. [Action item 1]
|
||||
2. [Action item 2]
|
||||
3. [Action item 3]
|
||||
|
||||
---
|
||||
|
||||
**Review Conducted By**: Claude Code Python Review Skill
|
||||
**Tools Used**: ruff, basedpyright, isort, bandit, safety, performance_profiler
|
||||
Reference in New Issue
Block a user