Initial commit
This commit is contained in:
347
skills/suggest-reviewers/SKILL.md
Normal file
347
skills/suggest-reviewers/SKILL.md
Normal file
@@ -0,0 +1,347 @@
|
||||
---
|
||||
name: Suggest Reviewers Helper
|
||||
description: Git blame analysis helper for the suggest-reviewers command
|
||||
---
|
||||
|
||||
# Suggest Reviewers Helper
|
||||
|
||||
This skill provides a Python helper script that analyzes git blame data for the `/git:suggest-reviewers` command. The script handles the complex task of identifying which lines were changed and who authored the original code.
|
||||
|
||||
## When to Use This Skill
|
||||
|
||||
Use this skill when implementing the `/git:suggest-reviewers` command. The helper script should be invoked during Step 3 of the command implementation (analyzing git blame for changed lines).
|
||||
|
||||
**DO NOT implement git blame analysis manually** - always use the provided `analyze_blame.py` script.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Python 3.6 or higher
|
||||
- Git repository with commit history
|
||||
- Git CLI available in PATH
|
||||
|
||||
## Helper Script: analyze_blame.py
|
||||
|
||||
The `analyze_blame.py` script automates the complex process of:
|
||||
1. Parsing git diff output to identify specific line ranges that were modified
|
||||
2. Running git blame on only the changed line ranges (not entire files)
|
||||
3. Extracting and aggregating author information with statistics
|
||||
4. Filtering out bot accounts automatically
|
||||
|
||||
### Usage
|
||||
|
||||
**For uncommitted changes:**
|
||||
```bash
|
||||
python3 ${CLAUDE_PLUGIN_ROOT}/skills/suggest-reviewers/analyze_blame.py \
|
||||
--mode uncommitted \
|
||||
--file path/to/file1.go \
|
||||
--file path/to/file2.py \
|
||||
--output json
|
||||
```
|
||||
|
||||
**For committed changes on a feature branch:**
|
||||
```bash
|
||||
python3 ${CLAUDE_PLUGIN_ROOT}/skills/suggest-reviewers/analyze_blame.py \
|
||||
--mode committed \
|
||||
--base-branch main \
|
||||
--file path/to/file1.go \
|
||||
--file path/to/file2.py \
|
||||
--output json
|
||||
```
|
||||
|
||||
### Parameters
|
||||
|
||||
- `--mode`: Required. Either `uncommitted` or `committed`
|
||||
- `uncommitted`: Analyzes unstaged/staged changes against HEAD
|
||||
- `committed`: Analyzes committed changes against a base branch
|
||||
|
||||
- `--base-branch`: Required when mode is `committed`. The base branch to compare against (e.g., `main`, `master`)
|
||||
|
||||
- `--file`: Can be specified multiple times. Each file to analyze for blame information. Only changed files should be passed.
|
||||
|
||||
- `--output`: Output format. Default is `json`. Options:
|
||||
- `json`: Machine-readable JSON output
|
||||
- `text`: Human-readable text output
|
||||
|
||||
### Output Format (JSON)
|
||||
|
||||
```json
|
||||
{
|
||||
"Author Name": {
|
||||
"line_count": 45,
|
||||
"most_recent_date": "2024-10-15T14:23:10",
|
||||
"files": ["file1.go", "file2.go"],
|
||||
"email": "author@example.com"
|
||||
},
|
||||
"Another Author": {
|
||||
"line_count": 23,
|
||||
"most_recent_date": "2024-09-20T09:15:33",
|
||||
"files": ["file3.py"],
|
||||
"email": "another@example.com"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Output Fields
|
||||
|
||||
- `line_count`: Total number of modified lines authored by this person
|
||||
- `most_recent_date`: ISO 8601 timestamp of their most recent contribution to the changed code
|
||||
- `files`: Array of files where this author has contributions in the changed lines
|
||||
- `email`: Author's email address from git commits
|
||||
|
||||
### Bot Filtering
|
||||
|
||||
The script automatically filters out common bot accounts:
|
||||
- GitHub bots (e.g., `dependabot[bot]`, `renovate[bot]`)
|
||||
- CI bots (e.g., `openshift-ci-robot`, `k8s-ci-robot`)
|
||||
- Generic bot patterns (any name containing `[bot]` or ending in `-bot`)
|
||||
|
||||
## Implementation Steps
|
||||
|
||||
### Step 1: Collect changed files
|
||||
|
||||
Before invoking the script, collect the list of changed files based on the scenario:
|
||||
|
||||
**Uncommitted changes:**
|
||||
```bash
|
||||
# Get staged and unstaged files
|
||||
files=$(git diff --name-only --diff-filter=d HEAD)
|
||||
files+=" $(git diff --name-only --diff-filter=d --cached)"
|
||||
```
|
||||
|
||||
**Committed changes:**
|
||||
```bash
|
||||
# Get files changed from base branch
|
||||
files=$(git diff --name-only --diff-filter=d ${base_branch}...HEAD)
|
||||
```
|
||||
|
||||
### Step 2: Invoke the script
|
||||
|
||||
Build the command with the appropriate mode and all changed files:
|
||||
|
||||
```bash
|
||||
# Start building the command
|
||||
cmd="python3 ${CLAUDE_PLUGIN_ROOT}/skills/suggest-reviewers/analyze_blame.py"
|
||||
|
||||
# Add mode
|
||||
if [ "$has_uncommitted" = true ] || [ "$on_base_branch" = true ]; then
|
||||
cmd="$cmd --mode uncommitted"
|
||||
else
|
||||
cmd="$cmd --mode committed --base-branch $base_branch"
|
||||
fi
|
||||
|
||||
# Add each file
|
||||
for file in $files; do
|
||||
cmd="$cmd --file $file"
|
||||
done
|
||||
|
||||
# Add output format
|
||||
cmd="$cmd --output json"
|
||||
|
||||
# Execute and capture JSON output
|
||||
blame_data=$($cmd)
|
||||
```
|
||||
|
||||
### Step 3: Parse the output
|
||||
|
||||
The JSON output can be parsed using Python, jq, or any JSON parser:
|
||||
|
||||
```bash
|
||||
# Example using jq to get top contributor
|
||||
echo "$blame_data" | jq -r 'to_entries | sort_by(-.value.line_count) | .[0].key'
|
||||
|
||||
# Example using Python
|
||||
python3 << EOF
|
||||
import json
|
||||
import sys
|
||||
|
||||
data = json.loads('''$blame_data''')
|
||||
|
||||
# Sort by line count
|
||||
sorted_authors = sorted(data.items(), key=lambda x: x[1]['line_count'], reverse=True)
|
||||
|
||||
for author, stats in sorted_authors:
|
||||
print(f"{author}: {stats['line_count']} lines, last modified {stats['most_recent_date']}")
|
||||
EOF
|
||||
```
|
||||
|
||||
### Step 4: Combine with OWNERS data
|
||||
|
||||
After getting blame data, merge it with OWNERS file information to produce the final ranked list of reviewers.
|
||||
|
||||
## Error Handling
|
||||
|
||||
### No changed files
|
||||
|
||||
If no files are passed to the script:
|
||||
```
|
||||
Error: No files specified. Use --file option at least once.
|
||||
```
|
||||
|
||||
**Resolution:** Ensure you've detected changed files correctly before invoking the script.
|
||||
|
||||
### Invalid mode
|
||||
|
||||
If an invalid mode is specified:
|
||||
```
|
||||
Error: Invalid mode 'invalid'. Must be 'uncommitted' or 'committed'.
|
||||
```
|
||||
|
||||
**Resolution:** Use either `--mode uncommitted` or `--mode committed`.
|
||||
|
||||
### Missing base branch in committed mode
|
||||
|
||||
If `--mode committed` is used without `--base-branch`:
|
||||
```
|
||||
Error: --base-branch is required when mode is 'committed'.
|
||||
```
|
||||
|
||||
**Resolution:** Provide the base branch: `--base-branch main`
|
||||
|
||||
### File not in repository
|
||||
|
||||
If a specified file is not tracked by git:
|
||||
```
|
||||
Warning: File 'path/to/file' is not tracked by git, skipping.
|
||||
```
|
||||
|
||||
**Resolution:** This is a warning and can be safely ignored. The script will skip untracked files.
|
||||
|
||||
### No blame data found
|
||||
|
||||
If git blame returns no data for any files:
|
||||
```json
|
||||
{}
|
||||
```
|
||||
|
||||
**Resolution:** This can happen if:
|
||||
- All changed files are newly created (no blame history)
|
||||
- All changes are in binary files
|
||||
- Git blame is unable to run
|
||||
|
||||
In this case, fall back to OWNERS-only suggestions.
|
||||
|
||||
## Examples
|
||||
|
||||
### Example 1: Analyze uncommitted changes
|
||||
|
||||
```bash
|
||||
$ python3 analyze_blame.py --mode uncommitted --file src/main.go --file src/utils.go --output json
|
||||
{
|
||||
"Alice Developer": {
|
||||
"line_count": 45,
|
||||
"most_recent_date": "2024-10-15T14:23:10",
|
||||
"files": ["src/main.go", "src/utils.go"],
|
||||
"email": "alice@example.com"
|
||||
},
|
||||
"Bob Engineer": {
|
||||
"line_count": 12,
|
||||
"most_recent_date": "2024-09-20T09:15:33",
|
||||
"files": ["src/main.go"],
|
||||
"email": "bob@example.com"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Example 2: Analyze committed changes on feature branch
|
||||
|
||||
```bash
|
||||
$ python3 analyze_blame.py --mode committed --base-branch main --file pkg/controller/manager.go --output json
|
||||
{
|
||||
"Charlie Contributor": {
|
||||
"line_count": 78,
|
||||
"most_recent_date": "2024-10-01T11:42:55",
|
||||
"files": ["pkg/controller/manager.go"],
|
||||
"email": "charlie@example.com"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Example 3: Text output format
|
||||
|
||||
```bash
|
||||
$ python3 analyze_blame.py --mode uncommitted --file README.md --output text
|
||||
|
||||
Blame Analysis Results:
|
||||
=======================
|
||||
|
||||
Alice Developer (alice@example.com)
|
||||
Lines: 23
|
||||
Most recent: 2024-10-15T14:23:10
|
||||
Files: README.md
|
||||
|
||||
Bob Engineer (bob@example.com)
|
||||
Lines: 5
|
||||
Most recent: 2024-08-12T16:30:21
|
||||
Files: README.md
|
||||
```
|
||||
|
||||
### Example 4: Multiple files with mixed results
|
||||
|
||||
```bash
|
||||
$ python3 analyze_blame.py --mode committed --base-branch release-4.15 \
|
||||
--file vendor/k8s.io/client-go/kubernetes/clientset.go \
|
||||
--file pkg/controller/node.go \
|
||||
--file docs/README.md \
|
||||
--output json
|
||||
{
|
||||
"Diana Developer": {
|
||||
"line_count": 156,
|
||||
"most_recent_date": "2024-09-28T13:15:42",
|
||||
"files": ["vendor/k8s.io/client-go/kubernetes/clientset.go", "pkg/controller/node.go"],
|
||||
"email": "diana@example.com"
|
||||
},
|
||||
"Eve Technical Writer": {
|
||||
"line_count": 34,
|
||||
"most_recent_date": "2024-10-10T10:22:18",
|
||||
"files": ["docs/README.md"],
|
||||
"email": "eve@example.com"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Technical Details
|
||||
|
||||
### How the script works
|
||||
|
||||
1. **Determine diff range**: Based on mode, calculates what to compare:
|
||||
- `uncommitted`: Compares working directory against HEAD
|
||||
- `committed`: Compares HEAD against base branch
|
||||
|
||||
2. **Parse diff output**: Runs `git diff` with unified format to identify:
|
||||
- Which files changed
|
||||
- Which line ranges were added/modified
|
||||
- Ignores deleted lines (can't blame what doesn't exist)
|
||||
|
||||
3. **Run git blame**: For each file and line range:
|
||||
- Runs `git blame -L start,end --line-porcelain file`
|
||||
- Parses porcelain format to extract author, email, and timestamp
|
||||
- Aggregates data across all changed lines
|
||||
|
||||
4. **Filter and aggregate**:
|
||||
- Removes bot accounts
|
||||
- Groups by author name
|
||||
- Counts total lines per author
|
||||
- Tracks most recent contribution date
|
||||
- Lists all files each author contributed to
|
||||
|
||||
5. **Output results**: Formats as JSON or text based on `--output` parameter
|
||||
|
||||
### Performance considerations
|
||||
|
||||
- Only blames changed line ranges, not entire files (much faster for small changes to large files)
|
||||
- Processes files in parallel when possible
|
||||
- Caches git commands where appropriate
|
||||
- Skips binary files automatically
|
||||
|
||||
## Limitations
|
||||
|
||||
- Does not handle file renames/moves (treats as delete + add)
|
||||
- Bot filtering is based on common patterns; custom bots may not be filtered
|
||||
- Requires git history; newly initialized repos may not have useful data
|
||||
- Does not consider commit message content or PR review history
|
||||
|
||||
## See Also
|
||||
|
||||
- Main command: `/git:suggest-reviewers` in `plugins/git/commands/suggest-reviewers.md`
|
||||
- Git blame documentation: https://git-scm.com/docs/git-blame
|
||||
- Git diff documentation: https://git-scm.com/docs/git-diff
|
||||
380
skills/suggest-reviewers/analyze_blame.py
Normal file
380
skills/suggest-reviewers/analyze_blame.py
Normal file
@@ -0,0 +1,380 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
Git Blame Analysis Helper for suggest-reviewers command.
|
||||
|
||||
This script helps identify the authors of code lines being modified in a PR,
|
||||
aggregating git blame data to suggest the most relevant reviewers.
|
||||
|
||||
Usage:
|
||||
python analyze_blame.py --mode <uncommitted|committed> --file <filepath> [--base-branch <branch>]
|
||||
|
||||
Modes:
|
||||
uncommitted: Analyze uncommitted changes (compares against HEAD)
|
||||
committed: Analyze committed changes on feature branch (compares against base branch)
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import re
|
||||
import subprocess
|
||||
import sys
|
||||
from collections import defaultdict
|
||||
from datetime import datetime
|
||||
from typing import Dict, List, Tuple, Optional
|
||||
|
||||
|
||||
class BlameAnalyzer:
|
||||
"""Analyzes git blame for changed lines in files."""
|
||||
|
||||
# Bot patterns to filter out
|
||||
BOT_PATTERNS = [
|
||||
r'.*\[bot\]',
|
||||
r'openshift-bot',
|
||||
r'k8s-ci-robot',
|
||||
r'openshift-merge-robot',
|
||||
r'openshift-ci\[bot\]',
|
||||
r'dependabot',
|
||||
r'renovate\[bot\]',
|
||||
]
|
||||
|
||||
def __init__(self, mode: str, base_branch: Optional[str] = None):
|
||||
"""
|
||||
Initialize the analyzer.
|
||||
|
||||
Args:
|
||||
mode: 'uncommitted' or 'committed'
|
||||
base_branch: Base branch for committed mode (e.g., 'main')
|
||||
"""
|
||||
self.mode = mode
|
||||
self.base_branch = base_branch
|
||||
self.authors = defaultdict(lambda: {
|
||||
'line_count': 0,
|
||||
'most_recent_date': None,
|
||||
'files': set(),
|
||||
'email': None
|
||||
})
|
||||
|
||||
if mode == 'committed' and not base_branch:
|
||||
raise ValueError("base_branch required for 'committed' mode")
|
||||
|
||||
# Get current user to exclude from suggestions
|
||||
self.current_user_name = self._get_git_config('user.name')
|
||||
self.current_user_email = self._get_git_config('user.email')
|
||||
|
||||
def _get_git_config(self, key: str) -> Optional[str]:
|
||||
"""Get a git config value."""
|
||||
try:
|
||||
result = subprocess.run(
|
||||
['git', 'config', '--get', key],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
check=False
|
||||
)
|
||||
if result.returncode == 0:
|
||||
return result.stdout.strip()
|
||||
except Exception:
|
||||
pass
|
||||
return None
|
||||
|
||||
def is_bot(self, author: str) -> bool:
|
||||
"""Check if an author name matches bot patterns."""
|
||||
for pattern in self.BOT_PATTERNS:
|
||||
if re.match(pattern, author, re.IGNORECASE):
|
||||
return True
|
||||
return False
|
||||
|
||||
def is_current_user(self, author: str, email: Optional[str]) -> bool:
|
||||
"""Check if the author is the current user."""
|
||||
if self.current_user_name and author == self.current_user_name:
|
||||
return True
|
||||
if self.current_user_email and email and email == self.current_user_email:
|
||||
return True
|
||||
return False
|
||||
|
||||
def parse_diff_ranges(self, file_path: str) -> List[Tuple[int, int]]:
|
||||
"""
|
||||
Parse git diff output to extract changed line ranges.
|
||||
|
||||
Returns:
|
||||
List of (start_line, line_count) tuples for changed ranges
|
||||
"""
|
||||
ranges = []
|
||||
|
||||
try:
|
||||
if self.mode == 'uncommitted':
|
||||
# Check staged changes
|
||||
diff_cmd = ['git', 'diff', '--cached', '--unified=0', file_path]
|
||||
result = subprocess.run(diff_cmd, capture_output=True, text=True, check=False)
|
||||
ranges.extend(self._extract_ranges_from_diff(result.stdout))
|
||||
|
||||
# Check unstaged changes
|
||||
diff_cmd = ['git', 'diff', 'HEAD', '--unified=0', file_path]
|
||||
result = subprocess.run(diff_cmd, capture_output=True, text=True, check=False)
|
||||
ranges.extend(self._extract_ranges_from_diff(result.stdout))
|
||||
else:
|
||||
# Committed changes: compare against base branch
|
||||
diff_cmd = ['git', 'diff', f'{self.base_branch}...HEAD', '--unified=0', file_path]
|
||||
result = subprocess.run(diff_cmd, capture_output=True, text=True, check=True)
|
||||
ranges.extend(self._extract_ranges_from_diff(result.stdout))
|
||||
|
||||
except subprocess.CalledProcessError as e:
|
||||
print(f"Error running diff for {file_path}: {e}", file=sys.stderr)
|
||||
return []
|
||||
|
||||
# Deduplicate and merge overlapping ranges
|
||||
return self._merge_ranges(ranges)
|
||||
|
||||
def _extract_ranges_from_diff(self, diff_output: str) -> List[Tuple[int, int]]:
|
||||
"""
|
||||
Extract line ranges from diff @@ markers.
|
||||
|
||||
Diff format: @@ -old_start,old_count +new_start,new_count @@
|
||||
We want the 'old' ranges (lines being replaced/modified in the base)
|
||||
|
||||
For pure additions (count=0), we analyze context lines before the insertion
|
||||
point to find relevant code owners.
|
||||
"""
|
||||
ranges = []
|
||||
# Match @@ -start[,count] +start[,count] @@
|
||||
pattern = r'^@@\s+-(\d+)(?:,(\d+))?\s+\+\d+(?:,\d+)?\s+@@'
|
||||
|
||||
for line in diff_output.split('\n'):
|
||||
match = re.match(pattern, line)
|
||||
if match:
|
||||
start = int(match.group(1))
|
||||
count = int(match.group(2)) if match.group(2) else 1
|
||||
|
||||
if start > 0:
|
||||
if count > 0:
|
||||
# Regular modification/deletion
|
||||
ranges.append((start, count))
|
||||
else:
|
||||
# Pure addition (count=0): analyze context before insertion
|
||||
# Look at up to 5 lines before the insertion point
|
||||
context_start = max(1, start - 5)
|
||||
context_count = start - context_start
|
||||
if context_count > 0:
|
||||
ranges.append((context_start, context_count))
|
||||
|
||||
return ranges
|
||||
|
||||
def _merge_ranges(self, ranges: List[Tuple[int, int]]) -> List[Tuple[int, int]]:
|
||||
"""Merge overlapping line ranges."""
|
||||
if not ranges:
|
||||
return []
|
||||
|
||||
# Sort by start line
|
||||
sorted_ranges = sorted(ranges, key=lambda x: x[0])
|
||||
merged = [sorted_ranges[0]]
|
||||
|
||||
for start, count in sorted_ranges[1:]:
|
||||
last_start, last_count = merged[-1]
|
||||
last_end = last_start + last_count - 1
|
||||
current_end = start + count - 1
|
||||
|
||||
# Check if ranges overlap or are adjacent
|
||||
if start <= last_end + 1:
|
||||
# Merge ranges
|
||||
new_end = max(last_end, current_end)
|
||||
new_count = new_end - last_start + 1
|
||||
merged[-1] = (last_start, new_count)
|
||||
else:
|
||||
merged.append((start, count))
|
||||
|
||||
return merged
|
||||
|
||||
def analyze_file(self, file_path: str) -> None:
|
||||
"""
|
||||
Analyze git blame for a specific file.
|
||||
|
||||
Args:
|
||||
file_path: Path to file relative to repo root
|
||||
"""
|
||||
# Get changed line ranges
|
||||
ranges = self.parse_diff_ranges(file_path)
|
||||
|
||||
if not ranges:
|
||||
return
|
||||
|
||||
# Determine which revision to blame
|
||||
if self.mode == 'uncommitted':
|
||||
blame_target = 'HEAD'
|
||||
else:
|
||||
blame_target = self.base_branch
|
||||
|
||||
# Run git blame on each range
|
||||
for start, count in ranges:
|
||||
end = start + count - 1
|
||||
self._blame_range(file_path, start, end, blame_target)
|
||||
|
||||
def _blame_range(self, file_path: str, start: int, end: int, revision: str) -> None:
|
||||
"""
|
||||
Run git blame on a specific line range and extract author data.
|
||||
|
||||
Args:
|
||||
file_path: File to blame
|
||||
start: Start line number
|
||||
end: End line number
|
||||
revision: Git revision to blame (e.g., 'HEAD', 'main')
|
||||
"""
|
||||
try:
|
||||
# Use porcelain format for easier parsing
|
||||
blame_cmd = [
|
||||
'git', 'blame',
|
||||
'--porcelain',
|
||||
'-L', f'{start},{end}',
|
||||
revision,
|
||||
'--',
|
||||
file_path
|
||||
]
|
||||
|
||||
result = subprocess.run(blame_cmd, capture_output=True, text=True, check=True)
|
||||
self._parse_blame_output(result.stdout, file_path)
|
||||
|
||||
except subprocess.CalledProcessError as e:
|
||||
print(f"Error running blame on {file_path}:{start}-{end}: {e}", file=sys.stderr)
|
||||
|
||||
def _parse_blame_output(self, blame_output: str, file_path: str) -> None:
|
||||
"""
|
||||
Parse git blame --porcelain output and aggregate author data.
|
||||
|
||||
Porcelain format:
|
||||
<commit-hash> <original-line> <final-line> <num-lines>
|
||||
author <author-name>
|
||||
author-mail <email>
|
||||
author-time <unix-timestamp>
|
||||
...
|
||||
\t<line-content>
|
||||
"""
|
||||
lines = blame_output.split('\n')
|
||||
i = 0
|
||||
|
||||
while i < len(lines):
|
||||
line = lines[i]
|
||||
|
||||
# Check if this is a commit header line
|
||||
if line and not line.startswith('\t'):
|
||||
parts = line.split()
|
||||
if len(parts) >= 4 and len(parts[0]) == 40: # Looks like a SHA
|
||||
# Parse commit metadata
|
||||
author = None
|
||||
email = None
|
||||
timestamp = None
|
||||
|
||||
# Look ahead for author info
|
||||
j = i + 1
|
||||
while j < len(lines) and not lines[j].startswith('\t'):
|
||||
if lines[j].startswith('author '):
|
||||
author = lines[j][7:] # Remove 'author ' prefix
|
||||
elif lines[j].startswith('author-mail '):
|
||||
email = lines[j][12:].strip('<>') # Remove 'author-mail ' and <>
|
||||
elif lines[j].startswith('author-time '):
|
||||
timestamp = int(lines[j][12:])
|
||||
j += 1
|
||||
|
||||
# Update author data (exclude bots and current user)
|
||||
if author and not self.is_bot(author) and not self.is_current_user(author, email):
|
||||
author_date = datetime.fromtimestamp(timestamp) if timestamp else None
|
||||
|
||||
self.authors[author]['line_count'] += 1
|
||||
self.authors[author]['files'].add(file_path)
|
||||
self.authors[author]['email'] = email
|
||||
|
||||
# Track most recent contribution
|
||||
if author_date:
|
||||
current_recent = self.authors[author]['most_recent_date']
|
||||
if current_recent is None or author_date > current_recent:
|
||||
self.authors[author]['most_recent_date'] = author_date
|
||||
|
||||
i = j
|
||||
continue
|
||||
|
||||
i += 1
|
||||
|
||||
def get_results(self) -> Dict:
|
||||
"""
|
||||
Get aggregated results as a dictionary.
|
||||
|
||||
Returns:
|
||||
Dictionary mapping author names to their statistics
|
||||
"""
|
||||
results = {}
|
||||
|
||||
for author, data in self.authors.items():
|
||||
results[author] = {
|
||||
'line_count': data['line_count'],
|
||||
'most_recent_date': data['most_recent_date'].isoformat() if data['most_recent_date'] else None,
|
||||
'files': sorted(list(data['files'])),
|
||||
'email': data['email']
|
||||
}
|
||||
|
||||
return results
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(
|
||||
description='Analyze git blame for changed lines to identify code authors'
|
||||
)
|
||||
parser.add_argument(
|
||||
'--mode',
|
||||
choices=['uncommitted', 'committed'],
|
||||
required=True,
|
||||
help='Analysis mode: uncommitted (vs HEAD) or committed (vs base branch)'
|
||||
)
|
||||
parser.add_argument(
|
||||
'--file',
|
||||
required=True,
|
||||
action='append',
|
||||
dest='files',
|
||||
help='File(s) to analyze (can be specified multiple times)'
|
||||
)
|
||||
parser.add_argument(
|
||||
'--base-branch',
|
||||
help='Base branch for committed mode (e.g., main, master)'
|
||||
)
|
||||
parser.add_argument(
|
||||
'--output',
|
||||
choices=['json', 'text'],
|
||||
default='json',
|
||||
help='Output format (default: json)'
|
||||
)
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
# Validate arguments
|
||||
if args.mode == 'committed' and not args.base_branch:
|
||||
print("Error: --base-branch required for 'committed' mode", file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
# Analyze files
|
||||
analyzer = BlameAnalyzer(mode=args.mode, base_branch=args.base_branch)
|
||||
|
||||
for file_path in args.files:
|
||||
analyzer.analyze_file(file_path)
|
||||
|
||||
# Output results
|
||||
results = analyzer.get_results()
|
||||
|
||||
if args.output == 'json':
|
||||
print(json.dumps(results, indent=2))
|
||||
else:
|
||||
# Text output
|
||||
print(f"\nAuthors of modified code ({len(results)} found):\n")
|
||||
|
||||
# Sort by line count
|
||||
sorted_authors = sorted(
|
||||
results.items(),
|
||||
key=lambda x: x[1]['line_count'],
|
||||
reverse=True
|
||||
)
|
||||
|
||||
for author, data in sorted_authors:
|
||||
print(f"{author} <{data['email']}>")
|
||||
print(f" Lines: {data['line_count']}")
|
||||
print(f" Most recent: {data['most_recent_date'] or 'unknown'}")
|
||||
print(f" Files: {', '.join(data['files'])}")
|
||||
print()
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
Reference in New Issue
Block a user