207 lines
7.6 KiB
Markdown
Executable File
207 lines
7.6 KiB
Markdown
Executable File
---
|
|
model: claude-haiku-4-5-20251001
|
|
allowed-tools: Bash(git branch:*), Bash(git status:*), Bash(git log:*), Bash(git diff:*), mcp__*, mcp__ado__repo_get_repo_by_name_or_id, mcp__ado__repo_list_pull_requests_by_repo_or_project, mcp__ado__repo_get_pull_request_by_id, mcp__ado__repo_list_pull_request_threads, mcp__ado__repo_list_pull_request_thread_comments, mcp__ado__repo_create_pull_request_thread, mcp__ado__repo_reply_to_comment, mcp__ado__repo_update_pull_request, mcp__ado__repo_search_commits, mcp__ado__pipelines_get_builds, Read, Task
|
|
argument-hint: [PR_ID] (optional - if not provided, will list all open PRs)
|
|
# PR Review and Approval
|
|
---
|
|
## Task
|
|
Review open pull requests in the current repository and approve/complete them if they meet quality standards.
|
|
|
|
## Instructions
|
|
|
|
### 1. Get Repository Information
|
|
- Use `mcp__ado__repo_get_repo_by_name_or_id` with:
|
|
- Project: `Program Unify`
|
|
- Repository: `unify_2_1_dm_synapse_env_d10`
|
|
- Extract repository ID: `d3fa6f02-bfdf-428d-825c-7e7bd4e7f338`
|
|
|
|
### 2. List Open Pull Requests
|
|
- Use `mcp__ado__repo_list_pull_requests_by_repo_or_project` with:
|
|
- Repository ID: `d3fa6f02-bfdf-428d-825c-7e7bd4e7f338`
|
|
- Status: `Active`
|
|
- If `$ARGUMENTS` provided, filter to that specific PR ID
|
|
- Display all open PRs with key details (ID, title, source/target branches, author)
|
|
|
|
### 3. Review Each Pull Request
|
|
For each PR (or the specified PR):
|
|
|
|
#### 3.1 Get PR Details
|
|
- Use `mcp__ado__repo_get_pull_request_by_id` to get full PR details
|
|
- Check merge status - if conflicts exist, stop and report
|
|
|
|
#### 3.2 Get PR Changes
|
|
- Use `mcp__ado__repo_search_commits` to get commits in the PR
|
|
- Identify files changed and scope of changes
|
|
|
|
#### 3.3 Review Code Quality
|
|
Read changed files and evaluate:
|
|
1. **Code Quality & Maintainability**
|
|
- Proper use of type hints and descriptive variable names
|
|
- Maximum line length (240 chars) compliance
|
|
- No blank lines inside functions
|
|
- Proper import organization
|
|
- Use of `@synapse_error_print_handler` decorator
|
|
- Proper error handling with meaningful messages
|
|
|
|
2. **PySpark Best Practices**
|
|
- DataFrame operations over raw SQL
|
|
- Proper use of `TableUtilities` methods
|
|
- Correct logging with `NotebookLogger`
|
|
- Proper session management
|
|
|
|
3. **ETL Pattern Compliance**
|
|
- Follows ETL class pattern for Silver/Gold layers
|
|
- Proper extract/transform/load method structure
|
|
- Correct database and table naming conventions
|
|
|
|
4. **Standards Compliance**
|
|
- Follows project coding standards from `.claude/rules/python_rules.md`
|
|
- No missing docstrings (unless explicitly instructed to omit)
|
|
- Proper use of configuration from `configuration.yaml`
|
|
|
|
#### 3.4 Review DevOps Considerations
|
|
1. **CI/CD Integration**
|
|
- Changes compatible with existing pipeline
|
|
- No breaking changes to deployment process
|
|
|
|
2. **Configuration & Infrastructure**
|
|
- Proper environment detection pattern
|
|
- Azure integration handled correctly
|
|
- No hardcoded paths or credentials
|
|
|
|
3. **Testing & Quality Gates**
|
|
- Syntax validation would pass
|
|
- Linting compliance (ruff check)
|
|
- Test coverage for new functionality
|
|
|
|
#### 3.5 Deep PySpark Analysis (Conditional)
|
|
**Only execute if PR modifies PySpark ETL code**
|
|
|
|
Check if PR changes affect:
|
|
- `python_files/pipeline_operations/bronze_layer_deployment.py`
|
|
- `python_files/pipeline_operations/silver_dag_deployment.py`
|
|
- `python_files/pipeline_operations/gold_dag_deployment.py`
|
|
- Any files in `python_files/silver/`
|
|
- Any files in `python_files/gold/`
|
|
- `python_files/utilities/session_optimiser.py`
|
|
|
|
**If PySpark files are modified, use Task tool to launch pyspark-engineer agent:**
|
|
|
|
```
|
|
Task tool parameters:
|
|
- subagent_type: "pyspark-engineer"
|
|
- description: "Deep PySpark analysis for PR #[PR_ID]"
|
|
- prompt: "
|
|
Perform expert-level PySpark analysis for PR #[PR_ID]:
|
|
|
|
PR Details:
|
|
- Title: [PR_TITLE]
|
|
- Changed Files: [LIST_OF_CHANGED_FILES]
|
|
- Source Branch: [SOURCE_BRANCH]
|
|
- Target Branch: [TARGET_BRANCH]
|
|
|
|
Review Requirements:
|
|
1. Read all changed PySpark files
|
|
2. Analyze transformation logic for:
|
|
- Partitioning strategies and data skew
|
|
- Shuffle optimisation opportunities
|
|
- Broadcast join usage and optimisation
|
|
- Memory management and caching strategies
|
|
- DataFrame operation efficiency
|
|
3. Validate Medallion Architecture compliance:
|
|
- Bronze layer: Raw data preservation patterns
|
|
- Silver layer: Cleansing and standardization
|
|
- Gold layer: Business model optimisation
|
|
4. Check performance considerations:
|
|
- Identify potential bottlenecks
|
|
- Suggest optimisation opportunities
|
|
- Validate cost-efficiency patterns
|
|
5. Verify test coverage:
|
|
- Check for pytest test files
|
|
- Validate test completeness
|
|
- Suggest missing test scenarios
|
|
6. Review production readiness:
|
|
- Error handling for data pipeline failures
|
|
- Idempotent operation design
|
|
- Monitoring and logging completeness
|
|
|
|
Provide detailed findings in this format:
|
|
|
|
## PySpark Analysis Results
|
|
|
|
### Critical Issues (blocking)
|
|
- [List any critical performance or correctness issues]
|
|
|
|
### Performance Optimisations
|
|
- [Specific optimisation recommendations]
|
|
|
|
### Architecture Compliance
|
|
- [Medallion architecture adherence assessment]
|
|
|
|
### Test Coverage
|
|
- [Test completeness and gaps]
|
|
|
|
### Recommendations
|
|
- [Specific actionable improvements]
|
|
|
|
Return your analysis for integration into the PR review.
|
|
"
|
|
```
|
|
|
|
**Integration of PySpark Analysis:**
|
|
- If pyspark-engineer identifies critical issues → Add to review comments
|
|
- If optimisations suggested → Add as optional improvement comments
|
|
- If architecture violations found → Add as required changes
|
|
- Include all findings in final review summary
|
|
|
|
### 4. Provide Review Comments
|
|
- Use `mcp__ado__repo_list_pull_request_threads` to check existing review comments
|
|
- If issues found, use `mcp__ado__repo_create_pull_request_thread` to add:
|
|
- Specific file-level comments with line numbers
|
|
- Clear description of issues
|
|
- Suggested improvements
|
|
- Mark as `Active` status if changes required
|
|
|
|
### 5. Approve and Complete PR (if satisfied)
|
|
**Only proceed if ALL criteria met:**
|
|
- No merge conflicts
|
|
- Code quality standards met
|
|
- PySpark best practices followed
|
|
- ETL patterns correct
|
|
- No DevOps concerns
|
|
- Proper error handling and logging
|
|
- Standards compliant
|
|
- **PySpark analysis (if performed) shows no critical issues**
|
|
- **Performance optimisations either implemented or deferred with justification**
|
|
- **Medallion architecture compliance validated**
|
|
|
|
**If approved:**
|
|
1. Use `mcp__ado__repo_update_pull_request` with:
|
|
- Set `autoComplete: true`
|
|
- Set `mergeStrategy: "NoFastForward"` (or "Squash" if many small commits)
|
|
- Set `deleteSourceBranch: false` (preserve branch history)
|
|
- Set `transitionWorkItems: true`
|
|
- Add approval comment explaining what was reviewed
|
|
|
|
2. Confirm completion with summary:
|
|
- PR ID and title
|
|
- Number of commits reviewed
|
|
- Key changes identified
|
|
- Approval rationale
|
|
|
|
### 6. Report Results
|
|
Provide comprehensive summary:
|
|
- Total open PRs reviewed
|
|
- PRs approved and completed (with IDs)
|
|
- PRs requiring changes (with summary of issues)
|
|
- PRs blocked by merge conflicts
|
|
- **PySpark analysis findings (if performed)**
|
|
- **Performance optimisation recommendations**
|
|
|
|
## Important Notes
|
|
- **No deferrals**: All identified issues must be addressed before approval
|
|
- **Immediate action**: If improvements needed, request them now - no "future work" comments
|
|
- **Thorough review**: Check both code quality AND DevOps considerations
|
|
- **Professional objectivity**: Prioritize technical accuracy over validation
|
|
- **Merge conflicts**: Do NOT approve PRs with merge conflicts - report them for manual resolution
|