Initial commit

This commit is contained in:
Zhongwei Li
2025-11-29 17:54:49 +08:00
commit ee11345c5b
28 changed files with 6747 additions and 0 deletions

669
commands/evi.md Normal file
View File

@@ -0,0 +1,669 @@
---
description: Evaluate GitHub issue quality and completeness for agent implementation
argument-hint: [issue number or URL]
---
# EVI - Evaluate Issue Quality
Evaluates a GitHub issue to ensure it contains everything needed for the agent implementation framework to succeed.
## Input
**Issue Number or URL:**
```
$ARGUMENTS
```
## Agent Framework Requirements
The issue must support all agents in the pipeline:
**issue-implementer** needs:
- Clear requirements and end state
- Files/components to create or modify
- Edge cases and error handling specs
- Testing expectations
**quality-checker** needs:
- Testable acceptance criteria matching automated checks
- Performance benchmarks (if applicable)
- Expected test outcomes
**security-checker** needs:
- Security considerations and requirements
- Authentication/authorization specs
- Data handling requirements
**doc-checker** needs:
- Documentation update requirements
- What needs README/wiki updates
- API documentation expectations
**review-orchestrator** needs:
- Clear pass/fail criteria
- Non-negotiable vs nice-to-have requirements
**issue-merger** needs:
- Unambiguous "done" definition
- Integration requirements
## Evaluation Criteria
### ✅ CRITICAL (Must Have)
**1. Clear Requirements**
- Exact end state specified
- Technical details: names, types, constraints, behavior
- Files/components to create or modify
- Validation rules and error handling
- Edge cases covered
**2. Testable Acceptance Criteria**
- Specific, measurable outcomes
- Aligns with automated checks (pytest, ESLint, TypeScript, build)
- Includes edge cases
- References quality checks: "All tests pass", "ESLint passes", "Build succeeds"
**3. Affected Components**
- Which files/modules to modify
- Which APIs/endpoints involved
- Which database tables affected
- Which UI components changed
**4. Testing Expectations**
- What tests need to be written
- What tests need to pass
- Performance benchmarks (if applicable)
- Integration test requirements
**5. Context & Why**
- Business value
- User impact
- Current limitation
- Why this matters now
### ⚠️ IMPORTANT (Should Have)
**6. Security Requirements**
- Authentication/authorization needs
- Data privacy considerations
- Input validation requirements
- Security vulnerabilities to avoid
**7. Documentation Requirements**
- What needs README updates
- What needs wiki/API docs
- Inline code comments expected
- FEATURE-LIST.md updates
**8. Error Handling**
- Expected error messages
- Error codes to return
- Fallback behavior
- User-facing error text
**9. Scope Boundaries**
- What IS included
- What is NOT included
- Out of scope items
- Future work references
### 💡 HELPFUL (Nice to Have)
**10. Performance Requirements** (OPTIONAL - only if specific concern)
- Response time limits
- Query performance expectations
- Scale requirements
- Load handling
- **Note:** Most issues don't need this. Build first, measure, optimize later.
**11. Related Issues**
- Dependencies (blocked by, depends on)
- Related work
- Follow-up issues planned
**12. Implementation Guidance**
- Problems agent needs to solve
- Existing patterns to follow
- Challenges to consider
- No prescriptive solutions (guides, doesn't prescribe)
### ❌ RED FLAGS (Must NOT Have)
**13. Prescriptive Implementation**
- Complete function/component implementations (full solutions)
- Large blocks of working code (> 10-15 lines)
- Complete SQL migration scripts
- Step-by-step implementation guide
- "Add this code to line X" with specific file/line fixes
**✅ OK to have:**
- Short code examples (< 10 lines): `{ error: 'message', code: 400 }`
- Type definitions: `{ email: string, category?: string }`
- Example API responses: `{ id: 5, status: 'active' }`
- Error message formats: `"Invalid email format"`
- Small syntax examples showing the shape/format
## Evaluation Process
### STEP 1: Fetch the Issue
```bash
ISSUE_NUM=$ARGUMENTS
# Fetch issue content
gh issue view $ISSUE_NUM --json title,body,labels > issue-data.json
TITLE=$(jq -r '.title' issue-data.json)
BODY=$(jq -r '.body' issue-data.json)
LABELS=$(jq -r '.labels[].name' issue-data.json)
echo "Evaluating Issue #${ISSUE_NUM}: ${TITLE}"
```
### STEP 2: Check Agent Framework Compatibility
Evaluate for each agent in the pipeline:
**For issue-implementer:**
```bash
# Check: Clear requirements present?
echo "$BODY" | grep -iE '(requirements|end state|target state)' || echo "⚠️ No clear requirements section"
# Check: Files/components specified?
echo "$BODY" | grep -iE '(files? (to )?(create|modify)|components? (to )?(create|modify|affected)|tables? (to )?(create|modify))' || echo "⚠️ No affected files/components specified"
# Check: Edge cases covered?
echo "$BODY" | grep -iE '(edge cases?|error handling|fallback|validation)' || echo "⚠️ No edge cases or error handling mentioned"
```
**For quality-checker:**
```bash
# Check: Acceptance criteria reference automated checks?
echo "$BODY" | grep -iE '(tests? pass|eslint|flake8|mypy|pytest|typescript|build succeeds|linting passes)' || echo "⚠️ Acceptance criteria don't reference automated quality checks"
# Note: Performance requirements are optional - don't check for them
```
**For security-checker:**
```bash
# Check: Security requirements present if handling sensitive data?
if echo "$BODY" | grep -iE '(password|token|secret|api.?key|auth|credential|email|private)'; then
echo "$BODY" | grep -iE '(security|authentication|authorization|encrypt|sanitize|validate|sql.?injection|xss)' || echo "⚠️ Handles sensitive data but no security requirements"
fi
```
**For doc-checker:**
```bash
# Check: Documentation requirements specified?
echo "$BODY" | grep -iE '(documentation|readme|wiki|api.?docs?|comments?|feature.?list)' || echo "⚠️ No documentation requirements specified"
```
**For review-orchestrator:**
```bash
# Check: Clear pass/fail criteria?
CRITERIA_COUNT=$(echo "$BODY" | grep -c '- \[.\]' || echo "0")
if [ "$CRITERIA_COUNT" -lt 3 ]; then
echo "⚠️ Only $CRITERIA_COUNT acceptance criteria (need at least 3-5)"
fi
```
### STEP 3: Evaluate Testable Acceptance Criteria
Check if criteria align with automated checks:
**Good criteria (matches quality-checker):**
- [x] "All pytest tests pass" → quality-checker runs pytest
- [x] "ESLint passes with no errors" → quality-checker runs ESLint
- [x] "TypeScript compilation succeeds" → quality-checker runs tsc
- [x] "Query completes in < 100ms" → measurable, testable
- [x] "Can insert 5 accounts per user" → specific, verifiable
**Bad criteria (vague/unmeasurable):**
- [ ] "Works correctly" → subjective
- [ ] "Code looks good" → subjective
- [ ] "Function created" → process-oriented, not outcome
- [ ] "Bug is fixed" → not specific enough
### STEP 4: Check for Affected Components
Does issue specify:
- Which files to create?
- Which files to modify?
- Which APIs/endpoints involved?
- Which database tables affected?
- Which UI components changed?
- Which tests need updating?
### STEP 5: Evaluate Testing Expectations
Does issue specify:
- What tests to write (unit, integration, e2e)?
- What existing tests need to pass?
- What test coverage is expected?
- Performance test requirements?
### STEP 6: Check Security Requirements
**Security (if applicable):**
- Authentication/authorization requirements?
- Input validation specs?
- Data privacy considerations?
- Known vulnerabilities to avoid?
**Note on Performance:**
Performance requirements are OPTIONAL and NOT evaluated as a critical check. Most issues don't need explicit performance requirements - build first, measure, optimize later. Only flag as missing if:
- Issue specifically mentions performance as a concern
- Feature handles large-scale data (millions of records)
- There's a user-facing latency requirement
Otherwise, absence of performance requirements is fine.
### STEP 7: Check Documentation Requirements
Does issue specify:
- README updates needed?
- Wiki/API docs updates?
- Inline comments expected?
- FEATURE-LIST.md updates?
### STEP 8: Scan for Red Flags
```bash
# RED FLAG: Large code blocks (> 15 lines)
# Count lines in code blocks
CODE_BLOCKS=$(grep -E '```(typescript|javascript|python|sql|java|go|rust)' <<< "$BODY")
if [ ! -z "$CODE_BLOCKS" ]; then
# Check if any code block has > 15 lines
# (This is a simplified check - full implementation would count lines per block)
echo "⚠️ Found code blocks - checking size..."
# Manual review needed: Are these short examples or full implementations?
fi
# RED FLAG: Complete function implementations
grep -iE '(function|def|class).*\{' <<< "$BODY" && echo "🚨 Contains complete function/class implementations"
# RED FLAG: Prescriptive instructions with specific fixes
grep -iE '(add (this|the following) code to line [0-9]+|here is the implementation|use this exact code)' <<< "$BODY" && echo "🚨 Contains prescriptive code placement instructions"
# RED FLAG: Specific file/line references for bug fixes
grep -E '(fix|change|modify|add).*(in|at|on) line [0-9]+' <<< "$BODY" && echo "🚨 Contains specific file/line fix locations"
# RED FLAG: Step-by-step implementation guide (not just planning)
grep -iE '(step [0-9]|first,.*second,.*third)' <<< "$BODY" | grep -iE '(write this code|add this function|implement as follows)' && echo "🚨 Contains step-by-step code implementation guide"
# OK: Short examples are fine
# These are NOT red flags:
# - Type definitions: { email: string }
# - Example responses: { id: 5, status: 'active' }
# - Error formats: "Invalid email"
# - Small syntax examples (< 5 lines)
```
## Output Format
```markdown
# Issue #${ISSUE_NUM} Evaluation Report
**Title:** ${TITLE}
## Agent Framework Compatibility: X/9 Critical Checks Passed
**Ready for implementation?** [YES / NEEDS_WORK / NO]
**Note:** Performance, Related Issues, and Implementation Guidance are optional - not counted in score.
### ✅ Strengths (What's Good)
- [Specific strength 1]
- [Specific strength 2]
- [Specific strength 3]
### ⚠️ Needs Improvement (Fix Before Implementation)
- [Missing element with specific impact on agent]
- [Missing element with specific impact on agent]
- [Missing element with specific impact on agent]
### ❌ Critical Issues (Blocks Agent Success)
- [Critical missing element / red flag]
- [Critical missing element / red flag]
### 🚨 Red Flags Found
- [Code snippet / prescriptive instruction found]
- [Specific file/line reference found]
## Agent-by-Agent Analysis
### issue-implementer Readiness: [READY / NEEDS_WORK / BLOCKED]
**Can the implementer succeed with this issue?**
**Has:**
- Clear requirements and end state
- Files/components to modify specified
- Edge cases covered
**Missing:**
- Error handling specifications
- Validation rules not detailed
**Impact:** [Description of how missing elements affect implementer]
### quality-checker Readiness: [READY / NEEDS_WORK / BLOCKED]
**Can the quality checker validate this?**
**Has:**
- Acceptance criteria reference "pytest passes"
- Performance benchmark: "< 100ms"
**Missing:**
- No reference to ESLint/TypeScript checks
- Test coverage expectations not specified
**Impact:** [Description of how missing elements affect quality validation]
### security-checker Readiness: [READY / NEEDS_WORK / N/A]
**Are security requirements clear?**
**Has:**
- Authentication requirements specified
- Input validation rules defined
**Missing:**
- No SQL injection prevention mentioned
- Data encryption requirements unclear
**Impact:** [Description of security gaps]
### doc-checker Readiness: [READY / NEEDS_WORK / BLOCKED]
**Are documentation expectations clear?**
**Has:**
- README update requirement specified
**Missing:**
- No API documentation mentioned
- FEATURE-LIST.md updates not specified
**Impact:** [Description of doc gaps]
### review-orchestrator Readiness: [READY / NEEDS_WORK / BLOCKED]
**Are pass/fail criteria clear?**
**Has:**
- 5 specific acceptance criteria
- Clear success conditions
**Missing:**
- Distinction between blocking vs non-blocking issues
- Performance criteria not measurable
**Impact:** [Description of review ambiguity]
## Recommendations
### High Priority (Fix Before Implementation)
1. [Specific actionable fix]
2. [Specific actionable fix]
3. [Specific actionable fix]
### Medium Priority (Improve Clarity)
1. [Specific suggestion]
2. [Specific suggestion]
### Low Priority (Nice to Have)
1. [Optional improvement]
2. [Optional improvement]
## Example Improvements
### Before (Current):
```markdown
[Quote problematic section from issue]
```
### After (Suggested):
```markdown
[Show how it should be written]
```
## Agent Framework Compatibility
**Will this issue work well with the agent framework?**
- ✅ YES - Issue is well-structured for agents
- ⚠️ MAYBE - Needs improvements but workable
- ❌ NO - Critical issues will confuse agents
**Specific concerns:**
- [Agent compatibility issue 1]
- [Agent compatibility issue 2]
## Quick Fixes
If you want to improve this issue, run:
```bash
# Add these sections
gh issue comment $ISSUE_NUM --body "## Implementation Guidance
To implement this, you will need to:
- [List problems to solve]
"
# Remove code snippets
# Edit the issue and remove code blocks showing implementation
gh issue edit $ISSUE_NUM
# Add acceptance criteria
gh issue comment $ISSUE_NUM --body "## Additional Acceptance Criteria
- [ ] [Specific testable criterion]
"
```
## Summary
**Ready for agent implementation?** [YES/NO/NEEDS_WORK]
**Confidence level:** [HIGH/MEDIUM/LOW]
**Estimated time to fix issues:** [X minutes]
```
## Agent Framework Compatibility Score
**12 Critical Checks (Pass/Fail):**
### ✅ CRITICAL (Must Pass) - 5 checks
1. **Clear Requirements** - End state specified with technical details
2. **Testable Acceptance Criteria** - Aligns with automated checks (pytest, ESLint, etc.)
3. **Affected Components** - Files/modules to create or modify listed
4. **Testing Expectations** - What tests to write/pass specified
5. **Context** - Why, who, business value explained
### ⚠️ IMPORTANT (Should Pass) - 4 checks
6. **Security Requirements** - If handling sensitive data/auth
7. **Documentation Requirements** - README/wiki/API docs expectations
8. **Error Handling** - Error messages, codes, fallback behavior
9. **Scope Boundaries** - What IS and ISN'T included
### 💡 HELPFUL (Nice to Pass) - 3 checks
10. **Performance Requirements** - OPTIONAL, only if specific concern mentioned
11. **Related Issues** - Dependencies, related work
12. **Implementation Guidance** - Problems to solve (without prescribing HOW)
### ❌ RED FLAG (Must NOT Have) - 1 check
13. **No Prescriptive Implementation** - No code snippets, algorithms, step-by-step
**Scoring (based on 9 critical/important checks):**
- **9/9 + No Red Flags**: Perfect - Ready for agents
- **7-8/9 + No Red Flags**: Excellent - Minor improvements
- **5-6/9 + No Red Flags**: Good - Some gaps but workable
- **3-4/9 + No Red Flags**: Needs work - Significant gaps
- **< 3/9 OR Red Flags Present**: Blocked - Must fix before implementation
**Note:** Performance, Related Issues, and Implementation Guidance are HELPFUL but not required.
**Ready for Implementation?**
- **YES**: 7+ checks passed, no red flags
- **NEEDS_WORK**: 5-6 checks passed OR minor red flags
- **NO**: < 5 checks passed OR major red flags (code snippets, step-by-step)
## Examples
### Example 1: Excellent Issue (9/9 Checks Passed)
```
Issue #42: "Support multiple Google accounts per user"
✅ Agent Framework Compatibility: 9/9
**Ready for implementation?** YES
Strengths:
- Clear database requirements (tables, columns, constraints)
- Acceptance criteria: "pytest passes", "Build succeeds"
- Files specified: user_google_accounts table, modify user_google_tokens
- Testing: "Write tests for multi-account insertion"
- Security: "Validate email uniqueness per user"
- Documentation: "Update README with multi-account setup"
- Error handling: "Return 409 if duplicate email"
- Scope: "Account switching UI is out of scope (Issue #43)"
- Context: Explains why users need personal + work accounts
Agent-by-Agent:
✅ implementer: Has everything needed
✅ quality-checker: Criteria match automated checks
✅ security-checker: Security requirements clear
✅ doc-checker: Documentation expectations specified
✅ review-orchestrator: Clear pass/fail criteria
Note: No performance requirements specified - this is fine. Build first, optimize later if needed.
```
### Example 2: Needs Work (6/9 Checks Passed)
```
Issue #55: "Add email search functionality"
⚠️ Agent Framework Compatibility: 6/9
**Ready for implementation?** NEEDS_WORK
Strengths:
- Clear requirements: search endpoint, query parameters
- Context explains user need
- Affected components specified
Issues:
❌ Missing testing expectations (what tests to write?)
❌ Missing documentation requirements
❌ Missing error handling specs
❌ Vague acceptance criteria: "Search works correctly"
Agent-by-Agent:
✅ implementer: Has requirements
⚠️ quality-checker: Criteria too vague ("works correctly")
⚠️ security-checker: No SQL injection prevention mentioned
⚠️ doc-checker: No docs requirements
⚠️ review-orchestrator: Criteria not specific enough
Recommendations:
- Add: "pytest tests pass for search functionality"
- Add: "Update API documentation"
- Add: "Error handling: Return 400 if invalid query, 500 if database error"
- Replace "works correctly" with specific outcomes: "Returns matching emails", "Handles empty results"
Note: Performance requirements are optional - don't need to add unless there's a specific concern.
```
### Example 3: Blocked (2/9, Red Flags Present)
```
Issue #67: "Fix email parsing bug"
❌ Agent Framework Compatibility: 2/9
**Ready for implementation?** NO
Critical Issues:
🚨 Contains 30-line complete function implementation
🚨 Prescriptive: "Add this exact function to inbox.ts line 45"
🚨 Step-by-step: "First, create parseEmail(). Second, add validation. Third, handle errors."
❌ No clear requirements (what should parsing do?)
❌ No acceptance criteria
❌ No testing expectations
Agent-by-Agent:
❌ implementer: Doesn't need to think - solution provided
❌ quality-checker: Can't validate (no criteria)
❌ security-checker: Missing validation specs
❌ doc-checker: No docs requirements
❌ review-orchestrator: No pass/fail criteria
Must fix:
1. Remove complete function - describe WHAT parsing should do instead
- ✅ OK to keep: "Output format: `{ local: string, domain: string }`"
- ✅ OK to keep: "Error message: 'Invalid email format'"
- ❌ Remove: 30-line function implementation
2. Add requirements: input format, output format, error handling
3. Add acceptance criteria: "Parses valid emails", "Rejects invalid formats"
4. Specify testing: "Add unit tests for edge cases"
5. Remove line 45 reference - describe behavior instead
6. Convert step-by-step to "Implementation Guidance" section
```
### Example 4: Good Use of Code Examples (9/9)
```
Issue #78: "Add user profile endpoint"
✅ Agent Framework Compatibility: 9/9
**Ready for implementation?** YES
Requirements:
- Endpoint: POST /api/profile
- Request body: `{ name: string, bio?: string }` ← Short example (OK!)
- Response: `{ id: number, name: string, bio: string | null }` ← Type def (OK!)
- Error: `{ error: "Name required", code: 400 }` ← Error format (OK!)
✅ Good use of code examples:
- Shows shape/format without providing implementation
- Type definitions help clarify requirements
- Error format specifies exact user-facing text
- No function implementations
Agent-by-Agent:
✅ implementer: Clear requirements, no prescribed solution
✅ quality-checker: Testable criteria
✅ security-checker: Input validation specified
✅ doc-checker: API docs requirement present
```
## Integration with GHI Command
This command complements GHI:
- **GHI**: Creates new issues following best practices
- **EVI**: Evaluates existing issues against same standards
Use EVI to:
- Review issues written by other agents
- Audit issues before sending to implementation
- Train team on issue quality standards
- Ensure agent framework compatibility
## Notes
- Run EVI before implementing any issue
- Use it to review issues written by other agents or teammates
- Consider making EVI a required check in your workflow
- **11+ checks passed**: Agent-ready, implement immediately
- **8-10 checks passed**: Needs minor improvements, but workable
- **< 8 checks passed**: Must revise before implementation
- Look for red flags even in high-scoring issues
### What About Code Snippets?
**✅ Short examples are GOOD:**
- Type definitions: `interface User { id: number, name: string }`
- API responses: `{ success: true, data: {...} }`
- Error formats: `{ error: "message", code: 400 }`
- Small syntax examples (< 10 lines)
**❌ Large implementations are BAD:**
- Complete functions (15+ lines)
- Full component implementations
- Complete SQL migration scripts
- Working solutions that agents can copy-paste
**The key distinction:** Are you showing the shape/format (good) or providing the solution (bad)?

145
commands/featureforge.md Normal file
View File

@@ -0,0 +1,145 @@
---
description: Transform rough feature ideas into comprehensive implementation specifications
argument-hint: [feature idea]
---
# FeatureForge - Feature Specification Builder
Transform rough feature ideas into comprehensive, ready-to-implement specifications.
## Input
**Feature Idea:**
```
$ARGUMENTS
```
## Steps
1. **Understand the Feature**
- Identify core functionality and business value
- Determine complexity level (Simple/Medium/Complex)
- Choose appropriate detail level based on complexity
2. **Gather Context** (Ask 3-5 clarifying questions)
- What problem does this solve for users?
- Who will use this feature and when?
- What's the main user flow or interaction?
- Any technical constraints or integrations needed?
- How will you measure success?
3. **Build Specification**
- Use template below based on complexity
- Focus on clarity and actionable details
- Include acceptance criteria that are testable
4. **Output for Next Steps**
- Format as complete specification ready for `/cci` command
- Wrap in code block for easy copying
## Output Format
### For Simple Features
```markdown
# Feature: [Feature Name]
## Problem
[What user problem does this solve?]
## Solution
[How will this feature work?]
## User Story
As a [user type], I want to [action] so that [benefit].
## Acceptance Criteria
- [ ] [Testable criterion 1]
- [ ] [Testable criterion 2]
- [ ] [Testable criterion 3]
## Technical Notes
- [Key implementation detail]
- [Dependencies or constraints]
**Priority**: Medium | **Effort**: Small
```
### For Medium/Complex Features
```markdown
# Feature: [Feature Name]
## Problem Statement
[Detailed description of the problem and user impact]
## Proposed Solution
[Comprehensive description of how the feature works]
## User Stories
1. As a [user type], I want to [action] so that [benefit]
2. As a [user type], I want to [action] so that [benefit]
## Functional Requirements
- [Requirement 1 with details]
- [Requirement 2 with details]
- [Requirement 3 with details]
## Acceptance Criteria
- [ ] [Detailed testable criterion 1]
- [ ] [Detailed testable criterion 2]
- [ ] [Detailed testable criterion 3]
- [ ] [Detailed testable criterion 4]
## Technical Specifications
### Data Model
[What data needs to be stored/processed?]
### API/Integration Points
[New endpoints or external integrations needed]
### UI/UX Considerations
[Key user interface elements and flows]
## Edge Cases & Error Handling
- **[Edge case 1]**: [How to handle]
- **[Error scenario]**: [Expected behavior]
## Success Metrics
- [Metric 1]: [Target value]
- [Metric 2]: [Target value]
## Dependencies
- [External system or prerequisite feature]
**Priority**: High | **Effort**: Large | **Timeline**: 4-6 weeks
```
## Example Commands
**Simple:**
```
/featureforge Add a dark mode toggle in settings
```
**Complex:**
```
/featureforge Build a customer referral system with reward tracking, email notifications, and analytics dashboard
```
## Usage Flow
1. Run `/featureforge [your feature idea]`
2. Answer clarifying questions
3. Receive formatted specification
4. Copy the output
5. Use with `/cci` to create GitHub issue: `/cci [paste specification here]`
## Notes
- **Keep it actionable** - Focus on what to build, not how to build it
- **Be specific** - Vague requirements lead to unclear implementations
- **Think user-first** - Start with the problem, not the solution
- **Include metrics** - Define what success looks like
- For coding standards and patterns, refer to CLAUDE.md

View File

@@ -0,0 +1,116 @@
---
description: Generate a new custom Claude Code slash command from a description
argument-hint: [command purpose]
---
# Create a Custom Claude Code Command
Create a new slash command in `.claude/commands/` for the requested task.
## Goal
#$ARGUMENTS
## Key Capabilities to Leverage
**File Operations:**
- Read, Edit, Write - modify files precisely
- Glob, Grep - search codebase
- MultiEdit - atomic multi-part changes
**Development:**
- Bash - run commands (git, tests, linters)
- Task - launch specialized agents for complex tasks
- TodoWrite - track progress with todo lists
**Web & APIs:**
- WebFetch, WebSearch - research documentation
- GitHub (gh cli) - PRs, issues, reviews
- Puppeteer - browser automation, screenshots
**Integrations:**
- AppSignal - logs and monitoring
- Context7 - framework docs
- Stripe, Todoist, Featurebase (if relevant)
## Best Practices
1. **Be specific and clear** - detailed instructions yield better results
2. **Break down complex tasks** - use step-by-step plans
3. **Use examples** - reference existing code patterns
4. **Include success criteria** - tests pass, linting clean, etc.
5. **Think first** - use "think hard" or "plan" keywords for complex problems
6. **Iterate** - guide the process step by step
## Structure Your Command
```markdown
# [Command Name]
[Brief description of what this command does]
## Steps
1. [First step with specific details]
- Include file paths, patterns, or constraints
- Reference existing code if applicable
2. [Second step]
- Use parallel tool calls when possible
- Check/verify results
3. [Final steps]
- Run tests
- Lint code
- Commit changes (if appropriate)
## Success Criteria
- [ ] Tests pass
- [ ] Code follows style guide
- [ ] Documentation updated (if needed)
```
## Tips for Effective Commands
- **Use $ARGUMENTS** placeholder for dynamic inputs
- **Reference CLAUDE.md** patterns and conventions
- **Include verification steps** - tests, linting, visual checks
- **Be explicit about constraints** - don't modify X, use pattern Y
- **Use XML tags** for structured prompts: `<task>`, `<requirements>`, `<constraints>`
## Example Pattern
```markdown
Implement #$ARGUMENTS following these steps:
1. Research existing patterns
- Search for similar code using Grep
- Read relevant files to understand approach
2. Plan the implementation
- Think through edge cases and requirements
- Consider test cases needed
3. Implement
- Follow existing code patterns (reference specific files)
- Write tests first if doing TDD
- Ensure code follows CLAUDE.md conventions
4. Verify
- Run tests:
- Rails: `bin/rails test` or `bundle exec rspec`
- TypeScript: `npm test` or `yarn test` (Jest/Vitest)
- Python: `pytest` or `python -m pytest`
- Run linter:
- Rails: `bundle exec standardrb` or `bundle exec rubocop`
- TypeScript: `npm run lint` or `eslint .`
- Python: `ruff check .` or `flake8`
- Check changes with git diff
5. Commit (optional)
- Stage changes
- Write clear commit message
```
Now create the command file at `.claude/commands/[name].md` with the structure above.

1152
commands/ghi.md Normal file

File diff suppressed because it is too large Load Diff

258
commands/implement-issue.md Normal file
View File

@@ -0,0 +1,258 @@
---
description: Complete 3-phase automation: implement → review → merge for GitHub issues
argument-hint: [issue number]
---
# Implement GitHub Issue - Automated Workflow
You are orchestrating the complete implementation → review → merge pipeline for issue **$ARGUMENTS**.
## Your Mission
Execute this workflow **autonomously** without stopping between phases unless errors occur:
1. **Phase 1: Implementation** - Launch issue-implementer agent
2. **Phase 2: Review** - Launch review-orchestrator agent (3 parallel checks)
3. **Phase 3: Merge** - Launch issue-merger agent
Do **NOT** stop between phases to ask the user. Only stop if:
- An agent returns an error
- Review decision is "REQUEST_CHANGES" (report feedback to user)
- Merge fails due to conflicts (report to user)
---
## Execution Instructions
### Phase 1: Implementation
Launch the issue-implementer agent with issue number **$ARGUMENTS**:
```
Use the Task tool with:
- subagent_type: issue-implementer
- description: "Implement issue $ARGUMENTS"
- prompt: "Implement issue $ARGUMENTS following the 8-step workflow."
```
**If API returns 500 "Overloaded" error**:
- Automatically retry up to 2 more times (3 attempts total)
- Wait 5 seconds between retries
- Log: "API overloaded, retrying in 5 seconds... (attempt X/3)"
- Only fail if all 3 attempts return 500 errors
**After agent completes, verify actual implementation**:
```bash
# Check manifest exists
if [ -f ".agent-state/issue-$ARGUMENTS-implementation.yaml" ]; then
STATUS=$(grep "status:" .agent-state/issue-$ARGUMENTS-implementation.yaml | cut -d: -f2 | tr -d ' ')
# Verify implementation artifacts
BRANCH=$(grep "branch:" .agent-state/issue-$ARGUMENTS-implementation.yaml | cut -d: -f2 | tr -d ' "')
# Check branch exists
if ! git rev-parse --verify "$BRANCH" >/dev/null 2>&1; then
echo "⚠️ Warning: Branch $BRANCH not found, but status is $STATUS"
echo "This may indicate implementation succeeded despite error messages"
fi
# For AWS operations, verify resources exist (if applicable)
# This will be handled in implementation verification section below
else
echo "❌ ERROR: Implementation manifest not found"
echo "Agent may have failed before completion"
exit 1
fi
```
**Verify Implementation Success**:
Check actual state, not just manifest:
- Branch exists: `git rev-parse --verify feature/issue-$ARGUMENTS`
- Commits pushed: `git log origin/feature/issue-$ARGUMENTS`
- Issue labeled: `gh issue view $ARGUMENTS --json labels | grep ready_for_review`
**Using the verification helper script**:
```bash
# Verify branch exists
~/.claude/scripts/verify-github-operation.sh branch-exists $ARGUMENTS feature/issue-$ARGUMENTS
# Verify label updated
~/.claude/scripts/verify-github-operation.sh label-updated $ARGUMENTS ready_for_review
```
The verification script provides color-coded output and clear success/failure indicators.
If status is "ready_for_review" and verification passes, print:
```
✅ Phase 1 Complete: Implementation
📋 Files changed: [list from manifest]
📊 Status: ready_for_review
⏭️ Continuing to review phase...
```
Then proceed to Phase 2 immediately.
---
### Phase 2: Review
Launch the review-orchestrator agent:
```
Use the Task tool with:
- subagent_type: review-orchestrator
- description: "Review issue $ARGUMENTS"
- prompt: "Orchestrate code review for issue $ARGUMENTS. Coordinate security, quality, and documentation checks in parallel. Create PR if approved."
```
**Wait for completion**, then check:
```bash
cat .agent-state/issue-$ARGUMENTS-review.yaml | grep "decision:"
```
**If decision is "APPROVE"**, verify and report:
**Verify Review Completion**:
```bash
# Verify PR was created
~/.claude/scripts/verify-github-operation.sh pr-created $ARGUMENTS
# Verify label updated to ready_for_merge
~/.claude/scripts/verify-github-operation.sh label-updated $ARGUMENTS ready_for_merge
```
If verification passes, print:
```
✅ Phase 2 Complete: Review
🔒 Security: PASS (0 issues)
⚡ Quality: PASS
📚 Documentation: PASS
🎯 Decision: APPROVE
📝 PR created and label updated
⏭️ Continuing to merge phase...
```
Then proceed to Phase 3 immediately.
**If decision is "REQUEST_CHANGES"**, print:
```
❌ Phase 2: Changes Requested
📝 Feedback: [show blocking issues from manifest]
🔄 Issue returned to "To Do" status
Workflow stopped. Address the feedback and re-run /implement-issue $ARGUMENTS
```
Stop here and report to user.
---
### Phase 3: Merge
Launch the issue-merger agent:
```
Use the Task tool with:
- subagent_type: issue-merger
- description: "Merge PR for issue $ARGUMENTS"
- prompt: "Merge the approved pull request for issue $ARGUMENTS. Handle conflict resolution if needed, update all tracking systems, and cleanup worktrees."
```
**Wait for completion**, then verify and report:
**Verify Merge Completion**:
```bash
# Verify PR was merged
~/.claude/scripts/verify-github-operation.sh pr-merged $ARGUMENTS
# Verify issue was closed
~/.claude/scripts/verify-github-operation.sh issue-closed $ARGUMENTS
```
If verification passes, print:
```
✅ Phase 3 Complete: Merge
🔗 PR merged and verified
📦 Commit: [SHA from merger output]
🏁 Issue #$ARGUMENTS closed and verified
✨ Workflow complete
```
---
## Error Handling
**Error Recovery Protocol**:
1. **Check actual state before failing**:
- Implementation phase: Verify branch exists, files changed, commits pushed
- Review phase: Verify PR created (if approved), issue labeled correctly
- Merge phase: Verify PR merged, issue closed, branch deleted
2. **For API 500 "Overloaded" errors specifically**:
- These are temporary overload, not failures
- Automatically retry up to 2 more times (3 attempts total)
- Wait 5 seconds between retries
- Only fail if all 3 attempts return errors
3. **For "operation failed" but state shows success**:
- Log warning but continue workflow
- Example: "Connection closed" error but AWS resource actually exists
- Report to user: "Operation completed successfully despite error message"
4. **For actual failures**:
- Stop workflow immediately
- Show clear error with phase name and details
- Provide specific recovery steps
- Preserve work (don't delete worktree/branch)
**Common errors:**
- "Issue not found" → Check issue number format (should be just number, not URL)
- "No implementation manifest" → Run implementer first
- "Review not approved" → Address review feedback and re-run workflow
- "Merge conflicts" → Resolve manually, then re-run merge phase
- "API Overloaded" → Automatic retry (up to 3 attempts)
- "Label not updated" → Manually update via: `gh issue edit NUMBER --add-label LABEL`
---
## Status Reporting
After each phase completion, use this format:
```
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Phase [N]: [Phase Name]
✅ Status: [Complete/Failed]
📊 Details: [Key information]
⏭️ Next: [What happens next]
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
```
---
## Final Report
After Phase 3 completes successfully, provide:
```
🎉 Issue #$ARGUMENTS Implementation Complete
📊 Summary:
• Implementation: [files changed]
• Review: [decision + check results]
• Merge: [PR number + commit SHA]
🔗 Links:
• Issue: https://github.com/Mygentic-AI/mygentic-personal-assistant/issues/$ARGUMENTS
• PR: [PR URL from merger output]
• Commit: [commit URL]
✅ All tracking systems updated
✅ Worktree cleaned up
✅ Branches deleted
```
---
**Start now with issue $ARGUMENTS. Do not ask for confirmation between phases.**

View File

@@ -0,0 +1,202 @@
---
description: Implement multiple GitHub issues in parallel with dependency analysis
argument-hint: [space-separated issue numbers]
---
# Implement Multiple GitHub Issues in Parallel
Execute multiple GitHub issues concurrently, respecting dependencies between them.
## Usage
```
/implement-issues-parallel 5 12 18 23
```
## Workflow
### Phase 1: Analysis & Dependency Detection
1. **Fetch All Issues**
```bash
for issue in $ARGUMENTS; do
gh issue view $issue --json title,body,labels
done
```
2. **Analyze Dependencies**
For each issue, check:
- Does it reference other issues in the batch? (e.g., "Depends on #5", "Blocks #12")
- Does it modify files that other issues also modify?
- Does it require features from other issues?
3. **Build Dependency Graph (ASCII)**
```
┌─────────────────────────────────────────────────────────────┐
│ EXECUTION PLAN FOR ISSUES: 5, 12, 18, 23 │
└─────────────────────────────────────────────────────────────┘
┌──────────┐
│ Issue #5 │ (Database migration)
└────┬─────┘
├────────┐
│ │
▼ ▼
┌──────────┐ ┌──────────┐
│Issue #12 │ │Issue #18 │ (Both need DB changes from #5)
└────┬─────┘ └────┬─────┘
│ │
└────────┬───┘
┌──────────┐
│Issue #23 │ (Needs both #12 and #18)
└──────────┘
WAVE 1 (Parallel): #5
WAVE 2 (Parallel): #12, #18
WAVE 3 (Parallel): #23
Total Estimated Time: ~45-60 minutes
```
4. **User Confirmation**
Show the execution plan and ask:
```
Execute this plan? (yes/modify/cancel)
- yes: Start execution
- modify: Adjust dependencies or order
- cancel: Abort
```
### Phase 2: Parallel Execution
1. **Execute by Waves**
For each wave:
- Launch issue-implementer agents in parallel
- Wait for all agents in wave to complete
- Verify no conflicts before proceeding to next wave
Example for Wave 2:
```
# Launch in parallel
Task issue-implementer(12) &
Task issue-implementer(18) &
# Wait for both to complete
wait
```
2. **Conflict Detection Between Waves**
After each wave completes:
- Check for file conflicts between branches
- Verify no overlapping changes
- Report any conflicts to user before continuing
3. **Progress Tracking**
```
┌──────────────────────────────────────────────────────┐
│ PROGRESS: Wave 2/3 │
├──────────────────────────────────────────────────────┤
│ ✅ Wave 1: #5 (Complete) │
│ ⏳ Wave 2: #12 (in_progress) | #18 (in_progress) │
│ ⏸ Wave 3: #23 (pending) │
└──────────────────────────────────────────────────────┘
```
### Phase 3: Review & Merge Coordination
1. **Sequential Review**
Even though implementation was parallel, reviews happen sequentially to avoid reviewer confusion:
```
For each issue in dependency order (5, 12, 18, 23):
- Launch review-orchestrator(issue)
- Wait for approval
- If approved, continue to next
- If changes requested, pause pipeline
```
2. **Coordinated Merging**
After all reviews pass:
```
For each issue in dependency order:
- Launch issue-merger(issue)
- Verify merge successful
- Update remaining branches if needed
```
### Phase 4: Completion Report
```
┌────────────────────────────────────────────────────────────┐
│ PARALLEL EXECUTION COMPLETE │
├────────────────────────────────────────────────────────────┤
│ Issues Implemented: 4 │
│ Total Time: 52 minutes │
│ Time Saved vs Sequential: ~78 minutes (60% faster) │
│ │
│ Results: │
│ ✅ #5 - Merged (commit: abc123) │
│ ✅ #12 - Merged (commit: def456) │
│ ✅ #18 - Merged (commit: ghi789) │
│ ✅ #23 - Merged (commit: jkl012) │
└────────────────────────────────────────────────────────────┘
```
## Dependency Detection Rules
**Explicit Dependencies** (highest priority):
- Issue body contains: "Depends on #X", "Requires #X", "Blocked by #X"
- Issue references another issue with keywords: "After #X", "Needs #X"
**Implicit Dependencies** (inferred):
- Both modify the same database table (check for migration files)
- One adds a function, another uses it (check for new exports/imports)
- One modifies an API endpoint, another calls it
**File Conflict Analysis**:
- Compare `files_changed` from each issue's implementation manifest
- If overlap > 30%, consider creating dependency
- Ask user to confirm inferred dependencies
## Error Handling
**If an Issue Fails in a Wave:**
1. Continue other issues in current wave
2. Mark dependent issues as "blocked"
3. Report failure to user
4. Ask: retry failed issue / skip dependent issues / abort all
**If Review Requests Changes:**
1. Pause pipeline
2. Show which issues are affected
3. Ask: address feedback now / continue with others / abort
**If Merge Conflict:**
1. Attempt automatic resolution if simple
2. If complex, pause and show conflict
3. Ask user to resolve manually or abort
## Limitations
- Maximum 10 issues per batch (to avoid complexity)
- All issues must be from the same repository
- Cannot mix issues requiring different environments
- Review coordination may still take time despite parallel implementation
## Success Criteria
- All issues successfully merged OR clear failure report
- No merge conflicts left unresolved
- All tracking systems updated correctly
- Worktrees cleaned up properly

374
commands/plan.md Normal file
View File

@@ -0,0 +1,374 @@
---
description: Research and plan comprehensive GitHub issues with parallel agent analysis
argument-hint: [feature/bug description]
---
# Create GitHub Issue (Plan)
## Introduction
Transform feature descriptions, bug reports, or improvement ideas into well-structured markdown files issues that follow project conventions and best practices. This command provides flexible detail levels to match your needs.
## Feature Description
<feature_description> #$ARGUMENTS </feature_description>
## Main Tasks
### 1. Repository Research & Context Gathering
<thinking>
First, I need to understand the project's conventions and existing patterns, leveraging all available resources and use paralel subagents to do this.
</thinking>
Runn these three agents in paralel at the same time:
- Task repo-research-analyst(feature_description)
- Task best-practices-researcher (feature_description)
- Task framework-docs-researcher (feature_description)
**Reference Collection:**
- [ ] Document all research findings with specific file paths (e.g., `src/services/user_service.py:42` or `src/components/UserProfile.tsx:15`)
- [ ] Include URLs to external documentation and best practices guides
- [ ] Create a reference list of similar issues or PRs (e.g., `#123`, `#456`)
- [ ] Note any team conventions discovered in `CLAUDE.md` or team documentation
### 2. Issue Planning & Structure
<thinking>
Think like a product manager - what would make this issue clear and actionable? Consider multiple perspectives
</thinking>
**Title & Categorization:**
- [ ] Draft clear, searchable issue title using conventional format (e.g., `feat:`, `fix:`, `docs:`)
- [ ] Identify appropriate labels from repository's label set (`gh label list`)
- [ ] Determine issue type: enhancement, bug, refactor
**Stakeholder Analysis:**
- [ ] Identify who will be affected by this issue (end users, developers, operations)
- [ ] Consider implementation complexity and required expertise
**Content Planning:**
- [ ] Choose appropriate detail level based on issue complexity and audience
- [ ] List all necessary sections for the chosen template
- [ ] Gather supporting materials (error logs, screenshots, design mockups)
- [ ] Prepare code examples or reproduction steps if applicable, name the mock filenames in the lists
### 3. Choose Implementation Detail Level
Select how comprehensive you want the issue to be:
#### 📄 MINIMAL (Quick Issue)
**Best for:** Simple bugs, small improvements, clear features
**Includes:**
- Problem statement or feature description
- Basic acceptance criteria
- Essential context only
**Structure:**
````markdown
[Brief problem/feature description]
## Acceptance Criteria
- [ ] Core requirement 1
- [ ] Core requirement 2
## Context
[Any critical information]
## MVP
### test.py
```python
class Test:
def __init__(self):
self.name = "test"
```
### test.ts
```typescript
class Test {
private name: string;
constructor() {
this.name = "test";
}
}
```
## References
- Related issue: #[issue_number]
- Documentation: [relevant_docs_url]
````
#### 📋 MORE (Standard Issue)
**Best for:** Most features, complex bugs, team collaboration
**Includes everything from MINIMAL plus:**
- Detailed background and motivation
- Technical considerations
- Success metrics
- Dependencies and risks
- Basic implementation suggestions
**Structure:**
```markdown
## Overview
[Comprehensive description]
## Problem Statement / Motivation
[Why this matters]
## Proposed Solution
[High-level approach]
## Technical Considerations
- Architecture impacts
- Performance implications
- Security considerations
## Acceptance Criteria
- [ ] Detailed requirement 1
- [ ] Detailed requirement 2
- [ ] Testing requirements
## Success Metrics
[How we measure success]
## Dependencies & Risks
[What could block or complicate this]
## References & Research
- Similar implementations: [file_path:line_number]
- Best practices: [documentation_url]
- Related PRs: #[pr_number]
```
#### 📚 A LOT (Comprehensive Issue)
**Best for:** Major features, architectural changes, complex integrations
**Includes everything from MORE plus:**
- Detailed implementation plan with phases
- Alternative approaches considered
- Extensive technical specifications
- Resource requirements and timeline
- Future considerations and extensibility
- Risk mitigation strategies
- Documentation requirements
**Structure:**
```markdown
## Overview
[Executive summary]
## Problem Statement
[Detailed problem analysis]
## Proposed Solution
[Comprehensive solution design]
## Technical Approach
### Architecture
[Detailed technical design]
### Implementation Phases
#### Phase 1: [Foundation]
- Tasks and deliverables
- Success criteria
- Estimated effort
#### Phase 2: [Core Implementation]
- Tasks and deliverables
- Success criteria
- Estimated effort
#### Phase 3: [Polish & Optimization]
- Tasks and deliverables
- Success criteria
- Estimated effort
## Alternative Approaches Considered
[Other solutions evaluated and why rejected]
## Acceptance Criteria
### Functional Requirements
- [ ] Detailed functional criteria
### Non-Functional Requirements
- [ ] Performance targets
- [ ] Security requirements
- [ ] Accessibility standards
### Quality Gates
- [ ] Test coverage requirements
- [ ] Documentation completeness
- [ ] Code review approval
## Success Metrics
[Detailed KPIs and measurement methods]
## Dependencies & Prerequisites
[Detailed dependency analysis]
## Risk Analysis & Mitigation
[Comprehensive risk assessment]
## Resource Requirements
[Team, time, infrastructure needs]
## Future Considerations
[Extensibility and long-term vision]
## Documentation Plan
[What docs need updating]
## References & Research
### Internal References
- Architecture decisions: [file_path:line_number]
- Similar features: [file_path:line_number]
- Configuration: [file_path:line_number]
### External References
- Framework documentation: [url]
- Best practices guide: [url]
- Industry standards: [url]
### Related Work
- Previous PRs: #[pr_numbers]
- Related issues: #[issue_numbers]
- Design documents: [links]
```
### 4. Issue Creation & Formatting
<thinking>
Apply best practices for clarity and actionability, making the issue easy to scan and understand
</thinking>
**Content Formatting:**
- [ ] Use clear, descriptive headings with proper hierarchy (##, ###)
- [ ] Include code examples in triple backticks with language syntax highlighting
- [ ] Add screenshots/mockups if UI-related (drag & drop or use image hosting)
- [ ] Use task lists (- [ ]) for trackable items that can be checked off
- [ ] Add collapsible sections for lengthy logs or optional details using `<details>` tags
- [ ] Apply appropriate emoji for visual scanning (🐛 bug, ✨ feature, 📚 docs, ♻️ refactor)
**Cross-Referencing:**
- [ ] Link to related issues/PRs using #number format
- [ ] Reference specific commits with SHA hashes when relevant
- [ ] Link to code using GitHub's permalink feature (press 'y' for permanent link)
- [ ] Mention relevant team members with @username if needed
- [ ] Add links to external resources with descriptive text
**Code & Examples:**
```markdown
# Good example with syntax highlighting and line references
\`\`\`ruby
# app/services/user_service.rb:42
def process_user(user)
# Implementation here
end \`\`\`
# Collapsible error logs
<details>
<summary>Full error stacktrace</summary>
\`\`\` Error details here... \`\`\`
</details>
```
**AI-Era Considerations:**
- [ ] Account for accelerated development with AI pair programming
- [ ] Include prompts or instructions that worked well during research
- [ ] Note which AI tools were used for initial exploration (Claude, Copilot, etc.)
- [ ] Emphasize comprehensive testing given rapid implementation
- [ ] Document any AI-generated code that needs human review
### 5. Final Review & Submission
**Pre-submission Checklist:**
- [ ] Title is searchable and descriptive
- [ ] Labels accurately categorize the issue
- [ ] All template sections are complete
- [ ] Links and references are working
- [ ] Acceptance criteria are measurable
- [ ] Add names of files in pseudo code examples and todo lists
- [ ] Add an ERD mermaid diagram if applicable for new model changes
## Output Format
Present the complete issue content within `<github_issue>` tags, ready for GitHub CLI:
```bash
gh issue create --title "[TITLE]" --body "[CONTENT]" --label "[LABELS]"
```
## Thinking Approaches
- **Analytical:** Break down complex features into manageable components
- **User-Centric:** Consider end-user impact and experience
- **Technical:** Evaluate implementation complexity and architecture fit
- **Strategic:** Align with project goals and roadmap

387
commands/tar.md Normal file
View File

@@ -0,0 +1,387 @@
---
description: Research and recommend technical approaches for implementation challenges
argument-hint: [problem description or feature]
---
# TAR - Technical Approach Research
Research and recommend technical approaches for implementation challenges or new features.
## Purpose
TAR handles **two types of research**:
1. **Problem Diagnosis**: Research why something is broken/slow/buggy and propose fixes
- Example: "Our Lambda functions are timing out under load"
- Example: "OAuth token refresh is failing intermittently"
2. **Feature Architecture**: Research the best way to architect a new feature
- Example: "Build real-time voice chat for web and mobile"
- Example: "Add multi-account support to our authentication system"
**Primary Goal**: Thorough research and diagnosis. Understanding the problem deeply and recommending the right solution.
**Secondary Benefit**: Research output naturally contains information useful for creating GitHub issues later (if desired).
## Input
**Project/Feature Description:**
```
$ARGUMENTS
```
## Perplexity AI: Your Research Partner
**USE PERPLEXITY ITERATIVELY - Don't Stop at the First Answer**
Perplexity is your research partner. Have a conversation with it, dig deeper, ask follow-ups. Don't treat it like a search engine where you get one answer and move on.
### Research Pattern
**1. Start with `perplexity_ask` (5-20 sec)**
- Get initial understanding of the problem/approach
- Identify key concepts and technologies
**2. Dig Deeper with Follow-ups**
- Ask about edge cases: "What are common pitfalls with [approach]?"
- Ask about gotchas: "What breaks when [condition]?"
- Ask about real examples: "Who uses [approach] in production?"
- Ask about alternatives: "What about [alternative approach]?"
**3. Use `perplexity_reason` (30 sec - 2 min) for Implementation Details**
- "How do I handle [specific technical challenge]?"
- "What's the debugging strategy when [error occurs]?"
- "Why does [approach] fail under [condition]?"
**4. Use `perplexity_research` (3-10 min) ONLY for Deep Dives**
- Comprehensive architecture analysis
- Multiple competing approaches need full comparison
- Novel/cutting-edge technology with limited documentation
### Provide Comprehensive Context
**Every Perplexity query should include:**
- **Environment**: OS, versions, cloud provider, region
- **Current State**: What exists today, what's broken, error messages
- **Constraints**: Performance requirements, scale, budget, team skills
- **Architecture**: Related systems, dependencies, data flow
- **Goal**: What success looks like
**Bad Perplexity Query**:
```
"How do I fix Lambda timeouts?"
```
**Good Perplexity Query**:
```
"I have AWS Lambda functions in me-central-1 running Python 3.11 that call Aurora PostgreSQL via Data API. Under load (100+ concurrent requests), functions timeout after 30 seconds. Current config: 1024MB memory, no VPC, cold start ~2s, warm ~500ms. Database queries are simple SELECTs taking <50ms. What are the most likely causes and solutions? Looking for production-proven approaches."
```
### Iterative Research Example
**Round 1**: "What causes Lambda timeouts with Aurora Data API?"
→ Response mentions connection pooling, query optimization, memory allocation
**Round 2**: "Does Aurora Data API support connection pooling? What are the gotchas?"
→ Response explains Data API is HTTP-based, no traditional pooling needed
**Round 3**: "What are the common Aurora Data API errors under high load and how to handle them?"
→ Response lists throttling, transaction timeouts, network issues
**Round 4**: "How do production systems handle Aurora Data API throttling? Show real examples."
→ Response provides retry strategies, backoff algorithms, monitoring approaches
**Result**: Deep understanding of the problem and battle-tested solutions.
## Steps
### 1. Understand the Problem Space
**For Problem Diagnosis:**
- What's the symptom? (error messages, slow performance, unexpected behavior)
- When does it occur? (always, under load, specific conditions)
- What changed recently? (code, config, traffic, infrastructure)
- What's already been tried? (debugging steps, failed solutions)
**For Feature Architecture:**
- What's the user need? (what problem are we solving?)
- What are the constraints? (performance, scale, budget, timeline)
- What's the environment? (platforms, existing systems, team skills)
- What's the scope? (MVP vs full feature, what's out of scope)
### 2. Research with Perplexity (Iteratively!)
**Initial Research** (perplexity_ask):
- Problem diagnosis: "What causes [symptom] in [technology]?"
- Feature architecture: "How do companies implement [feature] for [scale/platform]?"
**Follow-up Questions** (keep digging):
- "What are the gotchas with [approach]?"
- "How does [company] handle [specific challenge]?"
- "What breaks when [condition occurs]?"
- "What's the debugging strategy for [error]?"
- "What are alternatives to [initial approach]?"
**Deep Dive** (perplexity_reason or perplexity_research if needed):
- Complex architectural decisions
- Multiple approaches need full comparison
- Novel technology with limited documentation
### 3. Examine Codebase (If Applicable)
If diagnosing a problem in existing code:
- Read relevant files to understand current implementation
- Check recent changes: `git log --oneline -20 path/to/file`
- Look for related issues: `gh issue list --search "keyword"`
- Check error logs, metrics, monitoring
### 4. Research Multiple Approaches
Find **3-4 viable approaches**:
- Look for production case studies (last 2 years preferred)
- Focus on real-world implementations from companies at scale
- Include performance data, not just descriptions
- Note which approach is most common (battle-tested)
### 5. Evaluate Trade-offs
For each approach, analyze:
- **Performance**: Latency, throughput, resource usage
- **Complexity**: Dev time, code complexity, operational overhead
- **Cost**: Infrastructure, licensing, ongoing maintenance
- **Scalability**: How it handles growth
- **Team Fit**: Required skills, learning curve
- **Risk**: Common pitfalls, anti-patterns, failure modes
### 6. Make Recommendation
- Pick the approach that best fits the constraints
- Explain WHY this is recommended (not just what)
- Be honest about trade-offs and limitations
- Provide real examples of successful implementations
- Include actionable next steps
### 7. Deliver Research Report
Use the output format below. Keep focus on **research quality**, not on formatting for GitHub issues.
## Output Format
```markdown
# Technical Research: [Problem/Feature Name]
## Research Type
[Problem Diagnosis | Feature Architecture]
## Executive Summary
[2-3 paragraphs covering:
- What's the problem or what are we building?
- What did the research reveal?
- What's recommended and why?
- Key trade-offs to be aware of]
## Problem Analysis (For Diagnosis) OR Requirements (For Architecture)
**For Problem Diagnosis:**
- **Symptom**: [Exact error, behavior, or performance issue]
- **When It Occurs**: [Conditions, frequency, patterns]
- **Current State**: [What's already in place, recent changes]
- **Root Cause**: [What the research revealed]
- **Impact**: [Who/what is affected]
**For Feature Architecture:**
- **Core Need**: [What problem we're solving for users]
- **Constraints**: [Performance, scale, budget, timeline, technical]
- **Platforms**: [Web, mobile, APIs, infrastructure]
- **Scope**: [What's included, what's explicitly out of scope]
- **Success Criteria**: [What "done" looks like]
## Research Findings
### Key Insights from Research
1. [Major insight from Perplexity research]
2. [Important gotcha or edge case discovered]
3. [Production pattern or anti-pattern found]
### Technologies/Patterns Investigated
- [Technology/Pattern 1]: [Brief description]
- [Technology/Pattern 2]: [Brief description]
- [Technology/Pattern 3]: [Brief description]
---
## Approach 1: [Name] ⭐ RECOMMENDED
**Overview**: [One paragraph: what it is, how it works, why it exists]
**Why This Approach**:
- [Key advantage 1 - specific, measurable]
- [Key advantage 2 - real-world proof]
- [Key advantage 3 - fits constraints]
**Trade-offs**:
- ⚠️ [Limitation or consideration 1]
- ⚠️ [Limitation or consideration 2]
**Real-World Examples**:
- **[Company/Project]**: [How they use it, scale, results]
- **[Company/Project]**: [How they use it, scale, results]
**Common Pitfalls**:
- [Pitfall 1 and how to avoid it]
- [Pitfall 2 and how to avoid it]
**Complexity**: Low/Medium/High
**Development Time**: [Estimate based on research]
**Operational Overhead**: Low/Medium/High
---
## Approach 2: [Alternative Name]
**Overview**: [One paragraph description]
**Pros**:
- [Advantage 1]
- [Advantage 2]
**Cons**:
- [Disadvantage 1]
- [Disadvantage 2]
**When to Use**: [Specific scenarios where this is better than recommended approach]
**Real-World Examples**:
- [Company/Project using this approach]
---
## Approach 3: [Alternative Name]
**Overview**: [One paragraph description]
**Pros**:
- [Advantage 1]
- [Advantage 2]
**Cons**:
- [Disadvantage 1]
- [Disadvantage 2]
**When to Use**: [Specific scenarios where this is better than recommended approach]
**Real-World Examples**:
- [Company/Project using this approach]
---
## Comparison Matrix
| Aspect | Approach 1 ⭐ | Approach 2 | Approach 3 |
|--------|-------------|-----------|-----------|
| Dev Speed | Fast/Medium/Slow | ... | ... |
| Performance | Good/Excellent | ... | ... |
| Complexity | Low/Medium/High | ... | ... |
| Scalability | Good/Excellent | ... | ... |
| Cost | Low/Medium/High | ... | ... |
| Team Fit | Good/Poor | ... | ... |
| Battle-Tested | Very/Somewhat/New | ... | ... |
## Implementation Roadmap
**For Recommended Approach:**
### Phase 1: Foundation (Week 1-2)
1. [Specific technical task]
2. [Specific technical task]
3. [Validation checkpoint]
### Phase 2: Core Development (Week 3-4)
1. [Specific technical task]
2. [Specific technical task]
3. [Testing/validation]
### Phase 3: Polish & Deploy (Week 5+)
1. [Specific technical task]
2. [Performance testing/optimization]
3. [Production rollout]
## Critical Risks & Mitigation
| Risk | Impact | Likelihood | Mitigation Strategy |
|------|--------|-----------|---------------------|
| [Technical risk 1] | High/Medium/Low | High/Medium/Low | [Specific strategy] |
| [Technical risk 2] | High/Medium/Low | High/Medium/Low | [Specific strategy] |
## Testing Strategy
**What to Test**:
- [Specific test type 1 and why]
- [Specific test type 2 and why]
**Performance Benchmarks**:
- [Metric 1]: Target [value], measured by [method]
- [Metric 2]: Target [value], measured by [method]
## Key Resources
**Documentation**:
- [Official docs with URL and specific sections to read]
**Real Examples**:
- [GitHub repo / blog post with URL and what's relevant]
- [Production case study with URL and key takeaways]
**Tools/Libraries**:
- [Tool name with URL and what it does]
---
## Optional: Issue Creation Summary
**If creating a GitHub issue from this research**, the following information would be relevant:
**Summary**: [2-3 sentence summary for issue title/description]
**Key Requirements**:
- [Specific requirement 1]
- [Specific requirement 2]
**Recommended Approach**: [One paragraph on what to implement]
**Implementation Guidance**:
- [Problem to solve 1]
- [Problem to solve 2]
**Testing Needs**:
- [Test requirement 1]
- [Test requirement 2]
**Documentation Needs**:
- [Doc requirement 1]
```
## Example Commands
**Problem Diagnosis**:
```
/tar Our Lambda functions are timing out when calling Aurora PostgreSQL via Data API under load. Functions are Python 3.11, 1024MB memory, no VPC, in me-central-1. Timeouts happen at 100+ concurrent requests.
```
**Feature Architecture**:
```
/tar Build a real-time voice chat feature for web and mobile apps with low latency (<200ms) and support for 10+ concurrent users per room. Budget conscious, team familiar with JavaScript/Python.
```
## Notes
- **Research is the goal** - Deep understanding matters more than perfect formatting
- **Use Perplexity iteratively** - Ask follow-ups, dig into gotchas, find real examples
- **Provide context** - Every Perplexity query should include environment, constraints, goals
- **Focus on practicality** - Real implementations beat theoretical perfection
- **Recency matters** - Prioritize solutions from last 2 years (technology evolves)
- **Show your work** - Include Perplexity queries, sources, reasoning
- **Be honest about trade-offs** - No approach is perfect
- **Real examples prove it works** - "Company X uses this at Y scale" is gold
- **Issue creation is optional** - Sometimes research is just research

View File

@@ -0,0 +1,190 @@
---
description: Generate comprehensive validation command for backend or frontend
---
# Generate Ultimate Validation Command
Analyze a codebase deeply and create `.claude/commands/validate.md` that comprehensively validates everything.
**Usage:**
```
/ultimate_validate_command [backend|frontend]
```
## Step 0: Discover Real User Workflows
**Before analyzing tooling, understand what users ACTUALLY do:**
1. Read workflow documentation:
- README.md - Look for "Usage", "Quickstart", "Examples" sections
- CLAUDE.md - Look for workflow patterns
- docs/ folder - User guides, tutorials, feature documentation
2. Identify external integrations:
- What CLIs does the app use? (Check Dockerfile, requirements.txt, package.json)
- What external APIs does it call? (Google Calendar, Gmail, Telegram, LiveKit, etc.)
- What services does it interact with? (AWS, Supabase, databases, etc.)
3. Extract complete user journeys from docs:
- Find examples like "User asks voice agent → processes calendar → sends email"
- Each workflow becomes an E2E test scenario
**Critical: Your E2E tests should mirror actual workflows from docs, not just test internal APIs.**
## Step 1: Deep Codebase Analysis
Navigate to the appropriate repo:
```bash
cd backend/ # or frontend/
```
Explore the codebase to understand:
**What validation tools already exist:**
- **Linting config:** `.eslintrc*`, `.pylintrc`, `ruff.toml`, `.flake8`, etc.
- **Type checking:** `tsconfig.json`, `mypy.ini`, etc.
- **Style/formatting:** `.prettierrc*`, `black`, `.editorconfig`
- **Unit tests:** `jest.config.*`, `pytest.ini`, test directories
- **Package manager scripts:** `package.json` scripts, `Makefile`, `pyproject.toml` tools
**What the application does:**
- **Frontend:** Routes, pages, components, user flows (Next.js app)
- **Backend:** API endpoints, voice agent, Google services integration, authentication
- **Database:** Schema, migrations, models
- **Infrastructure:** Docker services, dependencies, AWS resources
**How things are currently tested:**
- Existing test files and patterns
- CI/CD workflows (`.github/workflows/`, etc.)
- Test commands in package.json or pyproject.toml
## Step 2: Generate validate.md
Create `.claude/commands/validate.md` in the **appropriate repo** (backend/ or frontend/) with these phases:
**ONLY include phases that exist in the codebase.**
### Phase 1: Linting
Run the actual linter commands found in the project:
- Python: `ruff check .`, `flake8`, `pylint`
- JavaScript/TypeScript: `npm run lint`, `eslint`
### Phase 2: Type Checking
Run the actual type checker commands found:
- TypeScript: `npx tsc --noEmit`
- Python: `mypy src/`, `mypy .`
### Phase 3: Style Checking
Run the actual formatter check commands found:
- JavaScript/TypeScript: `npm run format:check`, `prettier --check`
- Python: `black --check .`
### Phase 4: Unit Testing
Run the actual test commands found:
- Python: `pytest`, `pytest tests/unit`
- JavaScript/TypeScript: `npm test`, `jest`
### Phase 5: End-to-End Testing (BE CREATIVE AND COMPREHENSIVE)
Test COMPLETE user workflows from documentation, not just internal APIs.
**The Three Levels of E2E Testing:**
1. **Internal APIs** (what you might naturally test):
- Test adapter endpoints work
- Database queries succeed
- Commands execute
2. **External Integrations** (what you MUST test):
- CLI operations (AWS CLI, gh, gcloud, etc.)
- Platform APIs (LiveKit, Google Calendar, Gmail, Telegram)
- Any external services the app depends on
3. **Complete User Journeys** (what gives 100% confidence):
- Follow workflows from docs start-to-finish
- **Backend Example:** "Voice agent receives request → processes calendar event → sends email → confirms to user"
- **Frontend Example:** "User logs in → views dashboard → schedules event → receives confirmation"
- Test like a user would actually use the application in production
**Examples of good vs. bad E2E tests:**
**Backend:**
- ❌ Bad: Tests that Google Calendar adapter returns data
- ✅ Good: Voice agent receives calendar request → fetches events → formats response → returns to user
- ✅ Great: Voice interaction → Calendar query → Gmail notification → LiveKit response confirmation
**Frontend:**
- ❌ Bad: Tests that login form submits data
- ✅ Good: User submits login → Gets redirected → Dashboard loads with data
- ✅ Great: Full auth flow → Create calendar event via UI → Verify in Google Calendar → See update in dashboard
**Approach:**
- Use Docker for isolated, reproducible testing
- Create test data/accounts as needed
- Verify outcomes in external systems (Google Calendar, database, LiveKit rooms)
- Clean up after tests
- Use environment variables for test credentials
## Critical: Don't Stop Until Everything is Validated
**Your job is to create a validation command that leaves NO STONE UNTURNED.**
- Every user workflow from docs should be tested end-to-end
- Every external integration should be exercised (LiveKit, Google services, AWS, etc.)
- Every API endpoint should be hit
- Every error case should be verified
- Database integrity should be confirmed
- The validation should be so thorough that manual testing is completely unnecessary
If `/validate` passes, the user should have 100% confidence their application works correctly in production. Don't settle for partial coverage - make it comprehensive, creative, and complete.
## Mygentic Personal Assistant Specific Workflows
**Backend workflows to test:**
1. **Voice Agent Interaction:**
- LiveKit room connection
- Speech-to-text processing
- Intent recognition
- Response generation
- Text-to-speech output
2. **Google Services Integration:**
- OAuth authentication flow
- Calendar event creation/retrieval
- Gmail message sending
- Multiple account support
3. **Telegram Bot:**
- Message receiving
- Command processing
- Response sending
**Frontend workflows to test:**
1. **Authentication:**
- User registration
- Email verification
- Login/logout
- Session management
2. **Dashboard:**
- Calendar view
- Event management
- Settings configuration
3. **Voice Agent UI:**
- Connection to LiveKit
- Audio streaming
- Real-time transcription display
- Response visualization
## Output
Write the generated validation command to:
- `backend/.claude/commands/validate.md` for backend
- `frontend/.claude/commands/validate.md` for frontend
The command should be executable, practical, and give complete confidence in the codebase.
## Example Structure
See the example validation command for reference, but adapt it to the specific tools, frameworks, and workflows found in this codebase. Don't copy-paste - generate based on actual codebase analysis.