Initial commit

2025-11-29 17:54:49 +08:00
commit ee11345c5b
28 changed files with 6747 additions and 0 deletions
--- a/commands/evi.md
+++ b/commands/evi.md
@@ -0,0 +1,669 @@
+---
+description: Evaluate GitHub issue quality and completeness for agent implementation
+argument-hint: [issue number or URL]
+---
+
+# EVI - Evaluate Issue Quality
+
+Evaluates a GitHub issue to ensure it contains everything needed for the agent implementation framework to succeed.
+
+## Input
+
+**Issue Number or URL:**
+```
+$ARGUMENTS
+```
+
+## Agent Framework Requirements
+
+The issue must support all agents in the pipeline:
+
+**issue-implementer** needs:
+- Clear requirements and end state
+- Files/components to create or modify
+- Edge cases and error handling specs
+- Testing expectations
+
+**quality-checker** needs:
+- Testable acceptance criteria matching automated checks
+- Performance benchmarks (if applicable)
+- Expected test outcomes
+
+**security-checker** needs:
+- Security considerations and requirements
+- Authentication/authorization specs
+- Data handling requirements
+
+**doc-checker** needs:
+- Documentation update requirements
+- What needs README/wiki updates
+- API documentation expectations
+
+**review-orchestrator** needs:
+- Clear pass/fail criteria
+- Non-negotiable vs nice-to-have requirements
+
+**issue-merger** needs:
+- Unambiguous "done" definition
+- Integration requirements
+
+## Evaluation Criteria
+
+### ✅ CRITICAL (Must Have)
+
+**1. Clear Requirements**
+- Exact end state specified
+- Technical details: names, types, constraints, behavior
+- Files/components to create or modify
+- Validation rules and error handling
+- Edge cases covered
+
+**2. Testable Acceptance Criteria**
+- Specific, measurable outcomes
+- Aligns with automated checks (pytest, ESLint, TypeScript, build)
+- Includes edge cases
+- References quality checks: "All tests pass", "ESLint passes", "Build succeeds"
+
+**3. Affected Components**
+- Which files/modules to modify
+- Which APIs/endpoints involved
+- Which database tables affected
+- Which UI components changed
+
+**4. Testing Expectations**
+- What tests need to be written
+- What tests need to pass
+- Performance benchmarks (if applicable)
+- Integration test requirements
+
+**5. Context & Why**
+- Business value
+- User impact
+- Current limitation
+- Why this matters now
+
+### ⚠️ IMPORTANT (Should Have)
+
+**6. Security Requirements**
+- Authentication/authorization needs
+- Data privacy considerations
+- Input validation requirements
+- Security vulnerabilities to avoid
+
+**7. Documentation Requirements**
+- What needs README updates
+- What needs wiki/API docs
+- Inline code comments expected
+- FEATURE-LIST.md updates
+
+**8. Error Handling**
+- Expected error messages
+- Error codes to return
+- Fallback behavior
+- User-facing error text
+
+**9. Scope Boundaries**
+- What IS included
+- What is NOT included
+- Out of scope items
+- Future work references
+
+### 💡 HELPFUL (Nice to Have)
+
+**10. Performance Requirements** (OPTIONAL - only if specific concern)
+- Response time limits
+- Query performance expectations
+- Scale requirements
+- Load handling
+- **Note:** Most issues don't need this. Build first, measure, optimize later.
+
+**11. Related Issues**
+- Dependencies (blocked by, depends on)
+- Related work
+- Follow-up issues planned
+
+**12. Implementation Guidance**
+- Problems agent needs to solve
+- Existing patterns to follow
+- Challenges to consider
+- No prescriptive solutions (guides, doesn't prescribe)
+
+### ❌ RED FLAGS (Must NOT Have)
+
+**13. Prescriptive Implementation**
+- Complete function/component implementations (full solutions)
+- Large blocks of working code (> 10-15 lines)
+- Complete SQL migration scripts
+- Step-by-step implementation guide
+- "Add this code to line X" with specific file/line fixes
+
+**✅ OK to have:**
+- Short code examples (< 10 lines): `{ error: 'message', code: 400 }`
+- Type definitions: `{ email: string, category?: string }`
+- Example API responses: `{ id: 5, status: 'active' }`
+- Error message formats: `"Invalid email format"`
+- Small syntax examples showing the shape/format
+
+## Evaluation Process
+
+### STEP 1: Fetch the Issue
+```bash
+ISSUE_NUM=$ARGUMENTS
+
+# Fetch issue content
+gh issue view $ISSUE_NUM --json title,body,labels > issue-data.json
+
+TITLE=$(jq -r '.title' issue-data.json)
+BODY=$(jq -r '.body' issue-data.json)
+LABELS=$(jq -r '.labels[].name' issue-data.json)
+
+echo "Evaluating Issue #${ISSUE_NUM}: ${TITLE}"
+```
+
+### STEP 2: Check Agent Framework Compatibility
+
+Evaluate for each agent in the pipeline:
+
+**For issue-implementer:**
+```bash
+# Check: Clear requirements present?
+echo "$BODY" | grep -iE '(requirements|end state|target state)' || echo "⚠️ No clear requirements section"
+
+# Check: Files/components specified?
+echo "$BODY" | grep -iE '(files? (to )?(create|modify)|components? (to )?(create|modify|affected)|tables? (to )?(create|modify))' || echo "⚠️ No affected files/components specified"
+
+# Check: Edge cases covered?
+echo "$BODY" | grep -iE '(edge cases?|error handling|fallback|validation)' || echo "⚠️ No edge cases or error handling mentioned"
+```
+
+**For quality-checker:**
+```bash
+# Check: Acceptance criteria reference automated checks?
+echo "$BODY" | grep -iE '(tests? pass|eslint|flake8|mypy|pytest|typescript|build succeeds|linting passes)' || echo "⚠️ Acceptance criteria don't reference automated quality checks"
+
+# Note: Performance requirements are optional - don't check for them
+```
+
+**For security-checker:**
+```bash
+# Check: Security requirements present if handling sensitive data?
+if echo "$BODY" | grep -iE '(password|token|secret|api.?key|auth|credential|email|private)'; then
+  echo "$BODY" | grep -iE '(security|authentication|authorization|encrypt|sanitize|validate|sql.?injection|xss)' || echo "⚠️ Handles sensitive data but no security requirements"
+fi
+```
+
+**For doc-checker:**
+```bash
+# Check: Documentation requirements specified?
+echo "$BODY" | grep -iE '(documentation|readme|wiki|api.?docs?|comments?|feature.?list)' || echo "⚠️ No documentation requirements specified"
+```
+
+**For review-orchestrator:**
+```bash
+# Check: Clear pass/fail criteria?
+CRITERIA_COUNT=$(echo "$BODY" | grep -c '- \[.\]' || echo "0")
+if [ "$CRITERIA_COUNT" -lt 3 ]; then
+  echo "⚠️ Only $CRITERIA_COUNT acceptance criteria (need at least 3-5)"
+fi
+```
+
+### STEP 3: Evaluate Testable Acceptance Criteria
+
+Check if criteria align with automated checks:
+
+**Good criteria (matches quality-checker):**
+- [x] "All pytest tests pass" → quality-checker runs pytest
+- [x] "ESLint passes with no errors" → quality-checker runs ESLint
+- [x] "TypeScript compilation succeeds" → quality-checker runs tsc
+- [x] "Query completes in < 100ms" → measurable, testable
+- [x] "Can insert 5 accounts per user" → specific, verifiable
+
+**Bad criteria (vague/unmeasurable):**
+- [ ] "Works correctly" → subjective
+- [ ] "Code looks good" → subjective
+- [ ] "Function created" → process-oriented, not outcome
+- [ ] "Bug is fixed" → not specific enough
+
+### STEP 4: Check for Affected Components
+
+Does issue specify:
+- Which files to create?
+- Which files to modify?
+- Which APIs/endpoints involved?
+- Which database tables affected?
+- Which UI components changed?
+- Which tests need updating?
+
+### STEP 5: Evaluate Testing Expectations
+
+Does issue specify:
+- What tests to write (unit, integration, e2e)?
+- What existing tests need to pass?
+- What test coverage is expected?
+- Performance test requirements?
+
+### STEP 6: Check Security Requirements
+
+**Security (if applicable):**
+- Authentication/authorization requirements?
+- Input validation specs?
+- Data privacy considerations?
+- Known vulnerabilities to avoid?
+
+**Note on Performance:**
+Performance requirements are OPTIONAL and NOT evaluated as a critical check. Most issues don't need explicit performance requirements - build first, measure, optimize later. Only flag as missing if:
+- Issue specifically mentions performance as a concern
+- Feature handles large-scale data (millions of records)
+- There's a user-facing latency requirement
+
+Otherwise, absence of performance requirements is fine.
+
+### STEP 7: Check Documentation Requirements
+
+Does issue specify:
+- README updates needed?
+- Wiki/API docs updates?
+- Inline comments expected?
+- FEATURE-LIST.md updates?
+
+### STEP 8: Scan for Red Flags
+
+```bash
+# RED FLAG: Large code blocks (> 15 lines)
+# Count lines in code blocks
+CODE_BLOCKS=$(grep -E '```(typescript|javascript|python|sql|java|go|rust)' <<< "$BODY")
+if [ ! -z "$CODE_BLOCKS" ]; then
+  # Check if any code block has > 15 lines
+  # (This is a simplified check - full implementation would count lines per block)
+  echo "⚠️ Found code blocks - checking size..."
+  # Manual review needed: Are these short examples or full implementations?
+fi
+
+# RED FLAG: Complete function implementations
+grep -iE '(function|def|class).*\{' <<< "$BODY" && echo "🚨 Contains complete function/class implementations"
+
+# RED FLAG: Prescriptive instructions with specific fixes
+grep -iE '(add (this|the following) code to line [0-9]+|here is the implementation|use this exact code)' <<< "$BODY" && echo "🚨 Contains prescriptive code placement instructions"
+
+# RED FLAG: Specific file/line references for bug fixes
+grep -E '(fix|change|modify|add).*(in|at|on) line [0-9]+' <<< "$BODY" && echo "🚨 Contains specific file/line fix locations"
+
+# RED FLAG: Step-by-step implementation guide (not just planning)
+grep -iE '(step [0-9]|first,.*second,.*third)' <<< "$BODY" | grep -iE '(write this code|add this function|implement as follows)' && echo "🚨 Contains step-by-step code implementation guide"
+
+# OK: Short examples are fine
+# These are NOT red flags:
+# - Type definitions: { email: string }
+# - Example responses: { id: 5, status: 'active' }
+# - Error formats: "Invalid email"
+# - Small syntax examples (< 5 lines)
+```
+
+## Output Format
+
+```markdown
+# Issue #${ISSUE_NUM} Evaluation Report
+
+**Title:** ${TITLE}
+
+## Agent Framework Compatibility: X/9 Critical Checks Passed
+
+**Ready for implementation?** [YES / NEEDS_WORK / NO]
+
+**Note:** Performance, Related Issues, and Implementation Guidance are optional - not counted in score.
+
+### ✅ Strengths (What's Good)
+- [Specific strength 1]
+- [Specific strength 2]
+- [Specific strength 3]
+
+### ⚠️ Needs Improvement (Fix Before Implementation)
+- [Missing element with specific impact on agent]
+- [Missing element with specific impact on agent]
+- [Missing element with specific impact on agent]
+
+### ❌ Critical Issues (Blocks Agent Success)
+- [Critical missing element / red flag]
+- [Critical missing element / red flag]
+
+### 🚨 Red Flags Found
+- [Code snippet / prescriptive instruction found]
+- [Specific file/line reference found]
+
+## Agent-by-Agent Analysis
+
+### issue-implementer Readiness: [READY / NEEDS_WORK / BLOCKED]
+**Can the implementer succeed with this issue?**
+
+✅ **Has:**
+- Clear requirements and end state
+- Files/components to modify specified
+- Edge cases covered
+
+❌ **Missing:**
+- Error handling specifications
+- Validation rules not detailed
+
+**Impact:** [Description of how missing elements affect implementer]
+
+### quality-checker Readiness: [READY / NEEDS_WORK / BLOCKED]
+**Can the quality checker validate this?**
+
+✅ **Has:**
+- Acceptance criteria reference "pytest passes"
+- Performance benchmark: "< 100ms"
+
+❌ **Missing:**
+- No reference to ESLint/TypeScript checks
+- Test coverage expectations not specified
+
+**Impact:** [Description of how missing elements affect quality validation]
+
+### security-checker Readiness: [READY / NEEDS_WORK / N/A]
+**Are security requirements clear?**
+
+✅ **Has:**
+- Authentication requirements specified
+- Input validation rules defined
+
+❌ **Missing:**
+- No SQL injection prevention mentioned
+- Data encryption requirements unclear
+
+**Impact:** [Description of security gaps]
+
+### doc-checker Readiness: [READY / NEEDS_WORK / BLOCKED]
+**Are documentation expectations clear?**
+
+✅ **Has:**
+- README update requirement specified
+
+❌ **Missing:**
+- No API documentation mentioned
+- FEATURE-LIST.md updates not specified
+
+**Impact:** [Description of doc gaps]
+
+### review-orchestrator Readiness: [READY / NEEDS_WORK / BLOCKED]
+**Are pass/fail criteria clear?**
+
+✅ **Has:**
+- 5 specific acceptance criteria
+- Clear success conditions
+
+❌ **Missing:**
+- Distinction between blocking vs non-blocking issues
+- Performance criteria not measurable
+
+**Impact:** [Description of review ambiguity]
+
+## Recommendations
+
+### High Priority (Fix Before Implementation)
+1. [Specific actionable fix]
+2. [Specific actionable fix]
+3. [Specific actionable fix]
+
+### Medium Priority (Improve Clarity)
+1. [Specific suggestion]
+2. [Specific suggestion]
+
+### Low Priority (Nice to Have)
+1. [Optional improvement]
+2. [Optional improvement]
+
+## Example Improvements
+
+### Before (Current):
+```markdown
+[Quote problematic section from issue]
+```
+
+### After (Suggested):
+```markdown
+[Show how it should be written]
+```
+
+## Agent Framework Compatibility
+
+**Will this issue work well with the agent framework?**
+
+- ✅ YES - Issue is well-structured for agents
+- ⚠️ MAYBE - Needs improvements but workable
+- ❌ NO - Critical issues will confuse agents
+
+**Specific concerns:**
+- [Agent compatibility issue 1]
+- [Agent compatibility issue 2]
+
+## Quick Fixes
+
+If you want to improve this issue, run:
+
+```bash
+# Add these sections
+gh issue comment $ISSUE_NUM --body "## Implementation Guidance
+
+To implement this, you will need to:
+- [List problems to solve]
+"
+
+# Remove code snippets
+# Edit the issue and remove code blocks showing implementation
+gh issue edit $ISSUE_NUM
+
+# Add acceptance criteria
+gh issue comment $ISSUE_NUM --body "## Additional Acceptance Criteria
+- [ ] [Specific testable criterion]
+"
+```
+
+## Summary
+
+**Ready for agent implementation?** [YES/NO/NEEDS_WORK]
+
+**Confidence level:** [HIGH/MEDIUM/LOW]
+
+**Estimated time to fix issues:** [X minutes]
+```
+
+## Agent Framework Compatibility Score
+
+**12 Critical Checks (Pass/Fail):**
+
+### ✅ CRITICAL (Must Pass) - 5 checks
+1. **Clear Requirements** - End state specified with technical details
+2. **Testable Acceptance Criteria** - Aligns with automated checks (pytest, ESLint, etc.)
+3. **Affected Components** - Files/modules to create or modify listed
+4. **Testing Expectations** - What tests to write/pass specified
+5. **Context** - Why, who, business value explained
+
+### ⚠️ IMPORTANT (Should Pass) - 4 checks
+6. **Security Requirements** - If handling sensitive data/auth
+7. **Documentation Requirements** - README/wiki/API docs expectations
+8. **Error Handling** - Error messages, codes, fallback behavior
+9. **Scope Boundaries** - What IS and ISN'T included
+
+### 💡 HELPFUL (Nice to Pass) - 3 checks
+10. **Performance Requirements** - OPTIONAL, only if specific concern mentioned
+11. **Related Issues** - Dependencies, related work
+12. **Implementation Guidance** - Problems to solve (without prescribing HOW)
+
+### ❌ RED FLAG (Must NOT Have) - 1 check
+13. **No Prescriptive Implementation** - No code snippets, algorithms, step-by-step
+
+**Scoring (based on 9 critical/important checks):**
+- **9/9 + No Red Flags**: Perfect - Ready for agents
+- **7-8/9 + No Red Flags**: Excellent - Minor improvements
+- **5-6/9 + No Red Flags**: Good - Some gaps but workable
+- **3-4/9 + No Red Flags**: Needs work - Significant gaps
+- **< 3/9 OR Red Flags Present**: Blocked - Must fix before implementation
+
+**Note:** Performance, Related Issues, and Implementation Guidance are HELPFUL but not required.
+
+**Ready for Implementation?**
+- **YES**: 7+ checks passed, no red flags
+- **NEEDS_WORK**: 5-6 checks passed OR minor red flags
+- **NO**: < 5 checks passed OR major red flags (code snippets, step-by-step)
+
+## Examples
+
+### Example 1: Excellent Issue (9/9 Checks Passed)
+```
+Issue #42: "Support multiple Google accounts per user"
+
+✅ Agent Framework Compatibility: 9/9
+**Ready for implementation?** YES
+
+Strengths:
+- Clear database requirements (tables, columns, constraints)
+- Acceptance criteria: "pytest passes", "Build succeeds"
+- Files specified: user_google_accounts table, modify user_google_tokens
+- Testing: "Write tests for multi-account insertion"
+- Security: "Validate email uniqueness per user"
+- Documentation: "Update README with multi-account setup"
+- Error handling: "Return 409 if duplicate email"
+- Scope: "Account switching UI is out of scope (Issue #43)"
+- Context: Explains why users need personal + work accounts
+
+Agent-by-Agent:
+✅ implementer: Has everything needed
+✅ quality-checker: Criteria match automated checks
+✅ security-checker: Security requirements clear
+✅ doc-checker: Documentation expectations specified
+✅ review-orchestrator: Clear pass/fail criteria
+
+Note: No performance requirements specified - this is fine. Build first, optimize later if needed.
+```
+
+### Example 2: Needs Work (6/9 Checks Passed)
+```
+Issue #55: "Add email search functionality"
+
+⚠️ Agent Framework Compatibility: 6/9
+**Ready for implementation?** NEEDS_WORK
+
+Strengths:
+- Clear requirements: search endpoint, query parameters
+- Context explains user need
+- Affected components specified
+
+Issues:
+❌ Missing testing expectations (what tests to write?)
+❌ Missing documentation requirements
+❌ Missing error handling specs
+❌ Vague acceptance criteria: "Search works correctly"
+
+Agent-by-Agent:
+✅ implementer: Has requirements
+⚠️ quality-checker: Criteria too vague ("works correctly")
+⚠️ security-checker: No SQL injection prevention mentioned
+⚠️ doc-checker: No docs requirements
+⚠️ review-orchestrator: Criteria not specific enough
+
+Recommendations:
+- Add: "pytest tests pass for search functionality"
+- Add: "Update API documentation"
+- Add: "Error handling: Return 400 if invalid query, 500 if database error"
+- Replace "works correctly" with specific outcomes: "Returns matching emails", "Handles empty results"
+
+Note: Performance requirements are optional - don't need to add unless there's a specific concern.
+```
+
+### Example 3: Blocked (2/9, Red Flags Present)
+```
+Issue #67: "Fix email parsing bug"
+
+❌ Agent Framework Compatibility: 2/9
+**Ready for implementation?** NO
+
+Critical Issues:
+🚨 Contains 30-line complete function implementation
+🚨 Prescriptive: "Add this exact function to inbox.ts line 45"
+🚨 Step-by-step: "First, create parseEmail(). Second, add validation. Third, handle errors."
+❌ No clear requirements (what should parsing do?)
+❌ No acceptance criteria
+❌ No testing expectations
+
+Agent-by-Agent:
+❌ implementer: Doesn't need to think - solution provided
+❌ quality-checker: Can't validate (no criteria)
+❌ security-checker: Missing validation specs
+❌ doc-checker: No docs requirements
+❌ review-orchestrator: No pass/fail criteria
+
+Must fix:
+1. Remove complete function - describe WHAT parsing should do instead
+   - ✅ OK to keep: "Output format: `{ local: string, domain: string }`"
+   - ✅ OK to keep: "Error message: 'Invalid email format'"
+   - ❌ Remove: 30-line function implementation
+2. Add requirements: input format, output format, error handling
+3. Add acceptance criteria: "Parses valid emails", "Rejects invalid formats"
+4. Specify testing: "Add unit tests for edge cases"
+5. Remove line 45 reference - describe behavior instead
+6. Convert step-by-step to "Implementation Guidance" section
+```
+
+### Example 4: Good Use of Code Examples (9/9)
+```
+Issue #78: "Add user profile endpoint"
+
+✅ Agent Framework Compatibility: 9/9
+**Ready for implementation?** YES
+
+Requirements:
+- Endpoint: POST /api/profile
+- Request body: `{ name: string, bio?: string }`  ← Short example (OK!)
+- Response: `{ id: number, name: string, bio: string | null }`  ← Type def (OK!)
+- Error: `{ error: "Name required", code: 400 }`  ← Error format (OK!)
+
+✅ Good use of code examples:
+- Shows shape/format without providing implementation
+- Type definitions help clarify requirements
+- Error format specifies exact user-facing text
+- No function implementations
+
+Agent-by-Agent:
+✅ implementer: Clear requirements, no prescribed solution
+✅ quality-checker: Testable criteria
+✅ security-checker: Input validation specified
+✅ doc-checker: API docs requirement present
+```
+
+## Integration with GHI Command
+
+This command complements GHI:
+- **GHI**: Creates new issues following best practices
+- **EVI**: Evaluates existing issues against same standards
+
+Use EVI to:
+- Review issues written by other agents
+- Audit issues before sending to implementation
+- Train team on issue quality standards
+- Ensure agent framework compatibility
+
+## Notes
+
+- Run EVI before implementing any issue
+- Use it to review issues written by other agents or teammates
+- Consider making EVI a required check in your workflow
+- **11+ checks passed**: Agent-ready, implement immediately
+- **8-10 checks passed**: Needs minor improvements, but workable
+- **< 8 checks passed**: Must revise before implementation
+- Look for red flags even in high-scoring issues
+
+### What About Code Snippets?
+
+**✅ Short examples are GOOD:**
+- Type definitions: `interface User { id: number, name: string }`
+- API responses: `{ success: true, data: {...} }`
+- Error formats: `{ error: "message", code: 400 }`
+- Small syntax examples (< 10 lines)
+
+**❌ Large implementations are BAD:**
+- Complete functions (15+ lines)
+- Full component implementations
+- Complete SQL migration scripts
+- Working solutions that agents can copy-paste
+
+**The key distinction:** Are you showing the shape/format (good) or providing the solution (bad)?
--- a/commands/featureforge.md
+++ b/commands/featureforge.md
@@ -0,0 +1,145 @@
+---
+description: Transform rough feature ideas into comprehensive implementation specifications
+argument-hint: [feature idea]
+---
+
+# FeatureForge - Feature Specification Builder
+
+Transform rough feature ideas into comprehensive, ready-to-implement specifications.
+
+## Input
+
+**Feature Idea:**
+```
+$ARGUMENTS
+```
+
+## Steps
+
+1. **Understand the Feature**
+   - Identify core functionality and business value
+   - Determine complexity level (Simple/Medium/Complex)
+   - Choose appropriate detail level based on complexity
+
+2. **Gather Context** (Ask 3-5 clarifying questions)
+   - What problem does this solve for users?
+   - Who will use this feature and when?
+   - What's the main user flow or interaction?
+   - Any technical constraints or integrations needed?
+   - How will you measure success?
+
+3. **Build Specification**
+   - Use template below based on complexity
+   - Focus on clarity and actionable details
+   - Include acceptance criteria that are testable
+
+4. **Output for Next Steps**
+   - Format as complete specification ready for `/cci` command
+   - Wrap in code block for easy copying
+
+## Output Format
+
+### For Simple Features
+
+```markdown
+# Feature: [Feature Name]
+
+## Problem
+[What user problem does this solve?]
+
+## Solution
+[How will this feature work?]
+
+## User Story
+As a [user type], I want to [action] so that [benefit].
+
+## Acceptance Criteria
+- [ ] [Testable criterion 1]
+- [ ] [Testable criterion 2]
+- [ ] [Testable criterion 3]
+
+## Technical Notes
+- [Key implementation detail]
+- [Dependencies or constraints]
+
+**Priority**: Medium | **Effort**: Small
+```
+
+### For Medium/Complex Features
+
+```markdown
+# Feature: [Feature Name]
+
+## Problem Statement
+[Detailed description of the problem and user impact]
+
+## Proposed Solution
+[Comprehensive description of how the feature works]
+
+## User Stories
+1. As a [user type], I want to [action] so that [benefit]
+2. As a [user type], I want to [action] so that [benefit]
+
+## Functional Requirements
+- [Requirement 1 with details]
+- [Requirement 2 with details]
+- [Requirement 3 with details]
+
+## Acceptance Criteria
+- [ ] [Detailed testable criterion 1]
+- [ ] [Detailed testable criterion 2]
+- [ ] [Detailed testable criterion 3]
+- [ ] [Detailed testable criterion 4]
+
+## Technical Specifications
+
+### Data Model
+[What data needs to be stored/processed?]
+
+### API/Integration Points
+[New endpoints or external integrations needed]
+
+### UI/UX Considerations
+[Key user interface elements and flows]
+
+## Edge Cases & Error Handling
+- **[Edge case 1]**: [How to handle]
+- **[Error scenario]**: [Expected behavior]
+
+## Success Metrics
+- [Metric 1]: [Target value]
+- [Metric 2]: [Target value]
+
+## Dependencies
+- [External system or prerequisite feature]
+
+**Priority**: High | **Effort**: Large | **Timeline**: 4-6 weeks
+```
+
+## Example Commands
+
+**Simple:**
+```
+/featureforge Add a dark mode toggle in settings
+```
+
+**Complex:**
+```
+/featureforge Build a customer referral system with reward tracking, email notifications, and analytics dashboard
+```
+
+## Usage Flow
+
+1. Run `/featureforge [your feature idea]`
+2. Answer clarifying questions
+3. Receive formatted specification
+4. Copy the output
+5. Use with `/cci` to create GitHub issue: `/cci [paste specification here]`
+
+## Notes
+
+- **Keep it actionable** - Focus on what to build, not how to build it
+- **Be specific** - Vague requirements lead to unclear implementations
+- **Think user-first** - Start with the problem, not the solution
+- **Include metrics** - Define what success looks like
+- For coding standards and patterns, refer to CLAUDE.md
--- a/commands/generate_command.md
+++ b/commands/generate_command.md
@@ -0,0 +1,116 @@
+---
+description: Generate a new custom Claude Code slash command from a description
+argument-hint: [command purpose]
+---
+
+# Create a Custom Claude Code Command
+
+Create a new slash command in `.claude/commands/` for the requested task.
+
+## Goal
+
+#$ARGUMENTS
+
+## Key Capabilities to Leverage
+
+**File Operations:**
+- Read, Edit, Write - modify files precisely
+- Glob, Grep - search codebase
+- MultiEdit - atomic multi-part changes
+
+**Development:**
+- Bash - run commands (git, tests, linters)
+- Task - launch specialized agents for complex tasks
+- TodoWrite - track progress with todo lists
+
+**Web & APIs:**
+- WebFetch, WebSearch - research documentation
+- GitHub (gh cli) - PRs, issues, reviews
+- Puppeteer - browser automation, screenshots
+
+**Integrations:**
+- AppSignal - logs and monitoring
+- Context7 - framework docs
+- Stripe, Todoist, Featurebase (if relevant)
+
+## Best Practices
+
+1. **Be specific and clear** - detailed instructions yield better results
+2. **Break down complex tasks** - use step-by-step plans
+3. **Use examples** - reference existing code patterns
+4. **Include success criteria** - tests pass, linting clean, etc.
+5. **Think first** - use "think hard" or "plan" keywords for complex problems
+6. **Iterate** - guide the process step by step
+
+## Structure Your Command
+
+```markdown
+# [Command Name]
+
+[Brief description of what this command does]
+
+## Steps
+
+1. [First step with specific details]
+   - Include file paths, patterns, or constraints
+   - Reference existing code if applicable
+
+2. [Second step]
+   - Use parallel tool calls when possible
+   - Check/verify results
+
+3. [Final steps]
+   - Run tests
+   - Lint code
+   - Commit changes (if appropriate)
+
+## Success Criteria
+
+- [ ] Tests pass
+- [ ] Code follows style guide
+- [ ] Documentation updated (if needed)
+```
+
+## Tips for Effective Commands
+
+- **Use $ARGUMENTS** placeholder for dynamic inputs
+- **Reference CLAUDE.md** patterns and conventions
+- **Include verification steps** - tests, linting, visual checks
+- **Be explicit about constraints** - don't modify X, use pattern Y
+- **Use XML tags** for structured prompts: `<task>`, `<requirements>`, `<constraints>`
+
+## Example Pattern
+
+```markdown
+Implement #$ARGUMENTS following these steps:
+
+1. Research existing patterns
+   - Search for similar code using Grep
+   - Read relevant files to understand approach
+
+2. Plan the implementation
+   - Think through edge cases and requirements
+   - Consider test cases needed
+
+3. Implement
+   - Follow existing code patterns (reference specific files)
+   - Write tests first if doing TDD
+   - Ensure code follows CLAUDE.md conventions
+
+4. Verify
+   - Run tests:
+     - Rails: `bin/rails test` or `bundle exec rspec`
+     - TypeScript: `npm test` or `yarn test` (Jest/Vitest)
+     - Python: `pytest` or `python -m pytest`
+   - Run linter:
+     - Rails: `bundle exec standardrb` or `bundle exec rubocop`
+     - TypeScript: `npm run lint` or `eslint .`
+     - Python: `ruff check .` or `flake8`
+   - Check changes with git diff
+
+5. Commit (optional)
+   - Stage changes
+   - Write clear commit message
+```
+
+Now create the command file at `.claude/commands/[name].md` with the structure above.
--- a/commands/ghi.md
+++ b/commands/ghi.md
--- a/commands/implement-issue.md
+++ b/commands/implement-issue.md
@@ -0,0 +1,258 @@
+---
+description: Complete 3-phase automation: implement → review → merge for GitHub issues
+argument-hint: [issue number]
+---
+
+# Implement GitHub Issue - Automated Workflow
+
+You are orchestrating the complete implementation → review → merge pipeline for issue **$ARGUMENTS**.
+
+## Your Mission
+
+Execute this workflow **autonomously** without stopping between phases unless errors occur:
+
+1. **Phase 1: Implementation** - Launch issue-implementer agent
+2. **Phase 2: Review** - Launch review-orchestrator agent (3 parallel checks)
+3. **Phase 3: Merge** - Launch issue-merger agent
+
+Do **NOT** stop between phases to ask the user. Only stop if:
+- An agent returns an error
+- Review decision is "REQUEST_CHANGES" (report feedback to user)
+- Merge fails due to conflicts (report to user)
+
+---
+
+## Execution Instructions
+
+### Phase 1: Implementation
+
+Launch the issue-implementer agent with issue number **$ARGUMENTS**:
+
+```
+Use the Task tool with:
+- subagent_type: issue-implementer
+- description: "Implement issue $ARGUMENTS"
+- prompt: "Implement issue $ARGUMENTS following the 8-step workflow."
+```
+
+**If API returns 500 "Overloaded" error**:
+- Automatically retry up to 2 more times (3 attempts total)
+- Wait 5 seconds between retries
+- Log: "API overloaded, retrying in 5 seconds... (attempt X/3)"
+- Only fail if all 3 attempts return 500 errors
+
+**After agent completes, verify actual implementation**:
+```bash
+# Check manifest exists
+if [ -f ".agent-state/issue-$ARGUMENTS-implementation.yaml" ]; then
+  STATUS=$(grep "status:" .agent-state/issue-$ARGUMENTS-implementation.yaml | cut -d: -f2 | tr -d ' ')
+
+  # Verify implementation artifacts
+  BRANCH=$(grep "branch:" .agent-state/issue-$ARGUMENTS-implementation.yaml | cut -d: -f2 | tr -d ' "')
+
+  # Check branch exists
+  if ! git rev-parse --verify "$BRANCH" >/dev/null 2>&1; then
+    echo "⚠️ Warning: Branch $BRANCH not found, but status is $STATUS"
+    echo "This may indicate implementation succeeded despite error messages"
+  fi
+
+  # For AWS operations, verify resources exist (if applicable)
+  # This will be handled in implementation verification section below
+else
+  echo "❌ ERROR: Implementation manifest not found"
+  echo "Agent may have failed before completion"
+  exit 1
+fi
+```
+
+**Verify Implementation Success**:
+Check actual state, not just manifest:
+- Branch exists: `git rev-parse --verify feature/issue-$ARGUMENTS`
+- Commits pushed: `git log origin/feature/issue-$ARGUMENTS`
+- Issue labeled: `gh issue view $ARGUMENTS --json labels | grep ready_for_review`
+
+**Using the verification helper script**:
+```bash
+# Verify branch exists
+~/.claude/scripts/verify-github-operation.sh branch-exists $ARGUMENTS feature/issue-$ARGUMENTS
+
+# Verify label updated
+~/.claude/scripts/verify-github-operation.sh label-updated $ARGUMENTS ready_for_review
+```
+
+The verification script provides color-coded output and clear success/failure indicators.
+
+If status is "ready_for_review" and verification passes, print:
+```
+✅ Phase 1 Complete: Implementation
+📋 Files changed: [list from manifest]
+📊 Status: ready_for_review
+⏭️  Continuing to review phase...
+```
+
+Then proceed to Phase 2 immediately.
+
+---
+
+### Phase 2: Review
+
+Launch the review-orchestrator agent:
+
+```
+Use the Task tool with:
+- subagent_type: review-orchestrator
+- description: "Review issue $ARGUMENTS"
+- prompt: "Orchestrate code review for issue $ARGUMENTS. Coordinate security, quality, and documentation checks in parallel. Create PR if approved."
+```
+
+**Wait for completion**, then check:
+```bash
+cat .agent-state/issue-$ARGUMENTS-review.yaml | grep "decision:"
+```
+
+**If decision is "APPROVE"**, verify and report:
+
+**Verify Review Completion**:
+```bash
+# Verify PR was created
+~/.claude/scripts/verify-github-operation.sh pr-created $ARGUMENTS
+
+# Verify label updated to ready_for_merge
+~/.claude/scripts/verify-github-operation.sh label-updated $ARGUMENTS ready_for_merge
+```
+
+If verification passes, print:
+```
+✅ Phase 2 Complete: Review
+🔒 Security: PASS (0 issues)
+⚡ Quality: PASS
+📚 Documentation: PASS
+🎯 Decision: APPROVE
+📝 PR created and label updated
+⏭️  Continuing to merge phase...
+```
+
+Then proceed to Phase 3 immediately.
+
+**If decision is "REQUEST_CHANGES"**, print:
+```
+❌ Phase 2: Changes Requested
+📝 Feedback: [show blocking issues from manifest]
+🔄 Issue returned to "To Do" status
+
+Workflow stopped. Address the feedback and re-run /implement-issue $ARGUMENTS
+```
+
+Stop here and report to user.
+
+---
+
+### Phase 3: Merge
+
+Launch the issue-merger agent:
+
+```
+Use the Task tool with:
+- subagent_type: issue-merger
+- description: "Merge PR for issue $ARGUMENTS"
+- prompt: "Merge the approved pull request for issue $ARGUMENTS. Handle conflict resolution if needed, update all tracking systems, and cleanup worktrees."
+```
+
+**Wait for completion**, then verify and report:
+
+**Verify Merge Completion**:
+```bash
+# Verify PR was merged
+~/.claude/scripts/verify-github-operation.sh pr-merged $ARGUMENTS
+
+# Verify issue was closed
+~/.claude/scripts/verify-github-operation.sh issue-closed $ARGUMENTS
+```
+
+If verification passes, print:
+```
+✅ Phase 3 Complete: Merge
+🔗 PR merged and verified
+📦 Commit: [SHA from merger output]
+🏁 Issue #$ARGUMENTS closed and verified
+✨ Workflow complete
+```
+
+---
+
+## Error Handling
+
+**Error Recovery Protocol**:
+
+1. **Check actual state before failing**:
+   - Implementation phase: Verify branch exists, files changed, commits pushed
+   - Review phase: Verify PR created (if approved), issue labeled correctly
+   - Merge phase: Verify PR merged, issue closed, branch deleted
+
+2. **For API 500 "Overloaded" errors specifically**:
+   - These are temporary overload, not failures
+   - Automatically retry up to 2 more times (3 attempts total)
+   - Wait 5 seconds between retries
+   - Only fail if all 3 attempts return errors
+
+3. **For "operation failed" but state shows success**:
+   - Log warning but continue workflow
+   - Example: "Connection closed" error but AWS resource actually exists
+   - Report to user: "Operation completed successfully despite error message"
+
+4. **For actual failures**:
+   - Stop workflow immediately
+   - Show clear error with phase name and details
+   - Provide specific recovery steps
+   - Preserve work (don't delete worktree/branch)
+
+**Common errors:**
+- "Issue not found" → Check issue number format (should be just number, not URL)
+- "No implementation manifest" → Run implementer first
+- "Review not approved" → Address review feedback and re-run workflow
+- "Merge conflicts" → Resolve manually, then re-run merge phase
+- "API Overloaded" → Automatic retry (up to 3 attempts)
+- "Label not updated" → Manually update via: `gh issue edit NUMBER --add-label LABEL`
+
+---
+
+## Status Reporting
+
+After each phase completion, use this format:
+
+```
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+Phase [N]: [Phase Name]
+✅ Status: [Complete/Failed]
+📊 Details: [Key information]
+⏭️  Next: [What happens next]
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+```
+
+---
+
+## Final Report
+
+After Phase 3 completes successfully, provide:
+
+```
+🎉 Issue #$ARGUMENTS Implementation Complete
+
+📊 Summary:
+   • Implementation: [files changed]
+   • Review: [decision + check results]
+   • Merge: [PR number + commit SHA]
+
+🔗 Links:
+   • Issue: https://github.com/Mygentic-AI/mygentic-personal-assistant/issues/$ARGUMENTS
+   • PR: [PR URL from merger output]
+   • Commit: [commit URL]
+
+✅ All tracking systems updated
+✅ Worktree cleaned up
+✅ Branches deleted
+```
+
+---
+
+**Start now with issue $ARGUMENTS. Do not ask for confirmation between phases.**
--- a/commands/implement-issues-parallel.md
+++ b/commands/implement-issues-parallel.md
@@ -0,0 +1,202 @@
+---
+description: Implement multiple GitHub issues in parallel with dependency analysis
+argument-hint: [space-separated issue numbers]
+---
+
+# Implement Multiple GitHub Issues in Parallel
+
+Execute multiple GitHub issues concurrently, respecting dependencies between them.
+
+## Usage
+
+```
+/implement-issues-parallel 5 12 18 23
+```
+
+## Workflow
+
+### Phase 1: Analysis & Dependency Detection
+
+1. **Fetch All Issues**
+   ```bash
+   for issue in $ARGUMENTS; do
+     gh issue view $issue --json title,body,labels
+   done
+   ```
+
+2. **Analyze Dependencies**
+
+   For each issue, check:
+   - Does it reference other issues in the batch? (e.g., "Depends on #5", "Blocks #12")
+   - Does it modify files that other issues also modify?
+   - Does it require features from other issues?
+
+3. **Build Dependency Graph (ASCII)**
+
+   ```
+   ┌─────────────────────────────────────────────────────────────┐
+   │ EXECUTION PLAN FOR ISSUES: 5, 12, 18, 23                   │
+   └─────────────────────────────────────────────────────────────┘
+
+   ┌──────────┐
+   │ Issue #5 │  (Database migration)
+   └────┬─────┘
+        │
+        ├────────┐
+        │        │
+        ▼        ▼
+   ┌──────────┐ ┌──────────┐
+   │Issue #12 │ │Issue #18 │ (Both need DB changes from #5)
+   └────┬─────┘ └────┬─────┘
+        │            │
+        └────────┬───┘
+                 │
+                 ▼
+            ┌──────────┐
+            │Issue #23 │ (Needs both #12 and #18)
+            └──────────┘
+
+   WAVE 1 (Parallel): #5
+   WAVE 2 (Parallel): #12, #18
+   WAVE 3 (Parallel): #23
+
+   Total Estimated Time: ~45-60 minutes
+   ```
+
+4. **User Confirmation**
+
+   Show the execution plan and ask:
+   ```
+   Execute this plan? (yes/modify/cancel)
+   - yes: Start execution
+   - modify: Adjust dependencies or order
+   - cancel: Abort
+   ```
+
+### Phase 2: Parallel Execution
+
+1. **Execute by Waves**
+
+   For each wave:
+   - Launch issue-implementer agents in parallel
+   - Wait for all agents in wave to complete
+   - Verify no conflicts before proceeding to next wave
+
+   Example for Wave 2:
+   ```
+   # Launch in parallel
+   Task issue-implementer(12) &
+   Task issue-implementer(18) &
+
+   # Wait for both to complete
+   wait
+   ```
+
+2. **Conflict Detection Between Waves**
+
+   After each wave completes:
+   - Check for file conflicts between branches
+   - Verify no overlapping changes
+   - Report any conflicts to user before continuing
+
+3. **Progress Tracking**
+
+   ```
+   ┌──────────────────────────────────────────────────────┐
+   │ PROGRESS: Wave 2/3                                   │
+   ├──────────────────────────────────────────────────────┤
+   │ ✅ Wave 1: #5 (Complete)                             │
+   │ ⏳ Wave 2: #12 (in_progress) | #18 (in_progress)     │
+   │ ⏸  Wave 3: #23 (pending)                             │
+   └──────────────────────────────────────────────────────┘
+   ```
+
+### Phase 3: Review & Merge Coordination
+
+1. **Sequential Review**
+
+   Even though implementation was parallel, reviews happen sequentially to avoid reviewer confusion:
+
+   ```
+   For each issue in dependency order (5, 12, 18, 23):
+     - Launch review-orchestrator(issue)
+     - Wait for approval
+     - If approved, continue to next
+     - If changes requested, pause pipeline
+   ```
+
+2. **Coordinated Merging**
+
+   After all reviews pass:
+   ```
+   For each issue in dependency order:
+     - Launch issue-merger(issue)
+     - Verify merge successful
+     - Update remaining branches if needed
+   ```
+
+### Phase 4: Completion Report
+
+```
+┌────────────────────────────────────────────────────────────┐
+│ PARALLEL EXECUTION COMPLETE                                │
+├────────────────────────────────────────────────────────────┤
+│ Issues Implemented: 4                                      │
+│ Total Time: 52 minutes                                     │
+│ Time Saved vs Sequential: ~78 minutes (60% faster)         │
+│                                                            │
+│ Results:                                                   │
+│   ✅ #5  - Merged (commit: abc123)                         │
+│   ✅ #12 - Merged (commit: def456)                         │
+│   ✅ #18 - Merged (commit: ghi789)                         │
+│   ✅ #23 - Merged (commit: jkl012)                         │
+└────────────────────────────────────────────────────────────┘
+```
+
+## Dependency Detection Rules
+
+**Explicit Dependencies** (highest priority):
+- Issue body contains: "Depends on #X", "Requires #X", "Blocked by #X"
+- Issue references another issue with keywords: "After #X", "Needs #X"
+
+**Implicit Dependencies** (inferred):
+- Both modify the same database table (check for migration files)
+- One adds a function, another uses it (check for new exports/imports)
+- One modifies an API endpoint, another calls it
+
+**File Conflict Analysis**:
+- Compare `files_changed` from each issue's implementation manifest
+- If overlap > 30%, consider creating dependency
+- Ask user to confirm inferred dependencies
+
+## Error Handling
+
+**If an Issue Fails in a Wave:**
+1. Continue other issues in current wave
+2. Mark dependent issues as "blocked"
+3. Report failure to user
+4. Ask: retry failed issue / skip dependent issues / abort all
+
+**If Review Requests Changes:**
+1. Pause pipeline
+2. Show which issues are affected
+3. Ask: address feedback now / continue with others / abort
+
+**If Merge Conflict:**
+1. Attempt automatic resolution if simple
+2. If complex, pause and show conflict
+3. Ask user to resolve manually or abort
+
+## Limitations
+
+- Maximum 10 issues per batch (to avoid complexity)
+- All issues must be from the same repository
+- Cannot mix issues requiring different environments
+- Review coordination may still take time despite parallel implementation
+
+## Success Criteria
+
+- All issues successfully merged OR clear failure report
+- No merge conflicts left unresolved
+- All tracking systems updated correctly
+- Worktrees cleaned up properly
--- a/commands/plan.md
+++ b/commands/plan.md
@@ -0,0 +1,374 @@
+---
+description: Research and plan comprehensive GitHub issues with parallel agent analysis
+argument-hint: [feature/bug description]
+---
+
+# Create GitHub Issue (Plan)
+
+## Introduction
+
+Transform feature descriptions, bug reports, or improvement ideas into well-structured markdown files issues that follow project conventions and best practices. This command provides flexible detail levels to match your needs.
+
+## Feature Description
+
+<feature_description> #$ARGUMENTS </feature_description>
+
+## Main Tasks
+
+### 1. Repository Research & Context Gathering
+
+<thinking>
+First, I need to understand the project's conventions and existing patterns, leveraging all available resources and use paralel subagents to do this.
+</thinking>
+
+Runn these three agents in paralel at the same time:
+
+- Task repo-research-analyst(feature_description)
+- Task best-practices-researcher (feature_description)
+- Task framework-docs-researcher (feature_description)
+
+**Reference Collection:**
+
+- [ ] Document all research findings with specific file paths (e.g., `src/services/user_service.py:42` or `src/components/UserProfile.tsx:15`)
+- [ ] Include URLs to external documentation and best practices guides
+- [ ] Create a reference list of similar issues or PRs (e.g., `#123`, `#456`)
+- [ ] Note any team conventions discovered in `CLAUDE.md` or team documentation
+
+### 2. Issue Planning & Structure
+
+<thinking>
+Think like a product manager - what would make this issue clear and actionable? Consider multiple perspectives
+</thinking>
+
+**Title & Categorization:**
+
+- [ ] Draft clear, searchable issue title using conventional format (e.g., `feat:`, `fix:`, `docs:`)
+- [ ] Identify appropriate labels from repository's label set (`gh label list`)
+- [ ] Determine issue type: enhancement, bug, refactor
+
+**Stakeholder Analysis:**
+
+- [ ] Identify who will be affected by this issue (end users, developers, operations)
+- [ ] Consider implementation complexity and required expertise
+
+**Content Planning:**
+
+- [ ] Choose appropriate detail level based on issue complexity and audience
+- [ ] List all necessary sections for the chosen template
+- [ ] Gather supporting materials (error logs, screenshots, design mockups)
+- [ ] Prepare code examples or reproduction steps if applicable, name the mock filenames in the lists
+
+### 3. Choose Implementation Detail Level
+
+Select how comprehensive you want the issue to be:
+
+#### 📄 MINIMAL (Quick Issue)
+
+**Best for:** Simple bugs, small improvements, clear features
+
+**Includes:**
+
+- Problem statement or feature description
+- Basic acceptance criteria
+- Essential context only
+
+**Structure:**
+
+````markdown
+[Brief problem/feature description]
+
+## Acceptance Criteria
+
+- [ ] Core requirement 1
+- [ ] Core requirement 2
+
+## Context
+
+[Any critical information]
+
+## MVP
+
+### test.py
+
+```python
+class Test:
+    def __init__(self):
+        self.name = "test"
+```
+
+### test.ts
+
+```typescript
+class Test {
+  private name: string;
+
+  constructor() {
+    this.name = "test";
+  }
+}
+```
+
+## References
+
+- Related issue: #[issue_number]
+- Documentation: [relevant_docs_url]
+````
+
+#### 📋 MORE (Standard Issue)
+
+**Best for:** Most features, complex bugs, team collaboration
+
+**Includes everything from MINIMAL plus:**
+
+- Detailed background and motivation
+- Technical considerations
+- Success metrics
+- Dependencies and risks
+- Basic implementation suggestions
+
+**Structure:**
+
+```markdown
+## Overview
+
+[Comprehensive description]
+
+## Problem Statement / Motivation
+
+[Why this matters]
+
+## Proposed Solution
+
+[High-level approach]
+
+## Technical Considerations
+
+- Architecture impacts
+- Performance implications
+- Security considerations
+
+## Acceptance Criteria
+
+- [ ] Detailed requirement 1
+- [ ] Detailed requirement 2
+- [ ] Testing requirements
+
+## Success Metrics
+
+[How we measure success]
+
+## Dependencies & Risks
+
+[What could block or complicate this]
+
+## References & Research
+
+- Similar implementations: [file_path:line_number]
+- Best practices: [documentation_url]
+- Related PRs: #[pr_number]
+```
+
+#### 📚 A LOT (Comprehensive Issue)
+
+**Best for:** Major features, architectural changes, complex integrations
+
+**Includes everything from MORE plus:**
+
+- Detailed implementation plan with phases
+- Alternative approaches considered
+- Extensive technical specifications
+- Resource requirements and timeline
+- Future considerations and extensibility
+- Risk mitigation strategies
+- Documentation requirements
+
+**Structure:**
+
+```markdown
+## Overview
+
+[Executive summary]
+
+## Problem Statement
+
+[Detailed problem analysis]
+
+## Proposed Solution
+
+[Comprehensive solution design]
+
+## Technical Approach
+
+### Architecture
+
+[Detailed technical design]
+
+### Implementation Phases
+
+#### Phase 1: [Foundation]
+
+- Tasks and deliverables
+- Success criteria
+- Estimated effort
+
+#### Phase 2: [Core Implementation]
+
+- Tasks and deliverables
+- Success criteria
+- Estimated effort
+
+#### Phase 3: [Polish & Optimization]
+
+- Tasks and deliverables
+- Success criteria
+- Estimated effort
+
+## Alternative Approaches Considered
+
+[Other solutions evaluated and why rejected]
+
+## Acceptance Criteria
+
+### Functional Requirements
+
+- [ ] Detailed functional criteria
+
+### Non-Functional Requirements
+
+- [ ] Performance targets
+- [ ] Security requirements
+- [ ] Accessibility standards
+
+### Quality Gates
+
+- [ ] Test coverage requirements
+- [ ] Documentation completeness
+- [ ] Code review approval
+
+## Success Metrics
+
+[Detailed KPIs and measurement methods]
+
+## Dependencies & Prerequisites
+
+[Detailed dependency analysis]
+
+## Risk Analysis & Mitigation
+
+[Comprehensive risk assessment]
+
+## Resource Requirements
+
+[Team, time, infrastructure needs]
+
+## Future Considerations
+
+[Extensibility and long-term vision]
+
+## Documentation Plan
+
+[What docs need updating]
+
+## References & Research
+
+### Internal References
+
+- Architecture decisions: [file_path:line_number]
+- Similar features: [file_path:line_number]
+- Configuration: [file_path:line_number]
+
+### External References
+
+- Framework documentation: [url]
+- Best practices guide: [url]
+- Industry standards: [url]
+
+### Related Work
+
+- Previous PRs: #[pr_numbers]
+- Related issues: #[issue_numbers]
+- Design documents: [links]
+```
+
+### 4. Issue Creation & Formatting
+
+<thinking>
+Apply best practices for clarity and actionability, making the issue easy to scan and understand
+</thinking>
+
+**Content Formatting:**
+
+- [ ] Use clear, descriptive headings with proper hierarchy (##, ###)
+- [ ] Include code examples in triple backticks with language syntax highlighting
+- [ ] Add screenshots/mockups if UI-related (drag & drop or use image hosting)
+- [ ] Use task lists (- [ ]) for trackable items that can be checked off
+- [ ] Add collapsible sections for lengthy logs or optional details using `<details>` tags
+- [ ] Apply appropriate emoji for visual scanning (🐛 bug, ✨ feature, 📚 docs, ♻️ refactor)
+
+**Cross-Referencing:**
+
+- [ ] Link to related issues/PRs using #number format
+- [ ] Reference specific commits with SHA hashes when relevant
+- [ ] Link to code using GitHub's permalink feature (press 'y' for permanent link)
+- [ ] Mention relevant team members with @username if needed
+- [ ] Add links to external resources with descriptive text
+
+**Code & Examples:**
+
+```markdown
+# Good example with syntax highlighting and line references
+
+\`\`\`ruby
+
+# app/services/user_service.rb:42
+
+def process_user(user)
+
+# Implementation here
+
+end \`\`\`
+
+# Collapsible error logs
+
+<details>
+<summary>Full error stacktrace</summary>
+
+\`\`\` Error details here... \`\`\`
+
+</details>
+```
+
+**AI-Era Considerations:**
+
+- [ ] Account for accelerated development with AI pair programming
+- [ ] Include prompts or instructions that worked well during research
+- [ ] Note which AI tools were used for initial exploration (Claude, Copilot, etc.)
+- [ ] Emphasize comprehensive testing given rapid implementation
+- [ ] Document any AI-generated code that needs human review
+
+### 5. Final Review & Submission
+
+**Pre-submission Checklist:**
+
+- [ ] Title is searchable and descriptive
+- [ ] Labels accurately categorize the issue
+- [ ] All template sections are complete
+- [ ] Links and references are working
+- [ ] Acceptance criteria are measurable
+- [ ] Add names of files in pseudo code examples and todo lists
+- [ ] Add an ERD mermaid diagram if applicable for new model changes
+
+## Output Format
+
+Present the complete issue content within `<github_issue>` tags, ready for GitHub CLI:
+
+```bash
+gh issue create --title "[TITLE]" --body "[CONTENT]" --label "[LABELS]"
+```
+
+## Thinking Approaches
+
+- **Analytical:** Break down complex features into manageable components
+- **User-Centric:** Consider end-user impact and experience
+- **Technical:** Evaluate implementation complexity and architecture fit
+- **Strategic:** Align with project goals and roadmap
--- a/commands/tar.md
+++ b/commands/tar.md
@@ -0,0 +1,387 @@
+---
+description: Research and recommend technical approaches for implementation challenges
+argument-hint: [problem description or feature]
+---
+
+# TAR - Technical Approach Research
+
+Research and recommend technical approaches for implementation challenges or new features.
+
+## Purpose
+
+TAR handles **two types of research**:
+
+1. **Problem Diagnosis**: Research why something is broken/slow/buggy and propose fixes
+   - Example: "Our Lambda functions are timing out under load"
+   - Example: "OAuth token refresh is failing intermittently"
+
+2. **Feature Architecture**: Research the best way to architect a new feature
+   - Example: "Build real-time voice chat for web and mobile"
+   - Example: "Add multi-account support to our authentication system"
+
+**Primary Goal**: Thorough research and diagnosis. Understanding the problem deeply and recommending the right solution.
+
+**Secondary Benefit**: Research output naturally contains information useful for creating GitHub issues later (if desired).
+
+## Input
+
+**Project/Feature Description:**
+```
+$ARGUMENTS
+```
+
+## Perplexity AI: Your Research Partner
+
+**USE PERPLEXITY ITERATIVELY - Don't Stop at the First Answer**
+
+Perplexity is your research partner. Have a conversation with it, dig deeper, ask follow-ups. Don't treat it like a search engine where you get one answer and move on.
+
+### Research Pattern
+
+**1. Start with `perplexity_ask` (5-20 sec)**
+- Get initial understanding of the problem/approach
+- Identify key concepts and technologies
+
+**2. Dig Deeper with Follow-ups**
+- Ask about edge cases: "What are common pitfalls with [approach]?"
+- Ask about gotchas: "What breaks when [condition]?"
+- Ask about real examples: "Who uses [approach] in production?"
+- Ask about alternatives: "What about [alternative approach]?"
+
+**3. Use `perplexity_reason` (30 sec - 2 min) for Implementation Details**
+- "How do I handle [specific technical challenge]?"
+- "What's the debugging strategy when [error occurs]?"
+- "Why does [approach] fail under [condition]?"
+
+**4. Use `perplexity_research` (3-10 min) ONLY for Deep Dives**
+- Comprehensive architecture analysis
+- Multiple competing approaches need full comparison
+- Novel/cutting-edge technology with limited documentation
+
+### Provide Comprehensive Context
+
+**Every Perplexity query should include:**
+- **Environment**: OS, versions, cloud provider, region
+- **Current State**: What exists today, what's broken, error messages
+- **Constraints**: Performance requirements, scale, budget, team skills
+- **Architecture**: Related systems, dependencies, data flow
+- **Goal**: What success looks like
+
+**Bad Perplexity Query**:
+```
+"How do I fix Lambda timeouts?"
+```
+
+**Good Perplexity Query**:
+```
+"I have AWS Lambda functions in me-central-1 running Python 3.11 that call Aurora PostgreSQL via Data API. Under load (100+ concurrent requests), functions timeout after 30 seconds. Current config: 1024MB memory, no VPC, cold start ~2s, warm ~500ms. Database queries are simple SELECTs taking <50ms. What are the most likely causes and solutions? Looking for production-proven approaches."
+```
+
+### Iterative Research Example
+
+**Round 1**: "What causes Lambda timeouts with Aurora Data API?"
+→ Response mentions connection pooling, query optimization, memory allocation
+
+**Round 2**: "Does Aurora Data API support connection pooling? What are the gotchas?"
+→ Response explains Data API is HTTP-based, no traditional pooling needed
+
+**Round 3**: "What are the common Aurora Data API errors under high load and how to handle them?"
+→ Response lists throttling, transaction timeouts, network issues
+
+**Round 4**: "How do production systems handle Aurora Data API throttling? Show real examples."
+→ Response provides retry strategies, backoff algorithms, monitoring approaches
+
+**Result**: Deep understanding of the problem and battle-tested solutions.
+
+## Steps
+
+### 1. Understand the Problem Space
+
+**For Problem Diagnosis:**
+- What's the symptom? (error messages, slow performance, unexpected behavior)
+- When does it occur? (always, under load, specific conditions)
+- What changed recently? (code, config, traffic, infrastructure)
+- What's already been tried? (debugging steps, failed solutions)
+
+**For Feature Architecture:**
+- What's the user need? (what problem are we solving?)
+- What are the constraints? (performance, scale, budget, timeline)
+- What's the environment? (platforms, existing systems, team skills)
+- What's the scope? (MVP vs full feature, what's out of scope)
+
+### 2. Research with Perplexity (Iteratively!)
+
+**Initial Research** (perplexity_ask):
+- Problem diagnosis: "What causes [symptom] in [technology]?"
+- Feature architecture: "How do companies implement [feature] for [scale/platform]?"
+
+**Follow-up Questions** (keep digging):
+- "What are the gotchas with [approach]?"
+- "How does [company] handle [specific challenge]?"
+- "What breaks when [condition occurs]?"
+- "What's the debugging strategy for [error]?"
+- "What are alternatives to [initial approach]?"
+
+**Deep Dive** (perplexity_reason or perplexity_research if needed):
+- Complex architectural decisions
+- Multiple approaches need full comparison
+- Novel technology with limited documentation
+
+### 3. Examine Codebase (If Applicable)
+
+If diagnosing a problem in existing code:
+- Read relevant files to understand current implementation
+- Check recent changes: `git log --oneline -20 path/to/file`
+- Look for related issues: `gh issue list --search "keyword"`
+- Check error logs, metrics, monitoring
+
+### 4. Research Multiple Approaches
+
+Find **3-4 viable approaches**:
+- Look for production case studies (last 2 years preferred)
+- Focus on real-world implementations from companies at scale
+- Include performance data, not just descriptions
+- Note which approach is most common (battle-tested)
+
+### 5. Evaluate Trade-offs
+
+For each approach, analyze:
+- **Performance**: Latency, throughput, resource usage
+- **Complexity**: Dev time, code complexity, operational overhead
+- **Cost**: Infrastructure, licensing, ongoing maintenance
+- **Scalability**: How it handles growth
+- **Team Fit**: Required skills, learning curve
+- **Risk**: Common pitfalls, anti-patterns, failure modes
+
+### 6. Make Recommendation
+
+- Pick the approach that best fits the constraints
+- Explain WHY this is recommended (not just what)
+- Be honest about trade-offs and limitations
+- Provide real examples of successful implementations
+- Include actionable next steps
+
+### 7. Deliver Research Report
+
+Use the output format below. Keep focus on **research quality**, not on formatting for GitHub issues.
+
+## Output Format
+
+```markdown
+# Technical Research: [Problem/Feature Name]
+
+## Research Type
+[Problem Diagnosis | Feature Architecture]
+
+## Executive Summary
+[2-3 paragraphs covering:
+- What's the problem or what are we building?
+- What did the research reveal?
+- What's recommended and why?
+- Key trade-offs to be aware of]
+
+## Problem Analysis (For Diagnosis) OR Requirements (For Architecture)
+
+**For Problem Diagnosis:**
+- **Symptom**: [Exact error, behavior, or performance issue]
+- **When It Occurs**: [Conditions, frequency, patterns]
+- **Current State**: [What's already in place, recent changes]
+- **Root Cause**: [What the research revealed]
+- **Impact**: [Who/what is affected]
+
+**For Feature Architecture:**
+- **Core Need**: [What problem we're solving for users]
+- **Constraints**: [Performance, scale, budget, timeline, technical]
+- **Platforms**: [Web, mobile, APIs, infrastructure]
+- **Scope**: [What's included, what's explicitly out of scope]
+- **Success Criteria**: [What "done" looks like]
+
+## Research Findings
+
+### Key Insights from Research
+1. [Major insight from Perplexity research]
+2. [Important gotcha or edge case discovered]
+3. [Production pattern or anti-pattern found]
+
+### Technologies/Patterns Investigated
+- [Technology/Pattern 1]: [Brief description]
+- [Technology/Pattern 2]: [Brief description]
+- [Technology/Pattern 3]: [Brief description]
+
+---
+
+## Approach 1: [Name] ⭐ RECOMMENDED
+
+**Overview**: [One paragraph: what it is, how it works, why it exists]
+
+**Why This Approach**:
+- [Key advantage 1 - specific, measurable]
+- [Key advantage 2 - real-world proof]
+- [Key advantage 3 - fits constraints]
+
+**Trade-offs**:
+- ⚠️ [Limitation or consideration 1]
+- ⚠️ [Limitation or consideration 2]
+
+**Real-World Examples**:
+- **[Company/Project]**: [How they use it, scale, results]
+- **[Company/Project]**: [How they use it, scale, results]
+
+**Common Pitfalls**:
+- [Pitfall 1 and how to avoid it]
+- [Pitfall 2 and how to avoid it]
+
+**Complexity**: Low/Medium/High
+**Development Time**: [Estimate based on research]
+**Operational Overhead**: Low/Medium/High
+
+---
+
+## Approach 2: [Alternative Name]
+
+**Overview**: [One paragraph description]
+
+**Pros**:
+- [Advantage 1]
+- [Advantage 2]
+
+**Cons**:
+- [Disadvantage 1]
+- [Disadvantage 2]
+
+**When to Use**: [Specific scenarios where this is better than recommended approach]
+
+**Real-World Examples**:
+- [Company/Project using this approach]
+
+---
+
+## Approach 3: [Alternative Name]
+
+**Overview**: [One paragraph description]
+
+**Pros**:
+- [Advantage 1]
+- [Advantage 2]
+
+**Cons**:
+- [Disadvantage 1]
+- [Disadvantage 2]
+
+**When to Use**: [Specific scenarios where this is better than recommended approach]
+
+**Real-World Examples**:
+- [Company/Project using this approach]
+
+---
+
+## Comparison Matrix
+
+| Aspect | Approach 1 ⭐ | Approach 2 | Approach 3 |
+|--------|-------------|-----------|-----------|
+| Dev Speed | Fast/Medium/Slow | ... | ... |
+| Performance | Good/Excellent | ... | ... |
+| Complexity | Low/Medium/High | ... | ... |
+| Scalability | Good/Excellent | ... | ... |
+| Cost | Low/Medium/High | ... | ... |
+| Team Fit | Good/Poor | ... | ... |
+| Battle-Tested | Very/Somewhat/New | ... | ... |
+
+## Implementation Roadmap
+
+**For Recommended Approach:**
+
+### Phase 1: Foundation (Week 1-2)
+1. [Specific technical task]
+2. [Specific technical task]
+3. [Validation checkpoint]
+
+### Phase 2: Core Development (Week 3-4)
+1. [Specific technical task]
+2. [Specific technical task]
+3. [Testing/validation]
+
+### Phase 3: Polish & Deploy (Week 5+)
+1. [Specific technical task]
+2. [Performance testing/optimization]
+3. [Production rollout]
+
+## Critical Risks & Mitigation
+
+| Risk | Impact | Likelihood | Mitigation Strategy |
+|------|--------|-----------|---------------------|
+| [Technical risk 1] | High/Medium/Low | High/Medium/Low | [Specific strategy] |
+| [Technical risk 2] | High/Medium/Low | High/Medium/Low | [Specific strategy] |
+
+## Testing Strategy
+
+**What to Test**:
+- [Specific test type 1 and why]
+- [Specific test type 2 and why]
+
+**Performance Benchmarks**:
+- [Metric 1]: Target [value], measured by [method]
+- [Metric 2]: Target [value], measured by [method]
+
+## Key Resources
+
+**Documentation**:
+- [Official docs with URL and specific sections to read]
+
+**Real Examples**:
+- [GitHub repo / blog post with URL and what's relevant]
+- [Production case study with URL and key takeaways]
+
+**Tools/Libraries**:
+- [Tool name with URL and what it does]
+
+---
+
+## Optional: Issue Creation Summary
+
+**If creating a GitHub issue from this research**, the following information would be relevant:
+
+**Summary**: [2-3 sentence summary for issue title/description]
+
+**Key Requirements**:
+- [Specific requirement 1]
+- [Specific requirement 2]
+
+**Recommended Approach**: [One paragraph on what to implement]
+
+**Implementation Guidance**:
+- [Problem to solve 1]
+- [Problem to solve 2]
+
+**Testing Needs**:
+- [Test requirement 1]
+- [Test requirement 2]
+
+**Documentation Needs**:
+- [Doc requirement 1]
+
+```
+
+## Example Commands
+
+**Problem Diagnosis**:
+```
+/tar Our Lambda functions are timing out when calling Aurora PostgreSQL via Data API under load. Functions are Python 3.11, 1024MB memory, no VPC, in me-central-1. Timeouts happen at 100+ concurrent requests.
+```
+
+**Feature Architecture**:
+```
+/tar Build a real-time voice chat feature for web and mobile apps with low latency (<200ms) and support for 10+ concurrent users per room. Budget conscious, team familiar with JavaScript/Python.
+```
+
+## Notes
+
+- **Research is the goal** - Deep understanding matters more than perfect formatting
+- **Use Perplexity iteratively** - Ask follow-ups, dig into gotchas, find real examples
+- **Provide context** - Every Perplexity query should include environment, constraints, goals
+- **Focus on practicality** - Real implementations beat theoretical perfection
+- **Recency matters** - Prioritize solutions from last 2 years (technology evolves)
+- **Show your work** - Include Perplexity queries, sources, reasoning
+- **Be honest about trade-offs** - No approach is perfect
+- **Real examples prove it works** - "Company X uses this at Y scale" is gold
+- **Issue creation is optional** - Sometimes research is just research
--- a/commands/ultimate_validate_command.md
+++ b/commands/ultimate_validate_command.md
@@ -0,0 +1,190 @@
+---
+description: Generate comprehensive validation command for backend or frontend
+---
+
+# Generate Ultimate Validation Command
+
+Analyze a codebase deeply and create `.claude/commands/validate.md` that comprehensively validates everything.
+
+**Usage:**
+```
+/ultimate_validate_command [backend|frontend]
+```
+
+## Step 0: Discover Real User Workflows
+
+**Before analyzing tooling, understand what users ACTUALLY do:**
+
+1. Read workflow documentation:
+   - README.md - Look for "Usage", "Quickstart", "Examples" sections
+   - CLAUDE.md - Look for workflow patterns
+   - docs/ folder - User guides, tutorials, feature documentation
+
+2. Identify external integrations:
+   - What CLIs does the app use? (Check Dockerfile, requirements.txt, package.json)
+   - What external APIs does it call? (Google Calendar, Gmail, Telegram, LiveKit, etc.)
+   - What services does it interact with? (AWS, Supabase, databases, etc.)
+
+3. Extract complete user journeys from docs:
+   - Find examples like "User asks voice agent → processes calendar → sends email"
+   - Each workflow becomes an E2E test scenario
+
+**Critical: Your E2E tests should mirror actual workflows from docs, not just test internal APIs.**
+
+## Step 1: Deep Codebase Analysis
+
+Navigate to the appropriate repo:
+```bash
+cd backend/  # or frontend/
+```
+
+Explore the codebase to understand:
+
+**What validation tools already exist:**
+- **Linting config:** `.eslintrc*`, `.pylintrc`, `ruff.toml`, `.flake8`, etc.
+- **Type checking:** `tsconfig.json`, `mypy.ini`, etc.
+- **Style/formatting:** `.prettierrc*`, `black`, `.editorconfig`
+- **Unit tests:** `jest.config.*`, `pytest.ini`, test directories
+- **Package manager scripts:** `package.json` scripts, `Makefile`, `pyproject.toml` tools
+
+**What the application does:**
+- **Frontend:** Routes, pages, components, user flows (Next.js app)
+- **Backend:** API endpoints, voice agent, Google services integration, authentication
+- **Database:** Schema, migrations, models
+- **Infrastructure:** Docker services, dependencies, AWS resources
+
+**How things are currently tested:**
+- Existing test files and patterns
+- CI/CD workflows (`.github/workflows/`, etc.)
+- Test commands in package.json or pyproject.toml
+
+## Step 2: Generate validate.md
+
+Create `.claude/commands/validate.md` in the **appropriate repo** (backend/ or frontend/) with these phases:
+
+**ONLY include phases that exist in the codebase.**
+
+### Phase 1: Linting
+Run the actual linter commands found in the project:
+- Python: `ruff check .`, `flake8`, `pylint`
+- JavaScript/TypeScript: `npm run lint`, `eslint`
+
+### Phase 2: Type Checking
+Run the actual type checker commands found:
+- TypeScript: `npx tsc --noEmit`
+- Python: `mypy src/`, `mypy .`
+
+### Phase 3: Style Checking
+Run the actual formatter check commands found:
+- JavaScript/TypeScript: `npm run format:check`, `prettier --check`
+- Python: `black --check .`
+
+### Phase 4: Unit Testing
+Run the actual test commands found:
+- Python: `pytest`, `pytest tests/unit`
+- JavaScript/TypeScript: `npm test`, `jest`
+
+### Phase 5: End-to-End Testing (BE CREATIVE AND COMPREHENSIVE)
+
+Test COMPLETE user workflows from documentation, not just internal APIs.
+
+**The Three Levels of E2E Testing:**
+
+1. **Internal APIs** (what you might naturally test):
+   - Test adapter endpoints work
+   - Database queries succeed
+   - Commands execute
+
+2. **External Integrations** (what you MUST test):
+   - CLI operations (AWS CLI, gh, gcloud, etc.)
+   - Platform APIs (LiveKit, Google Calendar, Gmail, Telegram)
+   - Any external services the app depends on
+
+3. **Complete User Journeys** (what gives 100% confidence):
+   - Follow workflows from docs start-to-finish
+   - **Backend Example:** "Voice agent receives request → processes calendar event → sends email → confirms to user"
+   - **Frontend Example:** "User logs in → views dashboard → schedules event → receives confirmation"
+   - Test like a user would actually use the application in production
+
+**Examples of good vs. bad E2E tests:**
+
+**Backend:**
+- ❌ Bad: Tests that Google Calendar adapter returns data
+- ✅ Good: Voice agent receives calendar request → fetches events → formats response → returns to user
+- ✅ Great: Voice interaction → Calendar query → Gmail notification → LiveKit response confirmation
+
+**Frontend:**
+- ❌ Bad: Tests that login form submits data
+- ✅ Good: User submits login → Gets redirected → Dashboard loads with data
+- ✅ Great: Full auth flow → Create calendar event via UI → Verify in Google Calendar → See update in dashboard
+
+**Approach:**
+- Use Docker for isolated, reproducible testing
+- Create test data/accounts as needed
+- Verify outcomes in external systems (Google Calendar, database, LiveKit rooms)
+- Clean up after tests
+- Use environment variables for test credentials
+
+## Critical: Don't Stop Until Everything is Validated
+
+**Your job is to create a validation command that leaves NO STONE UNTURNED.**
+
+- Every user workflow from docs should be tested end-to-end
+- Every external integration should be exercised (LiveKit, Google services, AWS, etc.)
+- Every API endpoint should be hit
+- Every error case should be verified
+- Database integrity should be confirmed
+- The validation should be so thorough that manual testing is completely unnecessary
+
+If `/validate` passes, the user should have 100% confidence their application works correctly in production. Don't settle for partial coverage - make it comprehensive, creative, and complete.
+
+## Mygentic Personal Assistant Specific Workflows
+
+**Backend workflows to test:**
+1. **Voice Agent Interaction:**
+   - LiveKit room connection
+   - Speech-to-text processing
+   - Intent recognition
+   - Response generation
+   - Text-to-speech output
+
+2. **Google Services Integration:**
+   - OAuth authentication flow
+   - Calendar event creation/retrieval
+   - Gmail message sending
+   - Multiple account support
+
+3. **Telegram Bot:**
+   - Message receiving
+   - Command processing
+   - Response sending
+
+**Frontend workflows to test:**
+1. **Authentication:**
+   - User registration
+   - Email verification
+   - Login/logout
+   - Session management
+
+2. **Dashboard:**
+   - Calendar view
+   - Event management
+   - Settings configuration
+
+3. **Voice Agent UI:**
+   - Connection to LiveKit
+   - Audio streaming
+   - Real-time transcription display
+   - Response visualization
+
+## Output
+
+Write the generated validation command to:
+- `backend/.claude/commands/validate.md` for backend
+- `frontend/.claude/commands/validate.md` for frontend
+
+The command should be executable, practical, and give complete confidence in the codebase.
+
+## Example Structure
+
+See the example validation command for reference, but adapt it to the specific tools, frameworks, and workflows found in this codebase. Don't copy-paste - generate based on actual codebase analysis.