Initial commit

This commit is contained in:
Zhongwei Li
2025-11-30 09:03:11 +08:00
commit 4aff69d9a9
61 changed files with 7343 additions and 0 deletions

177
skills/auto-commit/SKILL.md Normal file
View File

@@ -0,0 +1,177 @@
# Auto-Commit Skill
## Purpose
Automatically create professional git commits after completing agent tasks with semantic commit messages.
## When to Use
- After completing a significant sub-task
- When changes are ready to be versioned
- Before handing off to another agent
- After passing quality gates
## Usage
Call the auto-commit script with:
```bash
./orchestra/mcp-servers/auto-commit.sh <prefix> <reason> <action> [agent_name]
```
### Parameters
1. **prefix** (required): Semantic commit type
- `feat` - New feature
- `fix` - Bug fix
- `docs` - Documentation changes
- `style` - Code formatting
- `refactor` - Code refactoring
- `perf` - Performance improvement
- `test` - Test additions/updates
- `chore` - Build/tool changes
2. **reason** (required): Why this change was made
- English: "to support voice notifications"
- Japanese: "通知をサポートする"
3. **action** (required): What was done
- English: "Add ElevenLabs TTS integration"
- Japanese: "ElevenLabs TTS統合を追加"
4. **agent_name** (optional): Agent making the commit
- "Eden", "Iris", "Mina", "Theo", "Blake"
### Examples
**Eden (QA Agent) - After running tests:**
```bash
# English
./orchestra/mcp-servers/auto-commit.sh \
"test" \
"to ensure code quality" \
"Add comprehensive unit tests for API endpoints" \
"Eden"
# Result: test: Add comprehensive unit tests for API endpoints (to ensure code quality)
# Japanese
./orchestra/mcp-servers/auto-commit.sh \
"test" \
"コード品質を確保するため" \
"APIエンドポイント用の包括的な単体テストを追加" \
"Eden"
# Result: test: APIエンドポイント用の包括的な単体テストを追加 (コード品質を確保するため)
```
**Iris (Security Agent) - After security scan:**
```bash
./orchestra/mcp-servers/auto-commit.sh \
"chore" \
"to validate deployment security" \
"Run TruffleHog and Grype security scans" \
"Iris"
# Result: chore: Run TruffleHog and Grype security scans (to validate deployment security)
```
**Blake (Release Manager) - After release preparation:**
```bash
./orchestra/mcp-servers/auto-commit.sh \
"chore" \
"to prepare for production release" \
"Update changelog and version tags for v2.0.0" \
"Blake"
# Result: chore: Update changelog and version tags for v2.0.0 (to prepare for production release)
```
**Mina (Integration Specialist) - After API integration:**
```bash
./orchestra/mcp-servers/auto-commit.sh \
"feat" \
"to enhance user experience" \
"Implement responsive navigation component" \
"Mina"
# Result: feat: Implement responsive navigation component (to enhance user experience)
```
**Theo (DevOps) - After deployment:**
```bash
./orchestra/mcp-servers/auto-commit.sh \
"chore" \
"to track deployment state" \
"Update production deployment configuration" \
"Theo"
# Result: chore: Update production deployment configuration (to track deployment state)
```
## Configuration
Set in `.env`:
```bash
# Enable/disable auto-commits
AUTO_COMMIT_ENABLED=true
# Commit message language
COMMIT_LANGUAGE=en # or 'ja' for Japanese
```
## Commit Message Format
**Format (both languages):**
```
prefix: <action> (<reason>)
Co-Authored-By: <Agent> <noreply@orchestra>
```
**English example:**
```
feat: Add voice notification feature (to support agent announcements)
Co-Authored-By: Eden <noreply@orchestra>
```
**Japanese example:**
```
feat: 音声通知機能を追加 (エージェントのアナウンスをサポートするため)
Co-Authored-By: Eden <noreply@orchestra>
```
## Best Practices
1. **Commit frequently**: After each meaningful sub-task
2. **Use semantic prefixes**: Choose the most appropriate prefix
3. **Be specific**: Clearly describe both reason and action
4. **Agent attribution**: Always include agent name for traceability
5. **Check AUTO_COMMIT_ENABLED**: Script automatically respects this flag
6. **Verify changes**: Script only commits if there are actual changes
## Integration with Hooks
Auto-commit is integrated into all hook stages:
- `before_task.sh` → Main Claude Code commits task planning analysis
- `before_pr.sh` → Eden commits QA validation results
- `before_merge.sh` → Eden commits integration test results
- `before_deploy.sh` → Iris commits security validation
- `after_deploy.sh` → Theo commits deployment verification
## Troubleshooting
**No changes to commit:**
```
No changes to commit.
```
→ Normal behavior, script exits gracefully
**Not a git repository:**
```
⚠️ Not a git repository. Skipping auto-commit.
```
→ Script only works in git repositories
**Invalid prefix:**
```
❌ Invalid prefix: xyz
Valid prefixes: feat fix docs style refactor perf test chore
```
→ Use one of the valid semantic prefixes
**Auto-commit disabled:**
Script exits silently when `AUTO_COMMIT_ENABLED=false`

70
skills/core/SKILL.md Normal file
View File

@@ -0,0 +1,70 @@
---
name: core
description: Core development principles and guidelines covering security, QA, performance, documentation, and coding standards. Used by all agents to ensure consistent quality across the Orchestra system.
---
# Core Development Skills
This skill provides essential development principles and checklists that all Orchestra agents follow to maintain high-quality standards across security, testing, documentation, performance, and code quality.
## Overview
The core skills provide:
- **Security principles** (security.yaml) - Secure coding practices and vulnerability prevention
- **QA guidelines** (qa.yaml) - Testing standards and quality assurance procedures
- **Release procedures** (release.yaml) - Deployment and release management
- **Performance standards** (performance.yaml) - Optimization and efficiency guidelines
- **Documentation standards** (documentation.yaml) - Technical writing and documentation best practices
- **Coding standards** (coding-standards.yaml) - Code style and structure conventions
- **Review checklist** (review-checklist.yaml) - Pre-merge code review requirements
- **Clarification guidelines** (clarify.yaml) - Requirements clarification procedures
- **Token efficiency** (token-efficiency.md) - Guidelines for minimizing token usage
## When to Use
Agents automatically reference these guidelines when:
- **Iris** - Applies security.yaml for security audits
- **Finn** - Uses qa.yaml and review-checklist.yaml for testing
- **Eden** - Follows documentation.yaml for technical writing
- **Kai** - References performance.yaml and coding-standards.yaml for architecture
- **Blake** - Uses release.yaml for deployment coordination
- **Riley** - Applies clarify.yaml for requirements clarification
- **All agents** - Follow token-efficiency.md to optimize responses
## Usage
Agents can reference specific guidelines:
```markdown
See `skills/core/security.yaml` for security best practices
See `skills/core/qa.yaml` for testing requirements
See `skills/core/token-efficiency.md` for response optimization
```
## File Structure
```
skills/core/
├── SKILL.md (this file)
├── security.yaml # Security principles
├── qa.yaml # QA and testing standards
├── release.yaml # Release management
├── performance.yaml # Performance optimization
├── documentation.yaml # Documentation standards
├── coding-standards.yaml # Code style conventions
├── review-checklist.yaml # Pre-merge checklist
├── clarify.yaml # Requirements clarification
└── token-efficiency.md # Token usage optimization
```
## Best Practices
1. **Consistency** - All agents follow the same core principles
2. **Reference, don't duplicate** - Agents link to guidelines rather than copying them
3. **Update centrally** - Guidelines are maintained in one place
4. **Agent-agnostic** - Principles apply across all specialized agents
5. **Token efficient** - Guidelines are concise and focused
## Integration
These core skills are automatically available to all Orchestra agents. No explicit invocation is needed - agents reference them as part of their standard operating procedures.

14
skills/core/clarify.yaml Normal file
View File

@@ -0,0 +1,14 @@
name: clarify
description: |
Skill for disambiguating unclear requests.
Used by Riley and Eden to ensure mutual understanding before work begins.
principles:
- Never assume intent — verify.
- Confirm context, motivation, outcomes, and constraints.
patterns:
- "Could you describe the current issue or goal more concretely?"
- "What would success look like for this task?"
- "Are there any specific risks, dependencies, or constraints?"
usage_examples:
- A vague feature request without measurable goals.
- A conflict between business and technical priorities.

View File

@@ -0,0 +1,12 @@
name: coding-standards
description: |
Skill for ensuring all code follows project conventions and best practices.
Used by Skye, Kai, Leo, Mina, and Iris.
principles:
- Code must be readable, consistent, and testable.
- Prefer explicitness over cleverness.
- Document assumptions inline when logic is non-trivial.
checks:
- Lint and type checks pass (ESLint, mypy, etc.)
- Code includes comments for complex logic.
- Sensitive data is never logged or exposed.

View File

@@ -0,0 +1,12 @@
name: documentation
description: |
Skill for maintaining clear and consistent documentation.
Used by Eden and Kai.
principles:
- Documentation must tell the "why" as well as the "how".
- Every feature or change must include updated docs or ADR references.
formats:
- README.md
- CHANGELOG.md
- ADR (Architecture Decision Record)
- Runbook

View File

@@ -0,0 +1,11 @@
name: performance
description: |
Skill for diagnosing and improving runtime performance.
Used by Nova, Leo, and Finn.
principles:
- Measure before optimizing.
- Define target metrics explicitly.
metrics:
- Time to First Byte (TTFB)
- Core Web Vitals (LCP, CLS, FID)
- API latency budgets

12
skills/core/qa.yaml Normal file
View File

@@ -0,0 +1,12 @@
name: qa
description: |
Skill for designing and executing automated quality assurance processes.
Used by Finn.
principles:
- Focus testing on risk areas and user flows.
- Prioritize fast feedback (CI-first testing).
checklist:
- [ ] Unit tests for critical logic
- [ ] Integration tests for data flow
- [ ] E2E tests for user-facing functionality
- [ ] Regression tests before release

11
skills/core/release.yaml Normal file
View File

@@ -0,0 +1,11 @@
name: release
description: |
Skill governing safe deployment and version management.
Used by Blake.
principles:
- Each release must be reversible.
- Document every change and communicate clearly.
steps:
- Tag version and update changelog
- Run smoke/E2E tests post-deploy
- Announce deployment status

View File

@@ -0,0 +1,12 @@
name: review-checklist
description: |
Skill for systematic review before merging code.
Used by Skye, Finn, and Iris.
principles:
- Reviews should focus on correctness, clarity, and maintainability.
- Do not approve PRs with missing documentation or untested code.
checklist:
- [ ] All tests pass and coverage is >80%
- [ ] Naming and structure follow project conventions
- [ ] Security-sensitive code reviewed by Iris
- [ ] Docs and changelogs updated

13
skills/core/security.yaml Normal file
View File

@@ -0,0 +1,13 @@
name: security
description: |
Skill enforcing secure practices throughout development.
Used by Iris, Mina, and Leo.
principles:
- Least privilege, defense in depth.
- Secrets never hard-coded or logged.
- Dependencies regularly scanned and updated.
checklist:
- [ ] No hardcoded credentials or tokens
- [ ] SBOM generated and verified
- [ ] OAuth scopes minimized
- [ ] Policies reviewed before merge

View File

@@ -0,0 +1,193 @@
# Token Efficiency Skill
**Purpose**: Minimize unnecessary token usage while maintaining code quality and thoroughness.
## Core Principles
### 1. Targeted Search, Not Exploration
**Always prefer specific over general:**
-`Glob("**/auth/login.ts")` - Exact file
-`Grep("function handleAuth", type="ts")` - Specific pattern with file type
-`Glob("**/*.ts")` - Too broad
- ❌ Reading entire directories without filtering
### 2. Read Only What You Need
**Before reading a file, ask:**
- Do I need the entire file, or just specific sections?
- Can I use grep to find the relevant parts first?
- Has this file already been read in the conversation?
**Use Read with limits:**
```
Read(file_path, offset=100, limit=50) # Only read 50 lines starting at line 100
```
### 3. Search Hierarchy (Most to Least Efficient)
1. **Known location**: Direct file read if path is known
2. **Pattern matching**: Glob with specific patterns
3. **Keyword search**: Grep for specific terms
4. **Contextual search**: Grep with -A/-B flags for context
5. **Agent search**: Task tool with Explore agent (last resort)
### 4. Incremental Discovery
**Don't gather everything upfront:**
```
Bad approach:
1. Read all files in src/
2. Read all test files
3. Read all config files
4. Then start implementation
Good approach:
1. Read the specific file to modify
2. Implement the change
3. Only read related files if needed during implementation
4. Read tests only when writing test updates
```
### 5. Scope Boundaries
**Set clear limits:**
- Maximum files to examine: 5-10 for most tasks
- Maximum file size to read fully: ~500 lines
- Large files: Use offset/limit or grep first
- Stop once you have sufficient information
### 6. Efficient Grep Usage
**Use appropriate output modes:**
- `files_with_matches`: Just need to know if pattern exists
- `count`: Just need to know how many matches
- `content`: Need to see actual matches (use head_limit)
**Add context only when needed:**
```
Grep("error handler", output_mode="content", -C=3, head_limit=10)
# Only 10 results with 3 lines of context each
```
### 7. Avoid Redundant Operations
**Check conversation history first:**
- Don't re-read files already examined
- Reference previous findings instead of re-searching
- Build on existing knowledge
### 8. Model Selection for Task Tool
**Choose the right model for the job:**
```
# Simple, well-defined task
Task(description="...", model="haiku", prompt="...")
# Default for most tasks (good balance)
Task(description="...", model="sonnet", prompt="...")
# Complex reasoning required (use sparingly)
Task(description="...", model="opus", prompt="...")
```
## Practical Examples
### Example 1: Adding a New Function
**Inefficient approach:**
```
1. Read entire src/ directory
2. Read all related files
3. Search for all similar patterns
4. Then implement
```
**Efficient approach:**
```
1. Grep for similar function names (files_with_matches)
2. Read ONE example file
3. Implement based on that pattern
4. Only read more if the first example is unclear
```
### Example 2: Bug Investigation
**Inefficient approach:**
```
1. Read all files that might be related
2. Search entire codebase for error messages
3. Check all test files
```
**Efficient approach:**
```
1. Grep for the specific error message
2. Read only the files containing that error
3. Check git blame if needed to understand context
4. Read tests only for files being modified
```
### Example 3: Code Review
**Inefficient approach:**
```
1. Read all files in the PR
2. Read all related test files
3. Read all documentation
4. Then provide feedback
```
**Efficient approach:**
```
1. Use git diff to see only changed lines
2. Read changed files with offset/limit to focus on modifications
3. Grep for test coverage of changed functions
4. Request summaries instead of reading full docs
```
## Token Budget Estimates
**Quick reference for delegation:**
| Task Type | Estimated Tokens | Recommended Model |
|-----------|------------------|-------------------|
| Simple bug fix | 2-5K | haiku |
| New feature (small) | 5-10K | haiku |
| New feature (medium) | 10-20K | sonnet |
| Architecture design | 20-30K | sonnet |
| Complex refactoring | 30-50K | sonnet |
| Full system review | Split into chunks | sonnet |
## Red Flags (Stop and Optimize)
**If you find yourself:**
- Reading more than 10 files for a single task
- Using Glob with `**/*` without file type filtering
- Reading entire large files (>1000 lines) without specific need
- Re-reading files already examined in the conversation
- Exploring code "just to understand better" without clear need
**Then:** Stop and refine your approach. Ask yourself:
1. What specific information do I need?
2. What's the minimum search required to get it?
3. Can I infer this from what I already know?
## Integration with Other Skills
This skill should be applied in conjunction with:
- **code-quality.md**: Maintain quality while being efficient
- **testing.md**: Test only what's necessary, focus on critical paths
- **security.md**: Targeted security checks, not exhaustive scans
- **documentation.md**: Document changes, not entire codebase
## Success Metrics
**You're doing it right when:**
- Tasks complete in <20K tokens for most features
- You find what you need in 1-3 search operations
- You reference previous reads instead of re-reading
- You can justify every file you read
- Subagents return focused, relevant results
**Review and improve** if token usage is consistently high without proportional complexity in the task.

171
skills/modes/SKILL.md Normal file
View File

@@ -0,0 +1,171 @@
---
name: modes
description: Domain-specific development mode guidelines for UI, API, database, integration, migration, and specialized workflows. Each mode provides tailored principles, checklists, and patterns for different types of development work.
---
# Development Mode Skills
This skill provides domain-specific guidelines that agents activate based on the type of work being performed. Each mode extends core principles with specialized requirements for different development contexts.
## Overview
The mode skills provide specialized guidance for:
- **UI Mode** (ui.yaml) - Frontend, accessibility, SEO, visual components
- **API Mode** (api.yaml) - REST/GraphQL endpoints, API design, versioning
- **Database Mode** (db.yaml) - Schema design, migrations, query optimization
- **Integration Mode** (integration.yaml) - External service integration, OAuth, webhooks
- **Migration Mode** (migration.yaml) - Data migration, version upgrades, rollback procedures
- **Performance Mode** (performance.yaml) - Optimization, caching, load testing
- **QA Mode** (qa.yaml) - Testing strategies, coverage requirements, test automation
- **Security Mode** (security.yaml) - Security audits, vulnerability scanning, penetration testing
- **Release Mode** (release.yaml) - Deployment procedures, release management, rollback
## When to Use
Modes are automatically activated based on work context:
### UI Mode
- **Used by**: Nova, Skye, Finn
- **Triggers**: Frontend changes, accessibility updates, SEO optimization
- **Focus**: Lighthouse scores, ARIA compliance, responsive design
### API Mode
- **Used by**: Skye, Kai, Mina
- **Triggers**: Endpoint creation, API versioning, integration work
- **Focus**: REST/GraphQL standards, documentation, versioning
### Database Mode
- **Used by**: Leo, Skye, Kai
- **Triggers**: Schema changes, migrations, query optimization
- **Focus**: Data integrity, indexing, rollback safety
### Integration Mode
- **Used by**: Mina, Iris, Kai
- **Triggers**: External service integration (Stripe, Shopify, AWS, etc.)
- **Focus**: OAuth flows, webhook handling, error resilience
### Migration Mode
- **Used by**: Blake, Leo, Kai
- **Triggers**: Database migrations, version upgrades, data transfers
- **Focus**: Rollback procedures, data validation, zero-downtime
### Performance Mode
- **Used by**: Kai, Nova, Theo
- **Triggers**: Optimization work, performance issues, load testing
- **Focus**: Caching strategies, bundle optimization, resource usage
### QA Mode
- **Used by**: Finn, Eden
- **Triggers**: Test creation, coverage validation, quality gates
- **Focus**: Unit/integration/E2E tests, coverage thresholds
### Security Mode
- **Used by**: Iris, Mina, Blake
- **Triggers**: Security audits, vulnerability scans, auth changes
- **Focus**: Secret management, SBOM generation, penetration testing
### Release Mode
- **Used by**: Blake, Eden, Theo
- **Triggers**: Deployment preparation, release coordination
- **Focus**: Changelog generation, deployment verification, rollback readiness
## Mode Structure
Each mode YAML file contains:
```yaml
name: mode-name
extends: [core-skills] # Inherited core principles
description: |
Mode-specific description
used_by: [Agent1, Agent2] # Which agents use this mode
triggers: # When to activate this mode
- trigger_condition_1
- trigger_condition_2
inputs_required: # Required context
- input_1
- input_2
outputs: # Expected deliverables
- output_1
- output_2
principles: # Mode-specific guidelines
- principle_1
- principle_2
checklist: # Validation requirements
- [ ] checklist_item_1
- [ ] checklist_item_2
patterns: # Common solutions
- "Pattern description"
hooks: # Integration points
- hook_name
```
## Usage
Agents reference specific modes based on work type:
```markdown
# Nova working on UI
See `skills/modes/ui.yaml` for accessibility and performance requirements
# Leo working on database
See `skills/modes/db.yaml` for migration and schema design guidelines
# Mina integrating Stripe
See `skills/modes/integration.yaml` for OAuth and webhook patterns
```
## File Structure
```
skills/modes/
├── SKILL.md (this file)
├── ui.yaml # Frontend/accessibility/SEO
├── api.yaml # REST/GraphQL endpoints
├── db.yaml # Database schema/migrations
├── integration.yaml # External service integration
├── migration.yaml # Data migration procedures
├── performance.yaml # Optimization strategies
├── qa.yaml # Testing requirements
├── security.yaml # Security audits
└── release.yaml # Deployment procedures
```
## Mode Inheritance
Modes extend core skills:
- **All modes** inherit from core principles
- **Specific modes** may extend additional core skills (e.g., ui.yaml extends performance, review-checklist, documentation)
- **Agents** apply both core and mode-specific guidelines
## Best Practices
1. **Context-aware activation** - Modes activate based on work type, not manual selection
2. **Layered guidance** - Core principles + mode-specific requirements
3. **Agent specialization** - Each agent knows which modes to apply
4. **Validation gates** - Each mode defines success criteria and checklists
5. **Pattern reuse** - Common solutions documented for consistency
## Integration with Agents
Agents automatically apply relevant modes:
- **Nova** (UI/UX) → ui.yaml, performance.yaml
- **Leo** (Database) → db.yaml, migration.yaml
- **Mina** (Integration) → integration.yaml, security.yaml
- **Blake** (Release) → release.yaml, qa.yaml
- **Iris** (Security) → security.yaml, integration.yaml
- **Finn** (QA) → qa.yaml, performance.yaml
- **Kai** (Architecture) → api.yaml, db.yaml, performance.yaml
- **Skye** (Implementation) → ui.yaml, api.yaml, db.yaml (context-dependent)
## Example Workflow
When Nova receives a UI task:
1. Activates **ui.yaml** mode
2. Inherits principles from **core/performance.yaml**, **core/review-checklist.yaml**
3. Applies Lighthouse A11y ≥ 95 requirement
4. Validates keyboard/screen-reader flows
5. Checks meta tags and OG/Twitter cards
6. Measures CLS < 0.1, LCP within budget
This ensures consistent, high-quality output across all UI work.

28
skills/modes/api.yaml Normal file
View File

@@ -0,0 +1,28 @@
name: api
extends: [coding-standards, qa, security, documentation]
description: |
Mode skill for API design/implementation and contract governance (REST/GraphQL/gRPC).
used_by: [Kai, Skye, Leo, Finn, Iris]
triggers:
- new_endpoint
- contract_change
- latency_budget_defined
inputs_required:
- api_style (REST/GraphQL/gRPC)
- contract_source (OpenAPI/SDL/IDL)
- non_functional (latency/error budgets, rate limits)
outputs:
- openapi.yaml
- api-change-log.md
principles:
- Contract-first: generate server/clients from spec when possible.
- Backward-compatibility by default; version when breaking.
- Validate and sanitize all inputs at boundaries.
checklist:
- [ ] OpenAPI/SDL updated and validated
- [ ] Error model consistent (problem+json or equivalent)
- [ ] AuthN/Z documented (scopes/claims)
- [ ] Load/perf smoke tests exist
hooks:
- before_pr
- before_merge

30
skills/modes/db.yaml Normal file
View File

@@ -0,0 +1,30 @@
name: db
extends: [security, documentation]
description: |
Mode skill for relational schemas, migrations, RLS/policies, and type contracts.
used_by: [Leo, Kai, Skye, Iris]
triggers:
- schema_change
- migration_needed
- rls_or_policy_change
inputs_required:
- migration_plan (up/down)
- data_backfill_strategy
- locking_risk_assessment
- rls_specs (who can read/write what)
outputs:
- migration.sql
- db-changes.md
- policy-review.md
principles:
- Small, reversible migrations with clear downtime expectations.
- Types drive code; generate types from DB where feasible.
- RLS least-privilege and audited.
checklist:
- [ ] Dry-run migration passed in staging snapshot
- [ ] Rollback (down) script tested
- [ ] RLS/Policies peer-reviewed (Iris)
- [ ] Data backfill verified and idempotent
hooks:
- before_pr
- before_merge

View File

@@ -0,0 +1,29 @@
name: integration
extends: [security, documentation, qa]
description: |
Mode skill for third-party/platform integrations (Shopify, Sanity, Supabase, AWS, Stripe, etc.).
used_by: [Mina, Kai, Skye, Iris, Finn]
triggers:
- includes_integrations
- needs_oauth_or_webhooks
- cross_service_data_flow
inputs_required:
- credentials_location (secret store path)
- oauth_scopes
- webhook_endpoints_and_retries
- rate_limit_and_backoff_strategy
outputs:
- integration_runbook.md
- healthcheck.spec.md
principles:
- Least-privilege credentials; rotate regularly.
- Retries with jitter; idempotency keys where applicable.
- Observability first: health checks and dashboards.
checklist:
- [ ] Secrets from vault (no .env commits)
- [ ] OAuth scopes minimized and documented
- [ ] Webhook signatures validated; replay protected
- [ ] Circuit-breakers / retry policies in place
hooks:
- before_pr
- before_merge

View File

@@ -0,0 +1,29 @@
name: migration
extends: [db, release, qa, documentation]
description: |
Mode skill for coordinated changes across code, schema, and data with safe rollout/rollback.
used_by: [Kai, Leo, Skye, Blake, Finn]
triggers:
- breaking_change
- multi_step_rollout
- data_backfill_required
inputs_required:
- phased_plan (T+0, T+1, T+2 steps)
- observability_checks (metrics/logs)
- rollback_switch (feature flag / traffic split)
outputs:
- migration-plan.md
- backfill-script.(py|ts|sql)
- rollback-plan.md
principles:
- Dark launch → dual-write/dual-read → cutover → cleanup.
- Reversible at every step; time-boxed checkpoints.
- Communicate windows and fallback.
checklist:
- [ ] Feature flags or traffic router configured
- [ ] Dual-read/write verified in staging
- [ ] Backfill idempotent with checkpoints
- [ ] Cutover + rollback rehearsed
hooks:
- before_merge
- before_deploy

View File

@@ -0,0 +1,27 @@
name: performance
extends: [performance] # from core; this file specializes targets/workflows
description: |
Mode specialization for setting budgets and running perf diagnostics across UI/API.
used_by: [Nova, Finn, Kai, Theo, Skye]
triggers:
- perf_budget_defined
- perf_regression_detected
- slo_or_sli_violation
inputs_required:
- target_metrics (TTFB/LCP/CLS/API P95 latency)
- baseline_report (Lighthouse/Profiler)
- test_scenarios (user journeys / endpoints)
outputs:
- perf-report.md
- lighthouse-report.json
- traces/
principles:
- Always measure on realistic data and device profiles.
- Track budgets in CI; block on critical regression.
checklist:
- [ ] Baseline captured and committed
- [ ] CI perf step green with budgets
- [ ] Bottleneck hypothesis & fix PR linked
hooks:
- before_merge
- after_deploy

27
skills/modes/qa.yaml Normal file
View File

@@ -0,0 +1,27 @@
name: qa
extends: [qa] # from core; specialize suites by change type
description: |
Mode specialization for selecting/maintaining the right automated suites.
used_by: [Finn, Blake, Theo]
triggers:
- pre_merge
- pre_deploy
- incident_repro
inputs_required:
- change_scope (ui/api/db/integration)
- critical_paths (top user journeys)
- perf_targets (if applicable)
outputs:
- test-plan.md
- e2e-specs/
- smoke-report.md
principles:
- Prioritize high-risk/high-impact paths.
- Keep suites fast; parallelize; quarantine flakies.
checklist:
- [ ] Unit + integration pass
- [ ] E2E covers critical paths
- [ ] Smoke tests green in staging
hooks:
- before_merge
- before_deploy

27
skills/modes/release.yaml Normal file
View File

@@ -0,0 +1,27 @@
name: release
extends: [release, qa, documentation, security]
description: |
Mode specialization for staging→prod promotion, canary, and rollback orchestration.
used_by: [Blake, Theo, Finn, Iris]
triggers:
- ready_for_release
- stage == 'pre-deploy'
- rollback_or_hotfix_needed
inputs_required:
- release_notes
- rollout_plan (regions/percentages)
- rollback_criteria (metrics/alerts)
outputs:
- changelog.md
- release-notes.md
- rollout-status.md
principles:
- Canary by default; fast rollback path.
- Communicate status; capture evidence artifacts.
checklist:
- [ ] All pre-deploy gates passed (QA/Sec)
- [ ] Canary + metrics watch window configured
- [ ] Rollback script/steps verified
hooks:
- before_deploy
- after_deploy

View File

@@ -0,0 +1,28 @@
name: security
extends: [security] # from core; specialize platform checks
description: |
Mode specialization for platform-aware checks (headers/CSP, IAM, SBOM, supply-chain).
used_by: [Iris, Mina, Leo, Blake]
triggers:
- deps_changed
- sbom_update_needed
- contains_secrets
- iam_or_policy_change
inputs_required:
- sbom_tool (syft/cyclonedx)
- scanning_tool (grype/trivy)
- policy_diff (IAM/RLS/CSP)
outputs:
- security-report.md
- sbom.json
principles:
- Shift-left: check early; block risky merges.
- Signed artifacts; pinned versions.
checklist:
- [ ] SBOM updated and scanned
- [ ] Secrets scans pass (no leak/noise triaged)
- [ ] CSP/headers validated in staging
- [ ] IAM/RLS diffs approved
hooks:
- before_pr
- before_merge

33
skills/modes/ui.yaml Normal file
View File

@@ -0,0 +1,33 @@
name: ui
extends: [performance, review-checklist, documentation] # from skills/core
description: |
Mode skill for user-facing UI work: layout, components, A11y, SEO, visual polish.
Applies to Next.js/React, Shopify theme (Liquid/JSON), and general frontend stacks.
used_by: [Nova, Skye, Finn]
triggers:
- touches_ui
- user_facing_change
- a11y_changes
- seo_changes
inputs_required:
- design_reference (Figma/URL/screenshot) optional
- target_devices (desktop/mobile)
- accessibility_budget (e.g., WCAG 2.1 AA)
- seo_targets (title/desc/canonical/open-graph)
outputs:
- ui-change-notes.md
- updated-components/
principles:
- Prefer accessible, semantic markup; ARIA only when necessary.
- Keep components pure and state minimal; co-locate styles/types.
- Enforce design tokens; no magic numbers.
checklist:
- [ ] Lighthouse A11y ≥ 95, Best Practices ≥ 95
- [ ] Keyboard and screen-reader flows validated
- [ ] Meta tags, canonical, OG/Twitter cards present
- [ ] CLS < 0.1, LCP within budget (desktop/mobile targets specified)
patterns:
- "Add an accessible label and role where semantics are insufficient."
- "Use CSS logical properties for RTL/i18n readiness."
hooks:
- before_merge

476
skills/web-browse/SKILL.md Normal file
View File

@@ -0,0 +1,476 @@
# Web Browse Skill
## Purpose
Safely navigate, interact with, and capture evidence from web pages using automated browser operations.
## Security Features
- **Rate Limiting**: Maximum operations per session (10 navigations, 50 clicks, 30 inputs)
- **Input Sanitization**: Blocks sensitive patterns (passwords, credit cards, SSN)
- **Operation Logging**: All actions are logged to `artifacts/browser/{session}/operations.log`
- **Safe Mode**: Dangerous JavaScript operations are blocked
- **Local Access Only**: Server binds to localhost:3030 (not accessible externally)
## When to Use
- Verify preview deployments (UI/UX checks)
- Capture screenshots for documentation
- Run Lighthouse performance audits
- Scrape public data from allowed domains
- Test user flows without full E2E suite
- Monitor production health with visual checks
## Agents Using This Skill
- **Nova** (Preview Verification): Screenshot + Lighthouse before merge
- **Theo** (Post-Deploy Monitoring): Synthetic monitoring with screenshots
- **Mina** (UI/UX Review): Visual comparison and accessibility checks
## Configuration
No configuration required! The browser server allows access to any valid URL and shows the browser GUI by default for easy debugging.
Default settings in `.env`:
```bash
# Server port (default: 9222 - Chrome DevTools Protocol port)
BROWSER_MCP_PORT=9222
# Show browser GUI (default: visible for debugging)
# Set to 'true' for headless/background mode (useful for CI)
BROWSER_HEADLESS=false
```
### Headless Mode (Background)
To run browser in the background without GUI:
```bash
# Option 1: Set environment variable
BROWSER_HEADLESS=true npm run browser
# Option 2: Update .env file
BROWSER_HEADLESS=true
```
By default (GUI mode):
- Browser window opens automatically
- You can see all navigation, clicks, and interactions in real-time
- Perfect for debugging and learning
- Screenshots still save to artifacts/
## API Endpoints
### Initialize Browser
```bash
curl -X POST http://localhost:3030/init
# Response: {"ok": true, "sessionId": "1234567890"}
```
### Navigate to URL
```bash
curl -X POST http://localhost:3030/navigate \
-H 'Content-Type: application/json' \
-d '{"url": "https://myapp.vercel.app", "waitUntil": "networkidle"}'
# Response: {"ok": true, "url": "https://myapp.vercel.app", "title": "My App"}
```
### Click Element
```bash
curl -X POST http://localhost:3030/click \
-H 'Content-Type: application/json' \
-d '{"selector": "button.primary"}'
# Response: {"ok": true}
```
### Type Text
```bash
curl -X POST http://localhost:3030/type \
-H 'Content-Type: application/json' \
-d '{"selector": "input[name=search]", "text": "test query", "pressEnter": true}'
# Response: {"ok": true}
```
### Authenticate (Password-Protected Sites)
**Interactive Authentication Flow:**
1. **First attempt** - Try without password (checks environment variable):
```bash
curl -X POST http://localhost:3030/auth \
-H 'Content-Type: application/json' \
-d '{"type": "shopify-store", "submitSelector": "button[type=submit]"}'
```
2. **If password not in .env** - Server responds with password request:
```json
{
"needsPassword": true,
"envVarName": "SHOPIFY_STORE_PASSWORD",
"type": "shopify-store",
"prompt": "Please enter the password for shopify-store:"
}
```
3. **Provide password** - Send password in request:
```bash
curl -X POST http://localhost:3030/auth \
-H 'Content-Type: application/json' \
-d '{"type": "shopify-store", "password": "mypassword", "submitSelector": "button[type=submit]"}'
# Response: {"ok": true, "shouldSavePassword": true, "envVarName": "SHOPIFY_STORE_PASSWORD"}
```
4. **Save password** (optional) - Save to .env for future use:
```bash
curl -X POST http://localhost:3030/auth/save \
-H 'Content-Type: application/json' \
-d '{"envVarName": "SHOPIFY_STORE_PASSWORD", "password": "mypassword"}'
# Response: {"ok": true, "message": "Password saved to .env as SHOPIFY_STORE_PASSWORD"}
```
**Built-in auth types:**
- `shopify-store` - Uses `SHOPIFY_STORE_PASSWORD` environment variable
- `staging` - Uses `STAGING_PASSWORD` environment variable
- `preview` - Uses `PREVIEW_PASSWORD` environment variable
- **Custom types** - Any type converts to `{TYPE}_PASSWORD` (e.g., `my-site``MY_SITE_PASSWORD`)
**Parameters:**
- `type` (required): Auth type (built-in or custom)
- `passwordSelector` (optional): CSS selector for password field (default: `input[type="password"]`)
- `submitSelector` (optional): CSS selector for submit button (if provided, auto-submits form)
- `password` (optional): Password to use (if not in environment variable)
**Using browser-helper.sh** (handles interactive flow automatically):
```bash
# Will prompt for password if not in .env, then offer to save it
# Also handles 2FA automatically - waits for user to complete 2FA in browser
./browser-helper.sh auth shopify-store "input[type=password]" "button[type=submit]"
```
### Wait for 2FA Completion
If 2FA is detected after password authentication, the system will automatically wait for manual completion:
```bash
curl -X POST http://localhost:3030/auth/wait-2fa \
-H 'Content-Type: application/json' \
-d '{"timeout": 120000}'
# Response: {"ok": true, "completed": true, "message": "2FA completed successfully", "url": "..."}
```
**How it works:**
1. System detects 2FA page (looks for keywords in both English and Japanese):
- **English**: "2fa", "mfa", "verify", "two-factor", "authentication code", "verification code", "security code"
- **Japanese**: "二段階認証", "2段階認証", "認証コード", "確認コード", "セキュリティコード", "ワンタイムパスワード"
2. Returns `requires2FA: true` response
3. Browser window stays open (GUI mode) for manual 2FA input
4. System polls every 2 seconds to check if 2FA completed
5. Success detected when:
- URL changes from 2FA page
- 2FA indicators disappear from page content (checks both English and Japanese)
6. Default timeout: 2 minutes (configurable)
**Parameters:**
- `timeout` (optional): Maximum wait time in milliseconds (default: 120000 = 2 minutes)
- `expectedUrlPattern` (optional): Regex pattern for successful auth URL
### Wait for Element
```bash
curl -X POST http://localhost:3030/wait \
-H 'Content-Type: application/json' \
-d '{"selector": ".results", "timeout": 10000}'
# Response: {"ok": true}
```
### Scrape Text Content
```bash
curl -X POST http://localhost:3030/scrape \
-H 'Content-Type: application/json' \
-d '{"selector": "h2.product-title", "limit": 50}'
# Response: {"ok": true, "data": ["Product 1", "Product 2", ...]}
```
### Take Screenshot
```bash
curl -X POST http://localhost:3030/screenshot \
-H 'Content-Type: application/json' \
-d '{"filename": "homepage.png", "fullPage": true}'
# Response: {"ok": true, "path": "/path/to/artifacts/browser/.../homepage.png"}
```
### Get Page Content
```bash
curl -X POST http://localhost:3030/content
# Response: {"ok": true, "url": "...", "title": "...", "html": "..."}
```
### Evaluate JavaScript
```bash
curl -X POST http://localhost:3030/evaluate \
-H 'Content-Type: application/json' \
-d '{"expression": "document.querySelectorAll(\"img\").length"}'
# Response: {"ok": true, "result": 42}
```
### Close Browser
```bash
curl -X POST http://localhost:3030/close
# Response: {"ok": true}
```
### Health Check
```bash
curl http://localhost:3030/health
# Response: {"ok": true, "browser": true, "page": true, "sessionId": "...", "allowedDomains": [...]}
```
## Usage Examples
### Example 1: Screenshot Preview Deployment (Nova)
```bash
#!/usr/bin/env bash
PREVIEW_URL="https://myapp-pr-123.vercel.app"
# Initialize
curl -X POST http://localhost:3030/init
# Navigate
curl -X POST http://localhost:3030/navigate \
-H 'Content-Type: application/json' \
-d "{\"url\":\"$PREVIEW_URL\"}"
# Wait for main content
curl -X POST http://localhost:3030/wait \
-H 'Content-Type: application/json' \
-d '{"selector":"main"}'
# Screenshot
curl -X POST http://localhost:3030/screenshot \
-H 'Content-Type: application/json' \
-d '{"filename":"preview-homepage.png","fullPage":true}'
# Close
curl -X POST http://localhost:3030/close
```
### Example 2: Scrape Product Titles (Competitive Research)
```bash
#!/usr/bin/env bash
# Initialize
curl -X POST http://localhost:3030/init
# Navigate to competitor site (must be in allowlist!)
curl -X POST http://localhost:3030/navigate \
-H 'Content-Type: application/json' \
-d '{"url":"https://competitor.shopify.com/products"}'
# Scrape product titles
curl -X POST http://localhost:3030/scrape \
-H 'Content-Type: application/json' \
-d '{"selector":"h3.product-title","limit":20}' \
> artifacts/competitor-products.json
# Close
curl -X POST http://localhost:3030/close
```
### Example 3: Test Search Flow (Mina)
```bash
#!/usr/bin/env bash
PREVIEW_URL="https://myapp.vercel.app"
# Initialize
curl -X POST http://localhost:3030/init
# Navigate
curl -X POST http://localhost:3030/navigate \
-H 'Content-Type: application/json' \
-d "{\"url\":\"$PREVIEW_URL\"}"
# Type in search box
curl -X POST http://localhost:3030/type \
-H 'Content-Type: application/json' \
-d '{"selector":"input[type=search]","text":"test product","pressEnter":true}'
# Wait for results
curl -X POST http://localhost:3030/wait \
-H 'Content-Type: application/json' \
-d '{"selector":".search-results"}'
# Screenshot results
curl -X POST http://localhost:3030/screenshot \
-H 'Content-Type: application/json' \
-d '{"filename":"search-results.png","fullPage":false}'
# Close
curl -X POST http://localhost:3030/close
```
### Example 4: Test Password-Protected Shopify Store (with 2FA support)
```bash
#!/usr/bin/env bash
STORE_URL="https://mystore.myshopify.com"
# Initialize
curl -X POST http://localhost:3030/init
# Navigate to password-protected store
curl -X POST http://localhost:3030/navigate \
-H 'Content-Type: application/json' \
-d "{\"url\":\"$STORE_URL\"}"
# Wait for password form
curl -X POST http://localhost:3030/wait \
-H 'Content-Type: application/json' \
-d '{"selector":"input[type=password]"}'
# Authenticate using environment variable
AUTH_RESPONSE=$(curl -s -X POST http://localhost:3030/auth \
-H 'Content-Type: application/json' \
-d '{"type":"shopify-store","submitSelector":"button[type=submit]"}')
# Check if 2FA is required
REQUIRES_2FA=$(echo "$AUTH_RESPONSE" | jq -r '.requires2FA // false')
if [ "$REQUIRES_2FA" = "true" ]; then
echo "⏳ 2FA detected. Please complete authentication in the browser..."
# Wait for 2FA completion (2 minutes timeout)
curl -s -X POST http://localhost:3030/auth/wait-2fa \
-H 'Content-Type: application/json' \
-d '{"timeout":120000}' | jq
echo "✅ 2FA completed"
fi
# Wait for store to load
curl -X POST http://localhost:3030/wait \
-H 'Content-Type: application/json' \
-d '{"selector":"main"}'
# Screenshot the store
curl -X POST http://localhost:3030/screenshot \
-H 'Content-Type: application/json' \
-d '{"filename":"shopify-store.png","fullPage":true}'
# Close
curl -X POST http://localhost:3030/close
```
**Or simply use browser-helper.sh (handles everything automatically):**
```bash
./browser-helper.sh init
./browser-helper.sh navigate https://mystore.myshopify.com
./browser-helper.sh wait "input[type=password]"
./browser-helper.sh auth shopify-store "input[type=password]" "button[type=submit]"
# If 2FA detected, automatically prompts user and waits for completion
./browser-helper.sh wait "main"
./browser-helper.sh screenshot shopify-store.png true
./browser-helper.sh close
```
## Integration with Hooks
### before_merge.sh - Visual Regression Check
```bash
#!/usr/bin/env bash
if [ -n "$PREVIEW_URL" ]; then
echo "→ Capturing preview screenshot..."
curl -s -X POST http://localhost:3030/init > /dev/null
curl -s -X POST http://localhost:3030/navigate \
-H 'Content-Type: application/json' \
-d "{\"url\":\"$PREVIEW_URL\"}" > /dev/null
curl -s -X POST http://localhost:3030/screenshot \
-H 'Content-Type: application/json' \
-d '{"filename":"preview.png","fullPage":true}' | jq -r '.path'
curl -s -X POST http://localhost:3030/close > /dev/null
echo "✅ Screenshot saved"
fi
```
### after_deploy.sh - Production Health Check
```bash
#!/usr/bin/env bash
PROD_URL="${PRODUCTION_URL:-https://app.example.com}"
echo "→ Running production health check..."
curl -s -X POST http://localhost:3030/init > /dev/null
curl -s -X POST http://localhost:3030/navigate \
-H 'Content-Type: application/json' \
-d "{\"url\":\"$PROD_URL\"}" > /dev/null
# Check if critical element exists
RESULT=$(curl -s -X POST http://localhost:3030/evaluate \
-H 'Content-Type: application/json' \
-d '{"expression":"document.querySelector(\".app-loaded\") !== null"}' | jq -r '.result')
if [ "$RESULT" = "true" ]; then
echo "✅ Production app loaded successfully"
else
echo "❌ Production app failed to load"
exit 1
fi
curl -s -X POST http://localhost:3030/close > /dev/null
```
## Best Practices
1. **Always Initialize**: Call `/init` before any browser operations
2. **Clean Up**: Call `/close` when done to free resources
3. **Wait for Elements**: Use `/wait` before interacting with dynamic content
4. **Rate Limit Awareness**: Monitor operation counts to avoid hitting limits
5. **Security First**: Never type credentials or PII (blocked by default)
6. **Evidence Collection**: Save screenshots and logs for audit trail
7. **Domain Approval**: Add domains to allowlist before accessing
## Troubleshooting
### Error: "Domain not allowed"
Add the domain to `BROWSER_ALLOWED_DOMAINS` in `.env`
### Error: "Browser not initialized"
Call `/init` endpoint first
### Error: "Navigation limit exceeded"
You've hit the 10 navigation limit per session. Close and reinitialize.
### Error: "No active page"
Navigate to a URL first using `/navigate`
### Screenshots not saving
Check that `artifacts/` directory has write permissions
## Artifacts
All browser operations save artifacts to:
```
artifacts/browser/{sessionId}/
├── operations.log # All operations with timestamps
├── screenshot.png # Screenshots (custom filenames)
├── preview.png
└── ...
```
## Rate Limits (Per Session)
- Navigations: 10
- Clicks: 50
- Type operations: 30
## Blocked Patterns
Input containing these patterns will be rejected:
- `password` (case-insensitive)
- `credit card`
- `ssn` / `social security`
## JavaScript Evaluation Restrictions
Blocked keywords in `/evaluate`:
- `delete`
- `drop`
- `remove`
- `cookie`
- `localStorage`