Initial commit

2025-11-30 09:03:11 +08:00
commit 4aff69d9a9
61 changed files with 7343 additions and 0 deletions
--- a/skills/auto-commit/SKILL.md
+++ b/skills/auto-commit/SKILL.md
@@ -0,0 +1,177 @@
+# Auto-Commit Skill
+
+## Purpose
+Automatically create professional git commits after completing agent tasks with semantic commit messages.
+
+## When to Use
+- After completing a significant sub-task
+- When changes are ready to be versioned
+- Before handing off to another agent
+- After passing quality gates
+
+## Usage
+
+Call the auto-commit script with:
+```bash
+./orchestra/mcp-servers/auto-commit.sh <prefix> <reason> <action> [agent_name]
+```
+
+### Parameters
+1. **prefix** (required): Semantic commit type
+   - `feat` - New feature
+   - `fix` - Bug fix
+   - `docs` - Documentation changes
+   - `style` - Code formatting
+   - `refactor` - Code refactoring
+   - `perf` - Performance improvement
+   - `test` - Test additions/updates
+   - `chore` - Build/tool changes
+
+2. **reason** (required): Why this change was made
+   - English: "to support voice notifications"
+   - Japanese: "通知をサポートする"
+
+3. **action** (required): What was done
+   - English: "Add ElevenLabs TTS integration"
+   - Japanese: "ElevenLabs TTS統合を追加"
+
+4. **agent_name** (optional): Agent making the commit
+   - "Eden", "Iris", "Mina", "Theo", "Blake"
+
+### Examples
+
+**Eden (QA Agent) - After running tests:**
+```bash
+# English
+./orchestra/mcp-servers/auto-commit.sh \
+  "test" \
+  "to ensure code quality" \
+  "Add comprehensive unit tests for API endpoints" \
+  "Eden"
+# Result: test: Add comprehensive unit tests for API endpoints (to ensure code quality)
+
+# Japanese
+./orchestra/mcp-servers/auto-commit.sh \
+  "test" \
+  "コード品質を確保するため" \
+  "APIエンドポイント用の包括的な単体テストを追加" \
+  "Eden"
+# Result: test: APIエンドポイント用の包括的な単体テストを追加 (コード品質を確保するため)
+```
+
+**Iris (Security Agent) - After security scan:**
+```bash
+./orchestra/mcp-servers/auto-commit.sh \
+  "chore" \
+  "to validate deployment security" \
+  "Run TruffleHog and Grype security scans" \
+  "Iris"
+# Result: chore: Run TruffleHog and Grype security scans (to validate deployment security)
+```
+
+**Blake (Release Manager) - After release preparation:**
+```bash
+./orchestra/mcp-servers/auto-commit.sh \
+  "chore" \
+  "to prepare for production release" \
+  "Update changelog and version tags for v2.0.0" \
+  "Blake"
+# Result: chore: Update changelog and version tags for v2.0.0 (to prepare for production release)
+```
+
+**Mina (Integration Specialist) - After API integration:**
+```bash
+./orchestra/mcp-servers/auto-commit.sh \
+  "feat" \
+  "to enhance user experience" \
+  "Implement responsive navigation component" \
+  "Mina"
+# Result: feat: Implement responsive navigation component (to enhance user experience)
+```
+
+**Theo (DevOps) - After deployment:**
+```bash
+./orchestra/mcp-servers/auto-commit.sh \
+  "chore" \
+  "to track deployment state" \
+  "Update production deployment configuration" \
+  "Theo"
+# Result: chore: Update production deployment configuration (to track deployment state)
+```
+
+## Configuration
+
+Set in `.env`:
+```bash
+# Enable/disable auto-commits
+AUTO_COMMIT_ENABLED=true
+
+# Commit message language
+COMMIT_LANGUAGE=en  # or 'ja' for Japanese
+```
+
+## Commit Message Format
+
+**Format (both languages):**
+```
+prefix: <action> (<reason>)
+
+Co-Authored-By: <Agent> <noreply@orchestra>
+```
+
+**English example:**
+```
+feat: Add voice notification feature (to support agent announcements)
+
+Co-Authored-By: Eden <noreply@orchestra>
+```
+
+**Japanese example:**
+```
+feat: 音声通知機能を追加 (エージェントのアナウンスをサポートするため)
+
+Co-Authored-By: Eden <noreply@orchestra>
+```
+
+## Best Practices
+
+1. **Commit frequently**: After each meaningful sub-task
+2. **Use semantic prefixes**: Choose the most appropriate prefix
+3. **Be specific**: Clearly describe both reason and action
+4. **Agent attribution**: Always include agent name for traceability
+5. **Check AUTO_COMMIT_ENABLED**: Script automatically respects this flag
+6. **Verify changes**: Script only commits if there are actual changes
+
+## Integration with Hooks
+
+Auto-commit is integrated into all hook stages:
+
+- `before_task.sh` → Main Claude Code commits task planning analysis
+- `before_pr.sh` → Eden commits QA validation results
+- `before_merge.sh` → Eden commits integration test results
+- `before_deploy.sh` → Iris commits security validation
+- `after_deploy.sh` → Theo commits deployment verification
+
+## Troubleshooting
+
+**No changes to commit:**
+```
+ℹ️  No changes to commit.
+```
+→ Normal behavior, script exits gracefully
+
+**Not a git repository:**
+```
+⚠️  Not a git repository. Skipping auto-commit.
+```
+→ Script only works in git repositories
+
+**Invalid prefix:**
+```
+❌ Invalid prefix: xyz
+   Valid prefixes: feat fix docs style refactor perf test chore
+```
+→ Use one of the valid semantic prefixes
+
+**Auto-commit disabled:**
+Script exits silently when `AUTO_COMMIT_ENABLED=false`
--- a/skills/core/SKILL.md
+++ b/skills/core/SKILL.md
@@ -0,0 +1,70 @@
+---
+name: core
+description: Core development principles and guidelines covering security, QA, performance, documentation, and coding standards. Used by all agents to ensure consistent quality across the Orchestra system.
+---
+
+# Core Development Skills
+
+This skill provides essential development principles and checklists that all Orchestra agents follow to maintain high-quality standards across security, testing, documentation, performance, and code quality.
+
+## Overview
+
+The core skills provide:
+- **Security principles** (security.yaml) - Secure coding practices and vulnerability prevention
+- **QA guidelines** (qa.yaml) - Testing standards and quality assurance procedures
+- **Release procedures** (release.yaml) - Deployment and release management
+- **Performance standards** (performance.yaml) - Optimization and efficiency guidelines
+- **Documentation standards** (documentation.yaml) - Technical writing and documentation best practices
+- **Coding standards** (coding-standards.yaml) - Code style and structure conventions
+- **Review checklist** (review-checklist.yaml) - Pre-merge code review requirements
+- **Clarification guidelines** (clarify.yaml) - Requirements clarification procedures
+- **Token efficiency** (token-efficiency.md) - Guidelines for minimizing token usage
+
+## When to Use
+
+Agents automatically reference these guidelines when:
+- **Iris** - Applies security.yaml for security audits
+- **Finn** - Uses qa.yaml and review-checklist.yaml for testing
+- **Eden** - Follows documentation.yaml for technical writing
+- **Kai** - References performance.yaml and coding-standards.yaml for architecture
+- **Blake** - Uses release.yaml for deployment coordination
+- **Riley** - Applies clarify.yaml for requirements clarification
+- **All agents** - Follow token-efficiency.md to optimize responses
+
+## Usage
+
+Agents can reference specific guidelines:
+
+```markdown
+See `skills/core/security.yaml` for security best practices
+See `skills/core/qa.yaml` for testing requirements
+See `skills/core/token-efficiency.md` for response optimization
+```
+
+## File Structure
+
+```
+skills/core/
+├── SKILL.md (this file)
+├── security.yaml          # Security principles
+├── qa.yaml                # QA and testing standards
+├── release.yaml           # Release management
+├── performance.yaml       # Performance optimization
+├── documentation.yaml     # Documentation standards
+├── coding-standards.yaml  # Code style conventions
+├── review-checklist.yaml  # Pre-merge checklist
+├── clarify.yaml           # Requirements clarification
+└── token-efficiency.md    # Token usage optimization
+```
+
+## Best Practices
+
+1. **Consistency** - All agents follow the same core principles
+2. **Reference, don't duplicate** - Agents link to guidelines rather than copying them
+3. **Update centrally** - Guidelines are maintained in one place
+4. **Agent-agnostic** - Principles apply across all specialized agents
+5. **Token efficient** - Guidelines are concise and focused
+
+## Integration
+
+These core skills are automatically available to all Orchestra agents. No explicit invocation is needed - agents reference them as part of their standard operating procedures.
--- a/skills/core/clarify.yaml
+++ b/skills/core/clarify.yaml
@@ -0,0 +1,14 @@
+name: clarify
+description: |
+  Skill for disambiguating unclear requests.
+  Used by Riley and Eden to ensure mutual understanding before work begins.
+principles:
+  - Never assume intent — verify.
+  - Confirm context, motivation, outcomes, and constraints.
+patterns:
+  - "Could you describe the current issue or goal more concretely?"
+  - "What would success look like for this task?"
+  - "Are there any specific risks, dependencies, or constraints?"
+usage_examples:
+  - A vague feature request without measurable goals.
+  - A conflict between business and technical priorities.
--- a/skills/core/coding-standards.yaml
+++ b/skills/core/coding-standards.yaml
@@ -0,0 +1,12 @@
+name: coding-standards
+description: |
+  Skill for ensuring all code follows project conventions and best practices.
+  Used by Skye, Kai, Leo, Mina, and Iris.
+principles:
+  - Code must be readable, consistent, and testable.
+  - Prefer explicitness over cleverness.
+  - Document assumptions inline when logic is non-trivial.
+checks:
+  - Lint and type checks pass (ESLint, mypy, etc.)
+  - Code includes comments for complex logic.
+  - Sensitive data is never logged or exposed.
--- a/skills/core/documentation.yaml
+++ b/skills/core/documentation.yaml
@@ -0,0 +1,12 @@
+name: documentation
+description: |
+  Skill for maintaining clear and consistent documentation.
+  Used by Eden and Kai.
+principles:
+  - Documentation must tell the "why" as well as the "how".
+  - Every feature or change must include updated docs or ADR references.
+formats:
+  - README.md
+  - CHANGELOG.md
+  - ADR (Architecture Decision Record)
+  - Runbook
--- a/skills/core/performance.yaml
+++ b/skills/core/performance.yaml
@@ -0,0 +1,11 @@
+name: performance
+description: |
+  Skill for diagnosing and improving runtime performance.
+  Used by Nova, Leo, and Finn.
+principles:
+  - Measure before optimizing.
+  - Define target metrics explicitly.
+metrics:
+  - Time to First Byte (TTFB)
+  - Core Web Vitals (LCP, CLS, FID)
+  - API latency budgets
--- a/skills/core/qa.yaml
+++ b/skills/core/qa.yaml
@@ -0,0 +1,12 @@
+name: qa
+description: |
+  Skill for designing and executing automated quality assurance processes.
+  Used by Finn.
+principles:
+  - Focus testing on risk areas and user flows.
+  - Prioritize fast feedback (CI-first testing).
+checklist:
+  - [ ] Unit tests for critical logic
+  - [ ] Integration tests for data flow
+  - [ ] E2E tests for user-facing functionality
+  - [ ] Regression tests before release
--- a/skills/core/release.yaml
+++ b/skills/core/release.yaml
@@ -0,0 +1,11 @@
+name: release
+description: |
+  Skill governing safe deployment and version management.
+  Used by Blake.
+principles:
+  - Each release must be reversible.
+  - Document every change and communicate clearly.
+steps:
+  - Tag version and update changelog
+  - Run smoke/E2E tests post-deploy
+  - Announce deployment status
--- a/skills/core/review-checklist.yaml
+++ b/skills/core/review-checklist.yaml
@@ -0,0 +1,12 @@
+name: review-checklist
+description: |
+  Skill for systematic review before merging code.
+  Used by Skye, Finn, and Iris.
+principles:
+  - Reviews should focus on correctness, clarity, and maintainability.
+  - Do not approve PRs with missing documentation or untested code.
+checklist:
+  - [ ] All tests pass and coverage is >80%
+  - [ ] Naming and structure follow project conventions
+  - [ ] Security-sensitive code reviewed by Iris
+  - [ ] Docs and changelogs updated
--- a/skills/core/security.yaml
+++ b/skills/core/security.yaml
@@ -0,0 +1,13 @@
+name: security
+description: |
+  Skill enforcing secure practices throughout development.
+  Used by Iris, Mina, and Leo.
+principles:
+  - Least privilege, defense in depth.
+  - Secrets never hard-coded or logged.
+  - Dependencies regularly scanned and updated.
+checklist:
+  - [ ] No hardcoded credentials or tokens
+  - [ ] SBOM generated and verified
+  - [ ] OAuth scopes minimized
+  - [ ] Policies reviewed before merge
--- a/skills/core/token-efficiency.md
+++ b/skills/core/token-efficiency.md
@@ -0,0 +1,193 @@
+# Token Efficiency Skill
+
+**Purpose**: Minimize unnecessary token usage while maintaining code quality and thoroughness.
+
+## Core Principles
+
+### 1. Targeted Search, Not Exploration
+
+**Always prefer specific over general:**
+- ✅ `Glob("**/auth/login.ts")` - Exact file
+- ✅ `Grep("function handleAuth", type="ts")` - Specific pattern with file type
+- ❌ `Glob("**/*.ts")` - Too broad
+- ❌ Reading entire directories without filtering
+
+### 2. Read Only What You Need
+
+**Before reading a file, ask:**
+- Do I need the entire file, or just specific sections?
+- Can I use grep to find the relevant parts first?
+- Has this file already been read in the conversation?
+
+**Use Read with limits:**
+```
+Read(file_path, offset=100, limit=50)  # Only read 50 lines starting at line 100
+```
+
+### 3. Search Hierarchy (Most to Least Efficient)
+
+1. **Known location**: Direct file read if path is known
+2. **Pattern matching**: Glob with specific patterns
+3. **Keyword search**: Grep for specific terms
+4. **Contextual search**: Grep with -A/-B flags for context
+5. **Agent search**: Task tool with Explore agent (last resort)
+
+### 4. Incremental Discovery
+
+**Don't gather everything upfront:**
+```
+Bad approach:
+1. Read all files in src/
+2. Read all test files
+3. Read all config files
+4. Then start implementation
+
+Good approach:
+1. Read the specific file to modify
+2. Implement the change
+3. Only read related files if needed during implementation
+4. Read tests only when writing test updates
+```
+
+### 5. Scope Boundaries
+
+**Set clear limits:**
+- Maximum files to examine: 5-10 for most tasks
+- Maximum file size to read fully: ~500 lines
+- Large files: Use offset/limit or grep first
+- Stop once you have sufficient information
+
+### 6. Efficient Grep Usage
+
+**Use appropriate output modes:**
+- `files_with_matches`: Just need to know if pattern exists
+- `count`: Just need to know how many matches
+- `content`: Need to see actual matches (use head_limit)
+
+**Add context only when needed:**
+```
+Grep("error handler", output_mode="content", -C=3, head_limit=10)
+# Only 10 results with 3 lines of context each
+```
+
+### 7. Avoid Redundant Operations
+
+**Check conversation history first:**
+- Don't re-read files already examined
+- Reference previous findings instead of re-searching
+- Build on existing knowledge
+
+### 8. Model Selection for Task Tool
+
+**Choose the right model for the job:**
+```
+# Simple, well-defined task
+Task(description="...", model="haiku", prompt="...")
+
+# Default for most tasks (good balance)
+Task(description="...", model="sonnet", prompt="...")
+
+# Complex reasoning required (use sparingly)
+Task(description="...", model="opus", prompt="...")
+```
+
+## Practical Examples
+
+### Example 1: Adding a New Function
+
+**Inefficient approach:**
+```
+1. Read entire src/ directory
+2. Read all related files
+3. Search for all similar patterns
+4. Then implement
+```
+
+**Efficient approach:**
+```
+1. Grep for similar function names (files_with_matches)
+2. Read ONE example file
+3. Implement based on that pattern
+4. Only read more if the first example is unclear
+```
+
+### Example 2: Bug Investigation
+
+**Inefficient approach:**
+```
+1. Read all files that might be related
+2. Search entire codebase for error messages
+3. Check all test files
+```
+
+**Efficient approach:**
+```
+1. Grep for the specific error message
+2. Read only the files containing that error
+3. Check git blame if needed to understand context
+4. Read tests only for files being modified
+```
+
+### Example 3: Code Review
+
+**Inefficient approach:**
+```
+1. Read all files in the PR
+2. Read all related test files
+3. Read all documentation
+4. Then provide feedback
+```
+
+**Efficient approach:**
+```
+1. Use git diff to see only changed lines
+2. Read changed files with offset/limit to focus on modifications
+3. Grep for test coverage of changed functions
+4. Request summaries instead of reading full docs
+```
+
+## Token Budget Estimates
+
+**Quick reference for delegation:**
+
+| Task Type | Estimated Tokens | Recommended Model |
+|-----------|------------------|-------------------|
+| Simple bug fix | 2-5K | haiku |
+| New feature (small) | 5-10K | haiku |
+| New feature (medium) | 10-20K | sonnet |
+| Architecture design | 20-30K | sonnet |
+| Complex refactoring | 30-50K | sonnet |
+| Full system review | Split into chunks | sonnet |
+
+## Red Flags (Stop and Optimize)
+
+**If you find yourself:**
+- Reading more than 10 files for a single task
+- Using Glob with `**/*` without file type filtering
+- Reading entire large files (>1000 lines) without specific need
+- Re-reading files already examined in the conversation
+- Exploring code "just to understand better" without clear need
+
+**Then:** Stop and refine your approach. Ask yourself:
+1. What specific information do I need?
+2. What's the minimum search required to get it?
+3. Can I infer this from what I already know?
+
+## Integration with Other Skills
+
+This skill should be applied in conjunction with:
+- **code-quality.md**: Maintain quality while being efficient
+- **testing.md**: Test only what's necessary, focus on critical paths
+- **security.md**: Targeted security checks, not exhaustive scans
+- **documentation.md**: Document changes, not entire codebase
+
+## Success Metrics
+
+**You're doing it right when:**
+- Tasks complete in <20K tokens for most features
+- You find what you need in 1-3 search operations
+- You reference previous reads instead of re-reading
+- You can justify every file you read
+- Subagents return focused, relevant results
+
+**Review and improve** if token usage is consistently high without proportional complexity in the task.
--- a/skills/modes/SKILL.md
+++ b/skills/modes/SKILL.md
@@ -0,0 +1,171 @@
+---
+name: modes
+description: Domain-specific development mode guidelines for UI, API, database, integration, migration, and specialized workflows. Each mode provides tailored principles, checklists, and patterns for different types of development work.
+---
+
+# Development Mode Skills
+
+This skill provides domain-specific guidelines that agents activate based on the type of work being performed. Each mode extends core principles with specialized requirements for different development contexts.
+
+## Overview
+
+The mode skills provide specialized guidance for:
+- **UI Mode** (ui.yaml) - Frontend, accessibility, SEO, visual components
+- **API Mode** (api.yaml) - REST/GraphQL endpoints, API design, versioning
+- **Database Mode** (db.yaml) - Schema design, migrations, query optimization
+- **Integration Mode** (integration.yaml) - External service integration, OAuth, webhooks
+- **Migration Mode** (migration.yaml) - Data migration, version upgrades, rollback procedures
+- **Performance Mode** (performance.yaml) - Optimization, caching, load testing
+- **QA Mode** (qa.yaml) - Testing strategies, coverage requirements, test automation
+- **Security Mode** (security.yaml) - Security audits, vulnerability scanning, penetration testing
+- **Release Mode** (release.yaml) - Deployment procedures, release management, rollback
+
+## When to Use
+
+Modes are automatically activated based on work context:
+
+### UI Mode
+- **Used by**: Nova, Skye, Finn
+- **Triggers**: Frontend changes, accessibility updates, SEO optimization
+- **Focus**: Lighthouse scores, ARIA compliance, responsive design
+
+### API Mode
+- **Used by**: Skye, Kai, Mina
+- **Triggers**: Endpoint creation, API versioning, integration work
+- **Focus**: REST/GraphQL standards, documentation, versioning
+
+### Database Mode
+- **Used by**: Leo, Skye, Kai
+- **Triggers**: Schema changes, migrations, query optimization
+- **Focus**: Data integrity, indexing, rollback safety
+
+### Integration Mode
+- **Used by**: Mina, Iris, Kai
+- **Triggers**: External service integration (Stripe, Shopify, AWS, etc.)
+- **Focus**: OAuth flows, webhook handling, error resilience
+
+### Migration Mode
+- **Used by**: Blake, Leo, Kai
+- **Triggers**: Database migrations, version upgrades, data transfers
+- **Focus**: Rollback procedures, data validation, zero-downtime
+
+### Performance Mode
+- **Used by**: Kai, Nova, Theo
+- **Triggers**: Optimization work, performance issues, load testing
+- **Focus**: Caching strategies, bundle optimization, resource usage
+
+### QA Mode
+- **Used by**: Finn, Eden
+- **Triggers**: Test creation, coverage validation, quality gates
+- **Focus**: Unit/integration/E2E tests, coverage thresholds
+
+### Security Mode
+- **Used by**: Iris, Mina, Blake
+- **Triggers**: Security audits, vulnerability scans, auth changes
+- **Focus**: Secret management, SBOM generation, penetration testing
+
+### Release Mode
+- **Used by**: Blake, Eden, Theo
+- **Triggers**: Deployment preparation, release coordination
+- **Focus**: Changelog generation, deployment verification, rollback readiness
+
+## Mode Structure
+
+Each mode YAML file contains:
+
+```yaml
+name: mode-name
+extends: [core-skills]           # Inherited core principles
+description: |
+  Mode-specific description
+used_by: [Agent1, Agent2]        # Which agents use this mode
+triggers:                         # When to activate this mode
+  - trigger_condition_1
+  - trigger_condition_2
+inputs_required:                  # Required context
+  - input_1
+  - input_2
+outputs:                          # Expected deliverables
+  - output_1
+  - output_2
+principles:                       # Mode-specific guidelines
+  - principle_1
+  - principle_2
+checklist:                        # Validation requirements
+  - [ ] checklist_item_1
+  - [ ] checklist_item_2
+patterns:                         # Common solutions
+  - "Pattern description"
+hooks:                            # Integration points
+  - hook_name
+```
+
+## Usage
+
+Agents reference specific modes based on work type:
+
+```markdown
+# Nova working on UI
+See `skills/modes/ui.yaml` for accessibility and performance requirements
+
+# Leo working on database
+See `skills/modes/db.yaml` for migration and schema design guidelines
+
+# Mina integrating Stripe
+See `skills/modes/integration.yaml` for OAuth and webhook patterns
+```
+
+## File Structure
+
+```
+skills/modes/
+├── SKILL.md (this file)
+├── ui.yaml          # Frontend/accessibility/SEO
+├── api.yaml         # REST/GraphQL endpoints
+├── db.yaml          # Database schema/migrations
+├── integration.yaml # External service integration
+├── migration.yaml   # Data migration procedures
+├── performance.yaml # Optimization strategies
+├── qa.yaml          # Testing requirements
+├── security.yaml    # Security audits
+└── release.yaml     # Deployment procedures
+```
+
+## Mode Inheritance
+
+Modes extend core skills:
+- **All modes** inherit from core principles
+- **Specific modes** may extend additional core skills (e.g., ui.yaml extends performance, review-checklist, documentation)
+- **Agents** apply both core and mode-specific guidelines
+
+## Best Practices
+
+1. **Context-aware activation** - Modes activate based on work type, not manual selection
+2. **Layered guidance** - Core principles + mode-specific requirements
+3. **Agent specialization** - Each agent knows which modes to apply
+4. **Validation gates** - Each mode defines success criteria and checklists
+5. **Pattern reuse** - Common solutions documented for consistency
+
+## Integration with Agents
+
+Agents automatically apply relevant modes:
+- **Nova** (UI/UX) → ui.yaml, performance.yaml
+- **Leo** (Database) → db.yaml, migration.yaml
+- **Mina** (Integration) → integration.yaml, security.yaml
+- **Blake** (Release) → release.yaml, qa.yaml
+- **Iris** (Security) → security.yaml, integration.yaml
+- **Finn** (QA) → qa.yaml, performance.yaml
+- **Kai** (Architecture) → api.yaml, db.yaml, performance.yaml
+- **Skye** (Implementation) → ui.yaml, api.yaml, db.yaml (context-dependent)
+
+## Example Workflow
+
+When Nova receives a UI task:
+1. Activates **ui.yaml** mode
+2. Inherits principles from **core/performance.yaml**, **core/review-checklist.yaml**
+3. Applies Lighthouse A11y ≥ 95 requirement
+4. Validates keyboard/screen-reader flows
+5. Checks meta tags and OG/Twitter cards
+6. Measures CLS < 0.1, LCP within budget
+
+This ensures consistent, high-quality output across all UI work.
--- a/skills/modes/api.yaml
+++ b/skills/modes/api.yaml
@@ -0,0 +1,28 @@
+name: api
+extends: [coding-standards, qa, security, documentation]
+description: |
+  Mode skill for API design/implementation and contract governance (REST/GraphQL/gRPC).
+used_by: [Kai, Skye, Leo, Finn, Iris]
+triggers:
+  - new_endpoint
+  - contract_change
+  - latency_budget_defined
+inputs_required:
+  - api_style (REST/GraphQL/gRPC)
+  - contract_source (OpenAPI/SDL/IDL)
+  - non_functional (latency/error budgets, rate limits)
+outputs:
+  - openapi.yaml
+  - api-change-log.md
+principles:
+  - Contract-first: generate server/clients from spec when possible.
+  - Backward-compatibility by default; version when breaking.
+  - Validate and sanitize all inputs at boundaries.
+checklist:
+  - [ ] OpenAPI/SDL updated and validated
+  - [ ] Error model consistent (problem+json or equivalent)
+  - [ ] AuthN/Z documented (scopes/claims)
+  - [ ] Load/perf smoke tests exist
+hooks:
+  - before_pr
+  - before_merge
--- a/skills/modes/db.yaml
+++ b/skills/modes/db.yaml
@@ -0,0 +1,30 @@
+name: db
+extends: [security, documentation]
+description: |
+  Mode skill for relational schemas, migrations, RLS/policies, and type contracts.
+used_by: [Leo, Kai, Skye, Iris]
+triggers:
+  - schema_change
+  - migration_needed
+  - rls_or_policy_change
+inputs_required:
+  - migration_plan (up/down)
+  - data_backfill_strategy
+  - locking_risk_assessment
+  - rls_specs (who can read/write what)
+outputs:
+  - migration.sql
+  - db-changes.md
+  - policy-review.md
+principles:
+  - Small, reversible migrations with clear downtime expectations.
+  - Types drive code; generate types from DB where feasible.
+  - RLS least-privilege and audited.
+checklist:
+  - [ ] Dry-run migration passed in staging snapshot
+  - [ ] Rollback (down) script tested
+  - [ ] RLS/Policies peer-reviewed (Iris)
+  - [ ] Data backfill verified and idempotent
+hooks:
+  - before_pr
+  - before_merge
--- a/skills/modes/integration.yaml
+++ b/skills/modes/integration.yaml
@@ -0,0 +1,29 @@
+name: integration
+extends: [security, documentation, qa]
+description: |
+  Mode skill for third-party/platform integrations (Shopify, Sanity, Supabase, AWS, Stripe, etc.).
+used_by: [Mina, Kai, Skye, Iris, Finn]
+triggers:
+  - includes_integrations
+  - needs_oauth_or_webhooks
+  - cross_service_data_flow
+inputs_required:
+  - credentials_location (secret store path)
+  - oauth_scopes
+  - webhook_endpoints_and_retries
+  - rate_limit_and_backoff_strategy
+outputs:
+  - integration_runbook.md
+  - healthcheck.spec.md
+principles:
+  - Least-privilege credentials; rotate regularly.
+  - Retries with jitter; idempotency keys where applicable.
+  - Observability first: health checks and dashboards.
+checklist:
+  - [ ] Secrets from vault (no .env commits)
+  - [ ] OAuth scopes minimized and documented
+  - [ ] Webhook signatures validated; replay protected
+  - [ ] Circuit-breakers / retry policies in place
+hooks:
+  - before_pr
+  - before_merge
--- a/skills/modes/migration.yaml
+++ b/skills/modes/migration.yaml
@@ -0,0 +1,29 @@
+name: migration
+extends: [db, release, qa, documentation]
+description: |
+  Mode skill for coordinated changes across code, schema, and data with safe rollout/rollback.
+used_by: [Kai, Leo, Skye, Blake, Finn]
+triggers:
+  - breaking_change
+  - multi_step_rollout
+  - data_backfill_required
+inputs_required:
+  - phased_plan (T+0, T+1, T+2 steps)
+  - observability_checks (metrics/logs)
+  - rollback_switch (feature flag / traffic split)
+outputs:
+  - migration-plan.md
+  - backfill-script.(py|ts|sql)
+  - rollback-plan.md
+principles:
+  - Dark launch → dual-write/dual-read → cutover → cleanup.
+  - Reversible at every step; time-boxed checkpoints.
+  - Communicate windows and fallback.
+checklist:
+  - [ ] Feature flags or traffic router configured
+  - [ ] Dual-read/write verified in staging
+  - [ ] Backfill idempotent with checkpoints
+  - [ ] Cutover + rollback rehearsed
+hooks:
+  - before_merge
+  - before_deploy
--- a/skills/modes/performance.yaml
+++ b/skills/modes/performance.yaml
@@ -0,0 +1,27 @@
+name: performance
+extends: [performance]   # from core; this file specializes targets/workflows
+description: |
+  Mode specialization for setting budgets and running perf diagnostics across UI/API.
+used_by: [Nova, Finn, Kai, Theo, Skye]
+triggers:
+  - perf_budget_defined
+  - perf_regression_detected
+  - slo_or_sli_violation
+inputs_required:
+  - target_metrics (TTFB/LCP/CLS/API P95 latency)
+  - baseline_report (Lighthouse/Profiler)
+  - test_scenarios (user journeys / endpoints)
+outputs:
+  - perf-report.md
+  - lighthouse-report.json
+  - traces/
+principles:
+  - Always measure on realistic data and device profiles.
+  - Track budgets in CI; block on critical regression.
+checklist:
+  - [ ] Baseline captured and committed
+  - [ ] CI perf step green with budgets
+  - [ ] Bottleneck hypothesis & fix PR linked
+hooks:
+  - before_merge
+  - after_deploy
--- a/skills/modes/qa.yaml
+++ b/skills/modes/qa.yaml
@@ -0,0 +1,27 @@
+name: qa
+extends: [qa]   # from core; specialize suites by change type
+description: |
+  Mode specialization for selecting/maintaining the right automated suites.
+used_by: [Finn, Blake, Theo]
+triggers:
+  - pre_merge
+  - pre_deploy
+  - incident_repro
+inputs_required:
+  - change_scope (ui/api/db/integration)
+  - critical_paths (top user journeys)
+  - perf_targets (if applicable)
+outputs:
+  - test-plan.md
+  - e2e-specs/
+  - smoke-report.md
+principles:
+  - Prioritize high-risk/high-impact paths.
+  - Keep suites fast; parallelize; quarantine flakies.
+checklist:
+  - [ ] Unit + integration pass
+  - [ ] E2E covers critical paths
+  - [ ] Smoke tests green in staging
+hooks:
+  - before_merge
+  - before_deploy
--- a/skills/modes/release.yaml
+++ b/skills/modes/release.yaml
@@ -0,0 +1,27 @@
+name: release
+extends: [release, qa, documentation, security]
+description: |
+  Mode specialization for staging→prod promotion, canary, and rollback orchestration.
+used_by: [Blake, Theo, Finn, Iris]
+triggers:
+  - ready_for_release
+  - stage == 'pre-deploy'
+  - rollback_or_hotfix_needed
+inputs_required:
+  - release_notes
+  - rollout_plan (regions/percentages)
+  - rollback_criteria (metrics/alerts)
+outputs:
+  - changelog.md
+  - release-notes.md
+  - rollout-status.md
+principles:
+  - Canary by default; fast rollback path.
+  - Communicate status; capture evidence artifacts.
+checklist:
+  - [ ] All pre-deploy gates passed (QA/Sec)
+  - [ ] Canary + metrics watch window configured
+  - [ ] Rollback script/steps verified
+hooks:
+  - before_deploy
+  - after_deploy
--- a/skills/modes/security.yaml
+++ b/skills/modes/security.yaml
@@ -0,0 +1,28 @@
+name: security
+extends: [security]   # from core; specialize platform checks
+description: |
+  Mode specialization for platform-aware checks (headers/CSP, IAM, SBOM, supply-chain).
+used_by: [Iris, Mina, Leo, Blake]
+triggers:
+  - deps_changed
+  - sbom_update_needed
+  - contains_secrets
+  - iam_or_policy_change
+inputs_required:
+  - sbom_tool (syft/cyclonedx)
+  - scanning_tool (grype/trivy)
+  - policy_diff (IAM/RLS/CSP)
+outputs:
+  - security-report.md
+  - sbom.json
+principles:
+  - Shift-left: check early; block risky merges.
+  - Signed artifacts; pinned versions.
+checklist:
+  - [ ] SBOM updated and scanned
+  - [ ] Secrets scans pass (no leak/noise triaged)
+  - [ ] CSP/headers validated in staging
+  - [ ] IAM/RLS diffs approved
+hooks:
+  - before_pr
+  - before_merge
--- a/skills/modes/ui.yaml
+++ b/skills/modes/ui.yaml
@@ -0,0 +1,33 @@
+name: ui
+extends: [performance, review-checklist, documentation]   # from skills/core
+description: |
+  Mode skill for user-facing UI work: layout, components, A11y, SEO, visual polish.
+  Applies to Next.js/React, Shopify theme (Liquid/JSON), and general frontend stacks.
+used_by: [Nova, Skye, Finn]
+triggers:
+  - touches_ui
+  - user_facing_change
+  - a11y_changes
+  - seo_changes
+inputs_required:
+  - design_reference (Figma/URL/screenshot) optional
+  - target_devices (desktop/mobile)
+  - accessibility_budget (e.g., WCAG 2.1 AA)
+  - seo_targets (title/desc/canonical/open-graph)
+outputs:
+  - ui-change-notes.md
+  - updated-components/
+principles:
+  - Prefer accessible, semantic markup; ARIA only when necessary.
+  - Keep components pure and state minimal; co-locate styles/types.
+  - Enforce design tokens; no magic numbers.
+checklist:
+  - [ ] Lighthouse A11y ≥ 95, Best Practices ≥ 95
+  - [ ] Keyboard and screen-reader flows validated
+  - [ ] Meta tags, canonical, OG/Twitter cards present
+  - [ ] CLS < 0.1, LCP within budget (desktop/mobile targets specified)
+patterns:
+  - "Add an accessible label and role where semantics are insufficient."
+  - "Use CSS logical properties for RTL/i18n readiness."
+hooks:
+  - before_merge
--- a/skills/web-browse/SKILL.md
+++ b/skills/web-browse/SKILL.md
@@ -0,0 +1,476 @@
+# Web Browse Skill
+
+## Purpose
+Safely navigate, interact with, and capture evidence from web pages using automated browser operations.
+
+## Security Features
+- **Rate Limiting**: Maximum operations per session (10 navigations, 50 clicks, 30 inputs)
+- **Input Sanitization**: Blocks sensitive patterns (passwords, credit cards, SSN)
+- **Operation Logging**: All actions are logged to `artifacts/browser/{session}/operations.log`
+- **Safe Mode**: Dangerous JavaScript operations are blocked
+- **Local Access Only**: Server binds to localhost:3030 (not accessible externally)
+
+## When to Use
+- Verify preview deployments (UI/UX checks)
+- Capture screenshots for documentation
+- Run Lighthouse performance audits
+- Scrape public data from allowed domains
+- Test user flows without full E2E suite
+- Monitor production health with visual checks
+
+## Agents Using This Skill
+- **Nova** (Preview Verification): Screenshot + Lighthouse before merge
+- **Theo** (Post-Deploy Monitoring): Synthetic monitoring with screenshots
+- **Mina** (UI/UX Review): Visual comparison and accessibility checks
+
+## Configuration
+
+No configuration required! The browser server allows access to any valid URL and shows the browser GUI by default for easy debugging.
+
+Default settings in `.env`:
+```bash
+# Server port (default: 9222 - Chrome DevTools Protocol port)
+BROWSER_MCP_PORT=9222
+
+# Show browser GUI (default: visible for debugging)
+# Set to 'true' for headless/background mode (useful for CI)
+BROWSER_HEADLESS=false
+```
+
+### Headless Mode (Background)
+
+To run browser in the background without GUI:
+
+```bash
+# Option 1: Set environment variable
+BROWSER_HEADLESS=true npm run browser
+
+# Option 2: Update .env file
+BROWSER_HEADLESS=true
+```
+
+By default (GUI mode):
+- Browser window opens automatically
+- You can see all navigation, clicks, and interactions in real-time
+- Perfect for debugging and learning
+- Screenshots still save to artifacts/
+
+## API Endpoints
+
+### Initialize Browser
+```bash
+curl -X POST http://localhost:3030/init
+# Response: {"ok": true, "sessionId": "1234567890"}
+```
+
+### Navigate to URL
+```bash
+curl -X POST http://localhost:3030/navigate \
+  -H 'Content-Type: application/json' \
+  -d '{"url": "https://myapp.vercel.app", "waitUntil": "networkidle"}'
+# Response: {"ok": true, "url": "https://myapp.vercel.app", "title": "My App"}
+```
+
+### Click Element
+```bash
+curl -X POST http://localhost:3030/click \
+  -H 'Content-Type: application/json' \
+  -d '{"selector": "button.primary"}'
+# Response: {"ok": true}
+```
+
+### Type Text
+```bash
+curl -X POST http://localhost:3030/type \
+  -H 'Content-Type: application/json' \
+  -d '{"selector": "input[name=search]", "text": "test query", "pressEnter": true}'
+# Response: {"ok": true}
+```
+
+### Authenticate (Password-Protected Sites)
+
+**Interactive Authentication Flow:**
+
+1. **First attempt** - Try without password (checks environment variable):
+```bash
+curl -X POST http://localhost:3030/auth \
+  -H 'Content-Type: application/json' \
+  -d '{"type": "shopify-store", "submitSelector": "button[type=submit]"}'
+```
+
+2. **If password not in .env** - Server responds with password request:
+```json
+{
+  "needsPassword": true,
+  "envVarName": "SHOPIFY_STORE_PASSWORD",
+  "type": "shopify-store",
+  "prompt": "Please enter the password for shopify-store:"
+}
+```
+
+3. **Provide password** - Send password in request:
+```bash
+curl -X POST http://localhost:3030/auth \
+  -H 'Content-Type: application/json' \
+  -d '{"type": "shopify-store", "password": "mypassword", "submitSelector": "button[type=submit]"}'
+# Response: {"ok": true, "shouldSavePassword": true, "envVarName": "SHOPIFY_STORE_PASSWORD"}
+```
+
+4. **Save password** (optional) - Save to .env for future use:
+```bash
+curl -X POST http://localhost:3030/auth/save \
+  -H 'Content-Type: application/json' \
+  -d '{"envVarName": "SHOPIFY_STORE_PASSWORD", "password": "mypassword"}'
+# Response: {"ok": true, "message": "Password saved to .env as SHOPIFY_STORE_PASSWORD"}
+```
+
+**Built-in auth types:**
+- `shopify-store` - Uses `SHOPIFY_STORE_PASSWORD` environment variable
+- `staging` - Uses `STAGING_PASSWORD` environment variable
+- `preview` - Uses `PREVIEW_PASSWORD` environment variable
+- **Custom types** - Any type converts to `{TYPE}_PASSWORD` (e.g., `my-site` → `MY_SITE_PASSWORD`)
+
+**Parameters:**
+- `type` (required): Auth type (built-in or custom)
+- `passwordSelector` (optional): CSS selector for password field (default: `input[type="password"]`)
+- `submitSelector` (optional): CSS selector for submit button (if provided, auto-submits form)
+- `password` (optional): Password to use (if not in environment variable)
+
+**Using browser-helper.sh** (handles interactive flow automatically):
+```bash
+# Will prompt for password if not in .env, then offer to save it
+# Also handles 2FA automatically - waits for user to complete 2FA in browser
+./browser-helper.sh auth shopify-store "input[type=password]" "button[type=submit]"
+```
+
+### Wait for 2FA Completion
+
+If 2FA is detected after password authentication, the system will automatically wait for manual completion:
+
+```bash
+curl -X POST http://localhost:3030/auth/wait-2fa \
+  -H 'Content-Type: application/json' \
+  -d '{"timeout": 120000}'
+# Response: {"ok": true, "completed": true, "message": "2FA completed successfully", "url": "..."}
+```
+
+**How it works:**
+1. System detects 2FA page (looks for keywords in both English and Japanese):
+   - **English**: "2fa", "mfa", "verify", "two-factor", "authentication code", "verification code", "security code"
+   - **Japanese**: "二段階認証", "2段階認証", "認証コード", "確認コード", "セキュリティコード", "ワンタイムパスワード"
+2. Returns `requires2FA: true` response
+3. Browser window stays open (GUI mode) for manual 2FA input
+4. System polls every 2 seconds to check if 2FA completed
+5. Success detected when:
+   - URL changes from 2FA page
+   - 2FA indicators disappear from page content (checks both English and Japanese)
+6. Default timeout: 2 minutes (configurable)
+
+**Parameters:**
+- `timeout` (optional): Maximum wait time in milliseconds (default: 120000 = 2 minutes)
+- `expectedUrlPattern` (optional): Regex pattern for successful auth URL
+
+### Wait for Element
+```bash
+curl -X POST http://localhost:3030/wait \
+  -H 'Content-Type: application/json' \
+  -d '{"selector": ".results", "timeout": 10000}'
+# Response: {"ok": true}
+```
+
+### Scrape Text Content
+```bash
+curl -X POST http://localhost:3030/scrape \
+  -H 'Content-Type: application/json' \
+  -d '{"selector": "h2.product-title", "limit": 50}'
+# Response: {"ok": true, "data": ["Product 1", "Product 2", ...]}
+```
+
+### Take Screenshot
+```bash
+curl -X POST http://localhost:3030/screenshot \
+  -H 'Content-Type: application/json' \
+  -d '{"filename": "homepage.png", "fullPage": true}'
+# Response: {"ok": true, "path": "/path/to/artifacts/browser/.../homepage.png"}
+```
+
+### Get Page Content
+```bash
+curl -X POST http://localhost:3030/content
+# Response: {"ok": true, "url": "...", "title": "...", "html": "..."}
+```
+
+### Evaluate JavaScript
+```bash
+curl -X POST http://localhost:3030/evaluate \
+  -H 'Content-Type: application/json' \
+  -d '{"expression": "document.querySelectorAll(\"img\").length"}'
+# Response: {"ok": true, "result": 42}
+```
+
+### Close Browser
+```bash
+curl -X POST http://localhost:3030/close
+# Response: {"ok": true}
+```
+
+### Health Check
+```bash
+curl http://localhost:3030/health
+# Response: {"ok": true, "browser": true, "page": true, "sessionId": "...", "allowedDomains": [...]}
+```
+
+## Usage Examples
+
+### Example 1: Screenshot Preview Deployment (Nova)
+```bash
+#!/usr/bin/env bash
+PREVIEW_URL="https://myapp-pr-123.vercel.app"
+
+# Initialize
+curl -X POST http://localhost:3030/init
+
+# Navigate
+curl -X POST http://localhost:3030/navigate \
+  -H 'Content-Type: application/json' \
+  -d "{\"url\":\"$PREVIEW_URL\"}"
+
+# Wait for main content
+curl -X POST http://localhost:3030/wait \
+  -H 'Content-Type: application/json' \
+  -d '{"selector":"main"}'
+
+# Screenshot
+curl -X POST http://localhost:3030/screenshot \
+  -H 'Content-Type: application/json' \
+  -d '{"filename":"preview-homepage.png","fullPage":true}'
+
+# Close
+curl -X POST http://localhost:3030/close
+```
+
+### Example 2: Scrape Product Titles (Competitive Research)
+```bash
+#!/usr/bin/env bash
+# Initialize
+curl -X POST http://localhost:3030/init
+
+# Navigate to competitor site (must be in allowlist!)
+curl -X POST http://localhost:3030/navigate \
+  -H 'Content-Type: application/json' \
+  -d '{"url":"https://competitor.shopify.com/products"}'
+
+# Scrape product titles
+curl -X POST http://localhost:3030/scrape \
+  -H 'Content-Type: application/json' \
+  -d '{"selector":"h3.product-title","limit":20}' \
+  > artifacts/competitor-products.json
+
+# Close
+curl -X POST http://localhost:3030/close
+```
+
+### Example 3: Test Search Flow (Mina)
+```bash
+#!/usr/bin/env bash
+PREVIEW_URL="https://myapp.vercel.app"
+
+# Initialize
+curl -X POST http://localhost:3030/init
+
+# Navigate
+curl -X POST http://localhost:3030/navigate \
+  -H 'Content-Type: application/json' \
+  -d "{\"url\":\"$PREVIEW_URL\"}"
+
+# Type in search box
+curl -X POST http://localhost:3030/type \
+  -H 'Content-Type: application/json' \
+  -d '{"selector":"input[type=search]","text":"test product","pressEnter":true}'
+
+# Wait for results
+curl -X POST http://localhost:3030/wait \
+  -H 'Content-Type: application/json' \
+  -d '{"selector":".search-results"}'
+
+# Screenshot results
+curl -X POST http://localhost:3030/screenshot \
+  -H 'Content-Type: application/json' \
+  -d '{"filename":"search-results.png","fullPage":false}'
+
+# Close
+curl -X POST http://localhost:3030/close
+```
+
+### Example 4: Test Password-Protected Shopify Store (with 2FA support)
+```bash
+#!/usr/bin/env bash
+STORE_URL="https://mystore.myshopify.com"
+
+# Initialize
+curl -X POST http://localhost:3030/init
+
+# Navigate to password-protected store
+curl -X POST http://localhost:3030/navigate \
+  -H 'Content-Type: application/json' \
+  -d "{\"url\":\"$STORE_URL\"}"
+
+# Wait for password form
+curl -X POST http://localhost:3030/wait \
+  -H 'Content-Type: application/json' \
+  -d '{"selector":"input[type=password]"}'
+
+# Authenticate using environment variable
+AUTH_RESPONSE=$(curl -s -X POST http://localhost:3030/auth \
+  -H 'Content-Type: application/json' \
+  -d '{"type":"shopify-store","submitSelector":"button[type=submit]"}')
+
+# Check if 2FA is required
+REQUIRES_2FA=$(echo "$AUTH_RESPONSE" | jq -r '.requires2FA // false')
+
+if [ "$REQUIRES_2FA" = "true" ]; then
+  echo "⏳ 2FA detected. Please complete authentication in the browser..."
+
+  # Wait for 2FA completion (2 minutes timeout)
+  curl -s -X POST http://localhost:3030/auth/wait-2fa \
+    -H 'Content-Type: application/json' \
+    -d '{"timeout":120000}' | jq
+
+  echo "✅ 2FA completed"
+fi
+
+# Wait for store to load
+curl -X POST http://localhost:3030/wait \
+  -H 'Content-Type: application/json' \
+  -d '{"selector":"main"}'
+
+# Screenshot the store
+curl -X POST http://localhost:3030/screenshot \
+  -H 'Content-Type: application/json' \
+  -d '{"filename":"shopify-store.png","fullPage":true}'
+
+# Close
+curl -X POST http://localhost:3030/close
+```
+
+**Or simply use browser-helper.sh (handles everything automatically):**
+```bash
+./browser-helper.sh init
+./browser-helper.sh navigate https://mystore.myshopify.com
+./browser-helper.sh wait "input[type=password]"
+./browser-helper.sh auth shopify-store "input[type=password]" "button[type=submit]"
+# If 2FA detected, automatically prompts user and waits for completion
+./browser-helper.sh wait "main"
+./browser-helper.sh screenshot shopify-store.png true
+./browser-helper.sh close
+```
+
+## Integration with Hooks
+
+### before_merge.sh - Visual Regression Check
+```bash
+#!/usr/bin/env bash
+if [ -n "$PREVIEW_URL" ]; then
+  echo "→ Capturing preview screenshot..."
+
+  curl -s -X POST http://localhost:3030/init > /dev/null
+  curl -s -X POST http://localhost:3030/navigate \
+    -H 'Content-Type: application/json' \
+    -d "{\"url\":\"$PREVIEW_URL\"}" > /dev/null
+
+  curl -s -X POST http://localhost:3030/screenshot \
+    -H 'Content-Type: application/json' \
+    -d '{"filename":"preview.png","fullPage":true}' | jq -r '.path'
+
+  curl -s -X POST http://localhost:3030/close > /dev/null
+
+  echo "✅ Screenshot saved"
+fi
+```
+
+### after_deploy.sh - Production Health Check
+```bash
+#!/usr/bin/env bash
+PROD_URL="${PRODUCTION_URL:-https://app.example.com}"
+
+echo "→ Running production health check..."
+
+curl -s -X POST http://localhost:3030/init > /dev/null
+curl -s -X POST http://localhost:3030/navigate \
+  -H 'Content-Type: application/json' \
+  -d "{\"url\":\"$PROD_URL\"}" > /dev/null
+
+# Check if critical element exists
+RESULT=$(curl -s -X POST http://localhost:3030/evaluate \
+  -H 'Content-Type: application/json' \
+  -d '{"expression":"document.querySelector(\".app-loaded\") !== null"}' | jq -r '.result')
+
+if [ "$RESULT" = "true" ]; then
+  echo "✅ Production app loaded successfully"
+else
+  echo "❌ Production app failed to load"
+  exit 1
+fi
+
+curl -s -X POST http://localhost:3030/close > /dev/null
+```
+
+## Best Practices
+
+1. **Always Initialize**: Call `/init` before any browser operations
+2. **Clean Up**: Call `/close` when done to free resources
+3. **Wait for Elements**: Use `/wait` before interacting with dynamic content
+4. **Rate Limit Awareness**: Monitor operation counts to avoid hitting limits
+5. **Security First**: Never type credentials or PII (blocked by default)
+6. **Evidence Collection**: Save screenshots and logs for audit trail
+7. **Domain Approval**: Add domains to allowlist before accessing
+
+## Troubleshooting
+
+### Error: "Domain not allowed"
+Add the domain to `BROWSER_ALLOWED_DOMAINS` in `.env`
+
+### Error: "Browser not initialized"
+Call `/init` endpoint first
+
+### Error: "Navigation limit exceeded"
+You've hit the 10 navigation limit per session. Close and reinitialize.
+
+### Error: "No active page"
+Navigate to a URL first using `/navigate`
+
+### Screenshots not saving
+Check that `artifacts/` directory has write permissions
+
+## Artifacts
+
+All browser operations save artifacts to:
+```
+artifacts/browser/{sessionId}/
+├── operations.log       # All operations with timestamps
+├── screenshot.png       # Screenshots (custom filenames)
+├── preview.png
+└── ...
+```
+
+## Rate Limits (Per Session)
+
+- Navigations: 10
+- Clicks: 50
+- Type operations: 30
+
+## Blocked Patterns
+
+Input containing these patterns will be rejected:
+- `password` (case-insensitive)
+- `credit card`
+- `ssn` / `social security`
+
+## JavaScript Evaluation Restrictions
+
+Blocked keywords in `/evaluate`:
+- `delete`
+- `drop`
+- `remove`
+- `cookie`
+- `localStorage`