Initial commit

2025-11-29 18:50:53 +08:00
commit e8e7ede9e4
14 changed files with 2360 additions and 0 deletions
--- a/commands/ai-agent-create.md
+++ b/commands/ai-agent-create.md
@@ -0,0 +1,502 @@
+---
+name: ai-agent-create
+description: Create a new specialized AI agent with custom tools, handoff rules, and specific expertise for your multi-agent system
+model: sonnet
+---
+
+You are an expert in AI agent design and multi-agent system architecture.
+
+# Mission
+Create a new specialized agent file with:
+- Custom system prompt defining expertise
+- Optional tool definitions
+- Handoff rules to other agents
+- TypeScript type safety
+- Best practices for agent specialization
+
+# Usage
+
+User invokes: `/ai-agent-create [name] [specialization]`
+
+Examples:
+- `/ai-agent-create security-auditor "security vulnerability analysis"`
+- `/ai-agent-create api-designer "RESTful API design and OpenAPI specs"`
+- `/ai-agent-create data-analyst "data analysis and visualization"`
+- `/ai-agent-create frontend-optimizer "React performance optimization"`
+
+# Creation Process
+
+## 1. Parse Input
+
+Extract:
+- **Agent name** (kebab-case): `security-auditor`, `api-designer`, etc.
+- **Specialization** (description): What this agent is expert at
+
+If name or specialization missing, ask:
+```
+Please provide:
+1. Agent name (e.g., security-auditor)
+2. Specialization (e.g., "security vulnerability analysis")
+
+Example: /ai-agent-create security-auditor "security vulnerability analysis"
+```
+
+## 2. Determine Agent Category
+
+Based on specialization, classify agent type:
+
+**Code Quality Agents**:
+- `code-reviewer`, `security-auditor`, `performance-optimizer`, `refactoring-expert`
+- Focus: Code analysis, best practices, optimization
+
+**Implementation Agents**:
+- `backend-developer`, `frontend-developer`, `api-designer`, `database-architect`
+- Focus: Building features, writing code
+
+**Research Agents**:
+- `documentation-searcher`, `library-researcher`, `best-practices-finder`
+- Focus: Information gathering, analysis
+
+**Testing Agents**:
+- `test-writer`, `integration-tester`, `e2e-tester`, `qa-engineer`
+- Focus: Test creation, quality assurance
+
+**DevOps Agents**:
+- `deployment-specialist`, `ci-cd-expert`, `infrastructure-architect`
+- Focus: Deployment, infrastructure, automation
+
+**Domain Expert Agents**:
+- `ml-engineer`, `blockchain-expert`, `crypto-analyst`, `data-scientist`
+- Focus: Specialized domain knowledge
+
+## 3. Design Agent Architecture
+
+### System Prompt Template
+```typescript
+You are a [SPECIALIZATION] expert. Your responsibilities:
+- [Primary responsibility 1]
+- [Primary responsibility 2]
+- [Primary responsibility 3]
+
+Expertise areas:
+- [Area 1]
+- [Area 2]
+- [Area 3]
+
+When you receive a task:
+1. [Step 1]
+2. [Step 2]
+3. [Step 3]
+4. Hand off to [next-agent] if [condition]
+
+Quality standards:
+- [Standard 1]
+- [Standard 2]
+- [Standard 3]
+```
+
+### Tools Design (if applicable)
+
+Decide if agent needs custom tools based on specialization:
+
+**Security Auditor** → needs:
+- `scanCode` - Static analysis
+- `checkDependencies` - Vulnerability scanning
+- `analyzeAuth` - Authentication review
+
+**API Designer** → needs:
+- `generateOpenAPI` - OpenAPI spec generation
+- `validateEndpoints` - API validation
+- `designRESTful` - REST best practices
+
+**Data Analyst** → needs:
+- `analyzeDataset` - Statistical analysis
+- `visualize` - Chart generation
+- `summarizeFindings` - Report creation
+
+### Handoff Rules
+
+Determine which agents this agent should hand off to:
+
+**Security Auditor** → hands off to:
+- `remediation-agent` (to fix vulnerabilities)
+- `coordinator` (when done)
+
+**API Designer** → hands off to:
+- `backend-developer` (to implement)
+- `test-writer` (to create tests)
+
+**Test Writer** → hands off to:
+- `reviewer` (to review tests)
+- `coordinator` (when done)
+
+## 4. Generate Agent File
+
+Create `agents/{agent-name}.ts`:
+
+### Example: Security Auditor Agent
+
+```typescript
+import { createAgent } from '@ai-sdk-tools/agents';
+import { anthropic } from '@ai-sdk/anthropic';
+import { z } from 'zod';
+
+export const securityAuditor = createAgent({
+  name: 'security-auditor',
+  model: anthropic('claude-3-5-sonnet-20241022'),
+
+  system: `You are a security vulnerability analysis expert. Your responsibilities:
+- Identify security vulnerabilities in code
+- Check for OWASP Top 10 issues
+- Analyze authentication and authorization flows
+- Review dependency security
+- Provide remediation recommendations
+
+Expertise areas:
+- SQL injection, XSS, CSRF prevention
+- Secure authentication (OAuth, JWT, sessions)
+- Authorization and access control
+- Secure data handling and encryption
+- Dependency vulnerability analysis
+
+When you receive code to audit:
+1. Scan for common vulnerabilities (OWASP Top 10)
+2. Check authentication/authorization implementation
+3. Review data handling and validation
+4. Check dependencies for known CVEs
+5. Provide severity ratings and remediation steps
+6. Hand off to remediation-agent if fixes needed
+
+Quality standards:
+- Zero high-severity vulnerabilities
+- All user input properly validated
+- Authentication follows best practices
+- Dependencies up-to-date and secure`,
+
+  tools: {
+    scanCode: {
+      description: 'Perform static security analysis on code',
+      parameters: z.object({
+        code: z.string().describe('Code to analyze'),
+        language: z.string().describe('Programming language'),
+        checkTypes: z.array(z.enum([
+          'sql-injection',
+          'xss',
+          'csrf',
+          'auth',
+          'data-exposure',
+          'input-validation'
+        ])).describe('Types of checks to perform')
+      }),
+      execute: async ({ code, language, checkTypes }) => {
+        // Implement security scanning logic
+        const findings = [];
+
+        // Example: Check for SQL injection
+        if (checkTypes.includes('sql-injection')) {
+          if (code.includes('execute(') && code.includes('req.body')) {
+            findings.push({
+              type: 'sql-injection',
+              severity: 'HIGH',
+              line: 'TBD',
+              description: 'Potential SQL injection via unsanitized user input',
+              remediation: 'Use parameterized queries or ORM'
+            });
+          }
+        }
+
+        // Example: Check for XSS
+        if (checkTypes.includes('xss')) {
+          if (code.includes('innerHTML') || code.includes('dangerouslySetInnerHTML')) {
+            findings.push({
+              type: 'xss',
+              severity: 'MEDIUM',
+              line: 'TBD',
+              description: 'Potential XSS via DOM manipulation',
+              remediation: 'Sanitize user input before rendering'
+            });
+          }
+        }
+
+        return {
+          findings,
+          summary: `Found ${findings.length} potential security issues`,
+          overallRisk: findings.some(f => f.severity === 'HIGH') ? 'HIGH' : 'MEDIUM'
+        };
+      }
+    },
+
+    checkDependencies: {
+      description: 'Check dependencies for known vulnerabilities',
+      parameters: z.object({
+        packageFile: z.string().describe('package.json or requirements.txt content'),
+        ecosystem: z.enum(['npm', 'pypi', 'maven']).describe('Package ecosystem')
+      }),
+      execute: async ({ packageFile, ecosystem }) => {
+        // In real implementation, query vulnerability databases
+        return {
+          vulnerabilities: [],
+          outdatedPackages: [],
+          recommendations: []
+        };
+      }
+    },
+
+    analyzeAuth: {
+      description: 'Analyze authentication and authorization implementation',
+      parameters: z.object({
+        authCode: z.string().describe('Authentication/authorization code'),
+        authType: z.enum(['jwt', 'session', 'oauth', 'api-key']).describe('Auth type')
+      }),
+      execute: async ({ authCode, authType }) => {
+        const issues = [];
+
+        // Check for common auth issues
+        if (authType === 'jwt' && !authCode.includes('verify')) {
+          issues.push({
+            severity: 'HIGH',
+            issue: 'JWT tokens not verified',
+            remediation: 'Always verify JWT signatures'
+          });
+        }
+
+        return {
+          issues,
+          authStrength: issues.length === 0 ? 'STRONG' : 'WEAK',
+          recommendations: []
+        };
+      }
+    }
+  },
+
+  handoffTo: ['remediation-agent', 'coordinator']
+});
+```
+
+### Example: API Designer Agent
+
+```typescript
+import { createAgent } from '@ai-sdk-tools/agents';
+import { anthropic } from '@ai-sdk/anthropic';
+import { z } from 'zod';
+
+export const apiDesigner = createAgent({
+  name: 'api-designer',
+  model: anthropic('claude-3-5-sonnet-20241022'),
+
+  system: `You are a RESTful API design expert. Your responsibilities:
+- Design clean, RESTful API architectures
+- Create comprehensive OpenAPI/Swagger specifications
+- Ensure API best practices (versioning, pagination, error handling)
+- Design for scalability and maintainability
+
+Expertise areas:
+- REST principles and best practices
+- OpenAPI 3.0+ specification
+- API versioning strategies
+- Request/response design
+- Error handling and status codes
+- Authentication and rate limiting
+
+When you design an API:
+1. Understand the resource model and relationships
+2. Design resource URIs following REST principles
+3. Define HTTP methods and status codes
+4. Design request/response schemas
+5. Add authentication, pagination, filtering
+6. Generate OpenAPI specification
+7. Hand off to backend-developer for implementation
+
+Design principles:
+- Resources, not actions (GET /users, not GET /getUsers)
+- Proper HTTP status codes (200, 201, 400, 404, 500)
+- Consistent naming conventions (kebab-case or snake_case)
+- Comprehensive error messages
+- API versioning (v1, v2)`,
+
+  tools: {
+    generateOpenAPI: {
+      description: 'Generate OpenAPI 3.0 specification',
+      parameters: z.object({
+        apiName: z.string().describe('API name'),
+        version: z.string().describe('API version'),
+        resources: z.array(z.object({
+          name: z.string(),
+          methods: z.array(z.string()),
+          schema: z.any()
+        })).describe('API resources')
+      }),
+      execute: async ({ apiName, version, resources }) => {
+        const openapi = {
+          openapi: '3.0.0',
+          info: {
+            title: apiName,
+            version: version,
+            description: `${apiName} API`
+          },
+          paths: {},
+          components: {
+            schemas: {}
+          }
+        };
+
+        // Generate paths and schemas for each resource
+        resources.forEach(resource => {
+          const path = `/${resource.name}`;
+          openapi.paths[path] = {};
+
+          resource.methods.forEach(method => {
+            openapi.paths[path][method.toLowerCase()] = {
+              summary: `${method} ${resource.name}`,
+              responses: {
+                '200': {
+                  description: 'Successful response'
+                }
+              }
+            };
+          });
+        });
+
+        return {
+          spec: openapi,
+          yaml: '# OpenAPI YAML would be here',
+          json: JSON.stringify(openapi, null, 2)
+        };
+      }
+    }
+  },
+
+  handoffTo: ['backend-developer', 'test-writer', 'coordinator']
+});
+```
+
+## 5. Register Agent
+
+Add to orchestration system in `index.ts`:
+
+```typescript
+import { [agentName] } from './agents/[agent-name]';
+
+const agents = [
+  coordinator,
+  // ... existing agents
+  [agentName]  // Add new agent
+];
+```
+
+## 6. Create Documentation
+
+Add agent documentation to README.md:
+
+```markdown
+### [Agent Name]
+
+**Specialization**: [Specialization description]
+
+**Responsibilities**:
+- [Responsibility 1]
+- [Responsibility 2]
+- [Responsibility 3]
+
+**Tools**:
+- `toolName` - Description
+
+**Handoffs**:
+- Hands off to [agent1] when [condition]
+- Hands off to [agent2] when [condition]
+
+**Example Usage**:
+```typescript
+// Through coordinator
+const result = await runMultiAgentTask(
+  'Audit this code for security vulnerabilities: [code]'
+);
+
+// Direct invocation
+const result = await [agentName].handle({
+  message: 'Task description',
+  context: {}
+});
+```
+```
+
+## 7. Create Test File
+
+Create `examples/test-[agent-name].ts`:
+
+```typescript
+import { [agentName] } from '../agents/[agent-name]';
+
+async function test() {
+  const result = await [agentName].handle({
+    message: 'Test task for agent',
+    context: {}
+  });
+
+  console.log('Result:', result);
+}
+
+test().catch(console.error);
+```
+
+# Output Format
+
+After creation, display:
+
+```
+✅ Agent created successfully!
+
+📁 Files created:
+   agents/[agent-name].ts
+   examples/test-[agent-name].ts
+
+🤖 Agent: [Agent Name]
+   Specialization: [Specialization]
+   Model: Claude 3.5 Sonnet
+   Tools: [X] custom tools
+   Handoffs: [agent1], [agent2]
+
+📝 Next steps:
+1. Review the agent in agents/[agent-name].ts
+2. Register in index.ts (agents array)
+3. Test with: npm run dev "Task for this agent"
+4. Or test directly: ts-node examples/test-[agent-name].ts
+
+💡 Integration:
+   The agent will automatically be available to the coordinator
+   for routing. It can hand off tasks to: [agent1], [agent2]
+```
+
+# Agent Design Best Practices
+
+When creating agents, ensure:
+
+1. **Clear specialization** - Agent has one primary expertise
+2. **Well-defined responsibilities** - Specific, actionable tasks
+3. **Appropriate tools** - Tools match the agent's expertise
+4. **Smart handoffs** - Knows when to delegate to other agents
+5. **Quality standards** - Has measurable quality criteria
+6. **Error handling** - Gracefully handles edge cases
+7. **Context awareness** - Uses context from previous agents
+
+# Common Agent Patterns
+
+**Analyzer Pattern**:
+- Input: Raw data/code
+- Output: Analysis report
+- Handoff: To implementer or coordinator
+
+**Implementer Pattern**:
+- Input: Specifications
+- Output: Implementation
+- Handoff: To reviewer
+
+**Reviewer Pattern**:
+- Input: Implementation
+- Output: Review feedback
+- Handoff: Back to implementer or coordinator
+
+**Coordinator Pattern**:
+- Input: User request
+- Output: Routes to specialist
+- Handoff: To appropriate agent
--- a/commands/ai-agents-setup.md
+++ b/commands/ai-agents-setup.md
@@ -0,0 +1,430 @@
+---
+name: ai-agents-setup
+description: Initialize a multi-agent orchestration project with AI SDK v5 agents, complete with coordinator, specialized agents, and orchestration setup
+model: sonnet
+---
+
+You are an expert in multi-agent system architecture and AI SDK v5 orchestration.
+
+# Mission
+Set up a complete multi-agent orchestration project using @ai-sdk-tools/agents, including:
+- Project directory structure
+- Multiple specialized agents (coordinator, researcher, coder, reviewer)
+- Orchestration configuration
+- Environment setup for API keys
+- Example usage and testing scripts
+
+# Setup Process
+
+## 1. Check Dependencies
+First, verify the user has Node.js 18+ installed:
+```bash
+node --version
+```
+
+If not installed, guide them to install Node.js from https://nodejs.org/
+
+## 2. Create Project Structure
+```bash
+mkdir -p ai-agents-project
+cd ai-agents-project
+
+# Initialize npm project
+npm init -y
+
+# Install dependencies
+npm install @ai-sdk-tools/agents ai zod
+
+# Install AI provider SDKs (user chooses)
+npm install @ai-sdk/anthropic  # For Claude
+npm install @ai-sdk/openai     # For GPT-4
+npm install @ai-sdk/google     # For Gemini
+```
+
+## 3. Create Directory Structure
+```bash
+mkdir -p agents
+mkdir -p examples
+mkdir -p config
+```
+
+## 4. Create Agent Files
+
+### agents/coordinator.ts
+```typescript
+import { createAgent } from '@ai-sdk-tools/agents';
+import { anthropic } from '@ai-sdk/anthropic';
+
+export const coordinator = createAgent({
+  name: 'coordinator',
+  model: anthropic('claude-3-5-sonnet-20241022'),
+  system: `You are a coordinator agent responsible for:
+- Analyzing incoming requests
+- Routing to the most appropriate specialized agent
+- Managing handoffs between agents
+- Aggregating results from multiple agents
+- Returning cohesive final output
+
+Available agents:
+- researcher: Gathers information, searches documentation
+- coder: Implements code, follows specifications
+- reviewer: Reviews code quality, security, best practices
+
+When you receive a request:
+1. Analyze what's needed
+2. Route to the best agent
+3. Manage any necessary handoffs
+4. Return the final result`,
+
+  handoffTo: ['researcher', 'coder', 'reviewer']
+});
+```
+
+### agents/researcher.ts
+```typescript
+import { createAgent } from '@ai-sdk-tools/agents';
+import { anthropic } from '@ai-sdk/anthropic';
+import { z } from 'zod';
+
+export const researcher = createAgent({
+  name: 'researcher',
+  model: anthropic('claude-3-5-sonnet-20241022'),
+  system: `You are a research specialist. Your job is to:
+- Gather information from documentation
+- Search for best practices
+- Find relevant examples
+- Analyze technical requirements
+- Provide comprehensive research summaries
+
+Always provide sources and reasoning for your findings.`,
+
+  tools: {
+    search: {
+      description: 'Search for information',
+      parameters: z.object({
+        query: z.string().describe('Search query'),
+        sources: z.array(z.string()).optional().describe('Specific sources to search')
+      }),
+      execute: async ({ query, sources }) => {
+        // In real implementation, this would search docs, web, etc.
+        return {
+          results: `Research results for: ${query}`,
+          sources: sources || ['documentation', 'best practices']
+        };
+      }
+    }
+  },
+
+  handoffTo: ['coder', 'coordinator']
+});
+```
+
+### agents/coder.ts
+```typescript
+import { createAgent } from '@ai-sdk-tools/agents';
+import { anthropic } from '@ai-sdk/anthropic';
+
+export const coder = createAgent({
+  name: 'coder',
+  model: anthropic('claude-3-5-sonnet-20241022'),
+  system: `You are a code implementation specialist. Your job is to:
+- Write clean, production-ready code
+- Follow best practices and patterns
+- Implement features according to specifications
+- Write code that is testable and maintainable
+- Document your code appropriately
+
+When you complete implementation, hand off to reviewer for quality check.`,
+
+  handoffTo: ['reviewer', 'coordinator']
+});
+```
+
+### agents/reviewer.ts
+```typescript
+import { createAgent } from '@ai-sdk-tools/agents';
+import { anthropic } from '@ai-sdk/anthropic';
+
+export const reviewer = createAgent({
+  name: 'reviewer',
+  model: anthropic('claude-3-5-sonnet-20241022'),
+  system: `You are a code review specialist. Your job is to:
+- Review code quality and structure
+- Check for security vulnerabilities
+- Verify best practices are followed
+- Ensure code is testable and maintainable
+- Provide constructive feedback
+
+Provide a comprehensive review with:
+- What's good
+- What needs improvement
+- Security concerns (if any)
+- Overall quality score`
+});
+```
+
+## 5. Create Orchestration Setup
+
+### index.ts
+```typescript
+import { orchestrate } from '@ai-sdk-tools/agents';
+import { coordinator } from './agents/coordinator';
+import { researcher } from './agents/researcher';
+import { coder } from './agents/coder';
+import { reviewer } from './agents/reviewer';
+
+// Register all agents
+const agents = [coordinator, researcher, coder, reviewer];
+
+export async function runMultiAgentTask(task: string) {
+  console.log(`\n🤖 Starting multi-agent task: ${task}\n`);
+
+  const result = await orchestrate({
+    agents,
+    task,
+    coordinator, // Coordinator decides routing
+    maxDepth: 10, // Max handoff chain length
+    timeout: 300000, // 5 minutes
+
+    onHandoff: (event) => {
+      console.log(`\n🔄 Handoff: ${event.from} → ${event.to}`);
+      console.log(`   Reason: ${event.reason}\n`);
+    },
+
+    onComplete: (result) => {
+      console.log(`\n✅ Task complete!`);
+      console.log(`   Total handoffs: ${result.handoffCount}`);
+      console.log(`   Duration: ${result.duration}ms\n`);
+    }
+  });
+
+  return result;
+}
+
+// Example usage
+if (require.main === module) {
+  const task = process.argv[2] || 'Build a REST API with authentication';
+
+  runMultiAgentTask(task)
+    .then(result => {
+      console.log('\n📊 Final Result:\n');
+      console.log(result.output);
+    })
+    .catch(error => {
+      console.error('❌ Error:', error);
+      process.exit(1);
+    });
+}
+```
+
+## 6. Create Environment Setup
+
+### .env.example
+```bash
+# Choose your AI provider(s) and add the appropriate API keys
+
+# Anthropic (Claude)
+ANTHROPIC_API_KEY=your_anthropic_key_here
+
+# OpenAI (GPT-4)
+OPENAI_API_KEY=your_openai_key_here
+
+# Google (Gemini)
+GOOGLE_API_KEY=your_google_key_here
+```
+
+### .gitignore
+```
+node_modules/
+.env
+dist/
+*.log
+```
+
+## 7. Create Example Scripts
+
+### examples/code-generation.ts
+```typescript
+import { runMultiAgentTask } from '../index';
+
+async function example() {
+  const result = await runMultiAgentTask(
+    'Build a TypeScript REST API with user authentication, including tests and documentation'
+  );
+
+  console.log('Result:', result);
+}
+
+example();
+```
+
+### examples/research-pipeline.ts
+```typescript
+import { runMultiAgentTask } from '../index';
+
+async function example() {
+  const result = await runMultiAgentTask(
+    'Research best practices for building scalable microservices with Node.js'
+  );
+
+  console.log('Result:', result);
+}
+
+example();
+```
+
+## 8. Update package.json
+
+Add scripts to package.json:
+```json
+{
+  "scripts": {
+    "dev": "ts-node index.ts",
+    "example:code": "ts-node examples/code-generation.ts",
+    "example:research": "ts-node examples/research-pipeline.ts",
+    "build": "tsc",
+    "start": "node dist/index.js"
+  },
+  "devDependencies": {
+    "@types/node": "^20.0.0",
+    "ts-node": "^10.9.0",
+    "typescript": "^5.0.0"
+  }
+}
+```
+
+## 9. Create TypeScript Config
+
+### tsconfig.json
+```json
+{
+  "compilerOptions": {
+    "target": "ES2020",
+    "module": "commonjs",
+    "lib": ["ES2020"],
+    "outDir": "./dist",
+    "rootDir": "./",
+    "strict": true,
+    "esModuleInterop": true,
+    "skipLibCheck": true,
+    "forceConsistentCasingInFileNames": true,
+    "resolveJsonModule": true,
+    "declaration": true,
+    "declarationMap": true,
+    "sourceMap": true
+  },
+  "include": ["**/*.ts"],
+  "exclude": ["node_modules", "dist"]
+}
+```
+
+## 10. Create README
+
+### README.md
+```markdown
+# Multi-Agent Orchestration Project
+
+Built with AI SDK v5 and @ai-sdk-tools/agents
+
+## Setup
+
+1. Install dependencies:
+   ```bash
+   npm install
+   ```
+
+2. Configure API keys:
+   ```bash
+   cp .env.example .env
+   # Edit .env with your API keys
+   ```
+
+3. Run examples:
+   ```bash
+   npm run example:code
+   npm run example:research
+   ```
+
+## Available Agents
+
+- **coordinator** - Routes requests to specialized agents
+- **researcher** - Gathers information and best practices
+- **coder** - Implements features and writes code
+- **reviewer** - Reviews code quality and security
+
+## Usage
+
+```typescript
+import { runMultiAgentTask } from './index';
+
+const result = await runMultiAgentTask('Your task here');
+console.log(result.output);
+```
+
+## Architecture
+
+The system uses agent handoffs to coordinate complex tasks:
+1. Coordinator receives request
+2. Routes to appropriate specialist
+3. Specialists hand off to each other as needed
+4. Final result aggregated by coordinator
+```
+
+# Completion Steps
+
+After creating all files:
+
+1. **Install TypeScript tooling**:
+   ```bash
+   npm install -D typescript ts-node @types/node
+   ```
+
+2. **Create .env from example**:
+   ```bash
+   cp .env.example .env
+   echo "⚠️  Please edit .env and add your API keys"
+   ```
+
+3. **Test the setup**:
+   ```bash
+   npm run dev "Build a simple TODO API"
+   ```
+
+4. **Inform user**:
+   ```
+   ✅ Multi-agent project setup complete!
+
+   📁 Project structure:
+      agents/
+        ├── coordinator.ts
+        ├── researcher.ts
+        ├── coder.ts
+        └── reviewer.ts
+      examples/
+        ├── code-generation.ts
+        └── research-pipeline.ts
+      index.ts
+      .env.example
+      tsconfig.json
+      package.json
+      README.md
+
+   📝 Next steps:
+   1. Add your API keys to .env
+   2. Run: npm run dev "Your task here"
+   3. Try examples: npm run example:code
+
+   🤖 Your agents are ready to collaborate!
+   ```
+
+# Template Options
+
+Ask the user which template they want:
+
+1. **Basic** (default) - Coordinator + 3 specialists (researcher, coder, reviewer)
+2. **Research** - Research-focused agents (searcher, analyzer, synthesizer, reporter)
+3. **Content** - Content creation agents (researcher, writer, editor, SEO, publisher)
+4. **Support** - Customer support agents (triager, FAQ bot, technical, escalator)
+5. **DevOps** - DevOps agents (monitor, diagnoser, fixer, notifier)
+
+If user specifies a template, adjust the agents accordingly.
--- a/commands/ai-agents-test.md
+++ b/commands/ai-agents-test.md
@@ -0,0 +1,530 @@
+---
+name: ai-agents-test
+description: Test your multi-agent system with a sample task, showing agent handoffs, routing decisions, and performance metrics
+model: sonnet
+---
+
+You are an expert in multi-agent system testing and observability.
+
+# Mission
+Test a multi-agent orchestration system by:
+- Running a sample task through the agent network
+- Showing real-time agent handoffs and routing
+- Displaying performance metrics (time, handoff count)
+- Validating agent coordination and output quality
+- Identifying bottlenecks or issues
+
+# Usage
+
+User invokes: `/ai-agents-test "Task description"`
+
+Examples:
+- `/ai-agents-test "Build a REST API with authentication"`
+- `/ai-agents-test "Research best practices for React performance"`
+- `/ai-agents-test "Debug this authentication error"`
+
+# Test Process
+
+## 1. Validate Setup
+
+First check if the multi-agent project exists:
+
+```bash
+# Check for required files
+if [ -f "index.ts" ] && [ -d "agents" ]; then
+  echo "✅ Multi-agent project found"
+else
+  echo "❌ Multi-agent project not found"
+  echo "💡 Run /ai-agents-setup first to create the project"
+  exit 1
+fi
+```
+
+## 2. Parse Test Query
+
+Extract the task from user input:
+- If provided: Use their task
+- If empty: Use default test task
+
+Default tasks by category:
+- **Code generation**: "Build a TODO API with CRUD operations"
+- **Research**: "Research microservices best practices"
+- **Debug**: "Why is my JWT authentication failing?"
+- **Review**: "Review this code for security issues"
+
+## 3. Start Test Execution
+
+Create a test runner script:
+
+### test-runner.ts
+```typescript
+import { runMultiAgentTask } from './index';
+
+interface TestMetrics {
+  startTime: number;
+  endTime?: number;
+  handoffs: Array<{
+    from: string;
+    to: string;
+    reason: string;
+    timestamp: number;
+  }>;
+  agentsInvolved: Set<string>;
+  totalDuration?: number;
+}
+
+async function testMultiAgentSystem(task: string) {
+  console.log('🚀 Multi-Agent System Test\n');
+  console.log('━'.repeat(60));
+  console.log(`📋 Task: ${task}`);
+  console.log('━'.repeat(60));
+  console.log('');
+
+  const metrics: TestMetrics = {
+    startTime: Date.now(),
+    handoffs: [],
+    agentsInvolved: new Set()
+  };
+
+  try {
+    const result = await runMultiAgentTask(task);
+
+    metrics.endTime = Date.now();
+    metrics.totalDuration = metrics.endTime - metrics.startTime;
+
+    // Display results
+    displayResults(result, metrics);
+
+    return { success: true, result, metrics };
+  } catch (error) {
+    console.error('❌ Test failed:', error);
+    return { success: false, error, metrics };
+  }
+}
+
+function displayResults(result: any, metrics: TestMetrics) {
+  console.log('\n' + '━'.repeat(60));
+  console.log('📊 Test Results');
+  console.log('━'.repeat(60));
+  console.log('');
+
+  // Success indicator
+  console.log('✅ Status: Task completed successfully\n');
+
+  // Metrics
+  console.log('⏱️  Performance Metrics:');
+  console.log(`   Total duration: ${metrics.totalDuration}ms (${(metrics.totalDuration! / 1000).toFixed(2)}s)`);
+  console.log(`   Handoff count: ${metrics.handoffs.length}`);
+  console.log(`   Agents involved: ${metrics.agentsInvolved.size}`);
+  console.log(`   Avg time per handoff: ${(metrics.totalDuration! / Math.max(metrics.handoffs.length, 1)).toFixed(0)}ms`);
+  console.log('');
+
+  // Agent flow
+  if (metrics.handoffs.length > 0) {
+    console.log('🔄 Agent Flow:');
+    const agentFlow = ['coordinator'];
+    metrics.handoffs.forEach(h => {
+      if (!agentFlow.includes(h.to)) {
+        agentFlow.push(h.to);
+      }
+    });
+    console.log(`   ${agentFlow.join(' → ')}`);
+    console.log('');
+  }
+
+  // Handoff details
+  if (metrics.handoffs.length > 0) {
+    console.log('🔀 Handoff Details:');
+    metrics.handoffs.forEach((handoff, i) => {
+      const duration = i < metrics.handoffs.length - 1
+        ? metrics.handoffs[i + 1].timestamp - handoff.timestamp
+        : metrics.endTime! - handoff.timestamp;
+
+      console.log(`   ${i + 1}. ${handoff.from} → ${handoff.to}`);
+      console.log(`      Reason: ${handoff.reason}`);
+      console.log(`      Duration: ${duration}ms`);
+      console.log('');
+    });
+  }
+
+  // Output summary
+  console.log('📝 Output Summary:');
+  const output = typeof result.output === 'string' ? result.output : JSON.stringify(result.output, null, 2);
+  const lines = output.split('\n');
+
+  if (lines.length > 20) {
+    console.log(lines.slice(0, 10).join('\n'));
+    console.log(`   ... (${lines.length - 20} more lines) ...`);
+    console.log(lines.slice(-10).join('\n'));
+  } else {
+    console.log(output);
+  }
+  console.log('');
+
+  // Quality assessment
+  console.log('🎯 Quality Assessment:');
+  const qualityScore = assessQuality(result, metrics);
+  console.log(`   Overall score: ${qualityScore.score}/100`);
+  console.log(`   Completeness: ${qualityScore.completeness}`);
+  console.log(`   Efficiency: ${qualityScore.efficiency}`);
+  console.log(`   Coordination: ${qualityScore.coordination}`);
+  console.log('');
+}
+
+function assessQuality(result: any, metrics: TestMetrics) {
+  let score = 100;
+  let completeness = '✅ Excellent';
+  let efficiency = '✅ Excellent';
+  let coordination = '✅ Excellent';
+
+  // Check completeness
+  const outputLength = JSON.stringify(result.output).length;
+  if (outputLength < 100) {
+    score -= 30;
+    completeness = '⚠️  Incomplete';
+  } else if (outputLength < 500) {
+    score -= 10;
+    completeness = '✅ Good';
+  }
+
+  // Check efficiency
+  const avgHandoffTime = metrics.totalDuration! / Math.max(metrics.handoffs.length, 1);
+  if (avgHandoffTime > 5000) {
+    score -= 20;
+    efficiency = '⚠️  Slow';
+  } else if (avgHandoffTime > 3000) {
+    score -= 10;
+    efficiency = '✅ Good';
+  }
+
+  // Check coordination
+  if (metrics.handoffs.length === 0) {
+    score -= 20;
+    coordination = '⚠️  No handoffs';
+  } else if (metrics.handoffs.length > 10) {
+    score -= 10;
+    coordination = '⚠️  Too many handoffs';
+  }
+
+  return {
+    score: Math.max(0, score),
+    completeness,
+    efficiency,
+    coordination
+  };
+}
+
+// CLI interface
+const task = process.argv[2];
+
+if (!task) {
+  console.error('❌ Error: Please provide a task to test');
+  console.log('');
+  console.log('Usage: ts-node test-runner.ts "Your task description"');
+  console.log('');
+  console.log('Examples:');
+  console.log('  ts-node test-runner.ts "Build a REST API with authentication"');
+  console.log('  ts-node test-runner.ts "Research React performance best practices"');
+  console.log('');
+  process.exit(1);
+}
+
+testMultiAgentSystem(task)
+  .then(({ success }) => {
+    process.exit(success ? 0 : 1);
+  })
+  .catch(error => {
+    console.error('Fatal error:', error);
+    process.exit(1);
+  });
+```
+
+## 4. Enhanced Orchestration with Metrics
+
+Update `index.ts` to emit events for testing:
+
+```typescript
+export async function runMultiAgentTask(task: string, options?: {
+  onHandoff?: (event: HandoffEvent) => void;
+  onComplete?: (result: any) => void;
+  verbose?: boolean;
+}) {
+  const verbose = options?.verbose ?? true;
+
+  if (verbose) {
+    console.log(`\n🤖 Starting multi-agent task: ${task}\n`);
+  }
+
+  const handoffs: Array<{
+    from: string;
+    to: string;
+    reason: string;
+    timestamp: number;
+  }> = [];
+
+  const result = await orchestrate({
+    agents,
+    task,
+    coordinator,
+    maxDepth: 10,
+    timeout: 300000,
+
+    onHandoff: (event) => {
+      const handoffData = {
+        from: event.from,
+        to: event.to,
+        reason: event.reason,
+        timestamp: Date.now()
+      };
+
+      handoffs.push(handoffData);
+
+      if (verbose) {
+        console.log(`\n🔄 Handoff: ${event.from} → ${event.to}`);
+        console.log(`   Reason: ${event.reason}\n`);
+      }
+
+      options?.onHandoff?.(event);
+    },
+
+    onComplete: (result) => {
+      if (verbose) {
+        console.log(`\n✅ Task complete!`);
+        console.log(`   Total handoffs: ${handoffs.length}`);
+        console.log(`   Agents: ${new Set(handoffs.flatMap(h => [h.from, h.to])).size}\n`);
+      }
+
+      options?.onComplete?.(result);
+    }
+  });
+
+  return {
+    ...result,
+    metrics: {
+      handoffs,
+      agentCount: new Set(handoffs.flatMap(h => [h.from, h.to])).size
+    }
+  };
+}
+```
+
+## 5. Execute Test
+
+Run the test:
+
+```bash
+# Using ts-node
+ts-node test-runner.ts "Build a REST API with authentication"
+
+# Or using npm script
+npm run test:agents "Build a REST API with authentication"
+```
+
+## 6. Display Real-Time Progress
+
+Show live updates during execution:
+
+```
+🚀 Multi-Agent System Test
+
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+📋 Task: Build a REST API with authentication
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+
+🔄 Handoff: coordinator → researcher
+   Reason: Need to research authentication best practices
+
+🔄 Handoff: researcher → coder
+   Reason: Research complete, ready to implement
+
+🔄 Handoff: coder → reviewer
+   Reason: Implementation complete, needs review
+
+🔄 Handoff: reviewer → coordinator
+   Reason: Review complete, all checks passed
+
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+📊 Test Results
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+
+✅ Status: Task completed successfully
+
+⏱️  Performance Metrics:
+   Total duration: 47823ms (47.82s)
+   Handoff count: 4
+   Agents involved: 4
+   Avg time per handoff: 11956ms
+
+🔄 Agent Flow:
+   coordinator → researcher → coder → reviewer → coordinator
+
+🔀 Handoff Details:
+   1. coordinator → researcher
+      Reason: Need to research authentication best practices
+      Duration: 8234ms
+
+   2. researcher → coder
+      Reason: Research complete, ready to implement
+      Duration: 23456ms
+
+   3. coder → reviewer
+      Reason: Implementation complete, needs review
+      Duration: 12389ms
+
+   4. reviewer → coordinator
+      Reason: Review complete, all checks passed
+      Duration: 3744ms
+
+📝 Output Summary:
+{
+  "api": "REST API with JWT authentication",
+  "features": [
+    "User registration",
+    "User login",
+    "JWT token generation",
+    "Protected routes",
+    "Token refresh"
+  ],
+  "security": {
+    "passwordHashing": "bcrypt",
+    "tokenExpiry": "1h",
+    "refreshToken": "7d"
+  },
+  "endpoints": [
+    "POST /api/auth/register",
+    "POST /api/auth/login",
+    "POST /api/auth/refresh",
+    "GET /api/users/me (protected)"
+  ],
+  "tests": "95% coverage"
+}
+
+🎯 Quality Assessment:
+   Overall score: 95/100
+   Completeness: ✅ Excellent
+   Efficiency: ✅ Excellent
+   Coordination: ✅ Excellent
+```
+
+## 7. Add Test Script to package.json
+
+```json
+{
+  "scripts": {
+    "test:agents": "ts-node test-runner.ts"
+  }
+}
+```
+
+## 8. Create Pre-defined Test Scenarios
+
+Create `tests/scenarios.json`:
+
+```json
+{
+  "scenarios": [
+    {
+      "name": "Code Generation",
+      "task": "Build a REST API with authentication and CRUD operations",
+      "expectedAgents": ["coordinator", "researcher", "coder", "reviewer"],
+      "expectedHandoffs": 4,
+      "maxDuration": 60000
+    },
+    {
+      "name": "Research Task",
+      "task": "Research best practices for microservices architecture",
+      "expectedAgents": ["coordinator", "researcher"],
+      "expectedHandoffs": 2,
+      "maxDuration": 20000
+    },
+    {
+      "name": "Debug Task",
+      "task": "Debug JWT authentication failing with 401 errors",
+      "expectedAgents": ["coordinator", "researcher", "security-auditor"],
+      "expectedHandoffs": 3,
+      "maxDuration": 30000
+    },
+    {
+      "name": "Complex Pipeline",
+      "task": "Design, implement, test, and document a payment processing API",
+      "expectedAgents": ["coordinator", "api-designer", "coder", "test-writer", "reviewer"],
+      "expectedHandoffs": 6,
+      "maxDuration": 120000
+    }
+  ]
+}
+```
+
+## 9. Troubleshooting
+
+If test fails, check:
+
+```bash
+# 1. Environment variables
+if [ -z "$ANTHROPIC_API_KEY" ]; then
+  echo "❌ Error: ANTHROPIC_API_KEY not set"
+  echo "💡 Add your API key to .env file"
+  exit 1
+fi
+
+# 2. Dependencies installed
+if [ ! -d "node_modules/@ai-sdk-tools/agents" ]; then
+  echo "❌ Error: Dependencies not installed"
+  echo "💡 Run: npm install"
+  exit 1
+fi
+
+# 3. Agents registered
+if ! grep -q "researcher" index.ts; then
+  echo "⚠️  Warning: Not all agents registered in index.ts"
+fi
+```
+
+# Output Summary
+
+After test completion, show:
+
+```
+✅ Multi-agent test complete!
+
+📊 Results:
+   Status: Success
+   Duration: 47.8s
+   Agents: 4 (coordinator, researcher, coder, reviewer)
+   Handoffs: 4
+   Quality: 95/100
+
+🎯 Assessment:
+   ✅ All agents coordinated successfully
+   ✅ Task completed within expected time
+   ✅ Output quality meets standards
+
+💡 Recommendations:
+   - System is functioning optimally
+   - Consider adding more specialized agents for complex tasks
+   - Average handoff time is excellent (11.9s)
+
+📁 Full test output saved to: test-results-[timestamp].json
+```
+
+# Test Validation Criteria
+
+A successful test should have:
+- ✅ At least 2 agents involved (coordinator + 1 specialist)
+- ✅ Meaningful handoffs with clear reasons
+- ✅ Completion within timeout (5 minutes default)
+- ✅ Quality output (not just "task complete")
+- ✅ No errors or exceptions
+
+# Performance Benchmarks
+
+Expected performance ranges:
+- **Simple tasks** (research): 10-20 seconds, 2-3 handoffs
+- **Medium tasks** (code generation): 30-60 seconds, 3-5 handoffs
+- **Complex tasks** (full pipeline): 60-120 seconds, 5-8 handoffs
+
+If actual performance exceeds these by 2x, investigate:
+- API rate limiting
+- Model selection (use faster models for testing)
+- Network latency
+- Agent prompt optimization