Initial commit

This commit is contained in:
Zhongwei Li
2025-11-29 18:48:58 +08:00
commit df092d8cd2
127 changed files with 62057 additions and 0 deletions

View File

@@ -0,0 +1,571 @@
---
name: anthropic-architect
description: Determine the best Anthropic architecture for your project by analyzing requirements and recommending the optimal combination of Skills, Agents, Prompts, and SDK primitives.
---
# Anthropic Architect
Expert architectural guidance for Anthropic-based projects. Analyze your requirements and receive tailored recommendations on the optimal architecture using Skills, Agents, Subagents, Prompts, and SDK primitives.
## What This Skill Does
Helps you design the right Anthropic architecture for your project by:
- **Analyzing project requirements** - Understanding complexity, scope, and constraints
- **Recommending architectures** - Skills vs Agents vs Prompts vs SDK primitives
- **Applying decision rubrics** - Data-driven architectural choices
- **Following best practices** - 2025 Anthropic patterns and principles
- **Progressive disclosure design** - Efficient context management
- **Security considerations** - Safe, controllable AI systems
## Why Architecture Matters
**Without proper architecture:**
- Inefficient context usage and high costs
- Poor performance and slow responses
- Security vulnerabilities and risks
- Difficult to maintain and scale
- Agents reading entire skill contexts unnecessarily
- Mixed concerns and unclear boundaries
**With engineered architecture:**
- Optimal context utilization
- Fast, focused responses
- Secure, controlled operations
- Easy to maintain and extend
- Progressive disclosure of information
- Clear separation of concerns
- Scalable and reusable components
## Quick Start
### Analyze Your Project
```
Using the anthropic-architect skill, help me determine the best
architecture for: [describe your project]
Requirements:
- [List your key requirements]
- [Complexity level]
- [Reusability needs]
- [Security constraints]
```
### Get Architecture Recommendation
The skill will provide:
1. **Recommended architecture** - Specific primitives to use
2. **Decision reasoning** - Why this architecture fits
3. **Implementation guidance** - How to build it
4. **Best practices** - What to follow
5. **Example patterns** - Similar successful architectures
## The Four Anthropic Primitives
### 1. Skills (Prompt-Based Meta-Tools)
**What:** Organized folders of instructions, scripts, and resources that agents can discover and load dynamically.
**When to use:**
- ✅ Specialized domain knowledge needed
- ✅ Reusable across multiple projects
- ✅ Complex, multi-step workflows
- ✅ Reference materials required
- ✅ Progressive disclosure beneficial
**When NOT to use:**
- ❌ Simple, one-off tasks
- ❌ Project-specific logic only
- ❌ No need for reusability
**Example use cases:**
- Prompt engineering expertise
- Design system generation
- Code review guidelines
- Domain-specific knowledge (finance, medical, legal)
### 2. Agents/Subagents (Autonomous Task Handlers)
**What:** Specialized agents with independent system prompts, dedicated context windows, and specific tool permissions.
**When to use:**
- ✅ Complex, multi-step autonomous tasks
- ✅ Need for isolated context
- ✅ Different tool permissions required
- ✅ Parallel task execution
- ✅ Specialized expertise per task type
**When NOT to use:**
- ❌ Simple queries or lookups
- ❌ Shared context required
- ❌ Sequential dependencies
- ❌ Resource-constrained environments
**Example use cases:**
- Code exploration and analysis
- Test generation and execution
- Documentation generation
- Security audits
- Performance optimization
### 3. Direct Prompts (Simple Instructions)
**What:** Clear, explicit instructions passed directly to Claude without additional structure.
**When to use:**
- ✅ Simple, straightforward tasks
- ✅ One-time operations
- ✅ Quick questions or clarifications
- ✅ No need for specialization
- ✅ Minimal context required
**When NOT to use:**
- ❌ Complex, multi-step processes
- ❌ Need for reusability
- ❌ Requires domain expertise
- ❌ Multiple related operations
**Example use cases:**
- Code explanations
- Quick refactoring
- Simple bug fixes
- Documentation updates
- Direct questions
### 4. SDK Primitives (Custom Workflows)
**What:** Low-level building blocks from the Claude Agent SDK to create custom agent workflows.
**When to use:**
- ✅ Unique workflow requirements
- ✅ Custom tool integration needed
- ✅ Specific feedback loops required
- ✅ Integration with existing systems
- ✅ Fine-grained control needed
**When NOT to use:**
- ❌ Standard use cases covered by Skills/Agents
- ❌ Limited development resources
- ❌ Maintenance burden concern
- ❌ Faster time-to-market priority
**Example use cases:**
- Custom CI/CD integration
- Specialized code analysis pipelines
- Domain-specific automation
- Integration with proprietary systems
## Decision Rubric
Use this rubric to determine the right architecture:
### Task Complexity Analysis
**Low Complexity** → Direct Prompts
- Single operation
- Clear input/output
- No dependencies
- < 5 steps
**Medium Complexity** → Skills
- Multiple related operations
- Reusable patterns
- Reference materials helpful
- 5-20 steps
**High Complexity** → Agents/Subagents
- Multi-step autonomous workflow
- Needs isolated context
- Different tool permissions
- > 20 steps or parallel tasks
**Custom Complexity** → SDK Primitives
- Unique workflows
- System integration required
- Custom tools needed
- Specific feedback loops
### Reusability Assessment
**Single Use** → Direct Prompts
- One-time task
- Project-specific
- No future reuse
**Team Reuse** → Skills
- Multiple team members benefit
- Common workflows
- Shareable knowledge
**Organization Reuse** → Skills + Marketplace
- Cross-team benefit
- Standard patterns
- Company-wide knowledge
**Product Feature** → SDK Primitives
- End-user facing
- Production deployment
- Custom integration
### Context Management Needs
**Minimal Context** → Direct Prompts
- Self-contained task
- No external references
- Simple instructions
**Structured Context** → Skills
- Progressive disclosure needed
- Reference materials required
- Organized information
**Isolated Context** → Agents/Subagents
- Separate concerns
- Avoid context pollution
- Parallel execution
**Custom Context** → SDK Primitives
- Specific context handling
- Integration requirements
- Fine-grained control
### Security & Control Requirements
**Basic Safety** → Direct Prompts + Skills
- Standard guardrails
- No sensitive operations
- Read-only or low-risk
**Controlled Access** → Agents with Tool Restrictions
- Specific tool permissions
- Allowlist approach
- Confirmation required
**High Security** → SDK Primitives + Custom Controls
- Deny-all default
- Explicit confirmations
- Audit logging
- Custom security layers
## Architecture Patterns
### Pattern 1: Skills-First Architecture
**Use when:** Building reusable expertise and workflows
**Structure:**
```
Project
├── skills/
│ ├── domain-expert/
│ │ ├── SKILL.md
│ │ └── references/
│ │ ├── patterns.md
│ │ ├── best_practices.md
│ │ └── examples.md
│ └── workflow-automation/
│ ├── SKILL.md
│ └── scripts/
│ └── automate.sh
└── .claude/
└── config
```
**Benefits:**
- Reusable across projects
- Progressive disclosure
- Easy to share and maintain
- Clear documentation
### Pattern 2: Agent-Based Architecture
**Use when:** Complex autonomous tasks with isolated concerns
**Structure:**
```
Main Agent (orchestrator)
├── Explore Agent (codebase analysis)
├── Plan Agent (task planning)
├── Code Agent (implementation)
└── Review Agent (validation)
```
**Benefits:**
- Parallel execution
- Isolated contexts
- Specialized expertise
- Clear responsibilities
### Pattern 3: Hybrid Architecture
**Use when:** Complex projects with varied requirements
**Structure:**
```
Main Conversation
├── Direct Prompts (simple tasks)
├── Skills (reusable expertise)
│ ├── code-review-skill
│ └── testing-skill
└── Subagents (complex workflows)
├── Explore Agent
└── Plan Agent
```
**Benefits:**
- Right tool for each task
- Optimal resource usage
- Flexible and scalable
- Best of all approaches
### Pattern 4: SDK Custom Architecture
**Use when:** Unique requirements or product features
**Structure:**
```
Custom Agent SDK Implementation
├── Custom Tools
├── Specialized Feedback Loops
├── System Integrations
└── Domain-Specific Workflows
```
**Benefits:**
- Full control
- Custom integration
- Unique workflows
- Production-ready
## Key Principles (2025)
### 1. Progressive Disclosure
**What:** Show only what's needed, when it's needed.
**Why:** Avoids context limits, reduces costs, improves performance.
**How:** Organize skills with task-based navigation, provide query tools, structure information hierarchically.
### 2. Context as Resource
**What:** Treat context window as precious, limited resource.
**Why:** Every token counts toward limits and costs.
**How:** Use progressive disclosure, prefer retrieval over dumping, compress aggressively, reset periodically.
### 3. Clear Instructions
**What:** Explicit, unambiguous directions.
**Why:** Claude 4.x responds best to clarity.
**How:** Be specific, define output format, provide examples, avoid vagueness.
### 4. Security by Design
**What:** Deny-all default, allowlist approach.
**Why:** Safe, controlled AI systems.
**How:** Limit tool access, require confirmations, audit operations, block dangerous commands.
### 5. Thinking Capabilities
**What:** Leverage Claude's extended thinking mode.
**Why:** Better results for complex reasoning.
**How:** Request step-by-step thinking, allow reflection after tool use, guide initial thinking.
### 6. Two-Message Pattern
**What:** Use meta messages for context without UI clutter.
**Why:** Clean UX while providing necessary context.
**How:** Set isMeta: true for system messages, use for skill loading, keep UI focused.
## Reference Materials
All architectural patterns, decision frameworks, and examples are in the `references/` directory:
- **decision_rubric.md** - Comprehensive decision framework
- **architectural_patterns.md** - Detailed pattern catalog
- **best_practices.md** - 2025 Anthropic best practices
- **use_case_examples.md** - Real-world architecture examples
## Usage Examples
### Example 1: Determining Architecture for Content Generation
**Input:**
```
Using anthropic-architect, I need to build a system that:
- Generates blog posts from product features
- Ensures brand voice consistency
- Includes SEO optimization
- Reusable across marketing team
```
**Analysis:**
- Medium complexity (structured workflow)
- High reusability (team-wide)
- Domain expertise needed (content, SEO, brand)
- Progressive disclosure beneficial
**Recommendation:** **Skills-First Architecture**
- Create `content-generator` skill
- Include brand voice references
- SEO guidelines in references
- Example templates
- Progressive disclosure for different content types
### Example 2: Code Refactoring Tool
**Input:**
```
Using anthropic-architect, I want to:
- Analyze codebase for refactoring opportunities
- Generate refactoring plan
- Execute refactoring with tests
- Review and validate changes
```
**Analysis:**
- High complexity (multi-step, autonomous)
- Different contexts needed (explore, plan, code, review)
- Parallel execution beneficial
- Tool permissions vary by stage
**Recommendation:** **Agent-Based Architecture**
- Main orchestrator agent
- Explore subagent (read-only, codebase analysis)
- Plan subagent (planning, no execution)
- Code subagent (write permissions)
- Review subagent (validation, test execution)
### Example 3: Simple Code Review
**Input:**
```
Using anthropic-architect, I need to:
- Review this PR for bugs
- Check code style
- Suggest improvements
```
**Analysis:**
- Low complexity (single operation)
- One-time task
- No reusability needed
- Minimal context
**Recommendation:** **Direct Prompt**
- Simple, clear instructions
- No skill/agent overhead
- Fast execution
- Sufficient for task
### Example 4: Custom CI/CD Integration
**Input:**
```
Using anthropic-architect, I want to:
- Integrate Claude into CI pipeline
- Custom tool for deployment validation
- Specific workflow for our stack
- Production feature
```
**Analysis:**
- Custom complexity
- System integration required
- Production deployment
- Unique workflows
**Recommendation:** **SDK Primitives**
- Build custom agent with SDK
- Implement custom tools
- Create specialized feedback loops
- Integration with CI system
## Best Practices Checklist
When designing your architecture:
- [ ] Analyzed task complexity accurately
- [ ] Considered reusability requirements
- [ ] Evaluated context management needs
- [ ] Assessed security requirements
- [ ] Applied progressive disclosure where beneficial
- [ ] Chose simplest solution that works
- [ ] Documented architectural decisions
- [ ] Planned for maintenance and updates
- [ ] Considered cost implications
- [ ] Validated with prototype/POC
## Common Anti-Patterns
### Anti-Pattern 1: Over-Engineering
**Problem:** Using Agents/SDK for simple tasks
**Solution:** Start simple, scale complexity as needed
### Anti-Pattern 2: Context Dumping
**Problem:** Loading entire skills into context
**Solution:** Use progressive disclosure, query tools
### Anti-Pattern 3: Mixed Concerns
**Problem:** Single skill/agent doing too much
**Solution:** Separate concerns, use subagents or multiple skills
### Anti-Pattern 4: No Security Boundaries
**Problem:** Full tool access for all agents
**Solution:** Allowlist approach, minimal permissions
### Anti-Pattern 5: Ignoring Reusability
**Problem:** Recreating same prompts repeatedly
**Solution:** Extract to skills, share across projects
## Getting Started
### Step 1: Describe Your Project
Provide clear requirements, complexity level, and constraints.
### Step 2: Receive Recommendation
Get tailored architecture with reasoning.
### Step 3: Review Patterns
Explore similar successful architectures.
### Step 4: Implement
Follow implementation guidance.
### Step 5: Iterate
Refine based on results and feedback.
## Summary
The Anthropic Architect skill helps you:
- Choose the right primitives for your needs
- Design scalable, maintainable architectures
- Follow 2025 best practices
- Avoid common pitfalls
- Optimize for performance and cost
**Key Primitives:**
- **Skills** - Reusable domain expertise
- **Agents** - Autonomous complex workflows
- **Prompts** - Simple direct tasks
- **SDK** - Custom integrations
**Core Principles:**
- Progressive disclosure
- Context as resource
- Security by design
- Clear instructions
- Right tool for the job
---
**"The best architecture is the simplest one that meets your requirements."**

View File

@@ -0,0 +1,965 @@
# Architectural Patterns
Comprehensive catalog of proven Anthropic architecture patterns for common use cases.
## Table of Contents
1. [Skills-First Patterns](#skills-first-patterns)
2. [Agent-Based Patterns](#agent-based-patterns)
3. [Hybrid Patterns](#hybrid-patterns)
4. [SDK Custom Patterns](#sdk-custom-patterns)
5. [Progressive Disclosure Patterns](#progressive-disclosure-patterns)
6. [Security Patterns](#security-patterns)
---
## Skills-First Patterns
### Pattern 1: Domain Expert Skill
**When to use:**
- Specialized domain knowledge needed
- Team-wide expertise sharing
- Consistent application of patterns
**Structure:**
```
domain-expert-skill/
├── SKILL.md
├── references/
│ ├── core_concepts.md
│ ├── patterns/
│ │ ├── pattern_1.md
│ │ ├── pattern_2.md
│ │ └── pattern_3.md
│ ├── best_practices.md
│ └── examples/
│ ├── example_1.md
│ └── example_2.md
└── scripts/
└── validate.sh
```
**Example: Security Expert Skill**
```
security-expert/
├── SKILL.md
├── references/
│ ├── owasp_top_10.md
│ ├── patterns/
│ │ ├── authentication.md
│ │ ├── authorization.md
│ │ ├── encryption.md
│ │ └── input_validation.md
│ ├── threat_models.md
│ └── examples/
│ ├── secure_api.md
│ └── secure_storage.md
└── scripts/
└── security_audit.sh
```
**Usage:**
```
Using the security-expert skill, review this authentication flow
for vulnerabilities following OWASP guidelines.
```
**Benefits:**
- Centralized expertise
- Consistent security reviews
- Team knowledge sharing
- Progressive disclosure of security patterns
---
### Pattern 2: Workflow Automation Skill
**When to use:**
- Repeatable multi-step workflows
- Team-wide process standardization
- Interactive script execution
**Structure:**
```
workflow-automation-skill/
├── SKILL.md
├── workflows/
│ ├── workflow_1.md
│ ├── workflow_2.md
│ └── workflow_3.md
├── scripts/
│ ├── step_1.sh
│ ├── step_2.sh
│ └── orchestrate.sh
└── templates/
├── template_1.md
└── template_2.md
```
**Example: Release Management Skill**
```
release-management/
├── SKILL.md
├── workflows/
│ ├── version_bump.md
│ ├── changelog_generation.md
│ ├── deployment.md
│ └── rollback.md
├── scripts/
│ ├── bump_version.sh
│ ├── generate_changelog.sh
│ ├── deploy.sh
│ └── release.sh (orchestrator)
└── templates/
├── changelog_template.md
└── release_notes.md
```
**Usage:**
```
Using the release-management skill, prepare a new release:
- Bump version to 2.1.0
- Generate changelog from commits
- Create release notes
```
**Benefits:**
- Standardized processes
- Reduced errors
- Faster execution
- Team consistency
---
### Pattern 3: Progressive Disclosure Skill
**When to use:**
- Large knowledge base
- Context limits are concern
- Query-based information retrieval
**Structure:**
```
progressive-skill/
├── SKILL.md
├── query_tool.sh
├── index.md (high-level navigation)
├── expertise/
│ ├── by_task/
│ │ ├── task_1.md
│ │ └── task_2.md
│ ├── by_category/
│ │ ├── category_1.md
│ │ └── category_2.md
│ └── quick_reference/
│ └── cheat_sheet.md
└── examples/
└── contextualized_examples.md
```
**Example: API Design Skill**
```
api-design-expert/
├── SKILL.md
├── query_expertise.sh
├── index.md
├── expertise/
│ ├── by_task/
│ │ ├── rest_api.md
│ │ ├── graphql.md
│ │ └── webhooks.md
│ ├── by_concern/
│ │ ├── authentication.md
│ │ ├── rate_limiting.md
│ │ ├── versioning.md
│ │ └── documentation.md
│ └── quick_reference/
│ ├── http_methods.md
│ └── status_codes.md
└── examples/
└── real_world_apis.md
```
**Usage:**
```
Using api-design-expert, show me best practices for:
- RESTful resource design
- Authentication and authorization
- Rate limiting strategy
```
**Query Flow:**
1. Agent loads index.md (high-level)
2. Agent identifies relevant task: "rest_api.md"
3. Skill provides only REST API patterns
4. If auth needed, loads authentication.md
5. Progressive disclosure of information
**Benefits:**
- Minimal context usage
- Fast responses
- Scalable knowledge base
- Efficient token consumption
---
## Agent-Based Patterns
### Pattern 4: Multi-Phase Agent Pipeline
**When to use:**
- Complex workflows with distinct phases
- Different tool permissions per phase
- Isolated contexts beneficial
**Structure:**
```
Main Agent (orchestrator)
├── Phase 1: Explore Agent
│ Tools: [Read, Grep, Glob]
│ Context: Codebase exploration
│ Output: Insights, patterns, structure
├── Phase 2: Plan Agent
│ Tools: [TodoWrite]
│ Context: Insights from Phase 1
│ Output: Detailed plan
├── Phase 3: Execute Agent
│ Tools: [Read, Edit, Write]
│ Context: Plan + target files
│ Output: Code changes
└── Phase 4: Validate Agent
Tools: [Bash, Read]
Context: Changes + tests
Output: Validation results
```
**Example: Feature Implementation**
```
Feature Implementation Pipeline
1. Explore Agent
- Analyze existing codebase
- Find related code patterns
- Identify integration points
→ Output: Architecture analysis
2. Design Agent
- Review analysis
- Create detailed design
- Plan file changes
→ Output: Implementation plan
3. Code Agent
- Implement feature
- Follow design plan
- Write tests
→ Output: Code changes
4. Review Agent
- Run tests
- Check coverage
- Validate functionality
→ Output: Pass/fail + feedback
5. Main Agent
- Aggregate results
- Report to user
- Handle iterations
```
**Usage:**
```
Implement a new user authentication feature:
1. Analyze current auth system
2. Design JWT-based auth
3. Implement changes
4. Validate with tests
```
**Benefits:**
- Clear phase separation
- Minimal permissions per phase
- Isolated contexts (no pollution)
- Parallel execution possible
- Easy to debug and iterate
---
### Pattern 5: Parallel Agent Execution
**When to use:**
- Independent tasks that can run simultaneously
- Large codebases or datasets
- Time-sensitive operations
**Structure:**
```
Coordinator Agent
├─┬─ Worker Agent 1: Task Subset 1
│ │
│ ├─ Worker Agent 2: Task Subset 2
│ │
│ ├─ Worker Agent 3: Task Subset 3
│ │
│ └─ Worker Agent 4: Task Subset 4
└── Aggregator Agent: Combine Results
```
**Example: Codebase Analysis**
```
Analysis Coordinator
├─┬─ Analyze Agent 1: /src/components/
│ │ → Component patterns
│ │
│ ├─ Analyze Agent 2: /src/services/
│ │ → Service patterns
│ │
│ ├─ Analyze Agent 3: /src/utils/
│ │ → Utility patterns
│ │
│ └─ Analyze Agent 4: /tests/
│ → Test coverage
└── Aggregator Agent
→ Combined analysis report
```
**Usage:**
```
Analyze entire codebase for:
- Code patterns
- Test coverage
- Performance issues
- Security vulnerabilities
Execute in parallel for speed.
```
**Benefits:**
- Massive speedup (4x with 4 agents)
- Independent execution
- Resource optimization
- Scalable to large codebases
---
### Pattern 6: Specialist Agent Team
**When to use:**
- Different expertise areas needed
- Collaborative task execution
- Review and validation workflows
**Structure:**
```
Project Lead Agent (orchestrator)
├── Frontend Specialist Agent
│ Expertise: UI/UX, components, accessibility
│ Tools: Frontend-specific
├── Backend Specialist Agent
│ Expertise: APIs, databases, services
│ Tools: Backend-specific
├── DevOps Specialist Agent
│ Expertise: CI/CD, deployment, infrastructure
│ Tools: DevOps-specific
└── QA Specialist Agent
Expertise: Testing, validation, quality
Tools: Testing-specific
```
**Example: Full-Stack Feature**
```
Feature: Add user profile page
Project Lead Agent
├── Frontend Agent
│ - Create profile component
│ - Add routing
│ - Implement responsive design
│ Load: frontend-designer skill
├── Backend Agent
│ - Create profile API endpoint
│ - Add database queries
│ - Implement validation
│ Load: api-design-expert skill
├── DevOps Agent
│ - Update deployment config
│ - Add environment variables
│ - Configure monitoring
│ Load: devops-expert skill
└── QA Agent
- Write integration tests
- Validate end-to-end flow
- Check accessibility
Load: qa-test-planner skill
```
**Usage:**
```
Implement user profile feature across the stack:
- Frontend: Profile page with edit capability
- Backend: CRUD API for profile data
- DevOps: Deploy to staging
- QA: Full test coverage
```
**Benefits:**
- Specialized expertise
- Parallel execution
- Skills loading per specialist
- Clear ownership
- Comprehensive coverage
---
## Hybrid Patterns
### Pattern 7: Agents + Skills Hybrid
**When to use:**
- Complex workflows needing domain expertise
- Reusable knowledge + autonomous execution
- Best of both approaches
**Structure:**
```
Agent Workflow
├── Explore Agent
│ Loads: codebase-analysis-skill
│ Expertise: Pattern recognition
│ Tools: [Read, Grep, Glob]
├── Plan Agent
│ Loads: architecture-patterns-skill
│ Expertise: Design patterns
│ Tools: [TodoWrite]
└── Execute Agent
Loads: coding-standards-skill
Expertise: Team conventions
Tools: [Read, Edit, Write]
```
**Example: Refactoring Pipeline**
```
Refactoring Workflow
1. Analysis Agent
Load: code-quality-skill
- Identify code smells
- Find duplication
- Analyze complexity
Use skill's: anti-patterns reference
2. Planning Agent
Load: refactoring-patterns-skill
- Choose refactoring patterns
- Plan step-by-step changes
- Estimate risk
Use skill's: safe refactoring strategies
3. Execution Agent
Load: team-coding-standards-skill
- Apply refactorings
- Follow team style
- Maintain tests
Use skill's: style guide and examples
4. Validation Agent
Load: testing-strategies-skill
- Run test suite
- Check coverage
- Verify behavior
Use skill's: test validation patterns
```
**Usage:**
```
Refactor the UserService class:
- Load relevant skills for expertise
- Use agents for autonomous execution
- Progressive disclosure of skill knowledge
- Isolated contexts per phase
```
**Benefits:**
- Reusable expertise (skills)
- Autonomous execution (agents)
- Progressive disclosure
- Isolated contexts
- Best of both worlds
---
### Pattern 8: Prompts + Skills Fallback
**When to use:**
- Start simple, escalate complexity
- Most tasks simple, some complex
- Cost optimization
**Structure:**
```
Task Router
├── Simple task? → Direct Prompt
├── Needs expertise? → Load Skill
│ └── Still simple? → Skill + Prompt
│ └── Complex? → Skill + Agent
└── Very complex? → Agents + Skills
```
**Example: Code Review System**
```
Code Review Router
1. Check complexity
- Small PR (< 50 lines)? → Direct Prompt
"Review this PR for basic issues"
- Medium PR (50-500 lines)? → Load Skill
Using code-review-skill, review this PR
Skill provides: checklist, patterns
- Large PR (> 500 lines)? → Agent + Skill
Review Agent loads code-review-skill
- Explore codebase context
- Apply review checklist
- Generate comprehensive review
```
**Benefits:**
- Cost-effective (simple → cheap)
- Scalable (complex → powerful)
- Flexible (adapts to task)
- Progressive enhancement
---
## SDK Custom Patterns
### Pattern 9: Custom Tool Integration
**When to use:**
- Integration with proprietary systems
- Custom tools needed
- Domain-specific operations
**Structure:**
```typescript
import { Agent, Tool } from 'claude-agent-sdk';
// Define custom tool
const customTool: Tool = {
name: 'custom-tool',
description: 'Interacts with proprietary system',
execute: async (params) => {
// Custom implementation
return await proprietarySystem.call(params);
}
};
// Create agent with custom tool
const agent = new Agent({
tools: [customTool, ...standardTools],
systemPrompt: 'You are an expert with custom-system access'
});
```
**Example: CRM Integration Agent**
```typescript
// Custom CRM tools
const getCRMData: Tool = {
name: 'get-crm-data',
description: 'Fetch customer data from CRM',
execute: async ({ customerId }) => {
return await crmAPI.getCustomer(customerId);
}
};
const updateCRMData: Tool = {
name: 'update-crm-data',
description: 'Update customer data in CRM',
execute: async ({ customerId, data }) => {
return await crmAPI.updateCustomer(customerId, data);
}
};
// Agent with CRM access
const crmAgent = new Agent({
tools: [getCRMData, updateCRMData],
systemPrompt: `You are a CRM assistant with access to customer data.
Help users query and update customer information.`
});
```
**Usage:**
```typescript
const response = await crmAgent.run({
task: 'Get customer data for customer ID 12345 and update their email'
});
```
**Benefits:**
- Direct system integration
- Custom business logic
- Proprietary data access
- Fine-grained control
---
### Pattern 10: Custom Feedback Loop
**When to use:**
- Specific workflow requirements
- Unique validation logic
- Custom iteration patterns
**Structure:**
```typescript
const customWorkflow = async (task: Task) => {
let result;
let iterations = 0;
const maxIterations = 5;
while (iterations < maxIterations) {
// Step 1: Generate
result = await agent.generate(task);
// Step 2: Custom validation
const validation = await customValidator(result);
// Step 3: Custom decision
if (validation.passed) {
break; // Success
}
// Step 4: Custom feedback
task = customFeedback(task, validation.issues);
iterations++;
}
return result;
};
```
**Example: Code Generation with Custom Linter**
```typescript
const codeGenerationWorkflow = async (spec: Specification) => {
let code;
let attempt = 0;
while (attempt < 3) {
// Generate code
code = await codeAgent.generate(spec);
// Custom linter validation
const lintResults = await customLinter.check(code);
if (lintResults.errors.length === 0) {
// Passed linting
break;
}
// Custom feedback loop
spec = addLintingFeedback(spec, lintResults.errors);
attempt++;
}
// Custom post-processing
code = await customFormatter.format(code);
return code;
};
```
**Benefits:**
- Custom validation logic
- Specific iteration patterns
- Business rule enforcement
- Unique workflows
---
## Progressive Disclosure Patterns
### Pattern 11: Query-Based Disclosure
**When to use:**
- Large knowledge bases
- Context optimization critical
- Task-specific information needed
**Structure:**
```
skill/
├── SKILL.md (high-level overview)
├── query.sh (interactive query tool)
├── index.md (navigation)
└── content/
├── topic_1/
│ ├── overview.md (loaded first)
│ ├── detailed.md (on demand)
│ └── examples.md (when requested)
└── topic_2/
├── overview.md
├── detailed.md
└── examples.md
```
**Query Pattern:**
```bash
# User queries skill
"Using skill, how do I implement authentication?"
# Skill loading strategy
1. Load: index.md (< 500 tokens)
→ Find relevant topic: authentication
2. Load: authentication/overview.md (< 1000 tokens)
→ Provides high-level guidance
3. If user needs more:
Load: authentication/detailed.md
→ In-depth patterns
4. If user wants examples:
Load: authentication/examples.md
→ Real code samples
```
**Benefits:**
- Minimal initial context
- Load more as needed
- Efficient token usage
- Fast initial response
---
### Pattern 12: Hierarchical Disclosure
**When to use:**
- Complex topics with depth
- Progressive learning needed
- Multiple expertise levels
**Structure:**
```
skill/
├── level_1_basics/
│ └── (fundamental concepts)
├── level_2_intermediate/
│ └── (common patterns)
├── level_3_advanced/
│ └── (complex techniques)
└── level_4_expert/
└── (edge cases, optimization)
```
**Disclosure Flow:**
```
User: "Help me with caching"
Skill responds:
├─ Level 1: Basic caching concepts
│ User: "I know basics, show me patterns"
├─ Level 2: Common caching patterns
│ User: "Show me advanced optimization"
├─ Level 3: Cache optimization techniques
│ User: "What about distributed caching?"
└─ Level 4: Distributed caching strategies
```
**Benefits:**
- Tailored to user expertise
- Prevents overwhelming
- Progressive depth
- Efficient learning
---
## Security Patterns
### Pattern 13: Allowlist Security Pattern
**When to use:**
- Sensitive operations
- Controlled tool access
- Security-critical applications
**Structure:**
```typescript
const secureAgent = new Agent({
tools: allowlistOnly([
'Read', // Safe: read-only
'Grep', // Safe: read-only
'Glob', // Safe: read-only
]),
denylist: [
'rm -rf',
'sudo',
'curl', // Could leak data
'wget', // Could leak data
],
confirmations: [
'git push',
'deployment',
'data deletion'
]
});
```
**Example: Production Agent**
```typescript
const productionAgent = new Agent({
name: 'production-agent',
// Minimal permissions
tools: [
'Read', // View configs
'Grep', // Search logs
],
// Block dangerous operations
denylist: [
'rm',
'delete',
'drop',
'truncate',
'sudo',
'chmod',
'exec'
],
// Require confirmation
confirmations: [
'restart service',
'change config',
'modify database'
],
// Audit all operations
audit: {
enabled: true,
logLevel: 'verbose',
destination: 'security-log'
}
});
```
**Benefits:**
- Deny-all default
- Explicit permissions
- Confirmations for sensitive ops
- Full audit trail
---
### Pattern 14: Defense in Depth
**When to use:**
- High security requirements
- Multiple security layers needed
- Critical systems
**Structure:**
```
Security Layers:
Layer 1: Tool Allowlist
→ Only approved tools
Layer 2: Command Validation
→ Validate command safety
Layer 3: Confirmation Required
→ Human approval for sensitive ops
Layer 4: Sandbox Execution
→ Isolated environment
Layer 5: Audit Logging
→ Full operation trail
Layer 6: Rollback Capability
→ Undo mechanism
```
**Example: Financial System Agent**
```typescript
const financialAgent = new Agent({
// Layer 1: Tool Allowlist
tools: allowlistOnly(['Read', 'Grep']),
// Layer 2: Command Validation
preExecute: async (command) => {
return await securityValidator.validate(command);
},
// Layer 3: Confirmation Required
confirmations: 'all',
// Layer 4: Sandbox
sandbox: {
enabled: true,
isolated: true,
networkBlocked: true
},
// Layer 5: Audit
audit: {
enabled: true,
level: 'detailed',
retention: '7years',
immutable: true
},
// Layer 6: Rollback
rollback: {
enabled: true,
autoSnapshot: true,
quickRevert: true
}
});
```
**Benefits:**
- Multiple security layers
- Defense against various threats
- Compliance ready
- Maximum security
---
## Summary
Choose patterns based on your requirements:
**Simple Tasks:** Direct Prompts
**Reusable Expertise:** Skills (Patterns 1-3)
**Complex Workflows:** Agents (Patterns 4-6)
**Best of Both:** Hybrid (Patterns 7-8)
**Custom Needs:** SDK (Patterns 9-10)
**Large Knowledge:** Progressive Disclosure (Patterns 11-12)
**Security Critical:** Security Patterns (Patterns 13-14)
**Key Principle:** Start simple, add complexity only when needed.

View File

@@ -0,0 +1,936 @@
# Anthropic Architecture Best Practices (2025)
Proven best practices for designing, building, and maintaining Anthropic-based systems.
## Table of Contents
1. [Core Design Principles](#core-design-principles)
2. [Progressive Disclosure](#progressive-disclosure)
3. [Context Management](#context-management)
4. [Security & Safety](#security--safety)
5. [Performance Optimization](#performance-optimization)
6. [Skill Design](#skill-design)
7. [Agent Design](#agent-design)
8. [Testing & Validation](#testing--validation)
9. [Maintenance & Evolution](#maintenance--evolution)
10. [Cost Optimization](#cost-optimization)
---
## Core Design Principles
### 1. Start Simple, Scale Complexity
**Principle:** Always begin with the simplest solution that meets requirements.
**Why:** Avoid over-engineering and unnecessary complexity.
**How:**
```
Level 1: Try Direct Prompt
└─ Works? → Done
└─ Too complex? → Continue
Level 2: Create Skill
└─ Works? → Done
└─ Needs isolation? → Continue
Level 3: Use Agents
└─ Works? → Done
└─ Need custom workflow? → Continue
Level 4: SDK Primitives
└─ Full control achieved
```
**Example:**
```
Task: Generate code review
❌ Wrong: Immediately build custom SDK agent
✅ Right: Try direct prompt first
"Review this PR for:
- Code quality issues
- Security vulnerabilities
- Performance concerns"
If inconsistent → Create code-review skill
If too complex → Use agent with multiple phases
```
---
### 2. Progressive Disclosure First
**Principle:** Show only what's needed, when it's needed.
**Why:** Optimize context usage, reduce costs, improve performance.
**How:**
```
Structure information hierarchically:
├── Index/Overview (always load)
├── Topic Overviews (load on demand)
├── Detailed Content (load when requested)
└── Examples (load if needed)
```
**Anti-Pattern:**
```markdown
❌ DON'T: Dump entire skill into context
skill/
└── SKILL.md (100KB of everything)
→ Context overload
→ Slow responses
→ High costs
```
**Best Practice:**
```markdown
✅ DO: Structure for progressive disclosure
skill/
├── SKILL.md (overview, 2KB)
├── index.md (navigation, 1KB)
└── topics/
├── topic_1/
│ ├── overview.md (1KB)
│ └── details.md (loaded on request)
└── topic_2/
├── overview.md (1KB)
└── details.md (loaded on request)
Total initial load: ~4KB vs 100KB
```
---
### 3. Context as Precious Resource
**Principle:** Treat every token as valuable and limited.
**Why:** Context windows have limits and costs.
**Context Budget:**
```
Claude 4.x: 200K tokens
- Reserve: 50K for responses
- Available: 150K for context
- Budget wisely!
```
**Best Practices:**
- ✅ Load only necessary information
- ✅ Summarize large outputs
- ✅ Use progressive disclosure
- ✅ Reset context periodically
- ✅ Compress repeated information
- ❌ Don't dump raw logs
- ❌ Don't load unused references
- ❌ Don't repeat information
---
### 4. Clear, Explicit Instructions
**Principle:** Claude 4.x responds best to unambiguous direction.
**Why:** Reduces errors, improves consistency, better results.
**Comparison:**
```
❌ Vague:
"Make the code better"
✅ Clear:
"Refactor this function to:
1. Extract magic numbers to constants
2. Add type annotations
3. Improve variable names for clarity
4. Add error handling for edge cases
Format: Return only the refactored code"
```
**Template:**
```
<instructions>
TASK: [Clear task definition]
REQUIREMENTS:
- [Specific requirement 1]
- [Specific requirement 2]
- [Specific requirement 3]
OUTPUT FORMAT:
[Exact format expected]
CONSTRAINTS:
- [Constraint 1]
- [Constraint 2]
</instructions>
<examples>
[2-3 examples showing desired pattern]
</examples>
```
---
### 5. Security by Design
**Principle:** Deny-all default, allowlist approach.
**Why:** Safe, controlled AI systems.
**Security Checklist:**
- [ ] Minimal tool permissions
- [ ] Allowlist approved tools only
- [ ] Deny dangerous commands
- [ ] Require confirmations for sensitive ops
- [ ] Audit all operations
- [ ] Implement rollback capability
- [ ] Validate all inputs
- [ ] Sandbox execution when possible
**Default Agent Security:**
```typescript
const secureAgent = new Agent({
// Deny-all default
tools: [],
// Explicitly allow minimal tools
allowlist: [
'Read', // Read-only
'Grep', // Search-only
'Glob' // Find-only
],
// Block dangerous operations
denylist: [
'rm -rf',
'sudo',
'exec',
'eval'
],
// Require confirmation
confirmations: [
'git push',
'deployment',
'data modification'
]
});
```
---
## Progressive Disclosure
### Pattern: Query-Based Disclosure
**Best Practice:**
```
skill/
├── SKILL.md
│ Content: High-level overview
│ Size: < 2KB
│ Purpose: Introduce skill capabilities
├── index.md
│ Content: Navigation/TOC
│ Size: < 1KB
│ Purpose: Guide to available topics
└── content/
└── topic/
├── overview.md (load first)
├── details.md (load on demand)
└── examples.md (load when requested)
```
**Loading Strategy:**
```
1. Initial load: SKILL.md + index.md (~3KB)
2. User asks about "authentication"
3. Load: authentication/overview.md (~1KB)
4. User needs details
5. Load: authentication/details.md (~3KB)
6. User wants examples
7. Load: authentication/examples.md (~2KB)
Total: 9KB loaded vs 50KB if dumped all at once
Savings: 82% context reduction
```
---
### Pattern: Hierarchical Expertise
**Best Practice:**
```
expertise/
├── by_task/
│ ├── authentication.md
│ ├── api_design.md
│ └── testing.md
├── by_language/
│ ├── typescript.md
│ ├── python.md
│ └── rust.md
├── by_pattern/
│ ├── repository.md
│ └── factory.md
└── quick_reference/
└── cheatsheet.md
```
**Query Examples:**
```
"How to implement auth?" → Load: by_task/authentication.md
"TypeScript style guide?" → Load: by_language/typescript.md
"Repository pattern?" → Load: by_pattern/repository.md
"Quick naming conventions?" → Load: quick_reference/cheatsheet.md
```
---
## Context Management
### Best Practice 1: Periodic Context Reset
**Why:** Long sessions accumulate irrelevant context.
**When to reset:**
- After completing major task
- Context feels "bloated"
- Responses become slower
- Approaching token limits
**How:**
```
Option 1: New conversation
- Start fresh conversation
- Provide summary of previous work
Option 2: Explicit reset request
- Ask Claude to forget irrelevant context
- Summarize key points to retain
Option 3: Use separate agents
- Different agents for different tasks
- Clean contexts per task
```
---
### Best Practice 2: Summarize, Don't Dump
**Anti-Pattern:**
```
❌ DON'T: Dump raw logs
"Here are the test results:"
[10,000 lines of test output]
```
**Best Practice:**
```
✅ DO: Summarize key information
"Test Results Summary:
- Total: 1,247 tests
- Passed: 1,245 (99.8%)
- Failed: 2
- test_auth_token_expiration (line 456)
- test_rate_limiting (line 789)
- Duration: 2m 34s"
```
---
### Best Practice 3: Compress Repeated Information
**Anti-Pattern:**
```
❌ DON'T: Repeat same information
Task 1: "Following these coding standards: [full standards]"
Task 2: "Following these coding standards: [full standards]"
Task 3: "Following these coding standards: [full standards]"
```
**Best Practice:**
```
✅ DO: Reference once, use skill
Task 1: Load: coding-standards-skill
Task 2: "Continue following loaded coding standards"
Task 3: "Continue following loaded coding standards"
```
---
## Security & Safety
### Best Practice 1: Minimal Permissions
**Principle:** Grant minimum tools needed for task.
**Example: Code Analysis**
```typescript
const analysisAgent = new Agent({
tools: [
'Read', // Read code
'Grep', // Search code
'Glob' // Find files
]
// NO Write, Edit, Bash, etc.
});
```
**Example: Code Modification**
```typescript
const codeAgent = new Agent({
tools: [
'Read', // Read existing code
'Edit' // Modify code
]
// NO Bash (can't execute)
// NO full Write (use Edit for safety)
});
```
---
### Best Practice 2: Confirmation for Sensitive Operations
**Always require confirmation:**
- git push
- Deployment commands
- Data deletion
- System modifications
- API calls to production
- Database changes
**Implementation:**
```typescript
const deployAgent = new Agent({
confirmations: [
'git push',
'npm publish',
'kubectl apply',
'terraform apply',
'aws',
'gcloud'
]
});
```
---
### Best Practice 3: Audit Logging
**Why:** Track all AI operations for security and debugging.
**What to log:**
- Tool usage
- Commands executed
- Files modified
- API calls made
- Errors encountered
- User confirmations
**Implementation:**
```typescript
const auditedAgent = new Agent({
audit: {
enabled: true,
level: 'verbose',
includeContext: true,
destination: './logs/agent-audit.log',
retention: '90days'
}
});
```
---
## Performance Optimization
### Best Practice 1: Parallel Execution
**When possible, parallelize:**
**Anti-Pattern:**
```
❌ Sequential (slow):
1. Analyze file1.ts → 10s
2. Analyze file2.ts → 10s
3. Analyze file3.ts → 10s
Total: 30s
```
**Best Practice:**
```
✅ Parallel (fast):
1. Analyze file1.ts ──┐
2. Analyze file2.ts ──┼→ All run simultaneously
3. Analyze file3.ts ──┘
Total: 10s (3x faster)
```
**Implementation:**
```
Launch 3 agents in parallel:
- Agent 1: file1.ts
- Agent 2: file2.ts
- Agent 3: file3.ts
Aggregate results
```
---
### Best Practice 2: Cache Frequent Queries
**Pattern:**
```
skill/
└── cache/
├── frequently_asked.md
└── common_patterns.md
```
**Example:**
```
Common query: "How to handle errors?"
Instead of processing each time:
1. Maintain: error_handling.md with comprehensive guide
2. Query → Immediately load cached response
3. Fast, consistent responses
```
---
### Best Practice 3: Optimize Token Usage
**Token Optimization Checklist:**
- [ ] Use progressive disclosure
- [ ] Summarize large outputs
- [ ] Remove redundant information
- [ ] Compress repeated content
- [ ] Use shorter variable names in examples
- [ ] Remove unnecessary whitespace
- [ ] Reference external docs vs embedding
**Example:**
```
❌ High token usage:
const myVeryLongDescriptiveVariableName = 'value';
const anotherVeryLongDescriptiveVariableName = 'value';
✅ Optimized:
const user = 'value';
const data = 'value';
// Still clear, fewer tokens
```
---
## Skill Design
### Best Practice 1: Single Responsibility
**Principle:** Each skill should have one clear purpose.
**Anti-Pattern:**
```
❌ DON'T: Mega-skill doing everything
super-skill/
├── frontend/
├── backend/
├── database/
├── devops/
└── testing/
→ Too broad, context overload
```
**Best Practice:**
```
✅ DO: Focused skills
frontend-expert/
├── components/
├── styling/
└── accessibility/
backend-expert/
├── apis/
├── services/
└── databases/
```
---
### Best Practice 2: Clear Documentation
**Skill Documentation Template:**
```markdown
---
name: skill-name
description: One-sentence description
---
# Skill Name
## What This Skill Does
[2-3 sentences explaining purpose]
## When to Use
- ✅ Use case 1
- ✅ Use case 2
- ❌ Not for use case 3
## Quick Start
[Simple example]
## Reference Materials
- file1.md - Description
- file2.md - Description
## Examples
[2-3 concrete examples]
```
---
### Best Practice 3: Version Skills
**Why:** Track changes, enable rollback, communicate updates.
**Structure:**
```
skill/
├── VERSION (e.g., 2.1.0)
├── CHANGELOG.md
├── SKILL.md
└── references/
```
**CHANGELOG.md:**
```markdown
# Changelog
## [2.1.0] - 2025-01-15
### Added
- New pattern: async error handling
- Examples for TypeScript 5.x
### Changed
- Updated API guidelines for REST
### Fixed
- Corrected authentication example
## [2.0.0] - 2024-12-01
### Breaking Changes
- Restructured reference materials
```
---
## Agent Design
### Best Practice 1: Clear Agent Boundaries
**Principle:** Each agent should have clear, distinct responsibilities.
**Anti-Pattern:**
```
❌ DON'T: Monolithic agent doing everything
BigAgent
├── Explores codebase
├── Plans changes
├── Executes changes
├── Runs tests
├── Deploys
└── Monitors
→ Too much responsibility, hard to debug
```
**Best Practice:**
```
✅ DO: Specialized agents
Main Orchestrator
├── Explore Agent (read-only)
├── Plan Agent (planning)
├── Code Agent (implementation)
├── Test Agent (validation)
└── Report Agent (aggregation)
```
---
### Best Practice 2: Agent Communication Patterns
**Pattern: Parent-Child**
```
Main Agent
├─→ Subagent 1: Task
│ └─→ Returns: Result
├─→ Subagent 2: Task
│ └─→ Returns: Result
└─→ Main aggregates results
```
**Pattern: Pipeline**
```
Agent 1: Explore
└─→ Output: Analysis
└─→ Agent 2: Plan
└─→ Output: Plan
└─→ Agent 3: Execute
└─→ Output: Changes
```
**Pattern: Parallel Workers**
```
Coordinator
├─┬─ Worker 1 ──┐
│ ├─ Worker 2 ──┤
│ ├─ Worker 3 ──┼→ Aggregator → Result
│ └─ Worker 4 ──┘
```
---
### Best Practice 3: Error Handling in Agents
**Principle:** Graceful failure and recovery.
**Pattern:**
```typescript
const resilientAgent = async (task) => {
try {
const result = await agent.run(task);
return result;
} catch (error) {
// Log error
logger.error('Agent failed', error);
// Attempt recovery
if (isRecoverable(error)) {
return await retryWithBackoff(agent, task);
}
// Fallback strategy
return await fallbackStrategy(task);
}
};
```
---
## Testing & Validation
### Best Practice 1: Test Skills
**What to test:**
- Skill loads correctly
- References are accessible
- Examples are valid
- Scripts execute successfully
**Example:**
```bash
#!/bin/bash
# test_skill.sh
echo "Testing skill: $1"
# Test 1: Skill file exists
if [ ! -f "$1/SKILL.md" ]; then
echo "❌ SKILL.md not found"
exit 1
fi
# Test 2: References are valid
for ref in $1/references/*.md; do
if [ ! -f "$ref" ]; then
echo "❌ Reference missing: $ref"
exit 1
fi
done
# Test 3: Scripts are executable
for script in $1/scripts/*.sh; do
if [ ! -x "$script" ]; then
echo "❌ Script not executable: $script"
exit 1
fi
done
echo "✅ All tests passed"
```
---
### Best Practice 2: Validate Agent Output
**Pattern:**
```typescript
const validateAgentOutput = async (output) => {
// Schema validation
if (!matchesSchema(output)) {
throw new Error('Invalid output schema');
}
// Business logic validation
if (!meetsRequirements(output)) {
throw new Error('Output doesn\'t meet requirements');
}
// Safety checks
if (containsDangerousContent(output)) {
throw new Error('Output contains dangerous content');
}
return output;
};
```
---
## Maintenance & Evolution
### Best Practice 1: Regular Skill Updates
**Schedule:**
- Monthly: Review and update examples
- Quarterly: Major updates for new patterns
- Yearly: Comprehensive review and restructure
**Update Checklist:**
- [ ] New patterns added
- [ ] Deprecated patterns removed
- [ ] Examples updated for current versions
- [ ] Documentation improved
- [ ] User feedback incorporated
- [ ] Version bumped
- [ ] Changelog updated
---
### Best Practice 2: Deprecation Strategy
**When deprecating:**
```markdown
## [3.0.0] - 2025-06-01
### Deprecated
⚠️ OLD PATTERN (Deprecated, remove in 4.0.0):
[Old pattern example]
✅ NEW PATTERN (Use instead):
[New pattern example]
Migration guide: See MIGRATION.md
```
**Deprecation Timeline:**
1. Announce deprecation (version N)
2. Maintain both patterns (version N+1)
3. Remove old pattern (version N+2)
---
## Cost Optimization
### Best Practice 1: Token Efficiency
**Strategies:**
- Use progressive disclosure (load less)
- Summarize outputs (fewer tokens)
- Cache frequent queries (reuse)
- Compress repeated content (deduplicate)
- Choose smaller models when possible (Haiku vs Sonnet)
**Example:**
```
Task: Simple syntax error fix
❌ Expensive: Use Sonnet for everything
Cost: $X per request
✅ Optimized: Use Haiku for simple tasks
Cost: $X/5 per request
Savings: 80%
```
---
### Best Practice 2: Model Selection
**Choose model based on complexity:**
**Haiku (Fast, Cheap):**
- Simple queries
- Straightforward tasks
- Well-defined operations
- Cost-sensitive applications
**Sonnet (Balanced):**
- Medium complexity
- Most general tasks
- Good balance of capability/cost
- Default choice
**Opus (Powerful, Expensive):**
- Complex reasoning
- Critical tasks
- High-stakes decisions
- Quality over cost
---
## Summary Checklist
**Design Phase:**
- [ ] Start with simplest solution
- [ ] Apply progressive disclosure
- [ ] Plan for context efficiency
- [ ] Design security boundaries
- [ ] Consider performance needs
**Implementation Phase:**
- [ ] Follow single responsibility
- [ ] Implement clear documentation
- [ ] Add version control
- [ ] Include error handling
- [ ] Add audit logging
**Testing Phase:**
- [ ] Test skill loading
- [ ] Validate agent outputs
- [ ] Check security controls
- [ ] Verify performance
- [ ] Test edge cases
**Maintenance Phase:**
- [ ] Regular updates
- [ ] Deprecation strategy
- [ ] User feedback loop
- [ ] Cost monitoring
- [ ] Performance optimization
---
**Remember:** The best practices evolve. Stay current with Anthropic updates and community patterns.

View File

@@ -0,0 +1,821 @@
# Architectural Decision Rubric
Comprehensive framework for choosing the right Anthropic architecture for your project.
## Overview
This rubric evaluates seven key dimensions to recommend the optimal combination of Skills, Agents, Prompts, and SDK primitives.
## The Seven Dimensions
### 1. Task Complexity
### 2. Reusability Requirements
### 3. Context Management
### 4. Security & Control
### 5. Performance Needs
### 6. Maintenance Burden
### 7. Time to Market
---
## 1. Task Complexity Analysis
### Low Complexity (Score: 1-3)
**Characteristics:**
- Single operation or simple workflow
- Clear input → output mapping
- No dependencies or minimal dependencies
- Linear execution (no branching)
- 1-5 steps maximum
- Can be described in one sentence
**Recommendation:** **Direct Prompts**
**Examples:**
- "Explain this function"
- "Fix this typo"
- "Add a comment to this class"
- "Format this JSON"
- "Translate this error message"
**Implementation:**
```
Simple, clear instruction to Claude without additional structure
```
---
### Medium Complexity (Score: 4-6)
**Characteristics:**
- Multiple related operations
- Structured workflow with steps
- Some dependencies between steps
- Requires reference materials or examples
- 5-20 steps
- Benefits from organization and guidance
- Reusable pattern
**Recommendation:** **Skills**
**Examples:**
- Generate comprehensive code review
- Create design system component
- Write technical documentation with examples
- Analyze codebase for patterns
- Generate test suite following guidelines
**Implementation:**
```
skill/
├── SKILL.md (main documentation)
├── references/
│ ├── patterns.md
│ ├── examples.md
│ └── guidelines.md
└── scripts/ (optional)
└── helper.sh
```
---
### High Complexity (Score: 7-9)
**Characteristics:**
- Multi-step autonomous workflow
- Needs isolated context
- Parallel execution beneficial
- Different phases with different requirements
- 20+ steps or multiple parallel tracks
- Requires exploration, planning, execution, validation
- Different tool permissions per phase
**Recommendation:** **Agents/Subagents**
**Examples:**
- Full codebase refactoring
- Comprehensive security audit
- Multi-file feature implementation
- Documentation generation across entire project
- Performance optimization analysis and fixes
**Implementation:**
```
Main Agent (orchestrator)
├── Explore Subagent (read-only, analysis)
├── Plan Subagent (planning, no execution)
├── Execute Subagent (write permissions)
└── Validate Subagent (testing, verification)
```
---
### Custom Complexity (Score: 10)
**Characteristics:**
- Unique workflow requirements
- System integration needed
- Custom tools required
- Specific feedback loops
- Production deployment
- Fine-grained control necessary
**Recommendation:** **SDK Primitives**
**Examples:**
- Custom CI/CD integration
- Proprietary system automation
- Domain-specific code analysis
- Production AI features
- Specialized agent behaviors
**Implementation:**
```typescript
import { Agent, Tool } from 'claude-agent-sdk';
const customAgent = new Agent({
tools: [customTool1, customTool2],
workflow: customFeedbackLoop,
integrations: [ciSystem, deploySystem]
});
```
---
## 2. Reusability Requirements
### Single Use (Score: 1-2)
**Characteristics:**
- One-time task
- Project-specific
- No future reuse expected
- Temporal need
**Recommendation:** **Direct Prompts**
**Examples:**
- Debug this specific bug
- Update this particular file
- Answer question about this code
---
### Personal Reuse (Score: 3-4)
**Characteristics:**
- You'll use it multiple times
- Personal workflow optimization
- Not shared with team
**Recommendation:** **Skills (Personal)**
**Examples:**
- Your personal code review checklist
- Your preferred refactoring patterns
- Your documentation template
**Storage:** Local `.claude/skills/` directory
---
### Team Reuse (Score: 5-7)
**Characteristics:**
- Multiple team members benefit
- Team-wide patterns
- Shared knowledge
- Collaboration value
**Recommendation:** **Skills (Team Plugin)**
**Examples:**
- Team coding standards
- Project-specific patterns
- Shared workflows
- Team documentation templates
**Storage:** Team repository plugin
---
### Organization Reuse (Score: 8-9)
**Characteristics:**
- Cross-team benefit
- Company-wide standards
- Multiple projects
- Organization knowledge
**Recommendation:** **Skills (Organization Marketplace)**
**Examples:**
- Company coding standards
- Security review guidelines
- Architecture patterns
- Compliance requirements
**Distribution:** Internal marketplace
---
### Product Feature (Score: 10)
**Characteristics:**
- End-user facing
- Production deployment
- Product differentiation
- Revenue impact
**Recommendation:** **SDK Primitives**
**Examples:**
- AI-powered product feature
- Customer-facing automation
- Production workflow
- SaaS feature
**Implementation:** Custom SDK integration
---
## 3. Context Management Needs
### Minimal Context (Score: 1-3)
**Characteristics:**
- Self-contained task
- No external references
- All info in prompt
- < 1000 tokens
**Recommendation:** **Direct Prompts**
**Example:**
```
Explain this function:
[paste function]
```
---
### Structured Context (Score: 4-6)
**Characteristics:**
- Reference materials needed
- Organized information
- Progressive disclosure beneficial
- 1K-10K tokens
**Recommendation:** **Skills with Progressive Disclosure**
**Example:**
```
skill/
├── SKILL.md
└── references/
├── quick_reference.md (loaded first)
├── detailed_patterns.md (loaded on demand)
└── examples.md (loaded when needed)
```
**Pattern:**
- Start with minimal context
- Load more as needed
- Query-based retrieval
---
### Isolated Context (Score: 7-9)
**Characteristics:**
- Separate concerns
- Avoid context pollution
- Parallel execution
- Different contexts per phase
- 10K+ tokens per context
**Recommendation:** **Agents/Subagents**
**Example:**
```
Explore Agent: Codebase context (read-only)
Plan Agent: Planning context (insights from explore)
Code Agent: Implementation context (plan + target files)
Review Agent: Validation context (changes + tests)
```
**Benefits:**
- No context pollution
- Clear boundaries
- Parallel execution
- Optimal token usage
---
### Custom Context (Score: 10)
**Characteristics:**
- Specific context handling
- Integration requirements
- Custom context sources
- Dynamic context loading
**Recommendation:** **SDK Primitives**
**Example:**
```typescript
const context = await customContextLoader({
source: proprietarySystem,
filter: taskSpecific,
transform: domainFormat
});
```
---
## 4. Security & Control Requirements
### Basic Safety (Score: 1-3)
**Characteristics:**
- Read-only operations
- No sensitive data
- Standard guardrails sufficient
- Low risk
**Recommendation:** **Direct Prompts + Skills**
**Controls:**
- Standard Claude safety features
- No additional restrictions needed
---
### Controlled Access (Score: 4-6)
**Characteristics:**
- Write operations
- Specific tool permissions needed
- Some sensitive operations
- Medium risk
**Recommendation:** **Agents with Tool Restrictions**
**Controls:**
```typescript
Explore Agent: [Read, Grep, Glob] // Read-only
Plan Agent: [TodoWrite] // Planning only
Code Agent: [Read, Edit, Write] // Code changes
Review Agent: [Bash, Read] // Testing
```
**Pattern:**
- Allowlist approach
- Minimal permissions
- Explicit grants
---
### High Security (Score: 7-9)
**Characteristics:**
- Sensitive operations
- Compliance requirements
- Audit logging needed
- High risk
**Recommendation:** **Agents with Confirmations**
**Controls:**
```typescript
Agent {
tools: allowlistOnly,
confirmations: [
'git push',
'deployment',
'data deletion',
'sensitive operations'
],
audit: true,
denylist: dangerousCommands
}
```
**Pattern:**
- Deny-all default
- Explicit confirmations
- Audit all operations
- Block dangerous commands
---
### Maximum Security (Score: 10)
**Characteristics:**
- Production systems
- Financial/medical data
- Regulatory compliance
- Critical infrastructure
**Recommendation:** **SDK Primitives + Custom Security**
**Controls:**
```typescript
const secureAgent = new Agent({
security: {
denyAll: true,
allowlist: minimalTools,
mfa: true,
audit: comprehensiveLogger,
encryption: true,
rateLimits: strict,
monitoring: realtime,
rollback: automatic
}
});
```
---
## 5. Performance Needs
### Standard Performance (Score: 1-3)
**Characteristics:**
- User can wait
- Not time-sensitive
- Occasional use
**Recommendation:** **Any approach**
---
### Fast Response (Score: 4-6)
**Characteristics:**
- Quick feedback expected
- Interactive use
- Multiple requests
**Recommendation:** **Skills with Progressive Disclosure**
**Optimization:**
- Load minimal context initially
- Query additional info on demand
- Cache frequent queries
---
### High Performance (Score: 7-9)
**Characteristics:**
- Real-time or near real-time
- Parallel execution beneficial
- Resource optimization critical
**Recommendation:** **Agents (Parallel Execution)**
**Optimization:**
```
Parallel Subagents:
├── Agent 1: File 1-10
├── Agent 2: File 11-20
├── Agent 3: File 21-30
└── Agent 4: Aggregation
Execution: All run simultaneously
```
---
### Maximum Performance (Score: 10)
**Characteristics:**
- Production SLA requirements
- Sub-second responses
- High throughput
- Resource limits
**Recommendation:** **SDK Primitives + Custom Optimization**
**Optimization:**
```typescript
const optimizedAgent = new Agent({
caching: aggressive,
parallelization: maximum,
contextCompression: true,
earlyTermination: true,
resourceLimits: optimized
});
```
---
## 6. Maintenance Burden
### Low Maintenance (Score: 1-3)
**Characteristics:**
- Set and forget
- Stable requirements
- Minimal updates
**Recommendation:** **Direct Prompts (no maintenance)**
---
### Medium Maintenance (Score: 4-6)
**Characteristics:**
- Periodic updates
- Evolving patterns
- Team contributions
**Recommendation:** **Skills (easy to update)**
**Maintenance:**
- Update reference docs
- Add new examples
- Version control friendly
- Clear documentation
---
### High Maintenance (Score: 7-9)
**Characteristics:**
- Regular updates
- Multiple contributors
- Evolving requirements
**Recommendation:** **Skills + Version Control**
**Maintenance:**
```
skill/
├── CHANGELOG.md
├── VERSION
├── SKILL.md
└── references/
└── (versioned docs)
```
---
### Custom Maintenance (Score: 10)
**Characteristics:**
- Custom codebase
- Breaking changes
- Integration updates
- Production support
**Recommendation:** **SDK Primitives (with CI/CD)**
**Maintenance:**
```typescript
// Automated testing
// Version management
// Deployment pipeline
// Monitoring and alerts
```
---
## 7. Time to Market
### Immediate (Score: 1-3)
**Characteristics:**
- Need it now
- No setup time
- Quick win
**Recommendation:** **Direct Prompts**
**Time:** Seconds to minutes
---
### Quick (Score: 4-6)
**Characteristics:**
- Hours to days
- Some setup acceptable
- Reusability valuable
**Recommendation:** **Skills**
**Time:** 1-4 hours to create
---
### Planned (Score: 7-9)
**Characteristics:**
- Days to weeks
- Proper planning
- Complex requirements
**Recommendation:** **Agents/Subagents**
**Time:** 1-3 days to design and implement
---
### Strategic (Score: 10)
**Characteristics:**
- Weeks to months
- Product feature
- Full development cycle
**Recommendation:** **SDK Primitives**
**Time:** 1+ weeks to build and deploy
---
## Decision Matrix
### Quick Reference Table
| Dimension | Prompts | Skills | Agents | SDK |
|-----------|---------|--------|--------|-----|
| **Complexity** | Low (1-3) | Medium (4-6) | High (7-9) | Custom (10) |
| **Reusability** | Single (1-2) | Team (5-7) | Org (8-9) | Product (10) |
| **Context** | Minimal (1-3) | Structured (4-6) | Isolated (7-9) | Custom (10) |
| **Security** | Basic (1-3) | Controlled (4-6) | High (7-9) | Max (10) |
| **Performance** | Standard (1-3) | Fast (4-6) | High (7-9) | Max (10) |
| **Maintenance** | Low (1-3) | Medium (4-6) | High (7-9) | Custom (10) |
| **Time to Market** | Immediate (1-3) | Quick (4-6) | Planned (7-9) | Strategic (10) |
---
## Scoring Your Project
### Step 1: Score Each Dimension
Rate your project on each of the 7 dimensions (1-10).
### Step 2: Calculate Weighted Average
Different dimensions may have different importance for your use case.
**Default Weights:**
- Task Complexity: 25%
- Reusability: 20%
- Context Management: 15%
- Security: 15%
- Performance: 10%
- Maintenance: 10%
- Time to Market: 5%
### Step 3: Interpret Score
**Average Score 1-3:** Direct Prompts
- Simple, clear instructions
- No additional structure
**Average Score 4-6:** Skills
- Organized expertise
- Progressive disclosure
- Reference materials
**Average Score 7-9:** Agents/Subagents
- Complex workflows
- Isolated contexts
- Parallel execution
**Average Score 10:** SDK Primitives
- Custom implementation
- Full control
- Production deployment
---
## Special Cases
### Hybrid Architectures
**When:** Scores span multiple ranges
**Solution:** Combine approaches
- Direct Prompts for simple tasks
- Skills for reusable expertise
- Agents for complex workflows
**Example:**
```
Project with scores:
- Complexity: 7 (Agents)
- Reusability: 5 (Skills)
- Context: 4 (Skills)
- Security: 6 (Skills/Agents)
Recommendation: Agents + Skills
- Use Agents for complex workflows
- Load Skills for domain expertise
- Direct Prompts for simple operations
```
---
## Decision Tree
```
Start
├─ Is it a simple, one-time task?
│ └─ YES → Direct Prompts
│ └─ NO → Continue
├─ Do you need reusable expertise?
│ └─ YES → Continue
│ └─ NO → Continue
│ │
│ ├─ Is it complex with multiple phases?
│ │ └─ YES → Agents
│ │ └─ NO → Direct Prompts
├─ Is it complex with isolated contexts needed?
│ └─ YES → Agents
│ └─ NO → Skills
├─ Is it a production feature or unique workflow?
│ └─ YES → SDK Primitives
│ └─ NO → Agents or Skills
└─ Default → Skills (best balance)
```
---
## Example Evaluations
### Example 1: Code Review Automation
**Scores:**
- Complexity: 5 (structured workflow)
- Reusability: 7 (team-wide)
- Context: 5 (reference materials)
- Security: 4 (read operations)
- Performance: 5 (interactive)
- Maintenance: 6 (evolving standards)
- Time to Market: 5 (hours to setup)
**Average:** 5.3
**Recommendation:** **Skills**
- Create code-review skill
- Include team standards
- Progressive disclosure of guidelines
- Reference materials for patterns
---
### Example 2: Codebase Migration
**Scores:**
- Complexity: 9 (multi-phase, autonomous)
- Reusability: 3 (one-time migration)
- Context: 8 (isolated per phase)
- Security: 7 (write operations)
- Performance: 8 (parallel execution)
- Maintenance: 3 (temporary)
- Time to Market: 7 (proper planning)
**Average:** 6.4
**Recommendation:** **Agents**
- Despite low reusability, complexity demands Agents
- Explore Agent: Analyze codebase
- Plan Agent: Create migration strategy
- Migrate Agent: Execute changes
- Validate Agent: Run tests
---
### Example 3: Quick Bug Fix
**Scores:**
- Complexity: 2 (single fix)
- Reusability: 1 (one-time)
- Context: 2 (minimal)
- Security: 3 (single file change)
- Performance: 2 (can wait)
- Maintenance: 1 (no maintenance)
- Time to Market: 1 (immediate)
**Average:** 1.7
**Recommendation:** **Direct Prompt**
- Simple instruction
- Fast execution
- No overhead
---
### Example 4: AI-Powered Product Feature
**Scores:**
- Complexity: 10 (custom workflow)
- Reusability: 10 (product feature)
- Context: 10 (custom handling)
- Security: 9 (production)
- Performance: 9 (SLA requirements)
- Maintenance: 10 (ongoing support)
- Time to Market: 10 (strategic)
**Average:** 9.7
**Recommendation:** **SDK Primitives**
- Custom agent implementation
- Production-grade security
- Full monitoring and controls
- Integration with product
---
## Summary
Use this rubric to:
1. **Score** your project on 7 dimensions
2. **Calculate** weighted average
3. **Interpret** score to get recommendation
4. **Validate** with decision tree
5. **Review** example evaluations
**Key Principle:** Start simple, scale complexity as needed.
**Remember:** The best architecture is the simplest one that meets your requirements.

File diff suppressed because it is too large Load Diff