zhongwei/gh-jamesrochabrun-skills-all-skills

Fork 0

Files

Zhongwei Li df092d8cd2 Initial commit

2025-11-29 18:48:58 +08:00

18 KiB

Raw Permalink Blame History

Anthropic Architecture Best Practices (2025)

Proven best practices for designing, building, and maintaining Anthropic-based systems.

Core Design Principles
Progressive Disclosure
Context Management
Security & Safety
Performance Optimization
Skill Design
Agent Design
Testing & Validation
Maintenance & Evolution
Cost Optimization

Core Design Principles

1. Start Simple, Scale Complexity

Principle: Always begin with the simplest solution that meets requirements.

Why: Avoid over-engineering and unnecessary complexity.

How:

Level 1: Try Direct Prompt
  └─ Works? → Done
  └─ Too complex? → Continue

Level 2: Create Skill
  └─ Works? → Done
  └─ Needs isolation? → Continue

Level 3: Use Agents
  └─ Works? → Done
  └─ Need custom workflow? → Continue

Level 4: SDK Primitives
  └─ Full control achieved

Example:

Task: Generate code review

❌ Wrong: Immediately build custom SDK agent
✅ Right: Try direct prompt first

"Review this PR for:
- Code quality issues
- Security vulnerabilities
- Performance concerns"

If inconsistent → Create code-review skill
If too complex → Use agent with multiple phases

2. Progressive Disclosure First

Principle: Show only what's needed, when it's needed.

Why: Optimize context usage, reduce costs, improve performance.

How:

Structure information hierarchically:
├── Index/Overview (always load)
├── Topic Overviews (load on demand)
├── Detailed Content (load when requested)
└── Examples (load if needed)

Anti-Pattern:

❌ DON'T: Dump entire skill into context

skill/
└── SKILL.md (100KB of everything)
    → Context overload
    → Slow responses
    → High costs

Best Practice:

✅ DO: Structure for progressive disclosure

skill/
├── SKILL.md (overview, 2KB)
├── index.md (navigation, 1KB)
└── topics/
    ├── topic_1/
    │   ├── overview.md (1KB)
    │   └── details.md (loaded on request)
    └── topic_2/
        ├── overview.md (1KB)
        └── details.md (loaded on request)

Total initial load: ~4KB vs 100KB

3. Context as Precious Resource

Principle: Treat every token as valuable and limited.

Why: Context windows have limits and costs.

Context Budget:

Claude 4.x: 200K tokens
- Reserve: 50K for responses
- Available: 150K for context
- Budget wisely!

Best Practices:

✅ Load only necessary information
✅ Summarize large outputs
✅ Use progressive disclosure
✅ Reset context periodically
✅ Compress repeated information
❌ Don't dump raw logs
❌ Don't load unused references
❌ Don't repeat information

4. Clear, Explicit Instructions

Principle: Claude 4.x responds best to unambiguous direction.

Why: Reduces errors, improves consistency, better results.

Comparison:

❌ Vague:
"Make the code better"

✅ Clear:
"Refactor this function to:
1. Extract magic numbers to constants
2. Add type annotations
3. Improve variable names for clarity
4. Add error handling for edge cases
Format: Return only the refactored code"

Template:

<instructions>
TASK: [Clear task definition]

REQUIREMENTS:
- [Specific requirement 1]
- [Specific requirement 2]
- [Specific requirement 3]

OUTPUT FORMAT:
[Exact format expected]

CONSTRAINTS:
- [Constraint 1]
- [Constraint 2]
</instructions>

<examples>
[2-3 examples showing desired pattern]
</examples>

5. Security by Design

Principle: Deny-all default, allowlist approach.

Why: Safe, controlled AI systems.

Security Checklist:

Minimal tool permissions
Allowlist approved tools only
Deny dangerous commands
Require confirmations for sensitive ops
Audit all operations
Implement rollback capability
Validate all inputs
Sandbox execution when possible

Default Agent Security:

const secureAgent = new Agent({
  // Deny-all default
  tools: [],

  // Explicitly allow minimal tools
  allowlist: [
    'Read',   // Read-only
    'Grep',   // Search-only
    'Glob'    // Find-only
  ],

  // Block dangerous operations
  denylist: [
    'rm -rf',
    'sudo',
    'exec',
    'eval'
  ],

  // Require confirmation
  confirmations: [
    'git push',
    'deployment',
    'data modification'
  ]
});

Progressive Disclosure

Pattern: Query-Based Disclosure

Best Practice:

skill/
├── SKILL.md
│   Content: High-level overview
│   Size: < 2KB
│   Purpose: Introduce skill capabilities
│
├── index.md
│   Content: Navigation/TOC
│   Size: < 1KB
│   Purpose: Guide to available topics
│
└── content/
    └── topic/
        ├── overview.md (load first)
        ├── details.md (load on demand)
        └── examples.md (load when requested)

Loading Strategy:

1. Initial load: SKILL.md + index.md (~3KB)
2. User asks about "authentication"
3. Load: authentication/overview.md (~1KB)
4. User needs details
5. Load: authentication/details.md (~3KB)
6. User wants examples
7. Load: authentication/examples.md (~2KB)

Total: 9KB loaded vs 50KB if dumped all at once
Savings: 82% context reduction

Pattern: Hierarchical Expertise

Best Practice:

expertise/
├── by_task/
│   ├── authentication.md
│   ├── api_design.md
│   └── testing.md
├── by_language/
│   ├── typescript.md
│   ├── python.md
│   └── rust.md
├── by_pattern/
│   ├── repository.md
│   └── factory.md
└── quick_reference/
    └── cheatsheet.md

Query Examples:

"How to implement auth?" → Load: by_task/authentication.md
"TypeScript style guide?" → Load: by_language/typescript.md
"Repository pattern?" → Load: by_pattern/repository.md
"Quick naming conventions?" → Load: quick_reference/cheatsheet.md

Context Management

Best Practice 1: Periodic Context Reset

Why: Long sessions accumulate irrelevant context.

When to reset:

After completing major task
Context feels "bloated"
Responses become slower
Approaching token limits

How:

Option 1: New conversation
- Start fresh conversation
- Provide summary of previous work

Option 2: Explicit reset request
- Ask Claude to forget irrelevant context
- Summarize key points to retain

Option 3: Use separate agents
- Different agents for different tasks
- Clean contexts per task

Best Practice 2: Summarize, Don't Dump

Anti-Pattern:

❌ DON'T: Dump raw logs

"Here are the test results:"
[10,000 lines of test output]

Best Practice:

✅ DO: Summarize key information

"Test Results Summary:
- Total: 1,247 tests
- Passed: 1,245 (99.8%)
- Failed: 2
  - test_auth_token_expiration (line 456)
  - test_rate_limiting (line 789)
- Duration: 2m 34s"

Best Practice 3: Compress Repeated Information

Anti-Pattern:

❌ DON'T: Repeat same information

Task 1: "Following these coding standards: [full standards]"
Task 2: "Following these coding standards: [full standards]"
Task 3: "Following these coding standards: [full standards]"

Best Practice:

✅ DO: Reference once, use skill

Task 1: Load: coding-standards-skill
Task 2: "Continue following loaded coding standards"
Task 3: "Continue following loaded coding standards"

Security & Safety

Best Practice 1: Minimal Permissions

Principle: Grant minimum tools needed for task.

Example: Code Analysis

const analysisAgent = new Agent({
  tools: [
    'Read',   // Read code
    'Grep',   // Search code
    'Glob'    // Find files
  ]
  // NO Write, Edit, Bash, etc.
});

Example: Code Modification

const codeAgent = new Agent({
  tools: [
    'Read',   // Read existing code
    'Edit'    // Modify code
  ]
  // NO Bash (can't execute)
  // NO full Write (use Edit for safety)
});

Best Practice 2: Confirmation for Sensitive Operations

Always require confirmation:

git push
Deployment commands
Data deletion
System modifications
API calls to production
Database changes

Implementation:

const deployAgent = new Agent({
  confirmations: [
    'git push',
    'npm publish',
    'kubectl apply',
    'terraform apply',
    'aws',
    'gcloud'
  ]
});

Best Practice 3: Audit Logging

Why: Track all AI operations for security and debugging.

What to log:

Tool usage
Commands executed
Files modified
API calls made
Errors encountered
User confirmations

Implementation:

const auditedAgent = new Agent({
  audit: {
    enabled: true,
    level: 'verbose',
    includeContext: true,
    destination: './logs/agent-audit.log',
    retention: '90days'
  }
});

Performance Optimization

Best Practice 1: Parallel Execution

When possible, parallelize:

Anti-Pattern:

❌ Sequential (slow):
1. Analyze file1.ts → 10s
2. Analyze file2.ts → 10s
3. Analyze file3.ts → 10s
Total: 30s

Best Practice:

✅ Parallel (fast):
1. Analyze file1.ts ──┐
2. Analyze file2.ts ──┼→ All run simultaneously
3. Analyze file3.ts ──┘
Total: 10s (3x faster)

Implementation:

Launch 3 agents in parallel:
- Agent 1: file1.ts
- Agent 2: file2.ts
- Agent 3: file3.ts
Aggregate results

Best Practice 2: Cache Frequent Queries

Pattern:

skill/
└── cache/
    ├── frequently_asked.md
    └── common_patterns.md

Example:

Common query: "How to handle errors?"

Instead of processing each time:
1. Maintain: error_handling.md with comprehensive guide
2. Query → Immediately load cached response
3. Fast, consistent responses

Best Practice 3: Optimize Token Usage

Token Optimization Checklist:

Use progressive disclosure
Summarize large outputs
Remove redundant information
Compress repeated content
Use shorter variable names in examples
Remove unnecessary whitespace
Reference external docs vs embedding

Example:

❌ High token usage:
const myVeryLongDescriptiveVariableName = 'value';
const anotherVeryLongDescriptiveVariableName = 'value';

✅ Optimized:
const user = 'value';
const data = 'value';
// Still clear, fewer tokens

Skill Design

Best Practice 1: Single Responsibility

Principle: Each skill should have one clear purpose.

Anti-Pattern:

❌ DON'T: Mega-skill doing everything

super-skill/
├── frontend/
├── backend/
├── database/
├── devops/
└── testing/
→ Too broad, context overload

Best Practice:

✅ DO: Focused skills

frontend-expert/
├── components/
├── styling/
└── accessibility/

backend-expert/
├── apis/
├── services/
└── databases/

Best Practice 2: Clear Documentation

Skill Documentation Template:

---
name: skill-name
description: One-sentence description
---

# Skill Name

## What This Skill Does
[2-3 sentences explaining purpose]

## When to Use
- ✅ Use case 1
- ✅ Use case 2
- ❌ Not for use case 3

## Quick Start
[Simple example]

## Reference Materials
- file1.md - Description
- file2.md - Description

## Examples
[2-3 concrete examples]

Best Practice 3: Version Skills

Why: Track changes, enable rollback, communicate updates.

Structure:

skill/
├── VERSION (e.g., 2.1.0)
├── CHANGELOG.md
├── SKILL.md
└── references/

CHANGELOG.md:

# Changelog

## [2.1.0] - 2025-01-15
### Added
- New pattern: async error handling
- Examples for TypeScript 5.x

### Changed
- Updated API guidelines for REST

### Fixed
- Corrected authentication example

## [2.0.0] - 2024-12-01
### Breaking Changes
- Restructured reference materials

Agent Design

Best Practice 1: Clear Agent Boundaries

Principle: Each agent should have clear, distinct responsibilities.

Anti-Pattern:

❌ DON'T: Monolithic agent doing everything

BigAgent
├── Explores codebase
├── Plans changes
├── Executes changes
├── Runs tests
├── Deploys
└── Monitors
→ Too much responsibility, hard to debug

Best Practice:

✅ DO: Specialized agents

Main Orchestrator
├── Explore Agent (read-only)
├── Plan Agent (planning)
├── Code Agent (implementation)
├── Test Agent (validation)
└── Report Agent (aggregation)

Best Practice 2: Agent Communication Patterns

Pattern: Parent-Child

Main Agent
│
├─→ Subagent 1: Task
│   └─→ Returns: Result
│
├─→ Subagent 2: Task
│   └─→ Returns: Result
│
└─→ Main aggregates results

Pattern: Pipeline

Agent 1: Explore
  └─→ Output: Analysis
      └─→ Agent 2: Plan
          └─→ Output: Plan
              └─→ Agent 3: Execute
                  └─→ Output: Changes

Pattern: Parallel Workers

Coordinator
├─┬─ Worker 1 ──┐
│ ├─ Worker 2 ──┤
│ ├─ Worker 3 ──┼→ Aggregator → Result
│ └─ Worker 4 ──┘

Best Practice 3: Error Handling in Agents

Principle: Graceful failure and recovery.

Pattern:

const resilientAgent = async (task) => {
  try {
    const result = await agent.run(task);
    return result;
  } catch (error) {
    // Log error
    logger.error('Agent failed', error);

    // Attempt recovery
    if (isRecoverable(error)) {
      return await retryWithBackoff(agent, task);
    }

    // Fallback strategy
    return await fallbackStrategy(task);
  }
};

Testing & Validation

Best Practice 1: Test Skills

What to test:

Skill loads correctly
References are accessible
Examples are valid
Scripts execute successfully

Example:

#!/bin/bash
# test_skill.sh

echo "Testing skill: $1"

# Test 1: Skill file exists
if [ ! -f "$1/SKILL.md" ]; then
  echo "❌ SKILL.md not found"
  exit 1
fi

# Test 2: References are valid
for ref in $1/references/*.md; do
  if [ ! -f "$ref" ]; then
    echo "❌ Reference missing: $ref"
    exit 1
  fi
done

# Test 3: Scripts are executable
for script in $1/scripts/*.sh; do
  if [ ! -x "$script" ]; then
    echo "❌ Script not executable: $script"
    exit 1
  fi
done

echo "✅ All tests passed"

Best Practice 2: Validate Agent Output

Pattern:

const validateAgentOutput = async (output) => {
  // Schema validation
  if (!matchesSchema(output)) {
    throw new Error('Invalid output schema');
  }

  // Business logic validation
  if (!meetsRequirements(output)) {
    throw new Error('Output doesn\'t meet requirements');
  }

  // Safety checks
  if (containsDangerousContent(output)) {
    throw new Error('Output contains dangerous content');
  }

  return output;
};

Maintenance & Evolution

Best Practice 1: Regular Skill Updates

Schedule:

Monthly: Review and update examples
Quarterly: Major updates for new patterns
Yearly: Comprehensive review and restructure

Update Checklist:

New patterns added
Deprecated patterns removed
Examples updated for current versions
Documentation improved
User feedback incorporated
Version bumped
Changelog updated

Best Practice 2: Deprecation Strategy

When deprecating:

## [3.0.0] - 2025-06-01

### Deprecated
⚠️ OLD PATTERN (Deprecated, remove in 4.0.0):
[Old pattern example]

✅ NEW PATTERN (Use instead):
[New pattern example]

Migration guide: See MIGRATION.md

Deprecation Timeline:

Announce deprecation (version N)
Maintain both patterns (version N+1)
Remove old pattern (version N+2)

Cost Optimization

Best Practice 1: Token Efficiency

Strategies:

Use progressive disclosure (load less)
Summarize outputs (fewer tokens)
Cache frequent queries (reuse)
Compress repeated content (deduplicate)
Choose smaller models when possible (Haiku vs Sonnet)

Example:

Task: Simple syntax error fix

❌ Expensive: Use Sonnet for everything
Cost: $X per request

✅ Optimized: Use Haiku for simple tasks
Cost: $X/5 per request
Savings: 80%

Best Practice 2: Model Selection

Choose model based on complexity:

Haiku (Fast, Cheap):

Simple queries
Straightforward tasks
Well-defined operations
Cost-sensitive applications

Sonnet (Balanced):

Medium complexity
Most general tasks
Good balance of capability/cost
Default choice

Opus (Powerful, Expensive):

Complex reasoning
Critical tasks
High-stakes decisions
Quality over cost

Summary Checklist

Design Phase:

Start with simplest solution
Apply progressive disclosure
Plan for context efficiency
Design security boundaries
Consider performance needs

Implementation Phase:

Follow single responsibility
Implement clear documentation
Add version control
Include error handling
Add audit logging

Testing Phase:

Test skill loading
Validate agent outputs
Check security controls
Verify performance
Test edge cases

Maintenance Phase:

Regular updates
Deprecation strategy
User feedback loop
Cost monitoring
Performance optimization

Remember: The best practices evolve. Stay current with Anthropic updates and community patterns.

18 KiB Raw Permalink Blame History

Anthropic Architecture Best Practices (2025)

Table of Contents

Core Design Principles

1. Start Simple, Scale Complexity

2. Progressive Disclosure First

3. Context as Precious Resource

4. Clear, Explicit Instructions

5. Security by Design

Progressive Disclosure

Pattern: Query-Based Disclosure

Pattern: Hierarchical Expertise

Context Management

Best Practice 1: Periodic Context Reset

Best Practice 2: Summarize, Don't Dump

Best Practice 3: Compress Repeated Information

Security & Safety

Best Practice 1: Minimal Permissions

Best Practice 2: Confirmation for Sensitive Operations

Best Practice 3: Audit Logging

Performance Optimization

Best Practice 1: Parallel Execution

Best Practice 2: Cache Frequent Queries

Best Practice 3: Optimize Token Usage

Skill Design

Best Practice 1: Single Responsibility

Best Practice 2: Clear Documentation

Best Practice 3: Version Skills

Agent Design

Best Practice 1: Clear Agent Boundaries

Best Practice 2: Agent Communication Patterns

Best Practice 3: Error Handling in Agents

Testing & Validation

Best Practice 1: Test Skills

Best Practice 2: Validate Agent Output

Maintenance & Evolution

Best Practice 1: Regular Skill Updates

Best Practice 2: Deprecation Strategy

Cost Optimization

Best Practice 1: Token Efficiency

Best Practice 2: Model Selection

Summary Checklist

18 KiB

Raw Permalink Blame History