Files
gh-cskiro-claudex-claude-co…/skills/claude-md-auditor/README.md
2025-11-29 18:16:51 +08:00

17 KiB
Raw Blame History

CLAUDE.md Auditor

Comprehensive validation and optimization tool for CLAUDE.md memory files in Claude Code

An Anthropic Skill that analyzes CLAUDE.md configuration files against official Anthropic documentation, community best practices, and academic research on LLM context optimization.

License Python Status


Quick Start

# 1. Copy skill to your skills directory
cp -r claude-md-auditor ~/.claude/skills/

# 2. Use in Claude Code
claude
> Audit my CLAUDE.md using the claude-md-auditor skill

Direct Script Usage

# Markdown audit report
python claude-md-auditor/scripts/analyzer.py ./CLAUDE.md

# JSON report (for CI/CD)
python claude-md-auditor/scripts/report_generator.py ./CLAUDE.md json > audit.json

# Generate refactored CLAUDE.md
python claude-md-auditor/scripts/report_generator.py ./CLAUDE.md refactored > CLAUDE_refactored.md

What Does It Do?

Validates Against Three Sources

Source Authority Examples
Official Anthropic Docs Highest Memory hierarchy, import syntax, "keep them lean"
💡 Community Best Practices Medium 100-300 line target, 80/20 rule, maintenance cadence
🔬 Academic Research Medium "Lost in the middle" positioning, token optimization

Detects Issues

  • 🚨 CRITICAL: Secrets (API keys, passwords), security vulnerabilities
  • ⚠️ HIGH: Generic content, excessive verbosity, vague instructions
  • 📋 MEDIUM: Outdated info, broken links, duplicate sections
  • LOW: Formatting issues, organizational improvements

Generates Output

  1. Markdown Report: Human-readable audit with detailed findings
  2. JSON Report: Machine-readable for CI/CD integration
  3. Refactored CLAUDE.md: Production-ready improved version

Features

🔒 Security Validation (CRITICAL)

Detects exposed secrets using pattern matching:

  • API keys (OpenAI, AWS, generic)
  • Tokens and passwords
  • Database connection strings
  • Private keys (PEM format)
  • Internal IP addresses

Why Critical: CLAUDE.md files are often committed to git. Exposed secrets can leak through history, PRs, logs, or backups.

Official Compliance

Validates against docs.claude.com guidance:

  • File length ("keep them lean")
  • Generic content (Claude already knows basic programming)
  • Import syntax (@path/to/import, max 5 hops)
  • Vague instructions (specific vs. ambiguous)
  • Proper markdown structure

💡 Best Practices

Evaluates community recommendations:

  • Optimal file length (100-300 lines)
  • Token usage (< 3,000 tokens / <2% of 200K context)
  • Organization (sections, headers, priority markers)
  • Maintenance (update dates, version info)
  • Duplicate or conflicting content

🔬 Research Optimization

Applies academic insights:

  • "Lost in the Middle" positioning (critical info at top/bottom)
  • Token efficiency and context utilization
  • Chunking and information architecture
  • Attention pattern optimization

Based On:

  • "Lost in the Middle" (Liu et al., 2023, TACL)
  • Claude-specific performance studies
  • Context awareness research (MIT/Google Cloud AI, 2024)

Installation

# Clone or copy to your skills directory
mkdir -p ~/.claude/skills
cp -r claude-md-auditor ~/.claude/skills/

# Verify installation
ls ~/.claude/skills/claude-md-auditor/SKILL.md

Option 2: Standalone Scripts

# Clone repository
git clone https://github.com/cskiro/annex.git
cd annex/claude-md-auditor

# Run directly (Python 3.8+ required, no dependencies)
python scripts/analyzer.py path/to/CLAUDE.md

Requirements

  • Python: 3.8 or higher
  • Dependencies: None (uses standard library only)
  • Claude Code: Any version with Skills support (optional)

Usage

Basic Audit

With Claude Code:

Audit my CLAUDE.md using the claude-md-auditor skill.

Direct:

python scripts/analyzer.py ./CLAUDE.md

Output:

============================================================
CLAUDE.md Audit Results: ./CLAUDE.md
============================================================

Overall Health Score: 78/100
Security Score: 100/100
Official Compliance Score: 75/100
Best Practices Score: 70/100
Research Optimization Score: 85/100

============================================================
Findings Summary:
  🚨 Critical: 0
  ⚠️  High: 2
  📋 Medium: 3
    Low: 5
============================================================

Generate JSON Report

For CI/CD Integration:

python scripts/report_generator.py ./CLAUDE.md json > audit.json

Example Output:

{
  "metadata": {
    "file": "./CLAUDE.md",
    "generated_at": "2025-10-26 10:30:00",
    "tier": "Project"
  },
  "scores": {
    "overall": 78,
    "security": 100,
    "official_compliance": 75,
    "critical_count": 0,
    "high_count": 2
  },
  "findings": [...]
}

Generate Refactored File

Create improved CLAUDE.md:

python scripts/report_generator.py ./CLAUDE.md refactored > CLAUDE_refactored.md

This generates a production-ready file with:

  • Optimal structure (critical at top, reference at bottom)
  • Research-based positioning ("lost in the middle" mitigation)
  • All security issues removed
  • Best practices applied
  • Inline comments for maintenance

Output Examples

Markdown Report Structure

# CLAUDE.md Audit Report

## Executive Summary
- Overall health: 78/100 (Good)
- Status: ⚠️ **HIGH PRIORITY** - Address this sprint
- Total findings: 10 (0 critical, 2 high, 3 medium, 5 low)

## Score Dashboard
| Category | Score | Status |
|----------|-------|--------|
| Security | 100/100 | ✅ Excellent |
| Official Compliance | 75/100 | 🟢 Good |
| Best Practices | 70/100 | 🟢 Good |

## Detailed Findings

### ⚠️ HIGH Priority

#### 1. Generic Programming Content Detected
**Category**: Official Compliance
**Source**: Official Guidance

**Description**: File contains generic React documentation

**Impact**: Wastes context window. Official guidance:
"Don't include basic programming concepts Claude already understands"

**Remediation**: Remove generic content. Focus on project-specific standards.

---

JSON Report Structure

{
  "metadata": {
    "file_path": "./CLAUDE.md",
    "line_count": 245,
    "token_estimate": 3240,
    "context_usage_200k": 1.62,
    "tier": "Project"
  },
  "scores": {
    "overall": 78,
    "security": 100,
    "official_compliance": 75,
    "best_practices": 70,
    "research_optimization": 85
  },
  "findings": [
    {
      "severity": "high",
      "category": "official_compliance",
      "title": "Generic Programming Content Detected",
      "description": "File contains generic React documentation",
      "line_number": 42,
      "source": "official",
      "remediation": "Remove generic content..."
    }
  ]
}

Reference Documentation

Complete Validation Criteria

All checks are documented in the reference/ directory:

Document Content Source
official_guidance.md Complete official Anthropic documentation docs.claude.com
best_practices.md Community recommendations and field experience Practitioners
research_insights.md Academic research on LLM context optimization Peer-reviewed papers
anti_patterns.md Catalog of common mistakes and violations Field observations

Key References


CI/CD Integration

GitHub Actions Example

name: CLAUDE.md Audit

on:
  pull_request:
    paths:
      - '**/CLAUDE.md'

jobs:
  audit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - name: Run CLAUDE.md Audit
        run: |
          python claude-md-auditor/scripts/analyzer.py CLAUDE.md \
            --format json \
            --output audit.json

      - name: Check Critical Issues
        run: |
          CRITICAL=$(python -c "import json; print(json.load(open('audit.json'))['summary']['critical'])")

          if [ "$CRITICAL" -gt 0 ]; then
            echo "❌ Critical issues found in CLAUDE.md"
            exit 1
          fi

          echo "✅ CLAUDE.md validation passed"

Pre-Commit Hook

#!/bin/bash
# .git/hooks/pre-commit

if git diff --cached --name-only | grep -q "CLAUDE.md"; then
  echo "Validating CLAUDE.md..."

  python claude-md-auditor/scripts/analyzer.py CLAUDE.md > /tmp/audit.txt

  # Check exit code or parse output
  if grep -q "🚨 Critical: [1-9]" /tmp/audit.txt; then
    echo "❌ CLAUDE.md has critical issues"
    cat /tmp/audit.txt
    exit 1
  fi

  echo "✅ CLAUDE.md validation passed"
fi

VS Code Task

Add to .vscode/tasks.json:

{
  "version": "2.0.0",
  "tasks": [
    {
      "label": "Audit CLAUDE.md",
      "type": "shell",
      "command": "python",
      "args": [
        "${workspaceFolder}/claude-md-auditor/scripts/analyzer.py",
        "${workspaceFolder}/CLAUDE.md"
      ],
      "group": "test",
      "presentation": {
        "reveal": "always",
        "panel": "new"
      }
    }
  ]
}

Understanding Scores

Overall Health Score (0-100)

Range Status Action
90-100 Excellent Minor optimizations only
75-89 🟢 Good Some improvements recommended
60-74 🟡 Fair Schedule improvements this quarter
40-59 🟠 Poor Significant issues to address
0-39 🔴 Critical Immediate action required

Severity Levels

  • 🚨 CRITICAL: Security risks (fix immediately, within 24 hours)
  • ⚠️ HIGH: Significant issues (fix this sprint, within 2 weeks)
  • 📋 MEDIUM: Moderate improvements (schedule for next quarter)
  • LOW: Minor optimizations (backlog)

Category Scores

  • Security: Should always be 100 (any security issue is critical)
  • Official Compliance: Aim for 80+ (follow Anthropic guidance)
  • Best Practices: 70+ is good (community recommendations are flexible)
  • Research Optimization: 60+ is acceptable (optimizations, not requirements)

Real-World Examples

Example 1: Security Violation

Before (Anti-Pattern):

# CLAUDE.md

## API Configuration
- API Key: sk-1234567890abcdefghijklmnop
- Database: postgres://admin:pass@10.0.1.42/db

Audit Finding:

🚨 CRITICAL: API Key Detected
Line: 4
Impact: Security breach risk. Secrets exposed in git history.
Remediation:
1. Remove the API key immediately
2. Rotate the compromised credential
3. Use environment variables (.env file)
4. Clean git history if committed

After (Fixed):

# CLAUDE.md

## API Configuration
- API keys: Stored in .env (see .env.example for template)
- Database: Use AWS Secrets Manager connection string
- Access: Contact team lead for credentials

Example 2: Generic Content

Before (Anti-Pattern):

## React Best Practices

React is a JavaScript library for building user interfaces.
It was created by Facebook in 2013. React uses a virtual
DOM for efficient updates...

[200 lines of React documentation]

Audit Finding:

⚠️ HIGH: Generic Programming Content Detected
Impact: Wastes context window. Claude already knows React basics.
Remediation: Remove generic content. Focus on project-specific patterns.

After (Fixed):

## React Standards (Project-Specific)

- Functional components only (no class components)
- Custom hooks location: /src/hooks
- Co-location pattern: Component + test + styles in same directory
- Props interface naming: [ComponentName]Props

Example 3: Optimal Structure

Generated Refactored File:

# MyApp

## 🚨 CRITICAL: Must-Follow Standards

<!-- Top position = highest attention -->

- Security: Never commit secrets to git
- TypeScript strict mode: No `any` types
- Testing: 80% coverage on all new code

## 📋 Project Overview

**Tech Stack**: React, TypeScript, Vite, PostgreSQL
**Architecture**: Feature-based modules, clean architecture
**Purpose**: Enterprise CRM system

## 🔧 Development Workflow

### Git
- Branches: `feature/{name}`, `bugfix/{name}`
- Commits: Conventional format required
- PRs: Tests + review + passing CI

## 📝 Code Standards

[Project-specific rules here]

## 📌 REFERENCE: Common Tasks

<!-- Bottom position = recency attention -->

```bash
npm run build        # Build production
npm test            # Run tests
npm run deploy      # Deploy to staging

Key Files

  • Config: /config/app.config.ts
  • Types: /src/types/index.ts

---

## FAQ

### Q: Will this skill automatically fix my CLAUDE.md?

**A**: No, but it can generate a refactored version. You need to review and apply changes manually to ensure they fit your project.

### Q: Are all recommendations mandatory?

**A**: No. Check the **source** field:
- **Official**: Follow Anthropic documentation (highest priority)
- **Community**: Recommended best practices (flexible)
- **Research**: Evidence-based optimizations (optional)

### Q: What if I disagree with a finding?

**A**: That's okay! Best practices are guidelines, not requirements. Official guidance should be followed, but community and research recommendations can be adapted to your context.

### Q: How often should I audit CLAUDE.md?

**A**:
- **On every change**: Before committing (use pre-commit hook)
- **Quarterly**: Regular maintenance audit
- **Before major releases**: Ensure standards are up-to-date
- **When onboarding**: Validate project configuration

### Q: Can I use this in my CI/CD pipeline?

**A**: Yes! Use JSON output mode and check for critical findings. Example provided in CI/CD Integration section.

### Q: Does this validate that Claude actually follows the standards?

**A**: No, this only validates the CLAUDE.md structure and content. To test effectiveness, start a new Claude session and verify standards are followed without re-stating them.

---

## Limitations

### What This Skill CANNOT Do

- ❌ Automatically fix security issues (manual remediation required)
- ❌ Test if Claude follows standards (behavioral testing needed)
- ❌ Validate imported files beyond path existence
- ❌ Detect circular imports (requires graph traversal)
- ❌ Verify standards match actual codebase
- ❌ Determine if standards are appropriate for your project

### Known Issues

- Import depth validation (max 5 hops) not yet implemented
- Circular import detection not yet implemented
- Cannot read contents of imported files for validation

---

## Roadmap

### v1.1 (Planned)

- [ ] Import graph traversal (detect circular imports)
- [ ] Import depth validation (max 5 hops)
- [ ] Content validation of imported files
- [ ] Interactive CLI for guided fixes
- [ ] HTML dashboard report format

### v1.2 (Planned)

- [ ] Effectiveness testing (test if Claude follows standards)
- [ ] Diff mode (compare before/after audits)
- [ ] Metrics tracking over time
- [ ] Custom rule definitions
- [ ] Integration with popular IDEs

---

## Contributing

This skill is based on three authoritative sources:

1. **Official Anthropic Documentation** (docs.claude.com)
2. **Peer-Reviewed Academic Research** (MIT, Google Cloud AI, TACL)
3. **Community Field Experience** (practitioner reports)

To propose changes:

1. Identify which category (official/community/research)
2. Provide source documentation or evidence
3. Explain rationale and expected impact
4. Update relevant reference documentation

---

## License

Apache 2.0 - Example skill for demonstration purposes

---

## Version

**Current Version**: 1.0.0
**Last Updated**: 2025-10-26
**Python**: 3.8+
**Status**: Stable

---

## Credits

**Developed By**: Connor (based on Anthropic Skills framework)

**Research Sources**:
- Liu et al. (2023) - "Lost in the Middle" (TACL/MIT Press)
- MIT/Google Cloud AI (2024) - Attention calibration research
- Anthropic Engineering (2023-2025) - Claude documentation and blog

**Special Thanks**: Anthropic team for Claude Code and Skills framework

---

**🔗 Links**:
- [Claude Code Documentation](https://docs.claude.com/en/docs/claude-code/overview)
- [Anthropic Skills Repository](https://github.com/anthropics/skills)
- [Memory Management Guide](https://docs.claude.com/en/docs/claude-code/memory)

---

*Generated by claude-md-auditor v1.0.0*