673 lines
17 KiB
Markdown
673 lines
17 KiB
Markdown
# CLAUDE.md Auditor
|
||
|
||
> Comprehensive validation and optimization tool for CLAUDE.md memory files in Claude Code
|
||
|
||
An Anthropic Skill that analyzes CLAUDE.md configuration files against **official Anthropic documentation**, **community best practices**, and **academic research** on LLM context optimization.
|
||
|
||
[](LICENSE)
|
||
[](https://www.python.org/downloads/)
|
||
[]()
|
||
|
||
---
|
||
|
||
## Quick Start
|
||
|
||
### With Claude Code (Recommended)
|
||
|
||
```bash
|
||
# 1. Copy skill to your skills directory
|
||
cp -r claude-md-auditor ~/.claude/skills/
|
||
|
||
# 2. Use in Claude Code
|
||
claude
|
||
> Audit my CLAUDE.md using the claude-md-auditor skill
|
||
```
|
||
|
||
### Direct Script Usage
|
||
|
||
```bash
|
||
# Markdown audit report
|
||
python claude-md-auditor/scripts/analyzer.py ./CLAUDE.md
|
||
|
||
# JSON report (for CI/CD)
|
||
python claude-md-auditor/scripts/report_generator.py ./CLAUDE.md json > audit.json
|
||
|
||
# Generate refactored CLAUDE.md
|
||
python claude-md-auditor/scripts/report_generator.py ./CLAUDE.md refactored > CLAUDE_refactored.md
|
||
```
|
||
|
||
---
|
||
|
||
## What Does It Do?
|
||
|
||
### Validates Against Three Sources
|
||
|
||
| Source | Authority | Examples |
|
||
|--------|-----------|----------|
|
||
| **✅ Official Anthropic Docs** | Highest | Memory hierarchy, import syntax, "keep them lean" |
|
||
| **💡 Community Best Practices** | Medium | 100-300 line target, 80/20 rule, maintenance cadence |
|
||
| **🔬 Academic Research** | Medium | "Lost in the middle" positioning, token optimization |
|
||
|
||
### Detects Issues
|
||
|
||
- **🚨 CRITICAL**: Secrets (API keys, passwords), security vulnerabilities
|
||
- **⚠️ HIGH**: Generic content, excessive verbosity, vague instructions
|
||
- **📋 MEDIUM**: Outdated info, broken links, duplicate sections
|
||
- **ℹ️ LOW**: Formatting issues, organizational improvements
|
||
|
||
### Generates Output
|
||
|
||
1. **Markdown Report**: Human-readable audit with detailed findings
|
||
2. **JSON Report**: Machine-readable for CI/CD integration
|
||
3. **Refactored CLAUDE.md**: Production-ready improved version
|
||
|
||
---
|
||
|
||
## Features
|
||
|
||
### 🔒 Security Validation (CRITICAL)
|
||
|
||
Detects exposed secrets using pattern matching:
|
||
|
||
- API keys (OpenAI, AWS, generic)
|
||
- Tokens and passwords
|
||
- Database connection strings
|
||
- Private keys (PEM format)
|
||
- Internal IP addresses
|
||
|
||
**Why Critical**: CLAUDE.md files are often committed to git. Exposed secrets can leak through history, PRs, logs, or backups.
|
||
|
||
### ✅ Official Compliance
|
||
|
||
Validates against [docs.claude.com](https://docs.claude.com) guidance:
|
||
|
||
- File length ("keep them lean")
|
||
- Generic content (Claude already knows basic programming)
|
||
- Import syntax (`@path/to/import`, max 5 hops)
|
||
- Vague instructions (specific vs. ambiguous)
|
||
- Proper markdown structure
|
||
|
||
### 💡 Best Practices
|
||
|
||
Evaluates community recommendations:
|
||
|
||
- Optimal file length (100-300 lines)
|
||
- Token usage (< 3,000 tokens / <2% of 200K context)
|
||
- Organization (sections, headers, priority markers)
|
||
- Maintenance (update dates, version info)
|
||
- Duplicate or conflicting content
|
||
|
||
### 🔬 Research Optimization
|
||
|
||
Applies academic insights:
|
||
|
||
- **"Lost in the Middle"** positioning (critical info at top/bottom)
|
||
- Token efficiency and context utilization
|
||
- Chunking and information architecture
|
||
- Attention pattern optimization
|
||
|
||
**Based On**:
|
||
- "Lost in the Middle" (Liu et al., 2023, TACL)
|
||
- Claude-specific performance studies
|
||
- Context awareness research (MIT/Google Cloud AI, 2024)
|
||
|
||
---
|
||
|
||
## Installation
|
||
|
||
### Option 1: Claude Code Skills (Recommended)
|
||
|
||
```bash
|
||
# Clone or copy to your skills directory
|
||
mkdir -p ~/.claude/skills
|
||
cp -r claude-md-auditor ~/.claude/skills/
|
||
|
||
# Verify installation
|
||
ls ~/.claude/skills/claude-md-auditor/SKILL.md
|
||
```
|
||
|
||
### Option 2: Standalone Scripts
|
||
|
||
```bash
|
||
# Clone repository
|
||
git clone https://github.com/cskiro/annex.git
|
||
cd annex/claude-md-auditor
|
||
|
||
# Run directly (Python 3.8+ required, no dependencies)
|
||
python scripts/analyzer.py path/to/CLAUDE.md
|
||
```
|
||
|
||
### Requirements
|
||
|
||
- **Python**: 3.8 or higher
|
||
- **Dependencies**: None (uses standard library only)
|
||
- **Claude Code**: Any version with Skills support (optional)
|
||
|
||
---
|
||
|
||
## Usage
|
||
|
||
### Basic Audit
|
||
|
||
**With Claude Code**:
|
||
```
|
||
Audit my CLAUDE.md using the claude-md-auditor skill.
|
||
```
|
||
|
||
**Direct**:
|
||
```bash
|
||
python scripts/analyzer.py ./CLAUDE.md
|
||
```
|
||
|
||
**Output**:
|
||
```
|
||
============================================================
|
||
CLAUDE.md Audit Results: ./CLAUDE.md
|
||
============================================================
|
||
|
||
Overall Health Score: 78/100
|
||
Security Score: 100/100
|
||
Official Compliance Score: 75/100
|
||
Best Practices Score: 70/100
|
||
Research Optimization Score: 85/100
|
||
|
||
============================================================
|
||
Findings Summary:
|
||
🚨 Critical: 0
|
||
⚠️ High: 2
|
||
📋 Medium: 3
|
||
ℹ️ Low: 5
|
||
============================================================
|
||
```
|
||
|
||
### Generate JSON Report
|
||
|
||
**For CI/CD Integration**:
|
||
|
||
```bash
|
||
python scripts/report_generator.py ./CLAUDE.md json > audit.json
|
||
```
|
||
|
||
**Example Output**:
|
||
```json
|
||
{
|
||
"metadata": {
|
||
"file": "./CLAUDE.md",
|
||
"generated_at": "2025-10-26 10:30:00",
|
||
"tier": "Project"
|
||
},
|
||
"scores": {
|
||
"overall": 78,
|
||
"security": 100,
|
||
"official_compliance": 75,
|
||
"critical_count": 0,
|
||
"high_count": 2
|
||
},
|
||
"findings": [...]
|
||
}
|
||
```
|
||
|
||
### Generate Refactored File
|
||
|
||
**Create improved CLAUDE.md**:
|
||
|
||
```bash
|
||
python scripts/report_generator.py ./CLAUDE.md refactored > CLAUDE_refactored.md
|
||
```
|
||
|
||
This generates a production-ready file with:
|
||
- ✅ Optimal structure (critical at top, reference at bottom)
|
||
- ✅ Research-based positioning ("lost in the middle" mitigation)
|
||
- ✅ All security issues removed
|
||
- ✅ Best practices applied
|
||
- ✅ Inline comments for maintenance
|
||
|
||
---
|
||
|
||
## Output Examples
|
||
|
||
### Markdown Report Structure
|
||
|
||
```markdown
|
||
# CLAUDE.md Audit Report
|
||
|
||
## Executive Summary
|
||
- Overall health: 78/100 (Good)
|
||
- Status: ⚠️ **HIGH PRIORITY** - Address this sprint
|
||
- Total findings: 10 (0 critical, 2 high, 3 medium, 5 low)
|
||
|
||
## Score Dashboard
|
||
| Category | Score | Status |
|
||
|----------|-------|--------|
|
||
| Security | 100/100 | ✅ Excellent |
|
||
| Official Compliance | 75/100 | 🟢 Good |
|
||
| Best Practices | 70/100 | 🟢 Good |
|
||
|
||
## Detailed Findings
|
||
|
||
### ⚠️ HIGH Priority
|
||
|
||
#### 1. Generic Programming Content Detected
|
||
**Category**: Official Compliance
|
||
**Source**: Official Guidance
|
||
|
||
**Description**: File contains generic React documentation
|
||
|
||
**Impact**: Wastes context window. Official guidance:
|
||
"Don't include basic programming concepts Claude already understands"
|
||
|
||
**Remediation**: Remove generic content. Focus on project-specific standards.
|
||
|
||
---
|
||
```
|
||
|
||
### JSON Report Structure
|
||
|
||
```json
|
||
{
|
||
"metadata": {
|
||
"file_path": "./CLAUDE.md",
|
||
"line_count": 245,
|
||
"token_estimate": 3240,
|
||
"context_usage_200k": 1.62,
|
||
"tier": "Project"
|
||
},
|
||
"scores": {
|
||
"overall": 78,
|
||
"security": 100,
|
||
"official_compliance": 75,
|
||
"best_practices": 70,
|
||
"research_optimization": 85
|
||
},
|
||
"findings": [
|
||
{
|
||
"severity": "high",
|
||
"category": "official_compliance",
|
||
"title": "Generic Programming Content Detected",
|
||
"description": "File contains generic React documentation",
|
||
"line_number": 42,
|
||
"source": "official",
|
||
"remediation": "Remove generic content..."
|
||
}
|
||
]
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## Reference Documentation
|
||
|
||
### Complete Validation Criteria
|
||
|
||
All checks are documented in the `reference/` directory:
|
||
|
||
| Document | Content | Source |
|
||
|----------|---------|--------|
|
||
| **official_guidance.md** | Complete official Anthropic documentation | docs.claude.com |
|
||
| **best_practices.md** | Community recommendations and field experience | Practitioners |
|
||
| **research_insights.md** | Academic research on LLM context optimization | Peer-reviewed papers |
|
||
| **anti_patterns.md** | Catalog of common mistakes and violations | Field observations |
|
||
|
||
### Key References
|
||
|
||
- [Memory Management](https://docs.claude.com/en/docs/claude-code/memory) - Official docs
|
||
- ["Lost in the Middle"](https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00638/119630/) - Academic paper
|
||
- [Claude Code Best Practices](https://www.anthropic.com/engineering/claude-code-best-practices) - Anthropic blog
|
||
|
||
---
|
||
|
||
## CI/CD Integration
|
||
|
||
### GitHub Actions Example
|
||
|
||
```yaml
|
||
name: CLAUDE.md Audit
|
||
|
||
on:
|
||
pull_request:
|
||
paths:
|
||
- '**/CLAUDE.md'
|
||
|
||
jobs:
|
||
audit:
|
||
runs-on: ubuntu-latest
|
||
steps:
|
||
- uses: actions/checkout@v3
|
||
|
||
- name: Run CLAUDE.md Audit
|
||
run: |
|
||
python claude-md-auditor/scripts/analyzer.py CLAUDE.md \
|
||
--format json \
|
||
--output audit.json
|
||
|
||
- name: Check Critical Issues
|
||
run: |
|
||
CRITICAL=$(python -c "import json; print(json.load(open('audit.json'))['summary']['critical'])")
|
||
|
||
if [ "$CRITICAL" -gt 0 ]; then
|
||
echo "❌ Critical issues found in CLAUDE.md"
|
||
exit 1
|
||
fi
|
||
|
||
echo "✅ CLAUDE.md validation passed"
|
||
```
|
||
|
||
### Pre-Commit Hook
|
||
|
||
```bash
|
||
#!/bin/bash
|
||
# .git/hooks/pre-commit
|
||
|
||
if git diff --cached --name-only | grep -q "CLAUDE.md"; then
|
||
echo "Validating CLAUDE.md..."
|
||
|
||
python claude-md-auditor/scripts/analyzer.py CLAUDE.md > /tmp/audit.txt
|
||
|
||
# Check exit code or parse output
|
||
if grep -q "🚨 Critical: [1-9]" /tmp/audit.txt; then
|
||
echo "❌ CLAUDE.md has critical issues"
|
||
cat /tmp/audit.txt
|
||
exit 1
|
||
fi
|
||
|
||
echo "✅ CLAUDE.md validation passed"
|
||
fi
|
||
```
|
||
|
||
### VS Code Task
|
||
|
||
Add to `.vscode/tasks.json`:
|
||
|
||
```json
|
||
{
|
||
"version": "2.0.0",
|
||
"tasks": [
|
||
{
|
||
"label": "Audit CLAUDE.md",
|
||
"type": "shell",
|
||
"command": "python",
|
||
"args": [
|
||
"${workspaceFolder}/claude-md-auditor/scripts/analyzer.py",
|
||
"${workspaceFolder}/CLAUDE.md"
|
||
],
|
||
"group": "test",
|
||
"presentation": {
|
||
"reveal": "always",
|
||
"panel": "new"
|
||
}
|
||
}
|
||
]
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## Understanding Scores
|
||
|
||
### Overall Health Score (0-100)
|
||
|
||
| Range | Status | Action |
|
||
|-------|--------|--------|
|
||
| 90-100 | ✅ Excellent | Minor optimizations only |
|
||
| 75-89 | 🟢 Good | Some improvements recommended |
|
||
| 60-74 | 🟡 Fair | Schedule improvements this quarter |
|
||
| 40-59 | 🟠 Poor | Significant issues to address |
|
||
| 0-39 | 🔴 Critical | Immediate action required |
|
||
|
||
### Severity Levels
|
||
|
||
- **🚨 CRITICAL**: Security risks (fix immediately, within 24 hours)
|
||
- **⚠️ HIGH**: Significant issues (fix this sprint, within 2 weeks)
|
||
- **📋 MEDIUM**: Moderate improvements (schedule for next quarter)
|
||
- **ℹ️ LOW**: Minor optimizations (backlog)
|
||
|
||
### Category Scores
|
||
|
||
- **Security**: Should always be 100 (any security issue is critical)
|
||
- **Official Compliance**: Aim for 80+ (follow Anthropic guidance)
|
||
- **Best Practices**: 70+ is good (community recommendations are flexible)
|
||
- **Research Optimization**: 60+ is acceptable (optimizations, not requirements)
|
||
|
||
---
|
||
|
||
## Real-World Examples
|
||
|
||
### Example 1: Security Violation
|
||
|
||
**Before** (Anti-Pattern):
|
||
```markdown
|
||
# CLAUDE.md
|
||
|
||
## API Configuration
|
||
- API Key: sk-1234567890abcdefghijklmnop
|
||
- Database: postgres://admin:pass@10.0.1.42/db
|
||
```
|
||
|
||
**Audit Finding**:
|
||
```
|
||
🚨 CRITICAL: API Key Detected
|
||
Line: 4
|
||
Impact: Security breach risk. Secrets exposed in git history.
|
||
Remediation:
|
||
1. Remove the API key immediately
|
||
2. Rotate the compromised credential
|
||
3. Use environment variables (.env file)
|
||
4. Clean git history if committed
|
||
```
|
||
|
||
**After** (Fixed):
|
||
```markdown
|
||
# CLAUDE.md
|
||
|
||
## API Configuration
|
||
- API keys: Stored in .env (see .env.example for template)
|
||
- Database: Use AWS Secrets Manager connection string
|
||
- Access: Contact team lead for credentials
|
||
```
|
||
|
||
### Example 2: Generic Content
|
||
|
||
**Before** (Anti-Pattern):
|
||
```markdown
|
||
## React Best Practices
|
||
|
||
React is a JavaScript library for building user interfaces.
|
||
It was created by Facebook in 2013. React uses a virtual
|
||
DOM for efficient updates...
|
||
|
||
[200 lines of React documentation]
|
||
```
|
||
|
||
**Audit Finding**:
|
||
```
|
||
⚠️ HIGH: Generic Programming Content Detected
|
||
Impact: Wastes context window. Claude already knows React basics.
|
||
Remediation: Remove generic content. Focus on project-specific patterns.
|
||
```
|
||
|
||
**After** (Fixed):
|
||
```markdown
|
||
## React Standards (Project-Specific)
|
||
|
||
- Functional components only (no class components)
|
||
- Custom hooks location: /src/hooks
|
||
- Co-location pattern: Component + test + styles in same directory
|
||
- Props interface naming: [ComponentName]Props
|
||
```
|
||
|
||
### Example 3: Optimal Structure
|
||
|
||
**Generated Refactored File**:
|
||
```markdown
|
||
# MyApp
|
||
|
||
## 🚨 CRITICAL: Must-Follow Standards
|
||
|
||
<!-- Top position = highest attention -->
|
||
|
||
- Security: Never commit secrets to git
|
||
- TypeScript strict mode: No `any` types
|
||
- Testing: 80% coverage on all new code
|
||
|
||
## 📋 Project Overview
|
||
|
||
**Tech Stack**: React, TypeScript, Vite, PostgreSQL
|
||
**Architecture**: Feature-based modules, clean architecture
|
||
**Purpose**: Enterprise CRM system
|
||
|
||
## 🔧 Development Workflow
|
||
|
||
### Git
|
||
- Branches: `feature/{name}`, `bugfix/{name}`
|
||
- Commits: Conventional format required
|
||
- PRs: Tests + review + passing CI
|
||
|
||
## 📝 Code Standards
|
||
|
||
[Project-specific rules here]
|
||
|
||
## 📌 REFERENCE: Common Tasks
|
||
|
||
<!-- Bottom position = recency attention -->
|
||
|
||
```bash
|
||
npm run build # Build production
|
||
npm test # Run tests
|
||
npm run deploy # Deploy to staging
|
||
```
|
||
|
||
### Key Files
|
||
- Config: `/config/app.config.ts`
|
||
- Types: `/src/types/index.ts`
|
||
```
|
||
|
||
---
|
||
|
||
## FAQ
|
||
|
||
### Q: Will this skill automatically fix my CLAUDE.md?
|
||
|
||
**A**: No, but it can generate a refactored version. You need to review and apply changes manually to ensure they fit your project.
|
||
|
||
### Q: Are all recommendations mandatory?
|
||
|
||
**A**: No. Check the **source** field:
|
||
- **Official**: Follow Anthropic documentation (highest priority)
|
||
- **Community**: Recommended best practices (flexible)
|
||
- **Research**: Evidence-based optimizations (optional)
|
||
|
||
### Q: What if I disagree with a finding?
|
||
|
||
**A**: That's okay! Best practices are guidelines, not requirements. Official guidance should be followed, but community and research recommendations can be adapted to your context.
|
||
|
||
### Q: How often should I audit CLAUDE.md?
|
||
|
||
**A**:
|
||
- **On every change**: Before committing (use pre-commit hook)
|
||
- **Quarterly**: Regular maintenance audit
|
||
- **Before major releases**: Ensure standards are up-to-date
|
||
- **When onboarding**: Validate project configuration
|
||
|
||
### Q: Can I use this in my CI/CD pipeline?
|
||
|
||
**A**: Yes! Use JSON output mode and check for critical findings. Example provided in CI/CD Integration section.
|
||
|
||
### Q: Does this validate that Claude actually follows the standards?
|
||
|
||
**A**: No, this only validates the CLAUDE.md structure and content. To test effectiveness, start a new Claude session and verify standards are followed without re-stating them.
|
||
|
||
---
|
||
|
||
## Limitations
|
||
|
||
### What This Skill CANNOT Do
|
||
|
||
- ❌ Automatically fix security issues (manual remediation required)
|
||
- ❌ Test if Claude follows standards (behavioral testing needed)
|
||
- ❌ Validate imported files beyond path existence
|
||
- ❌ Detect circular imports (requires graph traversal)
|
||
- ❌ Verify standards match actual codebase
|
||
- ❌ Determine if standards are appropriate for your project
|
||
|
||
### Known Issues
|
||
|
||
- Import depth validation (max 5 hops) not yet implemented
|
||
- Circular import detection not yet implemented
|
||
- Cannot read contents of imported files for validation
|
||
|
||
---
|
||
|
||
## Roadmap
|
||
|
||
### v1.1 (Planned)
|
||
|
||
- [ ] Import graph traversal (detect circular imports)
|
||
- [ ] Import depth validation (max 5 hops)
|
||
- [ ] Content validation of imported files
|
||
- [ ] Interactive CLI for guided fixes
|
||
- [ ] HTML dashboard report format
|
||
|
||
### v1.2 (Planned)
|
||
|
||
- [ ] Effectiveness testing (test if Claude follows standards)
|
||
- [ ] Diff mode (compare before/after audits)
|
||
- [ ] Metrics tracking over time
|
||
- [ ] Custom rule definitions
|
||
- [ ] Integration with popular IDEs
|
||
|
||
---
|
||
|
||
## Contributing
|
||
|
||
This skill is based on three authoritative sources:
|
||
|
||
1. **Official Anthropic Documentation** (docs.claude.com)
|
||
2. **Peer-Reviewed Academic Research** (MIT, Google Cloud AI, TACL)
|
||
3. **Community Field Experience** (practitioner reports)
|
||
|
||
To propose changes:
|
||
|
||
1. Identify which category (official/community/research)
|
||
2. Provide source documentation or evidence
|
||
3. Explain rationale and expected impact
|
||
4. Update relevant reference documentation
|
||
|
||
---
|
||
|
||
## License
|
||
|
||
Apache 2.0 - Example skill for demonstration purposes
|
||
|
||
---
|
||
|
||
## Version
|
||
|
||
**Current Version**: 1.0.0
|
||
**Last Updated**: 2025-10-26
|
||
**Python**: 3.8+
|
||
**Status**: Stable
|
||
|
||
---
|
||
|
||
## Credits
|
||
|
||
**Developed By**: Connor (based on Anthropic Skills framework)
|
||
|
||
**Research Sources**:
|
||
- Liu et al. (2023) - "Lost in the Middle" (TACL/MIT Press)
|
||
- MIT/Google Cloud AI (2024) - Attention calibration research
|
||
- Anthropic Engineering (2023-2025) - Claude documentation and blog
|
||
|
||
**Special Thanks**: Anthropic team for Claude Code and Skills framework
|
||
|
||
---
|
||
|
||
**🔗 Links**:
|
||
- [Claude Code Documentation](https://docs.claude.com/en/docs/claude-code/overview)
|
||
- [Anthropic Skills Repository](https://github.com/anthropics/skills)
|
||
- [Memory Management Guide](https://docs.claude.com/en/docs/claude-code/memory)
|
||
|
||
---
|
||
|
||
*Generated by claude-md-auditor v1.0.0*
|