Initial commit

This commit is contained in:
Zhongwei Li
2025-11-29 18:16:40 +08:00
commit f125e90b9f
370 changed files with 67769 additions and 0 deletions

View File

@@ -0,0 +1,15 @@
{
"name": "productivity-hooks",
"description": "Productivity enhancement hooks for automated insight extraction and learning",
"version": "0.0.0-2025.11.28",
"author": {
"name": "Connor",
"email": "noreply@claudex.dev"
},
"skills": [
"./skills"
],
"hooks": [
"./hooks"
]
}

3
README.md Normal file
View File

@@ -0,0 +1,3 @@
# productivity-hooks
Productivity enhancement hooks for automated insight extraction and learning

18
hooks/hooks.json Normal file
View File

@@ -0,0 +1,18 @@
{
"$schema": "https://anthropic.com/claude-code/hooks.schema.json",
"description": "Productivity enhancement hooks for automated insight extraction and learning",
"hooks": {
"Stop": [
{
"matcher": "",
"hooks": [
{
"type": "command",
"command": "${CLAUDE_PLUGIN_ROOT}/hooks/extract-explanatory-insights.sh",
"description": "Extract ★ Insight blocks from Claude Code responses after agent stops"
}
]
}
]
}
}

1517
plugin.lock.json Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,16 @@
# Changelog
All notable changes to the accessibility-audit skill will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [0.1.0] - 2025-11-27
### Added
- Initial Claudex marketplace release
- Risk-based severity scoring (Impact x Likelihood model)
- MUI framework awareness to prevent false positives
- Multi-layer analysis workflow (static, runtime, manual)
- Comprehensive WCAG 2.2 Level AA criteria reference
- Gap report and remediation plan examples

View File

@@ -0,0 +1,51 @@
# Accessibility Audit
WCAG 2.2 Level AA accessibility auditing with risk-based severity scoring for React/TypeScript applications.
## Overview
This skill provides comprehensive accessibility auditing that goes beyond simple WCAG conformance checking. It uses a **risk-based severity model** where Severity = Impact x Likelihood, meaning a Level A failure can be LOW severity while a Level AA failure can be CRITICAL.
## Key Features
- **Risk-based severity scoring** - Prioritizes issues by real user impact
- **MUI framework awareness** - Avoids false positives on built-in accessibility features
- **Multi-layer analysis** - Static (ESLint), runtime (jest-axe, Playwright), and manual validation
- **Actionable output** - Gap reports with remediation priorities
## Quick Start
```bash
# Install required tooling
npm install --save-dev eslint-plugin-jsx-a11y jest-axe @axe-core/playwright
# Run audit
# Use trigger: "Run accessibility audit on [component/page]"
```
## Trigger Phrases
- "accessibility audit"
- "WCAG compliance"
- "a11y review"
- "screen reader"
- "keyboard navigation"
- "color contrast"
## Severity Levels
| Severity | Impact | Examples |
|----------|--------|----------|
| Critical | Blocks access | Keyboard traps, missing alt on actions |
| High | Significantly degrades UX | Poor contrast on CTAs, no skip navigation |
| Medium | Minor usability impact | Missing autocomplete, unclear link text |
| Low | Best practice | Could add tooltips, more descriptive titles |
## Related Skills
- `codebase-auditor` - General code quality analysis
- `bulletproof-react-auditor` - React architecture review
## Documentation
See [SKILL.md](SKILL.md) for the complete workflow and reference materials.

View File

@@ -0,0 +1,85 @@
---
name: accessibility-audit
version: 0.1.0
description: WCAG 2.2 Level AA accessibility auditing with risk-based severity scoring
author: Connor
triggers:
- accessibility audit
- WCAG compliance
- a11y review
- screen reader
- keyboard navigation
- color contrast
---
# Accessibility Audit Skill
Comprehensive WCAG 2.2 Level AA accessibility auditing for React/TypeScript applications with MUI framework awareness.
## Quick Reference
| Severity | Impact | Examples |
|----------|--------|----------|
| **Critical** | Blocks access completely | Keyboard traps, missing alt on actions, no focus visible |
| **High** | Significantly degrades UX | Poor contrast on CTAs, no skip navigation, small touch targets |
| **Medium** | Minor usability impact | Missing autocomplete, unclear link text |
| **Low** | Best practice enhancement | Could add tooltips, more descriptive titles |
## Key Principle
> **Severity = Impact x Likelihood**, NOT WCAG conformance level.
> A Level A failure can be LOW severity; a Level AA failure can be CRITICAL.
## Required Tooling
```bash
# Install required tools
npm install --save-dev eslint-plugin-jsx-a11y jest-axe @axe-core/playwright
# Configure ESLint
# Add to .eslintrc: extends: ['plugin:jsx-a11y/recommended']
```
## Workflow
| Phase | Description |
|-------|-------------|
| 1. Setup | Install tooling, create output directories |
| 2. Static Analysis | ESLint jsx-a11y scan |
| 3. Runtime Analysis | jest-axe and Playwright |
| 4. Manual Validation | Keyboard, screen reader, contrast |
| 5. Report Generation | JSON + Markdown outputs |
### Phase 1: Setup
See [workflow/setup.md](workflow/setup.md) for installation and configuration.
### Phase 4: Manual Validation
See [workflow/manual-validation.md](workflow/manual-validation.md) for keyboard and screen reader testing.
## Reference
- [Severity Rubric](reference/severity-rubric.md) - Impact x Likelihood calculation
- [MUI Framework Awareness](reference/mui-awareness.md) - Built-in accessibility features
## Common False Positives to Avoid
| Component | Built-in Behavior | Don't Flag |
|-----------|-------------------|------------|
| MUI `<SvgIcon>` | Auto `aria-hidden="true"` | Icons without titleAccess |
| MUI `<Alert>` | Default `role="alert"` | Missing role attribute |
| MUI `<Button>` | 36.5px min height | Target size < 44px |
| MUI `<TextField>` | Auto label association | Missing label |
| MUI `<Autocomplete>` | Manages ARIA attrs | Missing aria-expanded |
## Quick Audit Command
```
Run accessibility audit on [component/page] following WCAG 2.2 AA standards
```
## Related Skills
- `codebase-auditor` - General code quality analysis
- `bulletproof-react-auditor` - React architecture review

View File

@@ -0,0 +1,83 @@
# MUI Framework Awareness
MUI components have built-in accessibility features. Audit rules MUST account for framework defaults to avoid false positives.
## Component Defaults
### SvgIcon
- **Behavior**: Automatically adds `aria-hidden="true"` when `titleAccess` prop is undefined
- **Source**: `node_modules/@mui/material/SvgIcon/SvgIcon.js`
**Rule**: Do NOT flag MUI icons as missing aria-hidden unless titleAccess is set
```tsx
// Only flag if titleAccess is set (should have aria-label or be visible):
<SvgIcon titleAccess="Icon name" />
// Do NOT flag (auto aria-hidden):
<SearchIcon />
<Timeline />
```
### Alert
- **Behavior**: Defaults to `role="alert"` (assertive live region)
**Rule**: Do NOT recommend adding `role="alert"` - it's already there
```tsx
// Only suggest role="status" if polite announcement is more appropriate:
<Alert severity="success" role="status">Item saved</Alert>
```
### Button
- **Behavior**: Has minimum 36.5px height by default
**Rule**: Usually meets 24x24px target size requirement
```tsx
// Only flag if size="small" or custom sx reduces below 24px:
<Button size="small" sx={{ minHeight: '20px' }} /> // Flag this
<Button>Normal</Button> // Don't flag
```
### TextField
- **Behavior**: Automatically associates label with input via id
**Rule**: Do NOT flag as missing label if `label` prop is provided
```tsx
// This is accessible (auto-associated):
<TextField label="Email" />
// Only flag if no label and no aria-label:
<TextField /> // Flag this
```
### Autocomplete
- **Behavior**: Manages `aria-expanded`, `aria-controls`, `aria-activedescendant`
**Rule**: Do NOT flag ARIA attributes - they're managed by component
```tsx
// All ARIA is handled internally:
<Autocomplete options={options} renderInput={(params) => <TextField {...params} />} />
```
## False Positive Checklist
Before flagging a MUI component violation:
1. [ ] Check if MUI provides default accessibility behavior
2. [ ] Verify the violation exists in rendered output (use browser DevTools)
3. [ ] Test with actual screen reader to confirm issue
4. [ ] Check MUI documentation for accessibility notes
## Common False Positives
| Automated Finding | Why It's False | Reality |
|-------------------|----------------|---------|
| "Icon missing aria-hidden" | MUI adds it automatically | Check rendered HTML |
| "Alert missing role" | Default is role="alert" | Only change if polite needed |
| "Button too small" | 36.5px default height | Check actual rendered size |
| "Input missing label" | TextField manages labels | Check for label prop |
| "Missing aria-expanded" | Autocomplete manages it | Check rendered state |

View File

@@ -0,0 +1,80 @@
# Severity Rubric
## Core Principle
**Severity = Impact x Likelihood**, NOT WCAG conformance level.
- Level A vs AA is a *conformance tier*, not a risk rating
- A Level A failure can be LOW severity (decorative image missing alt)
- A Level AA failure can be CRITICAL (focus outline removed)
## Severity Levels
### Critical
- **Description**: Completely blocks access for users with disabilities
- **Impact**: Prevents task completion
- **Examples**:
- Keyboard trap preventing navigation (2.1.2, Level A)
- Missing alt text on primary action image (1.1.1, Level A)
- Form submission inaccessible via keyboard (2.1.1, Level A)
- Focus outline removed from focusable elements (2.4.7, Level AA)
### High
- **Description**: Significantly degrades experience or blocks common workflows
- **Impact**: Makes tasks difficult or requires workarounds
- **Examples**:
- No skip navigation on complex site (2.4.1, Level A)
- Poor contrast on primary CTA button (1.4.3, Level AA)
- Missing error suggestions on required form (3.3.3, Level AA)
- Touch targets too small on mobile (2.5.8, Level AA)
### Medium
- **Description**: Minor usability impact, affects subset of users
- **Impact**: Causes confusion or requires extra effort
- **Examples**:
- Decorative icon not hidden but in acceptable context (1.1.1, Level A)
- Link text needs slight improvement for clarity (2.4.4, Level A)
- Missing autocomplete on optional field (1.3.5, Level AA)
### Low
- **Description**: Best practice enhancement, minimal user impact
- **Impact**: Nice-to-have improvement
- **Examples**:
- Could add tooltips for better UX (not required)
- Page title could be more descriptive (2.4.2, Level A - but functional)
## Calculation Guide
### Impact Assessment
| Level | Description | Severity Modifier |
|-------|-------------|-------------------|
| Blocker | Prevents access | Critical/High |
| Degraded | Makes difficult | High/Medium |
| Friction | Adds effort | Medium/Low |
| Minor | Barely noticeable | Low |
### Likelihood Assessment
| Level | Description | Severity Modifier |
|-------|-------------|-------------------|
| Core flow | All users hit it | Increase severity |
| Common | Many users hit it | Base severity |
| Edge case | Few users hit it | Decrease severity |
| Rare | Almost never | Low |
## Examples
### Same Criterion, Different Severity
**Missing alt text (1.1.1, Level A)**:
- Hero image: Impact=Blocker, Likelihood=All users → **CRITICAL**
- Decorative footer icon: Impact=Minor, Likelihood=Rare → **LOW**
**No skip link (2.4.1, Level A)**:
- 3-item navigation: Impact=Friction, Likelihood=Common → **MEDIUM**
- 50-item navigation: Impact=Degraded, Likelihood=All users → **HIGH**
**Poor contrast (1.4.3, Level AA)**:
- Primary CTA button: **CRITICAL**
- Body text: **HIGH**
- Footer link: **MEDIUM**
- Decorative text: **LOW**

View File

@@ -0,0 +1,125 @@
# Phase 4: Manual Validation
These checks CANNOT be automated and require human judgment.
## 1. Color Contrast Validation
**Tool**: [WebAIM Contrast Checker](https://webaim.org/resources/contrastchecker/)
### Process
1. Extract all colors from theme configuration
2. Calculate contrast ratios for each text/background pair
3. Document results in gap report
### Requirements
| Element Type | Minimum Ratio |
|--------------|---------------|
| Normal text (< 18pt) | 4.5:1 |
| Large text (>= 18pt or 14pt bold) | 3:1 |
| UI components | 3:1 |
| Focus indicators | 3:1 |
### Severity by Element
- Primary CTA button low contrast = **CRITICAL**
- Body text low contrast = **HIGH**
- Footer link low contrast = **MEDIUM**
- Decorative text low contrast = **LOW**
## 2. Keyboard Navigation Testing
### Basic Test
1. Start at top of page
2. Press Tab repeatedly through all interactive elements
3. Verify:
- [ ] Logical order (left-to-right, top-to-bottom)
- [ ] No keyboard traps (can always Tab away)
- [ ] All functionality accessible
- [ ] Focus indicator visible on every element
### Key Combinations to Test
| Key | Expected Behavior |
|-----|-------------------|
| Tab | Move to next focusable element |
| Shift+Tab | Move to previous focusable element |
| Enter | Activate buttons/links |
| Space | Activate buttons, toggle checkboxes |
| Escape | Close modals/menus |
| Arrow keys | Navigate within components (menus, tabs) |
### Common Keyboard Traps
- Modal dialogs without Escape handling
- Date pickers without keyboard support
- Custom dropdowns that don't cycle
## 3. Screen Reader Testing
### Recommended Tools
- **Mac**: VoiceOver (built-in, Cmd+F5)
- **Windows**: NVDA (free), JAWS (paid)
- **iOS**: VoiceOver (built-in)
- **Android**: TalkBack (built-in)
### What to Test
1. **Landmarks**: Header, nav, main, footer announced
2. **Headings**: Logical hierarchy (h1 → h2 → h3)
3. **Forms**: Labels announced, errors read
4. **Dynamic content**: Status messages announced
5. **Images**: Alt text read appropriately
### VoiceOver Commands (Mac)
| Command | Action |
|---------|--------|
| VO + Right Arrow | Next element |
| VO + Left Arrow | Previous element |
| VO + U | Open rotor (landmarks, headings, links) |
| VO + Space | Activate |
## 4. Zoom and Reflow Testing
### 200% Zoom Test
1. Browser zoom to 200%
2. Verify:
- [ ] No horizontal scrolling
- [ ] No text truncation
- [ ] No overlapping elements
- [ ] All functionality accessible
### 320px Width Test (Mobile Reflow)
1. Resize browser to 320px width
2. Verify:
- [ ] Content reflows to single column
- [ ] No horizontal scroll
- [ ] Touch targets still accessible
- [ ] Text remains readable
## 5. WCAG Interpretation Decisions
Some criteria require human judgment:
### 2.4.5 Multiple Ways
- **Question**: Is this a "set of Web pages"?
- **If < 3 pages**: Likely exempt
- **If >= 3 pages**: Need 2+ navigation methods
### 3.2.6 Consistent Help
- **Question**: Does a help mechanism exist?
- **If no help exists**: Compliant (no requirement)
- **If help exists**: Must be consistently placed
### 1.3.5 Identify Input Purpose
- **Question**: Is this collecting user data from the 53 specified purposes?
- Search inputs: **NOT** in scope
- User email/phone: **IN** scope
## Checklist
- [ ] All color combinations checked against contrast requirements
- [ ] Full keyboard navigation test completed
- [ ] Screen reader testing with at least one tool
- [ ] 200% zoom test passed
- [ ] 320px reflow test passed
- [ ] Applicability decisions documented
## Next Step
Proceed with Report Generation (JSON + Markdown outputs).

View File

@@ -0,0 +1,89 @@
# Phase 1: Setup & Preparation
## Required Tooling Installation
### Static Analysis (Required)
```bash
npm install --save-dev eslint-plugin-jsx-a11y
```
Configure `.eslintrc.js`:
```javascript
module.exports = {
extends: ['plugin:jsx-a11y/recommended'],
// ... other config
};
```
### Runtime Analysis (Required)
```bash
npm install --save-dev jest-axe @axe-core/react
```
### E2E Analysis (Required)
```bash
npm install --save-dev @axe-core/playwright
```
### Optional Tools
```bash
npm install --save-dev @storybook/addon-a11y pa11y-ci
```
## Verification Commands
```bash
# Verify installations
npm list eslint-plugin-jsx-a11y jest-axe @axe-core/playwright
# Check ESLint config
grep -l "jsx-a11y" .eslintrc* 2>/dev/null || echo "jsx-a11y not configured"
```
## Output Directory Setup
```bash
mkdir -p docs/accessibility
```
## Prepare Report Templates
### Gap Report JSON Structure
```json
{
"meta": {
"project": "PROJECT_NAME",
"auditDate": "YYYY-MM-DD",
"auditor": "Claude Code",
"protocolVersion": "2.0.0",
"wcagVersion": "2.2",
"wcagLevel": "AA"
},
"summary": {
"totalCriteria": 60,
"passing": 0,
"failing": 0,
"notApplicable": 0,
"compliancePercentage": 0,
"severityBreakdown": {
"critical": 0,
"high": 0,
"medium": 0,
"low": 0
}
},
"findings": []
}
```
## Pre-Audit Checklist
- [ ] eslint-plugin-jsx-a11y installed and configured
- [ ] jest-axe available for component tests
- [ ] @axe-core/playwright available for E2E tests
- [ ] docs/accessibility/ directory exists
- [ ] Project uses React + TypeScript (protocol optimized for this stack)
## Next Step
Continue with Static Analysis (ESLint jsx-a11y scan).

View File

@@ -0,0 +1,13 @@
# Changelog
## 0.2.0
- Refactored to Anthropic progressive disclosure pattern
- Updated description with "Use PROACTIVELY when..." format
- Extracted detailed content to workflow/, reference/, examples/ directories
## 0.1.0
- Initial skill release
- React codebase auditing against Bulletproof React architecture
- Anti-pattern detection and migration planning

View File

@@ -0,0 +1,388 @@
# Bulletproof React Auditor Skill
> Comprehensive audit tool for React/TypeScript codebases based on Bulletproof React architecture principles
An Anthropic Skill that analyzes React applications for architectural issues, component anti-patterns, state management problems, and generates prioritized migration plans for adopting Bulletproof React patterns.
## Features
- **Progressive Disclosure**: Three-phase analysis (Discovery → Deep Analysis → Migration Plan)
- **React-Specific**: Tailored for React 16.8+ (hooks-based applications)
- **Comprehensive Analysis**:
- Project Structure (feature-based vs flat)
- Component Architecture (colocation, composition, size)
- State Management (appropriate tools for each state type)
- API Layer (centralized, type-safe patterns)
- Testing Strategy (testing trophy compliance)
- Styling Patterns (component libraries, utility CSS)
- Error Handling (boundaries, interceptors, tracking)
- Performance (code splitting, memoization, optimization)
- Security (authentication, authorization, XSS prevention)
- Standards Compliance (ESLint, TypeScript, naming conventions)
- **Multiple Report Formats**: Markdown, JSON, migration roadmaps
- **Prioritized Migration Plans**: P0-P3 severity with effort estimates
- **ASCII Structure Diagrams**: Visual before/after comparisons
- **Industry Standards**: Based on Bulletproof React best practices
## Installation
### Option 1: Claude Code (Recommended)
1. Clone or copy the `bulletproof-react-auditor` directory to your Claude skills directory
2. Ensure Python 3.8+ is installed
3. No additional dependencies required (uses Python standard library)
### Option 2: Manual Installation
```bash
cd ~/.claude/skills
git clone https://github.com/your-org/bulletproof-react-auditor.git
```
## Usage with Claude Code
### Basic Audit
```
Audit this React codebase using the bulletproof-react-auditor skill.
```
### Structure-Focused Audit
```
Run a structure audit on this React app against Bulletproof React patterns.
```
### Generate Migration Plan
```
Audit this React app and generate a migration plan to Bulletproof React architecture.
```
### Custom Scope
```
Audit this React codebase focusing on:
- Project structure and feature organization
- Component architecture patterns
- State management approach
```
## Direct Script Usage
```bash
# Full audit with Markdown report
python scripts/audit_engine.py /path/to/react-app --output audit.md
# Structure-focused audit
python scripts/audit_engine.py /path/to/react-app --scope structure,components --output report.md
# Generate migration plan
python scripts/audit_engine.py /path/to/react-app --migration-plan --output migration.md
# JSON output for CI/CD integration
python scripts/audit_engine.py /path/to/react-app --format json --output audit.json
# Quick health check only (Phase 1)
python scripts/audit_engine.py /path/to/react-app --phase quick
```
## Output Formats
### Markdown (Default)
Human-readable report with:
- ASCII structure diagrams (before/after)
- Detailed findings with code examples
- Step-by-step migration guidance
- Suitable for PRs, documentation, team reviews
### JSON
Machine-readable format for CI/CD integration:
```json
{
"summary": {
"compliance_score": 72,
"grade": "C",
"critical_issues": 3,
"migration_effort_days": 15
},
"findings": [...],
"metrics": {...},
"migration_plan": [...]
}
```
### Migration Plan
Prioritized roadmap with:
- P0-P3 priority levels
- Effort estimates per task
- Dependency chains
- Before/after code examples
- ADR templates
## Audit Criteria
The skill audits based on 10 Bulletproof React categories:
### 1. Project Structure
- Feature-based organization (80%+ code in features/)
- Unidirectional dependencies (shared → features → app)
- No cross-feature imports
- Proper feature boundaries
### 2. Component Architecture
- Component colocation (near usage)
- Limited props (< 7-10 per component)
- No large components (< 300 LOC)
- No nested render functions
- Proper abstraction (identify repetition first)
### 3. State Management
- Appropriate tool for each state type
- Local state preferred over global
- Server cache separated (React Query/SWR)
- Form state managed (React Hook Form)
- URL state utilized
### 4. API Layer
- Centralized API client
- Type-safe request declarations
- Colocated in features/
- Data fetching hooks
- Error handling
### 5. Testing Strategy
- Testing trophy (70% integration, 20% unit, 10% E2E)
- Semantic queries (getByRole preferred)
- User behavior testing (not implementation)
- 80%+ coverage on critical paths
### 6. Styling Patterns
- Consistent approach (component library or utility CSS)
- Colocated styles
- Design system usage
### 7. Error Handling
- API error interceptors
- Multiple error boundaries
- Error tracking service
- User-friendly messages
### 8. Performance
- Code splitting at routes
- Memoization patterns
- State localization
- Image optimization
- Bundle size monitoring
### 9. Security
- JWT with HttpOnly cookies
- Authorization (RBAC/PBAC)
- Input sanitization
- XSS prevention
### 10. Standards Compliance
- ESLint configured
- TypeScript strict mode
- Prettier setup
- Git hooks (Husky)
- Absolute imports
- Kebab-case naming
See [`reference/audit_criteria.md`](reference/audit_criteria.md) for complete checklist.
## Severity Levels
- **Critical (P0)**: Fix immediately (within 24 hours)
- Security vulnerabilities, breaking architectural patterns
- **High (P1)**: Fix this sprint (within 2 weeks)
- Major architectural violations, significant refactoring needed
- **Medium (P2)**: Fix next quarter (within 3 months)
- Component design issues, state management improvements
- **Low (P3)**: Backlog
- Styling consistency, minor optimizations
See [`reference/severity_matrix.md`](reference/severity_matrix.md) for detailed criteria.
## Migration Approach
### Phase 1: Foundation (Week 1-2)
1. Create feature folders structure
2. Move shared utilities to proper locations
3. Set up absolute imports
4. Configure ESLint for architecture rules
### Phase 2: Feature Extraction (Week 3-6)
1. Identify feature boundaries
2. Move components to features/
3. Colocate API calls with features
4. Extract feature-specific state
### Phase 3: Refinement (Week 7-10)
1. Refactor large components
2. Implement proper state management
3. Add error boundaries
4. Optimize performance
### Phase 4: Polish (Week 11-12)
1. Improve test coverage
2. Add documentation
3. Implement remaining patterns
4. Final review
## Examples
See the [`examples/`](examples/) directory for:
- Sample audit report (React app before Bulletproof)
- Complete migration plan with timeline
- Before/after structure comparisons
- Code transformation examples
## Architecture
```
bulletproof-react-auditor/
├── SKILL.md # Skill definition (Claude loads this)
├── README.md # This file
├── scripts/
│ ├── audit_engine.py # Core orchestrator
│ ├── analyzers/ # Specialized analyzers
│ │ ├── project_structure.py # Folder organization
│ │ ├── component_architecture.py # Component patterns
│ │ ├── state_management.py # State analysis
│ │ ├── api_layer.py # API patterns
│ │ ├── testing_strategy.py # Test quality
│ │ ├── styling_patterns.py # Styling approach
│ │ ├── error_handling.py # Error boundaries
│ │ ├── performance_patterns.py # React performance
│ │ ├── security_practices.py # React security
│ │ └── standards_compliance.py # ESLint, TS, Prettier
│ ├── report_generator.py # Multi-format reports
│ └── migration_planner.py # Prioritized roadmaps
├── reference/
│ ├── bulletproof_principles.md # Complete BR guide
│ ├── audit_criteria.md # Full checklist
│ ├── severity_matrix.md # Issue prioritization
│ └── migration_patterns.md # Common refactorings
└── examples/
├── sample_audit_report.md
├── migration_plan.md
└── before_after_structure.md
```
## Extending the Skill
### Adding a New Analyzer
1. Create `scripts/analyzers/your_analyzer.py`
2. Implement `analyze(codebase_path, metadata)` function
3. Add to `ANALYZERS` dict in `audit_engine.py`
Example:
```python
def analyze(codebase_path: Path, metadata: Dict) -> List[Dict]:
"""Analyze specific Bulletproof React pattern."""
findings = []
# Your analysis logic here
findings.append({
'severity': 'high',
'category': 'your_category',
'title': 'Issue title',
'current_state': 'What exists now',
'target_state': 'Bulletproof recommendation',
'migration_steps': ['Step 1', 'Step 2'],
'effort': 'medium',
})
return findings
```
## CI/CD Integration
### GitHub Actions Example
```yaml
name: Bulletproof React Audit
on: [pull_request]
jobs:
audit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Run Bulletproof Audit
run: |
python bulletproof-react-auditor/scripts/audit_engine.py . \
--format json \
--output audit-report.json
- name: Check Compliance Score
run: |
SCORE=$(jq '.summary.compliance_score' audit-report.json)
if [ "$SCORE" -lt 70 ]; then
echo "❌ Compliance score $SCORE below threshold (70)"
exit 1
fi
```
## Best Practices
1. **Audit Before Major Refactoring**: Establish baseline before starting
2. **Incremental Migration**: Don't refactor everything at once
3. **Feature-by-Feature**: Migrate one feature at a time
4. **Test Coverage First**: Ensure tests before restructuring
5. **Team Alignment**: Share Bulletproof React principles with team
6. **Document Decisions**: Create ADRs for architectural changes
7. **Track Progress**: Re-run audits weekly to measure improvement
## Connor's Standards Integration
This skill enforces Connor's specific requirements:
- **TypeScript Strict Mode**: No `any` types allowed
- **Test Coverage**: 80%+ minimum on all code
- **Testing Trophy**: 70% integration, 20% unit, 10% E2E
- **Modern Testing**: Semantic queries (getByRole) preferred
- **No Brittle Tests**: Avoid testing implementation details
- **Code Quality**: No console.log, no `var`, strict equality
- **Git Standards**: Conventional commits, proper branch naming
## Limitations
- Static analysis only (no runtime profiling)
- React 16.8+ required (hooks-based)
- Best suited for SPA/SSG patterns
- Next.js apps may have additional patterns
- Large codebases may need scoped analysis
- Does not execute tests (analyzes test files)
## Version
**1.0.0** - Initial release
## Standards Compliance
Based on:
- Bulletproof React Official Guide
- Kent C. Dodds Testing Trophy
- React Best Practices 2024-25
- TypeScript Strict Mode Guidelines
- Connor's Development Standards
## License
Apache 2.0 (example skill for demonstration)
---
**Built with**: Python 3.8+
**Anthropic Skill Version**: 1.0
**Last Updated**: 2024-10-25
**Bulletproof React Version**: Based on v2024 guidelines

View File

@@ -0,0 +1,130 @@
---
name: bulletproof-react-auditor
description: Use PROACTIVELY when users ask about React project structure, Bulletproof React patterns, or need architecture guidance. Covers structure setup, codebase auditing, anti-pattern detection, and feature-based migration planning. Triggers on "bulletproof react", "React structure help", "organize React app", or "audit my architecture".
---
# Bulletproof React Auditor
Audits React/TypeScript codebases against Bulletproof React architecture with migration planning.
## When to Use
**Natural Language Triggers** (semantic matching, not keywords):
- Questions about React project structure or organization
- Mentions of "bulletproof react" or feature-based architecture
- Requests to audit, review, or improve React codebase
- Planning migrations or refactoring React applications
- Seeking guidance on component patterns or folder structure
**Use Cases**:
- Setting up new React project structure
- Reorganizing existing flat codebase
- Auditing architecture against Bulletproof standards
- Planning migration to feature-based patterns
- Code review for structural anti-patterns
- Generating refactoring guidance and ADRs
## Bulletproof Structure Target
```
src/
├── app/ # Routes, providers
├── components/ # Shared components ONLY
├── config/ # Global config
├── features/ # Feature modules (most code)
│ └── feature/
│ ├── api/
│ ├── components/
│ ├── hooks/
│ ├── stores/
│ └── types/
├── hooks/ # Shared hooks
├── lib/ # Third-party configs
├── stores/ # Global state
├── testing/ # Test utilities
├── types/ # Shared types
└── utils/ # Shared utilities
```
## Audit Categories
| Category | Key Checks |
|----------|------------|
| Structure | Feature folders, cross-feature imports, boundaries |
| Components | Size (<300 LOC), props (<10), composition |
| State | Appropriate categories, localization, server cache |
| API Layer | Centralized client, types, React Query/SWR |
| Testing | Trophy (70/20/10), semantic queries, behavior |
| Styling | Consistent approach, component library |
| Errors | Boundaries, interceptors, tracking |
| Performance | Code splitting, memoization, bundle size |
| Security | JWT cookies, RBAC, XSS prevention |
| Standards | ESLint, Prettier, TS strict, Husky |
## Usage Examples
```
# Basic audit
Audit this React codebase using bulletproof-react-auditor.
# Structure focus
Run structure audit against Bulletproof React patterns.
# Migration plan
Generate migration plan to Bulletproof architecture.
# Custom scope
Audit focusing on structure, components, and state management.
```
## Output Formats
1. **Markdown Report** - ASCII diagrams, code examples
2. **JSON Report** - Machine-readable for CI/CD
3. **Migration Plan** - Roadmap with effort estimates
## Priority Levels
| Priority | Examples | Timeline |
|----------|----------|----------|
| P0 Critical | Security vulns, breaking issues | Immediate |
| P1 High | Feature folder creation, reorg | This sprint |
| P2 Medium | State refactor, API layer | Next quarter |
| P3 Low | Styling, docs, polish | Backlog |
## Connor's Standards Enforced
- TypeScript strict mode (no `any`)
- 80%+ test coverage
- Testing trophy: 70% integration, 20% unit, 10% E2E
- No console.log in production
- Semantic queries (getByRole preferred)
## Best Practices
1. Fix folder organization before component refactoring
2. Extract features before other changes
3. Maintain test coverage during migration
4. Incremental migration, not all at once
5. Document decisions with ADRs
## Limitations
- Static analysis only
- Requires React 16.8+ (hooks)
- Best for SPA/SSG (Next.js differs)
- Large codebases need scoped analysis
## Resources
- [Bulletproof React Guide](https://github.com/alan2207/bulletproof-react)
- [Project Structure](https://github.com/alan2207/bulletproof-react/blob/master/docs/project-structure.md)
- [Sample App](https://github.com/alan2207/bulletproof-react/tree/master/apps/react-vite)
## References
See `reference/` for:
- Complete Bulletproof principles guide
- Detailed audit criteria checklist
- Migration patterns and examples
- ADR templates

View File

@@ -0,0 +1,353 @@
# Bulletproof React Audit Report
**Generated**: 2024-10-25 15:30:00
**Codebase**: `/Users/developer/projects/my-react-app`
**Tech Stack**: React, TypeScript, Vite, Redux, Jest
**Structure Type**: Flat
**Total Files**: 287
**Lines of Code**: 18,420
---
## Executive Summary
### Overall Bulletproof Compliance: **62/100** (Grade: D)
### Category Scores
- **Structure**: 45/100 ⚠️ (Needs major refactoring)
- **Components**: 68/100 ⚠️ (Some improvements needed)
- **State Management**: 55/100 ⚠️ (Missing server cache)
- **API Layer**: 50/100 ⚠️ (Scattered fetch calls)
- **Testing**: 72/100 ⚠️ (Below 80% coverage)
- **Styling**: 80/100 ✅ (Good - using Tailwind)
- **Error Handling**: 40/100 ⚠️ (Missing error boundaries)
- **Performance**: 65/100 ⚠️ (No code splitting)
- **Security**: 58/100 ⚠️ (Tokens in localStorage)
- **Standards**: 85/100 ✅ (Good compliance)
### Issue Summary
- **Critical Issues**: 3
- **High Issues**: 12
- **Medium Issues**: 24
- **Low Issues**: 8
- **Total Issues**: 47
**Estimated Migration Effort**: 18.5 person-days (~4 weeks for 1 developer)
---
## Detailed Findings
### 🚨 CRITICAL (3 issues)
#### 1. Tokens stored in localStorage (Security)
**Current State**: Authentication tokens stored in localStorage in 3 files
**Target State**: Use HttpOnly cookies for JWT storage
**Files Affected**:
- `src/utils/auth.ts`
- `src/hooks/useAuth.ts`
- `src/api/client.ts`
**Impact**: localStorage is vulnerable to XSS attacks. If attacker injects JavaScript, they can steal authentication tokens.
**Migration Steps**:
1. Configure API backend to set JWT in HttpOnly cookie
2. Remove `localStorage.setItem('token', ...)` calls
3. Use `credentials: 'include'` in fetch requests
4. Implement CSRF protection
5. Test authentication flow
**Effort**: MEDIUM
---
#### 2. No features/ directory - flat structure (Structure)
**Current State**: All 287 files in flat src/ directory structure
**Target State**: 80%+ code organized in feature-based modules
**Impact**:
- Difficult to scale beyond current size
- No clear feature boundaries
- High coupling between unrelated code
- Difficult to test in isolation
- New developers struggle to find code
**Migration Steps**:
1. Create `src/features/` directory
2. Identify distinct features (e.g., authentication, dashboard, profile, settings)
3. Create directories for each feature
4. Move feature-specific code to respective features/
5. Organize each feature with api/, components/, hooks/, stores/ subdirectories
6. Update all import paths
7. Test thoroughly after each feature migration
**Effort**: HIGH (plan for 2 weeks)
---
#### 3. No testing framework detected (Testing)
**Current State**: Jest found but no @testing-library/react
**Target State**: Use Testing Library for user-centric React testing
**Impact**:
- Testing components requires low-level implementation testing
- Tests are brittle and break on refactoring
- Cannot follow testing trophy distribution
- Poor test quality
**Migration Steps**:
1. Install @testing-library/react
2. Install @testing-library/jest-dom
3. Configure test setup file
4. Write example tests using Testing Library patterns
5. Train team on Testing Library principles
**Effort**: LOW
---
### ⚠️ HIGH (12 issues - showing top 5)
#### 4. No data fetching library (State Management)
**Current State**: Manual API state management with Redux
**Target State**: Use React Query or SWR for server cache state
**Migration Steps**:
1. Install @tanstack/react-query
2. Wrap app with QueryClientProvider
3. Convert Redux API slices to React Query hooks
4. Remove manual loading/error state management
5. Configure caching strategies
**Effort**: MEDIUM
---
#### 5. Test coverage at 65.3% (Testing)
**Current State**: Line coverage: 65.3%, Branch coverage: 58.2%
**Target State**: Maintain 80%+ test coverage
**Critical Untested Paths**:
- Authentication flow
- Payment processing
- User profile updates
**Migration Steps**:
1. Generate coverage report with uncovered files
2. Prioritize critical paths (authentication, payments)
3. Write integration tests first (70% of tests)
4. Add unit tests for business logic
5. Configure coverage thresholds in jest.config.js
**Effort**: HIGH
---
#### 6. Large component: UserDashboard.tsx (468 LOC) (Components)
**Current State**: `src/components/UserDashboard.tsx` has 468 lines
**Target State**: Components should be < 300 lines
**Migration Steps**:
1. Identify distinct UI sections in dashboard
2. Extract sections to separate components (DashboardHeader, DashboardStats, DashboardActivity)
3. Move business logic to custom hooks (useDashboardData)
4. Extract complex calculations to utility functions
5. Update tests to test new components independently
**Effort**: MEDIUM
---
#### 7. Cross-feature imports detected (Structure)
**Current State**: 8 files import from other features
**Violations**:
- `features/dashboard → features/profile`
- `features/settings → features/authentication`
**Target State**: Features should be independent. Shared code belongs in src/components/ or src/utils/
**Migration Steps**:
1. Identify shared code being imported across features
2. Move truly shared components to src/components/
3. Move shared utilities to src/utils/
4. If code is feature-specific, duplicate it or refactor feature boundaries
**Effort**: MEDIUM
---
#### 8. No error boundaries detected (Error Handling)
**Current State**: No ErrorBoundary components found
**Target State**: Multiple error boundaries at route and feature levels
**Migration Steps**:
1. Create src/components/ErrorBoundary.tsx
2. Wrap each route with ErrorBoundary
3. Add feature-level error boundaries
4. Display user-friendly error messages
5. Log errors to Sentry
**Effort**: LOW
---
### 📊 MEDIUM (24 issues - showing top 3)
#### 9. Too many shared components (Structure)
**Current State**: 62.3% of components in src/components/ (shared)
**Target State**: Most components should be feature-specific
**Migration Steps**:
1. Review each shared component
2. Identify components used by only one feature
3. Move feature-specific components to their features
4. Keep only truly shared components in src/components/
**Effort**: MEDIUM
---
#### 10. Component with 12 props: UserProfileForm (Components)
**Current State**: `UserProfileForm` accepts 12 props
**Target State**: Components should accept < 7-10 props
**Migration Steps**:
1. Group related props into configuration object
2. Use composition (children) instead of render props
3. Extract sub-components with their own props
4. Consider Context for deeply shared state
**Effort**: LOW
---
#### 11. No code splitting detected (Performance)
**Current State**: No React.lazy() usage found
**Target State**: Use code splitting for routes and large components
**Migration Steps**:
1. Wrap route components with React.lazy()
2. Add Suspense boundaries with loading states
3. Split large features into separate chunks
4. Analyze bundle size with vite-bundle-analyzer
**Effort**: LOW
---
## Recommendations
### Immediate Action Required (This Week)
1. **Security**: Move tokens from localStorage to HttpOnly cookies
2. **Structure**: Create features/ directory and plan migration
3. **Testing**: Install Testing Library and write example tests
### This Sprint (Next 2 Weeks)
4. **Structure**: Begin feature extraction (start with 1-2 features)
5. **State**: Add React Query for API calls
6. **Testing**: Increase coverage to 70%+
7. **Components**: Refactor largest components (> 400 LOC)
8. **Errors**: Add error boundaries
### Next Quarter (3 Months)
9. **Structure**: Complete feature-based migration
10. **Testing**: Achieve 80%+ coverage
11. **Performance**: Implement code splitting
12. **State**: Evaluate Redux necessity (might not need with React Query)
### Backlog
13. **Standards**: Add git hooks (Husky) for pre-commit checks
14. **Components**: Improve component colocation
15. **Styling**: Document design system
16. **Naming**: Enforce kebab-case file naming
---
## Migration Priority Roadmap
### Week 1-2: Foundation
- [ ] Fix security issues (localStorage tokens)
- [ ] Create features/ structure
- [ ] Install Testing Library
- [ ] Add error boundaries
- [ ] Configure React Query
### Week 3-4: Feature Extraction Phase 1
- [ ] Extract authentication feature
- [ ] Extract dashboard feature
- [ ] Update imports and test
- [ ] Improve test coverage to 70%
### Week 5-8: Feature Extraction Phase 2
- [ ] Extract remaining features
- [ ] Refactor large components
- [ ] Add comprehensive error handling
- [ ] Achieve 80%+ test coverage
### Week 9-12: Optimization
- [ ] Implement code splitting
- [ ] Performance optimizations
- [ ] Security hardening
- [ ] Documentation updates
---
## Architecture Comparison
### Current Structure
```
src/
├── components/ (180 components - too many!)
├── hooks/ (12 hooks)
├── utils/ (15 utility files)
├── store/ (Redux slices)
├── api/ (API calls)
└── pages/ (Route components)
```
### Target Bulletproof Structure
```
src/
├── app/
│ ├── routes/
│ ├── app.tsx
│ └── provider.tsx
├── features/
│ ├── authentication/
│ │ ├── api/
│ │ ├── components/
│ │ ├── hooks/
│ │ └── stores/
│ ├── dashboard/
│ │ └── ...
│ └── profile/
│ └── ...
├── components/ (Only truly shared - ~20 components)
├── hooks/ (Shared hooks)
├── lib/ (API client, configs)
├── utils/ (Shared utilities)
└── types/ (Shared types)
```
---
*Report generated by Bulletproof React Auditor Skill v1.0*
*Based on Bulletproof React principles and Connor's development standards*

View File

@@ -0,0 +1,250 @@
# Bulletproof React Audit Criteria
Complete checklist for auditing React/TypeScript applications against Bulletproof React architecture principles.
## 1. Project Structure
### Feature-Based Organization
- [ ] 80%+ of code organized in src/features/
- [ ] Each feature has its own directory
- [ ] Features are independent (no cross-feature imports)
- [ ] Feature subdirectories: api/, components/, hooks/, stores/, types/, utils/
### Top-Level Directories
- [ ] src/app/ exists (application layer)
- [ ] src/features/ exists and contains features
- [ ] src/components/ for truly shared components only
- [ ] src/hooks/ for shared custom hooks
- [ ] src/lib/ for third-party configurations
- [ ] src/utils/ for shared utilities
- [ ] src/types/ for shared TypeScript types
- [ ] src/stores/ for global state (if needed)
### Unidirectional Dependencies
- [ ] No cross-feature imports
- [ ] Shared code imported into features (not vice versa)
- [ ] App layer imports from features
- [ ] Clean dependency flow: shared → features → app
## 2. Component Architecture
### Component Design
- [ ] Components < 300 lines of code
- [ ] No large components (> 500 LOC)
- [ ] Components accept < 7-10 props
- [ ] No nested render functions
- [ ] Component colocation (near where used)
- [ ] Proper use of composition over excessive props
### File Organization
- [ ] Kebab-case file naming
- [ ] Components colocated with tests
- [ ] Styles colocated with components
- [ ] Feature-specific components in features/
- [ ] Only truly shared components in src/components/
### Abstraction
- [ ] No premature abstractions
- [ ] Repetition identified before creating abstractions
- [ ] Components are focused and single-purpose
## 3. State Management
### State Categories
- [ ] Component state with useState/useReducer
- [ ] Global state with Context, Zustand, or Jotai
- [ ] Server cache state with React Query or SWR
- [ ] Form state with React Hook Form or Formik
- [ ] URL state with React Router
### State Localization
- [ ] State as local as possible
- [ ] Global state only when necessary
- [ ] No single massive global state object
- [ ] Context split into multiple focused providers
### Server State
- [ ] React Query or SWR for API data
- [ ] Proper caching configuration
- [ ] No manual loading/error state for API calls
- [ ] Optimistic updates where appropriate
## 4. API Layer
### Centralized Configuration
- [ ] Single API client instance
- [ ] Configured in src/lib/
- [ ] Base URL configuration
- [ ] Request/response interceptors
- [ ] Error handling interceptors
### Request Organization
- [ ] API calls colocated in features/*/api/
- [ ] Type-safe request declarations
- [ ] Custom hooks for each endpoint
- [ ] Validation schemas with types
- [ ] Proper error handling
## 5. Testing Strategy
### Coverage
- [ ] 80%+ line coverage
- [ ] 75%+ branch coverage
- [ ] 100% coverage on critical paths
- [ ] Coverage reports generated
### Testing Trophy Distribution
- [ ] ~70% integration tests
- [ ] ~20% unit tests
- [ ] ~10% E2E tests
### Test Quality
- [ ] Tests named "should X when Y"
- [ ] Semantic queries (getByRole, getByLabelText)
- [ ] Testing user behavior, not implementation
- [ ] No brittle tests (exact counts, element ordering)
- [ ] Tests isolated and independent
- [ ] No flaky tests
### Testing Tools
- [ ] Vitest or Jest configured
- [ ] @testing-library/react installed
- [ ] @testing-library/jest-dom for assertions
- [ ] Playwright or Cypress for E2E (optional)
## 6. Styling Patterns
### Styling Approach
- [ ] Consistent styling method chosen
- [ ] Component library (Chakra, Radix, MUI) OR
- [ ] Utility CSS (Tailwind) OR
- [ ] CSS-in-JS (Emotion, styled-components)
- [ ] Styles colocated with components
### Design System
- [ ] Design tokens defined
- [ ] Color palette established
- [ ] Typography scale defined
- [ ] Spacing system consistent
## 7. Error Handling
### Error Boundaries
- [ ] Multiple error boundaries at strategic locations
- [ ] Route-level error boundaries
- [ ] Feature-level error boundaries
- [ ] User-friendly error messages
- [ ] Error recovery mechanisms
### API Errors
- [ ] API error interceptors configured
- [ ] User notifications for errors
- [ ] Automatic retry logic where appropriate
- [ ] Unauthorized user logout
### Error Tracking
- [ ] Sentry or similar service configured
- [ ] User context added to errors
- [ ] Environment-specific error handling
- [ ] Source maps configured for production
## 8. Performance
### Code Splitting
- [ ] React.lazy() for route components
- [ ] Suspense boundaries with loading states
- [ ] Large features split into chunks
- [ ] Bundle size monitored and optimized
### React Performance
- [ ] State localized to prevent re-renders
- [ ] React.memo for expensive components
- [ ] useMemo for expensive calculations
- [ ] useCallback for stable function references
- [ ] Children prop optimization patterns
### Asset Optimization
- [ ] Images lazy loaded
- [ ] Images in modern formats (WebP)
- [ ] Responsive images with srcset
- [ ] Images < 500KB
- [ ] Videos lazy loaded or streamed
## 9. Security
### Authentication
- [ ] JWT stored in HttpOnly cookies (not localStorage)
- [ ] Secure session management
- [ ] Token refresh logic
- [ ] Logout functionality
### Authorization
- [ ] RBAC or PBAC implemented
- [ ] Protected routes
- [ ] Permission checks on actions
- [ ] API-level authorization
### XSS Prevention
- [ ] Input sanitization (DOMPurify)
- [ ] No dangerouslySetInnerHTML without sanitization
- [ ] Output encoding
- [ ] Content Security Policy
### CSRF Protection
- [ ] CSRF tokens for state-changing requests
- [ ] SameSite cookie attribute
- [ ] Verify origin headers
## 10. Standards Compliance
### ESLint
- [ ] .eslintrc or eslint.config.js configured
- [ ] React rules enabled
- [ ] TypeScript rules enabled
- [ ] Accessibility rules (jsx-a11y)
- [ ] Architectural rules (import restrictions)
### TypeScript
- [ ] strict: true in tsconfig.json
- [ ] No `any` types
- [ ] Explicit return types
- [ ] Type definitions for third-party libraries
- [ ] Types colocated with features
### Prettier
- [ ] Prettier configured
- [ ] Format on save enabled
- [ ] Consistent code style
- [ ] .prettierrc configuration
### Git Hooks
- [ ] Husky configured
- [ ] Pre-commit linting
- [ ] Pre-commit type checking
- [ ] Pre-commit tests (optional)
### File Naming
- [ ] Kebab-case for files and directories
- [ ] Consistent naming conventions
- [ ] ESLint rule to enforce naming
### Absolute Imports
- [ ] TypeScript paths configured (@/ prefix)
- [ ] Imports use @/ instead of relative paths
- [ ] Easier refactoring and moving files
## Compliance Scoring
### Grade Scale
- **A (90-100)**: Excellent Bulletproof React compliance
- **B (80-89)**: Good compliance, minor improvements needed
- **C (70-79)**: Moderate compliance, significant refactoring recommended
- **D (60-69)**: Poor compliance, major architectural changes needed
- **F (<60)**: Non-compliant, complete restructuring required
### Category Weights
All categories weighted equally for overall score.
---
**Note**: This checklist represents the ideal Bulletproof React architecture. Adapt based on your project's specific needs and constraints while maintaining the core principles.

View File

@@ -0,0 +1,248 @@
# Severity Matrix
Priority levels and response times for Bulletproof React audit findings.
## Severity Levels
### Critical (P0)
**Fix immediately (within 24 hours)**
#### Criteria
- Security vulnerabilities (tokens in localStorage, XSS risks)
- Breaking architectural violations that prevent scalability
- No testing framework in production app
- TypeScript strict mode disabled with widespread `any` usage
#### Examples
- Authentication tokens stored in localStorage
- No error boundaries in production app
- Zero test coverage on critical paths
- Multiple cross-feature dependencies creating circular imports
#### Impact
- Security breaches possible
- Application instability
- Cannot safely refactor or add features
- Technical debt compounds rapidly
---
### High (P1)
**Fix this sprint (within 2 weeks)**
#### Criteria
- Major architectural misalignment with Bulletproof React
- No data fetching library (manual API state management)
- Test coverage < 80%
- Large components (> 400 LOC) with multiple responsibilities
- No features/ directory with >50 components
#### Examples
- Flat structure instead of feature-based
- Scattered fetch calls throughout components
- No React Query/SWR for server state
- Components with 15+ props
- No error tracking service (Sentry)
#### Impact
- Difficult to maintain and extend
- Poor developer experience
- Slow feature development
- Bugs hard to track and fix
- Testing becomes increasingly difficult
---
### Medium (P2)
**Fix next quarter (within 3 months)**
#### Criteria
- Component design anti-patterns
- State management could be improved
- Missing recommended directories
- Some cross-feature imports
- No code splitting
- Inconsistent styling approaches
#### Examples
- Components 200-400 LOC
- Context with 5+ state values
- Too many shared components (should be feature-specific)
- Nested render functions instead of components
- Multiple styling systems in use
- Large images not optimized
#### Impact
- Code is maintainable but could be better
- Some technical debt accumulating
- Refactoring is more difficult than it should be
- Performance could be better
- Developer onboarding takes longer
---
### Low (P3)
**Backlog (schedule when convenient)**
#### Criteria
- Minor deviations from Bulletproof React patterns
- Stylistic improvements
- Missing nice-to-have features
- Small optimizations
#### Examples
- Files not using kebab-case naming
- No Prettier configured
- No git hooks (Husky)
- Missing some recommended directories
- Test naming doesn't follow "should X when Y"
- Some components could be better colocated
#### Impact
- Minimal impact on development
- Minor inconsistencies
- Small developer experience improvements possible
- Low-priority technical debt
---
## Effort Estimation
### Low Effort (< 1 day)
- Installing dependencies
- Creating configuration files
- Renaming files
- Adding error boundaries
- Setting up Prettier/ESLint
- Configuring git hooks
### Medium Effort (1-5 days)
- Creating features/ structure
- Organizing existing code into features
- Refactoring large components
- Adding React Query/SWR
- Setting up comprehensive error handling
- Improving test coverage to 80%
### High Effort (1-3 weeks)
- Complete architecture restructuring
- Migrating from flat to feature-based structure
- Comprehensive security improvements
- Building out full test suite
- Large-scale refactoring
- Multiple concurrent improvements
---
## Priority Decision Matrix
| Severity | Effort Low | Effort Medium | Effort High |
|----------|------------|---------------|-------------|
| **Critical** | P0 - Do Now | P0 - Do Now | P0 - Plan & Start |
| **High** | P1 - This Sprint | P1 - This Sprint | P1 - This Quarter |
| **Medium** | P2 - Next Sprint | P2 - Next Quarter | P2 - This Year |
| **Low** | P3 - Backlog | P3 - Backlog | P3 - Nice to Have |
---
## Response Time Guidelines
### Critical (P0)
- **Notification**: Immediate (Slack/email alert)
- **Acknowledgment**: Within 1 hour
- **Plan**: Within 4 hours
- **Fix**: Within 24 hours
- **Verification**: Immediately after fix
- **Documentation**: ADR created
### High (P1)
- **Notification**: Within 1 day
- **Acknowledgment**: Within 1 day
- **Plan**: Within 2 days
- **Fix**: Within current sprint (2 weeks)
- **Verification**: Before sprint end
- **Documentation**: Updated in sprint retrospective
### Medium (P2)
- **Notification**: Within 1 week
- **Acknowledgment**: Within 1 week
- **Plan**: Within sprint planning
- **Fix**: Within quarter (3 months)
- **Verification**: Quarterly review
- **Documentation**: Included in quarterly planning
### Low (P3)
- **Notification**: Added to backlog
- **Acknowledgment**: During backlog refinement
- **Plan**: When capacity available
- **Fix**: Opportunistic
- **Verification**: As completed
- **Documentation**: Optional
---
## Category-Specific Severity Guidelines
### Structure Issues
- **Critical**: No features/, flat structure with 100+ components
- **High**: Missing features/, cross-feature dependencies
- **Medium**: Some organizational issues
- **Low**: Minor folder organization improvements
### Component Issues
- **Critical**: Components > 1000 LOC, widespread violations
- **High**: Many components > 400 LOC, 15+ props
- **Medium**: Some large components, nested renders
- **Low**: Minor design improvements needed
### State Management
- **Critical**: No proper state management in complex app
- **High**: No data fetching library, manual API state
- **Medium**: State could be better localized
- **Low**: Could use better state management tool
### Testing Issues
- **Critical**: No testing framework, 0% coverage
- **High**: Coverage < 50%, wrong test distribution
- **Medium**: Coverage 50-79%, some brittle tests
- **Low**: Coverage > 80%, minor test improvements
### Security Issues
- **Critical**: Tokens in localStorage, XSS vulnerabilities
- **High**: No error tracking, missing CSRF protection
- **Medium**: Minor security improvements needed
- **Low**: Security best practices could be better
---
## Migration Planning
### Phase 1: Critical (Week 1)
1. Fix all P0 security issues
2. Establish basic architecture (features/)
3. Set up testing framework
4. Configure error tracking
### Phase 2: High Priority (Weeks 2-6)
1. Migrate to feature-based structure
2. Add React Query/SWR
3. Improve test coverage to 80%
4. Refactor large components
5. Add error boundaries
### Phase 3: Medium Priority (Months 2-3)
1. Optimize component architecture
2. Implement code splitting
3. Improve state management
4. Add comprehensive testing
5. Performance optimizations
### Phase 4: Low Priority (Ongoing)
1. Stylistic improvements
2. Developer experience enhancements
3. Documentation updates
4. Minor refactoring
---
**Note**: These guidelines should be adapted based on your team size, release cadence, and business priorities. Always balance technical debt reduction with feature development.

View File

@@ -0,0 +1,5 @@
"""
Bulletproof React Analyzers
Specialized analyzers for different aspects of Bulletproof React compliance.
"""

View File

@@ -0,0 +1,72 @@
"""
API Layer Analyzer
Analyzes API organization against Bulletproof React patterns:
- Centralized API client
- Type-safe request declarations
- Colocated in features/
- Data fetching hooks
"""
from pathlib import Path
from typing import Dict, List
import re
def analyze(codebase_path: Path, metadata: Dict) -> List[Dict]:
"""Analyze API layer architecture."""
findings = []
src_dir = codebase_path / 'src'
if not src_dir.exists():
return findings
# Check for centralized API client
has_api_config = (src_dir / 'lib').exists() or any(src_dir.rglob('**/api-client.*'))
if not has_api_config:
findings.append({
'severity': 'medium',
'category': 'api',
'title': 'No centralized API client detected',
'current_state': 'No api-client configuration found in src/lib/',
'target_state': 'Create single configured API client instance',
'migration_steps': [
'Create src/lib/api-client.ts with axios or fetch wrapper',
'Configure base URL, headers, interceptors',
'Export configured client',
'Use in all API calls'
],
'effort': 'low',
})
# Check for scattered fetch calls
scattered_fetches = []
for file in src_dir.rglob('*.{ts,tsx,js,jsx}'):
if 'test' in str(file) or 'spec' in str(file):
continue
try:
with open(file, 'r') as f:
content = f.read()
if re.search(r'\bfetch\s*\(', content) and 'api' not in str(file).lower():
scattered_fetches.append(str(file.relative_to(src_dir)))
except:
pass
if len(scattered_fetches) > 3:
findings.append({
'severity': 'high',
'category': 'api',
'title': f'Scattered fetch calls in {len(scattered_fetches)} files',
'current_state': 'fetch() calls throughout components',
'target_state': 'Centralize API calls in feature api/ directories',
'migration_steps': [
'Create api/ directory in each feature',
'Move API calls to dedicated functions',
'Create custom hooks wrapping API calls',
'Use React Query or SWR for data fetching'
],
'effort': 'high',
'affected_files': scattered_fetches[:5],
})
return findings

View File

@@ -0,0 +1,323 @@
"""
Component Architecture Analyzer
Analyzes React component design against Bulletproof React principles:
- Component colocation (near where they're used)
- Limited props (< 7-10)
- Reasonable component size (< 300 LOC)
- No nested render functions
- Proper composition over excessive props
- Consistent naming (kebab-case files)
"""
import re
from pathlib import Path
from typing import Dict, List, Tuple
def analyze(codebase_path: Path, metadata: Dict) -> List[Dict]:
"""
Analyze component architecture for Bulletproof React compliance.
Args:
codebase_path: Path to React codebase
metadata: Project metadata from discovery phase
Returns:
List of findings with severity and migration guidance
"""
findings = []
src_dir = codebase_path / 'src'
if not src_dir.exists():
return findings
# Analyze all React component files
findings.extend(check_component_sizes(src_dir))
findings.extend(check_component_props(src_dir))
findings.extend(check_nested_render_functions(src_dir))
findings.extend(check_file_naming_conventions(src_dir))
findings.extend(check_component_colocation(src_dir))
return findings
def check_component_sizes(src_dir: Path) -> List[Dict]:
"""Check for overly large components."""
findings = []
exclude_dirs = {'node_modules', 'dist', 'build', '.next', 'coverage'}
large_components = []
for component_file in src_dir.rglob('*.{tsx,jsx}'):
if any(excluded in component_file.parts for excluded in exclude_dirs):
continue
try:
with open(component_file, 'r', encoding='utf-8', errors='ignore') as f:
lines = f.readlines()
loc = len([line for line in lines if line.strip() and not line.strip().startswith('//')])
if loc > 300:
large_components.append({
'file': str(component_file.relative_to(src_dir)),
'lines': loc,
'severity': 'critical' if loc > 500 else 'high' if loc > 400 else 'medium'
})
except:
pass
if large_components:
# Report the worst offenders
large_components.sort(key=lambda x: x['lines'], reverse=True)
for comp in large_components[:10]: # Top 10 largest
findings.append({
'severity': comp['severity'],
'category': 'components',
'title': f'Large component ({comp["lines"]} LOC)',
'current_state': f'{comp["file"]} has {comp["lines"]} lines',
'target_state': 'Components should be < 300 lines. Large components are hard to understand and test.',
'migration_steps': [
'Identify distinct responsibilities in the component',
'Extract smaller components for each UI section',
'Move business logic to custom hooks',
'Extract complex rendering logic to separate components',
'Consider splitting into multiple feature components'
],
'effort': 'high' if comp['lines'] > 400 else 'medium',
'file': comp['file'],
})
return findings
def check_component_props(src_dir: Path) -> List[Dict]:
"""Check for components with excessive props."""
findings = []
exclude_dirs = {'node_modules', 'dist', 'build', '.next', 'coverage'}
components_with_many_props = []
for component_file in src_dir.rglob('*.{tsx,jsx}'):
if any(excluded in component_file.parts for excluded in exclude_dirs):
continue
try:
with open(component_file, 'r', encoding='utf-8', errors='ignore') as f:
content = f.read()
# Find component definitions with props
# Pattern matches: function Component({ prop1, prop2, ... })
# and: const Component = ({ prop1, prop2, ... }) =>
props_pattern = re.compile(
r'(?:function|const)\s+(\w+)\s*(?:=\s*)?\(\s*\{([^}]+)\}',
re.MULTILINE
)
matches = props_pattern.findall(content)
for component_name, props_str in matches:
# Count props (split by comma)
props = [p.strip() for p in props_str.split(',') if p.strip()]
# Filter out destructured nested props
actual_props = [p for p in props if not p.startswith('...')]
prop_count = len(actual_props)
if prop_count > 10:
components_with_many_props.append({
'file': str(component_file.relative_to(src_dir)),
'component': component_name,
'prop_count': prop_count,
})
except:
pass
if components_with_many_props:
for comp in components_with_many_props:
findings.append({
'severity': 'critical' if comp['prop_count'] > 15 else 'high',
'category': 'components',
'title': f'Component with {comp["prop_count"]} props: {comp["component"]}',
'current_state': f'{comp["file"]} has {comp["prop_count"]} props',
'target_state': 'Components should accept < 7-10 props. Too many props indicates insufficient composition.',
'migration_steps': [
'Group related props into configuration objects',
'Use composition (children prop) instead of render props',
'Extract sub-components with their own props',
'Consider using Context for deeply shared state',
'Use compound component pattern for complex UIs'
],
'effort': 'medium',
'file': comp['file'],
})
return findings
def check_nested_render_functions(src_dir: Path) -> List[Dict]:
"""Check for nested render functions inside components."""
findings = []
exclude_dirs = {'node_modules', 'dist', 'build', '.next', 'coverage'}
nested_render_functions = []
for component_file in src_dir.rglob('*.{tsx,jsx}'):
if any(excluded in component_file.parts for excluded in exclude_dirs):
continue
try:
with open(component_file, 'r', encoding='utf-8', errors='ignore') as f:
content = f.read()
lines = content.split('\n')
# Look for patterns like: const renderSomething = () => { ... }
# or: function renderSomething() { ... }
nested_render_pattern = re.compile(r'(?:const|function)\s+(render\w+)\s*[=:]?\s*\([^)]*\)\s*(?:=>)?\s*\{')
for line_num, line in enumerate(lines, start=1):
if nested_render_pattern.search(line):
nested_render_functions.append({
'file': str(component_file.relative_to(src_dir)),
'line': line_num,
})
except:
pass
if nested_render_functions:
# Group by file
files_with_nested = {}
for item in nested_render_functions:
file = item['file']
if file not in files_with_nested:
files_with_nested[file] = []
files_with_nested[file].append(item['line'])
for file, lines in files_with_nested.items():
findings.append({
'severity': 'medium',
'category': 'components',
'title': f'Nested render functions detected ({len(lines)} instances)',
'current_state': f'{file} contains render functions inside component',
'target_state': 'Extract nested render functions into separate components for better reusability and testing.',
'migration_steps': [
'Identify each render function and its dependencies',
'Extract to separate component file',
'Pass necessary props to new component',
'Update tests to test new component in isolation',
'Remove render function from parent component'
],
'effort': 'low',
'file': file,
'affected_lines': lines[:5], # Show first 5
})
return findings
def check_file_naming_conventions(src_dir: Path) -> List[Dict]:
"""Check for consistent kebab-case file naming."""
findings = []
exclude_dirs = {'node_modules', 'dist', 'build', '.next', 'coverage'}
non_kebab_files = []
for file_path in src_dir.rglob('*.{ts,tsx,js,jsx}'):
if any(excluded in file_path.parts for excluded in exclude_dirs):
continue
filename = file_path.stem # filename without extension
# Check if filename is kebab-case (lowercase with hyphens)
# Allow: kebab-case.tsx, lowercase.tsx
# Disallow: PascalCase.tsx, camelCase.tsx, snake_case.tsx
is_kebab_or_lowercase = re.match(r'^[a-z][a-z0-9]*(-[a-z0-9]+)*$', filename)
if not is_kebab_or_lowercase and filename not in ['index', 'App']: # Allow common exceptions
non_kebab_files.append(str(file_path.relative_to(src_dir)))
if len(non_kebab_files) > 5: # Only report if it's a pattern (>5 files)
findings.append({
'severity': 'low',
'category': 'components',
'title': f'Inconsistent file naming ({len(non_kebab_files)} files)',
'current_state': f'{len(non_kebab_files)} files not using kebab-case naming',
'target_state': 'Bulletproof React recommends kebab-case for all files (e.g., user-profile.tsx)',
'migration_steps': [
'Rename files to kebab-case format',
'Update all import statements',
'Run tests to ensure nothing broke',
'Add ESLint rule to enforce kebab-case (unicorn/filename-case)'
],
'effort': 'low',
'affected_files': non_kebab_files[:10], # Show first 10
})
return findings
def check_component_colocation(src_dir: Path) -> List[Dict]:
"""Check if components are colocated near where they're used."""
findings = []
components_dir = src_dir / 'components'
if not components_dir.exists():
return findings
# Find components in shared components/ that are only used once
single_use_components = []
for component_file in components_dir.rglob('*.{tsx,jsx}'):
try:
component_name = component_file.stem
# Search for imports of this component across codebase
import_pattern = re.compile(rf'import.*{component_name}.*from.*[\'"]/|@/')
usage_count = 0
used_in_feature = None
for search_file in src_dir.rglob('*.{ts,tsx,js,jsx}'):
if search_file == component_file:
continue
try:
with open(search_file, 'r', encoding='utf-8', errors='ignore') as f:
content = f.read()
if import_pattern.search(content):
usage_count += 1
# Check if used in a feature
if 'features' in search_file.parts:
features_index = search_file.parts.index('features')
if features_index + 1 < len(search_file.parts):
feature_name = search_file.parts[features_index + 1]
if used_in_feature is None:
used_in_feature = feature_name
elif used_in_feature != feature_name:
used_in_feature = 'multiple'
except:
pass
# If used only in one feature, it should be colocated there
if usage_count == 1 and used_in_feature and used_in_feature != 'multiple':
single_use_components.append({
'file': str(component_file.relative_to(src_dir)),
'component': component_name,
'feature': used_in_feature,
})
except:
pass
if single_use_components:
for comp in single_use_components[:5]: # Top 5
findings.append({
'severity': 'low',
'category': 'components',
'title': f'Component used in only one feature: {comp["component"]}',
'current_state': f'{comp["file"]} is in shared components/ but only used in {comp["feature"]} feature',
'target_state': 'Components used by only one feature should be colocated in that feature directory.',
'migration_steps': [
f'Move {comp["file"]} to src/features/{comp["feature"]}/components/',
'Update import in the feature',
'Run tests to verify',
'Remove from shared components/'
],
'effort': 'low',
'file': comp['file'],
})
return findings

View File

@@ -0,0 +1,62 @@
"""
Error Handling Analyzer
Analyzes error handling patterns:
- Error boundaries present
- API error interceptors
- Error tracking (Sentry)
"""
from pathlib import Path
from typing import Dict, List
import re
def analyze(codebase_path: Path, metadata: Dict) -> List[Dict]:
"""Analyze error handling patterns."""
findings = []
src_dir = codebase_path / 'src'
tech_stack = metadata.get('tech_stack', {})
if not src_dir.exists():
return findings
# Check for error boundaries
error_boundaries = list(src_dir.rglob('**/error-boundary.*')) + \
list(src_dir.rglob('**/ErrorBoundary.*'))
if not error_boundaries:
findings.append({
'severity': 'high',
'category': 'errors',
'title': 'No error boundaries detected',
'current_state': 'No ErrorBoundary components found',
'target_state': 'Implement multiple error boundaries at strategic locations',
'migration_steps': [
'Create ErrorBoundary component with componentDidCatch',
'Wrap route components with ErrorBoundary',
'Add feature-level error boundaries',
'Display user-friendly error messages'
],
'effort': 'low',
})
# Check for error tracking
if not tech_stack.get('sentry'):
findings.append({
'severity': 'medium',
'category': 'errors',
'title': 'No error tracking service detected',
'current_state': 'No Sentry or similar error tracking',
'target_state': 'Use Sentry for production error monitoring',
'migration_steps': [
'Sign up for Sentry',
'Install @sentry/react',
'Configure Sentry.init() in app entry',
'Add user context and tags',
'Set up error alerts'
],
'effort': 'low',
})
return findings

View File

@@ -0,0 +1,76 @@
"""
Performance Patterns Analyzer
Analyzes React performance optimizations:
- Code splitting at routes
- Memoization patterns
- Image optimization
"""
from pathlib import Path
from typing import Dict, List
import re
def analyze(codebase_path: Path, metadata: Dict) -> List[Dict]:
"""Analyze performance patterns."""
findings = []
src_dir = codebase_path / 'src'
if not src_dir.exists():
return findings
# Check for lazy loading
has_lazy_loading = False
for file in src_dir.rglob('*.{ts,tsx,js,jsx}'):
try:
with open(file, 'r') as f:
content = f.read()
if 'React.lazy' in content or 'lazy(' in content:
has_lazy_loading = True
break
except:
pass
if not has_lazy_loading:
findings.append({
'severity': 'medium',
'category': 'performance',
'title': 'No code splitting detected',
'current_state': 'No React.lazy() usage found',
'target_state': 'Use code splitting for routes and large components',
'migration_steps': [
'Wrap route components with React.lazy()',
'Add Suspense boundaries with loading states',
'Split large features into separate chunks',
'Analyze bundle size with build tools'
],
'effort': 'low',
})
# Check for large images
assets_dir = codebase_path / 'public' / 'assets'
if assets_dir.exists():
large_images = []
for img in assets_dir.rglob('*.{jpg,jpeg,png,gif}'):
size_mb = img.stat().st_size / (1024 * 1024)
if size_mb > 0.5: # Larger than 500KB
large_images.append((str(img.name), size_mb))
if large_images:
findings.append({
'severity': 'medium',
'category': 'performance',
'title': f'{len(large_images)} large images detected',
'current_state': f'Images larger than 500KB',
'target_state': 'Optimize images with modern formats and lazy loading',
'migration_steps': [
'Convert to WebP format',
'Add lazy loading with loading="lazy"',
'Use srcset for responsive images',
'Compress images with tools like sharp'
],
'effort': 'low',
})
return findings

View File

@@ -0,0 +1,369 @@
"""
Project Structure Analyzer
Analyzes React project structure against Bulletproof React patterns:
- Feature-based organization (src/features/)
- Unidirectional dependencies (shared → features → app)
- No cross-feature imports
- Proper folder hierarchy
"""
import re
from pathlib import Path
from typing import Dict, List, Set
def analyze(codebase_path: Path, metadata: Dict) -> List[Dict]:
"""
Analyze project structure for Bulletproof React compliance.
Args:
codebase_path: Path to React codebase
metadata: Project metadata from discovery phase
Returns:
List of findings with severity and migration guidance
"""
findings = []
src_dir = codebase_path / 'src'
if not src_dir.exists():
findings.append({
'severity': 'critical',
'category': 'structure',
'title': 'Missing src/ directory',
'current_state': 'No src/ directory found',
'target_state': 'All source code should be in src/ directory',
'migration_steps': [
'Create src/ directory',
'Move all source files to src/',
'Update import paths',
'Update build configuration'
],
'effort': 'medium',
})
return findings
# Check for Bulletproof structure
findings.extend(check_bulletproof_structure(src_dir))
# Check for cross-feature imports
findings.extend(check_cross_feature_imports(src_dir))
# Analyze features/ organization
findings.extend(analyze_features_directory(src_dir))
# Check shared code organization
findings.extend(check_shared_code_organization(src_dir))
# Check for architectural violations
findings.extend(check_architectural_violations(src_dir))
return findings
def check_bulletproof_structure(src_dir: Path) -> List[Dict]:
"""Check for presence of Bulletproof React folder structure."""
findings = []
# Required top-level directories for Bulletproof React
bulletproof_dirs = {
'app': 'Application layer (routes, app.tsx, provider.tsx, router.tsx)',
'features': 'Feature modules (80%+ of code should be here)',
}
# Recommended directories
recommended_dirs = {
'components': 'Shared components used across multiple features',
'hooks': 'Shared custom hooks',
'lib': 'Third-party library configurations',
'utils': 'Shared utility functions',
'types': 'Shared TypeScript types',
}
# Check required directories
for dir_name, description in bulletproof_dirs.items():
dir_path = src_dir / dir_name
if not dir_path.exists():
findings.append({
'severity': 'critical' if dir_name == 'features' else 'high',
'category': 'structure',
'title': f'Missing {dir_name}/ directory',
'current_state': f'No {dir_name}/ directory found',
'target_state': f'{dir_name}/ directory should exist: {description}',
'migration_steps': [
f'Create src/{dir_name}/ directory',
f'Organize code according to Bulletproof React {dir_name} pattern',
'Update imports to use new structure'
],
'effort': 'high' if dir_name == 'features' else 'medium',
})
# Check recommended directories (lower severity)
missing_recommended = []
for dir_name, description in recommended_dirs.items():
dir_path = src_dir / dir_name
if not dir_path.exists():
missing_recommended.append(f'{dir_name}/ ({description})')
if missing_recommended:
findings.append({
'severity': 'medium',
'category': 'structure',
'title': 'Missing recommended directories',
'current_state': f'Missing: {", ".join([d.split("/")[0] for d in missing_recommended])}',
'target_state': 'Bulletproof React recommends these directories for shared code',
'migration_steps': [
'Create missing directories as needed',
'Move shared code to appropriate directories',
'Ensure proper separation between shared and feature-specific code'
],
'effort': 'low',
})
return findings
def check_cross_feature_imports(src_dir: Path) -> List[Dict]:
"""Detect cross-feature imports (architectural violation)."""
findings = []
features_dir = src_dir / 'features'
if not features_dir.exists():
return findings
# Get all feature directories
feature_dirs = [d for d in features_dir.iterdir() if d.is_dir() and not d.name.startswith('.')]
violations = []
for feature_dir in feature_dirs:
# Find all TypeScript/JavaScript files in this feature
for file_path in feature_dir.rglob('*.{ts,tsx,js,jsx}'):
try:
with open(file_path, 'r', encoding='utf-8', errors='ignore') as f:
content = f.read()
# Check for imports from other features
import_pattern = re.compile(r'from\s+[\'"]([^\'\"]+)[\'"]')
imports = import_pattern.findall(content)
for imp in imports:
# Check if importing from another feature
if imp.startswith('../') or imp.startswith('@/features/'):
# Extract feature name from import path
if '@/features/' in imp:
imported_feature = imp.split('@/features/')[1].split('/')[0]
elif '../' in imp:
# Handle relative imports
parts = imp.split('/')
if 'features' in parts:
idx = parts.index('features')
if idx + 1 < len(parts):
imported_feature = parts[idx + 1]
else:
continue
else:
continue
else:
continue
# Check if importing from different feature
current_feature = feature_dir.name
if imported_feature != current_feature and imported_feature in [f.name for f in feature_dirs]:
violations.append({
'file': str(file_path.relative_to(src_dir)),
'from_feature': current_feature,
'to_feature': imported_feature,
'import': imp
})
except:
pass
if violations:
# Group violations by feature
grouped = {}
for v in violations:
key = f"{v['from_feature']}{v['to_feature']}"
if key not in grouped:
grouped[key] = []
grouped[key].append(v['file'])
for import_path, files in grouped.items():
findings.append({
'severity': 'high',
'category': 'structure',
'title': f'Cross-feature import: {import_path}',
'current_state': f'{len(files)} file(s) import from another feature',
'target_state': 'Features should be independent. Shared code belongs in src/components/, src/hooks/, or src/utils/',
'migration_steps': [
'Identify what code is being shared between features',
'Move truly shared code to src/components/, src/hooks/, or src/utils/',
'If code is feature-specific, duplicate it or refactor feature boundaries',
'Update imports to use shared code location'
],
'effort': 'medium',
'affected_files': files[:5], # Show first 5 files
})
return findings
def analyze_features_directory(src_dir: Path) -> List[Dict]:
"""Analyze features/ directory structure."""
findings = []
features_dir = src_dir / 'features'
if not features_dir.exists():
return findings
feature_dirs = [d for d in features_dir.iterdir() if d.is_dir() and not d.name.startswith('.')]
if len(feature_dirs) == 0:
findings.append({
'severity': 'high',
'category': 'structure',
'title': 'Empty features/ directory',
'current_state': 'features/ directory exists but contains no features',
'target_state': '80%+ of application code should be organized in feature modules',
'migration_steps': [
'Identify distinct features in your application',
'Create a directory for each feature in src/features/',
'Move feature-specific code to appropriate feature directories',
'Organize each feature with api/, components/, hooks/, stores/, types/, utils/ as needed'
],
'effort': 'high',
})
return findings
# Check each feature for proper internal structure
for feature_dir in feature_dirs:
feature_name = feature_dir.name
# Recommended feature subdirectories
feature_subdirs = ['api', 'components', 'hooks', 'stores', 'types', 'utils']
has_subdirs = any((feature_dir / subdir).exists() for subdir in feature_subdirs)
# Count files in feature root
root_files = [f for f in feature_dir.iterdir() if f.is_file() and f.suffix in {'.ts', '.tsx', '.js', '.jsx'}]
if len(root_files) > 5 and not has_subdirs:
findings.append({
'severity': 'medium',
'category': 'structure',
'title': f'Feature "{feature_name}" lacks internal organization',
'current_state': f'{len(root_files)} files in feature root without subdirectories',
'target_state': 'Features should be organized with api/, components/, hooks/, stores/, types/, utils/ subdirectories',
'migration_steps': [
f'Create subdirectories in src/features/{feature_name}/',
'Move API calls to api/',
'Move components to components/',
'Move hooks to hooks/',
'Move types to types/',
'Move utilities to utils/'
],
'effort': 'low',
})
return findings
def check_shared_code_organization(src_dir: Path) -> List[Dict]:
"""Check if shared code is properly organized."""
findings = []
components_dir = src_dir / 'components'
features_dir = src_dir / 'features'
if not components_dir.exists():
return findings
# Count components
shared_components = list(components_dir.rglob('*.{tsx,jsx}'))
shared_count = len(shared_components)
# Count feature components
feature_count = 0
if features_dir.exists():
feature_count = len(list(features_dir.rglob('**/components/**/*.{tsx,jsx}')))
total_components = shared_count + feature_count
if total_components > 0:
shared_percentage = (shared_count / total_components) * 100
# Bulletproof React recommends 80%+ code in features
if shared_percentage > 40:
findings.append({
'severity': 'medium',
'category': 'structure',
'title': 'Too many shared components',
'current_state': f'{shared_percentage:.1f}% of components are in src/components/ (shared)',
'target_state': 'Most components should be feature-specific. Only truly shared components belong in src/components/',
'migration_steps': [
'Review each component in src/components/',
'Identify components used by only one feature',
'Move feature-specific components to their feature directories',
'Keep only truly shared components in src/components/'
],
'effort': 'medium',
})
return findings
def check_architectural_violations(src_dir: Path) -> List[Dict]:
"""Check for common architectural violations."""
findings = []
# Check for business logic in components/
components_dir = src_dir / 'components'
if components_dir.exists():
large_components = []
for component_file in components_dir.rglob('*.{tsx,jsx}'):
try:
with open(component_file, 'r', encoding='utf-8', errors='ignore') as f:
lines = len(f.readlines())
if lines > 200:
large_components.append((str(component_file.relative_to(src_dir)), lines))
except:
pass
if large_components:
findings.append({
'severity': 'medium',
'category': 'structure',
'title': 'Large components in shared components/',
'current_state': f'{len(large_components)} component(s) > 200 lines in src/components/',
'target_state': 'Shared components should be simple and reusable. Complex components likely belong in features/',
'migration_steps': [
'Review large shared components',
'Extract business logic to feature-specific hooks or utilities',
'Consider moving complex components to features/ if feature-specific',
'Keep shared components simple and focused'
],
'effort': 'medium',
'affected_files': [f[0] for f in large_components[:5]],
})
# Check for proper app/ structure
app_dir = src_dir / 'app'
if app_dir.exists():
expected_app_files = ['app.tsx', 'provider.tsx', 'router.tsx']
has_routing = any((app_dir / f).exists() or (app_dir / 'routes').exists() for f in ['router.tsx', 'routes.tsx'])
if not has_routing:
findings.append({
'severity': 'low',
'category': 'structure',
'title': 'Missing routing configuration in app/',
'current_state': 'No router.tsx or routes/ found in src/app/',
'target_state': 'Bulletproof React recommends centralizing routing in src/app/router.tsx or src/app/routes/',
'migration_steps': [
'Create src/app/router.tsx or src/app/routes/',
'Define all application routes in one place',
'Use code splitting for route-level lazy loading'
],
'effort': 'low',
})
return findings

View File

@@ -0,0 +1,79 @@
"""
Security Practices Analyzer
Analyzes React security patterns:
- JWT with HttpOnly cookies
- Input sanitization
- XSS prevention
"""
from pathlib import Path
from typing import Dict, List
import re
def analyze(codebase_path: Path, metadata: Dict) -> List[Dict]:
"""Analyze security practices."""
findings = []
src_dir = codebase_path / 'src'
if not src_dir.exists():
return findings
# Check for localStorage token storage (security risk)
localstorage_auth = []
for file in src_dir.rglob('*.{ts,tsx,js,jsx}'):
try:
with open(file, 'r') as f:
content = f.read()
if re.search(r'localStorage\.(get|set)Item\s*\(\s*[\'"].*token.*[\'"]\s*\)', content, re.IGNORECASE):
localstorage_auth.append(str(file.relative_to(src_dir)))
except:
pass
if localstorage_auth:
findings.append({
'severity': 'high',
'category': 'security',
'title': f'Tokens stored in localStorage ({len(localstorage_auth)} files)',
'current_state': 'Authentication tokens in localStorage (XSS vulnerable)',
'target_state': 'Use HttpOnly cookies for JWT storage',
'migration_steps': [
'Configure API to set tokens in HttpOnly cookies',
'Remove localStorage token storage',
'Use credentials: "include" in fetch requests',
'Implement CSRF protection'
],
'effort': 'medium',
'affected_files': localstorage_auth[:3],
})
# Check for dangerouslySetInnerHTML
dangerous_html = []
for file in src_dir.rglob('*.{tsx,jsx}'):
try:
with open(file, 'r') as f:
content = f.read()
if 'dangerouslySetInnerHTML' in content:
dangerous_html.append(str(file.relative_to(src_dir)))
except:
pass
if dangerous_html:
findings.append({
'severity': 'high',
'category': 'security',
'title': f'dangerouslySetInnerHTML usage ({len(dangerous_html)} files)',
'current_state': 'Using dangerouslySetInnerHTML (XSS risk)',
'target_state': 'Sanitize HTML input with DOMPurify',
'migration_steps': [
'Install dompurify',
'Sanitize HTML before rendering',
'Prefer safe alternatives when possible',
'Add security review for HTML rendering'
],
'effort': 'low',
'affected_files': dangerous_html[:3],
})
return findings

View File

@@ -0,0 +1,105 @@
"""
Standards Compliance Analyzer
Analyzes project standards:
- ESLint configuration
- TypeScript strict mode
- Prettier setup
- Git hooks (Husky)
- Naming conventions
"""
from pathlib import Path
from typing import Dict, List
import json
def analyze(codebase_path: Path, metadata: Dict) -> List[Dict]:
"""Analyze standards compliance."""
findings = []
tech_stack = metadata.get('tech_stack', {})
# Check ESLint
eslint_config = any([
(codebase_path / '.eslintrc.js').exists(),
(codebase_path / '.eslintrc.json').exists(),
(codebase_path / 'eslint.config.js').exists(),
])
if not eslint_config:
findings.append({
'severity': 'high',
'category': 'standards',
'title': 'No ESLint configuration',
'current_state': 'No .eslintrc or eslint.config found',
'target_state': 'Configure ESLint with React and TypeScript rules',
'migration_steps': [
'Install eslint and plugins',
'Create .eslintrc.js configuration',
'Add recommended rules for React and TS',
'Add lint script to package.json',
'Fix existing violations'
],
'effort': 'low',
})
# Check TypeScript strict mode
tsconfig = codebase_path / 'tsconfig.json'
if tsconfig.exists():
try:
with open(tsconfig, 'r') as f:
config = json.load(f)
strict = config.get('compilerOptions', {}).get('strict', False)
if not strict:
findings.append({
'severity': 'high',
'category': 'standards',
'title': 'TypeScript strict mode disabled',
'current_state': 'strict: false in tsconfig.json',
'target_state': 'Enable strict mode for better type safety',
'migration_steps': [
'Set "strict": true in compilerOptions',
'Fix type errors incrementally',
'Add explicit return types',
'Remove any types'
],
'effort': 'high',
})
except:
pass
# Check Prettier
if not tech_stack.get('prettier'):
findings.append({
'severity': 'low',
'category': 'standards',
'title': 'No Prettier detected',
'current_state': 'Prettier not in dependencies',
'target_state': 'Use Prettier for consistent code formatting',
'migration_steps': [
'Install prettier',
'Create .prettierrc configuration',
'Enable "format on save" in IDE',
'Run prettier on all files'
],
'effort': 'low',
})
# Check Husky
if not tech_stack.get('husky'):
findings.append({
'severity': 'low',
'category': 'standards',
'title': 'No git hooks (Husky) detected',
'current_state': 'No pre-commit hooks',
'target_state': 'Use Husky for pre-commit linting and testing',
'migration_steps': [
'Install husky and lint-staged',
'Set up pre-commit hook',
'Run lint and type-check before commits',
'Prevent bad code from entering repo'
],
'effort': 'low',
})
return findings

View File

@@ -0,0 +1,199 @@
"""
State Management Analyzer
Analyzes React state management against Bulletproof React principles:
- Appropriate tool for each state type (component, app, server, form, URL)
- State localized when possible
- Server cache separated (React Query/SWR)
- No global state overuse
"""
import json
import re
from pathlib import Path
from typing import Dict, List
def analyze(codebase_path: Path, metadata: Dict) -> List[Dict]:
"""
Analyze state management patterns.
Args:
codebase_path: Path to React codebase
metadata: Project metadata from discovery phase
Returns:
List of findings with severity and migration guidance
"""
findings = []
tech_stack = metadata.get('tech_stack', {})
src_dir = codebase_path / 'src'
if not src_dir.exists():
return findings
# Check for appropriate state management tools
findings.extend(check_state_management_tools(tech_stack))
# Check for data fetching library (server cache state)
findings.extend(check_data_fetching_library(tech_stack))
# Check for form state management
findings.extend(check_form_state_management(src_dir, tech_stack))
# Check for potential state management issues
findings.extend(check_state_patterns(src_dir))
return findings
def check_state_management_tools(tech_stack: Dict) -> List[Dict]:
"""Check for presence of appropriate state management tools."""
findings = []
# Check if any global state management is present
has_state_mgmt = any([
tech_stack.get('redux'),
tech_stack.get('zustand'),
tech_stack.get('jotai'),
tech_stack.get('mobx')
])
# If app has many features but no state management, might need it
# (This is a heuristic - could be Context-based which is fine)
if not has_state_mgmt:
findings.append({
'severity': 'low',
'category': 'state',
'title': 'No explicit global state management detected',
'current_state': 'No Redux, Zustand, Jotai, or MobX found',
'target_state': 'Consider Zustand or Jotai for global state if Context becomes complex. Start with Context + hooks.',
'migration_steps': [
'Evaluate if Context API is sufficient for your needs',
'If Context becomes complex, consider Zustand (simple) or Jotai (atomic)',
'Avoid Redux unless you need its ecosystem (Redux Toolkit simplifies it)',
'Keep state as local as possible before going global'
],
'effort': 'low',
})
return findings
def check_data_fetching_library(tech_stack: Dict) -> List[Dict]:
"""Check for React Query, SWR, or similar for server state."""
findings = []
has_data_fetching = any([
tech_stack.get('react-query'),
tech_stack.get('swr'),
tech_stack.get('apollo'),
tech_stack.get('rtk-query')
])
if not has_data_fetching:
findings.append({
'severity': 'high',
'category': 'state',
'title': 'No data fetching library detected',
'current_state': 'No React Query, SWR, Apollo Client, or RTK Query found',
'target_state': 'Use React Query or SWR for server state management (caching, refetching, optimistic updates)',
'migration_steps': [
'Install React Query (@tanstack/react-query) or SWR',
'Wrap app with QueryClientProvider (React Query) or SWRConfig (SWR)',
'Convert fetch calls to useQuery hooks',
'Replace manual loading/error states with library patterns',
'Add staleTime, cacheTime configurations as needed'
],
'effort': 'medium',
})
return findings
def check_form_state_management(src_dir: Path, tech_stack: Dict) -> List[Dict]:
"""Check for form state management."""
findings = []
has_form_lib = any([
tech_stack.get('react-hook-form'),
tech_stack.get('formik')
])
# Look for form components without form library
if not has_form_lib:
form_files = []
for file_path in src_dir.rglob('*.{tsx,jsx}'):
try:
with open(file_path, 'r', encoding='utf-8', errors='ignore') as f:
content = f.read()
# Look for <form> tags
if re.search(r'<form[>\s]', content, re.IGNORECASE):
form_files.append(str(file_path.relative_to(src_dir)))
except:
pass
if len(form_files) > 3: # More than 3 forms suggests need for form library
findings.append({
'severity': 'medium',
'category': 'state',
'title': f'No form library but {len(form_files)} forms detected',
'current_state': f'{len(form_files)} form components without React Hook Form or Formik',
'target_state': 'Use React Hook Form for performant form state management',
'migration_steps': [
'Install react-hook-form',
'Replace controlled form state with useForm() hook',
'Use register() for input registration',
'Handle validation with yup or zod schemas',
'Reduce re-renders with uncontrolled inputs'
],
'effort': 'medium',
'affected_files': form_files[:5],
})
return findings
def check_state_patterns(src_dir: Path) -> List[Dict]:
"""Check for common state management anti-patterns."""
findings = []
# Look for large Context providers (potential performance issue)
large_contexts = []
for file_path in src_dir.rglob('*.{tsx,jsx}'):
try:
with open(file_path, 'r', encoding='utf-8', errors='ignore') as f:
content = f.read()
# Look for Context creation with many values
if 'createContext' in content:
# Count useState hooks in the provider
state_count = len(re.findall(r'useState\s*\(', content))
if state_count > 5:
large_contexts.append({
'file': str(file_path.relative_to(src_dir)),
'state_count': state_count
})
except:
pass
if large_contexts:
for ctx in large_contexts:
findings.append({
'severity': 'medium',
'category': 'state',
'title': f'Large Context with {ctx["state_count"]} state values',
'current_state': f'{ctx["file"]} has many state values in one Context',
'target_state': 'Split large Contexts into smaller, focused Contexts to prevent unnecessary re-renders',
'migration_steps': [
'Identify which state values change together',
'Create separate Contexts for independent state',
'Use Context composition for related state',
'Consider Zustand/Jotai for complex global state'
],
'effort': 'medium',
'file': ctx['file'],
})
return findings

View File

@@ -0,0 +1,59 @@
"""
Styling Patterns Analyzer
Analyzes styling approach against Bulletproof React:
- Consistent styling method
- Component library usage
- Colocated styles
"""
from pathlib import Path
from typing import Dict, List
def analyze(codebase_path: Path, metadata: Dict) -> List[Dict]:
"""Analyze styling patterns."""
findings = []
tech_stack = metadata.get('tech_stack', {})
# Check for styling approach
styling_tools = []
if tech_stack.get('tailwind'): styling_tools.append('Tailwind CSS')
if tech_stack.get('styled-components'): styling_tools.append('styled-components')
if tech_stack.get('emotion'): styling_tools.append('Emotion')
if tech_stack.get('chakra-ui'): styling_tools.append('Chakra UI')
if tech_stack.get('mui'): styling_tools.append('Material UI')
if tech_stack.get('radix-ui'): styling_tools.append('Radix UI')
if not styling_tools:
findings.append({
'severity': 'low',
'category': 'styling',
'title': 'No component library or utility CSS detected',
'current_state': 'No Tailwind, Chakra UI, Radix UI, or other styling system',
'target_state': 'Use component library (Chakra, Radix) or utility CSS (Tailwind)',
'migration_steps': [
'Choose styling approach based on needs',
'Install Tailwind CSS (utility-first) or Chakra UI (component library)',
'Configure theme and design tokens',
'Migrate components gradually'
],
'effort': 'medium',
})
elif len(styling_tools) > 2:
findings.append({
'severity': 'medium',
'category': 'styling',
'title': f'Multiple styling approaches ({len(styling_tools)})',
'current_state': f'Using: {", ".join(styling_tools)}',
'target_state': 'Standardize on single styling approach',
'migration_steps': [
'Choose primary styling system',
'Create migration plan',
'Update style guide',
'Refactor components incrementally'
],
'effort': 'high',
})
return findings

View File

@@ -0,0 +1,313 @@
"""
Testing Strategy Analyzer
Analyzes React testing against Bulletproof React and Connor's standards:
- Testing trophy distribution (70% integration, 20% unit, 10% E2E)
- 80%+ coverage requirement
- Semantic queries (getByRole preferred)
- User behavior testing (not implementation details)
- Test naming ("should X when Y")
"""
import json
import re
from pathlib import Path
from typing import Dict, List
def analyze(codebase_path: Path, metadata: Dict) -> List[Dict]:
"""
Analyze testing strategy and quality.
Args:
codebase_path: Path to React codebase
metadata: Project metadata from discovery phase
Returns:
List of findings with severity and migration guidance
"""
findings = []
tech_stack = metadata.get('tech_stack', {})
src_dir = codebase_path / 'src'
if not src_dir.exists():
return findings
# Check for testing framework
findings.extend(check_testing_framework(tech_stack))
# Check test coverage
findings.extend(check_test_coverage(codebase_path))
# Analyze test distribution (unit vs integration vs E2E)
findings.extend(analyze_test_distribution(codebase_path))
# Check test quality patterns
findings.extend(check_test_quality(codebase_path))
return findings
def check_testing_framework(tech_stack: Dict) -> List[Dict]:
"""Check for modern testing setup."""
findings = []
has_test_framework = tech_stack.get('vitest') or tech_stack.get('jest')
has_testing_library = tech_stack.get('testing-library')
if not has_test_framework:
findings.append({
'severity': 'critical',
'category': 'testing',
'title': 'No testing framework detected',
'current_state': 'No Vitest or Jest found',
'target_state': 'Use Vitest (modern, fast) or Jest for testing',
'migration_steps': [
'Install Vitest (recommended for Vite) or Jest',
'Install @testing-library/react',
'Configure test setup file',
'Add test scripts to package.json',
'Set up coverage reporting'
],
'effort': 'medium',
})
if not has_testing_library:
findings.append({
'severity': 'high',
'category': 'testing',
'title': 'Testing Library not found',
'current_state': 'No @testing-library/react detected',
'target_state': 'Use Testing Library for user-centric testing',
'migration_steps': [
'Install @testing-library/react',
'Install @testing-library/jest-dom for assertions',
'Use render() and semantic queries (getByRole)',
'Follow testing-library principles (test behavior, not implementation)'
],
'effort': 'low',
})
return findings
def check_test_coverage(codebase_path: Path) -> List[Dict]:
"""Check test coverage if available."""
findings = []
# Look for coverage reports
coverage_file = codebase_path / 'coverage' / 'coverage-summary.json'
if coverage_file.exists():
try:
with open(coverage_file, 'r') as f:
coverage_data = json.load(f)
total_coverage = coverage_data.get('total', {})
line_coverage = total_coverage.get('lines', {}).get('pct', 0)
branch_coverage = total_coverage.get('branches', {}).get('pct', 0)
if line_coverage < 80:
findings.append({
'severity': 'high',
'category': 'testing',
'title': f'Test coverage below 80% ({line_coverage:.1f}%)',
'current_state': f'Line coverage: {line_coverage:.1f}%, Branch coverage: {branch_coverage:.1f}%',
'target_state': 'Maintain 80%+ test coverage on all code',
'migration_steps': [
'Identify untested files and functions',
'Prioritize testing critical paths (authentication, payment, data processing)',
'Write integration tests first (70% of tests)',
'Add unit tests for complex business logic',
'Configure coverage thresholds in test config'
],
'effort': 'high',
})
elif line_coverage < 90:
findings.append({
'severity': 'medium',
'category': 'testing',
'title': f'Test coverage at {line_coverage:.1f}%',
'current_state': f'Coverage is good but could be excellent (current: {line_coverage:.1f}%)',
'target_state': 'Aim for 90%+ coverage for production-ready code',
'migration_steps': [
'Identify remaining untested code paths',
'Focus on edge cases and error handling',
'Ensure all critical features have 100% coverage'
],
'effort': 'medium',
})
except:
pass
else:
findings.append({
'severity': 'high',
'category': 'testing',
'title': 'No coverage report found',
'current_state': 'Cannot find coverage/coverage-summary.json',
'target_state': 'Generate coverage reports to track test coverage',
'migration_steps': [
'Configure coverage in vitest.config.ts or jest.config.js',
'Add --coverage flag to test script',
'Set coverage thresholds (lines: 80, branches: 75)',
'Add coverage/ to .gitignore',
'Review coverage reports regularly'
],
'effort': 'low',
})
return findings
def analyze_test_distribution(codebase_path: Path) -> List[Dict]:
"""Analyze testing trophy distribution."""
findings = []
# Count test files by type
unit_tests = 0
integration_tests = 0
e2e_tests = 0
test_patterns = {
'e2e': ['e2e/', '.e2e.test.', '.e2e.spec.', 'playwright/', 'cypress/'],
'integration': ['.test.tsx', '.test.jsx', '.spec.tsx', '.spec.jsx'], # Component tests
'unit': ['.test.ts', '.test.js', '.spec.ts', '.spec.js'], # Logic tests
}
for test_file in codebase_path.rglob('*.{test,spec}.{ts,tsx,js,jsx}'):
test_path_str = str(test_file)
# E2E tests
if any(pattern in test_path_str for pattern in test_patterns['e2e']):
e2e_tests += 1
# Integration tests (component tests with TSX/JSX)
elif any(pattern in test_path_str for pattern in test_patterns['integration']):
integration_tests += 1
# Unit tests (pure logic, no JSX)
else:
unit_tests += 1
total_tests = unit_tests + integration_tests + e2e_tests
if total_tests > 0:
int_pct = (integration_tests / total_tests) * 100
unit_pct = (unit_tests / total_tests) * 100
e2e_pct = (e2e_tests / total_tests) * 100
# Testing Trophy: 70% integration, 20% unit, 10% E2E
if int_pct < 50: # Should be ~70%
findings.append({
'severity': 'medium',
'category': 'testing',
'title': 'Testing pyramid instead of testing trophy',
'current_state': f'Distribution: {int_pct:.0f}% integration, {unit_pct:.0f}% unit, {e2e_pct:.0f}% E2E',
'target_state': 'Testing Trophy: 70% integration, 20% unit, 10% E2E',
'migration_steps': [
'Write more integration tests (component + hooks + context)',
'Test user workflows, not implementation details',
'Reduce excessive unit tests of simple functions',
'Keep E2E tests for critical user journeys only',
'Use Testing Library for integration tests'
],
'effort': 'medium',
})
if unit_pct > 40: # Should be ~20%
findings.append({
'severity': 'low',
'category': 'testing',
'title': 'Too many unit tests',
'current_state': f'{unit_pct:.0f}% unit tests (target: ~20%)',
'target_state': 'Focus on integration tests that provide more confidence',
'migration_steps': [
'Review unit tests - many could be integration tests',
'Combine related unit tests into integration tests',
'Keep unit tests only for complex business logic',
'Test components with their hooks and context'
],
'effort': 'low',
})
return findings
def check_test_quality(codebase_path: Path) -> List[Dict]:
"""Check for test quality anti-patterns."""
findings = []
brittle_test_patterns = []
bad_query_usage = []
bad_naming = []
for test_file in codebase_path.rglob('*.{test,spec}.{ts,tsx,js,jsx}'):
try:
with open(test_file, 'r', encoding='utf-8', errors='ignore') as f:
content = f.read()
# Check for brittle tests (testing implementation)
if 'getByTestId' in content:
bad_query_usage.append(str(test_file))
# Check for testing exact counts (brittle)
if re.search(r'expect\([^)]+\)\.toHaveLength\(\d+\)', content):
brittle_test_patterns.append(str(test_file))
# Check test naming ("should X when Y")
test_names = re.findall(r'(?:it|test)\s*\(\s*[\'"]([^\'"]+)[\'"]', content)
for name in test_names:
if not (name.startswith('should ') or 'when' in name.lower()):
bad_naming.append((str(test_file), name))
except:
pass
if bad_query_usage:
findings.append({
'severity': 'medium',
'category': 'testing',
'title': f'Using getByTestId in {len(bad_query_usage)} test files',
'current_state': 'Tests use getByTestId instead of semantic queries',
'target_state': 'Use semantic queries: getByRole, getByLabelText, getByText',
'migration_steps': [
'Replace getByTestId with getByRole (most preferred)',
'Use getByLabelText for form inputs',
'Use getByText for user-visible content',
'Only use getByTestId as last resort',
'Add eslint-plugin-testing-library for enforcement'
],
'effort': 'medium',
'affected_files': bad_query_usage[:5],
})
if brittle_test_patterns:
findings.append({
'severity': 'low',
'category': 'testing',
'title': f'Brittle test patterns in {len(brittle_test_patterns)} files',
'current_state': 'Tests check exact counts and DOM structure',
'target_state': 'Test user behavior and outcomes, not exact DOM structure',
'migration_steps': [
'Avoid testing exact element counts',
'Focus on user-visible behavior',
'Test functionality, not implementation',
'Allow flexibility in DOM structure'
],
'effort': 'low',
})
if len(bad_naming) > 5: # More than 5 tests with poor naming
findings.append({
'severity': 'low',
'category': 'testing',
'title': f'{len(bad_naming)} tests with unclear naming',
'current_state': 'Test names don\'t follow "should X when Y" pattern',
'target_state': 'Use descriptive names: "should display error when API fails"',
'migration_steps': [
'Rename tests to describe expected behavior',
'Use pattern: "should [expected behavior] when [condition]"',
'Make tests self-documenting',
'Tests should read like requirements'
],
'effort': 'low',
})
return findings

View File

@@ -0,0 +1,503 @@
#!/usr/bin/env python3
"""
Bulletproof React Audit Engine
Orchestrates comprehensive React/TypeScript codebase analysis against Bulletproof
React architecture principles. Generates detailed audit reports and migration plans.
Usage:
python audit_engine.py /path/to/react-app --output report.md
python audit_engine.py /path/to/react-app --format json --output report.json
python audit_engine.py /path/to/react-app --migration-plan --output migration.md
"""
import argparse
import json
import sys
from datetime import datetime
from pathlib import Path
from typing import Dict, List, Optional
import importlib.util
# Bulletproof React specific analyzers
ANALYZERS = {
'structure': 'analyzers.project_structure',
'components': 'analyzers.component_architecture',
'state': 'analyzers.state_management',
'api': 'analyzers.api_layer',
'testing': 'analyzers.testing_strategy',
'styling': 'analyzers.styling_patterns',
'errors': 'analyzers.error_handling',
'performance': 'analyzers.performance_patterns',
'security': 'analyzers.security_practices',
'standards': 'analyzers.standards_compliance',
}
class BulletproofAuditEngine:
"""
Core audit engine for Bulletproof React compliance analysis.
Uses progressive disclosure: loads only necessary analyzers based on scope.
"""
def __init__(self, codebase_path: Path, scope: Optional[List[str]] = None):
"""
Initialize Bulletproof React audit engine.
Args:
codebase_path: Path to the React codebase to audit
scope: Optional list of analysis categories to run
If None, runs all analyzers.
"""
self.codebase_path = Path(codebase_path).resolve()
self.scope = scope or list(ANALYZERS.keys())
self.findings: Dict[str, List[Dict]] = {}
self.metadata: Dict = {}
if not self.codebase_path.exists():
raise FileNotFoundError(f"Codebase path does not exist: {self.codebase_path}")
def discover_project(self) -> Dict:
"""
Phase 1: Initial React project discovery (lightweight scan).
Returns:
Dictionary containing React project metadata
"""
print("🔍 Phase 1: Discovering React project structure...")
metadata = {
'path': str(self.codebase_path),
'scan_time': datetime.now().isoformat(),
'is_react': self._detect_react(),
'tech_stack': self._detect_tech_stack(),
'structure_type': self._detect_structure_type(),
'total_files': self._count_files(),
'total_lines': self._count_lines(),
'git_info': self._get_git_info(),
}
if not metadata['is_react']:
print("⚠️ Warning: This does not appear to be a React project!")
print(" Bulletproof React audit is designed for React applications.")
self.metadata = metadata
return metadata
def _detect_react(self) -> bool:
"""Check if this is a React project."""
pkg_json = self.codebase_path / 'package.json'
if not pkg_json.exists():
return False
try:
with open(pkg_json, 'r') as f:
pkg = json.load(f)
deps = {**pkg.get('dependencies', {}), **pkg.get('devDependencies', {})}
return 'react' in deps or 'react-dom' in deps
except:
return False
def _detect_tech_stack(self) -> Dict[str, bool]:
"""Detect React ecosystem tools and libraries."""
pkg_json = self.codebase_path / 'package.json'
tech_stack = {}
if pkg_json.exists():
try:
with open(pkg_json, 'r') as f:
pkg = json.load(f)
deps = {**pkg.get('dependencies', {}), **pkg.get('devDependencies', {})}
# Core
tech_stack['react'] = 'react' in deps
tech_stack['typescript'] = 'typescript' in deps or (self.codebase_path / 'tsconfig.json').exists()
# Build tools
tech_stack['vite'] = 'vite' in deps
tech_stack['create-react-app'] = 'react-scripts' in deps
tech_stack['next'] = 'next' in deps
# State management
tech_stack['redux'] = 'redux' in deps or '@reduxjs/toolkit' in deps
tech_stack['zustand'] = 'zustand' in deps
tech_stack['jotai'] = 'jotai' in deps
tech_stack['mobx'] = 'mobx' in deps
# Data fetching
tech_stack['react-query'] = '@tanstack/react-query' in deps or 'react-query' in deps
tech_stack['swr'] = 'swr' in deps
tech_stack['apollo'] = '@apollo/client' in deps
tech_stack['rtk-query'] = '@reduxjs/toolkit' in deps
# Forms
tech_stack['react-hook-form'] = 'react-hook-form' in deps
tech_stack['formik'] = 'formik' in deps
# Styling
tech_stack['tailwind'] = 'tailwindcss' in deps or (self.codebase_path / 'tailwind.config.js').exists()
tech_stack['styled-components'] = 'styled-components' in deps
tech_stack['emotion'] = '@emotion/react' in deps
tech_stack['chakra-ui'] = '@chakra-ui/react' in deps
tech_stack['mui'] = '@mui/material' in deps
tech_stack['radix-ui'] = any('@radix-ui' in dep for dep in deps.keys())
# Testing
tech_stack['vitest'] = 'vitest' in deps
tech_stack['jest'] = 'jest' in deps
tech_stack['testing-library'] = '@testing-library/react' in deps
tech_stack['playwright'] = '@playwright/test' in deps
tech_stack['cypress'] = 'cypress' in deps
# Routing
tech_stack['react-router'] = 'react-router-dom' in deps
# Error tracking
tech_stack['sentry'] = '@sentry/react' in deps
# Code quality
tech_stack['eslint'] = 'eslint' in deps
tech_stack['prettier'] = 'prettier' in deps
tech_stack['husky'] = 'husky' in deps
except:
pass
return {k: v for k, v in tech_stack.items() if v}
def _detect_structure_type(self) -> str:
"""Determine project structure pattern (feature-based vs flat)."""
src_dir = self.codebase_path / 'src'
if not src_dir.exists():
return 'no_src_directory'
features_dir = src_dir / 'features'
components_dir = src_dir / 'components'
app_dir = src_dir / 'app'
# Count files in different locations
features_files = len(list(features_dir.rglob('*.{js,jsx,ts,tsx}'))) if features_dir.exists() else 0
components_files = len(list(components_dir.rglob('*.{js,jsx,ts,tsx}'))) if components_dir.exists() else 0
if features_dir.exists() and app_dir.exists():
if features_files > components_files * 2:
return 'feature_based'
else:
return 'mixed'
elif features_dir.exists():
return 'partial_feature_based'
else:
return 'flat'
def _count_files(self) -> int:
"""Count total files in React codebase."""
exclude_dirs = {'.git', 'node_modules', 'dist', 'build', '.next', 'out', 'coverage'}
count = 0
for path in self.codebase_path.rglob('*'):
if path.is_file() and not any(excluded in path.parts for excluded in exclude_dirs):
count += 1
return count
def _count_lines(self) -> int:
"""Count total lines of code in React files."""
exclude_dirs = {'.git', 'node_modules', 'dist', 'build', '.next', 'out', 'coverage'}
code_extensions = {'.js', '.jsx', '.ts', '.tsx'}
total_lines = 0
for path in self.codebase_path.rglob('*'):
if (path.is_file() and
path.suffix in code_extensions and
not any(excluded in path.parts for excluded in exclude_dirs)):
try:
with open(path, 'r', encoding='utf-8', errors='ignore') as f:
total_lines += sum(1 for line in f if line.strip() and not line.strip().startswith(('//', '#', '/*', '*')))
except:
pass
return total_lines
def _get_git_info(self) -> Optional[Dict]:
"""Get git repository information."""
git_dir = self.codebase_path / '.git'
if not git_dir.exists():
return None
try:
import subprocess
result = subprocess.run(
['git', '-C', str(self.codebase_path), 'log', '--oneline', '-10'],
capture_output=True,
text=True,
timeout=5
)
commit_count = subprocess.run(
['git', '-C', str(self.codebase_path), 'rev-list', '--count', 'HEAD'],
capture_output=True,
text=True,
timeout=5
)
return {
'is_git_repo': True,
'recent_commits': result.stdout.strip().split('\n') if result.returncode == 0 else [],
'total_commits': int(commit_count.stdout.strip()) if commit_count.returncode == 0 else 0,
}
except:
return {'is_git_repo': True, 'error': 'Could not read git info'}
def run_analysis(self, phase: str = 'full') -> Dict:
"""
Phase 2: Deep Bulletproof React analysis using specialized analyzers.
Args:
phase: 'quick' for lightweight scan, 'full' for comprehensive analysis
Returns:
Dictionary containing all findings
"""
print(f"🔬 Phase 2: Running {phase} Bulletproof React analysis...")
for category in self.scope:
if category not in ANALYZERS:
print(f"⚠️ Unknown analyzer category: {category}, skipping...")
continue
print(f" Analyzing {category}...")
analyzer_findings = self._run_analyzer(category)
if analyzer_findings:
self.findings[category] = analyzer_findings
return self.findings
def _run_analyzer(self, category: str) -> List[Dict]:
"""
Run a specific Bulletproof React analyzer module.
Args:
category: Analyzer category name
Returns:
List of findings from the analyzer
"""
module_path = ANALYZERS.get(category)
if not module_path:
return []
try:
# Import analyzer module dynamically
analyzer_file = Path(__file__).parent / f"{module_path.replace('.', '/')}.py"
if not analyzer_file.exists():
print(f" ⚠️ Analyzer not yet implemented: {category}")
return []
spec = importlib.util.spec_from_file_location(module_path, analyzer_file)
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
# Each analyzer should have an analyze() function
if hasattr(module, 'analyze'):
return module.analyze(self.codebase_path, self.metadata)
else:
print(f" ⚠️ Analyzer missing analyze() function: {category}")
return []
except Exception as e:
print(f" ❌ Error running analyzer {category}: {e}")
return []
def calculate_scores(self) -> Dict[str, float]:
"""
Calculate Bulletproof React compliance scores for each category.
Returns:
Dictionary of scores (0-100 scale)
"""
scores = {}
# Calculate score for each category based on findings severity
for category, findings in self.findings.items():
if not findings:
scores[category] = 100.0
continue
# Weighted scoring based on severity
severity_weights = {'critical': 15, 'high': 8, 'medium': 3, 'low': 1}
total_weight = sum(severity_weights.get(f.get('severity', 'low'), 1) for f in findings)
# Score decreases based on weighted issues
penalty = min(total_weight * 2, 100) # Each point = 2% penalty
scores[category] = max(0, 100 - penalty)
# Overall score is weighted average
if scores:
scores['overall'] = sum(scores.values()) / len(scores)
else:
scores['overall'] = 100.0
return scores
def calculate_grade(self, score: float) -> str:
"""Convert score to letter grade."""
if score >= 90: return 'A'
if score >= 80: return 'B'
if score >= 70: return 'C'
if score >= 60: return 'D'
return 'F'
def generate_summary(self) -> Dict:
"""
Generate executive summary of Bulletproof React audit results.
Returns:
Summary dictionary
"""
critical_count = sum(
1 for findings in self.findings.values()
for f in findings
if f.get('severity') == 'critical'
)
high_count = sum(
1 for findings in self.findings.values()
for f in findings
if f.get('severity') == 'high'
)
scores = self.calculate_scores()
overall_score = scores.get('overall', 0)
# Estimate migration effort in person-days
effort_map = {'low': 0.5, 'medium': 2, 'high': 5}
total_effort = sum(
effort_map.get(f.get('effort', 'medium'), 2)
for findings in self.findings.values()
for f in findings
)
return {
'compliance_score': round(overall_score, 1),
'grade': self.calculate_grade(overall_score),
'category_scores': {k: round(v, 1) for k, v in scores.items() if k != 'overall'},
'critical_issues': critical_count,
'high_issues': high_count,
'total_issues': sum(len(findings) for findings in self.findings.values()),
'migration_effort_days': round(total_effort, 1),
'structure_type': self.metadata.get('structure_type', 'unknown'),
'metadata': self.metadata,
}
def main():
"""Main entry point for CLI usage."""
parser = argparse.ArgumentParser(
description='Bulletproof React audit tool for React/TypeScript applications',
formatter_class=argparse.RawDescriptionHelpFormatter,
)
parser.add_argument(
'codebase',
type=str,
help='Path to the React codebase to audit'
)
parser.add_argument(
'--scope',
type=str,
help='Comma-separated list of analysis categories (structure,components,state,api,testing,styling,errors,performance,security,standards)',
default=None
)
parser.add_argument(
'--phase',
type=str,
choices=['quick', 'full'],
default='full',
help='Analysis depth: quick (Phase 1 only) or full (Phase 1 + 2)'
)
parser.add_argument(
'--format',
type=str,
choices=['markdown', 'json', 'html'],
default='markdown',
help='Output format for the report'
)
parser.add_argument(
'--output',
type=str,
help='Output file path (default: stdout)',
default=None
)
parser.add_argument(
'--migration-plan',
action='store_true',
help='Generate migration plan in addition to audit report'
)
args = parser.parse_args()
# Parse scope
scope = args.scope.split(',') if args.scope else None
# Initialize engine
try:
engine = BulletproofAuditEngine(args.codebase, scope=scope)
except FileNotFoundError as e:
print(f"❌ Error: {e}", file=sys.stderr)
sys.exit(1)
# Run audit
print("🚀 Starting Bulletproof React audit...")
print(f" Codebase: {args.codebase}")
print(f" Scope: {scope or 'all'}")
print(f" Phase: {args.phase}")
print()
# Phase 1: Discovery
metadata = engine.discover_project()
if metadata['is_react']:
print(f" React detected: ✅")
print(f" TypeScript: {'' if metadata['tech_stack'].get('typescript') else ''}")
print(f" Structure type: {metadata['structure_type']}")
print(f" Files: {metadata['total_files']}")
print(f" Lines of code: {metadata['total_lines']:,}")
else:
print(f" React detected: ❌")
print(" Continuing audit anyway...")
print()
# Phase 2: Analysis (if not quick mode)
if args.phase == 'full':
findings = engine.run_analysis()
# Generate summary
summary = engine.generate_summary()
# Output results
print()
print("📊 Bulletproof React Audit Complete!")
print(f" Compliance score: {summary['compliance_score']}/100 (Grade: {summary['grade']})")
print(f" Critical issues: {summary['critical_issues']}")
print(f" High issues: {summary['high_issues']}")
print(f" Total issues: {summary['total_issues']}")
print(f" Estimated migration effort: {summary['migration_effort_days']} person-days")
print()
# Generate report (to be implemented in report_generator.py)
if args.output:
print(f"📝 Report generation will be implemented in report_generator.py")
print(f" Format: {args.format}")
print(f" Output: {args.output}")
if args.migration_plan:
print(f" Migration plan: {args.output.replace('.md', '_migration.md')}")
if __name__ == '__main__':
main()

View File

@@ -0,0 +1,13 @@
# Changelog
## 0.2.0
- Refactored to Anthropic progressive disclosure pattern
- Updated description with "Use PROACTIVELY when..." format
- Extracted detailed content to reference/ and examples/ directories
## 0.1.0
- Initial skill release
- Comprehensive codebase analysis against 2024-25 SDLC standards
- OWASP, WCAG, and DORA metrics evaluation

View File

@@ -0,0 +1,253 @@
# Codebase Auditor Skill
> Comprehensive codebase audit tool based on modern SDLC best practices (2024-25 standards)
An Anthropic Skill that analyzes codebases for quality issues, security vulnerabilities, technical debt, and generates prioritized remediation plans.
## Features
- **Progressive Disclosure**: Three-phase analysis (Discovery → Deep Analysis → Report)
- **Multi-Language Support**: JavaScript, TypeScript, Python (extensible)
- **Comprehensive Analysis**:
- Code Quality (complexity, duplication, code smells)
- Security (secrets detection, OWASP Top 10, dependency vulnerabilities)
- Testing (coverage analysis, testing trophy distribution)
- Technical Debt (SQALE rating, remediation estimates)
- **Multiple Report Formats**: Markdown, JSON, HTML dashboard
- **Prioritized Remediation Plans**: P0-P3 severity with effort estimates
- **Industry Standards**: Based on 2024-25 SDLC best practices
## Installation
1. Copy the `codebase-auditor` directory to your Claude skills directory
2. Ensure Python 3.8+ is installed
3. No additional dependencies required (uses Python standard library)
## Usage with Claude Code
### Basic Audit
```
Audit this codebase using the codebase-auditor skill.
```
### Focused Audit
```
Run a security-focused audit on this codebase.
```
### Quick Health Check
```
Give me a quick health check of this codebase (Phase 1 only).
```
### Custom Scope
```
Audit this codebase focusing on:
- Test coverage and quality
- Security vulnerabilities
- Code complexity
```
## Direct Script Usage
```bash
# Full audit with Markdown report
python scripts/audit_engine.py /path/to/codebase --output report.md
# Security-focused audit
python scripts/audit_engine.py /path/to/codebase --scope security --output security-report.md
# JSON output for CI/CD integration
python scripts/audit_engine.py /path/to/codebase --format json --output report.json
# Quick health check only (Phase 1)
python scripts/audit_engine.py /path/to/codebase --phase quick
```
## Output Formats
### Markdown (Default)
Human-readable report with detailed findings and recommendations. Suitable for:
- Pull request comments
- Documentation
- Team reviews
### JSON
Machine-readable format for CI/CD integration. Includes:
- Structured findings
- Metrics and scores
- Full metadata
### HTML
Interactive dashboard with:
- Visual metrics
- Filterable findings
- Color-coded severity levels
## Audit Criteria
The skill audits based on 10 key categories:
1. **Code Quality**: Complexity, duplication, code smells, file/function length
2. **Testing**: Coverage, test quality, testing trophy distribution
3. **Security**: Secrets detection, OWASP Top 10, dependency vulnerabilities
4. **Architecture**: SOLID principles, design patterns, modularity
5. **Performance**: Build times, bundle size, runtime efficiency
6. **Documentation**: Code docs, README, architecture docs
7. **DevOps & CI/CD**: Pipeline maturity, deployment frequency, DORA metrics
8. **Dependencies**: Outdated packages, license compliance, CVEs
9. **Accessibility**: WCAG 2.1 AA compliance
10. **TypeScript Strict Mode**: Type safety, strict mode violations
See [`reference/audit_criteria.md`](reference/audit_criteria.md) for complete checklist.
## Severity Levels
- **Critical (P0)**: Fix immediately (within 24 hours)
- Security vulnerabilities, secrets exposure, production-breaking bugs
- **High (P1)**: Fix this sprint (within 2 weeks)
- Significant quality/security issues, critical path test gaps
- **Medium (P2)**: Fix next quarter (within 3 months)
- Code smells, documentation gaps, moderate technical debt
- **Low (P3)**: Backlog
- Stylistic issues, minor optimizations
See [`reference/severity_matrix.md`](reference/severity_matrix.md) for detailed criteria.
## Examples
See the [`examples/`](examples/) directory for:
- Sample audit report
- Sample remediation plan
## Architecture
```
codebase-auditor/
├── SKILL.md # Skill definition (Claude loads this)
├── README.md # This file
├── scripts/
│ ├── audit_engine.py # Core orchestrator
│ ├── analyzers/ # Specialized analyzers
│ │ ├── code_quality.py # Complexity, duplication, smells
│ │ ├── test_coverage.py # Coverage analysis
│ │ ├── security_scan.py # Security vulnerabilities
│ │ ├── dependencies.py # Dependency health
│ │ ├── performance.py # Performance analysis
│ │ └── technical_debt.py # SQALE rating
│ ├── report_generator.py # Multi-format reports
│ └── remediation_planner.py # Prioritized action plans
├── reference/
│ ├── audit_criteria.md # Complete audit checklist
│ ├── severity_matrix.md # Issue prioritization
│ └── best_practices_2025.md # SDLC standards
└── examples/
├── sample_report.md
└── remediation_plan.md
```
## Extending the Skill
### Adding a New Analyzer
1. Create `scripts/analyzers/your_analyzer.py`
2. Implement `analyze(codebase_path, metadata)` function that returns findings list
3. Add to `ANALYZERS` dict in `audit_engine.py`
Example:
```python
def analyze(codebase_path: Path, metadata: Dict) -> List[Dict]:
findings = []
# Your analysis logic here
findings.append({
'severity': 'high',
'category': 'your_category',
'subcategory': 'specific_issue',
'title': 'Issue title',
'description': 'What was found',
'file': 'path/to/file.js',
'line': 42,
'code_snippet': 'problematic code',
'impact': 'Why it matters',
'remediation': 'How to fix it',
'effort': 'low|medium|high',
})
return findings
```
## CI/CD Integration
### GitHub Actions Example
```yaml
name: Code Audit
on: [pull_request]
jobs:
audit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Run Codebase Audit
run: |
python codebase-auditor/scripts/audit_engine.py . \
--format json \
--output audit-report.json
- name: Check for Critical Issues
run: |
CRITICAL=$(jq '.summary.critical_issues' audit-report.json)
if [ "$CRITICAL" -gt 0 ]; then
echo "❌ Found $CRITICAL critical issues"
exit 1
fi
```
## Best Practices
1. **Run Incrementally**: For large codebases, use progressive disclosure
2. **Focus on Critical Paths**: Audit authentication, payment, data processing first
3. **Baseline Before Releases**: Establish quality gates before major releases
4. **Track Over Time**: Compare audits to measure improvement
5. **Integrate with CI/CD**: Automate for continuous monitoring
6. **Customize Thresholds**: Adjust severity based on project maturity
## Limitations
- Static analysis only (no runtime profiling)
- Requires source code access
- Dependency data requires internet access (for vulnerability databases)
- Large codebases may need chunked analysis
## Version
**1.0.0** - Initial release
## Standards Compliance
Based on:
- DORA State of DevOps Report 2024
- OWASP Top 10 (2024 Edition)
- WCAG 2.1 Guidelines
- Kent C. Dodds Testing Trophy
- SonarQube Quality Gates
## License
Apache 2.0 (example skill for demonstration)
---
**Built with**: Python 3.8+
**Anthropic Skill Version**: 1.0
**Last Updated**: 2024-10-21

View File

@@ -0,0 +1,112 @@
---
name: codebase-auditor
description: Use PROACTIVELY when evaluating code quality, assessing technical debt, or preparing for production deployment. Comprehensive audit tool analyzing software engineering practices, security vulnerabilities (OWASP Top 10), and technical debt using modern SDLC best practices (2024-25 standards). Generates prioritized remediation plans with effort estimates. Not for runtime profiling or real-time monitoring.
---
# Codebase Auditor
Comprehensive codebase audits using modern software engineering standards with actionable remediation plans.
## When to Use
- Audit codebase for quality, security, maintainability
- Assess technical debt and estimate remediation
- Prepare production readiness report
- Evaluate legacy codebase for modernization
## Audit Phases
### Phase 1: Initial Assessment
- Project discovery (tech stack, frameworks, tools)
- Quick health check (LOC, docs, git practices)
- Red flag detection (secrets, massive files)
### Phase 2: Deep Analysis
Load on demand based on Phase 1 findings.
### Phase 3: Report Generation
Comprehensive report with scores and priorities.
### Phase 4: Remediation Planning
Prioritized action plan with effort estimates.
## Analysis Categories
| Category | Key Checks |
|----------|------------|
| Code Quality | Complexity, duplication, code smells |
| Testing | Coverage (80% min), trophy distribution, quality |
| Security | OWASP Top 10, dependencies, secrets |
| Architecture | SOLID, patterns, modularity |
| Performance | Build time, bundle size, runtime |
| Documentation | JSDoc, README, ADRs |
| DevOps | CI/CD maturity, DORA metrics |
| Accessibility | WCAG 2.1 AA compliance |
## Technical Debt Rating (SQALE)
| Grade | Remediation Effort |
|-------|-------------------|
| A | <= 5% of dev time |
| B | 6-10% |
| C | 11-20% |
| D | 21-50% |
| E | > 50% |
## Usage Examples
```
# Basic audit
Audit this codebase using the codebase-auditor skill.
# Security focused
Run a security-focused audit on this codebase.
# Quick health check
Give me a quick health check (Phase 1 only).
# Custom scope
Audit focusing on test coverage and security.
```
## Output Formats
1. **Markdown Report** - Human-readable for PR comments
2. **JSON Report** - Machine-readable for CI/CD
3. **HTML Dashboard** - Interactive visualization
4. **Remediation Plan** - Prioritized action items
## Priority Levels
| Priority | Examples | Timeline |
|----------|----------|----------|
| P1 Critical | Security vulns, data loss risks | Immediate |
| P2 High | Coverage gaps, performance issues | This sprint |
| P3 Medium | Code smells, doc gaps | Next quarter |
| P4 Low | Stylistic, minor optimizations | Backlog |
## Best Practices
1. Run incrementally for large codebases
2. Focus on critical paths first
3. Baseline before major releases
4. Track metrics over time
5. Integrate with CI/CD
## Integrations
Complements: SonarQube, ESLint, Jest/Vitest, npm audit, Lighthouse, GitHub Actions
## Limitations
- Static analysis only (no runtime profiling)
- Requires source code access
- Internet needed for CVE data
- Large codebases need chunked analysis
## References
See `reference/` for:
- Complete audit criteria checklist
- Severity matrix and scoring rubric
- 2024-25 SDLC best practices guide

View File

@@ -0,0 +1,126 @@
# Codebase Remediation Plan
**Generated**: 2024-10-21 14:30:00
**Codebase**: `/Users/connor/projects/example-app`
---
## Priority 0: Critical Issues (Fix Immediately ⚡)
**Timeline**: Within 24 hours
**Impact**: Security vulnerabilities, production-breaking bugs, data loss risks
### 1. Potential API key found in code
**Category**: Security
**Location**: `src/utils/api.ts`
**Effort**: LOW
**Issue**: Found potential secret on line 12
**Impact**: Exposed secrets can lead to unauthorized access and data breaches
**Action**: Remove secret from code and use environment variables or secret management tools
---
### 2. Use of eval() is dangerous
**Category**: Security
**Location**: `src/legacy/parser.js`
**Effort**: MEDIUM
**Issue**: Found on line 45
**Impact**: eval() can execute arbitrary code and is a security risk
**Action**: Refactor to avoid eval(), use safer alternatives like Function constructor with specific scope
---
## Priority 1: High Issues (Fix This Sprint 📅)
**Timeline**: Within current sprint (2 weeks)
**Impact**: Significant quality, security, or user experience issues
### 1. High cyclomatic complexity (28)
**Category**: Code Quality
**Effort**: HIGH
**Action**: Refactor into smaller functions, extract complex conditions
### 2. Line coverage below target (65.3%)
**Category**: Testing
**Effort**: HIGH
**Action**: Add tests to increase coverage by 14.7%
### 3. Long function (127 lines)
**Category**: Code Quality
**Effort**: MEDIUM
**Action**: Extract smaller functions for distinct responsibilities
### 4. Console statement in production code
**Category**: Code Quality
**Effort**: LOW
**Action**: Remove console statement or replace with proper logging framework
### 5. Large file (843 lines)
**Category**: Code Quality
**Effort**: HIGH
**Action**: Split into multiple smaller, focused modules
---
## Priority 2: Medium Issues (Fix Next Quarter 📆)
**Timeline**: Within 3 months
**Impact**: Code maintainability, developer productivity
**Total Issues**: 25
**Grouped by Type**:
- Typescript Strict Mode: 8 issues
- Modern Javascript: 5 issues
- Code Smell: 7 issues
- Function Length: 5 issues
---
## Priority 3: Low Issues (Backlog 📋)
**Timeline**: When time permits
**Impact**: Minor improvements, stylistic issues
**Total Issues**: 12
*Address during dedicated tech debt sprints or slow periods*
---
## Suggested Timeline
- **2024-10-22**: All P0 issues resolved
- **2024-11-04**: P1 issues addressed (end of sprint)
- **2025-01-20**: P2 issues resolved (end of quarter)
## Effort Summary
**Total Estimated Effort**: 32.5 person-days
- Critical/High: 18.5 days
- Medium: 10.0 days
- Low: 4.0 days
## Team Assignment Suggestions
- **Security Team**: All P0 security issues, P1 vulnerabilities
- **QA/Testing**: Test coverage improvements, test quality issues
- **Infrastructure**: CI/CD improvements, build performance
- **Development Team**: Code quality refactoring, complexity reduction
---
*Remediation plan generated by Codebase Auditor Skill*
*Priority scoring based on: Impact × 10 + Frequency × 5 - Effort × 2*

View File

@@ -0,0 +1,117 @@
# Codebase Audit Report
**Generated**: 2024-10-21 14:30:00
**Codebase**: `/Users/connor/projects/example-app`
**Tech Stack**: javascript, typescript, react, node
**Total Files**: 342
**Lines of Code**: 15,420
---
## Executive Summary
### Overall Health Score: **72/100**
#### Category Scores
- **Quality**: 68/100 ⚠️
- **Testing**: 65/100 ⚠️
- **Security**: 85/100 ✅
- **Technical Debt**: 70/100 ⚠️
#### Issue Summary
- **Critical Issues**: 2
- **High Issues**: 8
- **Total Issues**: 47
---
## Detailed Findings
### 🚨 CRITICAL (2 issues)
#### Potential API key found in code
**Category**: Security
**Subcategory**: secrets
**Location**: `src/utils/api.ts:12`
Found potential secret on line 12
```typescript
const API_KEY = "sk_live_1234567890abcdef1234567890abcdef";
```
**Impact**: Exposed secrets can lead to unauthorized access and data breaches
**Remediation**: Remove secret from code and use environment variables or secret management tools
**Effort**: LOW
---
#### Use of eval() is dangerous
**Category**: Security
**Subcategory**: code_security
**Location**: `src/legacy/parser.js:45`
Found on line 45
```javascript
const result = eval(userInput);
```
**Impact**: eval() can execute arbitrary code and is a security risk
**Remediation**: Refactor to avoid eval(), use safer alternatives like Function constructor with specific scope
**Effort**: MEDIUM
---
### ⚠️ HIGH (8 issues)
#### High cyclomatic complexity (28)
**Category**: Code Quality
**Subcategory**: complexity
**Location**: `src/services/checkout.ts:156`
Function has complexity of 28
**Impact**: High complexity makes code difficult to understand, test, and maintain
**Remediation**: Refactor into smaller functions, extract complex conditions
**Effort**: HIGH
---
#### Line coverage below target (65.3%)
**Category**: Testing
**Subcategory**: test_coverage
**Location**: `coverage/coverage-summary.json`
Current coverage is 65.3%, target is 80%
**Impact**: Low coverage means untested code paths and higher bug risk
**Remediation**: Add tests to increase coverage by 14.7%
**Effort**: HIGH
---
## Recommendations
1. **Immediate Action Required**: Address all 2 critical security and quality issues before deploying to production.
2. **Sprint Focus**: Prioritize fixing the 8 high-severity issues in the next sprint. These significantly impact code quality and maintainability.
3. **Testing Improvements**: Increase test coverage to meet the 80% minimum threshold. Focus on critical paths first (authentication, payment, data processing).
4. **Security Review**: Conduct a thorough security review and penetration testing given the security issues found.
---
*Report generated by Codebase Auditor Skill (2024-25 Standards)*

View File

@@ -0,0 +1,292 @@
# Codebase Audit Criteria Checklist
This document provides a comprehensive checklist for auditing codebases based on modern software engineering best practices (2024-25).
## 1. Code Quality
### Complexity Metrics
- [ ] Cyclomatic complexity measured for all functions/methods
- [ ] Functions with complexity > 10 flagged as warnings
- [ ] Functions with complexity > 20 flagged as critical
- [ ] Cognitive complexity analyzed
- [ ] Maximum nesting depth < 4 levels
- [ ] Function/method length < 50 LOC (recommendation)
- [ ] File length < 500 LOC (recommendation)
### Code Duplication
- [ ] Duplication analysis performed (minimum 6-line blocks)
- [ ] Overall duplication < 5%
- [ ] Duplicate blocks identified with locations
- [ ] Opportunities for abstraction documented
### Code Smells
- [ ] God objects/classes identified (> 10 public methods)
- [ ] Feature envy detected (high coupling to other classes)
- [ ] Dead code identified (unused imports, variables, functions)
- [ ] Magic numbers replaced with named constants
- [ ] Hard-coded values moved to configuration
- [ ] Naming conventions consistent
- [ ] Error handling comprehensive
- [ ] No console.log in production code
- [ ] No commented-out code blocks
### Language-Specific (TypeScript/JavaScript)
- [ ] No use of `any` type (strict mode)
- [ ] No use of `var` keyword
- [ ] Strict equality (`===`) used consistently
- [ ] Return type annotations present for functions
- [ ] Non-null assertions justified with comments
- [ ] Async/await preferred over Promise chains
- [ ] No implicit any returns
## 2. Testing & Coverage
### Coverage Metrics
- [ ] Line coverage >= 80%
- [ ] Branch coverage >= 75%
- [ ] Function coverage >= 90%
- [ ] Critical paths have 100% coverage (auth, payment, data processing)
- [ ] Coverage reports generated and accessible
### Testing Trophy Distribution
- [ ] Integration tests: ~70% of total tests
- [ ] Unit tests: ~20% of total tests
- [ ] E2E tests: ~10% of total tests
- [ ] Actual distribution documented
### Test Quality
- [ ] Tests follow "should X when Y" naming pattern
- [ ] Tests are isolated and independent
- [ ] No tests of implementation details (brittle tests)
- [ ] Single assertion per test (or grouped related assertions)
- [ ] Edge cases covered
- [ ] No flaky tests
- [ ] Tests use semantic queries (getByRole, getByLabelText)
- [ ] Avoid testing emoji presence, exact DOM counts, element ordering
### Test Performance
- [ ] Tests complete in < 30 seconds (unit/integration)
- [ ] CPU usage monitored (use `npm run test:low -- --run`)
- [ ] No runaway test processes
- [ ] Tests run in parallel where possible
- [ ] Max threads limited to prevent CPU overload
## 3. Security
### Dependency Vulnerabilities
- [ ] No critical CVEs in dependencies
- [ ] No high-severity CVEs in dependencies
- [ ] All dependencies using supported versions
- [ ] No dependencies unmaintained for > 2 years
- [ ] License compliance verified
- [ ] No dependency confusion risks
### OWASP Top 10 (2024)
- [ ] Access control properly implemented
- [ ] Sensitive data encrypted at rest and in transit
- [ ] Input validation prevents injection attacks
- [ ] Security design patterns followed
- [ ] Security configuration reviewed (no defaults)
- [ ] All components up-to-date
- [ ] Authentication robust (MFA, rate limiting)
- [ ] Software integrity verified (SRI, signatures)
- [ ] Security logging and monitoring enabled
- [ ] SSRF protections in place
### Secrets Management
- [ ] No API keys in code
- [ ] No tokens in code
- [ ] No passwords in code
- [ ] No private keys committed
- [ ] Environment variables properly used
- [ ] No secrets in client-side code
- [ ] .env files in .gitignore
- [ ] Git history clean of secrets
### Security Best Practices
- [ ] Input validation on all user inputs
- [ ] Output encoding prevents XSS
- [ ] CSRF tokens implemented
- [ ] Secure session management
- [ ] HTTPS enforced
- [ ] CSP headers configured
- [ ] Rate limiting on APIs
- [ ] SQL prepared statements used
## 4. Architecture & Design
### SOLID Principles
- [ ] Single Responsibility: Classes/modules have one reason to change
- [ ] Open/Closed: Open for extension, closed for modification
- [ ] Liskov Substitution: Subtypes are substitutable for base types
- [ ] Interface Segregation: Clients not forced to depend on unused methods
- [ ] Dependency Inversion: Depend on abstractions, not concretions
### Design Patterns
- [ ] Appropriate patterns used (Factory, Strategy, Observer, etc.)
- [ ] No anti-patterns (Singleton abuse, God Object, etc.)
- [ ] Not over-engineered
- [ ] Not under-engineered
### Modularity
- [ ] Low coupling between modules
- [ ] High cohesion within modules
- [ ] No circular dependencies
- [ ] Proper separation of concerns
- [ ] Clean public APIs
- [ ] Internal implementation details hidden
## 5. Performance
### Build Performance
- [ ] Build time < 2 minutes for typical project
- [ ] Bundle size documented and optimized
- [ ] Code splitting implemented
- [ ] Tree-shaking enabled
- [ ] Source maps configured correctly
- [ ] Production build optimized
### Runtime Performance
- [ ] No memory leaks
- [ ] Algorithms efficient (avoid O(n²) where possible)
- [ ] No excessive re-renders (React/Vue)
- [ ] Computations memoized where appropriate
- [ ] Images optimized (< 200KB)
- [ ] Videos optimized or lazy-loaded
- [ ] Lazy loading for large components
### CI/CD Performance
- [ ] Pipeline runs in < 10 minutes
- [ ] Deployment frequency documented
- [ ] Test execution time < 5 minutes
- [ ] Docker images < 500MB (if applicable)
## 6. Documentation
### Code Documentation
- [ ] Public APIs documented (JSDoc/TSDoc)
- [ ] Complex logic has inline comments
- [ ] README.md comprehensive
- [ ] Architecture Decision Records (ADRs) present
- [ ] API documentation available
- [ ] CONTRIBUTING.md exists
- [ ] CODE_OF_CONDUCT.md exists
### Documentation Maintenance
- [ ] No outdated documentation
- [ ] No broken links
- [ ] All sections complete
- [ ] Code examples work correctly
- [ ] Changelog maintained
## 7. DevOps & CI/CD
### CI/CD Maturity
- [ ] Automated testing in pipeline
- [ ] Automated deployment configured
- [ ] Development/staging/production environments
- [ ] Rollback capability exists
- [ ] Feature flags used for risky changes
- [ ] Blue-green or canary deployments
### DORA 4 Metrics
- [ ] Deployment frequency measured
- Elite: Multiple times per day
- High: Once per day to once per week
- Medium: Once per week to once per month
- Low: Less than once per month
- [ ] Lead time for changes measured
- Elite: Less than 1 hour
- High: 1 day to 1 week
- Medium: 1 week to 1 month
- Low: More than 1 month
- [ ] Change failure rate measured
- Elite: < 1%
- High: 1-5%
- Medium: 5-15%
- Low: > 15%
- [ ] Time to restore service measured
- Elite: < 1 hour
- High: < 1 day
- Medium: 1 day to 1 week
- Low: > 1 week
### Infrastructure as Code
- [ ] Configuration managed as code
- [ ] Infrastructure versioned
- [ ] Secrets managed securely (Vault, AWS Secrets Manager)
- [ ] Environment variables documented
## 8. Accessibility (WCAG 2.1 AA)
### Semantic HTML
- [ ] Proper heading hierarchy (h1 → h2 → h3)
- [ ] ARIA labels where needed
- [ ] Form labels associated with inputs
- [ ] Landmark regions defined (header, nav, main, footer)
### Keyboard Navigation
- [ ] All interactive elements keyboard accessible
- [ ] Focus management implemented
- [ ] Tab order logical
- [ ] Focus indicators visible
### Screen Reader Support
- [ ] Images have alt text
- [ ] ARIA live regions for dynamic content
- [ ] Links have descriptive text
- [ ] Form errors announced
### Color & Contrast
- [ ] Text contrast >= 4.5:1 (normal text)
- [ ] Text contrast >= 3:1 (large text 18pt+)
- [ ] UI components contrast >= 3:1
- [ ] Color not sole means of conveying information
## 9. Technical Debt
### SQALE Rating
- [ ] Technical debt quantified in person-days
- [ ] Rating assigned (A-E)
- A: <= 5% of development time
- B: 6-10%
- C: 11-20%
- D: 21-50%
- E: > 50%
### Debt Categories
- [ ] Code smell debt identified
- [ ] Test debt quantified
- [ ] Documentation debt listed
- [ ] Security debt prioritized
- [ ] Performance debt noted
- [ ] Architecture debt evaluated
## 10. Project-Specific Standards
### Connor's Global Standards
- [ ] TypeScript strict mode enabled
- [ ] No `any` types
- [ ] Explicit return types
- [ ] Comprehensive error handling
- [ ] 80%+ test coverage
- [ ] No console.log statements
- [ ] No `var` keyword
- [ ] No loose equality (`==`)
- [ ] Conventional commits format
- [ ] Branch naming follows pattern: (feature|bugfix|chore)/{component-name}
## Audit Completion
### Final Checks
- [ ] All critical issues identified
- [ ] All high-severity issues documented
- [ ] Severity assigned to each finding
- [ ] Remediation effort estimated
- [ ] Report generated
- [ ] Remediation plan created
- [ ] Stakeholders notified
---
**Note**: This checklist is based on industry best practices as of 2024-25. Adjust severity thresholds and criteria based on your project's maturity stage and business context.

View File

@@ -0,0 +1,573 @@
# Modern SDLC Best Practices (2024-25)
This document outlines industry-standard software development lifecycle best practices based on 2024-25 research and modern engineering standards.
## Table of Contents
1. [Development Workflow](#development-workflow)
2. [Testing Strategy](#testing-strategy)
3. [Security (DevSecOps)](#security-devsecops)
4. [Code Quality](#code-quality)
5. [Performance](#performance)
6. [Documentation](#documentation)
7. [DevOps & CI/CD](#devops--cicd)
8. [DORA Metrics](#dora-metrics)
9. [Developer Experience](#developer-experience)
10. [Accessibility](#accessibility)
---
## Development Workflow
### Version Control (Git)
**Branching Strategy**:
- Main/master branch is always deployable
- Feature branches for new work: `feature/{component-name}`
- Bugfix branches: `bugfix/{issue-number}`
- Release branches for production releases
- No direct commits to main (use pull requests)
**Commit Messages**:
- Follow Conventional Commits format
- Structure: `type(scope): description`
- Types: feat, fix, docs, style, refactor, test, chore
- Example: `feat(auth): add OAuth2 social login`
**Code Review**:
- All changes require peer review
- Use pull request templates
- Automated checks must pass before merge
- Review within 24 hours for team velocity
- Focus on logic, security, and maintainability
### Test-Driven Development (TDD)
**RED-GREEN-REFACTOR Cycle**:
1. **RED**: Write failing test first
2. **GREEN**: Write minimum code to pass
3. **REFACTOR**: Improve code quality while tests pass
**Benefits**:
- Better design through testability
- Documentation through tests
- Confidence to refactor
- Fewer regression bugs
---
## Testing Strategy
### Testing Trophy (Kent C. Dodds)
**Philosophy**: "Write tests. Not too many. Mostly integration."
**Distribution**:
- **Integration Tests (70%)**: User workflows and component interaction
- Test real user behavior
- Test multiple units working together
- Higher confidence than unit tests
- Example: User registration flow end-to-end
- **Unit Tests (20%)**: Complex business logic only
- Pure functions
- Complex algorithms
- Edge cases and error handling
- Example: Tax calculation logic
- **E2E Tests (10%)**: Critical user journeys
- Full stack, production-like environment
- Happy path scenarios
- Critical business flows
- Example: Complete purchase flow
### What NOT to Test (Brittle Patterns)
**Avoid**:
- Emoji presence in UI elements
- Exact number of DOM elements
- Specific element ordering (unless critical)
- API call counts (unless performance critical)
- CSS class names and styling
- Implementation details over user behavior
- Private methods/functions
- Third-party library internals
### What to Prioritize (User-Focused)
**Prioritize**:
- User workflows and interactions
- Business logic and calculations
- Data accuracy and processing
- Error handling and edge cases
- Performance within acceptable limits
- Accessibility compliance (WCAG 2.1 AA)
- Security boundaries
### Semantic Queries (React Testing Library)
**Priority Order**:
1. `getByRole()` - Most preferred (accessibility-first)
2. `getByLabelText()` - Form elements
3. `getByPlaceholderText()` - Inputs without labels
4. `getByText()` - User-visible content
5. `getByDisplayValue()` - Form current values
6. `getByAltText()` - Images
7. `getByTitle()` - Title attributes
8. `getByTestId()` - Last resort only
### Coverage Targets
**Minimum Requirements**:
- Overall coverage: **80%**
- Critical paths: **100%** (auth, payment, data processing)
- Branch coverage: **75%**
- Function coverage: **90%**
**Tools**:
- Jest/Vitest for unit & integration tests
- Cypress/Playwright for E2E tests
- Istanbul/c8 for coverage reporting
---
## Security (DevSecOps)
### Shift-Left Security
**Principle**: Integrate security into every development stage, not as an afterthought.
**Cost Multiplier**:
- Fix in **design**: 1x cost
- Fix in **development**: 5x cost
- Fix in **testing**: 10x cost
- Fix in **production**: 30x cost
### OWASP Top 10 (2024)
1. **Broken Access Control**: Enforce authorization checks on every request
2. **Cryptographic Failures**: Use TLS, encrypt PII, avoid weak algorithms
3. **Injection**: Validate input, use prepared statements, sanitize output
4. **Insecure Design**: Threat modeling, secure design patterns
5. **Security Misconfiguration**: Harden defaults, disable unnecessary features
6. **Vulnerable Components**: Keep dependencies updated, scan for CVEs
7. **Authentication Failures**: MFA, rate limiting, secure session management
8. **Software Integrity Failures**: Verify integrity with signatures, SRI
9. **Security Logging**: Log security events, monitor for anomalies
10. **SSRF**: Validate URLs, whitelist allowed domains
### Dependency Management
**Best Practices**:
- Run `npm audit` / `yarn audit` weekly
- Update dependencies monthly
- Use Dependabot/Renovate for automated updates
- Pin dependency versions in production
- Check licenses for compliance
- Monitor CVE databases
### Secrets Management
**Rules**:
- NEVER commit secrets to version control
- Use environment variables for configuration
- Use secret management tools (Vault, AWS Secrets Manager)
- Rotate secrets regularly
- Scan git history for leaked secrets
- Use `.env.example` for documentation, not `.env`
---
## Code Quality
### Complexity Metrics
**Cyclomatic Complexity**:
- **1-10**: Simple, easy to test
- **11-20**: Moderate, consider refactoring
- **21-50**: High, should refactor
- **50+**: Very high, must refactor
**Tool**: ESLint `complexity` rule, SonarQube
### Code Duplication
**Thresholds**:
- **< 5%**: Excellent
- **5-10%**: Acceptable
- **10-20%**: Needs attention
- **> 20%**: Critical issue
**DRY Principle**: Don't Repeat Yourself
- Extract common code into functions/modules
- Use design patterns (Template Method, Strategy)
- Balance DRY with readability
### Code Smells
**Common Smells**:
- **God Object**: Too many responsibilities
- **Feature Envy**: Too much coupling to other classes
- **Long Method**: > 50 lines
- **Long Parameter List**: > 4 parameters
- **Dead Code**: Unused code
- **Magic Numbers**: Hard-coded values
- **Primitive Obsession**: Overuse of primitives vs objects
**Refactoring Techniques**:
- Extract Method
- Extract Class
- Introduce Parameter Object
- Replace Magic Number with Constant
- Remove Dead Code
### Static Analysis
**Tools**:
- **SonarQube**: Comprehensive code quality platform
- **ESLint**: JavaScript/TypeScript linting
- **Prettier**: Code formatting
- **TypeScript**: Type checking in strict mode
- **Checkmarx**: Security-focused analysis
---
## Performance
### Build Performance
**Targets**:
- Build time: < 2 minutes
- Hot reload: < 200ms
- First build: < 5 minutes
**Optimization**:
- Use build caching
- Parallelize builds
- Tree-shaking
- Code splitting
- Lazy loading
### Runtime Performance
**Web Vitals (Core)**:
- **LCP (Largest Contentful Paint)**: < 2.5s
- **FID (First Input Delay)**: < 100ms
- **CLS (Cumulative Layout Shift)**: < 0.1
**API Performance**:
- **P50**: < 100ms
- **P95**: < 500ms
- **P99**: < 1000ms
**Optimization Techniques**:
- Caching (Redis, CDN)
- Database indexing
- Query optimization
- Compression (gzip, Brotli)
- Image optimization (WebP, lazy loading)
- Code splitting and lazy loading
### Bundle Size
**Targets**:
- Initial bundle: < 200KB (gzipped)
- Total JavaScript: < 500KB (gzipped)
- Images optimized: < 200KB each
**Tools**:
- webpack-bundle-analyzer
- Lighthouse
- Chrome DevTools Performance tab
---
## Documentation
### Code Documentation
**JSDoc/TSDoc**:
- Document all public APIs
- Include examples for complex functions
- Document parameters, return types, exceptions
**Example**:
```typescript
/**
* Calculates the total price including tax and discounts.
*
* @param items - Array of cart items
* @param taxRate - Tax rate as decimal (e.g., 0.08 for 8%)
* @param discountCode - Optional discount code
* @returns Total price with tax and discounts applied
* @throws {InvalidDiscountError} If discount code is invalid
*
* @example
* const total = calculateTotal(items, 0.08, 'SUMMER20');
*/
function calculateTotal(items: CartItem[], taxRate: number, discountCode?: string): number {
// ...
}
```
### Project Documentation
**Essential Files**:
- **README.md**: Project overview, setup instructions, quick start
- **CONTRIBUTING.md**: How to contribute, coding standards, PR process
- **CODE_OF_CONDUCT.md**: Community guidelines
- **CHANGELOG.md**: Version history and changes
- **LICENSE**: Legal license information
- **ARCHITECTURE.md**: High-level architecture overview
- **ADRs** (Architecture Decision Records): Document important decisions
---
## DevOps & CI/CD
### Continuous Integration
**Requirements**:
- Automated testing on every commit
- Build verification
- Code quality checks (linting, formatting)
- Security scanning
- Fast feedback (< 10 minutes)
**Pipeline Stages**:
1. Lint & Format Check
2. Unit Tests
3. Integration Tests
4. Security Scan
5. Build Artifacts
6. Deploy to Staging
7. E2E Tests
8. Deploy to Production (with approval)
### Continuous Deployment
**Strategies**:
- **Blue-Green**: Two identical environments, switch traffic
- **Canary**: Gradual rollout to subset of users
- **Rolling**: Update instances incrementally
- **Feature Flags**: Control feature visibility without deployment
**Rollback**:
- Automated rollback on failure detection
- Keep last 3-5 versions deployable
- Database migrations reversible
- Monitor key metrics post-deployment
### Infrastructure as Code
**Tools**:
- Terraform, CloudFormation, Pulumi
- Ansible, Chef, Puppet
- Docker, Kubernetes
**Benefits**:
- Version-controlled infrastructure
- Reproducible environments
- Disaster recovery
- Automated provisioning
---
## DORA Metrics
**Four Key Metrics** (DevOps Research and Assessment):
### 1. Deployment Frequency
**How often code is deployed to production**
- **Elite**: Multiple times per day
- **High**: Once per day to once per week
- **Medium**: Once per week to once per month
- **Low**: Less than once per month
### 2. Lead Time for Changes
**Time from commit to production**
- **Elite**: Less than 1 hour
- **High**: 1 day to 1 week
- **Medium**: 1 week to 1 month
- **Low**: More than 1 month
### 3. Change Failure Rate
**Percentage of deployments causing failures**
- **Elite**: < 1%
- **High**: 1-5%
- **Medium**: 5-15%
- **Low**: > 15%
### 4. Time to Restore Service
**Time to recover from production incident**
- **Elite**: < 1 hour
- **High**: < 1 day
- **Medium**: 1 day to 1 week
- **Low**: > 1 week
**Tracking**: Use CI/CD tools, APM (Application Performance Monitoring), incident management systems
---
## Developer Experience
### Why It Matters
**Statistics**:
- 83% of engineers experience burnout
- Developer experience is the strongest predictor of delivery capability
- Happy developers are 2x more productive
### Key Factors
**Fast Feedback Loops**:
- Quick build times
- Fast test execution
- Immediate linting/formatting feedback
- Hot module reloading
**Good Tooling**:
- Modern IDE with autocomplete
- Debuggers and profilers
- Automated code reviews
- Documentation generators
**Clear Standards**:
- Coding style guides
- Architecture documentation
- Onboarding guides
- Runbooks for common tasks
**Psychological Safety**:
- Blameless post-mortems
- Encourage experimentation
- Celebrate learning from failure
- Mentorship programs
---
## Accessibility
### WCAG 2.1 Level AA Compliance
**Four Principles (POUR)**:
1. **Perceivable**: Information must be presentable to users
- Alt text for images
- Captions for videos
- Color contrast ratios
2. **Operable**: UI components must be operable
- Keyboard navigation
- Sufficient time to read content
- No seizure-inducing content
3. **Understandable**: Information must be understandable
- Readable text
- Predictable behavior
- Input assistance (error messages)
4. **Robust**: Content must be robust across technologies
- Valid HTML
- ARIA attributes
- Cross-browser compatibility
### Testing Tools
**Automated**:
- axe DevTools
- Lighthouse
- WAVE
- Pa11y
**Manual**:
- Keyboard navigation testing
- Screen reader testing (NVDA, JAWS, VoiceOver)
- Color contrast checkers
- Zoom testing (200%+)
---
## Modern Trends (2024-25)
### AI-Assisted Development
**Tools**:
- GitHub Copilot
- ChatGPT / Claude
- Tabnine
- Amazon CodeWhisperer
**Best Practices**:
- Review all AI-generated code
- Write tests for AI code
- Understand before committing
- Train team on effective prompting
### Platform Engineering
**Concept**: Internal developer platforms to improve developer experience
**Components**:
- Self-service infrastructure
- Golden paths (templates)
- Developer portals
- Observability dashboards
### Observability (vs Monitoring)
**Three Pillars**:
1. **Logs**: What happened
2. **Metrics**: Quantitative data
3. **Traces**: Request flow through system
**Tools**:
- Datadog, New Relic, Grafana
- OpenTelemetry for standardization
- Distributed tracing (Jaeger, Zipkin)
---
## Industry Benchmarks (2024-25)
### Code Quality
- Tech debt ratio: < 5%
- Duplication: < 5%
- Test coverage: > 80%
- Build time: < 2 minutes
### Security
- CVE remediation: < 30 days
- Security training: Quarterly
- Penetration testing: Annually
### Performance
- Page load: < 3 seconds
- API response: P95 < 500ms
- Uptime: 99.9%+
### Team Metrics
- Pull request review time: < 24 hours
- Deployment frequency: Daily+
- Incident MTTR: < 1 hour
- Developer onboarding: < 1 week
---
**References**:
- DORA State of DevOps Report 2024
- OWASP Top 10 (2024 Edition)
- WCAG 2.1 Guidelines
- Kent C. Dodds Testing Trophy
- SonarQube Quality Gates
- Google Web Vitals
**Last Updated**: 2024-25
**Version**: 1.0

View File

@@ -0,0 +1,307 @@
# Severity Matrix & Issue Prioritization
This document defines how to categorize and prioritize issues found during codebase audits.
## Severity Levels
### Critical (P0) - Fix Immediately
**Definition**: Issues that pose immediate risk to security, data integrity, or production stability.
**Characteristics**:
- Security vulnerabilities with known exploits (CVE scores >= 9.0)
- Secrets or credentials exposed in code
- Data loss or corruption risks
- Production-breaking bugs
- Authentication/authorization bypasses
- SQL injection or XSS vulnerabilities
- Compliance violations (GDPR, HIPAA, etc.)
**Timeline**: Must be fixed within 24 hours
**Effort vs Impact**: Fix immediately regardless of effort
**Deployment**: Requires immediate hotfix release
**Examples**:
- API key committed to repository
- SQL injection vulnerability in production endpoint
- Authentication bypass allowing unauthorized access
- Critical CVE in production dependency (e.g., log4shell)
- Unencrypted PII being transmitted over HTTP
- Memory leak causing production crashes
---
### High (P1) - Fix This Sprint
**Definition**: Significant issues that impact quality, security, or user experience but don't pose immediate production risk.
**Characteristics**:
- Medium-severity security vulnerabilities (CVE scores 7.0-8.9)
- Critical path missing test coverage
- Performance bottlenecks affecting user experience
- WCAG AA accessibility violations
- TypeScript strict mode violations in critical code
- High cyclomatic complexity (> 20) in business logic
- Missing error handling in critical operations
**Timeline**: Fix within current sprint (2 weeks)
**Effort vs Impact**: Prioritize high-impact, low-effort fixes first
**Deployment**: Include in next regular release
**Examples**:
- Payment processing code with 0% test coverage
- Page load time > 3 seconds
- Form inaccessible to screen readers
- 500+ line function with complexity of 45
- Unhandled promise rejections in checkout flow
- Dependency with moderate CVE (6.5 score)
---
### Medium (P2) - Fix Next Quarter
**Definition**: Issues that reduce code maintainability, developer productivity, or future scalability but don't immediately impact users.
**Characteristics**:
- Code smells and duplication
- Low-severity security issues (CVE scores 4.0-6.9)
- Test coverage between 60-80%
- Documentation gaps
- Minor performance optimizations
- Outdated dependencies (no CVEs)
- Moderate complexity (10-20)
- Technical debt accumulation
**Timeline**: Fix within next quarter (3 months)
**Effort vs Impact**: Plan during sprint planning, batch similar fixes
**Deployment**: Include in planned refactoring releases
**Examples**:
- 15% code duplication across services
- Missing JSDoc for public API
- God class with 25 public methods
- Build time of 5 minutes
- Test suite takes 10 minutes to run
- Dependency 2 major versions behind (stable)
---
### Low (P3) - Backlog
**Definition**: Minor improvements, stylistic issues, or optimizations that have minimal impact on functionality or quality.
**Characteristics**:
- Stylistic inconsistencies
- Minor code smells
- Documentation improvements
- Nice-to-have features
- Long-term architectural improvements
- Code coverage 80-90% (already meets minimum)
- Low complexity optimizations (< 10)
**Timeline**: Address when time permits or during dedicated tech debt sprints
**Effort vs Impact**: Only fix if effort is minimal or during slow periods
**Deployment**: Bundle with feature releases
**Examples**:
- Inconsistent variable naming (camelCase vs snake_case)
- Missing comments on simple functions
- Single-character variable names in non-critical code
- Console.log in development-only code
- README could be more detailed
- Opportunity to refactor small utility function
---
## Scoring Rubric
Use this matrix to assign severity levels:
| Impact | Effort Low | Effort Medium | Effort High |
|--------|------------|---------------|-------------|
| **Critical** | P0 | P0 | P0 |
| **High** | P1 | P1 | P1 |
| **Medium** | P1 | P2 | P2 |
| **Low** | P2 | P3 | P3 |
### Impact Assessment
**Critical Impact**:
- Security breach
- Data loss/corruption
- Production outage
- Legal/compliance violation
**High Impact**:
- User experience degraded
- Performance issues
- Accessibility barriers
- Development velocity reduced significantly
**Medium Impact**:
- Code maintainability reduced
- Technical debt accumulating
- Future changes more difficult
- Developer productivity slightly reduced
**Low Impact**:
- Minimal user/developer effect
- Cosmetic issues
- Future-proofing
- Best practice deviations
### Effort Estimation
**Low Effort**: < 4 hours
- Simple configuration change
- One-line fix
- Update dependency version
**Medium Effort**: 4 hours - 2 days
- Refactor single module
- Add test coverage for feature
- Implement security fix with tests
**High Effort**: > 2 days
- Architectural changes
- Major refactoring
- Migration to new framework/library
- Comprehensive security overhaul
---
## Category-Specific Severity Guidelines
### Security Issues
| Finding | Severity |
|---------|----------|
| Known exploit in production | Critical |
| Secrets in code | Critical |
| Authentication bypass | Critical |
| SQL injection | Critical |
| XSS vulnerability | High |
| CSRF vulnerability | High |
| Outdated dependency (CVE 7-9) | High |
| Outdated dependency (CVE 4-7) | Medium |
| Missing security headers | Medium |
| Weak encryption algorithm | Medium |
### Code Quality Issues
| Finding | Severity |
|---------|----------|
| Complexity > 50 | High |
| Complexity 20-50 | Medium |
| Complexity 10-20 | Low |
| Duplication > 20% | High |
| Duplication 10-20% | Medium |
| Duplication 5-10% | Low |
| File > 1000 LOC | Medium |
| File > 500 LOC | Low |
| Dead code (unused for > 6 months) | Low |
### Test Coverage Issues
| Finding | Severity |
|---------|----------|
| Critical path untested | High |
| Coverage < 50% | High |
| Coverage 50-80% | Medium |
| Coverage 80-90% | Low |
| Flaky tests | Medium |
| Slow tests (> 10 min) | Medium |
| No E2E tests | Medium |
| Missing edge case tests | Low |
### Performance Issues
| Finding | Severity |
|---------|----------|
| Page load > 5s | High |
| Page load 3-5s | Medium |
| Memory leak | High |
| O(n²) in hot path | High |
| Bundle size > 5MB | Medium |
| Build time > 10 min | Medium |
| Unoptimized images | Low |
### Accessibility Issues
| Finding | Severity |
|---------|----------|
| No keyboard navigation | High |
| Contrast ratio < 3:1 | High |
| Missing ARIA labels | High |
| Heading hierarchy broken | Medium |
| Missing alt text | Medium |
| Focus indicators absent | Medium |
| Color-only information | Low |
---
## Remediation Priority Formula
Use this formula to calculate a priority score:
```
Priority Score = (Impact × 10) + (Frequency × 5) - (Effort × 2)
```
Where:
- **Impact**: 1-10 (10 = critical)
- **Frequency**: 1-10 (10 = affects all users/code)
- **Effort**: 1-10 (10 = requires months of work)
Sort issues by priority score (highest first) to create your remediation plan.
### Example Calculations
**Example 1**: SQL Injection in Login
- Impact: 10 (critical security issue)
- Frequency: 10 (affects all users)
- Effort: 3 (straightforward fix with prepared statements)
- Score: (10 × 10) + (10 × 5) - (3 × 2) = **144****P0**
**Example 2**: Missing Tests on Helper Utility
- Impact: 4 (low risk, helper function)
- Frequency: 2 (rarely used)
- Effort: 2 (quick to test)
- Score: (4 × 10) + (2 × 5) - (2 × 2) = **46****P3**
**Example 3**: Performance Bottleneck in Search
- Impact: 7 (user experience degraded)
- Frequency: 8 (common feature)
- Effort: 6 (requires algorithm optimization)
- Score: (7 × 10) + (8 × 5) - (6 × 2) = **98****P1**
---
## Escalation Criteria
Escalate to leadership when:
- 5+ Critical issues found
- 10+ High issues in production code
- SQALE rating of D or E
- Security issues require disclosure
- Compliance violations detected
- Technical debt > 50% of development capacity
---
## Review Cycles
Recommended audit frequency based on project type:
| Project Type | Audit Frequency | Focus Areas |
|-------------|-----------------|-------------|
| Production SaaS | Monthly | Security, Performance, Uptime |
| Enterprise Software | Quarterly | Compliance, Security, Quality |
| Internal Tools | Semi-annually | Technical Debt, Maintainability |
| Open Source | Per major release | Security, Documentation, API stability |
| Startup MVP | Before funding rounds | Security, Scalability, Technical Debt |
---
**Last Updated**: 2024-25 Standards
**Version**: 1.0

View File

@@ -0,0 +1,8 @@
"""
Analyzer modules for codebase auditing.
Each analyzer implements an analyze(codebase_path, metadata) function
that returns a list of findings.
"""
__version__ = '1.0.0'

View File

@@ -0,0 +1,411 @@
"""
Code Quality Analyzer
Analyzes code for:
- Cyclomatic complexity
- Code duplication
- Code smells
- File/function length
- Language-specific issues (TypeScript/JavaScript)
"""
import re
from pathlib import Path
from typing import Dict, List
def analyze(codebase_path: Path, metadata: Dict) -> List[Dict]:
"""
Analyze codebase for code quality issues.
Args:
codebase_path: Path to codebase
metadata: Project metadata from discovery phase
Returns:
List of findings with severity, location, and remediation info
"""
findings = []
# Determine which languages to analyze
tech_stack = metadata.get('tech_stack', {})
if tech_stack.get('javascript') or tech_stack.get('typescript'):
findings.extend(analyze_javascript_typescript(codebase_path))
if tech_stack.get('python'):
findings.extend(analyze_python(codebase_path))
# General analysis (language-agnostic)
findings.extend(analyze_file_sizes(codebase_path))
findings.extend(analyze_dead_code(codebase_path, tech_stack))
return findings
def analyze_javascript_typescript(codebase_path: Path) -> List[Dict]:
"""Analyze JavaScript/TypeScript specific quality issues."""
findings = []
extensions = {'.js', '.jsx', '.ts', '.tsx'}
exclude_dirs = {'node_modules', '.git', 'dist', 'build', '.next', 'coverage'}
for file_path in codebase_path.rglob('*'):
if (file_path.suffix in extensions and
not any(excluded in file_path.parts for excluded in exclude_dirs)):
try:
with open(file_path, 'r', encoding='utf-8', errors='ignore') as f:
content = f.read()
lines = content.split('\n')
# Check for TypeScript 'any' type
if file_path.suffix in {'.ts', '.tsx'}:
findings.extend(check_any_usage(file_path, content, lines))
# Check for 'var' keyword
findings.extend(check_var_usage(file_path, content, lines))
# Check for console.log statements
findings.extend(check_console_log(file_path, content, lines))
# Check for loose equality
findings.extend(check_loose_equality(file_path, content, lines))
# Check cyclomatic complexity (simplified)
findings.extend(check_complexity(file_path, content, lines))
# Check function length
findings.extend(check_function_length(file_path, content, lines))
except Exception as e:
# Skip files that can't be read
pass
return findings
def check_any_usage(file_path: Path, content: str, lines: List[str]) -> List[Dict]:
"""Check for TypeScript 'any' type usage."""
findings = []
# Pattern to match 'any' type (excluding comments)
any_pattern = re.compile(r':\s*any\b|<any>|Array<any>|\bany\[\]')
for line_num, line in enumerate(lines, start=1):
# Skip comments
if line.strip().startswith('//') or line.strip().startswith('/*') or line.strip().startswith('*'):
continue
if any_pattern.search(line):
findings.append({
'severity': 'medium',
'category': 'code_quality',
'subcategory': 'typescript_strict_mode',
'title': "Use of 'any' type violates TypeScript strict mode",
'description': f"Found 'any' type on line {line_num}",
'file': str(file_path.relative_to(file_path.parents[len(file_path.parts) - file_path.parts.index('annex') - 2])),
'line': line_num,
'code_snippet': line.strip(),
'impact': 'Reduces type safety and defeats the purpose of TypeScript',
'remediation': 'Replace "any" with specific types or use "unknown" with type guards',
'effort': 'low',
})
return findings
def check_var_usage(file_path: Path, content: str, lines: List[str]) -> List[Dict]:
"""Check for 'var' keyword usage."""
findings = []
var_pattern = re.compile(r'\bvar\s+\w+')
for line_num, line in enumerate(lines, start=1):
if line.strip().startswith('//') or line.strip().startswith('/*'):
continue
if var_pattern.search(line):
findings.append({
'severity': 'low',
'category': 'code_quality',
'subcategory': 'modern_javascript',
'title': "Use of 'var' keyword is deprecated",
'description': f"Found 'var' keyword on line {line_num}",
'file': str(file_path.relative_to(file_path.parents[len(file_path.parts) - file_path.parts.index('annex') - 2])),
'line': line_num,
'code_snippet': line.strip(),
'impact': 'Function-scoped variables can lead to bugs; block-scoped (let/const) is preferred',
'remediation': "Replace 'var' with 'const' (for values that don't change) or 'let' (for values that change)",
'effort': 'low',
})
return findings
def check_console_log(file_path: Path, content: str, lines: List[str]) -> List[Dict]:
"""Check for console.log statements in production code."""
findings = []
# Skip if it's in a test file
if 'test' in file_path.name or 'spec' in file_path.name or '__tests__' in str(file_path):
return findings
console_pattern = re.compile(r'\bconsole\.(log|debug|info|warn|error)\(')
for line_num, line in enumerate(lines, start=1):
if line.strip().startswith('//'):
continue
if console_pattern.search(line):
findings.append({
'severity': 'medium',
'category': 'code_quality',
'subcategory': 'production_code',
'title': 'Console statement in production code',
'description': f"Found console statement on line {line_num}",
'file': str(file_path.relative_to(file_path.parents[len(file_path.parts) - file_path.parts.index('annex') - 2])),
'line': line_num,
'code_snippet': line.strip(),
'impact': 'Console statements should not be in production code; use proper logging',
'remediation': 'Remove console statement or replace with proper logging framework',
'effort': 'low',
})
return findings
def check_loose_equality(file_path: Path, content: str, lines: List[str]) -> List[Dict]:
"""Check for loose equality operators (== instead of ===)."""
findings = []
loose_eq_pattern = re.compile(r'[^!<>]==[^=]|[^!<>]!=[^=]')
for line_num, line in enumerate(lines, start=1):
if line.strip().startswith('//') or line.strip().startswith('/*'):
continue
if loose_eq_pattern.search(line):
findings.append({
'severity': 'low',
'category': 'code_quality',
'subcategory': 'code_smell',
'title': 'Loose equality operator used',
'description': f"Found '==' or '!=' on line {line_num}, should use '===' or '!=='",
'file': str(file_path.relative_to(file_path.parents[len(file_path.parts) - file_path.parts.index('annex') - 2])),
'line': line_num,
'code_snippet': line.strip(),
'impact': 'Loose equality can lead to unexpected type coercion bugs',
'remediation': "Replace '==' with '===' and '!=' with '!=='",
'effort': 'low',
})
return findings
def check_complexity(file_path: Path, content: str, lines: List[str]) -> List[Dict]:
"""
Check cyclomatic complexity (simplified).
Counts decision points: if, else, while, for, case, catch, &&, ||, ?
"""
findings = []
# Find function declarations
func_pattern = re.compile(r'(function\s+\w+|const\s+\w+\s*=\s*\([^)]*\)\s*=>|\w+\s*\([^)]*\)\s*{)')
current_function = None
current_function_line = 0
brace_depth = 0
complexity = 0
for line_num, line in enumerate(lines, start=1):
stripped = line.strip()
# Track braces to find function boundaries
brace_depth += stripped.count('{') - stripped.count('}')
# New function started
if func_pattern.search(line) and brace_depth >= 1:
# Save previous function if exists
if current_function and complexity > 10:
severity = 'critical' if complexity > 20 else 'high' if complexity > 15 else 'medium'
findings.append({
'severity': severity,
'category': 'code_quality',
'subcategory': 'complexity',
'title': f'High cyclomatic complexity ({complexity})',
'description': f'Function has complexity of {complexity}',
'file': str(file_path.relative_to(file_path.parents[len(file_path.parts) - file_path.parts.index('annex') - 2])),
'line': current_function_line,
'code_snippet': current_function,
'impact': 'High complexity makes code difficult to understand, test, and maintain',
'remediation': 'Refactor into smaller functions, extract complex conditions',
'effort': 'medium' if complexity < 20 else 'high',
})
# Start new function
current_function = stripped
current_function_line = line_num
complexity = 1 # Base complexity
# Count complexity contributors
if current_function:
complexity += stripped.count('if ')
complexity += stripped.count('else if')
complexity += stripped.count('while ')
complexity += stripped.count('for ')
complexity += stripped.count('case ')
complexity += stripped.count('catch ')
complexity += stripped.count('&&')
complexity += stripped.count('||')
complexity += stripped.count('?')
return findings
def check_function_length(file_path: Path, content: str, lines: List[str]) -> List[Dict]:
"""Check for overly long functions."""
findings = []
func_pattern = re.compile(r'(function\s+\w+|const\s+\w+\s*=\s*\([^)]*\)\s*=>|\w+\s*\([^)]*\)\s*{)')
current_function = None
current_function_line = 0
function_lines = 0
brace_depth = 0
for line_num, line in enumerate(lines, start=1):
stripped = line.strip()
if func_pattern.search(line):
# Check previous function
if current_function and function_lines > 50:
severity = 'high' if function_lines > 100 else 'medium'
findings.append({
'severity': severity,
'category': 'code_quality',
'subcategory': 'function_length',
'title': f'Long function ({function_lines} lines)',
'description': f'Function is {function_lines} lines long (recommended: < 50)',
'file': str(file_path.relative_to(file_path.parents[len(file_path.parts) - file_path.parts.index('annex') - 2])),
'line': current_function_line,
'code_snippet': current_function,
'impact': 'Long functions are harder to understand, test, and maintain',
'remediation': 'Extract smaller functions for distinct responsibilities',
'effort': 'medium',
})
current_function = stripped
current_function_line = line_num
function_lines = 0
brace_depth = 0
if current_function:
function_lines += 1
brace_depth += stripped.count('{') - stripped.count('}')
if brace_depth == 0 and function_lines > 1:
# Function ended
current_function = None
return findings
def analyze_python(codebase_path: Path) -> List[Dict]:
"""Analyze Python-specific quality issues."""
findings = []
# Python analysis to be implemented
# Would check: PEP 8 violations, complexity, type hints, etc.
return findings
def analyze_file_sizes(codebase_path: Path) -> List[Dict]:
"""Check for overly large files."""
findings = []
exclude_dirs = {'node_modules', '.git', 'dist', 'build', '__pycache__'}
code_extensions = {'.js', '.jsx', '.ts', '.tsx', '.py', '.java', '.go', '.rs'}
for file_path in codebase_path.rglob('*'):
if (file_path.is_file() and
file_path.suffix in code_extensions and
not any(excluded in file_path.parts for excluded in exclude_dirs)):
try:
with open(file_path, 'r', encoding='utf-8', errors='ignore') as f:
lines = len(f.readlines())
if lines > 500:
severity = 'high' if lines > 1000 else 'medium'
findings.append({
'severity': severity,
'category': 'code_quality',
'subcategory': 'file_length',
'title': f'Large file ({lines} lines)',
'description': f'File has {lines} lines (recommended: < 500)',
'file': str(file_path.relative_to(file_path.parents[len(file_path.parts) - file_path.parts.index('annex') - 2])),
'line': 1,
'code_snippet': None,
'impact': 'Large files are difficult to navigate and understand',
'remediation': 'Split into multiple smaller, focused modules',
'effort': 'high',
})
except:
pass
return findings
def analyze_dead_code(codebase_path: Path, tech_stack: Dict) -> List[Dict]:
"""Detect potential dead code (commented-out code blocks)."""
findings = []
exclude_dirs = {'node_modules', '.git', 'dist', 'build'}
extensions = set()
if tech_stack.get('javascript') or tech_stack.get('typescript'):
extensions.update({'.js', '.jsx', '.ts', '.tsx'})
if tech_stack.get('python'):
extensions.add('.py')
for file_path in codebase_path.rglob('*'):
if (file_path.suffix in extensions and
not any(excluded in file_path.parts for excluded in exclude_dirs)):
try:
with open(file_path, 'r', encoding='utf-8', errors='ignore') as f:
lines = f.readlines()
# Count consecutive commented lines with code-like content
comment_block_size = 0
block_start_line = 0
for line_num, line in enumerate(lines, start=1):
stripped = line.strip()
# Check if line is commented code
if (stripped.startswith('//') and
any(keyword in stripped for keyword in ['function', 'const', 'let', 'var', 'if', 'for', 'while', '{', '}', ';'])):
if comment_block_size == 0:
block_start_line = line_num
comment_block_size += 1
else:
# End of comment block
if comment_block_size >= 5: # 5+ lines of commented code
findings.append({
'severity': 'low',
'category': 'code_quality',
'subcategory': 'dead_code',
'title': f'Commented-out code block ({comment_block_size} lines)',
'description': f'Found {comment_block_size} lines of commented code',
'file': str(file_path.relative_to(file_path.parents[len(file_path.parts) - file_path.parts.index('annex') - 2])),
'line': block_start_line,
'code_snippet': None,
'impact': 'Commented code clutters codebase and reduces readability',
'remediation': 'Remove commented code (it\'s in version control if needed)',
'effort': 'low',
})
comment_block_size = 0
except:
pass
return findings

View File

@@ -0,0 +1,31 @@
"""
Dependencies Analyzer
Analyzes:
- Outdated dependencies
- Vulnerable dependencies
- License compliance
- Dependency health
"""
from pathlib import Path
from typing import Dict, List
def analyze(codebase_path: Path, metadata: Dict) -> List[Dict]:
"""
Analyze dependencies for issues.
Args:
codebase_path: Path to codebase
metadata: Project metadata
Returns:
List of dependency-related findings
"""
findings = []
# Placeholder implementation
# In production, this would integrate with npm audit, pip-audit, etc.
return findings

View File

@@ -0,0 +1,30 @@
"""
Performance Analyzer
Analyzes:
- Bundle sizes
- Build times
- Runtime performance indicators
"""
from pathlib import Path
from typing import Dict, List
def analyze(codebase_path: Path, metadata: Dict) -> List[Dict]:
"""
Analyze performance issues.
Args:
codebase_path: Path to codebase
metadata: Project metadata
Returns:
List of performance-related findings
"""
findings = []
# Placeholder implementation
# In production, this would analyze bundle sizes, check build configs, etc.
return findings

View File

@@ -0,0 +1,235 @@
"""
Security Scanner
Analyzes codebase for:
- Secrets in code (API keys, tokens, passwords)
- Dependency vulnerabilities
- Common security anti-patterns
- OWASP Top 10 issues
"""
import re
import json
from pathlib import Path
from typing import Dict, List
# Common patterns for secrets
SECRET_PATTERNS = {
'api_key': re.compile(r'(api[_-]?key|apikey)\s*[=:]\s*["\']([a-zA-Z0-9_-]{20,})["\']', re.IGNORECASE),
'aws_key': re.compile(r'AKIA[0-9A-Z]{16}'),
'generic_secret': re.compile(r'(secret|password|passwd|pwd)\s*[=:]\s*["\']([^"\'\s]{8,})["\']', re.IGNORECASE),
'private_key': re.compile(r'-----BEGIN (RSA |)PRIVATE KEY-----'),
'jwt': re.compile(r'eyJ[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+'),
'github_token': re.compile(r'gh[pousr]_[A-Za-z0-9_]{36}'),
'slack_token': re.compile(r'xox[baprs]-[0-9]{10,12}-[0-9]{10,12}-[a-zA-Z0-9]{24,32}'),
}
def analyze(codebase_path: Path, metadata: Dict) -> List[Dict]:
"""
Analyze codebase for security issues.
Args:
codebase_path: Path to codebase
metadata: Project metadata from discovery phase
Returns:
List of security findings
"""
findings = []
# Scan for secrets
findings.extend(scan_for_secrets(codebase_path))
# Scan dependencies for vulnerabilities
if metadata.get('tech_stack', {}).get('javascript'):
findings.extend(scan_npm_dependencies(codebase_path))
# Check for common security anti-patterns
findings.extend(scan_security_antipatterns(codebase_path, metadata))
return findings
def scan_for_secrets(codebase_path: Path) -> List[Dict]:
"""Scan for hardcoded secrets in code."""
findings = []
exclude_dirs = {'node_modules', '.git', 'dist', 'build', '__pycache__', '.venv', 'venv'}
exclude_files = {'.env.example', 'package-lock.json', 'yarn.lock'}
# File extensions to scan
code_extensions = {'.js', '.jsx', '.ts', '.tsx', '.py', '.java', '.go', '.rb', '.php', '.yml', '.yaml', '.json', '.env'}
for file_path in codebase_path.rglob('*'):
if (file_path.is_file() and
file_path.suffix in code_extensions and
file_path.name not in exclude_files and
not any(excluded in file_path.parts for excluded in exclude_dirs)):
try:
with open(file_path, 'r', encoding='utf-8', errors='ignore') as f:
content = f.read()
lines = content.split('\n')
for pattern_name, pattern in SECRET_PATTERNS.items():
matches = pattern.finditer(content)
for match in matches:
# Find line number
line_num = content[:match.start()].count('\n') + 1
# Skip if it's clearly a placeholder or example
matched_text = match.group(0)
if is_placeholder(matched_text):
continue
findings.append({
'severity': 'critical',
'category': 'security',
'subcategory': 'secrets',
'title': f'Potential {pattern_name.replace("_", " ")} found in code',
'description': f'Found potential secret on line {line_num}',
'file': str(file_path.relative_to(codebase_path)),
'line': line_num,
'code_snippet': lines[line_num - 1].strip() if line_num <= len(lines) else '',
'impact': 'Exposed secrets can lead to unauthorized access and data breaches',
'remediation': 'Remove secret from code and use environment variables or secret management tools',
'effort': 'low',
})
except:
pass
return findings
def is_placeholder(text: str) -> bool:
"""Check if a potential secret is actually a placeholder."""
placeholders = [
'your_api_key', 'your_secret', 'example', 'placeholder', 'test',
'dummy', 'sample', 'xxx', '000', 'abc123', 'changeme', 'replace_me',
'my_api_key', 'your_key_here', 'insert_key_here'
]
text_lower = text.lower()
return any(placeholder in text_lower for placeholder in placeholders)
def scan_npm_dependencies(codebase_path: Path) -> List[Dict]:
"""Scan npm dependencies for known vulnerabilities."""
findings = []
package_json = codebase_path / 'package.json'
if not package_json.exists():
return findings
try:
with open(package_json, 'r') as f:
pkg = json.load(f)
deps = {**pkg.get('dependencies', {}), **pkg.get('devDependencies', {})}
# Check for commonly vulnerable packages (simplified - in production use npm audit)
vulnerable_packages = {
'lodash': ('< 4.17.21', 'Prototype pollution vulnerability'),
'axios': ('< 0.21.1', 'SSRF vulnerability'),
'node-fetch': ('< 2.6.7', 'Information exposure vulnerability'),
}
for pkg_name, (vulnerable_version, description) in vulnerable_packages.items():
if pkg_name in deps:
findings.append({
'severity': 'high',
'category': 'security',
'subcategory': 'dependencies',
'title': f'Potentially vulnerable dependency: {pkg_name}',
'description': f'{description} (version: {deps[pkg_name]})',
'file': 'package.json',
'line': None,
'code_snippet': f'"{pkg_name}": "{deps[pkg_name]}"',
'impact': 'Vulnerable dependencies can be exploited by attackers',
'remediation': f'Update {pkg_name} to version {vulnerable_version.replace("< ", ">= ")} or later',
'effort': 'low',
})
except:
pass
return findings
def scan_security_antipatterns(codebase_path: Path, metadata: Dict) -> List[Dict]:
"""Scan for common security anti-patterns."""
findings = []
if metadata.get('tech_stack', {}).get('javascript') or metadata.get('tech_stack', {}).get('typescript'):
findings.extend(scan_js_security_issues(codebase_path))
return findings
def scan_js_security_issues(codebase_path: Path) -> List[Dict]:
"""Scan JavaScript/TypeScript for security anti-patterns."""
findings = []
extensions = {'.js', '.jsx', '.ts', '.tsx'}
exclude_dirs = {'node_modules', '.git', 'dist', 'build'}
# Dangerous patterns
patterns = {
'eval': (
re.compile(r'\beval\s*\('),
'Use of eval() is dangerous',
'eval() can execute arbitrary code and is a security risk',
'Refactor to avoid eval(), use safer alternatives like Function constructor with specific scope'
),
'dangerouslySetInnerHTML': (
re.compile(r'dangerouslySetInnerHTML'),
'Use of dangerouslySetInnerHTML without sanitization',
'Can lead to XSS attacks if not properly sanitized',
'Sanitize HTML content or use safer alternatives'
),
'innerHTML': (
re.compile(r'\.innerHTML\s*='),
'Direct assignment to innerHTML',
'Can lead to XSS attacks if content is not sanitized',
'Use textContent for text or sanitize HTML before assigning'
),
'document.write': (
re.compile(r'document\.write\s*\('),
'Use of document.write()',
'Can be exploited for XSS and causes page reflow',
'Use DOM manipulation methods instead'
),
}
for file_path in codebase_path.rglob('*'):
if (file_path.suffix in extensions and
not any(excluded in file_path.parts for excluded in exclude_dirs)):
try:
with open(file_path, 'r', encoding='utf-8', errors='ignore') as f:
content = f.read()
lines = content.split('\n')
for pattern_name, (pattern, title, impact, remediation) in patterns.items():
for line_num, line in enumerate(lines, start=1):
if pattern.search(line):
findings.append({
'severity': 'high',
'category': 'security',
'subcategory': 'code_security',
'title': title,
'description': f'Found on line {line_num}',
'file': str(file_path.relative_to(codebase_path)),
'line': line_num,
'code_snippet': line.strip(),
'impact': impact,
'remediation': remediation,
'effort': 'medium',
})
except:
pass
return findings

View File

@@ -0,0 +1,76 @@
"""
Technical Debt Calculator
Calculates:
- SQALE rating (A-E)
- Remediation effort estimates
- Debt categorization
"""
from pathlib import Path
from typing import Dict, List
def analyze(codebase_path: Path, metadata: Dict) -> List[Dict]:
"""
Calculate technical debt metrics.
Args:
codebase_path: Path to codebase
metadata: Project metadata
Returns:
List of technical debt findings
"""
findings = []
# Placeholder implementation
# In production, this would calculate SQALE rating based on all findings
return findings
def calculate_sqale_rating(all_findings: List[Dict], total_loc: int) -> str:
"""
Calculate SQALE rating (A-E) based on findings.
Args:
all_findings: All findings from all analyzers
total_loc: Total lines of code
Returns:
SQALE rating (A, B, C, D, or E)
"""
# Estimate remediation time in hours
severity_hours = {
'critical': 8,
'high': 4,
'medium': 2,
'low': 0.5
}
total_remediation_hours = sum(
severity_hours.get(finding.get('severity', 'low'), 0.5)
for finding in all_findings
)
# Estimate development time (1 hour per 50 LOC is conservative)
development_hours = total_loc / 50
# Calculate debt ratio
if development_hours == 0:
debt_ratio = 0
else:
debt_ratio = (total_remediation_hours / development_hours) * 100
# Assign SQALE rating
if debt_ratio <= 5:
return 'A'
elif debt_ratio <= 10:
return 'B'
elif debt_ratio <= 20:
return 'C'
elif debt_ratio <= 50:
return 'D'
else:
return 'E'

View File

@@ -0,0 +1,184 @@
"""
Test Coverage Analyzer
Analyzes:
- Test coverage percentage
- Testing Trophy distribution
- Test quality
- Untested critical paths
"""
import json
from pathlib import Path
from typing import Dict, List
def analyze(codebase_path: Path, metadata: Dict) -> List[Dict]:
"""
Analyze test coverage and quality.
Args:
codebase_path: Path to codebase
metadata: Project metadata
Returns:
List of testing-related findings
"""
findings = []
# Check for test files existence
test_stats = analyze_test_presence(codebase_path, metadata)
if test_stats:
findings.extend(test_stats)
# Analyze coverage if coverage reports exist
coverage_findings = analyze_coverage_reports(codebase_path, metadata)
if coverage_findings:
findings.extend(coverage_findings)
return findings
def analyze_test_presence(codebase_path: Path, metadata: Dict) -> List[Dict]:
"""Check for test file presence and basic test hygiene."""
findings = []
# Count test files
test_extensions = {'.test.js', '.test.ts', '.test.jsx', '.test.tsx', '.spec.js', '.spec.ts'}
test_dirs = {'__tests__', 'tests', 'test', 'spec'}
test_file_count = 0
source_file_count = 0
exclude_dirs = {'node_modules', '.git', 'dist', 'build', '__pycache__'}
source_extensions = {'.js', '.jsx', '.ts', '.tsx', '.py'}
for file_path in codebase_path.rglob('*'):
if file_path.is_file() and not any(excluded in file_path.parts for excluded in exclude_dirs):
# Check if it's a test file
is_test = (
any(file_path.name.endswith(ext) for ext in test_extensions) or
any(test_dir in file_path.parts for test_dir in test_dirs)
)
if is_test:
test_file_count += 1
elif file_path.suffix in source_extensions:
source_file_count += 1
# Calculate test ratio
if source_file_count > 0:
test_ratio = (test_file_count / source_file_count) * 100
if test_ratio < 20:
findings.append({
'severity': 'high',
'category': 'testing',
'subcategory': 'test_coverage',
'title': f'Low test file ratio ({test_ratio:.1f}%)',
'description': f'Only {test_file_count} test files for {source_file_count} source files',
'file': None,
'line': None,
'code_snippet': None,
'impact': 'Insufficient testing leads to bugs and difficult refactoring',
'remediation': 'Add tests for untested modules, aim for at least 80% coverage',
'effort': 'high',
})
elif test_ratio < 50:
findings.append({
'severity': 'medium',
'category': 'testing',
'subcategory': 'test_coverage',
'title': f'Moderate test file ratio ({test_ratio:.1f}%)',
'description': f'{test_file_count} test files for {source_file_count} source files',
'file': None,
'line': None,
'code_snippet': None,
'impact': 'More tests needed to achieve recommended 80% coverage',
'remediation': 'Continue adding tests, focus on critical paths first',
'effort': 'medium',
})
return findings
def analyze_coverage_reports(codebase_path: Path, metadata: Dict) -> List[Dict]:
"""Analyze coverage reports if they exist."""
findings = []
# Look for coverage reports (Istanbul/c8 format)
coverage_files = [
codebase_path / 'coverage' / 'coverage-summary.json',
codebase_path / 'coverage' / 'coverage-final.json',
codebase_path / '.nyc_output' / 'coverage-summary.json',
]
for coverage_file in coverage_files:
if coverage_file.exists():
try:
with open(coverage_file, 'r') as f:
coverage_data = json.load(f)
# Extract total coverage
total = coverage_data.get('total', {})
line_coverage = total.get('lines', {}).get('pct', 0)
branch_coverage = total.get('branches', {}).get('pct', 0)
function_coverage = total.get('functions', {}).get('pct', 0)
statement_coverage = total.get('statements', {}).get('pct', 0)
# Check against 80% threshold
if line_coverage < 80:
severity = 'high' if line_coverage < 50 else 'medium'
findings.append({
'severity': severity,
'category': 'testing',
'subcategory': 'test_coverage',
'title': f'Line coverage below target ({line_coverage:.1f}%)',
'description': f'Current coverage is {line_coverage:.1f}%, target is 80%',
'file': 'coverage/coverage-summary.json',
'line': None,
'code_snippet': None,
'impact': 'Low coverage means untested code paths and higher bug risk',
'remediation': f'Add tests to increase coverage by {80 - line_coverage:.1f}%',
'effort': 'high',
})
if branch_coverage < 75:
findings.append({
'severity': 'medium',
'category': 'testing',
'subcategory': 'test_coverage',
'title': f'Branch coverage below target ({branch_coverage:.1f}%)',
'description': f'Current branch coverage is {branch_coverage:.1f}%, target is 75%',
'file': 'coverage/coverage-summary.json',
'line': None,
'code_snippet': None,
'impact': 'Untested branches can hide bugs in conditional logic',
'remediation': 'Add tests for edge cases and conditional branches',
'effort': 'medium',
})
break # Found coverage, don't check other files
except:
pass
# If no coverage report found
if not findings:
findings.append({
'severity': 'medium',
'category': 'testing',
'subcategory': 'test_infrastructure',
'title': 'No coverage report found',
'description': 'Could not find coverage-summary.json',
'file': None,
'line': None,
'code_snippet': None,
'impact': 'Cannot measure test effectiveness without coverage reports',
'remediation': 'Configure test runner to generate coverage reports (Jest: --coverage, Vitest: --coverage)',
'effort': 'low',
})
return findings

View File

@@ -0,0 +1,408 @@
#!/usr/bin/env python3
"""
Codebase Audit Engine
Orchestrates comprehensive codebase analysis using multiple specialized analyzers.
Generates detailed audit reports and remediation plans based on modern SDLC best practices.
Usage:
python audit_engine.py /path/to/codebase --output report.md
python audit_engine.py /path/to/codebase --format json --output report.json
python audit_engine.py /path/to/codebase --scope security,quality
"""
import argparse
import json
import sys
from datetime import datetime
from pathlib import Path
from typing import Dict, List, Optional
import importlib.util
# Import analyzers dynamically to support progressive loading
ANALYZERS = {
'quality': 'analyzers.code_quality',
'testing': 'analyzers.test_coverage',
'security': 'analyzers.security_scan',
'dependencies': 'analyzers.dependencies',
'performance': 'analyzers.performance',
'technical_debt': 'analyzers.technical_debt',
}
class AuditEngine:
"""
Core audit engine that orchestrates codebase analysis.
Uses progressive disclosure: loads only necessary analyzers based on scope.
"""
def __init__(self, codebase_path: Path, scope: Optional[List[str]] = None):
"""
Initialize audit engine.
Args:
codebase_path: Path to the codebase to audit
scope: Optional list of analysis categories to run (e.g., ['security', 'quality'])
If None, runs all analyzers.
"""
self.codebase_path = Path(codebase_path).resolve()
self.scope = scope or list(ANALYZERS.keys())
self.findings: Dict[str, List[Dict]] = {}
self.metadata: Dict = {}
if not self.codebase_path.exists():
raise FileNotFoundError(f"Codebase path does not exist: {self.codebase_path}")
def discover_project(self) -> Dict:
"""
Phase 1: Initial project discovery (lightweight scan).
Returns:
Dictionary containing project metadata
"""
print("🔍 Phase 1: Discovering project structure...")
metadata = {
'path': str(self.codebase_path),
'scan_time': datetime.now().isoformat(),
'tech_stack': self._detect_tech_stack(),
'project_type': self._detect_project_type(),
'total_files': self._count_files(),
'total_lines': self._count_lines(),
'git_info': self._get_git_info(),
}
self.metadata = metadata
return metadata
def _detect_tech_stack(self) -> Dict[str, bool]:
"""Detect languages and frameworks used in the project."""
tech_stack = {
'javascript': (self.codebase_path / 'package.json').exists(),
'typescript': self._file_exists_with_extension('.ts') or self._file_exists_with_extension('.tsx'),
'python': (self.codebase_path / 'setup.py').exists() or
(self.codebase_path / 'pyproject.toml').exists() or
self._file_exists_with_extension('.py'),
'react': self._check_dependency('react'),
'vue': self._check_dependency('vue'),
'angular': self._check_dependency('@angular/core'),
'node': (self.codebase_path / 'package.json').exists(),
'docker': (self.codebase_path / 'Dockerfile').exists(),
}
return {k: v for k, v in tech_stack.items() if v}
def _detect_project_type(self) -> str:
"""Determine project type (web app, library, CLI, etc.)."""
if (self.codebase_path / 'package.json').exists():
try:
with open(self.codebase_path / 'package.json', 'r') as f:
pkg = json.load(f)
if pkg.get('private') is False:
return 'library'
if 'bin' in pkg:
return 'cli'
return 'web_app'
except:
pass
if (self.codebase_path / 'setup.py').exists():
return 'python_package'
return 'unknown'
def _count_files(self) -> int:
"""Count total files in codebase (excluding common ignore patterns)."""
exclude_dirs = {'.git', 'node_modules', '__pycache__', '.venv', 'venv', 'dist', 'build'}
count = 0
for path in self.codebase_path.rglob('*'):
if path.is_file() and not any(excluded in path.parts for excluded in exclude_dirs):
count += 1
return count
def _count_lines(self) -> int:
"""Count total lines of code (excluding empty lines and comments)."""
exclude_dirs = {'.git', 'node_modules', '__pycache__', '.venv', 'venv', 'dist', 'build'}
code_extensions = {'.js', '.jsx', '.ts', '.tsx', '.py', '.java', '.go', '.rs', '.rb'}
total_lines = 0
for path in self.codebase_path.rglob('*'):
if (path.is_file() and
path.suffix in code_extensions and
not any(excluded in path.parts for excluded in exclude_dirs)):
try:
with open(path, 'r', encoding='utf-8', errors='ignore') as f:
total_lines += sum(1 for line in f if line.strip() and not line.strip().startswith(('//', '#', '/*', '*')))
except:
pass
return total_lines
def _get_git_info(self) -> Optional[Dict]:
"""Get git repository information."""
git_dir = self.codebase_path / '.git'
if not git_dir.exists():
return None
try:
import subprocess
result = subprocess.run(
['git', '-C', str(self.codebase_path), 'log', '--oneline', '-10'],
capture_output=True,
text=True,
timeout=5
)
commit_count = subprocess.run(
['git', '-C', str(self.codebase_path), 'rev-list', '--count', 'HEAD'],
capture_output=True,
text=True,
timeout=5
)
return {
'is_git_repo': True,
'recent_commits': result.stdout.strip().split('\n') if result.returncode == 0 else [],
'total_commits': int(commit_count.stdout.strip()) if commit_count.returncode == 0 else 0,
}
except:
return {'is_git_repo': True, 'error': 'Could not read git info'}
def _file_exists_with_extension(self, extension: str) -> bool:
"""Check if any file with given extension exists."""
return any(self.codebase_path.rglob(f'*{extension}'))
def _check_dependency(self, dep_name: str) -> bool:
"""Check if a dependency exists in package.json."""
pkg_json = self.codebase_path / 'package.json'
if not pkg_json.exists():
return False
try:
with open(pkg_json, 'r') as f:
pkg = json.load(f)
deps = {**pkg.get('dependencies', {}), **pkg.get('devDependencies', {})}
return dep_name in deps
except:
return False
def run_analysis(self, phase: str = 'full') -> Dict:
"""
Phase 2: Deep analysis using specialized analyzers.
Args:
phase: 'quick' for lightweight scan, 'full' for comprehensive analysis
Returns:
Dictionary containing all findings
"""
print(f"🔬 Phase 2: Running {phase} analysis...")
for category in self.scope:
if category not in ANALYZERS:
print(f"⚠️ Unknown analyzer category: {category}, skipping...")
continue
print(f" Analyzing {category}...")
analyzer_findings = self._run_analyzer(category)
if analyzer_findings:
self.findings[category] = analyzer_findings
return self.findings
def _run_analyzer(self, category: str) -> List[Dict]:
"""
Run a specific analyzer module.
Args:
category: Analyzer category name
Returns:
List of findings from the analyzer
"""
module_path = ANALYZERS.get(category)
if not module_path:
return []
try:
# Import analyzer module dynamically
analyzer_file = Path(__file__).parent / f"{module_path.replace('.', '/')}.py"
if not analyzer_file.exists():
print(f" ⚠️ Analyzer not yet implemented: {category}")
return []
spec = importlib.util.spec_from_file_location(module_path, analyzer_file)
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
# Each analyzer should have an analyze() function
if hasattr(module, 'analyze'):
return module.analyze(self.codebase_path, self.metadata)
else:
print(f" ⚠️ Analyzer missing analyze() function: {category}")
return []
except Exception as e:
print(f" ❌ Error running analyzer {category}: {e}")
return []
def calculate_scores(self) -> Dict[str, float]:
"""
Calculate health scores for each category and overall.
Returns:
Dictionary of scores (0-100 scale)
"""
scores = {}
# Calculate score for each category based on findings severity
for category, findings in self.findings.items():
if not findings:
scores[category] = 100.0
continue
# Weighted scoring based on severity
severity_weights = {'critical': 10, 'high': 5, 'medium': 2, 'low': 1}
total_weight = sum(severity_weights.get(f.get('severity', 'low'), 1) for f in findings)
# Score decreases based on weighted issues
# Formula: 100 - (total_weight / num_findings * penalty_factor)
penalty = min(total_weight, 100)
scores[category] = max(0, 100 - penalty)
# Overall score is weighted average
if scores:
scores['overall'] = sum(scores.values()) / len(scores)
else:
scores['overall'] = 100.0
return scores
def generate_summary(self) -> Dict:
"""
Generate executive summary of audit results.
Returns:
Summary dictionary
"""
critical_count = sum(
1 for findings in self.findings.values()
for f in findings
if f.get('severity') == 'critical'
)
high_count = sum(
1 for findings in self.findings.values()
for f in findings
if f.get('severity') == 'high'
)
scores = self.calculate_scores()
return {
'overall_score': round(scores.get('overall', 0), 1),
'category_scores': {k: round(v, 1) for k, v in scores.items() if k != 'overall'},
'critical_issues': critical_count,
'high_issues': high_count,
'total_issues': sum(len(findings) for findings in self.findings.values()),
'metadata': self.metadata,
}
def main():
"""Main entry point for CLI usage."""
parser = argparse.ArgumentParser(
description='Comprehensive codebase auditor based on modern SDLC best practices (2024-25)',
formatter_class=argparse.RawDescriptionHelpFormatter,
)
parser.add_argument(
'codebase',
type=str,
help='Path to the codebase to audit'
)
parser.add_argument(
'--scope',
type=str,
help='Comma-separated list of analysis categories (quality,testing,security,dependencies,performance,technical_debt)',
default=None
)
parser.add_argument(
'--phase',
type=str,
choices=['quick', 'full'],
default='full',
help='Analysis depth: quick (Phase 1 only) or full (Phase 1 + 2)'
)
parser.add_argument(
'--format',
type=str,
choices=['markdown', 'json', 'html'],
default='markdown',
help='Output format for the report'
)
parser.add_argument(
'--output',
type=str,
help='Output file path (default: stdout)',
default=None
)
args = parser.parse_args()
# Parse scope
scope = args.scope.split(',') if args.scope else None
# Initialize engine
try:
engine = AuditEngine(args.codebase, scope=scope)
except FileNotFoundError as e:
print(f"❌ Error: {e}", file=sys.stderr)
sys.exit(1)
# Run audit
print("🚀 Starting codebase audit...")
print(f" Codebase: {args.codebase}")
print(f" Scope: {scope or 'all'}")
print(f" Phase: {args.phase}")
print()
# Phase 1: Discovery
metadata = engine.discover_project()
print(f" Detected: {', '.join(metadata['tech_stack'].keys())}")
print(f" Files: {metadata['total_files']}")
print(f" Lines of code: {metadata['total_lines']:,}")
print()
# Phase 2: Analysis (if not quick mode)
if args.phase == 'full':
findings = engine.run_analysis()
# Generate summary
summary = engine.generate_summary()
# Output results
print()
print("📊 Audit complete!")
print(f" Overall score: {summary['overall_score']}/100")
print(f" Critical issues: {summary['critical_issues']}")
print(f" High issues: {summary['high_issues']}")
print(f" Total issues: {summary['total_issues']}")
print()
# Generate report (to be implemented in report_generator.py)
if args.output:
print(f"📝 Report generation will be implemented in report_generator.py")
print(f" Format: {args.format}")
print(f" Output: {args.output}")
if __name__ == '__main__':
main()

View File

@@ -0,0 +1,241 @@
#!/usr/bin/env python3
"""
Remediation Planner
Generates prioritized action plans based on audit findings.
Uses severity, impact, frequency, and effort to prioritize issues.
"""
from typing import Dict, List
from datetime import datetime, timedelta
def generate_remediation_plan(findings: Dict[str, List[Dict]], metadata: Dict) -> str:
"""
Generate a prioritized remediation plan.
Args:
findings: All findings organized by category
metadata: Project metadata
Returns:
Markdown-formatted remediation plan
"""
plan = []
# Header
plan.append("# Codebase Remediation Plan")
plan.append(f"\n**Generated**: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
plan.append(f"**Codebase**: `{metadata.get('path', 'Unknown')}`")
plan.append("\n---\n")
# Flatten and prioritize all findings
all_findings = []
for category, category_findings in findings.items():
for finding in category_findings:
finding['category'] = category
all_findings.append(finding)
# Calculate priority scores
for finding in all_findings:
finding['priority_score'] = calculate_priority_score(finding)
# Sort by priority score (highest first)
all_findings.sort(key=lambda x: x['priority_score'], reverse=True)
# Group by priority level
p0_issues = [f for f in all_findings if f['severity'] == 'critical']
p1_issues = [f for f in all_findings if f['severity'] == 'high']
p2_issues = [f for f in all_findings if f['severity'] == 'medium']
p3_issues = [f for f in all_findings if f['severity'] == 'low']
# Priority 0: Critical Issues (Fix Immediately)
if p0_issues:
plan.append("## Priority 0: Critical Issues (Fix Immediately ⚡)")
plan.append("\n**Timeline**: Within 24 hours")
plan.append("**Impact**: Security vulnerabilities, production-breaking bugs, data loss risks\n")
for i, finding in enumerate(p0_issues, 1):
plan.append(f"### {i}. {finding.get('title', 'Untitled')}")
plan.append(f"**Category**: {finding.get('category', 'Unknown').replace('_', ' ').title()}")
plan.append(f"**Location**: `{finding.get('file', 'Unknown')}`")
plan.append(f"**Effort**: {finding.get('effort', 'unknown').upper()}")
plan.append(f"\n**Issue**: {finding.get('description', 'No description')}")
plan.append(f"\n**Impact**: {finding.get('impact', 'Unknown impact')}")
plan.append(f"\n**Action**: {finding.get('remediation', 'No remediation suggested')}\n")
plan.append("---\n")
# Priority 1: High Issues (Fix This Sprint)
if p1_issues:
plan.append("## Priority 1: High Issues (Fix This Sprint 📅)")
plan.append("\n**Timeline**: Within current sprint (2 weeks)")
plan.append("**Impact**: Significant quality, security, or user experience issues\n")
for i, finding in enumerate(p1_issues[:10], 1): # Top 10
plan.append(f"### {i}. {finding.get('title', 'Untitled')}")
plan.append(f"**Category**: {finding.get('category', 'Unknown').replace('_', ' ').title()}")
plan.append(f"**Effort**: {finding.get('effort', 'unknown').upper()}")
plan.append(f"\n**Action**: {finding.get('remediation', 'No remediation suggested')}\n")
if len(p1_issues) > 10:
plan.append(f"\n*...and {len(p1_issues) - 10} more high-priority issues*\n")
plan.append("---\n")
# Priority 2: Medium Issues (Fix Next Quarter)
if p2_issues:
plan.append("## Priority 2: Medium Issues (Fix Next Quarter 📆)")
plan.append("\n**Timeline**: Within 3 months")
plan.append("**Impact**: Code maintainability, developer productivity\n")
plan.append(f"**Total Issues**: {len(p2_issues)}\n")
# Group by subcategory
subcategories = {}
for finding in p2_issues:
subcat = finding.get('subcategory', 'Other')
if subcat not in subcategories:
subcategories[subcat] = []
subcategories[subcat].append(finding)
plan.append("**Grouped by Type**:\n")
for subcat, subcat_findings in subcategories.items():
plan.append(f"- {subcat.replace('_', ' ').title()}: {len(subcat_findings)} issues")
plan.append("\n---\n")
# Priority 3: Low Issues (Backlog)
if p3_issues:
plan.append("## Priority 3: Low Issues (Backlog 📋)")
plan.append("\n**Timeline**: When time permits")
plan.append("**Impact**: Minor improvements, stylistic issues\n")
plan.append(f"**Total Issues**: {len(p3_issues)}\n")
plan.append("*Address during dedicated tech debt sprints or slow periods*\n")
plan.append("---\n")
# Implementation Timeline
plan.append("## Suggested Timeline\n")
today = datetime.now()
if p0_issues:
deadline = today + timedelta(days=1)
plan.append(f"- **{deadline.strftime('%Y-%m-%d')}**: All P0 issues resolved")
if p1_issues:
deadline = today + timedelta(weeks=2)
plan.append(f"- **{deadline.strftime('%Y-%m-%d')}**: P1 issues addressed (end of sprint)")
if p2_issues:
deadline = today + timedelta(weeks=12)
plan.append(f"- **{deadline.strftime('%Y-%m-%d')}**: P2 issues resolved (end of quarter)")
# Effort Summary
plan.append("\n## Effort Summary\n")
effort_estimates = calculate_effort_summary(all_findings)
plan.append(f"**Total Estimated Effort**: {effort_estimates['total']} person-days")
plan.append(f"- Critical/High: {effort_estimates['critical_high']} days")
plan.append(f"- Medium: {effort_estimates['medium']} days")
plan.append(f"- Low: {effort_estimates['low']} days")
# Team Assignment Suggestions
plan.append("\n## Team Assignment Suggestions\n")
plan.append("- **Security Team**: All P0 security issues, P1 vulnerabilities")
plan.append("- **QA/Testing**: Test coverage improvements, test quality issues")
plan.append("- **Infrastructure**: CI/CD improvements, build performance")
plan.append("- **Development Team**: Code quality refactoring, complexity reduction")
# Footer
plan.append("\n---\n")
plan.append("*Remediation plan generated by Codebase Auditor Skill*")
plan.append("\n*Priority scoring based on: Impact × 10 + Frequency × 5 - Effort × 2*")
return '\n'.join(plan)
def calculate_priority_score(finding: Dict) -> int:
"""
Calculate priority score for a finding.
Formula: (Impact × 10) + (Frequency × 5) - (Effort × 2)
Args:
finding: Individual finding
Returns:
Priority score (higher = more urgent)
"""
# Map severity to impact (1-10)
severity_impact = {
'critical': 10,
'high': 7,
'medium': 4,
'low': 2,
}
impact = severity_impact.get(finding.get('severity', 'low'), 1)
# Estimate frequency (1-10) based on category
# Security/testing issues affect everything
category = finding.get('category', '')
if category in ['security', 'testing']:
frequency = 10
elif category in ['quality', 'performance']:
frequency = 6
else:
frequency = 3
# Map effort to numeric value (1-10)
effort_values = {
'low': 2,
'medium': 5,
'high': 8,
}
effort = effort_values.get(finding.get('effort', 'medium'), 5)
# Calculate score
score = (impact * 10) + (frequency * 5) - (effort * 2)
return max(0, score) # Never negative
def calculate_effort_summary(findings: List[Dict]) -> Dict[str, int]:
"""
Calculate total effort estimates.
Args:
findings: All findings
Returns:
Dictionary with effort estimates in person-days
"""
# Map effort levels to days
effort_days = {
'low': 0.5,
'medium': 2,
'high': 5,
}
critical_high_days = sum(
effort_days.get(f.get('effort', 'medium'), 2)
for f in findings
if f.get('severity') in ['critical', 'high']
)
medium_days = sum(
effort_days.get(f.get('effort', 'medium'), 2)
for f in findings
if f.get('severity') == 'medium'
)
low_days = sum(
effort_days.get(f.get('effort', 'medium'), 2)
for f in findings
if f.get('severity') == 'low'
)
return {
'critical_high': round(critical_high_days, 1),
'medium': round(medium_days, 1),
'low': round(low_days, 1),
'total': round(critical_high_days + medium_days + low_days, 1),
}

View File

@@ -0,0 +1,345 @@
#!/usr/bin/env python3
"""
Report Generator
Generates audit reports in multiple formats:
- Markdown (default, human-readable)
- JSON (machine-readable, CI/CD integration)
- HTML (interactive dashboard)
"""
import json
from datetime import datetime
from pathlib import Path
from typing import Dict, List
def generate_markdown_report(summary: Dict, findings: Dict[str, List[Dict]], metadata: Dict) -> str:
"""
Generate a Markdown-formatted audit report.
Args:
summary: Executive summary data
findings: All findings organized by category
metadata: Project metadata
Returns:
Markdown report as string
"""
report = []
# Header
report.append("# Codebase Audit Report")
report.append(f"\n**Generated**: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
report.append(f"**Codebase**: `{metadata.get('path', 'Unknown')}`")
report.append(f"**Tech Stack**: {', '.join(metadata.get('tech_stack', {}).keys())}")
report.append(f"**Total Files**: {metadata.get('total_files', 0):,}")
report.append(f"**Lines of Code**: {metadata.get('total_lines', 0):,}")
report.append("\n---\n")
# Executive Summary
report.append("## Executive Summary")
report.append(f"\n### Overall Health Score: **{summary.get('overall_score', 0)}/100**\n")
# Score breakdown
report.append("#### Category Scores\n")
for category, score in summary.get('category_scores', {}).items():
emoji = score_to_emoji(score)
report.append(f"- **{category.replace('_', ' ').title()}**: {score}/100 {emoji}")
# Issue summary
report.append("\n#### Issue Summary\n")
report.append(f"- **Critical Issues**: {summary.get('critical_issues', 0)}")
report.append(f"- **High Issues**: {summary.get('high_issues', 0)}")
report.append(f"- **Total Issues**: {summary.get('total_issues', 0)}")
report.append("\n---\n")
# Detailed Findings
report.append("## Detailed Findings\n")
severity_order = ['critical', 'high', 'medium', 'low']
for severity in severity_order:
severity_findings = []
for category, category_findings in findings.items():
for finding in category_findings:
if finding.get('severity') == severity:
severity_findings.append((category, finding))
if severity_findings:
severity_emoji = severity_to_emoji(severity)
report.append(f"### {severity_emoji} {severity.upper()} ({len(severity_findings)} issues)\n")
for category, finding in severity_findings:
report.append(f"#### {finding.get('title', 'Untitled Issue')}")
report.append(f"\n**Category**: {category.replace('_', ' ').title()}")
report.append(f"**Subcategory**: {finding.get('subcategory', 'N/A')}")
if finding.get('file'):
file_ref = f"{finding['file']}"
if finding.get('line'):
file_ref += f":{finding['line']}"
report.append(f"**Location**: `{file_ref}`")
report.append(f"\n{finding.get('description', 'No description')}")
if finding.get('code_snippet'):
report.append(f"\n```\n{finding['code_snippet']}\n```")
report.append(f"\n**Impact**: {finding.get('impact', 'Unknown impact')}")
report.append(f"\n**Remediation**: {finding.get('remediation', 'No remediation suggested')}")
report.append(f"\n**Effort**: {finding.get('effort', 'Unknown').upper()}\n")
report.append("---\n")
# Recommendations
report.append("## Recommendations\n")
report.append(generate_recommendations(summary, findings))
# Footer
report.append("\n---\n")
report.append("*Report generated by Codebase Auditor Skill (2024-25 Standards)*")
return '\n'.join(report)
def generate_json_report(summary: Dict, findings: Dict[str, List[Dict]], metadata: Dict) -> str:
"""
Generate a JSON-formatted audit report.
Args:
summary: Executive summary data
findings: All findings organized by category
metadata: Project metadata
Returns:
JSON report as string
"""
report = {
'generated_at': datetime.now().isoformat(),
'metadata': metadata,
'summary': summary,
'findings': findings,
'schema_version': '1.0.0',
}
return json.dumps(report, indent=2)
def generate_html_report(summary: Dict, findings: Dict[str, List[Dict]], metadata: Dict) -> str:
"""
Generate an HTML dashboard report.
Args:
summary: Executive summary data
findings: All findings organized by category
metadata: Project metadata
Returns:
HTML report as string
"""
# Simplified HTML template
html = f"""<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Codebase Audit Report</title>
<style>
body {{
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, sans-serif;
line-height: 1.6;
max-width: 1200px;
margin: 0 auto;
padding: 20px;
background: #f5f5f5;
}}
.header {{
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
color: white;
padding: 30px;
border-radius: 10px;
margin-bottom: 20px;
}}
.score {{
font-size: 48px;
font-weight: bold;
margin: 20px 0;
}}
.metrics {{
display: grid;
grid-template-columns: repeat(auto-fit, minmax(200px, 1fr));
gap: 20px;
margin: 20px 0;
}}
.metric {{
background: white;
padding: 20px;
border-radius: 8px;
box-shadow: 0 2px 4px rgba(0,0,0,0.1);
}}
.metric-title {{
font-size: 14px;
color: #666;
text-transform: uppercase;
}}
.metric-value {{
font-size: 32px;
font-weight: bold;
margin: 10px 0;
}}
.finding {{
background: white;
padding: 20px;
margin: 10px 0;
border-radius: 8px;
border-left: 4px solid #ddd;
}}
.finding.critical {{ border-left-color: #e53e3e; }}
.finding.high {{ border-left-color: #dd6b20; }}
.finding.medium {{ border-left-color: #d69e2e; }}
.finding.low {{ border-left-color: #38a169; }}
.badge {{
display: inline-block;
padding: 4px 12px;
border-radius: 12px;
font-size: 12px;
font-weight: bold;
text-transform: uppercase;
}}
.badge.critical {{ background: #fed7d7; color: #742a2a; }}
.badge.high {{ background: #feebc8; color: #7c2d12; }}
.badge.medium {{ background: #fefcbf; color: #744210; }}
.badge.low {{ background: #c6f6d5; color: #22543d; }}
code {{
background: #f7fafc;
padding: 2px 6px;
border-radius: 3px;
font-family: 'Courier New', monospace;
}}
pre {{
background: #2d3748;
color: #e2e8f0;
padding: 15px;
border-radius: 5px;
overflow-x: auto;
}}
</style>
</head>
<body>
<div class="header">
<h1>🔍 Codebase Audit Report</h1>
<p><strong>Generated:</strong> {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}</p>
<p><strong>Codebase:</strong> {metadata.get('path', 'Unknown')}</p>
<div class="score">Overall Score: {summary.get('overall_score', 0)}/100</div>
</div>
<div class="metrics">
<div class="metric">
<div class="metric-title">Critical Issues</div>
<div class="metric-value" style="color: #e53e3e;">{summary.get('critical_issues', 0)}</div>
</div>
<div class="metric">
<div class="metric-title">High Issues</div>
<div class="metric-value" style="color: #dd6b20;">{summary.get('high_issues', 0)}</div>
</div>
<div class="metric">
<div class="metric-title">Total Issues</div>
<div class="metric-value">{summary.get('total_issues', 0)}</div>
</div>
<div class="metric">
<div class="metric-title">Lines of Code</div>
<div class="metric-value">{metadata.get('total_lines', 0):,}</div>
</div>
</div>
<h2>Findings</h2>
"""
# Add findings
severity_order = ['critical', 'high', 'medium', 'low']
for severity in severity_order:
for category, category_findings in findings.items():
for finding in category_findings:
if finding.get('severity') == severity:
html += f"""
<div class="finding {severity}">
<div>
<span class="badge {severity}">{severity}</span>
<strong>{finding.get('title', 'Untitled')}</strong>
</div>
<p>{finding.get('description', 'No description')}</p>
"""
if finding.get('file'):
html += f"<p><strong>Location:</strong> <code>{finding['file']}"
if finding.get('line'):
html += f":{finding['line']}"
html += "</code></p>"
if finding.get('code_snippet'):
html += f"<pre><code>{finding['code_snippet']}</code></pre>"
html += f"""
<p><strong>Impact:</strong> {finding.get('impact', 'Unknown')}</p>
<p><strong>Remediation:</strong> {finding.get('remediation', 'No suggestion')}</p>
</div>
"""
html += """
</body>
</html>
"""
return html
def score_to_emoji(score: float) -> str:
"""Convert score to emoji."""
if score >= 90:
return ""
elif score >= 70:
return "⚠️"
else:
return ""
def severity_to_emoji(severity: str) -> str:
"""Convert severity to emoji."""
severity_map = {
'critical': '🚨',
'high': '⚠️',
'medium': '',
'low': '',
}
return severity_map.get(severity, '')
def generate_recommendations(summary: Dict, findings: Dict) -> str:
"""Generate recommendations based on findings."""
recommendations = []
critical_count = summary.get('critical_issues', 0)
high_count = summary.get('high_issues', 0)
overall_score = summary.get('overall_score', 0)
if critical_count > 0:
recommendations.append(f"1. **Immediate Action Required**: Address all {critical_count} critical security and quality issues before deploying to production.")
if high_count > 5:
recommendations.append(f"2. **Sprint Focus**: Prioritize fixing the {high_count} high-severity issues in the next sprint. These significantly impact code quality and maintainability.")
if overall_score < 70:
recommendations.append("3. **Technical Debt Sprint**: Schedule a dedicated sprint to address accumulated technical debt and improve code quality metrics.")
if 'testing' in findings and len(findings['testing']) > 0:
recommendations.append("4. **Testing Improvements**: Increase test coverage to meet the 80% minimum threshold. Focus on critical paths first (authentication, payment, data processing).")
if 'security' in findings and len(findings['security']) > 0:
recommendations.append("5. **Security Review**: Conduct a thorough security review and penetration testing given the security issues found.")
if not recommendations:
recommendations.append("1. **Maintain Standards**: Continue following best practices and maintain current quality levels.")
recommendations.append("2. **Continuous Improvement**: Consider implementing automated code quality checks in CI/CD pipeline.")
return '\n'.join(recommendations)

View File

@@ -0,0 +1,16 @@
# Changelog
## 0.2.0
- Refactored to Anthropic progressive disclosure pattern
- Updated description with "Use PROACTIVELY when..." format
- Removed version/author/category/tags from frontmatter
## 0.1.0
- Initial release of JSON Outputs Implementer skill
- Complete workflow covering schema design, SDK integration, testing, and production
- Pydantic and Zod integration patterns
- Error handling for refusals, token limits, and validation failures
- Performance optimization guidance (grammar caching, token management)
- Contact and invoice extraction examples

View File

@@ -0,0 +1,90 @@
# JSON Outputs Implementer
Specialized skill for implementing JSON outputs mode with guaranteed schema compliance.
## Purpose
This skill handles **end-to-end implementation** of JSON outputs mode (`output_format`), ensuring Claude's responses strictly match your JSON schema. Covers schema design, SDK integration, testing, and production deployment.
## Use Cases
- **Data Extraction**: Pull structured info from text/images
- **Classification**: Categorize content with guaranteed output format
- **API Formatting**: Generate API-ready JSON responses
- **Report Generation**: Create structured reports
- **Database Operations**: Ensure type-safe inserts/updates
## Prerequisites
- Routed here by `structured-outputs-advisor`
- Model: Claude Sonnet 4.5 or Opus 4.1
- Beta header: `structured-outputs-2025-11-13`
## Quick Start
**Python with Pydantic:**
```python
from pydantic import BaseModel
from anthropic import Anthropic
class Contact(BaseModel):
name: str
email: str
client = Anthropic()
response = client.beta.messages.parse(
model="claude-sonnet-4-5",
betas=["structured-outputs-2025-11-13"],
messages=[{"role": "user", "content": "Extract contact..."}],
output_format=Contact,
)
contact = response.parsed_output # Guaranteed valid
```
**TypeScript with Zod:**
```typescript
import { z } from 'zod';
const ContactSchema = z.object({
name: z.string(),
email: z.string().email(),
});
const response = await client.beta.messages.parse({
model: "claude-sonnet-4-5",
betas: ["structured-outputs-2025-11-13"],
output_format: betaZodOutputFormat(ContactSchema),
messages: [...]
});
```
## What You'll Learn
1. **Schema Design** - Respecting JSON Schema limitations
2. **SDK Integration** - Pydantic/Zod helpers
3. **Error Handling** - Refusals, token limits, validation
4. **Production Optimization** - Caching, monitoring, cost tracking
5. **Testing** - Comprehensive validation strategies
## Examples
- [contact-extraction.py](./examples/contact-extraction.py) - Extract contact info
- [invoice-extraction.py](./examples/invoice-extraction.py) - Complex nested schemas
## Related Skills
- [`structured-outputs-advisor`](../structured-outputs-advisor/) - Choose the right mode
- [`strict-tool-implementer`](../strict-tool-implementer/) - For tool validation
## Reference Materials
- [JSON Schema Limitations](../reference/json-schema-limitations.md)
- [Best Practices](../reference/best-practices.md)
- [API Compatibility](../reference/api-compatibility.md)
## Version
Current version: 0.1.0
See [CHANGELOG.md](./CHANGELOG.md) for version history.

View File

@@ -0,0 +1,93 @@
---
name: json-outputs-implementer
description: >-
Use PROACTIVELY when extracting structured data from text/images, classifying content, or formatting API responses with guaranteed schema compliance.
Implements Anthropic's JSON outputs mode with Pydantic/Zod SDK integration.
Covers schema design, validation, testing, and production optimization.
Not for tool parameter validation or agentic workflows (use strict-tool-implementer instead).
---
# JSON Outputs Implementer
## Overview
This skill implements Anthropic's JSON outputs mode for guaranteed schema compliance. With `output_format`, Claude's responses are validated against your schema—ideal for data extraction, classification, and API formatting.
**What This Skill Provides:**
- Production-ready JSON schema design
- SDK integration (Pydantic for Python, Zod for TypeScript)
- Validation and error handling patterns
- Performance optimization strategies
- Complete implementation examples
**Prerequisites:**
- Decision made via `structured-outputs-advisor`
- Model: Claude Sonnet 4.5 or Opus 4.1
- Beta header: `structured-outputs-2025-11-13`
## When to Use This Skill
**Use for:**
- Extracting structured data from text/images
- Classification tasks with guaranteed categories
- Generating API-ready responses
- Formatting reports with fixed structure
- Database inserts requiring type safety
**NOT for:**
- Validating tool inputs → `strict-tool-implementer`
- Agentic workflows → `strict-tool-implementer`
## Response Style
- **Schema-first**: Design schema before implementation
- **SDK-friendly**: Leverage Pydantic/Zod when available
- **Production-ready**: Consider performance, caching, errors
- **Example-driven**: Provide complete working code
- **Limitation-aware**: Respect JSON Schema constraints
## Workflow
| Phase | Description | Details |
|-------|-------------|---------|
| 1 | Schema Design | → [workflow/phase-1-schema-design.md](workflow/phase-1-schema-design.md) |
| 2 | SDK Integration | → [workflow/phase-2-sdk-integration.md](workflow/phase-2-sdk-integration.md) |
| 3 | Error Handling | → [workflow/phase-3-error-handling.md](workflow/phase-3-error-handling.md) |
| 4 | Testing | → [workflow/phase-4-testing.md](workflow/phase-4-testing.md) |
| 5 | Production Optimization | → [workflow/phase-5-production.md](workflow/phase-5-production.md) |
## Quick Reference
### Python Template
```python
from pydantic import BaseModel
from anthropic import Anthropic
class MySchema(BaseModel):
field: str
response = client.beta.messages.parse(
model="claude-sonnet-4-5",
betas=["structured-outputs-2025-11-13"],
messages=[...],
output_format=MySchema,
)
result = response.parsed_output # Validated!
```
### Supported Schema Features
✅ Basic types, enums, format strings, nested objects/arrays, required fields
❌ Recursive schemas, min/max constraints, string length, complex regex
## Reference Materials
- [Common Use Cases](reference/use-cases.md)
- [Schema Limitations](reference/schema-limitations.md)
## Related Skills
- `structured-outputs-advisor` - Choose the right mode
- `strict-tool-implementer` - For tool validation use cases

View File

@@ -0,0 +1,138 @@
"""
Contact Information Extraction Example
Extracts structured contact information from unstructured text (emails, messages, etc.)
using JSON outputs mode with Pydantic schema validation.
"""
from pydantic import BaseModel, Field, EmailStr
from typing import Optional, List
from anthropic import Anthropic
import os
# Initialize client
client = Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))
# Define schema with Pydantic
class ContactInfo(BaseModel):
"""Structured contact information extracted from text."""
name: str = Field(description="Full name of the contact person")
email: EmailStr = Field(description="Email address")
phone: Optional[str] = Field(
None, description="Phone number in any format"
)
company: Optional[str] = Field(
None, description="Company or organization name"
)
plan_interest: Optional[str] = Field(
None, description="Product plan or tier they're interested in"
)
demo_requested: bool = Field(
False, description="Whether they requested a product demo"
)
tags: List[str] = Field(
default_factory=list,
description="Relevant tags or categories"
)
def extract_contact(text: str) -> Optional[ContactInfo]:
"""
Extract contact information from unstructured text.
Args:
text: Unstructured text containing contact information
Returns:
ContactInfo object with extracted data, or None if request refused
"""
try:
response = client.beta.messages.parse(
model="claude-sonnet-4-5",
max_tokens=1024,
betas=["structured-outputs-2025-11-13"],
messages=[{
"role": "user",
"content": f"Extract contact information from the following text:\n\n{text}"
}],
output_format=ContactInfo,
)
# Handle different stop reasons
if response.stop_reason == "refusal":
print(f"⚠️ Request refused for safety reasons")
return None
if response.stop_reason == "max_tokens":
print(f"⚠️ Response truncated - increase max_tokens")
return None
# Return validated contact info
return response.parsed_output
except Exception as e:
print(f"❌ Error extracting contact: {e}")
raise
def main():
"""Run contact extraction examples."""
examples = [
# Example 1: Complete contact info
"""
Hi, I'm John Smith from Acme Corp. You can reach me at john.smith@acme.com
or call me at (555) 123-4567. I'm interested in your Enterprise plan and
would love to schedule a demo next week.
""",
# Example 2: Minimal info
"""
Contact: jane.doe@example.com
""",
# Example 3: Informal message
"""
Hey! Bob here. Email me at bob@startup.io if you want to chat about
the Pro plan. Thanks!
""",
# Example 4: Multiple contacts (extracts first/primary)
"""
From: alice@company.com
CC: support@company.com
Hi, I'm Alice Johnson, VP of Engineering at TechCo.
We're evaluating your platform for our team of 50 developers.
""",
]
print("=" * 70)
print("Contact Extraction Examples")
print("=" * 70)
for i, text in enumerate(examples, 1):
print(f"\n📧 Example {i}:")
print(f"Input: {text.strip()[:100]}...")
contact = extract_contact(text)
if contact:
print(f"\n✅ Extracted Contact:")
print(f" Name: {contact.name}")
print(f" Email: {contact.email}")
print(f" Phone: {contact.phone or 'N/A'}")
print(f" Company: {contact.company or 'N/A'}")
print(f" Plan Interest: {contact.plan_interest or 'N/A'}")
print(f" Demo Requested: {contact.demo_requested}")
print(f" Tags: {', '.join(contact.tags) if contact.tags else 'None'}")
else:
print(f"\n❌ No contact extracted")
print("-" * 70)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,160 @@
"""
Invoice Data Extraction Example
Extracts structured invoice data from text using JSON outputs with nested schemas.
Demonstrates handling complex nested structures (line items, tax breakdown).
"""
from pydantic import BaseModel, Field
from typing import List
from decimal import Decimal
from anthropic import Anthropic
import os
client = Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))
# Nested schema for line items
class LineItem(BaseModel):
"""Individual line item on an invoice."""
description: str = Field(description="Item description")
quantity: int = Field(description="Quantity ordered")
unit_price: float = Field(description="Price per unit in USD")
total: float = Field(description="Total for this line (quantity * unit_price)")
class Invoice(BaseModel):
"""Complete invoice structure."""
invoice_number: str = Field(description="Invoice ID (format: INV-XXXXX)")
date: str = Field(description="Invoice date in YYYY-MM-DD format")
due_date: str = Field(description="Payment due date in YYYY-MM-DD format")
customer_name: str = Field(description="Customer or company name")
customer_email: str = Field(description="Customer email address")
line_items: List[LineItem] = Field(
description="List of items on the invoice"
)
subtotal: float = Field(description="Subtotal before tax in USD")
tax_rate: float = Field(description="Tax rate as decimal (e.g., 0.08 for 8%)")
tax_amount: float = Field(description="Tax amount in USD")
total_amount: float = Field(description="Final total amount in USD")
notes: str = Field(
default="",
description="Additional notes or payment instructions"
)
def extract_invoice(invoice_text: str) -> Optional[Invoice]:
"""Extract structured invoice data."""
try:
response = client.beta.messages.parse(
model="claude-sonnet-4-5",
max_tokens=2048, # Higher for complex nested structures
betas=["structured-outputs-2025-11-13"],
messages=[{
"role": "user",
"content": f"Extract all invoice data from:\n\n{invoice_text}"
}],
output_format=Invoice,
)
if response.stop_reason != "end_turn":
print(f"⚠️ Unexpected stop reason: {response.stop_reason}")
return None
return response.parsed_output
except Exception as e:
print(f"❌ Error: {e}")
raise
def main():
"""Run invoice extraction example."""
invoice_text = """
INVOICE
Invoice Number: INV-2024-00123
Date: 2024-01-15
Due Date: 2024-02-15
Bill To:
Acme Corporation
John Smith, CFO
john.smith@acme.com
ITEMS:
1. Cloud Hosting - Pro Plan (x3 servers)
Quantity: 3
Unit Price: $299.00
Total: $897.00
2. Database Storage - 500GB
Quantity: 500
Unit Price: $0.50
Total: $250.00
3. API Calls - Premium Tier
Quantity: 1,000,000
Unit Price: $0.001
Total: $1,000.00
4. Support - Enterprise Level
Quantity: 1
Unit Price: $500.00
Total: $500.00
Subtotal: $2,647.00
Tax (8.5%): $224.99
TOTAL: $2,871.99
Payment Terms: Net 30
Please remit payment to accounts@cloudprovider.com
"""
print("=" * 70)
print("Invoice Extraction Example")
print("=" * 70)
invoice = extract_invoice(invoice_text)
if invoice:
print(f"\n✅ Invoice Extracted Successfully\n")
print(f"Invoice #: {invoice.invoice_number}")
print(f"Customer: {invoice.customer_name} ({invoice.customer_email})")
print(f"Date: {invoice.date}")
print(f"Due: {invoice.due_date}")
print(f"\nLine Items:")
for i, item in enumerate(invoice.line_items, 1):
print(f" {i}. {item.description}")
print(f" Qty: {item.quantity} × ${item.unit_price:.2f} = ${item.total:.2f}")
print(f"\nSubtotal: ${invoice.subtotal:.2f}")
print(f"Tax ({invoice.tax_rate * 100:.1f}%): ${invoice.tax_amount:.2f}")
print(f"TOTAL: ${invoice.total_amount:.2f}")
if invoice.notes:
print(f"\nNotes: {invoice.notes}")
# Validation checks
print(f"\n🔍 Validation:")
calculated_subtotal = sum(item.total for item in invoice.line_items)
print(f" Subtotal matches: {abs(calculated_subtotal - invoice.subtotal) < 0.01}")
calculated_tax = invoice.subtotal * invoice.tax_rate
print(f" Tax calculation matches: {abs(calculated_tax - invoice.tax_amount) < 0.01}")
calculated_total = invoice.subtotal + invoice.tax_amount
print(f" Total matches: {abs(calculated_total - invoice.total_amount) < 0.01}")
else:
print("❌ Failed to extract invoice")
if __name__ == "__main__":
from typing import Optional
main()

View File

@@ -0,0 +1,47 @@
# JSON Schema Limitations Reference
## Supported Features
- ✅ All basic types (object, array, string, integer, number, boolean, null)
-`enum` (primitives only)
-`const`, `anyOf`, `allOf`
-`$ref`, `$def`, `definitions` (local)
-`required`, `additionalProperties: false`
- ✅ String formats: date-time, time, date, email, uri, uuid, ipv4, ipv6
-`minItems: 0` or `minItems: 1` for arrays
## NOT Supported
- ❌ Recursive schemas
- ❌ Numerical constraints (minimum, maximum, multipleOf)
- ❌ String constraints (minLength, maxLength, pattern with complex regex)
- ❌ Array constraints (beyond minItems 0/1)
- ❌ External `$ref`
- ❌ Complex types in enums
## SDK Transformation
Python and TypeScript SDKs automatically remove unsupported constraints and add them to descriptions.
## Success Criteria
- [ ] Schema designed with all required fields
- [ ] JSON Schema limitations respected
- [ ] SDK helper integrated (Pydantic/Zod)
- [ ] Beta header included in requests
- [ ] Error handling for refusals and token limits
- [ ] Tested with representative examples
- [ ] Edge cases covered (missing fields, invalid data)
- [ ] Production optimization considered (caching, tokens)
- [ ] Monitoring in place (latency, costs)
- [ ] Documentation provided
## Important Reminders
1. **Use SDK helpers** - `client.beta.messages.parse()` auto-validates
2. **Respect limitations** - No recursive schemas, no min/max constraints
3. **Add descriptions** - Helps Claude understand what to extract
4. **Handle refusals** - Don't retry safety refusals
5. **Monitor performance** - Watch for cache misses and high latency
6. **Set `additionalProperties: false`** - Required for all objects
7. **Test thoroughly** - Edge cases often reveal schema issues

View File

@@ -0,0 +1,86 @@
# Common Use Cases
## Use Case 1: Data Extraction
**Scenario**: Extract invoice data from text/images
```python
from pydantic import BaseModel
from typing import List
class LineItem(BaseModel):
description: str
quantity: int
unit_price: float
total: float
class Invoice(BaseModel):
invoice_number: str
date: str
customer_name: str
line_items: List[LineItem]
subtotal: float
tax: float
total_amount: float
response = client.beta.messages.parse(
model="claude-sonnet-4-5",
betas=["structured-outputs-2025-11-13"],
max_tokens=2048,
messages=[{"role": "user", "content": f"Extract invoice:\n{invoice_text}"}],
output_format=Invoice,
)
invoice = response.parsed_output
# Insert into database with guaranteed types
db.insert_invoice(invoice.model_dump())
```
## Use Case 2: Classification
**Scenario**: Classify support tickets
```python
class TicketClassification(BaseModel):
category: str # "billing", "technical", "sales"
priority: str # "low", "medium", "high", "critical"
confidence: float
requires_human: bool
suggested_assignee: Optional[str] = None
tags: List[str]
response = client.beta.messages.parse(
model="claude-sonnet-4-5",
betas=["structured-outputs-2025-11-13"],
messages=[{"role": "user", "content": f"Classify:\n{ticket}"}],
output_format=TicketClassification,
)
classification = response.parsed_output
if classification.requires_human or classification.confidence < 0.7:
route_to_human(ticket)
else:
auto_assign(ticket, classification.category)
```
## Use Case 3: API Response Formatting
**Scenario**: Generate API-ready responses
```python
class APIResponse(BaseModel):
status: str # "success" or "error"
data: dict
errors: Optional[List[dict]] = None
metadata: dict
response = client.beta.messages.parse(
model="claude-sonnet-4-5",
betas=["structured-outputs-2025-11-13"],
messages=[{"role": "user", "content": f"Process: {request}"}],
output_format=APIResponse,
)
# Directly return as JSON API response
return jsonify(response.parsed_output.model_dump())
```

View File

@@ -0,0 +1,92 @@
# Phase 1: Schema Design
**Objective**: Create a production-ready JSON schema respecting all limitations
## Steps
### 1. Define Output Structure
Ask the user:
- "What fields do you need in the output?"
- "Which fields are required vs. optional?"
- "What are the data types for each field?"
- "Are there nested objects or arrays?"
### 2. Choose Schema Approach
**Option A: Pydantic (Python) - Recommended**
```python
from pydantic import BaseModel
from typing import List, Optional
class ContactInfo(BaseModel):
name: str
email: str
plan_interest: Optional[str] = None
demo_requested: bool = False
tags: List[str] = []
```
**Option B: Zod (TypeScript) - Recommended**
```typescript
import { z } from 'zod';
const ContactInfoSchema = z.object({
name: z.string(),
email: z.string().email(),
plan_interest: z.string().optional(),
demo_requested: z.boolean().default(false),
tags: z.array(z.string()).default([]),
});
```
**Option C: Raw JSON Schema**
```json
{
"type": "object",
"properties": {
"name": {"type": "string", "description": "Full name"},
"email": {"type": "string", "description": "Email address"},
"plan_interest": {"type": "string", "description": "Interested plan"},
"demo_requested": {"type": "boolean"},
"tags": {"type": "array", "items": {"type": "string"}}
},
"required": ["name", "email", "demo_requested"],
"additionalProperties": false
}
```
### 3. Apply JSON Schema Limitations
**✅ Supported Features:**
- All basic types: object, array, string, integer, number, boolean, null
- `enum` (strings, numbers, bools, nulls only)
- `const`
- `anyOf` and `allOf` (limited)
- `$ref`, `$def`, `definitions` (local only)
- `required` and `additionalProperties: false`
- String formats: date-time, time, date, email, uri, uuid, ipv4, ipv6
- Array `minItems` (0 or 1 only)
**❌ NOT Supported (SDK can transform these):**
- Recursive schemas
- Numerical constraints (minimum, maximum)
- String constraints (minLength, maxLength)
- Complex array constraints
- External `$ref`
### 4. Add AI-Friendly Descriptions
```python
class Invoice(BaseModel):
invoice_number: str # Field(description="Invoice ID, format: INV-XXXXX")
date: str # Field(description="Invoice date in YYYY-MM-DD format")
total: float # Field(description="Total amount in USD")
items: List[LineItem] # Field(description="Line items on the invoice")
```
Good descriptions help Claude understand what to extract.
## Output
Production-ready schema following Anthropic's limitations.

View File

@@ -0,0 +1,100 @@
# Phase 2: SDK Integration
**Objective**: Implement using SDK helpers for automatic validation
## Python Implementation
**Recommended: Use `client.beta.messages.parse()`**
```python
from pydantic import BaseModel, Field
from typing import List, Optional
from anthropic import Anthropic
class ContactInfo(BaseModel):
name: str = Field(description="Full name of the contact")
email: str = Field(description="Email address")
plan_interest: Optional[str] = Field(
None, description="Plan tier they're interested in"
)
demo_requested: bool = Field(
False, description="Whether they requested a demo"
)
client = Anthropic()
def extract_contact(text: str) -> ContactInfo:
"""Extract contact information from text."""
response = client.beta.messages.parse(
model="claude-sonnet-4-5",
max_tokens=1024,
betas=["structured-outputs-2025-11-13"],
messages=[{
"role": "user",
"content": f"Extract contact information from: {text}"
}],
output_format=ContactInfo,
)
# Handle edge cases
if response.stop_reason == "refusal":
raise ValueError("Claude refused the request")
if response.stop_reason == "max_tokens":
raise ValueError("Response truncated - increase max_tokens")
# Automatically validated
return response.parsed_output
# Usage
contact = extract_contact("John Smith (john@example.com) wants Enterprise plan")
print(contact.name, contact.email) # Type-safe access
```
## TypeScript Implementation
```typescript
import Anthropic from '@anthropic-ai/sdk';
import { z } from 'zod';
import { betaZodOutputFormat } from '@anthropic-ai/sdk/helpers/beta/zod';
const ContactInfoSchema = z.object({
name: z.string().describe("Full name of the contact"),
email: z.string().email().describe("Email address"),
plan_interest: z.string().optional().describe("Plan tier interested in"),
demo_requested: z.boolean().default(false).describe("Demo requested"),
});
const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
async function extractContact(text: string) {
const response = await client.beta.messages.parse({
model: "claude-sonnet-4-5",
max_tokens: 1024,
betas: ["structured-outputs-2025-11-13"],
messages: [{
role: "user",
content: `Extract contact information from: ${text}`
}],
output_format: betaZodOutputFormat(ContactInfoSchema),
});
if (response.stop_reason === "refusal") {
throw new Error("Claude refused the request");
}
if (response.stop_reason === "max_tokens") {
throw new Error("Response truncated - increase max_tokens");
}
return response.parsed_output;
}
// Usage
const contact = await extractContact("John Smith (john@example.com)...");
console.log(contact.name, contact.email); // Fully typed
```
## Output
Working implementation with SDK validation.

View File

@@ -0,0 +1,53 @@
# Phase 3: Error Handling
**Objective**: Handle refusals, token limits, and validation errors
## Key Error Scenarios
### 1. Safety Refusals (`stop_reason: "refusal"`)
```python
if response.stop_reason == "refusal":
logger.warning(f"Request refused: {input_text}")
# Don't retry - respect safety boundaries
return None # or raise exception
```
### 2. Token Limit Reached (`stop_reason: "max_tokens"`)
```python
if response.stop_reason == "max_tokens":
# Retry with higher limit
return extract_with_higher_limit(text, max_tokens * 1.5)
```
### 3. Schema Validation Errors (SDK raises exception)
```python
from pydantic import ValidationError
try:
result = response.parsed_output
except ValidationError as e:
logger.error(f"Schema validation failed: {e}")
# Should be rare - indicates schema mismatch
raise
```
### 4. API Errors (400 - schema too complex)
```python
from anthropic import BadRequestError
try:
response = client.beta.messages.parse(...)
except BadRequestError as e:
if "too complex" in str(e).lower():
# Simplify schema
logger.error("Schema too complex, simplifying...")
raise
```
## Output
Robust error handling for production deployments.

View File

@@ -0,0 +1,56 @@
# Phase 4: Testing
**Objective**: Validate schema works with representative data
## Test Coverage
```python
import pytest
@pytest.fixture
def extractor():
return ContactExtractor()
def test_complete_contact(extractor):
"""Test with all fields present."""
text = "John Smith (john@example.com) interested in Enterprise plan, wants demo"
result = extractor.extract(text)
assert result.name == "John Smith"
assert result.email == "john@example.com"
assert result.plan_interest == "Enterprise"
assert result.demo_requested is True
def test_minimal_contact(extractor):
"""Test with only required fields."""
text = "Contact: jane@example.com"
result = extractor.extract(text)
assert result.email == "jane@example.com"
assert result.name is not None # Claude should infer or extract
assert result.plan_interest is None # Optional field
assert result.demo_requested is False # Default
def test_invalid_input(extractor):
"""Test with insufficient data."""
text = "This has no contact information"
# Depending on requirements, might raise or return partial data
result = extractor.extract(text)
# Define expected behavior
def test_refusal_scenario(extractor):
"""Test that refusals are handled."""
# Test with potentially unsafe content
# Verify graceful handling without crash
pass
def test_token_limit(extractor):
"""Test with very long input."""
text = "..." * 10000 # Very long text
# Verify either succeeds or raises appropriate error
pass
```
## Output
Comprehensive test suite covering edge cases.

View File

@@ -0,0 +1,78 @@
# Phase 5: Production Optimization
**Objective**: Optimize for performance, cost, and reliability
## 1. Grammar Caching Strategy
The first request compiles a grammar from your schema (~extra latency). Subsequent requests use cached grammar (24-hour TTL).
**Cache Invalidation Triggers:**
- Schema structure changes
- Tool set changes (if using tools + JSON outputs together)
- 24 hours of non-use
**Best Practices:**
```python
# ✅ Good: Finalize schema before production
CONTACT_SCHEMA = ContactInfo # Reuse same schema
# ❌ Bad: Dynamic schema generation
def get_schema(include_phone: bool): # Different schemas = cache misses
if include_phone:
class Contact(BaseModel):
phone: str
...
...
```
## 2. Token Cost Management
Structured outputs add tokens via system prompt:
```python
# Monitor token usage
response = client.beta.messages.parse(...)
print(f"Input tokens: {response.usage.input_tokens}")
print(f"Output tokens: {response.usage.output_tokens}")
# Optimize descriptions for token efficiency
# ✅ Good: Concise but clear
name: str = Field(description="Full name")
# ❌ Excessive: Too verbose
name: str = Field(description="The complete full name of the person including first name, middle name if available, and last name")
```
## 3. Monitoring
```python
import time
from dataclasses import dataclass
@dataclass
class StructuredOutputMetrics:
latency_ms: float
input_tokens: int
output_tokens: int
cache_hit: bool # Infer from latency
stop_reason: str
def track_metrics(response, start_time) -> StructuredOutputMetrics:
latency = (time.time() - start_time) * 1000
return StructuredOutputMetrics(
latency_ms=latency,
input_tokens=response.usage.input_tokens,
output_tokens=response.usage.output_tokens,
cache_hit=latency < 500, # Heuristic: fast = cache hit
stop_reason=response.stop_reason,
)
# Track in production
metrics = track_metrics(response, start_time)
if metrics.latency_ms > 1000:
logger.warning(f"Slow structured output: {metrics.latency_ms}ms")
```
## Output
Production-optimized implementation with caching and monitoring.

View File

@@ -0,0 +1,16 @@
# Changelog
## 0.2.0
- Refactored to Anthropic progressive disclosure pattern
- Updated description with "Use PROACTIVELY when..." format
- Removed version/author/category/tags from frontmatter
## 0.1.0
- Initial release of Strict Tool Implementer skill
- Complete workflow covering tool schema design, multi-tool agents, and production deployment
- Multi-tool agent implementation patterns
- Error handling for tool failures and refusals
- Testing strategies for agentic workflows
- Travel booking agent example (multi-tool workflow)

View File

@@ -0,0 +1,81 @@
# Strict Tool Implementer
Specialized skill for implementing strict tool use mode with guaranteed parameter validation.
## Purpose
This skill handles **end-to-end implementation** of strict tool use mode (`strict: true`), ensuring tool input parameters strictly follow your schema. Essential for building reliable agentic workflows with type-safe tool execution.
## Use Cases
- **Multi-Tool Agents**: Travel booking, research assistants, etc.
- **Validated Function Calls**: Ensure parameters match expected types
- **Complex Tool Schemas**: Tools with nested properties
- **Critical Operations**: Financial transactions, booking systems
- **Tool Composition**: Sequential and parallel tool execution
## Prerequisites
- Routed here by `structured-outputs-advisor`
- Model: Claude Sonnet 4.5 or Opus 4.1
- Beta header: `structured-outputs-2025-11-13`
## Quick Start
**Python:**
```python
response = client.beta.messages.create(
model="claude-sonnet-4-5",
betas=["structured-outputs-2025-11-13"],
messages=[{"role": "user", "content": "Search for flights..."}],
tools=[{
"name": "search_flights",
"description": "Search for available flights",
"strict": True, # Enable strict validation
"input_schema": {
"type": "object",
"properties": {
"origin": {"type": "string"},
"destination": {"type": "string"},
"travelers": {"type": "integer", "enum": [1, 2, 3, 4, 5, 6]}
},
"required": ["origin", "destination", "travelers"],
"additionalProperties": False
}
}]
)
# Tool inputs GUARANTEED to match schema
for block in response.content:
if block.type == "tool_use":
execute_tool(block.name, block.input) # Type-safe!
```
## What You'll Learn
1. **Tool Schema Design** - With `strict: true` and proper validation
2. **Multi-Tool Workflows** - Sequential and parallel tool execution
3. **Agent Patterns** - Stateful agents, retry logic, validation
4. **Error Handling** - Tool failures, refusals, edge cases
5. **Production Deployment** - Monitoring, testing, reliability
## Examples
- [travel-booking-agent.py](./examples/travel-booking-agent.py) - Multi-tool agent workflow
## Related Skills
- [`structured-outputs-advisor`](../structured-outputs-advisor/) - Choose the right mode
- [`json-outputs-implementer`](../json-outputs-implementer/) - For data extraction
## Reference Materials
- [JSON Schema Limitations](../reference/json-schema-limitations.md)
- [Best Practices](../reference/best-practices.md)
- [API Compatibility](../reference/api-compatibility.md)
## Version
Current version: 0.1.0
See [CHANGELOG.md](./CHANGELOG.md) for version history.

View File

@@ -0,0 +1,92 @@
---
name: strict-tool-implementer
description: >-
Use PROACTIVELY when building multi-step agentic workflows with validated tool parameters.
Implements Anthropic's strict tool use mode for guaranteed schema compliance.
Covers tool schema design, multi-tool agent workflows, error handling, testing, and production patterns.
Not for data extraction or classification tasks (use json-outputs-implementer instead).
---
# Strict Tool Implementer
## Overview
This skill implements Anthropic's strict tool use mode for reliable agentic systems. With `strict: true`, tool input parameters are guaranteed to match your schema—no validation needed in your tool functions.
**What This Skill Provides:**
- Production-ready tool schema design
- Multi-tool workflow patterns
- Agentic system architecture
- Validation and error handling
- Complete agent implementation examples
**Prerequisites:**
- Decision made via `structured-outputs-advisor`
- Model: Claude Sonnet 4.5 or Opus 4.1
- Beta header: `structured-outputs-2025-11-13`
## When to Use This Skill
**Use for:**
- Building multi-step agentic workflows
- Validating function call parameters
- Ensuring type-safe tool execution
- Complex tools with nested properties
- Critical operations requiring guaranteed types
**NOT for:**
- Extracting data from text/images → `json-outputs-implementer`
- Formatting API responses → `json-outputs-implementer`
- Classification tasks → `json-outputs-implementer`
## Response Style
- **Tool-focused**: Design tools with clear, validated schemas
- **Agent-aware**: Consider multi-tool workflows and composition
- **Type-safe**: Guarantee parameter types for downstream functions
- **Production-ready**: Handle errors, retries, and monitoring
- **Example-driven**: Provide complete agent implementations
## Workflow
| Phase | Description | Details |
|-------|-------------|---------|
| 1 | Tool Schema Design | → [workflow/phase-1-schema-design.md](workflow/phase-1-schema-design.md) |
| 2 | Multi-Tool Agent Implementation | → [workflow/phase-2-implementation.md](workflow/phase-2-implementation.md) |
| 3 | Error Handling & Validation | → [workflow/phase-3-error-handling.md](workflow/phase-3-error-handling.md) |
| 4 | Testing Agent Workflows | → [workflow/phase-4-testing.md](workflow/phase-4-testing.md) |
| 5 | Production Agent Patterns | → [workflow/phase-5-production.md](workflow/phase-5-production.md) |
## Quick Reference
### Schema Template
```python
{
"name": "tool_name",
"description": "Clear description",
"strict": True,
"input_schema": {
"type": "object",
"properties": {...},
"required": [...],
"additionalProperties": False
}
}
```
### Supported Schema Features
✅ Basic types, enums, format strings, nested objects/arrays, required fields
❌ Recursive schemas, min/max constraints, string length, complex regex
## Reference Materials
- [Common Agentic Patterns](reference/common-patterns.md)
- [Success Criteria](reference/success-criteria.md)
## Related Skills
- `structured-outputs-advisor` - Choose the right mode
- `json-outputs-implementer` - For data extraction use cases

View File

@@ -0,0 +1,289 @@
"""
Travel Booking Agent Example
Multi-tool agent using strict tool use mode for guaranteed parameter validation.
Demonstrates validated tool inputs in agentic workflows.
"""
from anthropic import Anthropic
from typing import Dict, Any, List
import json
import os
client = Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))
# Define tools with strict mode
TOOLS = [
{
"name": "search_flights",
"description": "Search for available flights between two cities",
"strict": True, # Enable strict parameter validation
"input_schema": {
"type": "object",
"properties": {
"origin": {
"type": "string",
"description": "Departure city (e.g., 'San Francisco, CA')"
},
"destination": {
"type": "string",
"description": "Arrival city (e.g., 'Paris, France')"
},
"departure_date": {
"type": "string",
"format": "date",
"description": "Departure date in YYYY-MM-DD format"
},
"return_date": {
"type": "string",
"format": "date",
"description": "Return date in YYYY-MM-DD format (optional for one-way)"
},
"travelers": {
"type": "integer",
"enum": [1, 2, 3, 4, 5, 6],
"description": "Number of travelers"
}
},
"required": ["origin", "destination", "departure_date", "travelers"],
"additionalProperties": False # Required for strict mode
}
},
{
"name": "book_flight",
"description": "Book a selected flight",
"strict": True,
"input_schema": {
"type": "object",
"properties": {
"flight_id": {
"type": "string",
"description": "Flight identifier from search results"
},
"passenger_names": {
"type": "array",
"items": {"type": "string"},
"description": "Full names of all passengers"
},
"contact_email": {
"type": "string",
"format": "email",
"description": "Contact email for booking confirmation"
}
},
"required": ["flight_id", "passenger_names", "contact_email"],
"additionalProperties": False
}
},
{
"name": "search_hotels",
"description": "Search for hotels in a city",
"strict": True,
"input_schema": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "City name"
},
"check_in": {
"type": "string",
"format": "date",
"description": "Check-in date in YYYY-MM-DD format"
},
"check_out": {
"type": "string",
"format": "date",
"description": "Check-out date in YYYY-MM-DD format"
},
"guests": {
"type": "integer",
"enum": [1, 2, 3, 4],
"description": "Number of guests"
}
},
"required": ["city", "check_in", "check_out", "guests"],
"additionalProperties": False
}
}
]
# Mock tool implementations
def search_flights(origin: str, destination: str, departure_date: str,
travelers: int, return_date: str = None) -> Dict:
"""Mock flight search - would call real API."""
print(f"🔍 Searching flights: {origin}{destination}")
print(f" Departure: {departure_date}, Travelers: {travelers}")
return {
"flights": [
{
"id": "FL123",
"airline": "Air France",
"departure": f"{departure_date} 10:00",
"arrival": f"{departure_date} 23:00",
"price": 850.00,
"class": "Economy"
},
{
"id": "FL456",
"airline": "United",
"departure": f"{departure_date} 14:30",
"arrival": f"{departure_date} 03:30+1",
"price": 920.00,
"class": "Economy"
}
]
}
def book_flight(flight_id: str, passenger_names: List[str],
contact_email: str) -> Dict:
"""Mock flight booking - would call real API."""
print(f"✈️ Booking flight {flight_id}")
print(f" Passengers: {', '.join(passenger_names)}")
print(f" Email: {contact_email}")
return {
"confirmation": "ABC123XYZ",
"status": "confirmed",
"total_price": 850.00 * len(passenger_names)
}
def search_hotels(city: str, check_in: str, check_out: str, guests: int) -> Dict:
"""Mock hotel search - would call real API."""
print(f"🏨 Searching hotels in {city}")
print(f" Check-in: {check_in}, Check-out: {check_out}, Guests: {guests}")
return {
"hotels": [
{
"id": "HTL789",
"name": "Grand Hotel Paris",
"rating": 4.5,
"price_per_night": 200.00,
"amenities": ["WiFi", "Breakfast", "Gym"]
},
{
"id": "HTL101",
"name": "Budget Inn",
"rating": 3.5,
"price_per_night": 80.00,
"amenities": ["WiFi"]
}
]
}
# Tool registry
TOOL_FUNCTIONS = {
"search_flights": search_flights,
"book_flight": book_flight,
"search_hotels": search_hotels,
}
def run_travel_agent(user_request: str, max_turns: int = 10):
"""
Run travel booking agent with strict tool validation.
With strict mode enabled, all tool inputs are GUARANTEED to match
the schema - no validation needed in tool functions!
"""
messages = [{"role": "user", "content": user_request}]
print("=" * 70)
print(f"User Request: {user_request}")
print("=" * 70)
for turn in range(max_turns):
print(f"\n🤖 Agent Turn {turn + 1}")
response = client.beta.messages.create(
model="claude-sonnet-4-5",
max_tokens=2048,
betas=["structured-outputs-2025-11-13"],
messages=messages,
tools=TOOLS,
)
# Check stop reason
if response.stop_reason == "end_turn":
# Agent finished
final_text = ""
for block in response.content:
if hasattr(block, "text"):
final_text += block.text
print(f"\n✅ Agent Complete:")
print(f"{final_text}")
return final_text
if response.stop_reason == "tool_use":
# Execute tools
tool_results = []
for block in response.content:
if block.type == "text":
print(f"\n💭 Agent: {block.text}")
elif block.type == "tool_use":
tool_name = block.name
tool_input = block.input # GUARANTEED to match schema!
print(f"\n🔧 Tool Call: {tool_name}")
print(f" Input: {json.dumps(tool_input, indent=2)}")
# Execute tool with validated inputs
tool_function = TOOL_FUNCTIONS[tool_name]
result = tool_function(**tool_input) # Type-safe!
print(f" Result: {json.dumps(result, indent=2)}")
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": json.dumps(result)
})
# Add to conversation
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": tool_results})
else:
print(f"⚠️ Unexpected stop reason: {response.stop_reason}")
break
print("\n⚠️ Max turns reached without completion")
return None
def main():
"""Run travel agent examples."""
examples = [
"Book a round trip from San Francisco to Paris for 2 people, "
"departing May 15, 2024 and returning May 22, 2024. "
"Passengers are John Smith and Jane Doe. "
"Email confirmation to john.smith@example.com. "
"Also find a hotel in Paris for those dates.",
"Find flights from New York to London for 1 traveler on June 1, 2024.",
"Search for hotels in Tokyo for 2 guests, checking in July 10 "
"and checking out July 15.",
]
for i, request in enumerate(examples, 1):
print(f"\n\n{'='*70}")
print(f"EXAMPLE {i}")
print(f"{'='*70}")
run_travel_agent(request)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,48 @@
# Common Agentic Patterns
## Pattern 1: Sequential Workflow
Tools execute in sequence (search → book → confirm):
```python
# User: "Book a flight to Paris"
# Agent executes:
1. search_flights(origin="SF", destination="Paris", ...)
2. book_flight(flight_id="F1", passengers=[...])
3. send_confirmation(confirmation_id="ABC123")
```
## Pattern 2: Parallel Tool Execution
Multiple independent tools (flights + hotels):
```python
# User: "Find flights and hotels for Paris trip"
# Agent can call in parallel (if your implementation supports it):
1. search_flights(destination="Paris", ...)
2. search_hotels(city="Paris", ...)
```
## Pattern 3: Conditional Branching
Tool selection based on context:
```python
# User: "Plan my trip"
# Agent decides which tools to call based on conversation:
if budget_conscious:
search_flights(class="economy")
else:
search_flights(class="business")
```
## Important Reminders
1. **Always set `strict: true`** - This enables validation
2. **Require `additionalProperties: false`** - Enforced by strict mode
3. **Use enums for constrained values** - Better than free text
4. **Clear descriptions matter** - Claude uses these to decide when to call tools
5. **Tool inputs are guaranteed valid** - No validation needed in tool functions
6. **Handle tool execution failures** - External APIs can fail
7. **Test multi-step workflows** - Edge cases appear in tool composition
8. **Monitor agent behavior** - Track tool usage patterns and failures

View File

@@ -0,0 +1,23 @@
# Success Criteria
## Implementation Checklist
- [ ] Tool schemas designed with `strict: true`
- [ ] All tools have `additionalProperties: false`
- [ ] Clear descriptions for tools and parameters
- [ ] Required fields properly specified
- [ ] Multi-tool workflow implemented
- [ ] Error handling for tool failures
- [ ] Refusal scenarios handled
- [ ] Agent tested with realistic scenarios
- [ ] Production patterns applied (retry, validation)
- [ ] Monitoring in place
## Official Documentation
https://docs.anthropic.com/en/docs/build-with-claude/structured-outputs
## Related Skills
- `structured-outputs-advisor` - Choose the right mode
- `json-outputs-implementer` - For data extraction use cases

View File

@@ -0,0 +1,125 @@
# Phase 1: Tool Schema Design
**Objective**: Design validated tool schemas for your agent
## Steps
### 1. Identify Required Tools
Ask the user:
- "What actions should the agent be able to perform?"
- "What external systems will the agent interact with?"
- "What parameters does each tool need?"
**Example agent**: Travel booking
- `search_flights` - Find available flights
- `book_flight` - Reserve a flight
- `search_hotels` - Find hotels
- `book_hotel` - Reserve accommodation
### 2. Design Tool Schema with `strict: true`
**Template:**
```python
{
"name": "tool_name",
"description": "Clear description of what this tool does",
"strict": True, # ← Enables strict mode
"input_schema": {
"type": "object",
"properties": {
"param_name": {
"type": "string",
"description": "Clear parameter description"
}
},
"required": ["param_name"],
"additionalProperties": False # ← Required
}
}
```
**Example: Flight Search Tool**
```python
{
"name": "search_flights",
"description": "Search for available flights between two cities",
"strict": True,
"input_schema": {
"type": "object",
"properties": {
"origin": {
"type": "string",
"description": "Departure city (e.g., 'San Francisco, CA')"
},
"destination": {
"type": "string",
"description": "Arrival city (e.g., 'Paris, France')"
},
"departure_date": {
"type": "string",
"format": "date",
"description": "Departure date in YYYY-MM-DD format"
},
"return_date": {
"type": "string",
"format": "date",
"description": "Return date in YYYY-MM-DD format (optional)"
},
"travelers": {
"type": "integer",
"enum": [1, 2, 3, 4, 5, 6],
"description": "Number of travelers"
},
"class": {
"type": "string",
"enum": ["economy", "premium", "business", "first"],
"description": "Flight class preference"
}
},
"required": ["origin", "destination", "departure_date", "travelers"],
"additionalProperties": False
}
}
```
### 3. Apply JSON Schema Limitations
**✅ Supported:**
- All basic types (object, array, string, integer, number, boolean)
- `enum` for constrained values
- `format` for strings (date, email, uri, uuid, etc.)
- Nested objects and arrays
- `required` fields
- `additionalProperties: false` (required!)
**❌ NOT Supported:**
- Recursive schemas
- Numerical constraints (minimum, maximum)
- String length constraints
- Complex regex patterns
### 4. Add Clear Descriptions
Good descriptions help Claude:
- Understand when to call the tool
- Know what values to provide
- Format parameters correctly
```python
# ✅ Good: Clear and specific
"origin": {
"type": "string",
"description": "Departure city and state/country (e.g., 'San Francisco, CA')"
}
# ❌ Vague: Not helpful
"origin": {
"type": "string",
"description": "Origin"
}
```
## Output
Well-designed tool schemas ready for implementation.

View File

@@ -0,0 +1,222 @@
# Phase 2: Multi-Tool Agent Implementation
**Objective**: Implement agent with multiple validated tools
## Python Implementation
```python
from anthropic import Anthropic
from typing import Dict, Any, List
client = Anthropic()
# Define tools
TOOLS = [
{
"name": "search_flights",
"description": "Search for available flights",
"strict": True,
"input_schema": {
"type": "object",
"properties": {
"origin": {"type": "string", "description": "Departure city"},
"destination": {"type": "string", "description": "Arrival city"},
"departure_date": {"type": "string", "format": "date"},
"travelers": {"type": "integer", "enum": [1, 2, 3, 4, 5, 6]}
},
"required": ["origin", "destination", "departure_date", "travelers"],
"additionalProperties": False
}
},
{
"name": "book_flight",
"description": "Book a selected flight",
"strict": True,
"input_schema": {
"type": "object",
"properties": {
"flight_id": {"type": "string", "description": "Flight identifier"},
"passengers": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {"type": "string"},
"passport": {"type": "string"}
},
"required": ["name", "passport"],
"additionalProperties": False
}
}
},
"required": ["flight_id", "passengers"],
"additionalProperties": False
}
},
{
"name": "search_hotels",
"description": "Search for hotels in a city",
"strict": True,
"input_schema": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name"},
"check_in": {"type": "string", "format": "date"},
"check_out": {"type": "string", "format": "date"},
"guests": {"type": "integer", "enum": [1, 2, 3, 4]}
},
"required": ["city", "check_in", "check_out", "guests"],
"additionalProperties": False
}
}
]
# Tool execution functions
def search_flights(origin: str, destination: str, departure_date: str, travelers: int) -> Dict:
"""Execute flight search - calls actual API."""
# Implementation here
return {"flights": [...]}
def book_flight(flight_id: str, passengers: List[Dict]) -> Dict:
"""Book the flight - calls actual API."""
# Implementation here
return {"confirmation": "ABC123", "status": "confirmed"}
def search_hotels(city: str, check_in: str, check_out: str, guests: int) -> Dict:
"""Search hotels - calls actual API."""
# Implementation here
return {"hotels": [...]}
# Tool registry
TOOL_FUNCTIONS = {
"search_flights": search_flights,
"book_flight": book_flight,
"search_hotels": search_hotels,
}
# Agent loop
def run_agent(user_request: str, max_turns: int = 10):
"""Run agent with tool validation."""
messages = [{"role": "user", "content": user_request}]
for turn in range(max_turns):
response = client.beta.messages.create(
model="claude-sonnet-4-5",
max_tokens=2048,
betas=["structured-outputs-2025-11-13"],
messages=messages,
tools=TOOLS,
)
# Process response
if response.stop_reason == "end_turn":
# Agent finished
return extract_final_answer(response)
if response.stop_reason == "tool_use":
# Execute tools
tool_results = []
for block in response.content:
if block.type == "tool_use":
# Tool input is GUARANTEED to match schema
tool_name = block.name
tool_input = block.input # Already validated!
# Execute tool
tool_function = TOOL_FUNCTIONS[tool_name]
result = tool_function(**tool_input) # Type-safe!
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": str(result)
})
# Add assistant response and tool results to conversation
messages.append({"role": "assistant", "content": response.content})
messages.append({"role": "user", "content": tool_results})
else:
raise Exception(f"Unexpected stop reason: {response.stop_reason}")
raise Exception("Max turns reached")
# Usage
result = run_agent("Book a flight from SF to Paris for 2 people, departing May 15")
print(result)
```
## TypeScript Implementation
```typescript
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });
const TOOLS: Anthropic.Tool[] = [
{
name: "search_flights",
description: "Search for available flights",
strict: true,
input_schema: {
type: "object",
properties: {
origin: { type: "string", description: "Departure city" },
destination: { type: "string", description: "Arrival city" },
departure_date: { type: "string", format: "date" },
travelers: { type: "integer", enum: [1, 2, 3, 4, 5, 6] }
},
required: ["origin", "destination", "departure_date", "travelers"],
additionalProperties: false
}
},
// ... other tools
];
async function runAgent(userRequest: string, maxTurns: number = 10) {
const messages: Anthropic.MessageParam[] = [
{ role: "user", content: userRequest }
];
for (let turn = 0; turn < maxTurns; turn++) {
const response = await client.beta.messages.create({
model: "claude-sonnet-4-5",
max_tokens: 2048,
betas: ["structured-outputs-2025-11-13"],
messages,
tools: TOOLS,
});
if (response.stop_reason === "end_turn") {
return extractFinalAnswer(response);
}
if (response.stop_reason === "tool_use") {
const toolResults: Anthropic.ToolResultBlockParam[] = [];
for (const block of response.content) {
if (block.type === "tool_use") {
// Input guaranteed to match schema!
const result = await executeTool(block.name, block.input);
toolResults.push({
type: "tool_result",
tool_use_id: block.id,
content: JSON.stringify(result)
});
}
}
messages.push({ role: "assistant", content: response.content });
messages.push({ role: "user", content: toolResults });
}
}
throw new Error("Max turns reached");
}
```
## Output
Working multi-tool agent with validated tool schemas.

View File

@@ -0,0 +1,58 @@
# Phase 3: Error Handling & Validation
**Objective**: Handle errors and edge cases in agent workflows
## Key Error Scenarios
### 1. Tool Execution Failures
```python
def execute_tool_safely(tool_name: str, tool_input: Dict) -> Dict:
"""Execute tool with error handling."""
try:
tool_function = TOOL_FUNCTIONS[tool_name]
result = tool_function(**tool_input)
return {"success": True, "data": result}
except Exception as e:
logger.error(f"Tool {tool_name} failed: {e}")
return {
"success": False,
"error": str(e),
"message": "Tool execution failed. Please try again."
}
```
### 2. Safety Refusals
```python
if response.stop_reason == "refusal":
logger.warning("Agent refused request")
# Don't retry - respect safety boundaries
return {"error": "Request cannot be completed"}
```
### 3. Max Turns Exceeded
```python
if turn >= max_turns:
logger.warning("Agent exceeded max turns")
return {
"error": "Task too complex",
"partial_progress": extract_progress(messages)
}
```
### 4. Invalid Tool Name
```python
# With strict mode, tool names are guaranteed valid
# But external factors can cause issues
if tool_name not in TOOL_FUNCTIONS:
logger.error(f"Unknown tool: {tool_name}")
return {"error": f"Tool {tool_name} not implemented"}
```
## Output
Robust error handling for production agent workflows.

View File

@@ -0,0 +1,81 @@
# Phase 4: Testing Agent Workflows
**Objective**: Validate agent behavior with realistic scenarios
## Test Strategy
```python
import pytest
from unittest.mock import Mock, patch
@pytest.fixture
def mock_tool_functions():
"""Mock external tool functions."""
return {
"search_flights": Mock(return_value={"flights": [{"id": "F1", "price": 500}]}),
"book_flight": Mock(return_value={"confirmation": "ABC123"}),
"search_hotels": Mock(return_value={"hotels": [{"id": "H1", "price": 150}]}),
}
def test_simple_flight_search(mock_tool_functions):
"""Test agent handles simple flight search."""
with patch.dict('agent.TOOL_FUNCTIONS', mock_tool_functions):
result = run_agent("Find flights from SF to LA on May 15 for 2 people")
# Verify search_flights was called
mock_tool_functions["search_flights"].assert_called_once()
call_args = mock_tool_functions["search_flights"].call_args[1]
# Strict mode guarantees these match schema
assert call_args["origin"] == "San Francisco, CA" # or similar
assert call_args["destination"] == "Los Angeles, CA"
assert call_args["travelers"] == 2
assert "2024-05-15" in call_args["departure_date"]
def test_multi_step_booking(mock_tool_functions):
"""Test agent completes multi-step booking."""
with patch.dict('agent.TOOL_FUNCTIONS', mock_tool_functions):
result = run_agent(
"Book a round trip from SF to Paris for 2 people, "
"May 15-22, and find a hotel"
)
# Verify correct tool sequence
assert mock_tool_functions["search_flights"].called
assert mock_tool_functions["book_flight"].called
assert mock_tool_functions["search_hotels"].called
def test_tool_failure_handling(mock_tool_functions):
"""Test agent handles tool failures gracefully."""
mock_tool_functions["search_flights"].side_effect = Exception("API down")
with patch.dict('agent.TOOL_FUNCTIONS', mock_tool_functions):
result = run_agent("Find flights to Paris")
# Should handle error gracefully
assert "error" in result or "failed" in str(result).lower()
def test_parameter_validation():
"""Test that strict mode guarantees valid parameters."""
# With strict mode, parameters are guaranteed to match schema
# This test verifies the guarantee holds
response = client.beta.messages.create(
model="claude-sonnet-4-5",
betas=["structured-outputs-2025-11-13"],
messages=[{"role": "user", "content": "Search flights for 2 people"}],
tools=TOOLS,
)
for block in response.content:
if block.type == "tool_use":
# These assertions should NEVER fail with strict mode
assert isinstance(block.input, dict)
assert "travelers" in block.input
assert isinstance(block.input["travelers"], int)
assert block.input["travelers"] in [1, 2, 3, 4, 5, 6]
```
## Output
Comprehensive test coverage for agent workflows.

View File

@@ -0,0 +1,98 @@
# Phase 5: Production Agent Patterns
**Objective**: Production-ready agent architectures
## Pattern 1: Stateful Agent with Memory
```python
class StatefulTravelAgent:
"""Agent that maintains state across interactions."""
def __init__(self):
self.conversation_history: List[Dict] = []
self.booking_state: Dict[str, Any] = {}
def chat(self, user_message: str) -> str:
"""Process user message and return response."""
self.conversation_history.append({
"role": "user",
"content": user_message
})
response = client.beta.messages.create(
model="claude-sonnet-4-5",
betas=["structured-outputs-2025-11-13"],
max_tokens=2048,
messages=self.conversation_history,
tools=TOOLS,
)
# Process tools and update state
final_response = self._process_response(response)
self.conversation_history.append({
"role": "assistant",
"content": final_response
})
return final_response
def _process_response(self, response) -> str:
"""Process tool calls and maintain state."""
# Implementation...
pass
# Usage
agent = StatefulTravelAgent()
print(agent.chat("I want to go to Paris"))
print(agent.chat("For 2 people")) # Remembers context
print(agent.chat("May 15 to May 22")) # Continues booking
```
## Pattern 2: Tool Retry Logic
```python
def execute_tool_with_retry(
tool_name: str,
tool_input: Dict,
max_retries: int = 3
) -> Dict:
"""Execute tool with exponential backoff retry."""
import time
for attempt in range(max_retries):
try:
tool_func = TOOL_FUNCTIONS[tool_name]
result = tool_func(**tool_input)
return {"success": True, "data": result}
except Exception as e:
if attempt == max_retries - 1:
return {"success": False, "error": str(e)}
wait_time = 2 ** attempt # Exponential backoff
logger.warning(f"Tool {tool_name} failed, retrying in {wait_time}s")
time.sleep(wait_time)
```
## Pattern 3: Tool Result Validation
```python
def validate_tool_result(tool_name: str, result: Any) -> bool:
"""Validate tool execution result."""
validators = {
"search_flights": lambda r: "flights" in r and len(r["flights"]) > 0,
"book_flight": lambda r: "confirmation" in r,
"search_hotels": lambda r: "hotels" in r,
}
validator = validators.get(tool_name)
if validator:
return validator(result)
return True # No validator = assume valid
```
## Output
Production-ready agent patterns with state management, retry logic, and validation.

View File

@@ -0,0 +1,16 @@
# Changelog
## 0.2.0
- Refactored to Anthropic progressive disclosure pattern
- Updated description with "Use PROACTIVELY when..." format
- Removed version/author/category/tags from frontmatter
## 0.1.0
- Initial release of Structured Outputs Advisor skill
- Requirements gathering workflow
- Mode selection logic (JSON outputs vs strict tool use)
- Decision matrix for common scenarios
- Delegation patterns to specialized skills
- Mode selection examples covering 8 common scenarios

View File

@@ -0,0 +1,59 @@
# Structured Outputs Advisor
Expert advisor skill for choosing between JSON outputs and strict tool use modes in Anthropic's structured outputs feature.
## Purpose
This skill serves as the **entry point** for implementing structured outputs. It analyzes your requirements and recommends the right mode:
- **JSON Outputs** (`output_format`) - For data extraction, classification, API formatting
- **Strict Tool Use** (`strict: true`) - For agentic workflows, validated tool parameters
Then delegates to specialized implementation skills.
## When to Use
Invoke this skill when you need:
- Guaranteed JSON schema compliance
- Validated tool input parameters
- Structured data extraction
- Type-safe API responses
- Reliable agentic workflows
## Quick Start
**Trigger phrases:**
- "implement structured outputs"
- "need guaranteed JSON schema"
- "extract structured data from..."
- "build reliable agent with validated tools"
The advisor will ask questions to understand your use case and recommend the appropriate mode.
## Workflow
1. **Requirements gathering** - Understand what you're building
2. **Mode selection** - JSON outputs vs strict tool use
3. **Delegation** - Hand off to specialized skill for implementation
## Related Skills
- [`json-outputs-implementer`](../json-outputs-implementer/) - Implements JSON outputs mode
- [`strict-tool-implementer`](../strict-tool-implementer/) - Implements strict tool use mode
## Examples
See [mode-selection-examples.md](./examples/mode-selection-examples.md) for detailed scenarios.
## Documentation
- [Official Structured Outputs Docs](https://docs.anthropic.com/en/docs/build-with-claude/structured-outputs)
- [JSON Schema Limitations](../reference/json-schema-limitations.md)
- [Best Practices](../reference/best-practices.md)
- [API Compatibility](../reference/api-compatibility.md)
## Version
Current version: 0.1.0
See [CHANGELOG.md](./CHANGELOG.md) for version history.

View File

@@ -0,0 +1,283 @@
---
name: structured-outputs-advisor
description: Use PROACTIVELY when users need guaranteed schema compliance or validated tool inputs from Anthropic's structured outputs feature. Expert advisor for choosing between JSON outputs (data extraction/formatting) and strict tool use (agentic workflows). Analyzes requirements, explains trade-offs, and delegates to specialized implementation skills. Not for simple text responses or unstructured outputs.
---
# Structured Outputs Advisor
## Overview
This skill serves as the entry point for implementing Anthropic's structured outputs feature. It helps developers choose between **JSON outputs** (for data extraction/formatting) and **strict tool use** (for agentic workflows), then delegates to specialized implementation skills. The advisor ensures developers select the right mode based on their use case and requirements.
**Two Modes Available:**
1. **JSON Outputs** (`output_format`) - Guaranteed JSON schema compliance for responses
2. **Strict Tool Use** (`strict: true`) - Validated tool parameters for function calls
**Specialized Implementation Skills:**
- `json-outputs-implementer` - For data extraction, classification, API formatting
- `strict-tool-implementer` - For agentic workflows, validated function calls
## When to Use This Skill
**Trigger Phrases:**
- "implement structured outputs"
- "need guaranteed JSON schema"
- "extract structured data from [source]"
- "validate tool inputs"
- "build reliable agentic workflow"
- "ensure type-safe responses"
- "help me with structured outputs"
**Use Cases:**
- Data extraction from text/images
- Classification with guaranteed output format
- API response formatting
- Agentic workflows with validated tools
- Type-safe database operations
- Complex tool parameter validation
## Response Style
- **Consultative**: Ask questions to understand requirements
- **Educational**: Explain both modes and when to use each
- **Decisive**: Recommend the right mode based on use case
- **Delegating**: Hand off to specialized skills for implementation
- **Concise**: Keep mode selection phase quick (<5 questions)
## Core Workflow
### Phase 1: Understand Requirements
**Questions to Ask:**
1. **What's your goal?**
- "What kind of output do you need Claude to produce?"
- Examples: Extract invoice data, validate function parameters, classify tickets
2. **What's the data source?**
- Text, images, API calls, user input, etc.
3. **What consumes the output?**
- Database, API endpoint, function call, agent workflow, etc.
4. **How critical is schema compliance?**
- Must be guaranteed vs. generally reliable
### Phase 2: Mode Selection
**Use JSON Outputs (`output_format`) when:**
- ✅ You need Claude's **response** in a specific format
- ✅ Extracting structured data from unstructured sources
- ✅ Generating reports, classifications, or API responses
- ✅ Formatting output for downstream processing
- ✅ Single-step operations
**Examples:**
- Extract contact info from emails → CRM database
- Classify support tickets → routing system
- Generate structured reports → API endpoint
- Parse invoices → accounting software
**Use Strict Tool Use (`strict: true`) when:**
- ✅ You need validated **tool input parameters**
- ✅ Building multi-step agentic workflows
- ✅ Ensuring type-safe function calls
- ✅ Complex tools with many/nested properties
- ✅ Critical operations requiring guaranteed types
**Examples:**
- Travel booking agent (flights + hotels + activities)
- Database operations with strict type requirements
- API orchestration with validated parameters
- Complex workflow automation
### Phase 3: Delegation
**After determining the mode, delegate to the specialized skill:**
**For JSON Outputs:**
```
I recommend using JSON outputs for your [use case].
I'm going to invoke the json-outputs-implementer skill to help you:
1. Design a production-ready JSON schema
2. Implement with SDK helpers (Pydantic/Zod)
3. Add validation and error handling
4. Optimize for production
[Launch json-outputs-implementer skill]
```
**For Strict Tool Use:**
```
I recommend using strict tool use for your [use case].
I'm going to invoke the strict-tool-implementer skill to help you:
1. Design validated tool schemas
2. Implement strict mode correctly
3. Build reliable agent workflows
4. Test and validate tool calls
[Launch strict-tool-implementer skill]
```
**For Both Modes (Hybrid):**
```
Your use case requires both modes:
- JSON outputs for [specific use case]
- Strict tool use for [specific use case]
I'll help you implement both, starting with [primary mode].
[Launch appropriate skill first, then the second one]
```
## Decision Matrix
| Requirement | JSON Outputs | Strict Tool Use |
|-------------|--------------|-----------------|
| Extract structured data | ✅ Primary use case | ❌ Not designed for this |
| Validate function parameters | ❌ Not designed for this | ✅ Primary use case |
| Multi-step agent workflows | ⚠️ Possible but not ideal | ✅ Designed for this |
| API response formatting | ✅ Ideal | ❌ Unnecessary |
| Database inserts (type safety) | ✅ Good fit | ⚠️ If via tool calls |
| Complex nested schemas | ✅ Supports this | ✅ Supports this |
| Classification tasks | ✅ Perfect fit | ❌ Overkill |
| Tool composition/chaining | ❌ Not applicable | ✅ Excellent |
## Feature Availability
**Models Supported:**
- ✅ Claude Sonnet 4.5 (`claude-sonnet-4-5`)
- ✅ Claude Opus 4.1 (`claude-opus-4-1`)
**Beta Header Required:**
```
anthropic-beta: structured-outputs-2025-11-13
```
**Incompatible Features:**
- ❌ Citations (with JSON outputs)
- ❌ Message Prefilling (with JSON outputs)
**Compatible Features:**
- ✅ Batch Processing (50% discount)
- ✅ Token Counting
- ✅ Streaming
- ✅ Both modes together in same request
## Common Scenarios
### Scenario 1: "I need to extract invoice data"
**Analysis**: Data extraction from unstructured text
**Mode**: JSON Outputs
**Delegation**: `json-outputs-implementer`
**Reason**: Single-step extraction with structured output format
### Scenario 2: "Building a travel booking agent"
**Analysis**: Multi-tool workflow (flights, hotels, activities)
**Mode**: Strict Tool Use
**Delegation**: `strict-tool-implementer`
**Reason**: Multiple validated tools in agent workflow
### Scenario 3: "Classify customer support tickets"
**Analysis**: Classification with guaranteed categories
**Mode**: JSON Outputs
**Delegation**: `json-outputs-implementer`
**Reason**: Single classification result, structured response
### Scenario 4: "Validate database insert parameters"
**Analysis**: Type-safe database operations
**Mode**: JSON Outputs (if direct) OR Strict Tool Use (if via tool)
**Delegation**: Depends on architecture
**Reason**: Both work - choose based on system architecture
### Scenario 5: "Generate API-ready responses"
**Analysis**: Format responses for API consumption
**Mode**: JSON Outputs
**Delegation**: `json-outputs-implementer`
**Reason**: Output formatting is primary goal
## Quick Start Examples
### JSON Outputs Example
```python
# Extract contact information
from pydantic import BaseModel
from anthropic import Anthropic
class Contact(BaseModel):
name: str
email: str
plan: str
client = Anthropic()
response = client.beta.messages.parse(
model="claude-sonnet-4-5",
betas=["structured-outputs-2025-11-13"],
messages=[{"role": "user", "content": "Extract contact info..."}],
output_format=Contact,
)
contact = response.parsed_output # Guaranteed schema match
```
### Strict Tool Use Example
```python
# Validated tool for agent workflow
response = client.beta.messages.create(
model="claude-sonnet-4-5",
betas=["structured-outputs-2025-11-13"],
messages=[{"role": "user", "content": "Book a flight..."}],
tools=[{
"name": "book_flight",
"strict": True, # Guarantees schema compliance
"input_schema": {
"type": "object",
"properties": {
"destination": {"type": "string"},
"passengers": {"type": "integer"}
},
"required": ["destination"],
"additionalProperties": False
}
}]
)
# Tool inputs guaranteed to match schema
```
## Success Criteria
- [ ] Requirements clearly understood
- [ ] Data source identified
- [ ] Output consumer identified
- [ ] Correct mode selected (JSON outputs vs strict tool use)
- [ ] Reasoning for mode selection explained
- [ ] Appropriate specialized skill invoked
- [ ] User understands next steps
## Important Reminders
1. **Ask before assuming** - Don't guess the mode, understand requirements first
2. **One mode is usually enough** - Most use cases need only one mode
3. **Delegate quickly** - Keep advisor phase short, let specialists handle implementation
4. **Both modes work together** - Can use both in same request if needed
5. **Model availability** - Confirm Sonnet 4.5 or Opus 4.1 is available
6. **Beta feature** - Requires beta header in API requests
## Next Steps After Mode Selection
Once mode is selected and you've delegated to the specialized skill, that skill will handle:
- ✅ Schema design (respecting JSON Schema limitations)
- ✅ SDK integration (Pydantic/Zod helpers)
- ✅ Implementation with error handling
- ✅ Testing and validation
- ✅ Production optimization
- ✅ Complete examples and documentation
---
**Official Documentation**: https://docs.anthropic.com/en/docs/build-with-claude/structured-outputs
**Related Skills**:
- `json-outputs-implementer` - Implement JSON outputs mode
- `strict-tool-implementer` - Implement strict tool use mode

View File

@@ -0,0 +1,234 @@
# Mode Selection Examples
Real-world scenarios showing how the advisor helps choose between JSON outputs and strict tool use.
## Example 1: Invoice Data Extraction
**User Request:**
> "I need to extract invoice data from PDF documents and store it in our PostgreSQL database. The invoices contain line items, tax information, and customer details."
**Advisor Analysis:**
- **Goal:** Extract structured data from documents
- **Source:** PDF documents (unstructured)
- **Consumer:** PostgreSQL database (needs type-safe inserts)
- **Complexity:** Single-step extraction
**Recommended Mode:** JSON Outputs
**Reasoning:**
- This is a data extraction task (primary use case for JSON outputs)
- Single-step operation (extract → database)
- No multi-step agent workflow needed
- Schema compliance ensures database insert succeeds
**Next Step:**
Delegate to `json-outputs-implementer` to design invoice schema with line items, tax calculations, and customer info fields.
---
## Example 2: Travel Booking Agent
**User Request:**
> "Build an agent that can help users book complete travel itineraries. It should search for flights, compare options, book the chosen flight, find hotels near their destination, and book accommodation."
**Advisor Analysis:**
- **Goal:** Multi-step booking workflow
- **Source:** User conversation
- **Consumer:** Multiple external APIs (flights, hotels, booking systems)
- **Complexity:** Multi-tool agent workflow with sequential dependencies
**Recommended Mode:** Strict Tool Use
**Reasoning:**
- Multi-step workflow (search → compare → book → search → book)
- Multiple tools that need validated parameters
- Tool composition (flight booking influences hotel search location)
- Type-safe API calls are critical (booking with wrong parameters could charge cards incorrectly)
**Next Step:**
Delegate to `strict-tool-implementer` to design tool schemas for `search_flights`, `book_flight`, `search_hotels`, `book_hotel` with strict parameter validation.
---
## Example 3: Support Ticket Classification
**User Request:**
> "We receive thousands of support tickets daily. I need to automatically classify them by category (billing, technical, sales), priority level, and route them to the right team."
**Advisor Analysis:**
- **Goal:** Classification with routing
- **Source:** Support ticket text
- **Consumer:** Routing system + metrics dashboard
- **Complexity:** Single classification operation
**Recommended Mode:** JSON Outputs
**Reasoning:**
- Classification task (perfect for JSON outputs)
- Fixed output schema (category, priority, team, confidence)
- Single-step operation
- No tool execution needed (just classification output)
**Next Step:**
Delegate to `json-outputs-implementer` to design classification schema with enums for category/priority, confidence scoring, and routing metadata.
---
## Example 4: Database Query Agent
**User Request:**
> "I want an agent that can answer questions about our sales data. It should translate natural language questions into SQL, execute the queries safely, and return formatted results."
**Advisor Analysis:**
- **Goal:** Natural language → SQL query execution
- **Source:** User questions in natural language
- **Consumer:** Database + user (formatted results)
- **Complexity:** Tool execution with type-safe parameters + structured output
**Recommended Mode:** Both (Hybrid Approach)
**Reasoning:**
- Tool use for SQL execution: Need `execute_sql` tool with validated query parameters (prevent SQL injection)
- JSON outputs for response: Want structured results formatted consistently
- Two distinct phases: query generation/execution → result formatting
**Next Step:**
1. First: Delegate to `strict-tool-implementer` for `execute_sql` tool with strict validation
2. Then: Delegate to `json-outputs-implementer` for result formatting schema
---
## Example 5: Resume Parser
**User Request:**
> "Parse resumes in various formats (PDF, DOCX, plain text) and extract structured information: personal details, work experience, education, skills. Store in our ATS database."
**Advisor Analysis:**
- **Goal:** Extract structured data from documents
- **Source:** Resume documents (various formats)
- **Consumer:** ATS (Applicant Tracking System) database
- **Complexity:** Single extraction operation
**Recommended Mode:** JSON Outputs
**Reasoning:**
- Data extraction from unstructured documents
- Well-defined output schema (resume has standard sections)
- No tool execution needed
- Database insertion requires type-safe data
**Next Step:**
Delegate to `json-outputs-implementer` to design resume schema with nested objects for work experience, education, and skills arrays.
---
## Example 6: API Response Formatter
**User Request:**
> "Our API needs to return consistent JSON responses. Sometimes Claude generates the response data, and I need it formatted exactly to our API spec with status, data, errors, and metadata fields."
**Advisor Analysis:**
- **Goal:** Format API responses consistently
- **Source:** Claude-generated content
- **Consumer:** API clients (web/mobile apps)
- **Complexity:** Response formatting
**Recommended Mode:** JSON Outputs
**Reasoning:**
- Response formatting task
- Fixed API schema that must be followed exactly
- No tool execution
- Consistency is critical for API clients
**Next Step:**
Delegate to `json-outputs-implementer` to design API response schema matching the spec, with proper error handling structure.
---
## Example 7: Research Assistant Agent
**User Request:**
> "Build an agent that researches topics by searching the web, reading articles, extracting key facts, cross-referencing sources, and generating a comprehensive research report."
**Advisor Analysis:**
- **Goal:** Multi-step research workflow
- **Source:** Web (via search tools, article fetchers)
- **Consumer:** User (research report)
- **Complexity:** Multi-tool workflow with sequential and parallel steps + structured output
**Recommended Mode:** Both (Hybrid Approach)
**Reasoning:**
- Research phase: Need tools (`search_web`, `fetch_article`, `extract_facts`) with strict validation
- Report phase: Need structured report output (JSON outputs)
- Complex workflow with multiple stages
**Next Step:**
1. First: Delegate to `strict-tool-implementer` for research tools
2. Then: Delegate to `json-outputs-implementer` for final report schema
---
## Example 8: Form Data Extraction
**User Request:**
> "Users upload scanned forms (insurance claims, applications, etc.). Extract all form fields into a structured format for processing."
**Advisor Analysis:**
- **Goal:** Extract form data
- **Source:** Scanned form images
- **Consumer:** Processing system
- **Complexity:** Single extraction
**Recommended Mode:** JSON Outputs
**Reasoning:**
- Image data extraction
- Form has known structure (predefined fields)
- No tool execution
- Type-safe data needed for downstream processing
**Next Step:**
Delegate to `json-outputs-implementer` to design form schema matching the expected fields with proper types.
---
## Decision Patterns Summary
| Scenario Type | Recommended Mode | Key Indicator |
|---------------|------------------|---------------|
| Data extraction | JSON Outputs | "Extract X from Y" |
| Classification | JSON Outputs | "Classify/categorize X" |
| API formatting | JSON Outputs | "Format response as X" |
| Report generation | JSON Outputs | "Generate report with X structure" |
| Multi-tool workflow | Strict Tool Use | "Search, then book, then..." |
| Agent with tools | Strict Tool Use | "Agent that can call X, Y, Z" |
| Type-safe function calls | Strict Tool Use | "Validate parameters for X" |
| Complex agents | Both | "Research then report" / "Query then format" |
---
## Common Misconceptions
### ❌ "I need reliable JSON, so I should use strict tool use"
**Correction:** Use JSON outputs for reliable JSON responses. Strict tool use is for tool **parameters**, not Claude's response format.
### ❌ "My agent just needs one tool, so I should use JSON outputs"
**Correction:** Even a single-tool agent benefits from strict tool use if the tool needs parameter validation. Mode choice is about **what** you're validating, not **how many** tools.
### ❌ "I can use both modes for the same thing"
**Correction:** Each mode has a specific purpose:
- JSON outputs: Claude's response format
- Strict tool use: Tool input validation
They solve different problems and can be combined when you need both.
---
**See Also:**
- [JSON Outputs Implementer Examples](../../json-outputs-implementer/examples/)
- [Strict Tool Implementer Examples](../../strict-tool-implementer/examples/)

View File

@@ -0,0 +1,17 @@
# Changelog
All notable changes to the html-diagram-creator skill will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [0.1.0] - 2025-11-27
### Added
- Initial release with publication-quality HTML diagram generation
- Academic styling based on HELM, BetterBench, EleutherAI papers
- Color-coded component stages (Data/Execution/Analysis)
- Three copy-paste templates (Linear, Branching, Comparison)
- Arrow SVG snippets for horizontal, vertical, and bidirectional flows
- Integration guidance with html-to-png-converter and markdown-to-pdf-converter
- Mandatory completion checklist for quality assurance

View File

@@ -0,0 +1,48 @@
# HTML Diagram Creator
Publication-quality architecture diagrams as HTML files following academic paper conventions.
## Overview
This skill generates HTML-based diagrams styled after major ML benchmark papers (HELM, BetterBench, EleutherAI Evaluation Harness). Output can be exported to PNG for embedding in research papers, documentation, and presentations.
## Key Features
- **Academic styling**: Rounded corners, subtle shadows, figure numbering
- **Color-coded stages**: Data Preparation (blue), Execution (green), Analysis (orange)
- **TikZ-inspired design**: Follows LaTeX academic paper conventions
- **Export-ready**: HTML viewable in browser, exportable to PNG via Playwright
## Quick Start
```bash
# Trigger the skill
"Create an architecture diagram for [your pipeline]"
# Export to PNG (after HTML is generated)
npx playwright screenshot --full-page --device-scale-factor=2 "file://$(pwd)/diagram.html" diagram@2x.png
```
## Trigger Phrases
- "create an architecture diagram"
- "make a pipeline diagram for my paper"
- "publication-ready figure"
- "academic diagram"
- "benchmark visualization"
## Templates Available
- **Linear Pipeline**: 3-box horizontal flow
- **Branching Architecture**: Y-split parallel paths
- **Comparison**: Before/After side-by-side
## Related Skills
- `html-to-png-converter` - Export HTML diagrams to PNG
- `markdown-to-pdf-converter` - Embed PNG in professional PDFs
- `ascii-diagram-creator` - Terminal-compatible text diagrams
## Documentation
See [SKILL.md](SKILL.md) for complete templates, CSS reference, and workflow details.

View File

@@ -0,0 +1,146 @@
---
name: html-diagram-creator
version: 0.1.0
description: Use PROACTIVELY when user needs publication-quality architecture diagrams for research papers, documentation, or presentations. Triggers on "architecture diagram", "pipeline diagram", "figure for paper", "academic diagram", "benchmark visualization", or "publication-ready figure". Generates HTML diagrams following academic paper conventions (HELM, BetterBench, EleutherAI) with proper color coding, rounded corners, figure numbering, and export to PNG. Not for ASCII diagrams or flowcharts.
---
# HTML Diagram Creator
## Overview
Generates **publication-quality architecture diagrams** as HTML files for research papers, documentation, and presentations. Follows academic conventions from HELM, BetterBench, and EleutherAI papers.
**Key Capabilities**:
- Academic styling (rounded corners, shadows, figure numbering)
- Color-coded components (Data/Execution/Analysis stages)
- Browser-based with PNG export via Playwright
- Stage grouping with labels and legends
## When to Use This Skill
**Trigger Phrases**:
- "create an architecture diagram"
- "make a pipeline diagram for my paper"
- "publication-ready figure"
- "academic diagram"
- "benchmark visualization"
**Use PROACTIVELY when**:
- User is writing research documentation or papers
- User mentions "publication", "paper", "academic"
- User requests "PNG diagram" or "exportable diagram"
**Do NOT use when**:
- User wants ASCII diagrams (use ascii-diagram-creator)
- User needs interactive flowcharts (use Mermaid)
- User wants UML diagrams
## Quick Reference
### Color Palette
| Stage | Fill | Border | Usage |
|-------|------|--------|-------|
| Data Preparation | `#E3F2FD` | `#1976D2` | Input processing, loaders |
| Execution | `#E8F5E9` | `#388E3C` | API calls, inference |
| Analysis | `#FFF3E0` | `#F57C00` | Evaluation, scoring |
| Terminals | `#FF6B6B` | `#FF6B6B` | Input/Output markers |
### Visual Standards
| Element | Implementation |
|---------|----------------|
| Corners | `border-radius: 6px` |
| Shadows | `box-shadow: 0 2px 4px rgba(0,0,0,0.08)` |
| Arrows | Dark slate `#546E7A` with triangular heads |
| Figure label | "Figure N" format above title |
## Workflow
| Phase | Description | Details |
|-------|-------------|---------|
| 1 | Requirements Gathering | Identify components, flow type, stage categories |
| 2 | HTML Generation | Create standalone HTML with academic CSS |
| 3 | Component Layout | Structure with flexbox alignment |
| 4 | Export | PNG via Playwright or screenshot |
### Phase 1: Requirements
1. **Identify components**: What boxes/stages need to be shown?
2. **Determine flow**: Linear pipeline? Branching? Multi-path?
3. **Categorize stages**: Data (blue), Execution (green), Analysis (orange)
### Phase 2: HTML Structure
```html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>[Diagram Title]</title>
<style>/* Academic styling */</style>
</head>
<body>
<div class="diagram-container">
<div class="figure-label">Figure [N]</div>
<h2 class="diagram-title">[Title]</h2>
<!-- Pipeline components -->
<p class="figure-caption">[Caption]</p>
</div>
</body>
</html>
```
### Phase 3: Component Layout
```html
<div class="pipeline">
<div class="component-box data-prep">
<span class="component-name">[Name]</span>
<span class="component-tech">[Tech]</span>
</div>
<div class="arrow"></div>
<!-- More components -->
</div>
```
### Phase 4: Export
```bash
# Retina quality (recommended)
npx playwright screenshot --full-page --device-scale-factor=2 \
"file://$(pwd)/diagram.html" diagram@2x.png
# Standard resolution
npx playwright screenshot --full-page "file://$(pwd)/diagram.html" diagram.png
```
## Templates
Copy-paste ready templates available in [reference/html-templates.md](reference/html-templates.md):
- **Linear Pipeline**: 3-box horizontal flow
- **Branching Architecture**: Y-split parallel paths
- **Comparison**: Before/After side-by-side
- **Arrow Snippets**: Horizontal, vertical, bidirectional
## Reference
- [HTML Templates](reference/html-templates.md) - Copy-paste ready diagrams
- [CSS Components](reference/css-components.md) - Complete class definitions
- [Additional Templates](reference/templates.md) - More pipeline variants
## Completion Checklist
- [ ] HTML file generated with academic styling
- [ ] Figure numbering applied
- [ ] Color-coded by stage
- [ ] Rounded corners (6px) and shadows
- [ ] Export method explained
## Related Skills
- **html-to-png-converter**: Export HTML to PNG with retina support
- **markdown-to-pdf-converter**: Embed diagrams in PDFs
- **ascii-diagram-creator**: Terminal-compatible text diagrams
**Pipeline**: Create diagram → Export to PNG → Embed in markdown → Generate PDF

View File

@@ -0,0 +1,276 @@
# CSS Components Reference
Complete CSS class definitions for academic-style diagrams.
## Base Container
```css
body {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, 'Helvetica Neue', Arial, sans-serif;
background: #ffffff;
min-height: 100vh;
display: flex;
justify-content: center;
align-items: center;
padding: 40px;
}
.diagram-container {
background: #fafbfc;
border: 1px solid #e1e4e8;
border-radius: 8px;
padding: 40px 50px;
max-width: 1200px;
}
```
## Typography
```css
.figure-label {
font-size: 12px;
color: #57606a;
margin-bottom: 8px;
font-weight: 500;
}
.diagram-title {
font-size: 18px;
font-weight: 600;
color: #24292f;
text-align: center;
margin-bottom: 30px;
}
.figure-caption {
text-align: center;
margin-top: 25px;
font-size: 12px;
color: #57606a;
font-style: italic;
}
```
## Stage Labels
```css
.stage-labels {
display: flex;
justify-content: space-between;
padding: 0 30px;
margin-bottom: 15px;
}
.stage-label {
font-size: 10px;
font-weight: 600;
text-transform: uppercase;
letter-spacing: 0.5px;
color: #6e7781;
text-align: center;
}
```
## Component Boxes
```css
.component {
display: flex;
flex-direction: column;
align-items: center;
}
.component-box {
width: 130px;
height: 72px;
border-radius: 6px;
display: flex;
flex-direction: column;
align-items: center;
justify-content: center;
text-align: center;
padding: 10px;
box-shadow: 0 2px 4px rgba(0,0,0,0.08);
border: 1px solid;
}
.component-name {
font-size: 12px;
font-weight: 600;
color: #24292f;
line-height: 1.3;
}
.component-tech {
font-size: 9px;
color: #57606a;
margin-top: 4px;
font-style: italic;
}
```
## Color-Coded Stages
```css
/* Data Preparation - Blue */
.component-box.data-prep {
background: #e3f2fd;
border-color: #1976d2;
}
/* Execution - Green */
.component-box.execution {
background: #e8f5e9;
border-color: #388e3c;
}
/* Analysis - Orange */
.component-box.analysis {
background: #fff3e0;
border-color: #f57c00;
}
```
## Data Labels
```css
.data-label {
font-size: 9px;
color: #57606a;
font-family: 'SF Mono', Monaco, 'Courier New', monospace;
margin-bottom: 6px;
white-space: nowrap;
height: 14px;
}
```
## Terminal Circles
```css
.terminal {
width: 44px;
height: 44px;
border-radius: 50%;
display: flex;
align-items: center;
justify-content: center;
font-size: 11px;
font-weight: 600;
flex-shrink: 0;
}
.terminal.input {
background: #fff;
border: 3px solid #ff6b6b;
color: #ff6b6b;
}
.terminal.output {
background: #ff6b6b;
border: 3px solid #ff6b6b;
color: #fff;
}
```
## Arrows
```css
.arrow {
display: flex;
align-items: center;
justify-content: center;
width: 50px;
flex-shrink: 0;
padding-top: 20px;
}
.arrow svg {
width: 40px;
height: 16px;
}
.arrow path {
fill: none;
stroke: #546e7a;
stroke-width: 2;
}
.arrow polygon {
fill: #546e7a;
}
```
**Arrow SVG Template**:
```html
<svg viewBox="0 0 40 16">
<path d="M0,8 L30,8" />
<polygon points="30,4 38,8 30,12" />
</svg>
```
## Stage Brackets
```css
.stage-brackets {
display: flex;
justify-content: center;
padding: 0 30px;
margin-top: 5px;
}
.bracket {
height: 12px;
border-left: 2px solid #d0d7de;
border-right: 2px solid #d0d7de;
border-bottom: 2px solid #d0d7de;
border-radius: 0 0 4px 4px;
}
```
## Legend
```css
.legend {
display: flex;
justify-content: center;
gap: 24px;
margin-top: 20px;
padding-top: 15px;
border-top: 1px solid #e1e4e8;
}
.legend-item {
display: flex;
align-items: center;
gap: 6px;
font-size: 10px;
color: #57606a;
}
.legend-swatch {
width: 16px;
height: 16px;
border-radius: 3px;
border: 1px solid;
}
```
## Alternative Color Schemes
### Monochrome (Print-Friendly)
```css
.component-box.data-prep { background: #f5f5f5; border-color: #9e9e9e; }
.component-box.execution { background: #e0e0e0; border-color: #757575; }
.component-box.analysis { background: #bdbdbd; border-color: #616161; }
```
### Dark Theme
```css
.diagram-container { background: #1e1e1e; border-color: #333; }
.diagram-title { color: #e0e0e0; }
.component-box.data-prep { background: #1e3a5f; border-color: #42a5f5; }
.component-box.execution { background: #1b5e20; border-color: #66bb6a; }
.component-box.analysis { background: #e65100; border-color: #ffa726; }
```

View File

@@ -0,0 +1,185 @@
# HTML Diagram Templates
Copy-paste ready HTML templates for common diagram types.
## Linear Pipeline (3 boxes)
```html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Pipeline Diagram</title>
<style>
body { font-family: -apple-system, sans-serif; padding: 2em; background: #f8f9fa; }
.diagram-container { max-width: 900px; margin: 0 auto; background: white; padding: 2em; border-radius: 8px; }
.figure-label { font-size: 12px; color: #57606a; margin-bottom: 0.5em; }
.diagram-title { font-size: 18px; font-weight: 600; color: #24292f; margin: 0 0 1.5em 0; }
.pipeline { display: flex; align-items: center; justify-content: center; gap: 0; }
.component-box { padding: 1em 1.5em; border-radius: 6px; text-align: center; box-shadow: 0 2px 4px rgba(0,0,0,0.08); min-width: 120px; }
.component-name { font-size: 12px; font-weight: 600; display: block; }
.component-tech { font-size: 9px; font-style: italic; color: #57606a; }
.data-prep { background: #E3F2FD; border: 1px solid #1976D2; }
.execution { background: #E8F5E9; border: 1px solid #388E3C; }
.analysis { background: #FFF3E0; border: 1px solid #F57C00; }
.arrow { width: 40px; height: 2px; background: #546E7A; position: relative; margin: 0 -1px; }
.arrow::after { content: ''; position: absolute; right: -6px; top: -4px; border: 5px solid transparent; border-left: 6px solid #546E7A; }
.figure-caption { font-style: italic; color: #57606a; text-align: center; margin-top: 1.5em; font-size: 14px; }
</style>
</head>
<body>
<div class="diagram-container">
<div class="figure-label">Figure 1</div>
<h2 class="diagram-title">Pipeline Architecture</h2>
<div class="pipeline">
<div class="component-box data-prep">
<span class="component-name">Data Loader</span>
<span class="component-tech">JSON/CSV</span>
</div>
<div class="arrow"></div>
<div class="component-box execution">
<span class="component-name">Processor</span>
<span class="component-tech">Node.js</span>
</div>
<div class="arrow"></div>
<div class="component-box analysis">
<span class="component-name">Analyzer</span>
<span class="component-tech">Statistics</span>
</div>
</div>
<p class="figure-caption">Data flows through preparation, execution, and analysis stages.</p>
</div>
</body>
</html>
```
## Branching Architecture (Y-split)
```html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Branching Diagram</title>
<style>
body { font-family: -apple-system, sans-serif; padding: 2em; background: #f8f9fa; }
.diagram-container { max-width: 700px; margin: 0 auto; background: white; padding: 2em; border-radius: 8px; }
.figure-label { font-size: 12px; color: #57606a; }
.diagram-title { font-size: 18px; font-weight: 600; color: #24292f; margin: 0.5em 0 1.5em 0; }
.component-box { padding: 1em 1.5em; border-radius: 6px; text-align: center; box-shadow: 0 2px 4px rgba(0,0,0,0.08); }
.component-name { font-size: 12px; font-weight: 600; display: block; }
.input { background: #E3F2FD; border: 1px solid #1976D2; width: fit-content; margin: 0 auto 1em auto; }
.branch-a { background: #E8F5E9; border: 1px solid #388E3C; }
.branch-b { background: #FFF3E0; border: 1px solid #F57C00; }
.branches { display: flex; justify-content: center; gap: 3em; margin-top: 1em; }
.branch { text-align: center; }
.vertical-arrow { width: 2px; height: 30px; background: #546E7A; margin: 0 auto; position: relative; }
.vertical-arrow::after { content: ''; position: absolute; bottom: -6px; left: -4px; border: 5px solid transparent; border-top: 6px solid #546E7A; }
.figure-caption { font-style: italic; color: #57606a; text-align: center; margin-top: 1.5em; font-size: 14px; }
</style>
</head>
<body>
<div class="diagram-container">
<div class="figure-label">Figure 2</div>
<h2 class="diagram-title">Branching Architecture</h2>
<div class="component-box input"><span class="component-name">Input Router</span></div>
<div class="branches">
<div class="branch">
<div class="vertical-arrow"></div>
<div class="component-box branch-a"><span class="component-name">Path A</span></div>
</div>
<div class="branch">
<div class="vertical-arrow"></div>
<div class="component-box branch-b"><span class="component-name">Path B</span></div>
</div>
</div>
<p class="figure-caption">Input is routed to parallel processing paths.</p>
</div>
</body>
</html>
```
## Comparison (Before/After)
```html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Comparison Diagram</title>
<style>
body { font-family: -apple-system, sans-serif; padding: 2em; background: #f8f9fa; }
.diagram-container { max-width: 800px; margin: 0 auto; background: white; padding: 2em; border-radius: 8px; }
.figure-label { font-size: 12px; color: #57606a; }
.diagram-title { font-size: 18px; font-weight: 600; color: #24292f; margin: 0.5em 0 1.5em 0; }
.comparison { display: flex; gap: 2em; justify-content: center; }
.side { flex: 1; max-width: 300px; }
.side-label { font-size: 14px; font-weight: 600; text-align: center; margin-bottom: 1em; padding: 0.5em; border-radius: 4px; }
.before-label { background: #ffebee; color: #c62828; }
.after-label { background: #e8f5e9; color: #2e7d32; }
.component-box { padding: 0.8em 1em; border-radius: 6px; text-align: center; margin-bottom: 0.5em; font-size: 12px; }
.old { background: #fafafa; border: 1px solid #ccc; color: #666; }
.new { background: #E3F2FD; border: 1px solid #1976D2; }
.figure-caption { font-style: italic; color: #57606a; text-align: center; margin-top: 1.5em; font-size: 14px; }
</style>
</head>
<body>
<div class="diagram-container">
<div class="figure-label">Figure 3</div>
<h2 class="diagram-title">Architecture Comparison</h2>
<div class="comparison">
<div class="side">
<div class="side-label before-label">Before</div>
<div class="component-box old">Legacy Component A</div>
<div class="component-box old">Legacy Component B</div>
<div class="component-box old">Legacy Component C</div>
</div>
<div class="side">
<div class="side-label after-label">After</div>
<div class="component-box new">Modern Service</div>
<div class="component-box new">Unified API</div>
</div>
</div>
<p class="figure-caption">Migration consolidates three legacy components into two modern services.</p>
</div>
</body>
</html>
```
## Arrow SVG Snippets
### Horizontal Arrow (right)
```html
<div class="arrow" style="width: 40px; height: 2px; background: #546E7A; position: relative;">
<div style="position: absolute; right: -6px; top: -4px; border: 5px solid transparent; border-left: 6px solid #546E7A;"></div>
</div>
```
### Horizontal Arrow (left)
```html
<div class="arrow" style="width: 40px; height: 2px; background: #546E7A; position: relative;">
<div style="position: absolute; left: -6px; top: -4px; border: 5px solid transparent; border-right: 6px solid #546E7A;"></div>
</div>
```
### Vertical Arrow (down)
```html
<div style="width: 2px; height: 30px; background: #546E7A; margin: 0 auto; position: relative;">
<div style="position: absolute; bottom: -6px; left: -4px; border: 5px solid transparent; border-top: 6px solid #546E7A;"></div>
</div>
```
### Vertical Arrow (up)
```html
<div style="width: 2px; height: 30px; background: #546E7A; margin: 0 auto; position: relative;">
<div style="position: absolute; top: -6px; left: -4px; border: 5px solid transparent; border-bottom: 6px solid #546E7A;"></div>
</div>
```
### Bidirectional Arrow (horizontal)
```html
<div style="width: 40px; height: 2px; background: #546E7A; position: relative; margin: 0 8px;">
<div style="position: absolute; left: -6px; top: -4px; border: 5px solid transparent; border-right: 6px solid #546E7A;"></div>
<div style="position: absolute; right: -6px; top: -4px; border: 5px solid transparent; border-left: 6px solid #546E7A;"></div>
</div>
```

View File

@@ -0,0 +1,225 @@
# Diagram Templates
Ready-to-use HTML templates for common diagram patterns.
## Linear Pipeline Template
Standard left-to-right data flow (most common for benchmarks).
```html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>[TITLE]</title>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
background: #fff;
min-height: 100vh;
display: flex;
justify-content: center;
align-items: center;
padding: 40px;
}
.diagram-container {
background: #fafbfc;
border: 1px solid #e1e4e8;
border-radius: 8px;
padding: 40px 50px;
max-width: 1200px;
}
.figure-label { font-size: 12px; color: #57606a; margin-bottom: 8px; font-weight: 500; }
.diagram-title { font-size: 18px; font-weight: 600; color: #24292f; text-align: center; margin-bottom: 30px; }
.pipeline { display: flex; align-items: center; justify-content: center; }
.component { display: flex; flex-direction: column; align-items: center; }
.component-box {
width: 130px; height: 72px; border-radius: 6px;
display: flex; flex-direction: column; align-items: center; justify-content: center;
text-align: center; padding: 10px;
box-shadow: 0 2px 4px rgba(0,0,0,0.08); border: 1px solid;
}
.component-box.data-prep { background: #e3f2fd; border-color: #1976d2; }
.component-box.execution { background: #e8f5e9; border-color: #388e3c; }
.component-box.analysis { background: #fff3e0; border-color: #f57c00; }
.component-name { font-size: 12px; font-weight: 600; color: #24292f; line-height: 1.3; }
.component-tech { font-size: 9px; color: #57606a; margin-top: 4px; font-style: italic; }
.data-label { font-size: 9px; color: #57606a; font-family: monospace; margin-bottom: 6px; height: 14px; }
.terminal {
width: 44px; height: 44px; border-radius: 50%;
display: flex; align-items: center; justify-content: center;
font-size: 11px; font-weight: 600;
}
.terminal.input { background: #fff; border: 3px solid #ff6b6b; color: #ff6b6b; }
.terminal.output { background: #ff6b6b; border: 3px solid #ff6b6b; color: #fff; }
.arrow { display: flex; align-items: center; justify-content: center; width: 50px; padding-top: 20px; }
.arrow svg { width: 40px; height: 16px; }
.arrow path { fill: none; stroke: #546e7a; stroke-width: 2; }
.arrow polygon { fill: #546e7a; }
.figure-caption { text-align: center; margin-top: 25px; font-size: 12px; color: #57606a; font-style: italic; }
.legend { display: flex; justify-content: center; gap: 24px; margin-top: 20px; padding-top: 15px; border-top: 1px solid #e1e4e8; }
.legend-item { display: flex; align-items: center; gap: 6px; font-size: 10px; color: #57606a; }
.legend-swatch { width: 16px; height: 16px; border-radius: 3px; border: 1px solid; }
.legend-swatch.data-prep { background: #e3f2fd; border-color: #1976d2; }
.legend-swatch.execution { background: #e8f5e9; border-color: #388e3c; }
.legend-swatch.analysis { background: #fff3e0; border-color: #f57c00; }
</style>
</head>
<body>
<div class="diagram-container">
<div class="figure-label">Figure [N]</div>
<h2 class="diagram-title">[TITLE]</h2>
<div class="pipeline">
<!-- Input Terminal -->
<div class="component">
<span class="data-label">&nbsp;</span>
<div class="terminal input">In</div>
</div>
<div class="arrow"><svg viewBox="0 0 40 16"><path d="M0,8 L30,8"/><polygon points="30,4 38,8 30,12"/></svg></div>
<!-- Component 1 -->
<div class="component">
<span class="data-label">[data_type_1]</span>
<div class="component-box data-prep">
<span class="component-name">[Name 1]</span>
<span class="component-tech">[Technology]</span>
</div>
</div>
<div class="arrow"><svg viewBox="0 0 40 16"><path d="M0,8 L30,8"/><polygon points="30,4 38,8 30,12"/></svg></div>
<!-- Component 2 -->
<div class="component">
<span class="data-label">[data_type_2]</span>
<div class="component-box execution">
<span class="component-name">[Name 2]</span>
<span class="component-tech">[Technology]</span>
</div>
</div>
<div class="arrow"><svg viewBox="0 0 40 16"><path d="M0,8 L30,8"/><polygon points="30,4 38,8 30,12"/></svg></div>
<!-- Component 3 -->
<div class="component">
<span class="data-label">[data_type_3]</span>
<div class="component-box analysis">
<span class="component-name">[Name 3]</span>
<span class="component-tech">[Technology]</span>
</div>
</div>
<div class="arrow"><svg viewBox="0 0 40 16"><path d="M0,8 L30,8"/><polygon points="30,4 38,8 30,12"/></svg></div>
<!-- Output Terminal -->
<div class="component">
<span class="data-label">Results</span>
<div class="terminal output">Out</div>
</div>
</div>
<p class="figure-caption">[CAPTION - Describe what the diagram shows]</p>
<div class="legend">
<div class="legend-item"><div class="legend-swatch data-prep"></div><span>Data Preparation</span></div>
<div class="legend-item"><div class="legend-swatch execution"></div><span>Execution</span></div>
<div class="legend-item"><div class="legend-swatch analysis"></div><span>Analysis</span></div>
</div>
</div>
</body>
</html>
```
## Branching Architecture Template
For systems with parallel processing or multiple output paths.
```html
<!-- Add to body after pipeline -->
<div class="branch-container">
<div class="branch-label">Branch A</div>
<div class="pipeline branch">
<!-- Branch A components -->
</div>
<div class="branch-label">Branch B</div>
<div class="pipeline branch">
<!-- Branch B components -->
</div>
</div>
<style>
.branch-container { margin-top: 20px; }
.branch-label {
font-size: 10px; font-weight: 600;
color: #57606a; margin: 10px 0 5px 50px;
}
.pipeline.branch {
padding-left: 50px;
border-left: 2px solid #d0d7de;
}
</style>
```
## Multi-Stage with Grouping Template
For showing distinct phases with visual grouping.
```html
<!-- Add stage labels above pipeline -->
<div class="stage-labels">
<span class="stage-label" style="width: 280px;">Data Preparation</span>
<span class="stage-label" style="width: 140px;">Execution</span>
<span class="stage-label" style="width: 280px;">Analysis</span>
</div>
<!-- Add brackets below pipeline -->
<div class="stage-brackets">
<div class="bracket" style="width: 340px; margin-right: 10px;"></div>
<div class="bracket" style="width: 130px; margin-right: 10px;"></div>
<div class="bracket" style="width: 340px;"></div>
</div>
<style>
.stage-labels {
display: flex; justify-content: space-between;
padding: 0 30px; margin-bottom: 15px;
}
.stage-label {
font-size: 10px; font-weight: 600;
text-transform: uppercase; letter-spacing: 0.5px;
color: #6e7781; text-align: center;
}
.stage-brackets {
display: flex; justify-content: center;
padding: 0 30px; margin-top: 5px;
}
.bracket {
height: 12px;
border-left: 2px solid #d0d7de;
border-right: 2px solid #d0d7de;
border-bottom: 2px solid #d0d7de;
border-radius: 0 0 4px 4px;
}
</style>
```
## Vertical Stack Template
For systems with top-to-bottom flow.
```html
<style>
.pipeline.vertical {
flex-direction: column;
gap: 0;
}
.arrow.vertical {
transform: rotate(90deg);
height: 50px;
width: auto;
padding: 0 20px;
}
</style>
```
## Usage Notes
1. **Replace placeholders**: `[TITLE]`, `[N]`, `[Name]`, `[Technology]`, `[data_type]`, `[CAPTION]`
2. **Adjust widths**: Modify `.stage-label` and `.bracket` widths to match component count
3. **Add/remove components**: Copy component blocks as needed
4. **Change stages**: Use appropriate class (`data-prep`, `execution`, `analysis`)

View File

@@ -0,0 +1,17 @@
# Changelog
All notable changes to this skill will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [0.1.0] - 2025-11-27
### Added
- Initial skill implementation
- Playwright CLI screenshot automation
- Full-page capture support
- File protocol URL handling
- Batch conversion workflow
- Troubleshooting reference
- Example commands for common use cases

View File

@@ -0,0 +1,64 @@
# HTML to PNG Converter
A Claude Code skill for converting HTML files to PNG images using Playwright's CLI screenshot functionality.
## Purpose
Automates the conversion of HTML diagrams, charts, and documents to PNG images for use in:
- Academic papers and research publications
- Technical documentation
- Presentations and slide decks
- README files and project docs
## Prerequisites
- Node.js >= 16
- Playwright (`npm install -g playwright` or use via `npx`)
## Quick Start
```bash
# Install Playwright (if not already installed)
npm install -g playwright
playwright install chromium
# Convert HTML to PNG
playwright screenshot --full-page "file://$(pwd)/diagram.html" diagram.png
```
## Usage
### Single File
```bash
playwright screenshot --full-page "file:///absolute/path/to/file.html" output.png
```
### Batch Conversion
```bash
for f in docs/*.html; do
playwright screenshot --full-page "file://$(pwd)/$f" "${f%.html}.png"
done
```
## Key Options
| Option | Description |
|--------|-------------|
| `--full-page` | Capture entire document (not just viewport) |
| `--viewport-size=WxH` | Set viewport dimensions (e.g., `1920x1080`) |
| `--wait-for-timeout=ms` | Wait before screenshot (for dynamic content) |
## Troubleshooting
See `reference/troubleshooting.md` for common issues and solutions.
## Related Skills
- **png-diagram-creator**: Creates HTML diagrams with academic styling
- **ascii-diagram-creator**: Terminal-compatible text diagrams
## License
MIT

View File

@@ -0,0 +1,160 @@
---
name: html-to-png-converter
description: Use PROACTIVELY when user needs to convert HTML diagrams, charts, or documents to PNG images for papers, presentations, or documentation. Automates Playwright's screenshot command with proper file:// protocol handling, full-page capture, and output organization. Triggers on "convert HTML to PNG", "export diagram to image", "screenshot HTML file", or "make PNG from HTML". Not for live website screenshots, PDF generation, or image format conversions.
version: 0.1.0
author: Connor
category: documentation
---
# HTML to PNG Converter
## Overview
This skill automates HTML-to-PNG conversion using Playwright's CLI screenshot functionality. It handles file protocol URLs, full-page capture, and output path management for academic papers, documentation, and presentations.
**Key Capabilities**:
- **Zero-browser-launch**: Uses Playwright CLI (no script required)
- **Full-page capture**: Captures entire document, not just viewport
- **File protocol handling**: Properly constructs `file://` URLs from paths
- **Batch conversion**: Convert multiple HTML files in one operation
- **Output organization**: Consistent naming and directory structure
## When to Use This Skill
**Trigger Phrases**:
- "convert this HTML to PNG"
- "export the diagram as an image"
- "screenshot the HTML file"
- "make a PNG from the HTML"
- "turn diagram.html into diagram.png"
**Use Cases**:
- Converting HTML architecture diagrams to PNG for papers
- Exporting HTML charts/visualizations for presentations
- Creating static images from HTML reports
- Batch converting multiple HTML files to images
- Generating figures for academic publications
**NOT for**:
- Capturing live websites (use browser or dedicated tools)
- PDF generation (use Print to PDF or wkhtmltopdf)
- Image format conversions (use ImageMagick)
- Animated/interactive content capture
- Screenshots of running web applications
## Quick Reference
### Single File Conversion
```bash
# Basic usage (npx for portability)
npx playwright screenshot --full-page "file://$(pwd)/path/to/diagram.html" output.png
# Retina/high-DPI output (2x resolution for publications)
npx playwright screenshot --full-page --device-scale-factor=2 "file://$(pwd)/diagram.html" output@2x.png
# Custom viewport size (default is 1280x720)
npx playwright screenshot --full-page --viewport-size=1920,1080 "file://$(pwd)/diagram.html" output.png
# With absolute path
npx playwright screenshot --full-page "file:///Users/you/project/diagram.html" diagram.png
```
### Batch Conversion
```bash
# Convert all HTML files in a directory
for f in docs/*.html; do
npx playwright screenshot --full-page "file://$(pwd)/$f" "${f%.html}.png"
done
# Batch with retina quality
for f in docs/*.html; do
npx playwright screenshot --full-page --device-scale-factor=2 "file://$(pwd)/$f" "${f%.html}@2x.png"
done
```
### Common Viewport Sizes
| Size | Use Case |
|------|----------|
| `1280,720` | Default, standard diagrams |
| `1920,1080` | Full HD presentations |
| `800,600` | Compact figures |
| `2560,1440` | Large architecture diagrams |
## Workflow
### Phase 1: Prerequisite Check
Verify Playwright is installed with browsers.
**Details**: `workflow/phase-1-prerequisites.md`
### Phase 2: Path Resolution
Construct proper file:// URLs from relative/absolute paths.
**Details**: `workflow/phase-2-paths.md`
### Phase 3: Screenshot Capture
Execute Playwright screenshot command with options.
**Details**: `workflow/phase-3-capture.md`
### Phase 4: Output Verification
Verify PNG was created and check dimensions/quality.
**Details**: `workflow/phase-4-verification.md`
## Important Reminders
1. **Always use file:// protocol** - Playwright requires full URLs
2. **Use --full-page flag** - Without it, only captures viewport (800x600)
3. **Absolute paths are safer** - Use `$(pwd)` or full paths to avoid issues
4. **Check browser installation** - Run `playwright install` if needed
5. **HTML must be self-contained** - External resources need absolute paths
## Troubleshooting Quick Reference
| Issue | Cause | Solution |
|-------|-------|----------|
| "Browser not found" | Browsers not installed | `npx playwright install` |
| Blank/white image | File path wrong | Check file:// URL format |
| Partial capture | Missing --full-page | Add `--full-page` flag |
| Missing images/CSS | Relative paths in HTML | Use absolute paths or embed |
| Command not found | Playwright not in PATH | Use `npx playwright screenshot` |
| Image too small/blurry | Standard resolution | Add `--device-scale-factor=2` for retina |
| Wrong dimensions | Default viewport | Use `--viewport-size=WIDTH,HEIGHT` |
**Full troubleshooting**: `reference/troubleshooting.md`
## Success Criteria
- [ ] Playwright installed and accessible
- [ ] HTML file path correctly resolved
- [ ] file:// URL properly constructed
- [ ] Screenshot command executed successfully
- [ ] PNG file created at expected location
- [ ] Image dimensions match content (not 800x600 viewport)
- [ ] All visual elements rendered correctly
## Limitations
- Requires Node.js and Playwright installed
- First run downloads browsers (~500MB)
- Cannot capture dynamic/animated content
- External resources in HTML may not load correctly
- Very large HTML files may take longer to render
## Related Skills
- **html-diagram-creator**: Create HTML diagrams for conversion to PNG
- **markdown-to-pdf-converter**: Full document pipeline (diagrams embedded in PDFs)
## Reference Materials
| Resource | Purpose |
|----------|---------|
| `workflow/*.md` | Detailed phase instructions |
| `reference/troubleshooting.md` | Common issues and fixes |
| `reference/playwright-cli.md` | Full CLI options reference |
| `examples/` | Sample conversion commands |
---
**Total time**: ~5 seconds per conversion (after initial setup)

View File

@@ -0,0 +1,120 @@
# Conversion Examples
## Example 1: Academic Diagram
Converting an architecture diagram for a research paper.
**Context**: You have `docs/architecture_diagram.html` and need `docs/architecture_diagram.png`.
```bash
# Navigate to project root
cd /path/to/project
# Convert with full-page capture
playwright screenshot --full-page "file://$(pwd)/docs/architecture_diagram.html" docs/architecture_diagram.png
# Verify dimensions
sips -g pixelHeight -g pixelWidth docs/architecture_diagram.png
```
## Example 2: Bar Chart
Converting a bar chart visualization.
**Context**: You have `docs/loophole_rate_diagram.html` showing experimental results.
```bash
# High-resolution output for publication
playwright screenshot --full-page --scale=2 "file://$(pwd)/docs/loophole_rate_diagram.html" docs/loophole_rate_diagram.png
```
## Example 3: Batch Conversion
Converting all HTML files in a directory.
**Context**: Multiple diagrams in `docs/figures/`.
```bash
# Create output directory
mkdir -p docs/figures/png
# Batch convert all HTML files
for f in docs/figures/*.html; do
filename=$(basename "$f" .html)
playwright screenshot --full-page "file://$(pwd)/$f" "docs/figures/png/${filename}.png"
echo "Converted: $f"
done
```
## Example 4: Dark Mode Variant
Creating light and dark mode versions of a diagram.
```bash
# Light mode (default)
playwright screenshot --full-page "file://$(pwd)/diagram.html" diagram-light.png
# Dark mode
playwright screenshot --full-page --color-scheme=dark "file://$(pwd)/diagram.html" diagram-dark.png
```
## Example 5: From Empathy Experiment
Real example from paralleLLM project:
```bash
# Convert the loophole rate bar chart
playwright screenshot --full-page "file://$(pwd)/docs/loophole_rate_diagram.html" docs/loophole_rate_diagram.png
# Convert architecture pipeline diagram
playwright screenshot --full-page "file://$(pwd)/docs/architecture_diagram_v2.html" docs/architecture_diagram_v2.png
```
## Example 6: With Wait for Animations
When HTML has CSS transitions or animations:
```bash
# Wait 1 second for animations to complete
playwright screenshot --full-page --wait-for-timeout=1000 "file://$(pwd)/animated-diagram.html" output.png
```
## Shell Script for Project Integration
Create a reusable script `scripts/html-to-png.sh`:
```bash
#!/bin/bash
# html-to-png.sh - Convert HTML diagrams to PNG
# Usage: ./scripts/html-to-png.sh <input.html> [output.png]
set -e
INPUT="$1"
OUTPUT="${2:-${INPUT%.html}.png}"
if [ -z "$INPUT" ]; then
echo "Usage: $0 <input.html> [output.png]"
exit 1
fi
if [ ! -f "$INPUT" ]; then
echo "Error: File not found: $INPUT"
exit 1
fi
echo "Converting: $INPUT -> $OUTPUT"
playwright screenshot --full-page "file://$(pwd)/$INPUT" "$OUTPUT"
echo "Done! Dimensions:"
sips -g pixelHeight -g pixelWidth "$OUTPUT"
```
Make executable:
```bash
chmod +x scripts/html-to-png.sh
```
Use:
```bash
./scripts/html-to-png.sh docs/diagram.html docs/diagram.png
```

View File

@@ -0,0 +1,109 @@
# Playwright CLI Reference
## Screenshot Command
```bash
playwright screenshot [options] <url> <output>
```
## Options
| Option | Description | Default |
|--------|-------------|---------|
| `--full-page` | Capture full scrollable page | Off (viewport only) |
| `--viewport-size=WxH` | Set viewport size | 800x600 |
| `--scale=N` | Device scale factor | 1 |
| `--wait-for-timeout=ms` | Wait before screenshot | 0 |
| `--wait-for-selector=sel` | Wait for element | None |
| `--color-scheme=mode` | `light` or `dark` | System |
| `--device=name` | Emulate device | None |
| `--timeout=ms` | Navigation timeout | 30000 |
| `--browser=name` | Browser to use | chromium |
## Browser Options
```bash
# Use specific browser
playwright screenshot --browser=firefox "file://$(pwd)/diagram.html" output.png
playwright screenshot --browser=webkit "file://$(pwd)/diagram.html" output.png
playwright screenshot --browser=chromium "file://$(pwd)/diagram.html" output.png
```
## Device Emulation
```bash
# Emulate specific device
playwright screenshot --device="iPhone 12" "file://$(pwd)/diagram.html" output.png
playwright screenshot --device="iPad Pro" "file://$(pwd)/diagram.html" output.png
```
List all devices:
```bash
playwright devices
```
## Examples
### Basic Full Page
```bash
playwright screenshot --full-page "file://$(pwd)/diagram.html" output.png
```
### High Resolution (2x)
```bash
playwright screenshot --full-page --scale=2 "file://$(pwd)/diagram.html" output@2x.png
```
### Dark Mode
```bash
playwright screenshot --full-page --color-scheme=dark "file://$(pwd)/diagram.html" dark.png
```
### Custom Viewport
```bash
playwright screenshot --viewport-size=1920x1080 "file://$(pwd)/diagram.html" output.png
```
### Wait for Content
```bash
# Wait 2 seconds for dynamic content
playwright screenshot --full-page --wait-for-timeout=2000 "file://$(pwd)/diagram.html" output.png
# Wait for specific element
playwright screenshot --full-page --wait-for-selector=".loaded" "file://$(pwd)/diagram.html" output.png
```
## PDF Generation (Alternative)
For PDF output instead of PNG:
```bash
playwright pdf "file://$(pwd)/document.html" output.pdf
```
PDF-specific options:
- `--format=Letter|A4|...`
- `--landscape`
- `--margin=top,right,bottom,left`
- `--print-background`
## Installation Commands
```bash
# Install Playwright
npm install -g playwright
# Install browsers
playwright install # All browsers
playwright install chromium # Chromium only
playwright install firefox # Firefox only
playwright install webkit # WebKit only
# Check version
playwright --version
```

View File

@@ -0,0 +1,144 @@
# Troubleshooting Guide
## Common Issues and Solutions
### "Browser not found" / "Executable doesn't exist"
**Cause**: Playwright browsers not installed.
**Solution**:
```bash
# Install browsers
playwright install chromium
# Or install all browsers
playwright install
```
### "Command not found: playwright"
**Cause**: Playwright not installed globally.
**Solutions**:
```bash
# Option 1: Install globally
npm install -g playwright
# Option 2: Use via npx (no install needed)
npx playwright screenshot --full-page "file://$(pwd)/diagram.html" output.png
```
### Blank or White Image
**Cause**: File path not resolving correctly.
**Debug**:
```bash
# Check the file exists
ls -la diagram.html
# Check URL format (should be file://...)
echo "file://$(pwd)/diagram.html"
```
**Solution**: Ensure using `file://` protocol with absolute path.
### Image is 800x600 (Viewport Only)
**Cause**: Missing `--full-page` flag.
**Solution**:
```bash
# Add --full-page flag
playwright screenshot --full-page "file://$(pwd)/diagram.html" output.png
```
### Missing Images/CSS in Output
**Cause**: HTML uses relative paths that don't resolve in file:// context.
**Solutions**:
1. **Use absolute paths in HTML**:
```html
<!-- Instead of -->
<img src="images/logo.png">
<!-- Use -->
<img src="file:///Users/you/project/images/logo.png">
```
2. **Embed images as base64**:
```html
<img src="data:image/png;base64,iVBORw0KGgoAAAANS...">
```
3. **Inline CSS**:
```html
<style>
/* CSS inline instead of <link> */
</style>
```
### Fonts Not Rendering
**Cause**: Web fonts not loading in file:// context.
**Solutions**:
1. Use system fonts:
```css
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', sans-serif;
```
2. Embed fonts as base64 in CSS.
### Slow Conversion
**Cause**: Browser startup overhead or large content.
**Solutions**:
```bash
# For batch operations, reuse browser (requires script)
# For single operations, ~3-5 seconds is normal
# If content is dynamic, reduce wait time
playwright screenshot --full-page --wait-for-timeout=500 "file://$(pwd)/diagram.html" output.png
```
### Permission Denied
**Cause**: Cannot write to output directory.
**Solution**:
```bash
# Check directory permissions
ls -la $(dirname output.png)
# Create directory if needed
mkdir -p docs/images
```
### Fuzzy/Blurry Text
**Cause**: Low DPI capture.
**Solution**:
```bash
# Use 2x scale for retina-quality output
playwright screenshot --full-page --scale=2 "file://$(pwd)/diagram.html" output.png
```
## Debug Mode
For detailed troubleshooting:
```bash
# Enable debug output
DEBUG=pw:api playwright screenshot --full-page "file://$(pwd)/diagram.html" output.png
```
## Getting Help
1. Check Playwright docs: https://playwright.dev/docs/cli
2. Verify HTML renders in browser: `open diagram.html`
3. Test with simple HTML first to isolate issue

View File

@@ -0,0 +1,53 @@
# Phase 1: Prerequisites Check
## Overview
Before converting HTML to PNG, verify Playwright is installed and browsers are available.
## Steps
### 1.1 Check Playwright Installation
```bash
# Check if playwright is available
playwright --version
```
**If not found**:
```bash
# Install globally
npm install -g playwright
# Or use via npx (no global install)
npx playwright --version
```
### 1.2 Install Browsers
```bash
# Install all browsers
playwright install
# Or just Chromium (smallest, ~150MB)
playwright install chromium
```
### 1.3 Verify Setup
```bash
# Quick test - should create a PNG of example.com
playwright screenshot https://example.com test.png && rm test.png
echo "Playwright setup verified!"
```
## Common Issues
| Issue | Solution |
|-------|----------|
| `command not found: playwright` | `npm install -g playwright` or use `npx playwright` |
| Browser not found | Run `playwright install chromium` |
| Permission denied | Use `sudo npm install -g playwright` or fix npm permissions |
## Next Phase
Once prerequisites are verified, proceed to **Phase 2: Path Resolution**.

View File

@@ -0,0 +1,67 @@
# Phase 2: Path Resolution
## Overview
Playwright's screenshot command requires properly formatted `file://` URLs. This phase covers path construction patterns.
## URL Format
Playwright requires URLs, not bare file paths:
```
file:// + absolute_path = valid URL
```
## Path Patterns
### From Current Directory
```bash
# Using $(pwd) for absolute path
playwright screenshot --full-page "file://$(pwd)/diagram.html" output.png
```
### From Absolute Path
```bash
# Direct absolute path
playwright screenshot --full-page "file:///Users/connor/project/diagram.html" output.png
```
### Handling Spaces in Paths
```bash
# Quote the entire URL
playwright screenshot --full-page "file://$(pwd)/my diagram.html" output.png
```
### Relative to Project Root
```bash
# Navigate from any subdirectory
playwright screenshot --full-page "file://$(git rev-parse --show-toplevel)/docs/diagram.html" output.png
```
## Path Validation
Before running conversion, verify the file exists:
```bash
# Check file exists
test -f "diagram.html" && echo "File found" || echo "File not found"
# List HTML files in current directory
ls -la *.html
```
## Anti-Patterns
| Wrong | Correct |
|-------|---------|
| `playwright screenshot diagram.html` | `playwright screenshot "file://$(pwd)/diagram.html"` |
| `playwright screenshot ./diagram.html` | `playwright screenshot "file://$(pwd)/diagram.html"` |
| `playwright screenshot file:diagram.html` | `playwright screenshot "file://$(pwd)/diagram.html"` |
## Next Phase
Once path is resolved, proceed to **Phase 3: Screenshot Capture**.

View File

@@ -0,0 +1,71 @@
# Phase 3: Screenshot Capture
## Overview
Execute the Playwright screenshot command with appropriate options for your use case.
## Basic Command
```bash
playwright screenshot --full-page "file://$(pwd)/diagram.html" output.png
```
## Command Options
| Option | Purpose | Example |
|--------|---------|---------|
| `--full-page` | Capture entire document | Required for most diagrams |
| `--viewport-size=WxH` | Set initial viewport | `--viewport-size=1920x1080` |
| `--wait-for-timeout=ms` | Wait before capture | `--wait-for-timeout=1000` |
| `--device=name` | Emulate device | `--device="iPhone 12"` |
| `--color-scheme=mode` | Light/dark mode | `--color-scheme=dark` |
## Common Patterns
### Academic Diagrams (Most Common)
```bash
# Full page capture - lets content determine size
playwright screenshot --full-page "file://$(pwd)/docs/architecture.html" docs/architecture.png
```
### Fixed Width Output
```bash
# Set specific width, full page height
playwright screenshot --full-page --viewport-size=1200x800 "file://$(pwd)/diagram.html" output.png
```
### Wait for Dynamic Content
```bash
# Wait 2 seconds for animations/rendering
playwright screenshot --full-page --wait-for-timeout=2000 "file://$(pwd)/diagram.html" output.png
```
### Dark Mode Diagram
```bash
playwright screenshot --full-page --color-scheme=dark "file://$(pwd)/diagram.html" output-dark.png
```
## Batch Conversion
```bash
# Convert all HTML files in docs/ to PNG
for f in docs/*.html; do
output="${f%.html}.png"
playwright screenshot --full-page "file://$(pwd)/$f" "$output"
echo "Converted: $f -> $output"
done
```
## Output Location
- **Same directory**: Just use filename (`output.png`)
- **Different directory**: Use path (`docs/images/output.png`)
- **Ensure directory exists**: `mkdir -p docs/images` before running
## Next Phase
Proceed to **Phase 4: Output Verification** to validate the result.

View File

@@ -0,0 +1,84 @@
# Phase 4: Output Verification
## Overview
After conversion, verify the PNG was created correctly with expected dimensions and content.
## Verification Steps
### 4.1 Check File Exists
```bash
# Verify file was created
test -f output.png && echo "PNG created successfully" || echo "ERROR: PNG not found"
```
### 4.2 Check File Size
```bash
# Non-zero file size indicates content
ls -lh output.png
# If file is very small (<1KB), something may be wrong
```
### 4.3 Check Dimensions (macOS)
```bash
# Using sips (built into macOS)
sips -g pixelHeight -g pixelWidth output.png
```
**Expected output**:
```
pixelHeight: 1200
pixelWidth: 800
```
**Warning signs**:
- 800x600 = viewport-only capture (missing `--full-page`)
- Very small dimensions = content not rendering
### 4.4 Visual Inspection
```bash
# Open in default image viewer (macOS)
open output.png
# Or in Preview specifically
open -a Preview output.png
```
## Common Issues and Fixes
| Symptom | Cause | Fix |
|---------|-------|-----|
| 800x600 dimensions | No `--full-page` | Add `--full-page` flag |
| Blank/white image | Wrong file path | Check `file://` URL |
| Missing images | Relative paths in HTML | Use absolute paths or embed base64 |
| Cut off content | Viewport too small | Use `--full-page` or increase viewport |
| Fuzzy text | Low DPI | Add `--scale=2` for retina |
## Quality Checks
- [ ] File exists and has reasonable size (>10KB for diagrams)
- [ ] Dimensions match content (not 800x600)
- [ ] All visual elements rendered (text, colors, borders)
- [ ] No blank areas or missing components
- [ ] Text is readable and sharp
## Retina/High-DPI Output
For sharper images (publications):
```bash
# 2x scale for retina
playwright screenshot --full-page --scale=2 "file://$(pwd)/diagram.html" diagram@2x.png
```
## Cleanup
```bash
# Remove test files if any
rm -f test.png
```

View File

@@ -0,0 +1,18 @@
# Changelog
All notable changes to this skill will be documented in this file.
## [0.1.0] - 2025-11-27
### Added
- Initial release of markdown-to-pdf-converter skill
- Academic-style CSS template (pdf-style.css)
- Playwright diagram capture script
- Figure centering patterns that work with weasyprint
- Manual page break documentation
- Common issues troubleshooting guide
### Based On
- paralleLLM empathy-experiment-v1.0.pdf formatting standards
- pandoc + weasyprint toolchain
- Playwright for HTML → PNG capture

View File

@@ -0,0 +1,71 @@
# Markdown to PDF Converter
A Claude Code skill for converting markdown documents to professional, print-ready PDFs using pandoc and weasyprint with academic styling.
## Overview
This skill automates the markdown-to-PDF pipeline with:
- Academic-style CSS (system fonts, proper tables, page breaks)
- HTML diagram capture via Playwright at retina quality
- Iterative refinement workflow for complex documents
## Prerequisites
```bash
# Required
brew install pandoc
pip install weasyprint
# Optional (for diagram capture)
npm install playwright
npx playwright install chromium
```
## Usage
Trigger the skill with phrases like:
- "convert this markdown to PDF"
- "generate a PDF from this document"
- "create a professional PDF report"
## Key Features
### Academic Table Styling
Tables use traditional academic formatting with top/bottom borders on headers and clean cell spacing.
### Smart Page Breaks
- Headings stay with following content
- Tables and figures don't split across pages
- Manual page breaks via `<div style="page-break-before: always;"></div>`
### Figure Centering
Proper figure centering that works in weasyprint (not all CSS properties are supported).
### Retina-Quality Diagrams
Playwright captures HTML diagrams at 2x resolution for crisp print output.
## File Structure
```
markdown-to-pdf-converter/
├── SKILL.md # Main skill instructions
├── README.md # This file
├── CHANGELOG.md # Version history
├── templates/
│ ├── pdf-style.css # Academic CSS stylesheet
│ └── capture-diagrams.js # Playwright screenshot script
├── examples/
│ └── report-template.md # Example markdown structure
├── reference/
│ └── weasyprint-notes.md # CSS compatibility notes
└── workflow/
└── iterative-refinement.md # Page break tuning process
```
## Version
1.0.0 - Initial release based on paralleLLM empathy-experiment-v1.0.pdf
## Author
Connor Skiro

View File

@@ -0,0 +1,221 @@
---
name: markdown-to-pdf-converter
description: "Use PROACTIVELY when converting markdown documents to professional PDFs. Automates the pandoc + weasyprint pipeline with academic-style CSS, proper page breaks, and HTML diagram capture via Playwright. Supports reports, papers, and technical documentation. Not for slides or complex layouts requiring InDesign."
version: "0.1.0"
author: "Connor Skiro"
---
# Markdown to PDF Converter
Converts markdown documents to professional, print-ready PDFs using pandoc and weasyprint with academic styling.
## Overview
This skill provides a complete pipeline for converting markdown to publication-quality PDFs:
1. **Markdown → HTML**: pandoc with standalone CSS
2. **HTML → PDF**: weasyprint with academic styling
3. **HTML → PNG**: Playwright for diagram capture (optional)
Key features: academic table borders, proper page breaks, figure centering, retina-quality diagram export.
## When to Use
**Trigger Phrases**:
- "convert this markdown to PDF"
- "generate a PDF from this document"
- "create a professional PDF report"
- "export markdown as PDF"
**Use Cases**:
- Technical reports and whitepapers
- Research papers and academic documents
- Project documentation
- Experiment analysis reports
**NOT for**:
- Presentation slides (use Marp or reveal.js)
- Complex multi-column layouts
- Documents requiring precise InDesign-level control
## Quick Start
```bash
# Prerequisites
brew install pandoc
pip install weasyprint
npm install playwright # For diagram capture
# Verify installation
which pandoc weasyprint # Both should return paths
# Basic conversion (two-step)
pandoc document.md -o document.html --standalone --css=pdf-style.css
weasyprint document.html document.pdf
# One-liner (pipe pandoc to weasyprint)
pandoc document.md --standalone --css=pdf-style.css -t html | weasyprint - document.pdf
```
## Workflow Modes
| Mode | Use Case | Process |
|------|----------|---------|
| Quick Convert | Simple docs | Markdown → HTML → PDF |
| Academic Report | Papers with figures | + CSS styling + diagram capture |
| Iterative | Complex layout | Review PDF, adjust page breaks, regenerate |
## Academic PDF Style Standards
### Typography
```css
body {
font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif;
line-height: 1.6;
max-width: 800px;
margin: 0 auto;
padding: 2em;
}
```
### Tables (Academic Style)
- Top border: 2px solid on header
- Bottom border: 2px solid on header AND last row
- Cell padding: 0.5em 0.75em
- Page break avoidance: `page-break-inside: avoid`
### Page Control
| Element | Rule |
|---------|------|
| Page margins | 2cm |
| Headings | `page-break-after: avoid` |
| Figures | `page-break-inside: avoid` |
| Tables | `page-break-inside: avoid` |
| Orphans/widows | 3 lines minimum |
### Figure Centering (Critical)
```html
<figure style="margin: 2em auto; page-break-inside: avoid; text-align: center;">
<img src="diagram.png" alt="Description" style="max-width: 100%; height: auto; display: inline-block;">
<figcaption style="text-align: center; font-style: italic; margin-top: 1em;">
Figure 1: Caption text
</figcaption>
</figure>
```
### Manual Page Breaks
```html
<div style="page-break-before: always;"></div>
```
## Diagram Capture with Playwright
For HTML diagrams that need PNG export:
```javascript
const { chromium } = require('playwright');
async function captureDiagram(htmlPath, pngPath) {
const browser = await chromium.launch();
const context = await browser.newContext({ deviceScaleFactor: 2 }); // Retina quality
const page = await context.newPage();
await page.goto(`file://${htmlPath}`);
const element = await page.locator('.diagram-container');
await element.screenshot({ path: pngPath, type: 'png' });
await browser.close();
}
```
**Key settings**:
- `deviceScaleFactor: 2` for retina-quality PNGs
- Target `.diagram-container` selector for clean capture
- Use `max-width: 100%` in CSS, NOT `min-width`
## CSS Template Location
See `templates/pdf-style.css` for full academic stylesheet.
## Markdown Structure for Reports
```markdown
# Title
## Subtitle (optional)
**Metadata** (date, author, etc.)
---
## Abstract
Summary paragraph...
---
## 1. Section Title
### 1.1 Subsection
Content with tables, figures...
---
## Appendix A: Title
Supporting materials...
```
## Success Criteria
- [ ] PDF renders without weasyprint errors
- [ ] All images display correctly
- [ ] Tables don't split across pages
- [ ] Figures are centered with captions
- [ ] No orphaned headings at page bottoms
- [ ] Manual page breaks work as expected
- [ ] Text is readable (not cut off)
## Common Issues
| Issue | Solution |
|-------|----------|
| Image cut off | Remove `min-width`, use `max-width: 100%` |
| Image off-center | Add `margin: auto; text-align: center` to figure |
| Table split across pages | Add `page-break-inside: avoid` |
| Heading orphaned | CSS already handles with `page-break-after: avoid` |
| Too much whitespace | Remove unnecessary `<div style="page-break-before: always;">` |
## Weasyprint CSS Compatibility
Weasyprint does not support all CSS properties. The following will generate warnings (safe to ignore, but can be removed for cleaner output):
| Unsupported Property | Alternative |
|---------------------|-------------|
| `gap` | Use `margin` on child elements |
| `overflow-x` | Not needed for print |
| `user-select` | Not needed for print |
| `flex-gap` | Use `margin` instead |
| `backdrop-filter` | Not supported in print |
| `scroll-behavior` | Not needed for print |
**Clean CSS template tip**: Remove these properties from your stylesheet to avoid warning messages during conversion.
## Reference Files
- `templates/pdf-style.css` - Full CSS stylesheet
- `templates/capture-diagrams.js` - Playwright capture script
- `examples/report-template.md` - Example markdown structure
- `workflow/iterative-refinement.md` - Page break tuning process
## Related Skills
- **html-diagram-creator**: Create publication-quality HTML diagrams
- **html-to-png-converter**: Convert HTML diagrams to PNG for embedding
**Documentation Pipeline**: Create diagrams (html-diagram-creator) → Convert to PNG (html-to-png-converter) → Embed in markdown → Export to PDF (this skill)
---
**Based on**: paralleLLM empathy-experiment-v1.0.pdf

View File

@@ -0,0 +1,154 @@
# Report Title
## Subtitle or Description
**Date:** YYYY-MM-DD
**Author:** Name
**Version:** 1.0
---
## Abstract
Brief summary of the document (1-2 paragraphs). State the key findings or purpose upfront.
---
## Executive Summary
| Metric | Result |
|--------|--------|
| Key Finding 1 | Brief description |
| Key Finding 2 | Brief description |
| Sample Size | n = X |
---
## 1. Introduction
### 1.1 Background
Context and motivation for this work...
### 1.2 Objectives
1. **Objective 1**: Description
2. **Objective 2**: Description
3. **Objective 3**: Description
---
## 2. Methodology
### 2.1 Approach
Description of the approach taken...
### 2.2 Variables
**Independent Variable**: Description
| Level | Description | Example |
|-------|-------------|---------|
| Level 1 | Description | Example |
| Level 2 | Description | Example |
<div style="page-break-before: always;"></div>
**Dependent Variables**:
| Variable | Type | Measurement |
|----------|------|-------------|
| Variable 1 | Type | How measured |
| Variable 2 | Type | How measured |
### 2.3 Infrastructure
<figure style="margin: 2em auto; page-break-inside: avoid; text-align: center;">
<img src="architecture_diagram.png" alt="System Architecture" style="max-width: 100%; height: auto; display: inline-block;">
<figcaption style="text-align: center; font-style: italic; margin-top: 1em;">
Figure 1: System architecture diagram
</figcaption>
</figure>
---
## 3. Results
### 3.1 Summary Statistics
| Category | N | Mean | Std Dev | Key Metric |
|----------|---|------|---------|------------|
| Category A | 100 | 0.5 | 0.1 | 50% |
| Category B | 100 | 0.6 | 0.2 | 60% |
### 3.2 Key Findings
**Finding 1: Title**
Description of the finding with supporting data...
**Finding 2: Title**
Description of the finding with supporting data...
---
## 4. Discussion
### 4.1 Interpretation
Analysis of what the results mean...
### 4.2 Implications
| Scenario | Risk Level | Recommendation |
|----------|------------|----------------|
| Scenario A | **Low** | Safe to proceed |
| Scenario B | **High** | Exercise caution |
---
## 5. Limitations
1. **Limitation 1**: Description and impact
2. **Limitation 2**: Description and impact
---
## 6. Future Work
1. **Direction 1**: Description
2. **Direction 2**: Description
---
## 7. Conclusion
Summary of key findings and their significance...
**Bottom line**: One-sentence takeaway.
---
## Appendix A: Supporting Materials
### A.1 Sample Data
> Example content or data samples shown in blockquote format
### A.2 Additional Figures
<figure style="margin: 2em auto; page-break-inside: avoid; text-align: center;">
<img src="appendix_figure.png" alt="Additional Figure" style="max-width: 100%; height: auto; display: inline-block;">
<figcaption style="text-align: center; font-style: italic; margin-top: 1em;">
Figure A.1: Additional supporting figure
</figcaption>
</figure>
---
*Report generated by [Author Name]*

View File

@@ -0,0 +1,82 @@
# WeasyPrint CSS Compatibility Notes
WeasyPrint doesn't support all CSS properties. This reference documents what works and what doesn't.
## Supported (Works)
### Layout
- `max-width`, `min-width` (but avoid min-width on images)
- `margin`, `padding`
- `display: block`, `display: inline-block`
- `text-align`
- `width`, `height` (with units)
### Typography
- `font-family`, `font-size`, `font-weight`, `font-style`
- `line-height`
- `color`
### Tables
- `border-collapse`
- `border` properties
- `padding` on cells
### Print/Page
- `@page { margin: ... }`
- `page-break-before`, `page-break-after`, `page-break-inside`
- `orphans`, `widows`
### Backgrounds
- `background-color`
- `background` (simple)
## NOT Supported (Ignored)
### Modern CSS
- `gap` (use margin instead)
- `overflow-x`, `overflow-y`
- CSS Grid layout
- Flexbox (limited support)
- CSS variables (`--custom-property`)
- `min()`, `max()`, `clamp()` functions
### Advanced Selectors
- `:has()` (limited)
- Complex pseudo-selectors
## Common Warnings
```
WARNING: Ignored `gap: min(4vw, 1.5em)` at X:Y, invalid value.
WARNING: Ignored `overflow-x: auto` at X:Y, unknown property.
```
These warnings are informational and don't affect the output. The CSS fallbacks handle them.
## Image Centering Pattern
WeasyPrint-compatible centering:
```html
<!-- This works -->
<figure style="margin: 2em auto; text-align: center;">
<img style="max-width: 100%; display: inline-block;">
</figure>
<!-- This does NOT work reliably -->
<figure style="display: flex; justify-content: center;">
<img>
</figure>
```
## Page Break Pattern
```html
<!-- Explicit page break -->
<div style="page-break-before: always;"></div>
<!-- Keep together -->
<div style="page-break-inside: avoid;">
Content that should stay together
</div>
```

View File

@@ -0,0 +1,105 @@
/**
* Diagram Capture Script
* Converts HTML diagrams to high-resolution PNGs using Playwright
*
* Usage:
* node capture-diagrams.js [html-file] [output-png]
* node capture-diagrams.js # Captures all diagrams in current directory
*
* Prerequisites:
* npm install playwright
* npx playwright install chromium
*/
const { chromium } = require('playwright');
const path = require('path');
const fs = require('fs');
// Configuration
const CONFIG = {
deviceScaleFactor: 2, // 2x for retina quality
selector: '.diagram-container', // Default container selector
};
/**
* Capture a single HTML file to PNG
*/
async function captureScreenshot(htmlPath, pngPath, selector = CONFIG.selector) {
const browser = await chromium.launch();
const context = await browser.newContext({
deviceScaleFactor: CONFIG.deviceScaleFactor,
});
const page = await context.newPage();
const absoluteHtmlPath = path.resolve(htmlPath);
console.log(`Capturing ${htmlPath}...`);
await page.goto(`file://${absoluteHtmlPath}`);
const element = await page.locator(selector);
await element.screenshot({
path: pngPath,
type: 'png',
});
console.log(` → Saved to ${pngPath}`);
await browser.close();
}
/**
* Capture all HTML diagrams in a directory
*/
async function captureAllDiagrams(directory = '.') {
const browser = await chromium.launch();
const context = await browser.newContext({
deviceScaleFactor: CONFIG.deviceScaleFactor,
});
const page = await context.newPage();
// Find all *_diagram*.html files
const files = fs.readdirSync(directory)
.filter(f => f.endsWith('.html') && f.includes('diagram'));
if (files.length === 0) {
console.log('No diagram HTML files found in directory');
await browser.close();
return;
}
for (const htmlFile of files) {
const htmlPath = path.join(directory, htmlFile);
const pngPath = htmlPath.replace('.html', '.png');
console.log(`Capturing ${htmlFile}...`);
await page.goto(`file://${path.resolve(htmlPath)}`);
try {
const element = await page.locator(CONFIG.selector);
await element.screenshot({ path: pngPath, type: 'png' });
console.log(` → Saved to ${path.basename(pngPath)}`);
} catch (error) {
console.log(` ✗ Failed: ${error.message}`);
}
}
await browser.close();
console.log('\nAll diagrams captured successfully!');
}
// Main execution
async function main() {
const args = process.argv.slice(2);
if (args.length === 2) {
// Single file mode: node capture-diagrams.js input.html output.png
await captureScreenshot(args[0], args[1]);
} else if (args.length === 1) {
// Directory mode: node capture-diagrams.js ./docs
await captureAllDiagrams(args[0]);
} else {
// Default: capture all diagrams in current directory
await captureAllDiagrams('.');
}
}
main().catch(console.error);

View File

@@ -0,0 +1,164 @@
/* Academic PDF Style Template
* For use with pandoc + weasyprint
* Based on: paralleLLM empathy-experiment-v1.0.pdf
*/
/* ==========================================================================
Base Typography
========================================================================== */
body {
font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif;
line-height: 1.6;
max-width: 800px;
margin: 0 auto;
padding: 2em;
}
h1, h2, h3, h4 {
margin-top: 1.5em;
margin-bottom: 0.5em;
}
h2 {
margin-top: 2em;
}
h3 {
margin-top: 1.5em;
}
/* ==========================================================================
Tables (Academic Style)
========================================================================== */
table {
width: 100%;
border-collapse: collapse;
margin: 1em 0;
page-break-inside: avoid;
}
table th, table td {
padding: 0.5em 0.75em;
text-align: left;
vertical-align: top;
}
/* Academic-style borders: top/bottom on header, bottom on last row */
table thead th {
border-top: 2px solid #000;
border-bottom: 2px solid #000;
font-weight: bold;
}
table tbody td {
border-bottom: 1px solid #ddd;
}
table tbody tr:last-child td {
border-bottom: 2px solid #000;
}
/* ==========================================================================
Block Elements
========================================================================== */
blockquote {
border-left: 4px solid #ddd;
margin: 1em 0;
padding-left: 1em;
color: #555;
page-break-inside: avoid;
}
code {
background: #f5f5f5;
padding: 0.2em 0.4em;
border-radius: 3px;
font-size: 0.9em;
}
pre {
background: #f5f5f5;
padding: 1em;
border-radius: 5px;
page-break-inside: avoid;
}
pre code {
background: none;
padding: 0;
}
hr {
border: none;
border-top: 1px solid #ddd;
margin: 2em 0;
}
/* ==========================================================================
Figures and Images
========================================================================== */
figure {
page-break-inside: avoid;
margin: 1.5em 0;
}
figure img {
max-width: 100%;
height: auto;
display: block;
}
figcaption {
text-align: center;
font-style: italic;
margin-top: 0.5em;
font-size: 0.9em;
}
/* ==========================================================================
Page Control (Print/PDF)
========================================================================== */
@page {
margin: 2cm;
}
/* Keep headings with following content */
h2, h3, h4 {
page-break-after: avoid;
}
/* Prevent orphan paragraphs */
p {
orphans: 3;
widows: 3;
}
/* Keep lists together when possible */
ul, ol {
page-break-inside: avoid;
}
/* ==========================================================================
Utility Classes
========================================================================== */
/* For centered figures in weasyprint */
.figure-centered {
margin: 2em auto;
text-align: center;
}
.figure-centered img {
display: inline-block;
max-width: 100%;
}
/* Small text for appendix tables */
.small-text {
font-size: 0.85em;
}

Some files were not shown because too many files have changed in this diff Show More