Initial commit

This commit is contained in:
Zhongwei Li
2025-11-29 18:16:43 +08:00
commit 5fce0eef2b
46 changed files with 8067 additions and 0 deletions

View File

@@ -0,0 +1,13 @@
# Changelog
## 0.2.0
- Refactored to Anthropic progressive disclosure pattern
- Updated description with "Use PROACTIVELY when..." format
- Extracted detailed content to reference/ and examples/ directories
## 0.1.0
- Initial skill release
- Comprehensive codebase analysis against 2024-25 SDLC standards
- OWASP, WCAG, and DORA metrics evaluation

View File

@@ -0,0 +1,253 @@
# Codebase Auditor Skill
> Comprehensive codebase audit tool based on modern SDLC best practices (2024-25 standards)
An Anthropic Skill that analyzes codebases for quality issues, security vulnerabilities, technical debt, and generates prioritized remediation plans.
## Features
- **Progressive Disclosure**: Three-phase analysis (Discovery → Deep Analysis → Report)
- **Multi-Language Support**: JavaScript, TypeScript, Python (extensible)
- **Comprehensive Analysis**:
- Code Quality (complexity, duplication, code smells)
- Security (secrets detection, OWASP Top 10, dependency vulnerabilities)
- Testing (coverage analysis, testing trophy distribution)
- Technical Debt (SQALE rating, remediation estimates)
- **Multiple Report Formats**: Markdown, JSON, HTML dashboard
- **Prioritized Remediation Plans**: P0-P3 severity with effort estimates
- **Industry Standards**: Based on 2024-25 SDLC best practices
## Installation
1. Copy the `codebase-auditor` directory to your Claude skills directory
2. Ensure Python 3.8+ is installed
3. No additional dependencies required (uses Python standard library)
## Usage with Claude Code
### Basic Audit
```
Audit this codebase using the codebase-auditor skill.
```
### Focused Audit
```
Run a security-focused audit on this codebase.
```
### Quick Health Check
```
Give me a quick health check of this codebase (Phase 1 only).
```
### Custom Scope
```
Audit this codebase focusing on:
- Test coverage and quality
- Security vulnerabilities
- Code complexity
```
## Direct Script Usage
```bash
# Full audit with Markdown report
python scripts/audit_engine.py /path/to/codebase --output report.md
# Security-focused audit
python scripts/audit_engine.py /path/to/codebase --scope security --output security-report.md
# JSON output for CI/CD integration
python scripts/audit_engine.py /path/to/codebase --format json --output report.json
# Quick health check only (Phase 1)
python scripts/audit_engine.py /path/to/codebase --phase quick
```
## Output Formats
### Markdown (Default)
Human-readable report with detailed findings and recommendations. Suitable for:
- Pull request comments
- Documentation
- Team reviews
### JSON
Machine-readable format for CI/CD integration. Includes:
- Structured findings
- Metrics and scores
- Full metadata
### HTML
Interactive dashboard with:
- Visual metrics
- Filterable findings
- Color-coded severity levels
## Audit Criteria
The skill audits based on 10 key categories:
1. **Code Quality**: Complexity, duplication, code smells, file/function length
2. **Testing**: Coverage, test quality, testing trophy distribution
3. **Security**: Secrets detection, OWASP Top 10, dependency vulnerabilities
4. **Architecture**: SOLID principles, design patterns, modularity
5. **Performance**: Build times, bundle size, runtime efficiency
6. **Documentation**: Code docs, README, architecture docs
7. **DevOps & CI/CD**: Pipeline maturity, deployment frequency, DORA metrics
8. **Dependencies**: Outdated packages, license compliance, CVEs
9. **Accessibility**: WCAG 2.1 AA compliance
10. **TypeScript Strict Mode**: Type safety, strict mode violations
See [`reference/audit_criteria.md`](reference/audit_criteria.md) for complete checklist.
## Severity Levels
- **Critical (P0)**: Fix immediately (within 24 hours)
- Security vulnerabilities, secrets exposure, production-breaking bugs
- **High (P1)**: Fix this sprint (within 2 weeks)
- Significant quality/security issues, critical path test gaps
- **Medium (P2)**: Fix next quarter (within 3 months)
- Code smells, documentation gaps, moderate technical debt
- **Low (P3)**: Backlog
- Stylistic issues, minor optimizations
See [`reference/severity_matrix.md`](reference/severity_matrix.md) for detailed criteria.
## Examples
See the [`examples/`](examples/) directory for:
- Sample audit report
- Sample remediation plan
## Architecture
```
codebase-auditor/
├── SKILL.md # Skill definition (Claude loads this)
├── README.md # This file
├── scripts/
│ ├── audit_engine.py # Core orchestrator
│ ├── analyzers/ # Specialized analyzers
│ │ ├── code_quality.py # Complexity, duplication, smells
│ │ ├── test_coverage.py # Coverage analysis
│ │ ├── security_scan.py # Security vulnerabilities
│ │ ├── dependencies.py # Dependency health
│ │ ├── performance.py # Performance analysis
│ │ └── technical_debt.py # SQALE rating
│ ├── report_generator.py # Multi-format reports
│ └── remediation_planner.py # Prioritized action plans
├── reference/
│ ├── audit_criteria.md # Complete audit checklist
│ ├── severity_matrix.md # Issue prioritization
│ └── best_practices_2025.md # SDLC standards
└── examples/
├── sample_report.md
└── remediation_plan.md
```
## Extending the Skill
### Adding a New Analyzer
1. Create `scripts/analyzers/your_analyzer.py`
2. Implement `analyze(codebase_path, metadata)` function that returns findings list
3. Add to `ANALYZERS` dict in `audit_engine.py`
Example:
```python
def analyze(codebase_path: Path, metadata: Dict) -> List[Dict]:
findings = []
# Your analysis logic here
findings.append({
'severity': 'high',
'category': 'your_category',
'subcategory': 'specific_issue',
'title': 'Issue title',
'description': 'What was found',
'file': 'path/to/file.js',
'line': 42,
'code_snippet': 'problematic code',
'impact': 'Why it matters',
'remediation': 'How to fix it',
'effort': 'low|medium|high',
})
return findings
```
## CI/CD Integration
### GitHub Actions Example
```yaml
name: Code Audit
on: [pull_request]
jobs:
audit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Run Codebase Audit
run: |
python codebase-auditor/scripts/audit_engine.py . \
--format json \
--output audit-report.json
- name: Check for Critical Issues
run: |
CRITICAL=$(jq '.summary.critical_issues' audit-report.json)
if [ "$CRITICAL" -gt 0 ]; then
echo "❌ Found $CRITICAL critical issues"
exit 1
fi
```
## Best Practices
1. **Run Incrementally**: For large codebases, use progressive disclosure
2. **Focus on Critical Paths**: Audit authentication, payment, data processing first
3. **Baseline Before Releases**: Establish quality gates before major releases
4. **Track Over Time**: Compare audits to measure improvement
5. **Integrate with CI/CD**: Automate for continuous monitoring
6. **Customize Thresholds**: Adjust severity based on project maturity
## Limitations
- Static analysis only (no runtime profiling)
- Requires source code access
- Dependency data requires internet access (for vulnerability databases)
- Large codebases may need chunked analysis
## Version
**1.0.0** - Initial release
## Standards Compliance
Based on:
- DORA State of DevOps Report 2024
- OWASP Top 10 (2024 Edition)
- WCAG 2.1 Guidelines
- Kent C. Dodds Testing Trophy
- SonarQube Quality Gates
## License
Apache 2.0 (example skill for demonstration)
---
**Built with**: Python 3.8+
**Anthropic Skill Version**: 1.0
**Last Updated**: 2024-10-21

View File

@@ -0,0 +1,112 @@
---
name: codebase-auditor
description: Use PROACTIVELY when evaluating code quality, assessing technical debt, or preparing for production deployment. Comprehensive audit tool analyzing software engineering practices, security vulnerabilities (OWASP Top 10), and technical debt using modern SDLC best practices (2024-25 standards). Generates prioritized remediation plans with effort estimates. Not for runtime profiling or real-time monitoring.
---
# Codebase Auditor
Comprehensive codebase audits using modern software engineering standards with actionable remediation plans.
## When to Use
- Audit codebase for quality, security, maintainability
- Assess technical debt and estimate remediation
- Prepare production readiness report
- Evaluate legacy codebase for modernization
## Audit Phases
### Phase 1: Initial Assessment
- Project discovery (tech stack, frameworks, tools)
- Quick health check (LOC, docs, git practices)
- Red flag detection (secrets, massive files)
### Phase 2: Deep Analysis
Load on demand based on Phase 1 findings.
### Phase 3: Report Generation
Comprehensive report with scores and priorities.
### Phase 4: Remediation Planning
Prioritized action plan with effort estimates.
## Analysis Categories
| Category | Key Checks |
|----------|------------|
| Code Quality | Complexity, duplication, code smells |
| Testing | Coverage (80% min), trophy distribution, quality |
| Security | OWASP Top 10, dependencies, secrets |
| Architecture | SOLID, patterns, modularity |
| Performance | Build time, bundle size, runtime |
| Documentation | JSDoc, README, ADRs |
| DevOps | CI/CD maturity, DORA metrics |
| Accessibility | WCAG 2.1 AA compliance |
## Technical Debt Rating (SQALE)
| Grade | Remediation Effort |
|-------|-------------------|
| A | <= 5% of dev time |
| B | 6-10% |
| C | 11-20% |
| D | 21-50% |
| E | > 50% |
## Usage Examples
```
# Basic audit
Audit this codebase using the codebase-auditor skill.
# Security focused
Run a security-focused audit on this codebase.
# Quick health check
Give me a quick health check (Phase 1 only).
# Custom scope
Audit focusing on test coverage and security.
```
## Output Formats
1. **Markdown Report** - Human-readable for PR comments
2. **JSON Report** - Machine-readable for CI/CD
3. **HTML Dashboard** - Interactive visualization
4. **Remediation Plan** - Prioritized action items
## Priority Levels
| Priority | Examples | Timeline |
|----------|----------|----------|
| P1 Critical | Security vulns, data loss risks | Immediate |
| P2 High | Coverage gaps, performance issues | This sprint |
| P3 Medium | Code smells, doc gaps | Next quarter |
| P4 Low | Stylistic, minor optimizations | Backlog |
## Best Practices
1. Run incrementally for large codebases
2. Focus on critical paths first
3. Baseline before major releases
4. Track metrics over time
5. Integrate with CI/CD
## Integrations
Complements: SonarQube, ESLint, Jest/Vitest, npm audit, Lighthouse, GitHub Actions
## Limitations
- Static analysis only (no runtime profiling)
- Requires source code access
- Internet needed for CVE data
- Large codebases need chunked analysis
## References
See `reference/` for:
- Complete audit criteria checklist
- Severity matrix and scoring rubric
- 2024-25 SDLC best practices guide

View File

@@ -0,0 +1,126 @@
# Codebase Remediation Plan
**Generated**: 2024-10-21 14:30:00
**Codebase**: `/Users/connor/projects/example-app`
---
## Priority 0: Critical Issues (Fix Immediately ⚡)
**Timeline**: Within 24 hours
**Impact**: Security vulnerabilities, production-breaking bugs, data loss risks
### 1. Potential API key found in code
**Category**: Security
**Location**: `src/utils/api.ts`
**Effort**: LOW
**Issue**: Found potential secret on line 12
**Impact**: Exposed secrets can lead to unauthorized access and data breaches
**Action**: Remove secret from code and use environment variables or secret management tools
---
### 2. Use of eval() is dangerous
**Category**: Security
**Location**: `src/legacy/parser.js`
**Effort**: MEDIUM
**Issue**: Found on line 45
**Impact**: eval() can execute arbitrary code and is a security risk
**Action**: Refactor to avoid eval(), use safer alternatives like Function constructor with specific scope
---
## Priority 1: High Issues (Fix This Sprint 📅)
**Timeline**: Within current sprint (2 weeks)
**Impact**: Significant quality, security, or user experience issues
### 1. High cyclomatic complexity (28)
**Category**: Code Quality
**Effort**: HIGH
**Action**: Refactor into smaller functions, extract complex conditions
### 2. Line coverage below target (65.3%)
**Category**: Testing
**Effort**: HIGH
**Action**: Add tests to increase coverage by 14.7%
### 3. Long function (127 lines)
**Category**: Code Quality
**Effort**: MEDIUM
**Action**: Extract smaller functions for distinct responsibilities
### 4. Console statement in production code
**Category**: Code Quality
**Effort**: LOW
**Action**: Remove console statement or replace with proper logging framework
### 5. Large file (843 lines)
**Category**: Code Quality
**Effort**: HIGH
**Action**: Split into multiple smaller, focused modules
---
## Priority 2: Medium Issues (Fix Next Quarter 📆)
**Timeline**: Within 3 months
**Impact**: Code maintainability, developer productivity
**Total Issues**: 25
**Grouped by Type**:
- Typescript Strict Mode: 8 issues
- Modern Javascript: 5 issues
- Code Smell: 7 issues
- Function Length: 5 issues
---
## Priority 3: Low Issues (Backlog 📋)
**Timeline**: When time permits
**Impact**: Minor improvements, stylistic issues
**Total Issues**: 12
*Address during dedicated tech debt sprints or slow periods*
---
## Suggested Timeline
- **2024-10-22**: All P0 issues resolved
- **2024-11-04**: P1 issues addressed (end of sprint)
- **2025-01-20**: P2 issues resolved (end of quarter)
## Effort Summary
**Total Estimated Effort**: 32.5 person-days
- Critical/High: 18.5 days
- Medium: 10.0 days
- Low: 4.0 days
## Team Assignment Suggestions
- **Security Team**: All P0 security issues, P1 vulnerabilities
- **QA/Testing**: Test coverage improvements, test quality issues
- **Infrastructure**: CI/CD improvements, build performance
- **Development Team**: Code quality refactoring, complexity reduction
---
*Remediation plan generated by Codebase Auditor Skill*
*Priority scoring based on: Impact × 10 + Frequency × 5 - Effort × 2*

View File

@@ -0,0 +1,117 @@
# Codebase Audit Report
**Generated**: 2024-10-21 14:30:00
**Codebase**: `/Users/connor/projects/example-app`
**Tech Stack**: javascript, typescript, react, node
**Total Files**: 342
**Lines of Code**: 15,420
---
## Executive Summary
### Overall Health Score: **72/100**
#### Category Scores
- **Quality**: 68/100 ⚠️
- **Testing**: 65/100 ⚠️
- **Security**: 85/100 ✅
- **Technical Debt**: 70/100 ⚠️
#### Issue Summary
- **Critical Issues**: 2
- **High Issues**: 8
- **Total Issues**: 47
---
## Detailed Findings
### 🚨 CRITICAL (2 issues)
#### Potential API key found in code
**Category**: Security
**Subcategory**: secrets
**Location**: `src/utils/api.ts:12`
Found potential secret on line 12
```typescript
const API_KEY = "sk_live_1234567890abcdef1234567890abcdef";
```
**Impact**: Exposed secrets can lead to unauthorized access and data breaches
**Remediation**: Remove secret from code and use environment variables or secret management tools
**Effort**: LOW
---
#### Use of eval() is dangerous
**Category**: Security
**Subcategory**: code_security
**Location**: `src/legacy/parser.js:45`
Found on line 45
```javascript
const result = eval(userInput);
```
**Impact**: eval() can execute arbitrary code and is a security risk
**Remediation**: Refactor to avoid eval(), use safer alternatives like Function constructor with specific scope
**Effort**: MEDIUM
---
### ⚠️ HIGH (8 issues)
#### High cyclomatic complexity (28)
**Category**: Code Quality
**Subcategory**: complexity
**Location**: `src/services/checkout.ts:156`
Function has complexity of 28
**Impact**: High complexity makes code difficult to understand, test, and maintain
**Remediation**: Refactor into smaller functions, extract complex conditions
**Effort**: HIGH
---
#### Line coverage below target (65.3%)
**Category**: Testing
**Subcategory**: test_coverage
**Location**: `coverage/coverage-summary.json`
Current coverage is 65.3%, target is 80%
**Impact**: Low coverage means untested code paths and higher bug risk
**Remediation**: Add tests to increase coverage by 14.7%
**Effort**: HIGH
---
## Recommendations
1. **Immediate Action Required**: Address all 2 critical security and quality issues before deploying to production.
2. **Sprint Focus**: Prioritize fixing the 8 high-severity issues in the next sprint. These significantly impact code quality and maintainability.
3. **Testing Improvements**: Increase test coverage to meet the 80% minimum threshold. Focus on critical paths first (authentication, payment, data processing).
4. **Security Review**: Conduct a thorough security review and penetration testing given the security issues found.
---
*Report generated by Codebase Auditor Skill (2024-25 Standards)*

View File

@@ -0,0 +1,292 @@
# Codebase Audit Criteria Checklist
This document provides a comprehensive checklist for auditing codebases based on modern software engineering best practices (2024-25).
## 1. Code Quality
### Complexity Metrics
- [ ] Cyclomatic complexity measured for all functions/methods
- [ ] Functions with complexity > 10 flagged as warnings
- [ ] Functions with complexity > 20 flagged as critical
- [ ] Cognitive complexity analyzed
- [ ] Maximum nesting depth < 4 levels
- [ ] Function/method length < 50 LOC (recommendation)
- [ ] File length < 500 LOC (recommendation)
### Code Duplication
- [ ] Duplication analysis performed (minimum 6-line blocks)
- [ ] Overall duplication < 5%
- [ ] Duplicate blocks identified with locations
- [ ] Opportunities for abstraction documented
### Code Smells
- [ ] God objects/classes identified (> 10 public methods)
- [ ] Feature envy detected (high coupling to other classes)
- [ ] Dead code identified (unused imports, variables, functions)
- [ ] Magic numbers replaced with named constants
- [ ] Hard-coded values moved to configuration
- [ ] Naming conventions consistent
- [ ] Error handling comprehensive
- [ ] No console.log in production code
- [ ] No commented-out code blocks
### Language-Specific (TypeScript/JavaScript)
- [ ] No use of `any` type (strict mode)
- [ ] No use of `var` keyword
- [ ] Strict equality (`===`) used consistently
- [ ] Return type annotations present for functions
- [ ] Non-null assertions justified with comments
- [ ] Async/await preferred over Promise chains
- [ ] No implicit any returns
## 2. Testing & Coverage
### Coverage Metrics
- [ ] Line coverage >= 80%
- [ ] Branch coverage >= 75%
- [ ] Function coverage >= 90%
- [ ] Critical paths have 100% coverage (auth, payment, data processing)
- [ ] Coverage reports generated and accessible
### Testing Trophy Distribution
- [ ] Integration tests: ~70% of total tests
- [ ] Unit tests: ~20% of total tests
- [ ] E2E tests: ~10% of total tests
- [ ] Actual distribution documented
### Test Quality
- [ ] Tests follow "should X when Y" naming pattern
- [ ] Tests are isolated and independent
- [ ] No tests of implementation details (brittle tests)
- [ ] Single assertion per test (or grouped related assertions)
- [ ] Edge cases covered
- [ ] No flaky tests
- [ ] Tests use semantic queries (getByRole, getByLabelText)
- [ ] Avoid testing emoji presence, exact DOM counts, element ordering
### Test Performance
- [ ] Tests complete in < 30 seconds (unit/integration)
- [ ] CPU usage monitored (use `npm run test:low -- --run`)
- [ ] No runaway test processes
- [ ] Tests run in parallel where possible
- [ ] Max threads limited to prevent CPU overload
## 3. Security
### Dependency Vulnerabilities
- [ ] No critical CVEs in dependencies
- [ ] No high-severity CVEs in dependencies
- [ ] All dependencies using supported versions
- [ ] No dependencies unmaintained for > 2 years
- [ ] License compliance verified
- [ ] No dependency confusion risks
### OWASP Top 10 (2024)
- [ ] Access control properly implemented
- [ ] Sensitive data encrypted at rest and in transit
- [ ] Input validation prevents injection attacks
- [ ] Security design patterns followed
- [ ] Security configuration reviewed (no defaults)
- [ ] All components up-to-date
- [ ] Authentication robust (MFA, rate limiting)
- [ ] Software integrity verified (SRI, signatures)
- [ ] Security logging and monitoring enabled
- [ ] SSRF protections in place
### Secrets Management
- [ ] No API keys in code
- [ ] No tokens in code
- [ ] No passwords in code
- [ ] No private keys committed
- [ ] Environment variables properly used
- [ ] No secrets in client-side code
- [ ] .env files in .gitignore
- [ ] Git history clean of secrets
### Security Best Practices
- [ ] Input validation on all user inputs
- [ ] Output encoding prevents XSS
- [ ] CSRF tokens implemented
- [ ] Secure session management
- [ ] HTTPS enforced
- [ ] CSP headers configured
- [ ] Rate limiting on APIs
- [ ] SQL prepared statements used
## 4. Architecture & Design
### SOLID Principles
- [ ] Single Responsibility: Classes/modules have one reason to change
- [ ] Open/Closed: Open for extension, closed for modification
- [ ] Liskov Substitution: Subtypes are substitutable for base types
- [ ] Interface Segregation: Clients not forced to depend on unused methods
- [ ] Dependency Inversion: Depend on abstractions, not concretions
### Design Patterns
- [ ] Appropriate patterns used (Factory, Strategy, Observer, etc.)
- [ ] No anti-patterns (Singleton abuse, God Object, etc.)
- [ ] Not over-engineered
- [ ] Not under-engineered
### Modularity
- [ ] Low coupling between modules
- [ ] High cohesion within modules
- [ ] No circular dependencies
- [ ] Proper separation of concerns
- [ ] Clean public APIs
- [ ] Internal implementation details hidden
## 5. Performance
### Build Performance
- [ ] Build time < 2 minutes for typical project
- [ ] Bundle size documented and optimized
- [ ] Code splitting implemented
- [ ] Tree-shaking enabled
- [ ] Source maps configured correctly
- [ ] Production build optimized
### Runtime Performance
- [ ] No memory leaks
- [ ] Algorithms efficient (avoid O(n²) where possible)
- [ ] No excessive re-renders (React/Vue)
- [ ] Computations memoized where appropriate
- [ ] Images optimized (< 200KB)
- [ ] Videos optimized or lazy-loaded
- [ ] Lazy loading for large components
### CI/CD Performance
- [ ] Pipeline runs in < 10 minutes
- [ ] Deployment frequency documented
- [ ] Test execution time < 5 minutes
- [ ] Docker images < 500MB (if applicable)
## 6. Documentation
### Code Documentation
- [ ] Public APIs documented (JSDoc/TSDoc)
- [ ] Complex logic has inline comments
- [ ] README.md comprehensive
- [ ] Architecture Decision Records (ADRs) present
- [ ] API documentation available
- [ ] CONTRIBUTING.md exists
- [ ] CODE_OF_CONDUCT.md exists
### Documentation Maintenance
- [ ] No outdated documentation
- [ ] No broken links
- [ ] All sections complete
- [ ] Code examples work correctly
- [ ] Changelog maintained
## 7. DevOps & CI/CD
### CI/CD Maturity
- [ ] Automated testing in pipeline
- [ ] Automated deployment configured
- [ ] Development/staging/production environments
- [ ] Rollback capability exists
- [ ] Feature flags used for risky changes
- [ ] Blue-green or canary deployments
### DORA 4 Metrics
- [ ] Deployment frequency measured
- Elite: Multiple times per day
- High: Once per day to once per week
- Medium: Once per week to once per month
- Low: Less than once per month
- [ ] Lead time for changes measured
- Elite: Less than 1 hour
- High: 1 day to 1 week
- Medium: 1 week to 1 month
- Low: More than 1 month
- [ ] Change failure rate measured
- Elite: < 1%
- High: 1-5%
- Medium: 5-15%
- Low: > 15%
- [ ] Time to restore service measured
- Elite: < 1 hour
- High: < 1 day
- Medium: 1 day to 1 week
- Low: > 1 week
### Infrastructure as Code
- [ ] Configuration managed as code
- [ ] Infrastructure versioned
- [ ] Secrets managed securely (Vault, AWS Secrets Manager)
- [ ] Environment variables documented
## 8. Accessibility (WCAG 2.1 AA)
### Semantic HTML
- [ ] Proper heading hierarchy (h1 → h2 → h3)
- [ ] ARIA labels where needed
- [ ] Form labels associated with inputs
- [ ] Landmark regions defined (header, nav, main, footer)
### Keyboard Navigation
- [ ] All interactive elements keyboard accessible
- [ ] Focus management implemented
- [ ] Tab order logical
- [ ] Focus indicators visible
### Screen Reader Support
- [ ] Images have alt text
- [ ] ARIA live regions for dynamic content
- [ ] Links have descriptive text
- [ ] Form errors announced
### Color & Contrast
- [ ] Text contrast >= 4.5:1 (normal text)
- [ ] Text contrast >= 3:1 (large text 18pt+)
- [ ] UI components contrast >= 3:1
- [ ] Color not sole means of conveying information
## 9. Technical Debt
### SQALE Rating
- [ ] Technical debt quantified in person-days
- [ ] Rating assigned (A-E)
- A: <= 5% of development time
- B: 6-10%
- C: 11-20%
- D: 21-50%
- E: > 50%
### Debt Categories
- [ ] Code smell debt identified
- [ ] Test debt quantified
- [ ] Documentation debt listed
- [ ] Security debt prioritized
- [ ] Performance debt noted
- [ ] Architecture debt evaluated
## 10. Project-Specific Standards
### Connor's Global Standards
- [ ] TypeScript strict mode enabled
- [ ] No `any` types
- [ ] Explicit return types
- [ ] Comprehensive error handling
- [ ] 80%+ test coverage
- [ ] No console.log statements
- [ ] No `var` keyword
- [ ] No loose equality (`==`)
- [ ] Conventional commits format
- [ ] Branch naming follows pattern: (feature|bugfix|chore)/{component-name}
## Audit Completion
### Final Checks
- [ ] All critical issues identified
- [ ] All high-severity issues documented
- [ ] Severity assigned to each finding
- [ ] Remediation effort estimated
- [ ] Report generated
- [ ] Remediation plan created
- [ ] Stakeholders notified
---
**Note**: This checklist is based on industry best practices as of 2024-25. Adjust severity thresholds and criteria based on your project's maturity stage and business context.

View File

@@ -0,0 +1,573 @@
# Modern SDLC Best Practices (2024-25)
This document outlines industry-standard software development lifecycle best practices based on 2024-25 research and modern engineering standards.
## Table of Contents
1. [Development Workflow](#development-workflow)
2. [Testing Strategy](#testing-strategy)
3. [Security (DevSecOps)](#security-devsecops)
4. [Code Quality](#code-quality)
5. [Performance](#performance)
6. [Documentation](#documentation)
7. [DevOps & CI/CD](#devops--cicd)
8. [DORA Metrics](#dora-metrics)
9. [Developer Experience](#developer-experience)
10. [Accessibility](#accessibility)
---
## Development Workflow
### Version Control (Git)
**Branching Strategy**:
- Main/master branch is always deployable
- Feature branches for new work: `feature/{component-name}`
- Bugfix branches: `bugfix/{issue-number}`
- Release branches for production releases
- No direct commits to main (use pull requests)
**Commit Messages**:
- Follow Conventional Commits format
- Structure: `type(scope): description`
- Types: feat, fix, docs, style, refactor, test, chore
- Example: `feat(auth): add OAuth2 social login`
**Code Review**:
- All changes require peer review
- Use pull request templates
- Automated checks must pass before merge
- Review within 24 hours for team velocity
- Focus on logic, security, and maintainability
### Test-Driven Development (TDD)
**RED-GREEN-REFACTOR Cycle**:
1. **RED**: Write failing test first
2. **GREEN**: Write minimum code to pass
3. **REFACTOR**: Improve code quality while tests pass
**Benefits**:
- Better design through testability
- Documentation through tests
- Confidence to refactor
- Fewer regression bugs
---
## Testing Strategy
### Testing Trophy (Kent C. Dodds)
**Philosophy**: "Write tests. Not too many. Mostly integration."
**Distribution**:
- **Integration Tests (70%)**: User workflows and component interaction
- Test real user behavior
- Test multiple units working together
- Higher confidence than unit tests
- Example: User registration flow end-to-end
- **Unit Tests (20%)**: Complex business logic only
- Pure functions
- Complex algorithms
- Edge cases and error handling
- Example: Tax calculation logic
- **E2E Tests (10%)**: Critical user journeys
- Full stack, production-like environment
- Happy path scenarios
- Critical business flows
- Example: Complete purchase flow
### What NOT to Test (Brittle Patterns)
**Avoid**:
- Emoji presence in UI elements
- Exact number of DOM elements
- Specific element ordering (unless critical)
- API call counts (unless performance critical)
- CSS class names and styling
- Implementation details over user behavior
- Private methods/functions
- Third-party library internals
### What to Prioritize (User-Focused)
**Prioritize**:
- User workflows and interactions
- Business logic and calculations
- Data accuracy and processing
- Error handling and edge cases
- Performance within acceptable limits
- Accessibility compliance (WCAG 2.1 AA)
- Security boundaries
### Semantic Queries (React Testing Library)
**Priority Order**:
1. `getByRole()` - Most preferred (accessibility-first)
2. `getByLabelText()` - Form elements
3. `getByPlaceholderText()` - Inputs without labels
4. `getByText()` - User-visible content
5. `getByDisplayValue()` - Form current values
6. `getByAltText()` - Images
7. `getByTitle()` - Title attributes
8. `getByTestId()` - Last resort only
### Coverage Targets
**Minimum Requirements**:
- Overall coverage: **80%**
- Critical paths: **100%** (auth, payment, data processing)
- Branch coverage: **75%**
- Function coverage: **90%**
**Tools**:
- Jest/Vitest for unit & integration tests
- Cypress/Playwright for E2E tests
- Istanbul/c8 for coverage reporting
---
## Security (DevSecOps)
### Shift-Left Security
**Principle**: Integrate security into every development stage, not as an afterthought.
**Cost Multiplier**:
- Fix in **design**: 1x cost
- Fix in **development**: 5x cost
- Fix in **testing**: 10x cost
- Fix in **production**: 30x cost
### OWASP Top 10 (2024)
1. **Broken Access Control**: Enforce authorization checks on every request
2. **Cryptographic Failures**: Use TLS, encrypt PII, avoid weak algorithms
3. **Injection**: Validate input, use prepared statements, sanitize output
4. **Insecure Design**: Threat modeling, secure design patterns
5. **Security Misconfiguration**: Harden defaults, disable unnecessary features
6. **Vulnerable Components**: Keep dependencies updated, scan for CVEs
7. **Authentication Failures**: MFA, rate limiting, secure session management
8. **Software Integrity Failures**: Verify integrity with signatures, SRI
9. **Security Logging**: Log security events, monitor for anomalies
10. **SSRF**: Validate URLs, whitelist allowed domains
### Dependency Management
**Best Practices**:
- Run `npm audit` / `yarn audit` weekly
- Update dependencies monthly
- Use Dependabot/Renovate for automated updates
- Pin dependency versions in production
- Check licenses for compliance
- Monitor CVE databases
### Secrets Management
**Rules**:
- NEVER commit secrets to version control
- Use environment variables for configuration
- Use secret management tools (Vault, AWS Secrets Manager)
- Rotate secrets regularly
- Scan git history for leaked secrets
- Use `.env.example` for documentation, not `.env`
---
## Code Quality
### Complexity Metrics
**Cyclomatic Complexity**:
- **1-10**: Simple, easy to test
- **11-20**: Moderate, consider refactoring
- **21-50**: High, should refactor
- **50+**: Very high, must refactor
**Tool**: ESLint `complexity` rule, SonarQube
### Code Duplication
**Thresholds**:
- **< 5%**: Excellent
- **5-10%**: Acceptable
- **10-20%**: Needs attention
- **> 20%**: Critical issue
**DRY Principle**: Don't Repeat Yourself
- Extract common code into functions/modules
- Use design patterns (Template Method, Strategy)
- Balance DRY with readability
### Code Smells
**Common Smells**:
- **God Object**: Too many responsibilities
- **Feature Envy**: Too much coupling to other classes
- **Long Method**: > 50 lines
- **Long Parameter List**: > 4 parameters
- **Dead Code**: Unused code
- **Magic Numbers**: Hard-coded values
- **Primitive Obsession**: Overuse of primitives vs objects
**Refactoring Techniques**:
- Extract Method
- Extract Class
- Introduce Parameter Object
- Replace Magic Number with Constant
- Remove Dead Code
### Static Analysis
**Tools**:
- **SonarQube**: Comprehensive code quality platform
- **ESLint**: JavaScript/TypeScript linting
- **Prettier**: Code formatting
- **TypeScript**: Type checking in strict mode
- **Checkmarx**: Security-focused analysis
---
## Performance
### Build Performance
**Targets**:
- Build time: < 2 minutes
- Hot reload: < 200ms
- First build: < 5 minutes
**Optimization**:
- Use build caching
- Parallelize builds
- Tree-shaking
- Code splitting
- Lazy loading
### Runtime Performance
**Web Vitals (Core)**:
- **LCP (Largest Contentful Paint)**: < 2.5s
- **FID (First Input Delay)**: < 100ms
- **CLS (Cumulative Layout Shift)**: < 0.1
**API Performance**:
- **P50**: < 100ms
- **P95**: < 500ms
- **P99**: < 1000ms
**Optimization Techniques**:
- Caching (Redis, CDN)
- Database indexing
- Query optimization
- Compression (gzip, Brotli)
- Image optimization (WebP, lazy loading)
- Code splitting and lazy loading
### Bundle Size
**Targets**:
- Initial bundle: < 200KB (gzipped)
- Total JavaScript: < 500KB (gzipped)
- Images optimized: < 200KB each
**Tools**:
- webpack-bundle-analyzer
- Lighthouse
- Chrome DevTools Performance tab
---
## Documentation
### Code Documentation
**JSDoc/TSDoc**:
- Document all public APIs
- Include examples for complex functions
- Document parameters, return types, exceptions
**Example**:
```typescript
/**
* Calculates the total price including tax and discounts.
*
* @param items - Array of cart items
* @param taxRate - Tax rate as decimal (e.g., 0.08 for 8%)
* @param discountCode - Optional discount code
* @returns Total price with tax and discounts applied
* @throws {InvalidDiscountError} If discount code is invalid
*
* @example
* const total = calculateTotal(items, 0.08, 'SUMMER20');
*/
function calculateTotal(items: CartItem[], taxRate: number, discountCode?: string): number {
// ...
}
```
### Project Documentation
**Essential Files**:
- **README.md**: Project overview, setup instructions, quick start
- **CONTRIBUTING.md**: How to contribute, coding standards, PR process
- **CODE_OF_CONDUCT.md**: Community guidelines
- **CHANGELOG.md**: Version history and changes
- **LICENSE**: Legal license information
- **ARCHITECTURE.md**: High-level architecture overview
- **ADRs** (Architecture Decision Records): Document important decisions
---
## DevOps & CI/CD
### Continuous Integration
**Requirements**:
- Automated testing on every commit
- Build verification
- Code quality checks (linting, formatting)
- Security scanning
- Fast feedback (< 10 minutes)
**Pipeline Stages**:
1. Lint & Format Check
2. Unit Tests
3. Integration Tests
4. Security Scan
5. Build Artifacts
6. Deploy to Staging
7. E2E Tests
8. Deploy to Production (with approval)
### Continuous Deployment
**Strategies**:
- **Blue-Green**: Two identical environments, switch traffic
- **Canary**: Gradual rollout to subset of users
- **Rolling**: Update instances incrementally
- **Feature Flags**: Control feature visibility without deployment
**Rollback**:
- Automated rollback on failure detection
- Keep last 3-5 versions deployable
- Database migrations reversible
- Monitor key metrics post-deployment
### Infrastructure as Code
**Tools**:
- Terraform, CloudFormation, Pulumi
- Ansible, Chef, Puppet
- Docker, Kubernetes
**Benefits**:
- Version-controlled infrastructure
- Reproducible environments
- Disaster recovery
- Automated provisioning
---
## DORA Metrics
**Four Key Metrics** (DevOps Research and Assessment):
### 1. Deployment Frequency
**How often code is deployed to production**
- **Elite**: Multiple times per day
- **High**: Once per day to once per week
- **Medium**: Once per week to once per month
- **Low**: Less than once per month
### 2. Lead Time for Changes
**Time from commit to production**
- **Elite**: Less than 1 hour
- **High**: 1 day to 1 week
- **Medium**: 1 week to 1 month
- **Low**: More than 1 month
### 3. Change Failure Rate
**Percentage of deployments causing failures**
- **Elite**: < 1%
- **High**: 1-5%
- **Medium**: 5-15%
- **Low**: > 15%
### 4. Time to Restore Service
**Time to recover from production incident**
- **Elite**: < 1 hour
- **High**: < 1 day
- **Medium**: 1 day to 1 week
- **Low**: > 1 week
**Tracking**: Use CI/CD tools, APM (Application Performance Monitoring), incident management systems
---
## Developer Experience
### Why It Matters
**Statistics**:
- 83% of engineers experience burnout
- Developer experience is the strongest predictor of delivery capability
- Happy developers are 2x more productive
### Key Factors
**Fast Feedback Loops**:
- Quick build times
- Fast test execution
- Immediate linting/formatting feedback
- Hot module reloading
**Good Tooling**:
- Modern IDE with autocomplete
- Debuggers and profilers
- Automated code reviews
- Documentation generators
**Clear Standards**:
- Coding style guides
- Architecture documentation
- Onboarding guides
- Runbooks for common tasks
**Psychological Safety**:
- Blameless post-mortems
- Encourage experimentation
- Celebrate learning from failure
- Mentorship programs
---
## Accessibility
### WCAG 2.1 Level AA Compliance
**Four Principles (POUR)**:
1. **Perceivable**: Information must be presentable to users
- Alt text for images
- Captions for videos
- Color contrast ratios
2. **Operable**: UI components must be operable
- Keyboard navigation
- Sufficient time to read content
- No seizure-inducing content
3. **Understandable**: Information must be understandable
- Readable text
- Predictable behavior
- Input assistance (error messages)
4. **Robust**: Content must be robust across technologies
- Valid HTML
- ARIA attributes
- Cross-browser compatibility
### Testing Tools
**Automated**:
- axe DevTools
- Lighthouse
- WAVE
- Pa11y
**Manual**:
- Keyboard navigation testing
- Screen reader testing (NVDA, JAWS, VoiceOver)
- Color contrast checkers
- Zoom testing (200%+)
---
## Modern Trends (2024-25)
### AI-Assisted Development
**Tools**:
- GitHub Copilot
- ChatGPT / Claude
- Tabnine
- Amazon CodeWhisperer
**Best Practices**:
- Review all AI-generated code
- Write tests for AI code
- Understand before committing
- Train team on effective prompting
### Platform Engineering
**Concept**: Internal developer platforms to improve developer experience
**Components**:
- Self-service infrastructure
- Golden paths (templates)
- Developer portals
- Observability dashboards
### Observability (vs Monitoring)
**Three Pillars**:
1. **Logs**: What happened
2. **Metrics**: Quantitative data
3. **Traces**: Request flow through system
**Tools**:
- Datadog, New Relic, Grafana
- OpenTelemetry for standardization
- Distributed tracing (Jaeger, Zipkin)
---
## Industry Benchmarks (2024-25)
### Code Quality
- Tech debt ratio: < 5%
- Duplication: < 5%
- Test coverage: > 80%
- Build time: < 2 minutes
### Security
- CVE remediation: < 30 days
- Security training: Quarterly
- Penetration testing: Annually
### Performance
- Page load: < 3 seconds
- API response: P95 < 500ms
- Uptime: 99.9%+
### Team Metrics
- Pull request review time: < 24 hours
- Deployment frequency: Daily+
- Incident MTTR: < 1 hour
- Developer onboarding: < 1 week
---
**References**:
- DORA State of DevOps Report 2024
- OWASP Top 10 (2024 Edition)
- WCAG 2.1 Guidelines
- Kent C. Dodds Testing Trophy
- SonarQube Quality Gates
- Google Web Vitals
**Last Updated**: 2024-25
**Version**: 1.0

View File

@@ -0,0 +1,307 @@
# Severity Matrix & Issue Prioritization
This document defines how to categorize and prioritize issues found during codebase audits.
## Severity Levels
### Critical (P0) - Fix Immediately
**Definition**: Issues that pose immediate risk to security, data integrity, or production stability.
**Characteristics**:
- Security vulnerabilities with known exploits (CVE scores >= 9.0)
- Secrets or credentials exposed in code
- Data loss or corruption risks
- Production-breaking bugs
- Authentication/authorization bypasses
- SQL injection or XSS vulnerabilities
- Compliance violations (GDPR, HIPAA, etc.)
**Timeline**: Must be fixed within 24 hours
**Effort vs Impact**: Fix immediately regardless of effort
**Deployment**: Requires immediate hotfix release
**Examples**:
- API key committed to repository
- SQL injection vulnerability in production endpoint
- Authentication bypass allowing unauthorized access
- Critical CVE in production dependency (e.g., log4shell)
- Unencrypted PII being transmitted over HTTP
- Memory leak causing production crashes
---
### High (P1) - Fix This Sprint
**Definition**: Significant issues that impact quality, security, or user experience but don't pose immediate production risk.
**Characteristics**:
- Medium-severity security vulnerabilities (CVE scores 7.0-8.9)
- Critical path missing test coverage
- Performance bottlenecks affecting user experience
- WCAG AA accessibility violations
- TypeScript strict mode violations in critical code
- High cyclomatic complexity (> 20) in business logic
- Missing error handling in critical operations
**Timeline**: Fix within current sprint (2 weeks)
**Effort vs Impact**: Prioritize high-impact, low-effort fixes first
**Deployment**: Include in next regular release
**Examples**:
- Payment processing code with 0% test coverage
- Page load time > 3 seconds
- Form inaccessible to screen readers
- 500+ line function with complexity of 45
- Unhandled promise rejections in checkout flow
- Dependency with moderate CVE (6.5 score)
---
### Medium (P2) - Fix Next Quarter
**Definition**: Issues that reduce code maintainability, developer productivity, or future scalability but don't immediately impact users.
**Characteristics**:
- Code smells and duplication
- Low-severity security issues (CVE scores 4.0-6.9)
- Test coverage between 60-80%
- Documentation gaps
- Minor performance optimizations
- Outdated dependencies (no CVEs)
- Moderate complexity (10-20)
- Technical debt accumulation
**Timeline**: Fix within next quarter (3 months)
**Effort vs Impact**: Plan during sprint planning, batch similar fixes
**Deployment**: Include in planned refactoring releases
**Examples**:
- 15% code duplication across services
- Missing JSDoc for public API
- God class with 25 public methods
- Build time of 5 minutes
- Test suite takes 10 minutes to run
- Dependency 2 major versions behind (stable)
---
### Low (P3) - Backlog
**Definition**: Minor improvements, stylistic issues, or optimizations that have minimal impact on functionality or quality.
**Characteristics**:
- Stylistic inconsistencies
- Minor code smells
- Documentation improvements
- Nice-to-have features
- Long-term architectural improvements
- Code coverage 80-90% (already meets minimum)
- Low complexity optimizations (< 10)
**Timeline**: Address when time permits or during dedicated tech debt sprints
**Effort vs Impact**: Only fix if effort is minimal or during slow periods
**Deployment**: Bundle with feature releases
**Examples**:
- Inconsistent variable naming (camelCase vs snake_case)
- Missing comments on simple functions
- Single-character variable names in non-critical code
- Console.log in development-only code
- README could be more detailed
- Opportunity to refactor small utility function
---
## Scoring Rubric
Use this matrix to assign severity levels:
| Impact | Effort Low | Effort Medium | Effort High |
|--------|------------|---------------|-------------|
| **Critical** | P0 | P0 | P0 |
| **High** | P1 | P1 | P1 |
| **Medium** | P1 | P2 | P2 |
| **Low** | P2 | P3 | P3 |
### Impact Assessment
**Critical Impact**:
- Security breach
- Data loss/corruption
- Production outage
- Legal/compliance violation
**High Impact**:
- User experience degraded
- Performance issues
- Accessibility barriers
- Development velocity reduced significantly
**Medium Impact**:
- Code maintainability reduced
- Technical debt accumulating
- Future changes more difficult
- Developer productivity slightly reduced
**Low Impact**:
- Minimal user/developer effect
- Cosmetic issues
- Future-proofing
- Best practice deviations
### Effort Estimation
**Low Effort**: < 4 hours
- Simple configuration change
- One-line fix
- Update dependency version
**Medium Effort**: 4 hours - 2 days
- Refactor single module
- Add test coverage for feature
- Implement security fix with tests
**High Effort**: > 2 days
- Architectural changes
- Major refactoring
- Migration to new framework/library
- Comprehensive security overhaul
---
## Category-Specific Severity Guidelines
### Security Issues
| Finding | Severity |
|---------|----------|
| Known exploit in production | Critical |
| Secrets in code | Critical |
| Authentication bypass | Critical |
| SQL injection | Critical |
| XSS vulnerability | High |
| CSRF vulnerability | High |
| Outdated dependency (CVE 7-9) | High |
| Outdated dependency (CVE 4-7) | Medium |
| Missing security headers | Medium |
| Weak encryption algorithm | Medium |
### Code Quality Issues
| Finding | Severity |
|---------|----------|
| Complexity > 50 | High |
| Complexity 20-50 | Medium |
| Complexity 10-20 | Low |
| Duplication > 20% | High |
| Duplication 10-20% | Medium |
| Duplication 5-10% | Low |
| File > 1000 LOC | Medium |
| File > 500 LOC | Low |
| Dead code (unused for > 6 months) | Low |
### Test Coverage Issues
| Finding | Severity |
|---------|----------|
| Critical path untested | High |
| Coverage < 50% | High |
| Coverage 50-80% | Medium |
| Coverage 80-90% | Low |
| Flaky tests | Medium |
| Slow tests (> 10 min) | Medium |
| No E2E tests | Medium |
| Missing edge case tests | Low |
### Performance Issues
| Finding | Severity |
|---------|----------|
| Page load > 5s | High |
| Page load 3-5s | Medium |
| Memory leak | High |
| O(n²) in hot path | High |
| Bundle size > 5MB | Medium |
| Build time > 10 min | Medium |
| Unoptimized images | Low |
### Accessibility Issues
| Finding | Severity |
|---------|----------|
| No keyboard navigation | High |
| Contrast ratio < 3:1 | High |
| Missing ARIA labels | High |
| Heading hierarchy broken | Medium |
| Missing alt text | Medium |
| Focus indicators absent | Medium |
| Color-only information | Low |
---
## Remediation Priority Formula
Use this formula to calculate a priority score:
```
Priority Score = (Impact × 10) + (Frequency × 5) - (Effort × 2)
```
Where:
- **Impact**: 1-10 (10 = critical)
- **Frequency**: 1-10 (10 = affects all users/code)
- **Effort**: 1-10 (10 = requires months of work)
Sort issues by priority score (highest first) to create your remediation plan.
### Example Calculations
**Example 1**: SQL Injection in Login
- Impact: 10 (critical security issue)
- Frequency: 10 (affects all users)
- Effort: 3 (straightforward fix with prepared statements)
- Score: (10 × 10) + (10 × 5) - (3 × 2) = **144****P0**
**Example 2**: Missing Tests on Helper Utility
- Impact: 4 (low risk, helper function)
- Frequency: 2 (rarely used)
- Effort: 2 (quick to test)
- Score: (4 × 10) + (2 × 5) - (2 × 2) = **46****P3**
**Example 3**: Performance Bottleneck in Search
- Impact: 7 (user experience degraded)
- Frequency: 8 (common feature)
- Effort: 6 (requires algorithm optimization)
- Score: (7 × 10) + (8 × 5) - (6 × 2) = **98****P1**
---
## Escalation Criteria
Escalate to leadership when:
- 5+ Critical issues found
- 10+ High issues in production code
- SQALE rating of D or E
- Security issues require disclosure
- Compliance violations detected
- Technical debt > 50% of development capacity
---
## Review Cycles
Recommended audit frequency based on project type:
| Project Type | Audit Frequency | Focus Areas |
|-------------|-----------------|-------------|
| Production SaaS | Monthly | Security, Performance, Uptime |
| Enterprise Software | Quarterly | Compliance, Security, Quality |
| Internal Tools | Semi-annually | Technical Debt, Maintainability |
| Open Source | Per major release | Security, Documentation, API stability |
| Startup MVP | Before funding rounds | Security, Scalability, Technical Debt |
---
**Last Updated**: 2024-25 Standards
**Version**: 1.0

View File

@@ -0,0 +1,8 @@
"""
Analyzer modules for codebase auditing.
Each analyzer implements an analyze(codebase_path, metadata) function
that returns a list of findings.
"""
__version__ = '1.0.0'

View File

@@ -0,0 +1,411 @@
"""
Code Quality Analyzer
Analyzes code for:
- Cyclomatic complexity
- Code duplication
- Code smells
- File/function length
- Language-specific issues (TypeScript/JavaScript)
"""
import re
from pathlib import Path
from typing import Dict, List
def analyze(codebase_path: Path, metadata: Dict) -> List[Dict]:
"""
Analyze codebase for code quality issues.
Args:
codebase_path: Path to codebase
metadata: Project metadata from discovery phase
Returns:
List of findings with severity, location, and remediation info
"""
findings = []
# Determine which languages to analyze
tech_stack = metadata.get('tech_stack', {})
if tech_stack.get('javascript') or tech_stack.get('typescript'):
findings.extend(analyze_javascript_typescript(codebase_path))
if tech_stack.get('python'):
findings.extend(analyze_python(codebase_path))
# General analysis (language-agnostic)
findings.extend(analyze_file_sizes(codebase_path))
findings.extend(analyze_dead_code(codebase_path, tech_stack))
return findings
def analyze_javascript_typescript(codebase_path: Path) -> List[Dict]:
"""Analyze JavaScript/TypeScript specific quality issues."""
findings = []
extensions = {'.js', '.jsx', '.ts', '.tsx'}
exclude_dirs = {'node_modules', '.git', 'dist', 'build', '.next', 'coverage'}
for file_path in codebase_path.rglob('*'):
if (file_path.suffix in extensions and
not any(excluded in file_path.parts for excluded in exclude_dirs)):
try:
with open(file_path, 'r', encoding='utf-8', errors='ignore') as f:
content = f.read()
lines = content.split('\n')
# Check for TypeScript 'any' type
if file_path.suffix in {'.ts', '.tsx'}:
findings.extend(check_any_usage(file_path, content, lines))
# Check for 'var' keyword
findings.extend(check_var_usage(file_path, content, lines))
# Check for console.log statements
findings.extend(check_console_log(file_path, content, lines))
# Check for loose equality
findings.extend(check_loose_equality(file_path, content, lines))
# Check cyclomatic complexity (simplified)
findings.extend(check_complexity(file_path, content, lines))
# Check function length
findings.extend(check_function_length(file_path, content, lines))
except Exception as e:
# Skip files that can't be read
pass
return findings
def check_any_usage(file_path: Path, content: str, lines: List[str]) -> List[Dict]:
"""Check for TypeScript 'any' type usage."""
findings = []
# Pattern to match 'any' type (excluding comments)
any_pattern = re.compile(r':\s*any\b|<any>|Array<any>|\bany\[\]')
for line_num, line in enumerate(lines, start=1):
# Skip comments
if line.strip().startswith('//') or line.strip().startswith('/*') or line.strip().startswith('*'):
continue
if any_pattern.search(line):
findings.append({
'severity': 'medium',
'category': 'code_quality',
'subcategory': 'typescript_strict_mode',
'title': "Use of 'any' type violates TypeScript strict mode",
'description': f"Found 'any' type on line {line_num}",
'file': str(file_path.relative_to(file_path.parents[len(file_path.parts) - file_path.parts.index('annex') - 2])),
'line': line_num,
'code_snippet': line.strip(),
'impact': 'Reduces type safety and defeats the purpose of TypeScript',
'remediation': 'Replace "any" with specific types or use "unknown" with type guards',
'effort': 'low',
})
return findings
def check_var_usage(file_path: Path, content: str, lines: List[str]) -> List[Dict]:
"""Check for 'var' keyword usage."""
findings = []
var_pattern = re.compile(r'\bvar\s+\w+')
for line_num, line in enumerate(lines, start=1):
if line.strip().startswith('//') or line.strip().startswith('/*'):
continue
if var_pattern.search(line):
findings.append({
'severity': 'low',
'category': 'code_quality',
'subcategory': 'modern_javascript',
'title': "Use of 'var' keyword is deprecated",
'description': f"Found 'var' keyword on line {line_num}",
'file': str(file_path.relative_to(file_path.parents[len(file_path.parts) - file_path.parts.index('annex') - 2])),
'line': line_num,
'code_snippet': line.strip(),
'impact': 'Function-scoped variables can lead to bugs; block-scoped (let/const) is preferred',
'remediation': "Replace 'var' with 'const' (for values that don't change) or 'let' (for values that change)",
'effort': 'low',
})
return findings
def check_console_log(file_path: Path, content: str, lines: List[str]) -> List[Dict]:
"""Check for console.log statements in production code."""
findings = []
# Skip if it's in a test file
if 'test' in file_path.name or 'spec' in file_path.name or '__tests__' in str(file_path):
return findings
console_pattern = re.compile(r'\bconsole\.(log|debug|info|warn|error)\(')
for line_num, line in enumerate(lines, start=1):
if line.strip().startswith('//'):
continue
if console_pattern.search(line):
findings.append({
'severity': 'medium',
'category': 'code_quality',
'subcategory': 'production_code',
'title': 'Console statement in production code',
'description': f"Found console statement on line {line_num}",
'file': str(file_path.relative_to(file_path.parents[len(file_path.parts) - file_path.parts.index('annex') - 2])),
'line': line_num,
'code_snippet': line.strip(),
'impact': 'Console statements should not be in production code; use proper logging',
'remediation': 'Remove console statement or replace with proper logging framework',
'effort': 'low',
})
return findings
def check_loose_equality(file_path: Path, content: str, lines: List[str]) -> List[Dict]:
"""Check for loose equality operators (== instead of ===)."""
findings = []
loose_eq_pattern = re.compile(r'[^!<>]==[^=]|[^!<>]!=[^=]')
for line_num, line in enumerate(lines, start=1):
if line.strip().startswith('//') or line.strip().startswith('/*'):
continue
if loose_eq_pattern.search(line):
findings.append({
'severity': 'low',
'category': 'code_quality',
'subcategory': 'code_smell',
'title': 'Loose equality operator used',
'description': f"Found '==' or '!=' on line {line_num}, should use '===' or '!=='",
'file': str(file_path.relative_to(file_path.parents[len(file_path.parts) - file_path.parts.index('annex') - 2])),
'line': line_num,
'code_snippet': line.strip(),
'impact': 'Loose equality can lead to unexpected type coercion bugs',
'remediation': "Replace '==' with '===' and '!=' with '!=='",
'effort': 'low',
})
return findings
def check_complexity(file_path: Path, content: str, lines: List[str]) -> List[Dict]:
"""
Check cyclomatic complexity (simplified).
Counts decision points: if, else, while, for, case, catch, &&, ||, ?
"""
findings = []
# Find function declarations
func_pattern = re.compile(r'(function\s+\w+|const\s+\w+\s*=\s*\([^)]*\)\s*=>|\w+\s*\([^)]*\)\s*{)')
current_function = None
current_function_line = 0
brace_depth = 0
complexity = 0
for line_num, line in enumerate(lines, start=1):
stripped = line.strip()
# Track braces to find function boundaries
brace_depth += stripped.count('{') - stripped.count('}')
# New function started
if func_pattern.search(line) and brace_depth >= 1:
# Save previous function if exists
if current_function and complexity > 10:
severity = 'critical' if complexity > 20 else 'high' if complexity > 15 else 'medium'
findings.append({
'severity': severity,
'category': 'code_quality',
'subcategory': 'complexity',
'title': f'High cyclomatic complexity ({complexity})',
'description': f'Function has complexity of {complexity}',
'file': str(file_path.relative_to(file_path.parents[len(file_path.parts) - file_path.parts.index('annex') - 2])),
'line': current_function_line,
'code_snippet': current_function,
'impact': 'High complexity makes code difficult to understand, test, and maintain',
'remediation': 'Refactor into smaller functions, extract complex conditions',
'effort': 'medium' if complexity < 20 else 'high',
})
# Start new function
current_function = stripped
current_function_line = line_num
complexity = 1 # Base complexity
# Count complexity contributors
if current_function:
complexity += stripped.count('if ')
complexity += stripped.count('else if')
complexity += stripped.count('while ')
complexity += stripped.count('for ')
complexity += stripped.count('case ')
complexity += stripped.count('catch ')
complexity += stripped.count('&&')
complexity += stripped.count('||')
complexity += stripped.count('?')
return findings
def check_function_length(file_path: Path, content: str, lines: List[str]) -> List[Dict]:
"""Check for overly long functions."""
findings = []
func_pattern = re.compile(r'(function\s+\w+|const\s+\w+\s*=\s*\([^)]*\)\s*=>|\w+\s*\([^)]*\)\s*{)')
current_function = None
current_function_line = 0
function_lines = 0
brace_depth = 0
for line_num, line in enumerate(lines, start=1):
stripped = line.strip()
if func_pattern.search(line):
# Check previous function
if current_function and function_lines > 50:
severity = 'high' if function_lines > 100 else 'medium'
findings.append({
'severity': severity,
'category': 'code_quality',
'subcategory': 'function_length',
'title': f'Long function ({function_lines} lines)',
'description': f'Function is {function_lines} lines long (recommended: < 50)',
'file': str(file_path.relative_to(file_path.parents[len(file_path.parts) - file_path.parts.index('annex') - 2])),
'line': current_function_line,
'code_snippet': current_function,
'impact': 'Long functions are harder to understand, test, and maintain',
'remediation': 'Extract smaller functions for distinct responsibilities',
'effort': 'medium',
})
current_function = stripped
current_function_line = line_num
function_lines = 0
brace_depth = 0
if current_function:
function_lines += 1
brace_depth += stripped.count('{') - stripped.count('}')
if brace_depth == 0 and function_lines > 1:
# Function ended
current_function = None
return findings
def analyze_python(codebase_path: Path) -> List[Dict]:
"""Analyze Python-specific quality issues."""
findings = []
# Python analysis to be implemented
# Would check: PEP 8 violations, complexity, type hints, etc.
return findings
def analyze_file_sizes(codebase_path: Path) -> List[Dict]:
"""Check for overly large files."""
findings = []
exclude_dirs = {'node_modules', '.git', 'dist', 'build', '__pycache__'}
code_extensions = {'.js', '.jsx', '.ts', '.tsx', '.py', '.java', '.go', '.rs'}
for file_path in codebase_path.rglob('*'):
if (file_path.is_file() and
file_path.suffix in code_extensions and
not any(excluded in file_path.parts for excluded in exclude_dirs)):
try:
with open(file_path, 'r', encoding='utf-8', errors='ignore') as f:
lines = len(f.readlines())
if lines > 500:
severity = 'high' if lines > 1000 else 'medium'
findings.append({
'severity': severity,
'category': 'code_quality',
'subcategory': 'file_length',
'title': f'Large file ({lines} lines)',
'description': f'File has {lines} lines (recommended: < 500)',
'file': str(file_path.relative_to(file_path.parents[len(file_path.parts) - file_path.parts.index('annex') - 2])),
'line': 1,
'code_snippet': None,
'impact': 'Large files are difficult to navigate and understand',
'remediation': 'Split into multiple smaller, focused modules',
'effort': 'high',
})
except:
pass
return findings
def analyze_dead_code(codebase_path: Path, tech_stack: Dict) -> List[Dict]:
"""Detect potential dead code (commented-out code blocks)."""
findings = []
exclude_dirs = {'node_modules', '.git', 'dist', 'build'}
extensions = set()
if tech_stack.get('javascript') or tech_stack.get('typescript'):
extensions.update({'.js', '.jsx', '.ts', '.tsx'})
if tech_stack.get('python'):
extensions.add('.py')
for file_path in codebase_path.rglob('*'):
if (file_path.suffix in extensions and
not any(excluded in file_path.parts for excluded in exclude_dirs)):
try:
with open(file_path, 'r', encoding='utf-8', errors='ignore') as f:
lines = f.readlines()
# Count consecutive commented lines with code-like content
comment_block_size = 0
block_start_line = 0
for line_num, line in enumerate(lines, start=1):
stripped = line.strip()
# Check if line is commented code
if (stripped.startswith('//') and
any(keyword in stripped for keyword in ['function', 'const', 'let', 'var', 'if', 'for', 'while', '{', '}', ';'])):
if comment_block_size == 0:
block_start_line = line_num
comment_block_size += 1
else:
# End of comment block
if comment_block_size >= 5: # 5+ lines of commented code
findings.append({
'severity': 'low',
'category': 'code_quality',
'subcategory': 'dead_code',
'title': f'Commented-out code block ({comment_block_size} lines)',
'description': f'Found {comment_block_size} lines of commented code',
'file': str(file_path.relative_to(file_path.parents[len(file_path.parts) - file_path.parts.index('annex') - 2])),
'line': block_start_line,
'code_snippet': None,
'impact': 'Commented code clutters codebase and reduces readability',
'remediation': 'Remove commented code (it\'s in version control if needed)',
'effort': 'low',
})
comment_block_size = 0
except:
pass
return findings

View File

@@ -0,0 +1,31 @@
"""
Dependencies Analyzer
Analyzes:
- Outdated dependencies
- Vulnerable dependencies
- License compliance
- Dependency health
"""
from pathlib import Path
from typing import Dict, List
def analyze(codebase_path: Path, metadata: Dict) -> List[Dict]:
"""
Analyze dependencies for issues.
Args:
codebase_path: Path to codebase
metadata: Project metadata
Returns:
List of dependency-related findings
"""
findings = []
# Placeholder implementation
# In production, this would integrate with npm audit, pip-audit, etc.
return findings

View File

@@ -0,0 +1,30 @@
"""
Performance Analyzer
Analyzes:
- Bundle sizes
- Build times
- Runtime performance indicators
"""
from pathlib import Path
from typing import Dict, List
def analyze(codebase_path: Path, metadata: Dict) -> List[Dict]:
"""
Analyze performance issues.
Args:
codebase_path: Path to codebase
metadata: Project metadata
Returns:
List of performance-related findings
"""
findings = []
# Placeholder implementation
# In production, this would analyze bundle sizes, check build configs, etc.
return findings

View File

@@ -0,0 +1,235 @@
"""
Security Scanner
Analyzes codebase for:
- Secrets in code (API keys, tokens, passwords)
- Dependency vulnerabilities
- Common security anti-patterns
- OWASP Top 10 issues
"""
import re
import json
from pathlib import Path
from typing import Dict, List
# Common patterns for secrets
SECRET_PATTERNS = {
'api_key': re.compile(r'(api[_-]?key|apikey)\s*[=:]\s*["\']([a-zA-Z0-9_-]{20,})["\']', re.IGNORECASE),
'aws_key': re.compile(r'AKIA[0-9A-Z]{16}'),
'generic_secret': re.compile(r'(secret|password|passwd|pwd)\s*[=:]\s*["\']([^"\'\s]{8,})["\']', re.IGNORECASE),
'private_key': re.compile(r'-----BEGIN (RSA |)PRIVATE KEY-----'),
'jwt': re.compile(r'eyJ[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+'),
'github_token': re.compile(r'gh[pousr]_[A-Za-z0-9_]{36}'),
'slack_token': re.compile(r'xox[baprs]-[0-9]{10,12}-[0-9]{10,12}-[a-zA-Z0-9]{24,32}'),
}
def analyze(codebase_path: Path, metadata: Dict) -> List[Dict]:
"""
Analyze codebase for security issues.
Args:
codebase_path: Path to codebase
metadata: Project metadata from discovery phase
Returns:
List of security findings
"""
findings = []
# Scan for secrets
findings.extend(scan_for_secrets(codebase_path))
# Scan dependencies for vulnerabilities
if metadata.get('tech_stack', {}).get('javascript'):
findings.extend(scan_npm_dependencies(codebase_path))
# Check for common security anti-patterns
findings.extend(scan_security_antipatterns(codebase_path, metadata))
return findings
def scan_for_secrets(codebase_path: Path) -> List[Dict]:
"""Scan for hardcoded secrets in code."""
findings = []
exclude_dirs = {'node_modules', '.git', 'dist', 'build', '__pycache__', '.venv', 'venv'}
exclude_files = {'.env.example', 'package-lock.json', 'yarn.lock'}
# File extensions to scan
code_extensions = {'.js', '.jsx', '.ts', '.tsx', '.py', '.java', '.go', '.rb', '.php', '.yml', '.yaml', '.json', '.env'}
for file_path in codebase_path.rglob('*'):
if (file_path.is_file() and
file_path.suffix in code_extensions and
file_path.name not in exclude_files and
not any(excluded in file_path.parts for excluded in exclude_dirs)):
try:
with open(file_path, 'r', encoding='utf-8', errors='ignore') as f:
content = f.read()
lines = content.split('\n')
for pattern_name, pattern in SECRET_PATTERNS.items():
matches = pattern.finditer(content)
for match in matches:
# Find line number
line_num = content[:match.start()].count('\n') + 1
# Skip if it's clearly a placeholder or example
matched_text = match.group(0)
if is_placeholder(matched_text):
continue
findings.append({
'severity': 'critical',
'category': 'security',
'subcategory': 'secrets',
'title': f'Potential {pattern_name.replace("_", " ")} found in code',
'description': f'Found potential secret on line {line_num}',
'file': str(file_path.relative_to(codebase_path)),
'line': line_num,
'code_snippet': lines[line_num - 1].strip() if line_num <= len(lines) else '',
'impact': 'Exposed secrets can lead to unauthorized access and data breaches',
'remediation': 'Remove secret from code and use environment variables or secret management tools',
'effort': 'low',
})
except:
pass
return findings
def is_placeholder(text: str) -> bool:
"""Check if a potential secret is actually a placeholder."""
placeholders = [
'your_api_key', 'your_secret', 'example', 'placeholder', 'test',
'dummy', 'sample', 'xxx', '000', 'abc123', 'changeme', 'replace_me',
'my_api_key', 'your_key_here', 'insert_key_here'
]
text_lower = text.lower()
return any(placeholder in text_lower for placeholder in placeholders)
def scan_npm_dependencies(codebase_path: Path) -> List[Dict]:
"""Scan npm dependencies for known vulnerabilities."""
findings = []
package_json = codebase_path / 'package.json'
if not package_json.exists():
return findings
try:
with open(package_json, 'r') as f:
pkg = json.load(f)
deps = {**pkg.get('dependencies', {}), **pkg.get('devDependencies', {})}
# Check for commonly vulnerable packages (simplified - in production use npm audit)
vulnerable_packages = {
'lodash': ('< 4.17.21', 'Prototype pollution vulnerability'),
'axios': ('< 0.21.1', 'SSRF vulnerability'),
'node-fetch': ('< 2.6.7', 'Information exposure vulnerability'),
}
for pkg_name, (vulnerable_version, description) in vulnerable_packages.items():
if pkg_name in deps:
findings.append({
'severity': 'high',
'category': 'security',
'subcategory': 'dependencies',
'title': f'Potentially vulnerable dependency: {pkg_name}',
'description': f'{description} (version: {deps[pkg_name]})',
'file': 'package.json',
'line': None,
'code_snippet': f'"{pkg_name}": "{deps[pkg_name]}"',
'impact': 'Vulnerable dependencies can be exploited by attackers',
'remediation': f'Update {pkg_name} to version {vulnerable_version.replace("< ", ">= ")} or later',
'effort': 'low',
})
except:
pass
return findings
def scan_security_antipatterns(codebase_path: Path, metadata: Dict) -> List[Dict]:
"""Scan for common security anti-patterns."""
findings = []
if metadata.get('tech_stack', {}).get('javascript') or metadata.get('tech_stack', {}).get('typescript'):
findings.extend(scan_js_security_issues(codebase_path))
return findings
def scan_js_security_issues(codebase_path: Path) -> List[Dict]:
"""Scan JavaScript/TypeScript for security anti-patterns."""
findings = []
extensions = {'.js', '.jsx', '.ts', '.tsx'}
exclude_dirs = {'node_modules', '.git', 'dist', 'build'}
# Dangerous patterns
patterns = {
'eval': (
re.compile(r'\beval\s*\('),
'Use of eval() is dangerous',
'eval() can execute arbitrary code and is a security risk',
'Refactor to avoid eval(), use safer alternatives like Function constructor with specific scope'
),
'dangerouslySetInnerHTML': (
re.compile(r'dangerouslySetInnerHTML'),
'Use of dangerouslySetInnerHTML without sanitization',
'Can lead to XSS attacks if not properly sanitized',
'Sanitize HTML content or use safer alternatives'
),
'innerHTML': (
re.compile(r'\.innerHTML\s*='),
'Direct assignment to innerHTML',
'Can lead to XSS attacks if content is not sanitized',
'Use textContent for text or sanitize HTML before assigning'
),
'document.write': (
re.compile(r'document\.write\s*\('),
'Use of document.write()',
'Can be exploited for XSS and causes page reflow',
'Use DOM manipulation methods instead'
),
}
for file_path in codebase_path.rglob('*'):
if (file_path.suffix in extensions and
not any(excluded in file_path.parts for excluded in exclude_dirs)):
try:
with open(file_path, 'r', encoding='utf-8', errors='ignore') as f:
content = f.read()
lines = content.split('\n')
for pattern_name, (pattern, title, impact, remediation) in patterns.items():
for line_num, line in enumerate(lines, start=1):
if pattern.search(line):
findings.append({
'severity': 'high',
'category': 'security',
'subcategory': 'code_security',
'title': title,
'description': f'Found on line {line_num}',
'file': str(file_path.relative_to(codebase_path)),
'line': line_num,
'code_snippet': line.strip(),
'impact': impact,
'remediation': remediation,
'effort': 'medium',
})
except:
pass
return findings

View File

@@ -0,0 +1,76 @@
"""
Technical Debt Calculator
Calculates:
- SQALE rating (A-E)
- Remediation effort estimates
- Debt categorization
"""
from pathlib import Path
from typing import Dict, List
def analyze(codebase_path: Path, metadata: Dict) -> List[Dict]:
"""
Calculate technical debt metrics.
Args:
codebase_path: Path to codebase
metadata: Project metadata
Returns:
List of technical debt findings
"""
findings = []
# Placeholder implementation
# In production, this would calculate SQALE rating based on all findings
return findings
def calculate_sqale_rating(all_findings: List[Dict], total_loc: int) -> str:
"""
Calculate SQALE rating (A-E) based on findings.
Args:
all_findings: All findings from all analyzers
total_loc: Total lines of code
Returns:
SQALE rating (A, B, C, D, or E)
"""
# Estimate remediation time in hours
severity_hours = {
'critical': 8,
'high': 4,
'medium': 2,
'low': 0.5
}
total_remediation_hours = sum(
severity_hours.get(finding.get('severity', 'low'), 0.5)
for finding in all_findings
)
# Estimate development time (1 hour per 50 LOC is conservative)
development_hours = total_loc / 50
# Calculate debt ratio
if development_hours == 0:
debt_ratio = 0
else:
debt_ratio = (total_remediation_hours / development_hours) * 100
# Assign SQALE rating
if debt_ratio <= 5:
return 'A'
elif debt_ratio <= 10:
return 'B'
elif debt_ratio <= 20:
return 'C'
elif debt_ratio <= 50:
return 'D'
else:
return 'E'

View File

@@ -0,0 +1,184 @@
"""
Test Coverage Analyzer
Analyzes:
- Test coverage percentage
- Testing Trophy distribution
- Test quality
- Untested critical paths
"""
import json
from pathlib import Path
from typing import Dict, List
def analyze(codebase_path: Path, metadata: Dict) -> List[Dict]:
"""
Analyze test coverage and quality.
Args:
codebase_path: Path to codebase
metadata: Project metadata
Returns:
List of testing-related findings
"""
findings = []
# Check for test files existence
test_stats = analyze_test_presence(codebase_path, metadata)
if test_stats:
findings.extend(test_stats)
# Analyze coverage if coverage reports exist
coverage_findings = analyze_coverage_reports(codebase_path, metadata)
if coverage_findings:
findings.extend(coverage_findings)
return findings
def analyze_test_presence(codebase_path: Path, metadata: Dict) -> List[Dict]:
"""Check for test file presence and basic test hygiene."""
findings = []
# Count test files
test_extensions = {'.test.js', '.test.ts', '.test.jsx', '.test.tsx', '.spec.js', '.spec.ts'}
test_dirs = {'__tests__', 'tests', 'test', 'spec'}
test_file_count = 0
source_file_count = 0
exclude_dirs = {'node_modules', '.git', 'dist', 'build', '__pycache__'}
source_extensions = {'.js', '.jsx', '.ts', '.tsx', '.py'}
for file_path in codebase_path.rglob('*'):
if file_path.is_file() and not any(excluded in file_path.parts for excluded in exclude_dirs):
# Check if it's a test file
is_test = (
any(file_path.name.endswith(ext) for ext in test_extensions) or
any(test_dir in file_path.parts for test_dir in test_dirs)
)
if is_test:
test_file_count += 1
elif file_path.suffix in source_extensions:
source_file_count += 1
# Calculate test ratio
if source_file_count > 0:
test_ratio = (test_file_count / source_file_count) * 100
if test_ratio < 20:
findings.append({
'severity': 'high',
'category': 'testing',
'subcategory': 'test_coverage',
'title': f'Low test file ratio ({test_ratio:.1f}%)',
'description': f'Only {test_file_count} test files for {source_file_count} source files',
'file': None,
'line': None,
'code_snippet': None,
'impact': 'Insufficient testing leads to bugs and difficult refactoring',
'remediation': 'Add tests for untested modules, aim for at least 80% coverage',
'effort': 'high',
})
elif test_ratio < 50:
findings.append({
'severity': 'medium',
'category': 'testing',
'subcategory': 'test_coverage',
'title': f'Moderate test file ratio ({test_ratio:.1f}%)',
'description': f'{test_file_count} test files for {source_file_count} source files',
'file': None,
'line': None,
'code_snippet': None,
'impact': 'More tests needed to achieve recommended 80% coverage',
'remediation': 'Continue adding tests, focus on critical paths first',
'effort': 'medium',
})
return findings
def analyze_coverage_reports(codebase_path: Path, metadata: Dict) -> List[Dict]:
"""Analyze coverage reports if they exist."""
findings = []
# Look for coverage reports (Istanbul/c8 format)
coverage_files = [
codebase_path / 'coverage' / 'coverage-summary.json',
codebase_path / 'coverage' / 'coverage-final.json',
codebase_path / '.nyc_output' / 'coverage-summary.json',
]
for coverage_file in coverage_files:
if coverage_file.exists():
try:
with open(coverage_file, 'r') as f:
coverage_data = json.load(f)
# Extract total coverage
total = coverage_data.get('total', {})
line_coverage = total.get('lines', {}).get('pct', 0)
branch_coverage = total.get('branches', {}).get('pct', 0)
function_coverage = total.get('functions', {}).get('pct', 0)
statement_coverage = total.get('statements', {}).get('pct', 0)
# Check against 80% threshold
if line_coverage < 80:
severity = 'high' if line_coverage < 50 else 'medium'
findings.append({
'severity': severity,
'category': 'testing',
'subcategory': 'test_coverage',
'title': f'Line coverage below target ({line_coverage:.1f}%)',
'description': f'Current coverage is {line_coverage:.1f}%, target is 80%',
'file': 'coverage/coverage-summary.json',
'line': None,
'code_snippet': None,
'impact': 'Low coverage means untested code paths and higher bug risk',
'remediation': f'Add tests to increase coverage by {80 - line_coverage:.1f}%',
'effort': 'high',
})
if branch_coverage < 75:
findings.append({
'severity': 'medium',
'category': 'testing',
'subcategory': 'test_coverage',
'title': f'Branch coverage below target ({branch_coverage:.1f}%)',
'description': f'Current branch coverage is {branch_coverage:.1f}%, target is 75%',
'file': 'coverage/coverage-summary.json',
'line': None,
'code_snippet': None,
'impact': 'Untested branches can hide bugs in conditional logic',
'remediation': 'Add tests for edge cases and conditional branches',
'effort': 'medium',
})
break # Found coverage, don't check other files
except:
pass
# If no coverage report found
if not findings:
findings.append({
'severity': 'medium',
'category': 'testing',
'subcategory': 'test_infrastructure',
'title': 'No coverage report found',
'description': 'Could not find coverage-summary.json',
'file': None,
'line': None,
'code_snippet': None,
'impact': 'Cannot measure test effectiveness without coverage reports',
'remediation': 'Configure test runner to generate coverage reports (Jest: --coverage, Vitest: --coverage)',
'effort': 'low',
})
return findings

View File

@@ -0,0 +1,408 @@
#!/usr/bin/env python3
"""
Codebase Audit Engine
Orchestrates comprehensive codebase analysis using multiple specialized analyzers.
Generates detailed audit reports and remediation plans based on modern SDLC best practices.
Usage:
python audit_engine.py /path/to/codebase --output report.md
python audit_engine.py /path/to/codebase --format json --output report.json
python audit_engine.py /path/to/codebase --scope security,quality
"""
import argparse
import json
import sys
from datetime import datetime
from pathlib import Path
from typing import Dict, List, Optional
import importlib.util
# Import analyzers dynamically to support progressive loading
ANALYZERS = {
'quality': 'analyzers.code_quality',
'testing': 'analyzers.test_coverage',
'security': 'analyzers.security_scan',
'dependencies': 'analyzers.dependencies',
'performance': 'analyzers.performance',
'technical_debt': 'analyzers.technical_debt',
}
class AuditEngine:
"""
Core audit engine that orchestrates codebase analysis.
Uses progressive disclosure: loads only necessary analyzers based on scope.
"""
def __init__(self, codebase_path: Path, scope: Optional[List[str]] = None):
"""
Initialize audit engine.
Args:
codebase_path: Path to the codebase to audit
scope: Optional list of analysis categories to run (e.g., ['security', 'quality'])
If None, runs all analyzers.
"""
self.codebase_path = Path(codebase_path).resolve()
self.scope = scope or list(ANALYZERS.keys())
self.findings: Dict[str, List[Dict]] = {}
self.metadata: Dict = {}
if not self.codebase_path.exists():
raise FileNotFoundError(f"Codebase path does not exist: {self.codebase_path}")
def discover_project(self) -> Dict:
"""
Phase 1: Initial project discovery (lightweight scan).
Returns:
Dictionary containing project metadata
"""
print("🔍 Phase 1: Discovering project structure...")
metadata = {
'path': str(self.codebase_path),
'scan_time': datetime.now().isoformat(),
'tech_stack': self._detect_tech_stack(),
'project_type': self._detect_project_type(),
'total_files': self._count_files(),
'total_lines': self._count_lines(),
'git_info': self._get_git_info(),
}
self.metadata = metadata
return metadata
def _detect_tech_stack(self) -> Dict[str, bool]:
"""Detect languages and frameworks used in the project."""
tech_stack = {
'javascript': (self.codebase_path / 'package.json').exists(),
'typescript': self._file_exists_with_extension('.ts') or self._file_exists_with_extension('.tsx'),
'python': (self.codebase_path / 'setup.py').exists() or
(self.codebase_path / 'pyproject.toml').exists() or
self._file_exists_with_extension('.py'),
'react': self._check_dependency('react'),
'vue': self._check_dependency('vue'),
'angular': self._check_dependency('@angular/core'),
'node': (self.codebase_path / 'package.json').exists(),
'docker': (self.codebase_path / 'Dockerfile').exists(),
}
return {k: v for k, v in tech_stack.items() if v}
def _detect_project_type(self) -> str:
"""Determine project type (web app, library, CLI, etc.)."""
if (self.codebase_path / 'package.json').exists():
try:
with open(self.codebase_path / 'package.json', 'r') as f:
pkg = json.load(f)
if pkg.get('private') is False:
return 'library'
if 'bin' in pkg:
return 'cli'
return 'web_app'
except:
pass
if (self.codebase_path / 'setup.py').exists():
return 'python_package'
return 'unknown'
def _count_files(self) -> int:
"""Count total files in codebase (excluding common ignore patterns)."""
exclude_dirs = {'.git', 'node_modules', '__pycache__', '.venv', 'venv', 'dist', 'build'}
count = 0
for path in self.codebase_path.rglob('*'):
if path.is_file() and not any(excluded in path.parts for excluded in exclude_dirs):
count += 1
return count
def _count_lines(self) -> int:
"""Count total lines of code (excluding empty lines and comments)."""
exclude_dirs = {'.git', 'node_modules', '__pycache__', '.venv', 'venv', 'dist', 'build'}
code_extensions = {'.js', '.jsx', '.ts', '.tsx', '.py', '.java', '.go', '.rs', '.rb'}
total_lines = 0
for path in self.codebase_path.rglob('*'):
if (path.is_file() and
path.suffix in code_extensions and
not any(excluded in path.parts for excluded in exclude_dirs)):
try:
with open(path, 'r', encoding='utf-8', errors='ignore') as f:
total_lines += sum(1 for line in f if line.strip() and not line.strip().startswith(('//', '#', '/*', '*')))
except:
pass
return total_lines
def _get_git_info(self) -> Optional[Dict]:
"""Get git repository information."""
git_dir = self.codebase_path / '.git'
if not git_dir.exists():
return None
try:
import subprocess
result = subprocess.run(
['git', '-C', str(self.codebase_path), 'log', '--oneline', '-10'],
capture_output=True,
text=True,
timeout=5
)
commit_count = subprocess.run(
['git', '-C', str(self.codebase_path), 'rev-list', '--count', 'HEAD'],
capture_output=True,
text=True,
timeout=5
)
return {
'is_git_repo': True,
'recent_commits': result.stdout.strip().split('\n') if result.returncode == 0 else [],
'total_commits': int(commit_count.stdout.strip()) if commit_count.returncode == 0 else 0,
}
except:
return {'is_git_repo': True, 'error': 'Could not read git info'}
def _file_exists_with_extension(self, extension: str) -> bool:
"""Check if any file with given extension exists."""
return any(self.codebase_path.rglob(f'*{extension}'))
def _check_dependency(self, dep_name: str) -> bool:
"""Check if a dependency exists in package.json."""
pkg_json = self.codebase_path / 'package.json'
if not pkg_json.exists():
return False
try:
with open(pkg_json, 'r') as f:
pkg = json.load(f)
deps = {**pkg.get('dependencies', {}), **pkg.get('devDependencies', {})}
return dep_name in deps
except:
return False
def run_analysis(self, phase: str = 'full') -> Dict:
"""
Phase 2: Deep analysis using specialized analyzers.
Args:
phase: 'quick' for lightweight scan, 'full' for comprehensive analysis
Returns:
Dictionary containing all findings
"""
print(f"🔬 Phase 2: Running {phase} analysis...")
for category in self.scope:
if category not in ANALYZERS:
print(f"⚠️ Unknown analyzer category: {category}, skipping...")
continue
print(f" Analyzing {category}...")
analyzer_findings = self._run_analyzer(category)
if analyzer_findings:
self.findings[category] = analyzer_findings
return self.findings
def _run_analyzer(self, category: str) -> List[Dict]:
"""
Run a specific analyzer module.
Args:
category: Analyzer category name
Returns:
List of findings from the analyzer
"""
module_path = ANALYZERS.get(category)
if not module_path:
return []
try:
# Import analyzer module dynamically
analyzer_file = Path(__file__).parent / f"{module_path.replace('.', '/')}.py"
if not analyzer_file.exists():
print(f" ⚠️ Analyzer not yet implemented: {category}")
return []
spec = importlib.util.spec_from_file_location(module_path, analyzer_file)
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
# Each analyzer should have an analyze() function
if hasattr(module, 'analyze'):
return module.analyze(self.codebase_path, self.metadata)
else:
print(f" ⚠️ Analyzer missing analyze() function: {category}")
return []
except Exception as e:
print(f" ❌ Error running analyzer {category}: {e}")
return []
def calculate_scores(self) -> Dict[str, float]:
"""
Calculate health scores for each category and overall.
Returns:
Dictionary of scores (0-100 scale)
"""
scores = {}
# Calculate score for each category based on findings severity
for category, findings in self.findings.items():
if not findings:
scores[category] = 100.0
continue
# Weighted scoring based on severity
severity_weights = {'critical': 10, 'high': 5, 'medium': 2, 'low': 1}
total_weight = sum(severity_weights.get(f.get('severity', 'low'), 1) for f in findings)
# Score decreases based on weighted issues
# Formula: 100 - (total_weight / num_findings * penalty_factor)
penalty = min(total_weight, 100)
scores[category] = max(0, 100 - penalty)
# Overall score is weighted average
if scores:
scores['overall'] = sum(scores.values()) / len(scores)
else:
scores['overall'] = 100.0
return scores
def generate_summary(self) -> Dict:
"""
Generate executive summary of audit results.
Returns:
Summary dictionary
"""
critical_count = sum(
1 for findings in self.findings.values()
for f in findings
if f.get('severity') == 'critical'
)
high_count = sum(
1 for findings in self.findings.values()
for f in findings
if f.get('severity') == 'high'
)
scores = self.calculate_scores()
return {
'overall_score': round(scores.get('overall', 0), 1),
'category_scores': {k: round(v, 1) for k, v in scores.items() if k != 'overall'},
'critical_issues': critical_count,
'high_issues': high_count,
'total_issues': sum(len(findings) for findings in self.findings.values()),
'metadata': self.metadata,
}
def main():
"""Main entry point for CLI usage."""
parser = argparse.ArgumentParser(
description='Comprehensive codebase auditor based on modern SDLC best practices (2024-25)',
formatter_class=argparse.RawDescriptionHelpFormatter,
)
parser.add_argument(
'codebase',
type=str,
help='Path to the codebase to audit'
)
parser.add_argument(
'--scope',
type=str,
help='Comma-separated list of analysis categories (quality,testing,security,dependencies,performance,technical_debt)',
default=None
)
parser.add_argument(
'--phase',
type=str,
choices=['quick', 'full'],
default='full',
help='Analysis depth: quick (Phase 1 only) or full (Phase 1 + 2)'
)
parser.add_argument(
'--format',
type=str,
choices=['markdown', 'json', 'html'],
default='markdown',
help='Output format for the report'
)
parser.add_argument(
'--output',
type=str,
help='Output file path (default: stdout)',
default=None
)
args = parser.parse_args()
# Parse scope
scope = args.scope.split(',') if args.scope else None
# Initialize engine
try:
engine = AuditEngine(args.codebase, scope=scope)
except FileNotFoundError as e:
print(f"❌ Error: {e}", file=sys.stderr)
sys.exit(1)
# Run audit
print("🚀 Starting codebase audit...")
print(f" Codebase: {args.codebase}")
print(f" Scope: {scope or 'all'}")
print(f" Phase: {args.phase}")
print()
# Phase 1: Discovery
metadata = engine.discover_project()
print(f" Detected: {', '.join(metadata['tech_stack'].keys())}")
print(f" Files: {metadata['total_files']}")
print(f" Lines of code: {metadata['total_lines']:,}")
print()
# Phase 2: Analysis (if not quick mode)
if args.phase == 'full':
findings = engine.run_analysis()
# Generate summary
summary = engine.generate_summary()
# Output results
print()
print("📊 Audit complete!")
print(f" Overall score: {summary['overall_score']}/100")
print(f" Critical issues: {summary['critical_issues']}")
print(f" High issues: {summary['high_issues']}")
print(f" Total issues: {summary['total_issues']}")
print()
# Generate report (to be implemented in report_generator.py)
if args.output:
print(f"📝 Report generation will be implemented in report_generator.py")
print(f" Format: {args.format}")
print(f" Output: {args.output}")
if __name__ == '__main__':
main()

View File

@@ -0,0 +1,241 @@
#!/usr/bin/env python3
"""
Remediation Planner
Generates prioritized action plans based on audit findings.
Uses severity, impact, frequency, and effort to prioritize issues.
"""
from typing import Dict, List
from datetime import datetime, timedelta
def generate_remediation_plan(findings: Dict[str, List[Dict]], metadata: Dict) -> str:
"""
Generate a prioritized remediation plan.
Args:
findings: All findings organized by category
metadata: Project metadata
Returns:
Markdown-formatted remediation plan
"""
plan = []
# Header
plan.append("# Codebase Remediation Plan")
plan.append(f"\n**Generated**: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
plan.append(f"**Codebase**: `{metadata.get('path', 'Unknown')}`")
plan.append("\n---\n")
# Flatten and prioritize all findings
all_findings = []
for category, category_findings in findings.items():
for finding in category_findings:
finding['category'] = category
all_findings.append(finding)
# Calculate priority scores
for finding in all_findings:
finding['priority_score'] = calculate_priority_score(finding)
# Sort by priority score (highest first)
all_findings.sort(key=lambda x: x['priority_score'], reverse=True)
# Group by priority level
p0_issues = [f for f in all_findings if f['severity'] == 'critical']
p1_issues = [f for f in all_findings if f['severity'] == 'high']
p2_issues = [f for f in all_findings if f['severity'] == 'medium']
p3_issues = [f for f in all_findings if f['severity'] == 'low']
# Priority 0: Critical Issues (Fix Immediately)
if p0_issues:
plan.append("## Priority 0: Critical Issues (Fix Immediately ⚡)")
plan.append("\n**Timeline**: Within 24 hours")
plan.append("**Impact**: Security vulnerabilities, production-breaking bugs, data loss risks\n")
for i, finding in enumerate(p0_issues, 1):
plan.append(f"### {i}. {finding.get('title', 'Untitled')}")
plan.append(f"**Category**: {finding.get('category', 'Unknown').replace('_', ' ').title()}")
plan.append(f"**Location**: `{finding.get('file', 'Unknown')}`")
plan.append(f"**Effort**: {finding.get('effort', 'unknown').upper()}")
plan.append(f"\n**Issue**: {finding.get('description', 'No description')}")
plan.append(f"\n**Impact**: {finding.get('impact', 'Unknown impact')}")
plan.append(f"\n**Action**: {finding.get('remediation', 'No remediation suggested')}\n")
plan.append("---\n")
# Priority 1: High Issues (Fix This Sprint)
if p1_issues:
plan.append("## Priority 1: High Issues (Fix This Sprint 📅)")
plan.append("\n**Timeline**: Within current sprint (2 weeks)")
plan.append("**Impact**: Significant quality, security, or user experience issues\n")
for i, finding in enumerate(p1_issues[:10], 1): # Top 10
plan.append(f"### {i}. {finding.get('title', 'Untitled')}")
plan.append(f"**Category**: {finding.get('category', 'Unknown').replace('_', ' ').title()}")
plan.append(f"**Effort**: {finding.get('effort', 'unknown').upper()}")
plan.append(f"\n**Action**: {finding.get('remediation', 'No remediation suggested')}\n")
if len(p1_issues) > 10:
plan.append(f"\n*...and {len(p1_issues) - 10} more high-priority issues*\n")
plan.append("---\n")
# Priority 2: Medium Issues (Fix Next Quarter)
if p2_issues:
plan.append("## Priority 2: Medium Issues (Fix Next Quarter 📆)")
plan.append("\n**Timeline**: Within 3 months")
plan.append("**Impact**: Code maintainability, developer productivity\n")
plan.append(f"**Total Issues**: {len(p2_issues)}\n")
# Group by subcategory
subcategories = {}
for finding in p2_issues:
subcat = finding.get('subcategory', 'Other')
if subcat not in subcategories:
subcategories[subcat] = []
subcategories[subcat].append(finding)
plan.append("**Grouped by Type**:\n")
for subcat, subcat_findings in subcategories.items():
plan.append(f"- {subcat.replace('_', ' ').title()}: {len(subcat_findings)} issues")
plan.append("\n---\n")
# Priority 3: Low Issues (Backlog)
if p3_issues:
plan.append("## Priority 3: Low Issues (Backlog 📋)")
plan.append("\n**Timeline**: When time permits")
plan.append("**Impact**: Minor improvements, stylistic issues\n")
plan.append(f"**Total Issues**: {len(p3_issues)}\n")
plan.append("*Address during dedicated tech debt sprints or slow periods*\n")
plan.append("---\n")
# Implementation Timeline
plan.append("## Suggested Timeline\n")
today = datetime.now()
if p0_issues:
deadline = today + timedelta(days=1)
plan.append(f"- **{deadline.strftime('%Y-%m-%d')}**: All P0 issues resolved")
if p1_issues:
deadline = today + timedelta(weeks=2)
plan.append(f"- **{deadline.strftime('%Y-%m-%d')}**: P1 issues addressed (end of sprint)")
if p2_issues:
deadline = today + timedelta(weeks=12)
plan.append(f"- **{deadline.strftime('%Y-%m-%d')}**: P2 issues resolved (end of quarter)")
# Effort Summary
plan.append("\n## Effort Summary\n")
effort_estimates = calculate_effort_summary(all_findings)
plan.append(f"**Total Estimated Effort**: {effort_estimates['total']} person-days")
plan.append(f"- Critical/High: {effort_estimates['critical_high']} days")
plan.append(f"- Medium: {effort_estimates['medium']} days")
plan.append(f"- Low: {effort_estimates['low']} days")
# Team Assignment Suggestions
plan.append("\n## Team Assignment Suggestions\n")
plan.append("- **Security Team**: All P0 security issues, P1 vulnerabilities")
plan.append("- **QA/Testing**: Test coverage improvements, test quality issues")
plan.append("- **Infrastructure**: CI/CD improvements, build performance")
plan.append("- **Development Team**: Code quality refactoring, complexity reduction")
# Footer
plan.append("\n---\n")
plan.append("*Remediation plan generated by Codebase Auditor Skill*")
plan.append("\n*Priority scoring based on: Impact × 10 + Frequency × 5 - Effort × 2*")
return '\n'.join(plan)
def calculate_priority_score(finding: Dict) -> int:
"""
Calculate priority score for a finding.
Formula: (Impact × 10) + (Frequency × 5) - (Effort × 2)
Args:
finding: Individual finding
Returns:
Priority score (higher = more urgent)
"""
# Map severity to impact (1-10)
severity_impact = {
'critical': 10,
'high': 7,
'medium': 4,
'low': 2,
}
impact = severity_impact.get(finding.get('severity', 'low'), 1)
# Estimate frequency (1-10) based on category
# Security/testing issues affect everything
category = finding.get('category', '')
if category in ['security', 'testing']:
frequency = 10
elif category in ['quality', 'performance']:
frequency = 6
else:
frequency = 3
# Map effort to numeric value (1-10)
effort_values = {
'low': 2,
'medium': 5,
'high': 8,
}
effort = effort_values.get(finding.get('effort', 'medium'), 5)
# Calculate score
score = (impact * 10) + (frequency * 5) - (effort * 2)
return max(0, score) # Never negative
def calculate_effort_summary(findings: List[Dict]) -> Dict[str, int]:
"""
Calculate total effort estimates.
Args:
findings: All findings
Returns:
Dictionary with effort estimates in person-days
"""
# Map effort levels to days
effort_days = {
'low': 0.5,
'medium': 2,
'high': 5,
}
critical_high_days = sum(
effort_days.get(f.get('effort', 'medium'), 2)
for f in findings
if f.get('severity') in ['critical', 'high']
)
medium_days = sum(
effort_days.get(f.get('effort', 'medium'), 2)
for f in findings
if f.get('severity') == 'medium'
)
low_days = sum(
effort_days.get(f.get('effort', 'medium'), 2)
for f in findings
if f.get('severity') == 'low'
)
return {
'critical_high': round(critical_high_days, 1),
'medium': round(medium_days, 1),
'low': round(low_days, 1),
'total': round(critical_high_days + medium_days + low_days, 1),
}

View File

@@ -0,0 +1,345 @@
#!/usr/bin/env python3
"""
Report Generator
Generates audit reports in multiple formats:
- Markdown (default, human-readable)
- JSON (machine-readable, CI/CD integration)
- HTML (interactive dashboard)
"""
import json
from datetime import datetime
from pathlib import Path
from typing import Dict, List
def generate_markdown_report(summary: Dict, findings: Dict[str, List[Dict]], metadata: Dict) -> str:
"""
Generate a Markdown-formatted audit report.
Args:
summary: Executive summary data
findings: All findings organized by category
metadata: Project metadata
Returns:
Markdown report as string
"""
report = []
# Header
report.append("# Codebase Audit Report")
report.append(f"\n**Generated**: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
report.append(f"**Codebase**: `{metadata.get('path', 'Unknown')}`")
report.append(f"**Tech Stack**: {', '.join(metadata.get('tech_stack', {}).keys())}")
report.append(f"**Total Files**: {metadata.get('total_files', 0):,}")
report.append(f"**Lines of Code**: {metadata.get('total_lines', 0):,}")
report.append("\n---\n")
# Executive Summary
report.append("## Executive Summary")
report.append(f"\n### Overall Health Score: **{summary.get('overall_score', 0)}/100**\n")
# Score breakdown
report.append("#### Category Scores\n")
for category, score in summary.get('category_scores', {}).items():
emoji = score_to_emoji(score)
report.append(f"- **{category.replace('_', ' ').title()}**: {score}/100 {emoji}")
# Issue summary
report.append("\n#### Issue Summary\n")
report.append(f"- **Critical Issues**: {summary.get('critical_issues', 0)}")
report.append(f"- **High Issues**: {summary.get('high_issues', 0)}")
report.append(f"- **Total Issues**: {summary.get('total_issues', 0)}")
report.append("\n---\n")
# Detailed Findings
report.append("## Detailed Findings\n")
severity_order = ['critical', 'high', 'medium', 'low']
for severity in severity_order:
severity_findings = []
for category, category_findings in findings.items():
for finding in category_findings:
if finding.get('severity') == severity:
severity_findings.append((category, finding))
if severity_findings:
severity_emoji = severity_to_emoji(severity)
report.append(f"### {severity_emoji} {severity.upper()} ({len(severity_findings)} issues)\n")
for category, finding in severity_findings:
report.append(f"#### {finding.get('title', 'Untitled Issue')}")
report.append(f"\n**Category**: {category.replace('_', ' ').title()}")
report.append(f"**Subcategory**: {finding.get('subcategory', 'N/A')}")
if finding.get('file'):
file_ref = f"{finding['file']}"
if finding.get('line'):
file_ref += f":{finding['line']}"
report.append(f"**Location**: `{file_ref}`")
report.append(f"\n{finding.get('description', 'No description')}")
if finding.get('code_snippet'):
report.append(f"\n```\n{finding['code_snippet']}\n```")
report.append(f"\n**Impact**: {finding.get('impact', 'Unknown impact')}")
report.append(f"\n**Remediation**: {finding.get('remediation', 'No remediation suggested')}")
report.append(f"\n**Effort**: {finding.get('effort', 'Unknown').upper()}\n")
report.append("---\n")
# Recommendations
report.append("## Recommendations\n")
report.append(generate_recommendations(summary, findings))
# Footer
report.append("\n---\n")
report.append("*Report generated by Codebase Auditor Skill (2024-25 Standards)*")
return '\n'.join(report)
def generate_json_report(summary: Dict, findings: Dict[str, List[Dict]], metadata: Dict) -> str:
"""
Generate a JSON-formatted audit report.
Args:
summary: Executive summary data
findings: All findings organized by category
metadata: Project metadata
Returns:
JSON report as string
"""
report = {
'generated_at': datetime.now().isoformat(),
'metadata': metadata,
'summary': summary,
'findings': findings,
'schema_version': '1.0.0',
}
return json.dumps(report, indent=2)
def generate_html_report(summary: Dict, findings: Dict[str, List[Dict]], metadata: Dict) -> str:
"""
Generate an HTML dashboard report.
Args:
summary: Executive summary data
findings: All findings organized by category
metadata: Project metadata
Returns:
HTML report as string
"""
# Simplified HTML template
html = f"""<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Codebase Audit Report</title>
<style>
body {{
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, sans-serif;
line-height: 1.6;
max-width: 1200px;
margin: 0 auto;
padding: 20px;
background: #f5f5f5;
}}
.header {{
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
color: white;
padding: 30px;
border-radius: 10px;
margin-bottom: 20px;
}}
.score {{
font-size: 48px;
font-weight: bold;
margin: 20px 0;
}}
.metrics {{
display: grid;
grid-template-columns: repeat(auto-fit, minmax(200px, 1fr));
gap: 20px;
margin: 20px 0;
}}
.metric {{
background: white;
padding: 20px;
border-radius: 8px;
box-shadow: 0 2px 4px rgba(0,0,0,0.1);
}}
.metric-title {{
font-size: 14px;
color: #666;
text-transform: uppercase;
}}
.metric-value {{
font-size: 32px;
font-weight: bold;
margin: 10px 0;
}}
.finding {{
background: white;
padding: 20px;
margin: 10px 0;
border-radius: 8px;
border-left: 4px solid #ddd;
}}
.finding.critical {{ border-left-color: #e53e3e; }}
.finding.high {{ border-left-color: #dd6b20; }}
.finding.medium {{ border-left-color: #d69e2e; }}
.finding.low {{ border-left-color: #38a169; }}
.badge {{
display: inline-block;
padding: 4px 12px;
border-radius: 12px;
font-size: 12px;
font-weight: bold;
text-transform: uppercase;
}}
.badge.critical {{ background: #fed7d7; color: #742a2a; }}
.badge.high {{ background: #feebc8; color: #7c2d12; }}
.badge.medium {{ background: #fefcbf; color: #744210; }}
.badge.low {{ background: #c6f6d5; color: #22543d; }}
code {{
background: #f7fafc;
padding: 2px 6px;
border-radius: 3px;
font-family: 'Courier New', monospace;
}}
pre {{
background: #2d3748;
color: #e2e8f0;
padding: 15px;
border-radius: 5px;
overflow-x: auto;
}}
</style>
</head>
<body>
<div class="header">
<h1>🔍 Codebase Audit Report</h1>
<p><strong>Generated:</strong> {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}</p>
<p><strong>Codebase:</strong> {metadata.get('path', 'Unknown')}</p>
<div class="score">Overall Score: {summary.get('overall_score', 0)}/100</div>
</div>
<div class="metrics">
<div class="metric">
<div class="metric-title">Critical Issues</div>
<div class="metric-value" style="color: #e53e3e;">{summary.get('critical_issues', 0)}</div>
</div>
<div class="metric">
<div class="metric-title">High Issues</div>
<div class="metric-value" style="color: #dd6b20;">{summary.get('high_issues', 0)}</div>
</div>
<div class="metric">
<div class="metric-title">Total Issues</div>
<div class="metric-value">{summary.get('total_issues', 0)}</div>
</div>
<div class="metric">
<div class="metric-title">Lines of Code</div>
<div class="metric-value">{metadata.get('total_lines', 0):,}</div>
</div>
</div>
<h2>Findings</h2>
"""
# Add findings
severity_order = ['critical', 'high', 'medium', 'low']
for severity in severity_order:
for category, category_findings in findings.items():
for finding in category_findings:
if finding.get('severity') == severity:
html += f"""
<div class="finding {severity}">
<div>
<span class="badge {severity}">{severity}</span>
<strong>{finding.get('title', 'Untitled')}</strong>
</div>
<p>{finding.get('description', 'No description')}</p>
"""
if finding.get('file'):
html += f"<p><strong>Location:</strong> <code>{finding['file']}"
if finding.get('line'):
html += f":{finding['line']}"
html += "</code></p>"
if finding.get('code_snippet'):
html += f"<pre><code>{finding['code_snippet']}</code></pre>"
html += f"""
<p><strong>Impact:</strong> {finding.get('impact', 'Unknown')}</p>
<p><strong>Remediation:</strong> {finding.get('remediation', 'No suggestion')}</p>
</div>
"""
html += """
</body>
</html>
"""
return html
def score_to_emoji(score: float) -> str:
"""Convert score to emoji."""
if score >= 90:
return ""
elif score >= 70:
return "⚠️"
else:
return ""
def severity_to_emoji(severity: str) -> str:
"""Convert severity to emoji."""
severity_map = {
'critical': '🚨',
'high': '⚠️',
'medium': '',
'low': '',
}
return severity_map.get(severity, '')
def generate_recommendations(summary: Dict, findings: Dict) -> str:
"""Generate recommendations based on findings."""
recommendations = []
critical_count = summary.get('critical_issues', 0)
high_count = summary.get('high_issues', 0)
overall_score = summary.get('overall_score', 0)
if critical_count > 0:
recommendations.append(f"1. **Immediate Action Required**: Address all {critical_count} critical security and quality issues before deploying to production.")
if high_count > 5:
recommendations.append(f"2. **Sprint Focus**: Prioritize fixing the {high_count} high-severity issues in the next sprint. These significantly impact code quality and maintainability.")
if overall_score < 70:
recommendations.append("3. **Technical Debt Sprint**: Schedule a dedicated sprint to address accumulated technical debt and improve code quality metrics.")
if 'testing' in findings and len(findings['testing']) > 0:
recommendations.append("4. **Testing Improvements**: Increase test coverage to meet the 80% minimum threshold. Focus on critical paths first (authentication, payment, data processing).")
if 'security' in findings and len(findings['security']) > 0:
recommendations.append("5. **Security Review**: Conduct a thorough security review and penetration testing given the security issues found.")
if not recommendations:
recommendations.append("1. **Maintain Standards**: Continue following best practices and maintain current quality levels.")
recommendations.append("2. **Continuous Improvement**: Consider implementing automated code quality checks in CI/CD pipeline.")
return '\n'.join(recommendations)