434 lines
13 KiB
Markdown
434 lines
13 KiB
Markdown
|
|
# Security Architecture Review
|
|
|
|
## Overview
|
|
|
|
Systematically review designs for security issues. Core principle: **Checklist-driven review finds issues that intuition misses**.
|
|
|
|
**Key insight**: Ad-hoc review finds obvious issues. Systematic checklist finds subtle gaps.
|
|
|
|
## When to Use
|
|
|
|
Load this skill when:
|
|
- Reviewing architecture designs pre-implementation
|
|
- Conducting security audits of existing systems
|
|
- Evaluating third-party integrations
|
|
- Pre-deployment security validation
|
|
|
|
**Symptoms you need this**:
|
|
- "Is this design secure?"
|
|
- Reviewing microservices, APIs, data pipelines
|
|
- Pre-launch security check
|
|
- Compliance audit preparation
|
|
|
|
**Don't use for**:
|
|
- Threat modeling new systems (use `ordis/security-architect/threat-modeling`)
|
|
- Designing controls (use `ordis/security-architect/security-controls-design`)
|
|
- Secure patterns (use `ordis/security-architect/secure-by-design-patterns`)
|
|
|
|
## Review Process
|
|
|
|
### Step 1: Understand System
|
|
|
|
Before checklists, understand:
|
|
- **Components**: Services, databases, queues, external APIs
|
|
- **Data flows**: Where data enters/exits, trust boundaries
|
|
- **Users/actors**: Who accesses what (end users, admins, services)
|
|
- **Deployment**: Cloud/on-prem, network topology
|
|
|
|
### Step 2: Apply Checklists
|
|
|
|
For EACH area, go through checklist systematically. Check = verify present and secure.
|
|
|
|
### Step 3: Document Findings
|
|
|
|
For each issue found:
|
|
- **Description**: What's wrong
|
|
- **Severity**: Critical/High/Medium/Low
|
|
- **Impact**: What can attacker do
|
|
- **Recommendation**: How to fix
|
|
|
|
|
|
## Checklist 1: Authentication
|
|
|
|
### Credential Storage
|
|
- [ ] Passwords hashed with strong algorithm (bcrypt, Argon2, scrypt)
|
|
- [ ] NOT hashed with weak algorithms (MD5, SHA1, plain SHA256)
|
|
- [ ] Salt used per-password (not global salt or no salt)
|
|
- [ ] Key derivation with sufficient iterations (bcrypt cost ≥12, Argon2 with recommended params)
|
|
|
|
**Common Issues**:
|
|
- ❌ MD5/SHA1 hashing → CRITICAL
|
|
- ❌ No salt → HIGH
|
|
- ❌ Global salt → MEDIUM
|
|
- ❌ Low iteration count → MEDIUM
|
|
|
|
|
|
### Multi-Factor Authentication
|
|
- [ ] MFA available for all users (or at least admins)
|
|
- [ ] TOTP/hardware token support (not just SMS)
|
|
- [ ] Backup codes for account recovery
|
|
- [ ] Cannot bypass MFA via alternate login paths
|
|
|
|
**Common Issues**:
|
|
- ❌ No MFA → HIGH (for privileged accounts)
|
|
- ❌ SMS-only 2FA → MEDIUM (SIM swapping risk)
|
|
- ❌ MFA bypass path exists → HIGH
|
|
|
|
|
|
### Session Management
|
|
- [ ] Session tokens have expiration (not indefinite)
|
|
- [ ] Token expiry reasonable (hours for web, days max for mobile)
|
|
- [ ] Token rotation on sensitive actions (password change, permission change)
|
|
- [ ] Logout invalidates tokens (server-side revocation)
|
|
- [ ] Tokens stored securely (HttpOnly, Secure cookies or secure storage)
|
|
|
|
**Common Issues**:
|
|
- ❌ Non-expiring tokens → HIGH
|
|
- ❌ No logout/revocation → MEDIUM
|
|
- ❌ Tokens in localStorage (XSS vulnerable) → MEDIUM
|
|
|
|
|
|
### Password Policies
|
|
- [ ] Minimum length enforced (≥12 characters recommended)
|
|
- [ ] No maximum length (or very high max, ≥128 chars)
|
|
- [ ] Password history (prevent reuse of last N passwords)
|
|
- [ ] Account lockout after failed attempts (5-10 tries, temporary lockout)
|
|
- [ ] No common password blacklist (check against known-weak passwords)
|
|
|
|
**Common Issues**:
|
|
- ❌ No password policy → MEDIUM
|
|
- ❌ Short max length (<20 chars) → LOW
|
|
- ❌ No lockout → MEDIUM (brute-force vulnerability)
|
|
|
|
|
|
## Checklist 2: Authorization
|
|
|
|
### Access Control Model
|
|
- [ ] Clear access control model (RBAC, ABAC, MAC documented)
|
|
- [ ] Permissions defined per resource type
|
|
- [ ] Authorization checks at API layer AND data layer
|
|
- [ ] Consistent enforcement across all endpoints
|
|
|
|
**Common Issues**:
|
|
- ❌ No defined model (ad-hoc checks) → HIGH
|
|
- ❌ Authorization only at API (not data layer) → MEDIUM
|
|
- ❌ Inconsistent (some endpoints skip checks) → HIGH
|
|
|
|
|
|
### Privilege Escalation Prevention
|
|
- [ ] Users cannot modify their own roles/permissions
|
|
- [ ] API doesn't trust client-provided role claims
|
|
- [ ] Admin actions require separate authentication/approval
|
|
- [ ] No hidden admin endpoints without authorization
|
|
|
|
**Common Issues**:
|
|
- ❌ User can edit own role in profile → CRITICAL
|
|
- ❌ Trusting `X-User-Role` header from client → CRITICAL
|
|
- ❌ Hidden `/admin` path without authz → HIGH
|
|
|
|
|
|
### Resource-Level Authorization
|
|
- [ ] Authorization checks "Can THIS user access THIS resource?"
|
|
- [ ] Not just "Is user logged in?" or "Is user admin?"
|
|
- [ ] Users can only access their own data (unless explicitly shared)
|
|
- [ ] Object-level permission checks (IDOR prevention)
|
|
|
|
**Common Issues**:
|
|
- ❌ Check login only, not resource ownership → HIGH (IDOR vulnerability)
|
|
- ❌ `/api/users/{user_id}` allows any authenticated user → HIGH
|
|
|
|
|
|
### Default-Deny Principle
|
|
- [ ] All endpoints require authentication by default
|
|
- [ ] Explicit allow-list for public endpoints
|
|
- [ ] Authorization failures return 403 Forbidden (not 404)
|
|
- [ ] No "development mode" backdoors in production
|
|
|
|
**Common Issues**:
|
|
- ❌ Default-allow (must explicitly mark secure) → HIGH
|
|
- ❌ Returning 404 instead of 403 → LOW (information disclosure)
|
|
- ❌ Debug/dev endpoints enabled in production → CRITICAL
|
|
|
|
|
|
## Checklist 3: Secrets Management
|
|
|
|
### Secrets Storage
|
|
- [ ] No secrets in source code
|
|
- [ ] No secrets in config files committed to Git
|
|
- [ ] No secrets in environment variables visible in UI
|
|
- [ ] Secrets stored in secrets manager (Vault, AWS Secrets Manager, etc.)
|
|
- [ ] Secrets encrypted at rest
|
|
|
|
**Common Issues**:
|
|
- ❌ Secrets in Git history → CRITICAL
|
|
- ❌ Secrets in config.yaml → CRITICAL
|
|
- ❌ Database password in Dockerfile → HIGH
|
|
|
|
|
|
### Secrets Rotation
|
|
- [ ] Secrets have rotation schedule (monthly/quarterly)
|
|
- [ ] Rotation is automated or documented
|
|
- [ ] Application supports zero-downtime rotation
|
|
- [ ] Old secrets revoked after rotation
|
|
|
|
**Common Issues**:
|
|
- ❌ No rotation policy → MEDIUM
|
|
- ❌ Manual rotation without documentation → LOW
|
|
- ❌ Rotation requires downtime → LOW
|
|
|
|
|
|
### Access Control to Secrets
|
|
- [ ] Secrets access restricted by service identity
|
|
- [ ] Least privilege (service accesses only its secrets)
|
|
- [ ] Audit log of secrets access
|
|
- [ ] No shared secrets across services
|
|
|
|
**Common Issues**:
|
|
- ❌ All services can access all secrets → MEDIUM
|
|
- ❌ No audit log → LOW
|
|
- ❌ Shared API key for multiple services → MEDIUM
|
|
|
|
|
|
## Checklist 4: Data Flow
|
|
|
|
### Trust Boundary Identification
|
|
- [ ] Trust boundaries documented (external → API → services → database)
|
|
- [ ] Each boundary has validation rules
|
|
- [ ] No implicit trust based on network location
|
|
|
|
**Common Issues**:
|
|
- ❌ No documented boundaries → MEDIUM
|
|
- ❌ "Internal network = trusted" assumption → HIGH
|
|
|
|
|
|
### Input Validation at Boundaries
|
|
- [ ] All external input validated (type, format, range)
|
|
- [ ] Validation at EVERY trust boundary (not just entry point)
|
|
- [ ] Allow-list validation (define what's allowed, reject rest)
|
|
- [ ] Input validation errors logged
|
|
|
|
**Common Issues**:
|
|
- ❌ No validation at internal boundaries → MEDIUM
|
|
- ❌ Deny-list only (blacklist can be bypassed) → MEDIUM
|
|
- ❌ Validation at UI only (not API) → HIGH
|
|
|
|
|
|
### Output Encoding for Context
|
|
- [ ] Output encoded for destination context (HTML, SQL, shell, etc.)
|
|
- [ ] SQL parameterized queries (no string concatenation)
|
|
- [ ] HTML escaped when rendering user data
|
|
- [ ] JSON encoded when returning API responses
|
|
|
|
**Common Issues**:
|
|
- ❌ String concatenation for SQL → CRITICAL (SQL injection)
|
|
- ❌ Unescaped HTML → HIGH (XSS)
|
|
- ❌ Unescaped shell commands → CRITICAL (command injection)
|
|
|
|
|
|
### Data Classification and Handling
|
|
- [ ] Sensitive data identified (PII, secrets, financial)
|
|
- [ ] Sensitive data encrypted in transit
|
|
- [ ] Sensitive data encrypted at rest
|
|
- [ ] Sensitive data not logged
|
|
- [ ] Data retention policy defined
|
|
|
|
**Common Issues**:
|
|
- ❌ PII in plaintext logs → HIGH
|
|
- ❌ No encryption for sensitive data → HIGH
|
|
- ❌ No data retention policy → MEDIUM
|
|
|
|
|
|
## Checklist 5: Network Security
|
|
|
|
### TLS/Encryption in Transit
|
|
- [ ] All external traffic uses TLS (HTTPS, not HTTP)
|
|
- [ ] TLS 1.2 or 1.3 only (TLS 1.0/1.1 disabled)
|
|
- [ ] Strong cipher suites (no RC4, 3DES, MD5)
|
|
- [ ] HSTS header present (HTTP Strict Transport Security)
|
|
|
|
**Common Issues**:
|
|
- ❌ HTTP allowed → CRITICAL
|
|
- ❌ TLS 1.0 enabled → HIGH
|
|
- ❌ Weak ciphers → MEDIUM
|
|
- ❌ No HSTS → LOW
|
|
|
|
|
|
### Certificate Validation
|
|
- [ ] TLS certificates validated (not self-signed in production)
|
|
- [ ] Certificate expiry monitored
|
|
- [ ] Certificate revocation checking (OCSP/CRL)
|
|
- [ ] No certificate validation bypass in code
|
|
|
|
**Common Issues**:
|
|
- ❌ Self-signed certs in production → HIGH
|
|
- ❌ `verify=False` in code → CRITICAL
|
|
- ❌ No expiry monitoring → MEDIUM
|
|
|
|
|
|
### Network Segmentation
|
|
- [ ] Database not directly accessible from internet
|
|
- [ ] Services in separate subnets/VPCs
|
|
- [ ] Admin interfaces on separate network
|
|
- [ ] Internal services use mTLS or VPN
|
|
|
|
**Common Issues**:
|
|
- ❌ Database exposed to internet → CRITICAL
|
|
- ❌ All services in same flat network → MEDIUM
|
|
- ❌ Admin panel on public internet → HIGH
|
|
|
|
|
|
### Firewall Rules and Least-Necessary Access
|
|
- [ ] Firewall rules documented
|
|
- [ ] Default-deny firewall policy
|
|
- [ ] Only necessary ports open
|
|
- [ ] Source IP restrictions for sensitive endpoints
|
|
|
|
**Common Issues**:
|
|
- ❌ Default-allow firewall → HIGH
|
|
- ❌ All ports open → MEDIUM
|
|
- ❌ No source restrictions → MEDIUM
|
|
|
|
|
|
## Severity Guidelines
|
|
|
|
Use these guidelines to assign severity:
|
|
|
|
### Critical
|
|
- Direct path to data breach or system compromise
|
|
- **Examples**: SQL injection, secrets in Git, authentication bypass
|
|
|
|
### High
|
|
- Significant security impact, requires immediate attention
|
|
- **Examples**: Weak password hashing, no MFA for admins, IDOR vulnerability
|
|
|
|
### Medium
|
|
- Security weakness that should be addressed
|
|
- **Examples**: No password policy, inconsistent authorization, no audit logging
|
|
|
|
### Low
|
|
- Minor issue or hardening opportunity
|
|
- **Examples**: Information disclosure via error messages, missing HSTS header
|
|
|
|
|
|
## Review Report Template
|
|
|
|
```markdown
|
|
# Security Architecture Review
|
|
|
|
**System**: [System Name]
|
|
**Review Date**: [Date]
|
|
**Reviewer**: [Name]
|
|
|
|
## Executive Summary
|
|
|
|
[1-2 paragraph overview: Critical/High count, key findings, overall risk level]
|
|
|
|
## Findings
|
|
|
|
### Critical (Count: X)
|
|
|
|
#### 1. [Finding Title]
|
|
- **Area**: Authentication / Authorization / Secrets / Data Flow / Network
|
|
- **Description**: [What's wrong]
|
|
- **Impact**: [What attacker can do]
|
|
- **Recommendation**: [How to fix]
|
|
- **Affected Components**: [API Service, Database, etc.]
|
|
|
|
### High (Count: X)
|
|
|
|
[Same format as Critical]
|
|
|
|
### Medium (Count: X)
|
|
|
|
[Same format]
|
|
|
|
### Low (Count: X)
|
|
|
|
[Same format]
|
|
|
|
## Summary Table
|
|
|
|
| Finding | Severity | Area | Status |
|
|
|---------|----------|------|--------|
|
|
| SQL injection in /api/users | Critical | Data Flow | Open |
|
|
| Secrets in Git | Critical | Secrets | Open |
|
|
| No MFA for admins | High | Authentication | Open |
|
|
|
|
## Next Steps
|
|
|
|
1. Address Critical findings immediately (timeline: 1 week)
|
|
2. Address High findings (timeline: 2-4 weeks)
|
|
3. Schedule Medium/Low findings (timeline: 1-3 months)
|
|
```
|
|
|
|
|
|
## Quick Reference: Checklist Summary
|
|
|
|
| Area | Key Checks |
|
|
|------|------------|
|
|
| **Authentication** | Strong hashing (bcrypt), MFA, token expiry, lockout policy |
|
|
| **Authorization** | RBAC/ABAC model, resource-level checks, default-deny, no privilege escalation |
|
|
| **Secrets** | Not in Git, secrets manager, rotation policy, access control |
|
|
| **Data Flow** | Trust boundaries, input validation, output encoding, data classification |
|
|
| **Network** | TLS 1.2+, certificate validation, segmentation, firewall rules |
|
|
|
|
|
|
## Common Mistakes
|
|
|
|
### ❌ Ad-Hoc Review Without Checklist
|
|
|
|
**Wrong**: Review design intuitively, find obvious issues, call it done
|
|
|
|
**Right**: Systematically go through all 5 checklists, check every item
|
|
|
|
**Why**: Intuition finds 50% of issues. Checklist finds 90%.
|
|
|
|
|
|
### ❌ Stopping After Finding First Issues
|
|
|
|
**Wrong**: Find SQL injection, report it, stop review
|
|
|
|
**Right**: Complete full checklist review even after finding critical issues
|
|
|
|
**Why**: Systems often have multiple unrelated vulnerabilities.
|
|
|
|
|
|
### ❌ Vague Recommendations
|
|
|
|
**Wrong**: "Improve authentication security"
|
|
|
|
**Right**: "Replace MD5 with bcrypt (cost factor 12), add salt per-password"
|
|
|
|
**Why**: Specific recommendations are actionable. Vague ones are ignored.
|
|
|
|
|
|
### ❌ Missing Severity Assignment
|
|
|
|
**Wrong**: List issues without severity
|
|
|
|
**Right**: Assign Critical/High/Medium/Low to prioritize fixes
|
|
|
|
**Why**: Teams need to know what to fix first.
|
|
|
|
|
|
## Cross-References
|
|
|
|
**Use BEFORE this skill**:
|
|
- `ordis/security-architect/threat-modeling` - Understand threats, then review if design addresses them
|
|
|
|
**Use WITH this skill**:
|
|
- `muna/technical-writer/documentation-structure` - Document review as ADR or report
|
|
|
|
**Use AFTER this skill**:
|
|
- `ordis/security-architect/security-controls-design` - Design fixes for identified issues
|
|
|
|
## Real-World Impact
|
|
|
|
**Systematic reviews using these checklists:**
|
|
- **Pre-launch review caught SQL injection** in `/api/users/{id}` endpoint (would have been CRITICAL production vulnerability)
|
|
- **Checklist found secrets in Git history** across 3 repositories (developers missed it in manual review)
|
|
- **Authorization review found 12 IDOR vulnerabilities** where `/api/orders/{order_id}` didn't check ownership (intuitive review found only 2)
|
|
|
|
**Key lesson**: **Systematic checklist review finds 3-5x more issues than ad-hoc intuitive review.**
|