Initial commit

This commit is contained in:
Zhongwei Li
2025-11-30 08:38:26 +08:00
commit 41d9f6b189
304 changed files with 98322 additions and 0 deletions

View File

@@ -0,0 +1,372 @@
# Layered Reasoning: Methodology
Advanced techniques for multi-level reasoning, layer design, consistency validation, emergence detection, and bidirectional change propagation.
---
## 1. Layer Design Principles
### Choosing the Right Number of Layers
**Rule of thumb**: 3-5 layers optimal for most domains
**Too few layers** (1-2):
- **Problem**: Jumps abstraction too quickly, loses nuance
- **Example**: "Vision: Scale globally" → "Code: Deploy to AWS regions" (missing tactical layer: multi-region strategy, data sovereignty)
- **Symptom**: Strategic and operational teams can't communicate; implementation doesn't align with vision
**Too many layers** (6+):
- **Problem**: Excessive overhead, confusion about which layer to use
- **Example**: Vision → Strategy → Goals → Objectives → Tactics → Tasks → Subtasks → Actions (8 layers, redundant)
- **Symptom**: People debate which layer something belongs to; layers blur together
**Optimal**:
- **3 layers**: Strategic (why) → Tactical (what) → Operational (how)
- **4 layers**: Vision (purpose) → Strategy (approach) → Tactics (methods) → Operations (execution)
- **5 layers**: Vision → Strategy → Programs → Projects → Tasks (for large organizations)
**Test**: Can you clearly name each layer and explain its role? If not, simplify.
### Layer Independence (Modularity)
**Principle**: Each layer should be independently useful without requiring other layers
**Good layering** (modular):
- **Strategic**: "Customer privacy first" (guides decisions even without seeing code)
- **Tactical**: "Zero-trust architecture" (understandable without knowing AWS KMS details)
- **Operational**: "Implement AES-256 encryption" (executable without re-deriving strategy)
**Bad layering** (coupled):
- **Strategic**: "Use AES-256" (too operational for strategy)
- **Tactical**: "Deploy to AWS" (missing why: scalability, compliance, etc.)
- **Operational**: "Implement privacy" (too vague without tactical guidance)
**Test**: Show each layer independently to different stakeholders. Strategic layer to CEO → makes sense alone. Operational layer to engineer → executable alone.
### Layer Abstraction Gaps
**Principle**: Each layer should be roughly one abstraction level apart
**Optimal gap**: Each layer translates to 3-10 elements at layer below
**Example**: Good abstraction gap
- **Strategic**: "High availability" (1 principle)
- **Tactical**: "Multi-region, auto-scaling, circuit breakers" (3 tactics)
- **Operational**: Multi-region = "Deploy AWS us-east-1 + eu-west-1 + ap-southeast-1" (3 regions); Auto-scaling = "ECS target tracking on CPU 70%" (1 config); Circuit breakers = "Istio circuit breaker 5xx >50%" (1 config) → Total 5 operational items from 3 tactical
**Too large gap**:
- **Strategic**: "High availability" →
- **Operational**: "Deploy us-east-1, eu-west-1, ap-southeast-1, configure ECS target tracking CPU 70%, configure Istio..." (10+ items, no intermediate)
- **Problem**: Can't understand how strategy maps to operations; no tactical layer to adjust
**Test**: Can you translate each strategic element to 3-10 tactical elements? Can you translate each tactical element to 3-10 operational steps?
---
## 2. Consistency Validation Techniques
### Upward Consistency (Bottom-Up)
**Question**: Do lower layers implement upper layers?
**Validation method**:
1. **Trace operational procedure** → Which tactical approach does it implement?
2. **Aggregate tactical approaches** → Which strategic principle do they achieve?
3. **Check coverage**: Does every operation map to a tactic? Does every tactic map to strategy?
**Example**:
- **Ops**: "Encrypt with AES-256" → Tactical: "End-to-end encryption" → Strategic: "Customer privacy"
- **Ops**: "Deploy Istio mTLS" → Tactical: "Zero-trust architecture" → Strategic: "Customer privacy"
- **Coverage check**: All ops map to tactics ✓, all tactics map to privacy principle ✓
**Gap detection**:
- **Orphan operation**: Operation that doesn't implement any tactic (e.g., "Cache user data unencrypted" contradicts zero-trust)
- **Orphan tactic**: Tactic that doesn't achieve any strategic principle (e.g., "Use GraphQL" doesn't map to "privacy" or "scale")
**Fix**: Remove orphan operations, add missing tactics if operations reveal implicit strategy.
### Downward Consistency (Top-Down)
**Question**: Can upper layers be executed with lower layers?
**Validation method**:
1. **For each strategic principle** → List tactics that would achieve it
2. **For each tactic** → List operations required to implement it
3. **Check feasibility**: Can we actually execute these operations given constraints (budget, time, team skills)?
**Example**:
- **Strategy**: "HIPAA compliance" →
- **Tactics needed**: "Encryption, audit logs, access control" →
- **Operations needed**: "Deploy AWS KMS, enable CloudTrail, implement IAM policies" →
- **Feasibility**: Team has AWS expertise ✓, budget allows ✓, timeline feasible ✓
**Gap detection**:
- **Infeasible tactic**: Tactic can't be implemented operationally (e.g., "Real-time fraud detection" but team lacks ML expertise)
- **Missing tactic**: Strategic principle without sufficient tactics (e.g., "Privacy" but no encryption tactics)
**Fix**: Add missing tactics, adjust strategy if tactics infeasible, hire/train if skill gaps.
### Lateral Consistency (Same-Layer)
**Question**: Do parallel choices at the same layer contradict?
**Validation method**:
1. **List all choices at each layer** (e.g., all tactical approaches)
2. **Pairwise comparison**: For each pair, do they conflict?
3. **Check resource conflicts**: Do they compete for same resources (budget, team, time)?
**Example**: Tactical layer lateral check
- **Tactic A**: "Microservices for scale" vs **Tactic B**: "Monorepo for simplicity"
- **Conflict?** No, microservices (runtime) + monorepo (code organization) compatible
- **Tactic A**: "Multi-region deployment" vs **Tactic C**: "Single database"
- **Conflict?** Yes, multi-region requires distributed database (latency, sync)
**Resolution**:
- **Compatible**: Keep both (e.g., microservices + monorepo)
- **Incompatible**: Choose one based on strategic priority (e.g., multi-region wins if "availability" > "simplicity")
- **Refine**: Adjust to make compatible (e.g., "Multi-region with regional databases + eventual consistency")
### Formal Consistency Checking
**Dependency graph approach**:
1. **Build dependency graph**:
- Nodes = elements at all layers (strategic principles, tactical approaches, operational procedures)
- Edges = "implements" (upward) or "requires" (downward) relationships
2. **Check properties**:
- **No orphans**: Every node has at least one edge (connects to another layer)
- **No cycles**: Strategic A → Tactical B → Operational C → Tactical D → Strategic A (circular dependency = contradiction)
- **Full path**: Every strategic principle has path to at least one operational procedure
3. **Identify inconsistencies**:
- **Orphan node**: E.g., tactical approach not implementing any strategy
- **Cycle**: E.g., "We need X to implement Y, but Y is required for X"
- **Dead end**: Strategy with no path to operations (can't be executed)
**Example graph**:
```
Strategic: Privacy (S1)
Tactical: Encryption (T1), Access Control (T2)
Operational: AES-256 (O1), IAM policies (O2)
```
- **Check**: S1 → T1 → O1 (complete path ✓), S1 → T2 → O2 (complete path ✓)
- **No orphans** ✓, **No cycles** ✓, **Full coverage**
---
## 3. Emergence Detection
### Bottom-Up Pattern Recognition
**Definition**: Lower-layer interactions create unexpected upper-layer behavior not explicitly designed
**Process**:
1. **Observe operational behavior** (metrics, incidents, user feedback)
2. **Identify patterns** that occur repeatedly (not one-offs)
3. **Aggregate to tactical layer**: What systemic issue causes this pattern?
4. **Recognize strategic implication**: Does this invalidate strategic assumptions?
**Example 1**: Conway's Law emergence
- **Ops observation**: Cross-team features take 3× longer than single-team features
- **Tactical pattern**: Microservices owned by different teams require extensive coordination
- **Strategic implication**: "Org structure determines architecture; current structure slows key features" → Realign teams to product streams, not services
**Example 2**: Performance vs. security tradeoff
- **Ops observation**: Encryption adds 50ms latency, users complain about slowness
- **Tactical pattern**: Security measures consistently hurt performance
- **Strategic implication**: Original strategy "Security + speed" incompatible → Refine: "Security first, optimize critical paths to <100ms"
### Leading Indicators for Emergence
**Monitor these signals** to catch emergence early:
1. **Increasing complexity at operational layer**: More workarounds, special cases, exceptions
- **Meaning**: Tactics may not fit reality; strategic assumptions may be wrong
2. **Frequent tactical adjustments**: Changing approaches every sprint
- **Meaning**: Strategy unclear or infeasible; need strategic clarity
3. **Misalignment between metrics**: Strategic KPI improving but operational satisfaction dropping
- **Example**: Revenue up (strategic) but engineer productivity down (operational) → Hidden cost emerging
4. **Repeated failures of same type**: Same class of incident/bug over and over
- **Meaning**: Tactical approach has systematic flaw; may require strategic shift
### Emergence vs. Noise
**Emergence** (systematic pattern):
- Repeats across multiple contexts
- Persists over time (not transient)
- Has clear causal mechanism at lower layer
**Noise** (random variance):
- One-off occurrence
- Transient (disappears quickly)
- No clear causal mechanism
**Test**: Can you explain mechanism? ("Microservices cause coordination overhead because teams must sync on interfaces" = emergence). If no mechanism, likely noise.
---
## 4. Bidirectional Change Propagation
### Top-Down Propagation (Strategy Changes)
**Scenario**: Strategic shift (market change, new regulation, company pivot)
**Process**:
1. **Document strategic change**: What changed and why?
2. **Identify affected tactics**: Which tactical approaches depended on old strategy?
3. **Re-evaluate tactics**: Do current tactics still achieve new strategy? If not, generate alternatives.
4. **Cascade to operations**: Update operational procedures to implement new tactics.
5. **Validate consistency**: Check upward/downward/lateral consistency with new strategy.
**Example**: Strategic shift from "Speed" to "Trust"
- **Strategic change**: "Move fast and break things" → "Build trust through reliability"
- **Tactical impact**:
- OLD: "Deploy daily, fix issues post-deploy" → NEW: "Staged rollouts, canary testing, rollback plans"
- OLD: "Ship MVP, iterate" → NEW: "Comprehensive testing, beta programs, polish before GA"
- **Operational impact**:
- Update CI/CD: Add pre-deploy tests, canary stages
- Update sprint process: Add QA phase, user acceptance testing
- Update monitoring: Add error budgets, SLO tracking
**Propagation timeline**:
- **Week 1**: Communicate strategic change, get alignment
- **Week 2-3**: Re-evaluate tactics, design new approaches
- **Week 4-8**: Update operational procedures, train teams
- **Ongoing**: Monitor consistency, adjust as needed
### Bottom-Up Propagation (Operational Constraints)
**Scenario**: Operational constraint discovered (technical limitation, resource shortage, performance issue)
**Process**:
1. **Document operational constraint**: What's infeasible and why?
2. **Evaluate tactical impact**: Can we adjust tactics to work around constraint?
3. **If no tactical workaround**: Clarify or adjust strategy to acknowledge constraint.
4. **Communicate upward**: Ensure stakeholders understand strategic implications of operational reality.
**Example**: Performance constraint discovered
- **Operational constraint**: "Encryption adds 50ms latency, exceeds <100ms SLA"
- **Tactical re-evaluation**:
- Option A: Optimize encryption (caching, hardware acceleration) → Still 20ms overhead
- Option B: Selective encryption (only sensitive fields) → Violates "encrypt everything" tactic
- Option C: Lighter encryption (AES-128 instead of AES-256) → Security tradeoff
- **Strategic clarification** (if needed): Original strategy: "<100ms latency for all APIs"
- **Refined strategy**: "<100ms for user-facing APIs, <200ms for internal APIs where security critical"
- **Rationale**: Accept latency cost for security on sensitive paths, optimize user-facing
**Escalation decision tree**:
1. **Can tactical adjustment solve?** (e.g., optimize) → YES: Tactical change only
2. **Tactical adjustment insufficient?** → Escalate to strategic layer
3. **Strategic constraint absolute?** (e.g., compliance non-negotiable) → Accept operational cost or change tactics
4. **Strategic constraint negotiable?** → Refine strategy to acknowledge operational reality
### Change Impact Analysis
**Before propagating any change**, analyze impact:
**Impact dimensions**:
1. **Scope**: How many layers affected? (1 layer = local change, 3 layers = systemic)
2. **Magnitude**: How big are changes at each layer? (minor adjustment vs. complete redesign)
3. **Timeline**: How long to propagate changes fully? (1 week vs. 6 months)
4. **Risk**: What breaks if change executed poorly? (downtime, customer trust, team morale)
**Example**: Impact analysis of "Strategic shift to privacy-first"
- **Scope**: All 3 layers (strategic, tactical, operational)
- **Magnitude**: High (major tactical changes: add encryption, access control; major operational changes: new infrastructure)
- **Timeline**: 6 months (implement encryption Q1, access control Q2, audit Q3)
- **Risk**: High (customer data at risk if done wrong, compliance penalties if incomplete)
- **Decision**: Phased rollout with validation gates at each phase
---
## 5. Advanced Topics
### Layer Invariants and Contracts
**Concept**: Each layer establishes "contracts" that lower layers must satisfy
**Strategic layer invariants**:
- Non-negotiable constraints (e.g., "HIPAA compliance", "zero downtime")
- These are **invariants**: never violated regardless of tactical/operational choices
**Tactical layer contracts**:
- Promises to strategic layer (e.g., "Encryption ensures privacy")
- Requirements for operational layer (e.g., "Operations must implement AES-256")
**Operational layer contracts**:
- Guarantees to tactical layer (e.g., "KMS provides AES-256 encryption")
**Validation**: If operational layer can't satisfy tactical contract, either change operations or change tactics (which may require strategic clarification).
### Cross-Cutting Concerns
**Problem**: Some concerns span all layers (logging, security, monitoring)
**Approaches**:
**Option 1: Separate layers for cross-cutting concern**
- **Strategic (Security)**: "Defense in depth"
- **Tactical (Security)**: "WAF, encryption, access control"
- **Operational (Security)**: "Deploy WAF rules, implement RBAC"
- **Pro**: Clear security focus
- **Con**: Parallel structure, coordination overhead
**Option 2: Integrate into each layer**
- **Strategic**: "Privacy-first product" (security embedded)
- **Tactical**: "Zero-trust architecture" (security included in tactics)
- **Operational**: "Implement encryption" (security in operations)
- **Pro**: Unified structure
- **Con**: Security may get diluted across layers
**Recommendation**: Use Option 1 for critical cross-cutting concerns (security, compliance), Option 2 for less critical (logging, monitoring).
### Abstraction Hierarchies vs. Layered Reasoning
**Abstraction hierarchy** (programming):
- Layers hide implementation details (API → library → OS → hardware)
- Lower layers serve upper layers (hardware serves OS, OS serves library)
- **Focus**: Information hiding, modularity
**Layered reasoning** (thinking framework):
- Layers represent abstraction levels (strategy → tactics → operations)
- Lower layers implement upper layers; upper layers constrain lower layers
- **Focus**: Consistency, alignment, translation
**Key difference**: Abstraction hierarchy is **unidirectional** (call downward, hide details upward). Layered reasoning is **bidirectional** (implement downward, feedback upward).
### Formal Specifications at Each Layer
**Strategic layer**: Natural language principles + constraints
- "System must be HIPAA compliant"
- "Support 10× traffic growth"
**Tactical layer**: Architecture diagrams, decision records, policies
- ADR: "We will use microservices for scalability"
- Policy: "All services must implement health checks"
**Operational layer**: Code, runbooks, configuration
- Code: `encrypt(data, key)`
- Runbook: "If service fails health check, restart pod"
**Validation**: Can you trace from code (operational) → policy (tactical) → principle (strategic)?
---
## Common Mistakes and Solutions
| Mistake | Symptom | Solution |
|---------|---------|----------|
| **Skipping layers** | Jump from strategy to code without tactics | Insert tactical layer; design approaches before coding |
| **Layer coupling** | Can't understand one layer without others | Make each layer independently useful with clear contracts |
| **Too many layers** | Confusion about which layer to use, redundancy | Consolidate to 3-5 layers; eliminate redundant levels |
| **Ignoring emergence** | Surprised by unintended consequences | Monitor operational behavior; recognize emerging tactical patterns |
| **One-way propagation** | Strategy changes but operations don't update | Use bidirectional propagation; cascade changes downward |
| **No consistency checks** | Misalignment between layers discovered late | Regular upward/downward/lateral consistency validation |
| **Implicit assumptions** | Assumptions change, layers break | Document assumptions explicitly at each layer |
| **Orphan elements** | Operations/tactics not implementing strategy | Build dependency graph; ensure every element maps upward |