Files
2025-11-30 08:38:26 +08:00

14 KiB
Raw Permalink Blame History

Decomposition & Reconstruction: Advanced Methodology

Workflow

Copy this checklist for complex decomposition scenarios:

Advanced Decomposition Progress:
- [ ] Step 1: Apply hierarchical decomposition techniques
- [ ] Step 2: Build and analyze dependency graphs
- [ ] Step 3: Perform critical path analysis
- [ ] Step 4: Use advanced property measurement
- [ ] Step 5: Apply optimization algorithms

Step 1: Apply hierarchical decomposition techniques - Multi-level decomposition with consistent abstraction levels. See 1. Hierarchical Decomposition.

Step 2: Build and analyze dependency graphs - Visualize and analyze component relationships. See 2. Dependency Graph Analysis.

Step 3: Perform critical path analysis - Identify bottlenecks using PERT/CPM. See 3. Critical Path Analysis.

Step 4: Use advanced property measurement - Rigorous measurement and statistical analysis. See 4. Advanced Property Measurement.

Step 5: Apply optimization algorithms - Systematic reconstruction approaches. See 5. Optimization Algorithms.


1. Hierarchical Decomposition

Multi-Level Decomposition Strategy

Break into levels: L0 (System) → L1 (3-7 subsystems) → L2 (3-7 components each) → L3+ (only if needed). Stop when component is atomic or further breakdown doesn't help goal.

Abstraction consistency: All components at same level should be at same abstraction type (e.g., all architectural components, not mixing "API Service" with "user login function").

Template:

System → Subsystem A → Component A.1, A.2, A.3
      → Subsystem B → Component B.1, B.2
      → Subsystem C → Component C.1 (atomic)

Document WHY decomposed to this level and WHY stopped.


2. Dependency Graph Analysis

Building Dependency Graphs

Nodes: Components (from decomposition) Edges: Relationships (dependency, data flow, control flow, etc.) Direction: Arrow shows dependency direction (A → B means A depends on B)

Example:

Frontend → API Service → Database
               ↓
            Cache
               ↓
          Message Queue

Graph Properties

Strongly Connected Components (SCCs): Circular dependencies (A → B → C → A). Problematic for isolation. Use Tarjan's algorithm.

Topological Ordering: Linear order where edges point forward (only if acyclic). Reveals safe build/deploy order.

Critical Path: Longest weighted path, determines minimum completion time. Bottleneck for optimization.

Dependency Analysis

Forward: "If I change X, what breaks?" (BFS from X outgoing) Backward: "What must work for X to function?" (BFS from X incoming) Transitive Reduction: Remove redundant edges to simplify visualization.


3. Critical Path Analysis

PERT/CPM (Program Evaluation and Review Technique / Critical Path Method)

Use case: System with sequential stages, need to identify time bottlenecks

Inputs:

  • Components with estimated duration
  • Dependencies between components

Process:

Step 1: Build dependency graph with durations

A (3h) → B (5h) → D (2h)
A (3h) → C (4h) → D (2h)

Step 2: Calculate earliest start time (EST) for each component

EST(node) = max(EST(predecessor) + duration(predecessor)) for all predecessors

Example:

  • EST(A) = 0
  • EST(B) = EST(A) + duration(A) = 0 + 3 = 3h
  • EST(C) = EST(A) + duration(A) = 0 + 3 = 3h
  • EST(D) = max(EST(B) + duration(B), EST(C) + duration(C)) = max(3+5, 3+4) = 8h

Step 3: Calculate latest finish time (LFT) working backwards

LFT(node) = min(LFT(successor) - duration(node)) for all successors

Example (working backwards from D):

  • LFT(D) = project deadline (say 10h)
  • LFT(B) = LFT(D) - duration(B) = 10 - 5 = 5h
  • LFT(C) = LFT(D) - duration(C) = 10 - 4 = 6h
  • LFT(A) = min(LFT(B) - duration(A), LFT(C) - duration(A)) = min(5-3, 6-3) = 2h

Step 4: Calculate slack (float)

Slack(node) = LFT(node) - EST(node) - duration(node)

Example:

  • Slack(A) = 2 - 0 - 3 = -1h (on critical path, negative slack means delay)
  • Slack(B) = 5 - 3 - 5 = -3h (critical)
  • Slack(C) = 6 - 3 - 4 = -1h (has some float)
  • Slack(D) = 10 - 8 - 2 = 0 (critical)

Step 5: Identify critical path

Components with zero (or minimum) slack form the critical path.

Critical path: A → B → D (total 10h)

Optimization insight: Only optimizing B will reduce total time. Optimizing C (non-critical) won't help.

Handling Uncertainty (PERT Estimates)

When durations are uncertain, use three-point estimates:

  • Optimistic (O): Best case
  • Most Likely (M): Expected case
  • Pessimistic (P): Worst case

Expected duration: E = (O + 4M + P) / 6

Standard deviation: σ = (P - O) / 6

Example:

  • Component A: O=2h, M=3h, P=8h
  • Expected: E = (2 + 4×3 + 8) / 6 = 3.67h
  • Std dev: σ = (8 - 2) / 6 = 1h

Use expected durations for critical path analysis, report confidence intervals


4. Advanced Property Measurement

Quantitative vs Qualitative Properties

Quantitative (measurable):

  • Latency (ms), throughput (req/s), cost ($/month), lines of code, error rate (%)
  • Measurement: Use APM tools, profilers, logs, benchmarks
  • Reporting: Mean, median, p95, p99, min, max, std dev

Qualitative (subjective):

  • Code readability, maintainability, user experience, team morale
  • Measurement: Use rating scales (1-10), comparative ranking, surveys
  • Reporting: Mode, distribution, outliers

Statistical Rigor

For quantitative measurements:

1. Multiple samples: Don't rely on single measurement

  • Run benchmark 10+ times, report distribution
  • Example: Latency = 250ms ± 50ms (mean ± std dev, n=20)

2. Control for confounds: Isolate what you're measuring

  • Example: Measure DB query time with same dataset, same load, same hardware

3. Statistical significance: Determine if difference is real or noise

  • Use t-test or ANOVA to compare means
  • Report p-value (p < 0.05 typically considered significant)

For qualitative measurements:

1. Multiple raters: Reduce individual bias

  • Have 3+ people rate complexity independently, average scores

2. Calibration: Define rating scale clearly

  • Example: Complexity 1="< 50 LOC, no dependencies", 10=">1000 LOC, 20+ dependencies"

3. Inter-rater reliability: Check if raters agree

  • Calculate Cronbach's alpha or correlation coefficient

Performance Profiling Techniques

CPU Profiling:

  • Identify which components consume most CPU time
  • Tools: perf, gprof, Chrome DevTools, Xcode Instruments

Memory Profiling:

  • Identify which components allocate most memory or leak
  • Tools: valgrind, heaptrack, Chrome DevTools, Instruments

I/O Profiling:

  • Identify which components perform most disk/network I/O
  • Tools: iotop, iostat, Network tab in DevTools

Tracing:

  • Track execution flow through distributed systems
  • Tools: OpenTelemetry, Jaeger, Zipkin, AWS X-Ray

Result: Component-level resource consumption data for bottleneck analysis


5. Optimization Algorithms

Greedy Optimization

Approach: Optimize components in order of highest impact first

Algorithm:

  1. Measure impact of optimizing each component (reduction in latency, cost, etc.)
  2. Sort components by impact (descending)
  3. Optimize highest-impact component
  4. Re-measure, repeat until goal achieved or diminishing returns

Example (latency optimization):

  • Components: A (100ms), B (500ms), C (50ms)
  • Sort by impact: B (500ms), A (100ms), C (50ms)
  • Optimize B first → Reduce to 200ms → Total latency improved by 300ms
  • Re-measure, continue

Advantage: Fast, often gets 80% of benefit with 20% of effort Limitation: May miss global optimum (e.g., removing B entirely better than optimizing B)

Dynamic Programming Approach

Approach: Find optimal decomposition/reconstruction by exploring combinations

Use case: When multiple components interact, greedy may not find best solution

Example (budget allocation):

  • Budget: $1000/month
  • Components: A (improves UX, costs $400), B (reduces latency, costs $600), C (adds feature, costs $500)
  • Constraint: Total cost ≤ $1000
  • Goal: Maximize value

Algorithm:

  1. Enumerate all feasible combinations: {A}, {B}, {C}, {A+B}, {A+C}, {B+C}
  2. Calculate value and cost for each
  3. Select combination with max value under budget constraint

Result: Optimal combination (may not be greedy choice)

Constraint Satisfaction

Approach: Find reconstruction that satisfies all hard constraints

Use case: Multiple constraints (latency < 500ms AND cost < $500/month AND reliability > 99%)

Formulation:

  • Variables: Component choices (use component A or B? Parallelize or serialize?)
  • Domains: Possible values for each choice
  • Constraints: Rules that must be satisfied

Algorithm: Backtracking search, constraint propagation Tools: CSP solvers (Z3, MiniZinc)

Sensitivity Analysis

Goal: Understand how sensitive reconstruction is to property estimates

Process:

  1. Build reconstruction based on measured/estimated properties
  2. Vary each property by ±X% (e.g., ±20%)
  3. Re-run reconstruction
  4. Identify which properties most affect outcome

Example:

  • Baseline: Component A latency = 100ms → Optimize B
  • Sensitivity: If A latency = 150ms → Optimize A instead
  • Conclusion: Decision is sensitive to A's latency estimate, need better measurement

6. Advanced Reconstruction Patterns

Caching & Memoization

Pattern: Add caching layer for frequently accessed components

When: Component is slow, accessed repeatedly, output deterministic

Example: Database query repeated 1000x/sec → Add Redis cache → 95% cache hit rate → 20× latency reduction

Trade-offs: Memory cost, cache invalidation complexity, eventual consistency

Batch Processing

Pattern: Process items in batches instead of one-at-a-time

When: Per-item overhead is high, latency not critical

Example: Send 1000 individual emails (1s each, total 1000s) → Batch into groups of 100 → Send via batch API (10s per batch, total 100s)

Trade-offs: Increased latency for individual items, complexity in failure handling

Asynchronous Processing

Pattern: Decouple components using message queues

When: Component is slow but result not needed immediately

Example: User uploads video → Process synchronously (60s wait) → User unhappy Reconstruction: User uploads → Queue processing → User sees "processing" → Email when done

Trade-offs: Complexity (need queue infrastructure), eventual consistency, harder to debug

Load Balancing & Sharding

Pattern: Distribute load across multiple instances of a component

When: Component is bottleneck, can be parallelized, load is high

Example: Single DB handles 10K req/s, saturated → Shard by user ID → 10 DBs each handle 1K req/s

Trade-offs: Operational complexity, cross-shard queries expensive, rebalancing cost

Circuit Breaker

Pattern: Fail fast when dependent component is down

When: Component depends on unreliable external service

Example: API calls external service → Service is down → API waits 30s per request → API becomes slow Reconstruction: Add circuit breaker → Detect failures → Stop calling for 60s → Fail fast (< 1ms)

Trade-offs: Reduced functionality during outage, tuning thresholds (false positives vs negatives)


7. Failure Mode & Effects Analysis (FMEA)

FMEA Process

Goal: Identify weaknesses and single points of failure in decomposed system

Process:

Step 1: List all components

Step 2: For each component, identify failure modes

  • How can this component fail? (crash, slow, wrong output, security breach)

Step 3: For each failure mode, assess:

  • Severity (S): Impact if failure occurs (1-10, 10 = catastrophic)
  • Occurrence (O): Likelihood of failure (1-10, 10 = very likely)
  • Detection (D): Ability to detect before impact (1-10, 10 = undetectable)

Step 4: Calculate Risk Priority Number (RPN) RPN = S × O × D

Step 5: Prioritize failures by RPN, design mitigations

Example

Component Failure Mode S O D RPN Mitigation
Database Crashes 9 2 1 18 Add replica, automatic failover
Cache Stale data 5 6 8 240 Reduce TTL, add invalidation
API DDoS attack 8 4 3 96 Add rate limiting, WAF

Highest RPN = 240 (Cache stale data) → Address this first

Mitigation Strategies

Redundancy: Multiple instances, failover Monitoring: Early detection, alerting Graceful degradation: Degrade functionality instead of total failure Rate limiting: Prevent overload Input validation: Prevent bad data cascading Circuit breakers: Fail fast when dependencies down


8. Case Study Approach

Comparative Analysis

Compare reconstruction alternatives in table format (Latency, Cost, Time, Risk, Maintainability). Make recommendation with rationale based on trade-offs.

Iterative Refinement

If initial decomposition doesn't reveal insights, refine: go deeper in critical areas, switch decomposition strategy, add missing relationships. Re-run analysis. Stop when further refinement doesn't change recommendations.


9. Tool-Assisted Decomposition

Static analysis: CLOC, SonarQube (dependency graphs, complexity metrics) Dynamic analysis: Flame graphs, perf, Chrome DevTools (CPU/memory/I/O), Jaeger/Zipkin (distributed tracing)

Workflow: Static analysis → Dynamic measurement → Manual validation → Combine quantitative + qualitative

Caution: Tools miss runtime dependencies, overestimate coupling, produce overwhelming detail. Use as guide, not truth.


10. Communication & Visualization

Diagrams: Hierarchy trees, dependency graphs (color-code critical path), property heatmaps, before/after comparisons

Stakeholder views:

  • Executives: 1-page summary, key findings, business impact
  • Engineers: Detailed breakdown, technical rationale, implementation
  • Product/Business: UX impact, cost-benefit, timeline

Adapt depth to audience expertise.