38 KiB
Stocks and Flows Modeling
When to Use This Skill
Use stocks-and-flows modeling when:
- Predicting future states: "How many customers will we have in 6 months?"
- Finding equilibrium: "At what backlog size does the queue stabilize?"
- Analyzing delays: "Why does auto-scaling overshoot?"
- Quantifying accumulation: "How fast does technical debt grow?"
- Validating intuition: "Will doubling capacity solve this?"
- Making decisions with cost of error: Production incidents, capacity planning, resource allocation
Skip quantitative modeling when:
- System is very simple (single stock, obvious dynamics)
- Exploratory thinking (just brainstorming archetypes)
- No one will act on precise numbers
- Parameters are completely unknown (no way to estimate)
Key insight: Most management mistakes come from confusing stocks with flows. This skill provides frameworks to avoid that trap.
Fundamentals: Stocks vs Flows
Definition
Stock: A quantity that accumulates over time. You can measure it at a single instant.
- Examples: Bug count, cache entries, customers, technical debt, memory used, inventory
- Units: Things (customers, bugs, GB, etc.)
- Test: "How many X do we have RIGHT NOW?" → If answerable, it's a stock
Flow: A rate of change per unit time. It's an action happening continuously.
- Examples: Bug arrival rate, churn rate, requests/sec, memory leak rate
- Units: Things per time (customers/month, bugs/week, MB/sec)
- Test: "How fast is X changing?" → If that's the question, it's a flow
Derived metric: Neither stock nor flow, but calculated from them.
- Examples: Cache hit rate (hits/requests), utilization (used/capacity), velocity (story points/sprint)
- These are ratios or percentages, not accumulations
The Bathtub Metaphor
INFLOW (faucet)
↓
┌─────────────────────┐
│ │ ← STOCK (water level)
│ ~~~~~~~~~~~~~~~ │
│ │
└──────────┬──────────┘
↓
OUTFLOW (drain)
Stock changes by: Inflow - Outflow
- If Inflow > Outflow: Stock rises
- If Inflow < Outflow: Stock falls
- If Inflow = Outflow: Equilibrium (stock constant)
Why this matters: You can't change the stock level instantly. You can only adjust the faucets and drains. The stock responds with a delay determined by flow rates.
Units Discipline
Iron rule: Check dimensional consistency in every equation.
CORRECT:
ΔCustomers = (150 customers/month) - (0.05 × customers × 1/month)
Units: customers = customers/month × month ✓
WRONG:
ΔRevenue = Customers + Churn
Units: $/month ≠ customers + customers/month ✗
Practice: Write units next to every number. If units don't match across an equation, you've made a conceptual error.
Formal Notation
Basic Stock-Flow Equation
Discrete time (month-by-month, day-by-day):
S(t+1) = S(t) + Δt × (Inflow - Outflow)
Where:
S(t) = Stock at time t
Inflow = Rate coming in (units/time)
Outflow = Rate going out (units/time)
Δt = Time step (usually 1 if you match units)
Example - Bug Backlog:
Backlog(tomorrow) = Backlog(today) + (Bugs reported) - (Bugs fixed)
B(t+1) = B(t) + R - F
If R = 40 bugs/day, F = 25 bugs/day, B(0) = 100:
B(1) = 100 + 40 - 25 = 115 bugs
B(2) = 115 + 40 - 25 = 130 bugs
B(3) = 130 + 40 - 25 = 145 bugs
Flows Depending on Stocks
Often flows aren't constant—they depend on stock levels:
Outflow = Rate × Stock
Examples:
Churn = 0.05/month × Customers
Cache evictions = New entries (only when cache is full)
Bug fix rate = Engineer capacity × (Bugs / Bugs per engineer-day)
Bug backlog with stock-dependent fixing:
F = min(Team_capacity, 0.5 × B) ← More bugs → faster fixing (to a limit)
If B is small: Team isn't working at capacity
If B is large: Team is saturated at max throughput
Multi-Stock Systems
When stocks transfer between states:
BASIC CUSTOMERS (B):
ΔB = +Acquisitions - Upgrades + Downgrades - Churn_B
PREMIUM CUSTOMERS (P):
ΔP = +Upgrades - Downgrades - Churn_P
Note: Upgrades leave B and enter P (transfer flow)
Acquisitions only enter B (source flow)
Churn leaves system entirely (sink flow)
Template for multi-stock:
Stock_A(t+1) = Stock_A(t) + Sources_A + Transfers_to_A - Transfers_from_A - Sinks_A
Stock_B(t+1) = Stock_B(t) + Sources_B + Transfers_to_B - Transfers_from_B - Sinks_B
Stock vs Flow Identification
Decision tree:
-
Can you measure it at a single instant without reference to time?
- YES → It's a stock (or derived metric)
- NO → It's a flow
-
If YES, does it accumulate based on past activity?
- YES → Stock (customers accumulate from past acquisitions)
- NO → Derived metric (hit rate = hits/requests right now)
-
What are the units?
- Things (GB, customers, bugs) → Stock
- Things/time (GB/sec, customers/month) → Flow
- Dimensionless (%, ratio) → Derived metric
Common ambiguities:
| Concept | Stock or Flow? | Why |
|---|---|---|
| Technical debt | Stock | Accumulates over time, measured in "story points of debt" |
| Debt accumulation | Flow | Rate at which debt is added (points/sprint) |
| Velocity | Derived metric | Story points/sprint (ratio of two flows) |
| Morale | Stock | Current team morale level (1-10 scale at instant) |
| Morale erosion | Flow | Rate of morale decline (points/month) |
| Cache hit rate | Derived metric | Hits/Requests (ratio, not accumulation) |
| Response time | Derived metric | Total time / Requests (average at instant) |
| Bug count | Stock | Number of open bugs right now |
| Bug arrival rate | Flow | New bugs per week |
Red flag: If you're tempted to say "we need more velocity", stop. You can't "have" velocity—it's a measurement of throughput. You need more throughput capacity (stock: engineer hours) or better process efficiency (affects flow rate).
When to Model Quantitatively
Decision Criteria
Build a quantitative model when:
-
Equilibrium is non-obvious
- "Will the queue ever stabilize?"
- Multi-stock systems with transfers (churn + upgrades + downgrades)
- Need to know: "At what size?"
-
Delays are significant
- Delay > 50% of desired response time → Danger zone
- Auto-scaling with 4-minute cold start for 5-minute traffic spike
- Information travels slower than problem evolves
-
Non-linear relationships
- Performance cliffs (CPU 80% → 95% causes 10× slowdown)
- Network effects (value per user increases with user count)
- Saturation (hiring more doesn't help past some point)
-
Cost of error is high
- Production capacity planning
- Financial projections
- SLA compliance decisions
- Cost: "If we're wrong, we lose $X or reputation"
-
Intuition conflicts
- Team disagrees on what will happen
- "Common sense" says one thing, someone suspects otherwise
- Model adjudicates
-
Validation needed
- Need to convince stakeholders with numbers
- Compliance or audit requirement
- Building confidence before expensive commitment
Stay qualitative when:
- Brainstorming phase (exploring problem space)
- System is trivial (one stock, constant flows, obvious outcome)
- Parameters are completely unknown (garbage in, garbage out)
- Decision won't change regardless of numbers
- Time to model > time to just try it
Rule of thumb: If you're about to make a decision that takes >1 week to reverse and costs >$10K if wrong, spend 30 minutes building a spreadsheet model.
Equilibrium Analysis
Finding Steady States
Equilibrium = Stock levels where nothing changes (ΔS = 0)
Method:
- Write stock-flow equations
- Set ΔS = 0 (no change)
- Solve for stock levels algebraically
Example - Bug Backlog Equilibrium:
ΔB = R - F
Set ΔB = 0:
0 = R - F
F = R
If R = 40 bugs/day:
Equilibrium when F = 40 bugs/day
If fixing rate depends on backlog: F = min(50, 0.5 × B)
0 = 40 - 0.5 × B
B = 80 bugs ← Equilibrium backlog
Interpretation: System will settle at 80-bug backlog where team fixes 40/day.
Multi-Stock Equilibrium
SaaS customer example:
ΔB = 150 - 0.15×B + 0.08×P = 0 ... (1)
ΔP = 0.10×B - 0.13×P = 0 ... (2)
From (2): P = (0.10/0.13) × B = 0.769 × B
Substitute into (1):
150 - 0.15×B + 0.08×(0.769×B) = 0
150 = 0.15×B - 0.0615×B
150 = 0.0885×B
B = 1,695 customers
P = 1,304 customers
Total equilibrium = 2,999 customers
Validation:
- Check: 150 acquisitions = 0.15 × 1,695 = 254 exits ✓
- Sanity: Total grows from 1,000 → ~3,000 over ~18 months ✓
Stable vs Unstable Equilibria
Stable: Perturbations decay back to equilibrium
- Bug backlog with stock-dependent fixing
- Customer base with constant churn %
- Cache at capacity (every new entry evicts old)
Unstable: Small perturbations grow exponentially
- Bug backlog where fixing gets SLOWER as backlog grows (team overwhelmed)
- Product with negative word-of-mouth (more users → worse experience → churn accelerates)
- Memory leak (usage grows unbounded)
Test:
- Increase stock slightly above equilibrium
- Do flows push it back down? → Stable
- Do flows push it further up? → Unstable (runaway)
No equilibrium:
- ΔS = constant > 0 → Unbounded growth (venture-backed startup in growth mode)
- ΔS = constant < 0 → Runaway collapse (company in death spiral)
- These systems don't have steady states, only trajectories
Time Constants and Dynamics
How Fast to Equilibrium?
Time constant (τ): Characteristic time for system to respond
For simple balancing loop:
τ = Stock_equilibrium / Outflow_rate
Example - Filling cache:
Capacity: 1,000 entries
Miss rate: 8,000 unique requests/hour (when mostly empty)
τ = 1,000 / 8,000 = 0.125 hours = 7.5 minutes
Exponential approach: Stock approaches equilibrium like:
S(t) = S_eq - (S_eq - S_0) × e^(-t/τ)
Where:
S_eq = Equilibrium level
S_0 = Starting level
τ = Time constant
Useful milestones:
- After 1τ: 63% of the way to equilibrium
- After 2τ: 86% there
- After 3τ: 95% there
- After 5τ: 99% there (effectively "done")
Practical: "90% there" ≈ 2.3 × τ
Example - Customer growth:
Current: 1,000 customers
Equilibrium: 3,000 customers
Time constant: τ = 8 months (calculated from acquisition/churn rates)
When will we hit 2,700 customers (90% of growth)?
t = 2.3 × 8 = 18.4 months
Multi-Stock Time Constants
Different stocks approach equilibrium at different rates:
SaaS example:
- Basic customer base: τ_B ≈ 10 months (slow growth due to upgrades)
- Premium customer base: τ_P ≈ 5 months (faster growth from upgrade flow)
- MRR: Tracks premium customers, so τ_MRR ≈ 5 months
System reaches overall equilibrium when the SLOWEST stock stabilizes.
Implication: Revenue growth will plateau before customer count does (because premium customers equilibrate faster, and they drive revenue).
Modeling Delays
Types of Delays
Information delay: Time between event and awareness
- Monitoring lag: 5 minutes to detect CPU spike
- Reporting lag: Bug discovered 2 weeks after code shipped
- Metric delay: Dashboard updates every hour
Material delay: Time between decision and physical result
- Provisioning: 4 minutes to start new instance
- Hiring: 3 months to recruit and onboard engineer
- Training: 6 months for new team member to be fully productive
Pipeline delay: Work in progress
- Deployment pipeline: 20 minutes CI/CD
- Manufacturing: Parts in assembly
- Support tickets: Acknowledged but not resolved
Delay Notation
Event → [Information Delay] → Detection → [Decision Time] → Action → [Material Delay] → Effect
Example - Auto-scaling:
CPU spike → [5 min monitoring] → Alert → [instant] → Add instances → [4 min startup] → Capacity
Total delay: 9 minutes from problem to solution
Delay-Induced Failure Modes
1. Prolonged degradation: Solution arrives too late
Problem at t=0
Solution effective at t=9
If problem only lasts 5 minutes → Wasted scaling
If problem lasts 15 minutes → 60% of duration in pain
2. Overshoot: Multiple decisions made during delay
t=0: CPU spikes to 95%
t=5: Decision #1: Add 10 instances (not aware of in-flight)
t=9: Decision #1 takes effect, CPU drops to 60%
t=10: Decision #2: Add 10 more (based on stale data at t=5)
t=14: Decision #2 takes effect, CPU at 30%, massive overcapacity
3. Oscillation: System bounces around equilibrium
Undercapacity → Scale up → [delay] → Overcapacity → Scale down → [delay] → Undercapacity → ...
Delay Analysis Framework
Question 1: What is the delay magnitude (D)?
- Sum information + decision + material delays
Question 2: What is the desired response time (R)?
- How fast does the problem evolve?
- How quickly do we need the solution?
Question 3: What is the delay ratio (D/R)?
Rules of thumb:
- D/R < 0.2: Delay negligible, can treat as instant
- 0.2 < D/R < 0.5: Delay noticeable, may cause slight overshoot
- 0.5 < D/R < 1.0: Danger zone, significant overshoot/oscillation risk
- D/R > 1.0: Solution arrives after problem evolved, high risk of wrong action
Auto-scaling example:
- D = 9 minutes (5 + 4)
- R = 5 minutes (traffic spike duration)
- D/R = 1.8 → HIGH RISK
Implications:
- Need faster provisioning (reduce D)
- Need earlier warning (increase R by predicting)
- Need feedforward control (preemptive scaling)
Addressing Delays: Leverage Points
Level 12 (weakest): Tune parameters
- Adjust scaling thresholds (70% vs 80% CPU)
- Helps marginally, doesn't eliminate delay
Level 11: Add buffers
- Keep warm pool of pre-started instances
- Reduces material delay, still has information delay
Level 6: Change information flow
- Predictive auto-scaling (ML forecasting)
- Eliminates information delay by anticipating
Level 10 (stronger): Change system structure
- Scheduled scaling for known patterns
- Feedforward control (bypass feedback loop entirely)
Key insight: Delays in balancing loops create most of the problem. Fixing delays is high-leverage.
Non-Linear Dynamics
When Linear Intuition Fails
Linear thinking: "Double the input, double the output"
- Works for: Simple arithmetic, direct proportions
- Fails for: Real systems with constraints, thresholds, interactions
Signs of non-linearity:
- Diminishing returns: Adding more stops helping (hiring past team size 50)
- Accelerating returns: More begets more (network effects)
- Thresholds/cliffs: Small change causes regime shift (cache 95% → 100% full)
- Saturation: Can't grow past ceiling (CPU can't exceed 100%)
Common Non-Linear Patterns
1. S-Curve (Logistic Growth):
Slow start → Exponential growth → Saturation
Example: Product adoption
Early: Few users, slow growth (no network effects yet)
Middle: Rapid growth (word of mouth kicks in)
Late: Market saturated, growth slows
Formula:
S(t) = K / (1 + e^(-r(t - t0)))
Where:
K = Carrying capacity (max possible)
r = Growth rate
t0 = Inflection point
2. Performance Cliffs:
CPU Utilization vs Response Time (typical web server):
0-70%: 50ms (constant)
70-85%: 80ms (slight increase)
85-95%: 200ms (degraded)
95-98%: 800ms (severe degradation)
98%+: 5000ms (collapse)
Why: Queuing theory—small increases in utilization cause exponential increases in wait time near saturation.
Implication: "We're at 90% CPU, let's add 20% capacity" → Only brings you to 75%, still in degraded zone. Need 2× capacity to get to safe 45%.
3. Tipping Points:
Small change crosses threshold → Large regime shift
Examples:
- Technical debt reaches point where all time spent fixing, no features
- Team morale drops below threshold → Attrition spiral
- Cache eviction rate exceeds insertion rate → Thrashing
Modeling: Need to identify the threshold and model behavior on each side separately.
4. Reinforcing Loops (Exponential):
Compound growth: S(t) = S(0) × (1 + r)^t
Examples:
- Viral growth: Each user brings k friends (k > 1)
- Technical debt: Slows development → More shortcuts → More debt
- Attrition: People leave → Remaining overworked → More leave
Danger: Exponentials seem slow at first, then explode. By the time you notice, system is in crisis.
Identifying Non-Linearities
Method 1: Plot the relationship
- Graph flow vs stock (e.g., fix rate vs backlog)
- Linear: Straight line
- Non-linear: Curve, bend, cliff
Method 2: Test extremes
- What happens at stock = 0?
- What happens at stock = very large?
- If behavior changes qualitatively, it's non-linear
Method 3: Look for limits
- Physical limits (100% CPU, 24 hours/day)
- Economic limits (budget constraints)
- Social limits (team coordination breaks down past 50 people)
Method 4: Check for interactions
- Does flow depend on MULTIPLE stocks?
- Does one stock's growth affect another's?
- Interactions create non-linearities
Modeling Non-Linear Systems
Piecewise linear:
Fix_rate =
if B < 50: 25 bugs/day (constant)
if B >= 50: 0.5 × B bugs/day (linear in B)
if B > 100: 50 bugs/day (saturated)
Lookup tables:
CPU% | Response_ms
-----|------------
60 | 50
70 | 60
80 | 90
90 | 200
95 | 800
98 | 5000
Interpolate between values for model.
Functional forms:
- Exponential saturation:
F = F_max × (1 - e^(-k×S)) - Power law:
F = a × S^b - Logistic:
F = K / (1 + e^(-r×S))
Practical advice: Start simple (linear), add non-linearity only where it matters for the question you're answering.
Visualization Techniques
Bathtub Diagrams
Purpose: Communicate stock-flow structure to non-technical audiences
Format:
Acquisitions
150/month
↓
┌──────────────────┐
│ │
│ CUSTOMERS │ ← Stock (current: 1,000)
│ │
└────────┬─────────┘
↓
Churn
5% × Customers
= 50/month
When to use: Explaining accumulation dynamics to executives, stakeholders, non-engineers
Key: Label flows with rates, stock with current level and units
Stock-Flow Diagrams
Purpose: Technical analysis, show equations visually
Notation:
- Rectangle = Stock
- Valve = Flow
- Cloud = Source/Sink (outside system boundary)
- Arrow = Information link (affects flow)
Example:
☁ → [Acquisition] → |BASIC| → [Upgrade] → |PREMIUM| → [Churn] → ☁
↑ ↓
└──── [Downgrade] ────┘
[Flow] affects rate
|Stock| accumulates
☁ = External source/sink
When to use: Detailed analysis, documenting model structure, team discussion
Behavior Over Time (BOT) Graphs
Purpose: Show how stocks and flows change dynamically
Format: Time series plots
Customers
│ ┌─────── Equilibrium (3,000)
3000│ /
│ /
2000│ /
│ /
1000├/───────────────────
└─┴─┴─┴─┴─┴─┴─┴─┴─┴─
0 3 6 9 12 15 18 Months
When to use:
- Demonstrating "what happens over time"
- Comparing scenarios ("with churn reduction vs without")
- Showing approach to equilibrium
Best practice: Plot both stocks and key flows on same graph with dual y-axes if needed
Phase Diagrams (Advanced)
Purpose: Visualize multi-stock systems
Format: Plot Stock A vs Stock B
Premium
│
│ / ← Equilibrium point (1,695 B, 1,304 P)
│ /
│ / ← Trajectory from start
│ /
│●
└────────── Basic
Arrow shows direction of movement over time
When to use: Complex systems with 2-3 interacting stocks
Choosing Visualization
| Audience | Purpose | Best Visualization |
|---|---|---|
| Executive | Explain problem | Bathtub diagram |
| Engineer | Analyze dynamics | Stock-flow diagram + BOT graph |
| Stakeholder | Compare options | Multiple BOT graphs (scenarios) |
| Team | Build shared model | Whiteboard stock-flow diagram |
| Self | Understand system | All of the above iteratively |
Model Validation
Units Check (Dimensional Analysis)
Every equation must have consistent units on both sides.
Process:
- Write units next to every variable
- Check each term in equation has same units
- If units don't match, you've made a conceptual error
Example:
WRONG:
MRR = Basic_customers + Premium_revenue
[$/month] ≠ [customers] + [$/month] ✗
RIGHT:
MRR = (Basic_customers × $100/month) + (Premium_customers × $150/month)
[$/month] = [customers × $/month] + [customers × $/month] ✓
Common errors caught by units:
- Adding stock to flow
- Multiplying when you should divide
- Forgetting time scale (monthly vs annual rates)
Boundary Testing
Test extreme values to catch nonsensical model behavior:
What if stock = 0?
Bug backlog = 0 bugs
Fix rate = 0.5 × 0 = 0 bugs/day ✓ (Can't fix non-existent bugs)
What if flow = 0?
Churn = 0%
Equilibrium customers = ∞ ✗ (Unbounded growth is unrealistic)
Insight: Need to add market saturation limit
What if stock = very large?
Backlog = 10,000 bugs
Fix rate = 0.5 × 10,000 = 5,000 bugs/day ✗ (Team of 5 can't fix 5,000/day)
Insight: Need to cap fix rate at team capacity
What if flow is negative?
Acquisition rate = -50 customers/month ✗ (Negative acquisition is nonsense)
Insight: Model might produce negative flows in edge cases, need floor at 0
Assumptions Documentation
State every assumption explicitly:
Example - Cache model assumptions:
- Request distribution is stable (20/80 hot/cold)
- FIFO eviction (not LRU or LFU)
- Cache lookup time is negligible
- No cache invalidation (entries only evicted, not deleted)
- Hot resources are accessed frequently enough to never evict
Why this matters:
- Identify where model breaks if reality differs
- Communicate limitations to stakeholders
- Know where to improve model if predictions fail
Template:
## Model Assumptions
1. [Physical]: What are we assuming about the system?
2. [Behavioral]: What are we assuming about users/actors?
3. [Parameter]: What values are we guessing?
4. [Scope]: What are we deliberately ignoring?
Sensitivity Analysis
Question: How robust is the conclusion to parameter uncertainty?
Method: Vary parameters ±20% or ±50%, see if conclusion changes
Example - Churn reduction ROI:
Base case: 5% → 3% churn = +$98K MRR at 12 months
Sensitivity:
Acquisition rate ±20%: +$85K to +$112K (Conclusion robust ✓)
Upgrade rate ±20%: +$92K to +$104K (Conclusion robust ✓)
Initial customers ±20%: +$88K to +$108K (Conclusion robust ✓)
If conclusion changes sign (e.g., ROI goes negative), the model is sensitive to that parameter. You need better data for that parameter or acknowledge high uncertainty.
Traffic light test:
- Green: Conclusion unchanged across plausible range
- Yellow: Magnitude changes but direction same
- Red: Conclusion flips (positive to negative)
Calibration: Simple to Complex
Start simple:
- Constant flows
- Linear relationships
- Single stock
Add complexity only if:
- Simple model predictions don't match reality
- Non-linearity matters for your question
- Stakeholders won't accept simple model
Iterative refinement:
- Build simplest model
- Compare to real data (if available)
- Identify largest discrepancy
- Add ONE complexity to address it
- Repeat
Warning: Complex models have more parameters → More ways to be wrong. Prefer simple models that are "approximately right" over complex models that are "precisely wrong."
Common Patterns in Software
1. Technical Debt Accumulation
STOCK: Technical Debt (story points)
INFLOWS:
- Shortcuts taken: 5 points/sprint (pressure to ship)
- Dependencies decaying: 2 points/sprint (libraries age)
OUTFLOWS:
- Refactoring: 3 points/sprint (allocated capacity)
ΔDebt = 5 + 2 - 3 = +4 points/sprint
Equilibrium: Never (unbounded growth)
Time to crisis: When debt > team capacity to understand codebase
Interventions:
- Level 12: Increase refactoring allocation (3 → 5 points/sprint)
- Level 8: Change process to prevent shortcuts (balancing loop)
- Level 3: Change goal from "ship fast" to "ship sustainable"
2. Queue Dynamics
STOCK: Backlog (tickets, bugs, support requests)
INFLOW: Arrival rate (requests/day)
OUTFLOW: Service rate (resolved/day)
Special cases:
- Arrivals > Service: Queue grows unbounded (hire more or reduce demand)
- Arrivals < Service: Queue drains (over-capacity)
- Arrivals = Service: Equilibrium, but queue length depends on variability
Note: Even at equilibrium, queue has non-zero size due to randomness (queuing theory)
3. Resource Depletion
STOCK: Available Resources (DB connections, memory, file handles)
INFLOWS:
- Release: Connections closed, memory freed
OUTFLOWS:
- Allocation: Connections opened, memory allocated
Leak: Outflow > Inflow (allocate but don't release)
→ Stock depletes to 0
→ System fails
Time to failure: Initial_stock / Net_outflow
4. Capacity Planning
STOCK: Capacity (servers, bandwidth, storage)
DEMAND: Usage (request rate, data size)
Key question: When does demand exceed capacity?
Model demand growth:
D(t) = D(0) × (1 + growth_rate)^t
Solve for t when D(t) = Capacity:
t = log(Capacity / D(0)) / log(1 + growth_rate)
Example:
Current: 1,000 req/sec, Capacity: 2,000 req/sec
Growth: 5%/month
t = log(2) / log(1.05) = 14.2 months until saturation
5. Customer Dynamics
STOCK: Active Customers
INFLOWS:
- Acquisition: Marketing spend → New customers
- Reactivation: Win-back campaigns
OUTFLOWS:
- Churn: % leaving per month
- Downgrades: Moving to free tier (if that's outside system boundary)
Equilibrium: Acquisition = Churn
A = c × C (where c = churn rate)
C_eq = A / c
If A = 150/month, c = 5%:
C_eq = 150 / 0.05 = 3,000 customers
6. Cache Behavior
STOCK: Cache Entries (current: E, max: E_max)
INFLOWS:
- Cache misses for new resources
OUTFLOWS:
- Evictions (when cache is full)
Phases:
1. Fill (E < E_max): Inflow > 0, Outflow = 0
2. Equilibrium (E = E_max): Inflow = Outflow (every new entry evicts one)
Hit rate at equilibrium:
Depends on request distribution vs cache size
- Perfect: Hot set < E_max → 100% hit rate
- Reality: Long tail → Partial hit rate
Integration with Other Skills
Stock-Flow + Archetypes
Archetypes are patterns of stock-flow structure:
Fixes that Fail:
STOCK: Problem Symptom
Quick fix reduces symptom (outflow) but adds to root cause (inflow to different stock)
Result: Symptom returns worse
Example:
Stock 1: Bug Backlog
Stock 2: Technical Debt
Quick fix: Hack patches (reduces backlog, increases debt)
Debt → Slower development → More bugs → Backlog returns
Use stock-flow to quantify archetypes:
- How fast does the symptom return?
- What's the equilibrium after fix?
- How much worse is long-term state?
Stock-Flow + Leverage Points
Map leverage points to stock-flow structure:
- Level 12 (Parameters): Change flow rates (increase acquisition budget)
- Level 11 (Buffers): Change stock capacity (bigger cache, more servers)
- Level 10 (Structure): Add/remove stocks or flows (new customer tier)
- Level 8 (Balancing loops): Change outflow relationships (reduce churn)
- Level 7 (Reinforcing loops): Change inflow relationships (viral growth)
- Level 6 (Information): Change what affects flows (predictive scaling)
- Level 3 (Goals): Change target equilibrium (growth vs profitability)
Quantitative modeling helps evaluate leverage:
- Calculate impact of 20% parameter change (Level 12)
- Compare to impact of structural change (Level 10)
- See that structural change is often 5-10× more effective
Stock-Flow + Causal Loops
Causal loops show feedback structure:
Customers → Revenue → Marketing → Customers (reinforcing)
Stock-flow quantifies the loops:
C(t+1) = C(t) + M(t) - 0.05×C(t) (customers)
R(t) = $100 × C(t) (revenue)
M(t) = 0.10 × R(t) / $500 (marketing converts revenue to customers)
Use stock-flow to:
- Calculate loop strength (how fast does reinforcing loop accelerate growth?)
- Find equilibrium (where do balancing loops stabilize system?)
- Identify delays (how long before marketing investment shows up in customers?)
Decision Framework: Which Skill When?
Start with Archetypes when:
- Problem seems familiar ("we've seen this before")
- Need quick pattern matching
- Communicating to non-technical audience
Add Stock-Flow when:
- Need to quantify ("how fast?", "how much?", "when?")
- Archetype diagnosis unclear (need to map structure first)
- Validating intuition with numbers
Use Leverage Points when:
- Evaluating interventions (which fix is highest impact?)
- Communicating strategy (where should we focus?)
- Already have stock-flow model, need to decide what to change
Typical workflow:
- Sketch causal loops (quick structure)
- Identify archetype (pattern matching)
- Build stock-flow model (quantify)
- Evaluate interventions with leverage points (decide)
Common Mistakes
1. Confusing Stocks with Flows
Mistake: "We need more velocity"
- Velocity is a flow (story points/sprint), not a stock you can "have"
Correct: "We need more capacity" (engineer hours, a stock) or "We need better process efficiency" (affects velocity, a flow rate)
Test: Can you measure it at a single instant without time reference?
2. Forgetting Delays
Mistake: "Just add more servers, problem solved"
- Ignores 4-minute cold start
- Ignores 5-minute detection lag
- By the time servers are online, spike is over
Correct: "9-minute total delay means we'll be overloaded for most of the spike. Need faster provisioning or predictive scaling."
Test: What is delay / response_time? If >0.5, delay dominates.
3. Linear Thinking in Non-Linear Systems
Mistake: "We're at 90% CPU, add 20% more servers → 72% CPU"
- Queuing theory: Response time is non-linear near saturation
- 90% → 72% keeps you in degraded performance zone
Correct: "Need to get below 70% CPU to escape performance cliff. Requires 2× capacity, not 1.2×."
Test: Plot performance vs utilization. If it curves, it's non-linear.
4. Ignoring Units
Mistake:
Total_cost = Customers + (Revenue × 0.3)
[units?] = [customers] + [$/month × dimensionless] ✗
Correct: Write units, check consistency
Total_cost [$/month] = (Customers [count] × $100/customer/month) + ...
5. Over-Modeling
Mistake: Building 500-line Python simulation for simple question
- "How many customers at equilibrium?"
- Could solve with 2-line algebra
Correct: Start simple. Add complexity only if simple model fails.
Test: Can you answer the question with envelope math? If yes, do that first.
6. Under-Modeling
Mistake: Guessing at capacity needs for $100K infrastructure investment
- "Seems like we need 50 servers"
- No model, no calculation
Correct: 30 minutes in Excel to model growth, calculate breakpoint, sensitivity test
Test: Cost of error >$10K and decision takes >1 week to reverse? Build a model.
7. Snapshot Thinking
Mistake: "We have 100 bugs right now, that's manageable"
- Ignores accumulation: 40/day in, 25/day out
- In 30 days: 100 + (40-25)×30 = 550 bugs
Correct: "Backlog is growing 15 bugs/day. At this rate, we'll have 550 bugs in a month. Need to increase fix rate or reduce inflow."
Test: Are flows balanced? If not, stock will change dramatically.
8. Equilibrium Blindness
Mistake: "Let's hire our way out of tech debt"
- More engineers → More code → More debt
- Doesn't change debt/code ratio (the equilibrium structure)
Correct: "Hiring changes throughput but not debt accumulation rate. Need to change development process (reduce debt inflow) or allocate refactoring time (increase debt outflow)."
Test: Does the intervention change the equilibrium, or just the time to get there?
9. Ignoring Delays in Feedback Loops
Mistake: "We shipped the performance fix, why are users still complaining?"
- Fix deployed today
- Users notice over next 2 weeks
- Reviews/sentiment update over next month
- Information delay is 30+ days
Correct: "Fix will take 4-6 weeks to show up in sentiment metrics. Don't panic if next week's NPS is still low."
10. Treating Symptoms vs Stocks
Mistake: "Add more servers every time we get slow"
- Symptom: Slow response
- Stock: Request rate growth
- Treating symptom (capacity) not root cause (demand)
Correct: "Why is request rate growing? Can we cache, optimize queries, or rate-limit to reduce inflow? Then add capacity if structural changes aren't enough."
Red Flags: Rationalizations to Resist
When you're tempted to skip quantitative modeling, watch for these rationalizations:
"This is too simple to model"
Reality: Simple systems often have non-obvious equilibria.
- Bug backlog seems simple, but when does it stabilize?
- Customer churn seems obvious, but what's equilibrium size?
Counter: If it's simple, the model takes 5 minutes. If it's not simple, you NEED the model.
Test: Can you predict the equilibrium and time constant in your head? If not, it's not simple.
"We don't have time for spreadsheets"
Reality: 30 minutes modeling vs 3 months living with wrong decision.
Counter:
- Production incident? Model delay dynamics in 10 minutes to pick right intervention.
- Capacity planning? 1 hour in Excel saves $50K in overprovisioning.
Test: Time to model vs time to reverse decision. If model_time < 0.01 × reversal_time, model it.
"I can estimate this in my head"
Reality: Human intuition fails on:
- Exponential growth (seems slow then explodes)
- Delays (underestimate overshoot)
- Non-linearities (performance cliffs)
- Multi-stock systems (competing flows)
Counter: Write down your mental estimate, build model, compare. You'll be surprised how often your intuition is 2-5× off.
Test: If you're confident, the model will be quick confirmation. If you're uncertain, you need the model.
"We don't have data for parameters"
Reality: You know more than you think.
- "Churn is somewhere between 3% and 7%" is enough for sensitivity analysis
- Rough estimates reveal qualitative insights (growing vs shrinking)
Counter: Build model with plausible ranges, test sensitivity. If conclusion is robust across range, you don't need exact data. If it's sensitive, THEN invest in measurement.
Test: Can you bound parameters to ±50%? If yes, model it and check sensitivity.
"Math is overkill for this decision"
Reality:
- "Add 20% capacity" seems like common sense
- Model reveals: Need 2× due to performance cliff
- Math just prevented $40K waste
Counter: Engineering decisions deserve engineering rigor. You wouldn't deploy code without testing; don't make capacity decisions without modeling.
Test: Cost of error >$5K? Use math.
"The system is too complex to model"
Reality: All models are simplifications. That's the point.
- Don't need to model every detail
- Model the parts that matter for your decision
Counter: Start with simplest model that addresses your question. Three stocks and five flows captures 80% of systems.
Test: What's the ONE question you need to answer? Build minimal model for that question only.
"We'll just monitor and adjust"
Reality: By the time you see the problem, it may be too late.
- Delays mean problem is bigger than it appears
- Exponential growth hides until crisis
- Prevention is easier than cure
Counter: Model predicts WHEN you'll hit the wall. "Monitor and adjust" becomes "monitor for predicted warning signs and execute prepared plan."
Test: What's the delay between problem and solution? If >50% of problem duration, you need prediction, not reaction.
"This is a special case, stock-flow doesn't apply"
Reality: If something accumulates or depletes, it's a stock-flow system.
- Queues (tickets, requests, bugs)
- Resources (memory, connections, capacity)
- People (customers, users, employees)
- Intangibles (morale, technical debt, knowledge)
Counter: Describe the system. If you can identify what's accumulating and what's flowing, stock-flow applies.
Test: Is there something that can grow or shrink? That's a stock. What changes it? Those are flows.
Summary
Stocks and flows modeling is the quantitative backbone of systems thinking:
- Stocks accumulate (measurable at an instant)
- Flows change stocks (rates per unit time)
- Equilibrium = where flows balance (ΔS = 0)
- Delays create overshoot, oscillation, and failure
- Non-linearities break linear intuition (cliffs, S-curves, exponentials)
- Validation = units check, boundary test, sensitivity analysis
When to use:
- Predicting future states
- Finding equilibrium
- Quantifying delays
- Validating intuition
- Making high-stakes decisions
Key techniques:
- Formal notation: S(t+1) = S(t) + (Inflow - Outflow)
- Equilibrium: Set ΔS = 0, solve algebraically
- Time constants: τ = Stock / Flow
- Delay analysis: D/R ratio (danger when >0.5)
- Visualization: Bathtub diagrams, stock-flow diagrams, BOT graphs
Integration:
- Archetypes = patterns of stock-flow structure
- Leverage points = where to intervene in stock-flow system
- Causal loops = qualitative preview of stock-flow dynamics
Resist rationalizations:
- "Too simple" → Simple models take 5 minutes
- "No time" → 30 min modeling vs 3 months of wrong decision
- "I can estimate" → Intuition fails on delays, exponentials, non-linearities
- "No data" → Sensitivity analysis works with ranges
- "Too complex" → Start simple, add complexity only if needed
The discipline: Check units, test boundaries, state assumptions, validate with sensitivity analysis.
The payoff: Predict system behavior, avoid crises, choose high-leverage interventions, make decisions with confidence instead of guessing.