Initial commit

This commit is contained in:
Zhongwei Li
2025-11-30 08:38:26 +08:00
commit 41d9f6b189
304 changed files with 98322 additions and 0 deletions

View File

@@ -0,0 +1,350 @@
# Advanced Visualization Methodology
This document covers advanced techniques for complex visualization scenarios: dashboards, multivariate data, interactive charts, specialized domains, and sophisticated narrative structures.
---
## 1. Dashboard Design Principles
### Layout Patterns
**F-Pattern Layout:** Users scan top-left → top-right → down-left side. Place most important KPIs top-left.
**Inverted Pyramid:** Summary → Details → Deep Dive
- **Level 1 (top):** Key metrics (3-5 big numbers with trend indicators)
- **Level 2 (middle):** Supporting charts (2-4 visualizations showing drivers)
- **Level 3 (bottom):** Detailed tables/drill-downs (for exploration)
**Small Multiples Grid:** Same chart type repeated for each category with consistent scales
- Enables quick comparison across categories
- Example: 6 line charts showing MRR trend for each product line, same Y-axis scale
**Dashboard Sizing:**
- **Executive dashboard:** 1 screen, no scrolling, 5-8 total elements
- **Analyst dashboard:** 2-3 screens, deep drill-downs, 10-15 elements
- **Monitoring dashboard:** Real-time, auto-refresh, 6-12 key metrics
### Dashboard Elements
**Scorecard (Big Number):**
```
+----------------------+
| MRR: $2.6M |
| ↑ 15% vs target |
| ▲ 30% YoY |
+----------------------+
```
- One metric, large font
- Trend arrow (↑↓) and % change
- Color: green (good), red (bad), yellow (caution)
**Bullet Chart:** Performance vs target
```
Revenue: ▓▓▓▓▓▓▓▓▓░ $2.6M (target: $2.25M)
├──────┼──────┤
Poor Good Excellent
```
- Actual (dark bar), target (line), range bands (poor/good/excellent)
**Traffic Light Indicators:**
| Metric | Status | Value | Trend |
|--------|--------|-------|-------|
| MRR | 🟢 | $2.6M | ↑ 30% |
| Churn | 🔴 | 8% | ↑ 2pp |
| CAC | 🟡 | $450 | ↔ 0% |
### Dashboard Best Practices
**Consistent color scheme:** One palette throughout (e.g., blue for primary metric, gray for secondary)
**Alignment:** Grid-based layout, elements aligned to invisible grid
**White space:** Don't cram; use spacing to group related elements
**Update timestamp:** "Last updated: 2024-11-14 10:30 AM" visible
**Interactivity (if web):** Filters (date range, segment), drill-downs, tooltips
**Too many colors:** Confusing, no hierarchy
**Misaligned elements:** Looks unprofessional
**No context:** "$2.6M" alone (vs what?)
**Stale data:** No timestamp, user doesn't know if current
---
## 2. Advanced Chart Types
**Small Multiples:** Same chart repeated in grid with consistent scales. Best for comparing metric across >4 categories. Max 12 charts; use consistent Y-scale. Example: Revenue trend for 12 product lines in 3x4 grid.
**Sparklines:** Tiny inline charts in tables (no axes). Shows trend shape at a glance. Example: Table with "Trend" column showing ▁▂▃▅▆▇█ for each product.
**Horizon Chart:** Space-efficient time series using color intensity layers instead of Y-height. For 20+ metrics in limited space.
**Connected Scatter:** Scatter plot with points connected in time order. Shows X-Y relationship evolving. Example: Revenue vs Profit by quarter (Q1→Q2→Q3→Q4).
**Hexbin:** Dense scatter (1000s+ points) using hexagon grid colored by density. Avoids overlapping dots.
**Alluvial Diagram:** Flow between states over time. Bands show entity movement. Example: User tier transitions (Free→Pro→Enterprise) across quarters.
---
## 3. Multivariate Visualization Techniques
**Scatter Plot Matrix (SPLOM):** N×N grid of scatter plots for 3-5 numerical variables. Each cell = relationship between row/column variable. Example: 4 variables (MRR, Churn, CAC, LTV) = 4×4 grid.
**Parallel Coordinates:** Vertical axes for each variable, entities as lines connecting values. Compare 20+ entities across 5-15 dimensions. Brush/filter one axis to highlight lines.
**Heatmap Matrix:** Rows × Columns = Categories, cell color = metric. Example: Features × Segments, color = usage %. Use sequential (light→dark) or diverging (blue→white→red) scales. Sort by similarity to reveal patterns.
**Bubble Chart:** 4D (X, Y, size, color). Example: Products (X: revenue, Y: margin, size: customers, color: category). Limit to <20 bubbles; label them.
---
## 4. Statistical Overlays
**Regression Lines:** Linear/log/polynomial trend in scatter. Annotate R²: "R² = 0.85 (strong correlation)". Distinct color from points.
**Confidence Intervals:** Shaded band (forecast uncertainty) or error bars (mean ± SE). Example: 95% CI band around forecast line.
**Distribution Overlays:** Histogram + normal curve (actual vs expected), Box plot + strip plot (quartiles + individual points).
---
## 5. Geographic Visualization
**Choropleth:** Filled regions (states/countries) colored by metric. Sequential (light→dark) or diverging (blue→white→red) scales. Pitfall: Large areas dominate; fix with cartogram or bubble map.
**Bubble Map:** Precise locations with size = metric, color = category. Limit <100 bubbles; use clustering for density.
**Flow Map:** Origin-destination lines, width = volume. For shipping, migration, traffic flows.
---
## 6. Hierarchical & Network Visualization
**Treemap:** Nested rectangles, size = metric, nesting = hierarchy levels. Click to drill down. Example: Revenue by category → product.
**Sunburst:** Radial treemap (center = root, rings = levels). More compact for deep hierarchies (4+ levels).
**Dendrogram:** Tree diagram for clustering/hierarchy. Example: Customer segmentation tree.
**Network Graph:** Nodes & edges for relationships. Force-directed (organic clustering) or hierarchical (directed A→B→C) layout. Limit <100 nodes; node size = importance, edge width = strength.
---
## 7. Color Theory & Accessibility
### Color Scales
**Sequential (Single Hue):** Light blue → Dark blue
- For: One metric, low to high
- Examples: Revenue, count, usage
**Diverging (Two Hues):** Blue → White → Red
- For: Metric with meaningful midpoint (zero, average, neutral)
- Examples: Profit/loss, vs target, sentiment
**Categorical (Distinct Hues):** Blue, Orange, Green, Purple
- For: Discrete categories with no order
- Limit: 5-7 colors (more requires legend lookup)
### Colorblind-Safe Palettes
**Common types:**
- Red-green colorblindness (8% of men)
- Blue-yellow colorblindness (rare)
**Safe combinations:**
- Blue + Orange (most common alternative)
- Blue + Red (okay)
- Avoid: Red + Green alone
**Tools:** Use simulators (Color Oracle) to test designs
### Accessibility Checklist
- [ ] Color contrast ≥4.5:1 for text (WCAG AA)
- [ ] Don't rely on color alone (add patterns, labels, shapes)
- [ ] Alt text describes insight ("Revenue grew 30%, driven by Enterprise")
- [ ] Interactive charts keyboard-navigable (tab, arrow keys)
- [ ] Legends positioned near data (reduce eye movement)
---
## 8. Interactive Visualization Patterns
**Filtering:** Dropdown (select one), multi-select (check multiple), date slider (range), cross-filter (click element filters other charts).
**Drill-Down:** Click element to see breakdown. Breadcrumb navigation (Revenue > Product A > Feature X).
**Tooltip:** Hover detail (exact value, context, metadata). Position near cursor, contrasting background, 2-4 lines max.
**Brushing & Linking:** Select range on one chart updates others. Reveals cross-chart patterns.
---
## 9. Animation & Temporal Visualization
### Animated Transitions
**When:** Show change over time (especially for presentations)
**Example:** Bar chart race (ranks change month-by-month)
**Best practices:**
- Pause controls (don't force auto-play through)
- Slow enough to follow (1-2 seconds per frame)
- Label current time period prominently
### Before/After Comparison
**Slope chart:** Show change for each entity
- Left: Before values
- Right: After values
- Lines connect (slope = change)
**Dumbbell chart:** Like slope but horizontal
- Good for long category names
---
## 10. Domain-Specific Patterns
### SaaS Metrics Dashboard
**Key charts:**
- MRR trend (line chart)
- MRR by source (stacked area: new, expansion, churn)
- Cohort retention (heatmap: cohort × month, color = retention %)
- Funnel (inverted pyramid: leads → trials → paid)
### Financial Reporting
**P&L Waterfall:**
- Start: Revenue (bar)
- Subtract: COGS, OpEx (negative bars)
- End: Net Income (bar)
- Shows cumulative effect
**Variance Analysis:**
- Grouped bar: Actual vs Budget vs Last Year
- Or diverging bar: (Actual - Budget), color by +/-
### A/B Test Results
**Forest plot (Confidence Intervals):**
- Y-axis: Metrics
- X-axis: Effect size (treatment vs control)
- Points: Estimate
- Error bars: 95% CI
- Vertical line at zero (no effect)
**Statistical annotation:**
- "Conversion: +2.5% (95% CI: +1.2% to +3.8%), p<0.01"
### Operational Monitoring
**Status timeline:**
- X-axis: Time
- Y-axis: System/service
- Color: Status (green, yellow, red)
- Shows uptime/downtime patterns
**Percentile charts:**
- Line chart: P50, P90, P99 response times over time
- Shows not just average but tail latency
---
## 11. Advanced Narrative Techniques
### Multi-Chart Storytelling
**Progression:** Question → Evidence → Conclusion
- Chart 1: "Revenue growing, but is it sustainable?"
- Chart 2: "New customer acquisition slowing (trend down)"
- Chart 3: "But expansion revenue from existing customers up 40%"
- Conclusion: "Growth shifting from acquisition to expansion; prioritize customer success"
**Guided annotations:**
- Progressive reveal: Show chart 1, then annotate with insight, then show chart 2
- Highlight sequence: Circle region A → zoom in → annotate → circle region B
### Scenario Comparison
**Pattern:** Base case vs Alternative scenarios on same chart
- Line chart: Actual (solid) + Forecast scenarios (dashed: optimistic, base, pessimistic)
- Annotate assumptions for each scenario
**Fan chart:** Uncertainty grows over time
- Shaded bands widen into future (50% CI, 90% CI)
### Insight Layering
**Layer 1 (Surface):** "Revenue up 30%"
**Layer 2 (Decomposition):** "Driven by Enterprise (+120%), SMB declined (-10%)"
**Layer 3 (Root cause):** "Enterprise: new product launched Q2. SMB: pricing too high for segment"
**Layer 4 (Action):** "Double Enterprise sales hiring; test SMB annual plans to reduce churn"
---
## 12. Tools & Implementation
**Python:** Matplotlib (basic, full control), Seaborn (statistical, better defaults), Plotly (interactive), Altair (declarative, concise).
**BI Tools:** Tableau (drag-and-drop, dashboards), Power BI (Microsoft, Excel integration), Looker (SQL, data governance), Metabase (open-source).
**Presentation:** Excel/Sheets (quick), Slides/PowerPoint (static), Observable (interactive D3.js notebooks).
---
## 13. Quality Assurance Checklist
Before publishing any visualization:
**Accuracy**
- [ ] Data source is credible and recent
- [ ] Calculations are correct (spot-check numbers)
- [ ] No misleading scales (Y-axis starts at zero for bar charts)
- [ ] Outliers investigated (real or data error?)
**Clarity**
- [ ] Chart type matches question (trend→line, comparison→bar, etc.)
- [ ] Title is insight-first headline
- [ ] Axes labeled with units
- [ ] Legend clear (or direct labels used)
- [ ] Annotations explain key patterns
**Aesthetic**
- [ ] Colorblind-safe palette
- [ ] Sufficient contrast
- [ ] No chart junk (3D, gradients, heavy gridlines)
- [ ] Aligned elements (grid-based layout)
- [ ] White space used effectively
**Actionability**
- [ ] Narrative interprets pattern (not just describes)
- [ ] Context provided (vs benchmark/target/history)
- [ ] Actions recommended (specific, feasible, assigned)
**Accessibility**
- [ ] Alt text describes insight
- [ ] Keyboard navigable (if interactive)
- [ ] Readable in black & white (test print)
---
## 14. Further Reading
**Books:**
- "Storytelling with Data" by Cole Nussbaumer Knaflic (chart choice, decluttering, narrative)
- "The Visual Display of Quantitative Information" by Edward Tufte (principles, data-ink ratio)
- "Show Me the Numbers" by Stephen Few (dashboard design, perceptual principles)
- "The Truthful Art" by Alberto Cairo (accuracy, ethics, statistical graphics)
**Online Resources:**
- Flowing Data (blog on visualization techniques)
- Information is Beautiful (examples of creative visualizations)
- PolicyViz (public policy and data visualization)
- D3.js Gallery (interactive web visualization examples)
**Color Tools:**
- ColorBrewer (cartography color schemes, colorblind-safe)
- Color Oracle (colorblind simulator)
- Coolors (palette generator)