Files
gh-lyndonkl-claude/skills/cognitive-design/resources/data-visualization.md
2025-11-30 08:38:26 +08:00

15 KiB

Data Visualization

This resource provides cognitive design principles specifically for charts, dashboards, and data-driven presentations.

Covered topics:

  1. Chart selection via task-encoding alignment
  2. Dashboard hierarchy and grouping
  3. Progressive disclosure for interactive exploration
  4. Narrative data visualization

Why Data Visualization Needs Cognitive Design

WHY This Matters

Core insight: Data visualization is cognitive communication - the goal is insight transfer from data to human understanding. Success requires matching visual encodings to human perceptual strengths.

Common problems:

  • Wrong chart type chosen (pie when bar would be clearer)
  • Dashboards overwhelming users with too many metrics
  • Interactive tools causing anxiety (users fear getting lost)
  • Data stories lacking clear narrative guidance

How cognitive principles help:

  • Choose encodings that exploit perceptual accuracy (Cleveland & McGill hierarchy)
  • Design dashboards within working memory limits (chunking, hierarchy)
  • Structure exploration to reduce cognitive load (progressive disclosure, visible state)
  • Guide interpretation through annotations and narrative flow

Mental model: Data visualization is a translation problem - translating abstract data into visual form that human perception can process efficiently.


What You'll Learn

Four key areas:

  1. Chart Selection: Match visualization type to user task and perceptual capabilities
  2. Dashboard Design: Organize multiple visualizations for at-a-glance comprehension
  3. Interactive Exploration: Enable data investigation without overwhelming or losing users
  4. Narrative Visualization: Tell data stories with clear flow and guided interpretation

Why Chart Selection Matters

WHY This Matters

Core insight: Not all visual encodings serve all tasks equally well - position/length enable 5-10x more accurate comparison than angle/area.

Cleveland & McGill's Perceptual Hierarchy (most to least accurate):

  1. Position along common scale
  2. Position along non-aligned scales
  3. Length
  4. Angle/Slope
  5. Area
  6. Volume
  7. Color hue
  8. Color saturation

Implication: Chart type choice directly determines user accuracy in extracting insights.

Mental model: Different charts are tools for different jobs - bar chart is like precision calipers (accurate comparison), pie chart is like eyeballing (rough proportion sense).


WHAT to Apply

Task-to-Chart Mapping

User Task: Compare Values

Best choice: Bar chart (horizontal or vertical)

  • Why: Position/length encoding (top of hierarchy)
  • Enables: Precise magnitude comparison, easy ranking
  • Example: Comparing sales across 6 regions

Avoid: Pie chart

  • Why: Angle/area encoding (lower accuracy)
  • Problem: Difficult to judge which slice is larger when differences are subtle

Application rule:

For precise comparison → bar chart with common baseline
Sort descending for easy ranking
Add data labels for exact values (dual coding)

User Task: Show Trend Over Time

Best choice: Line chart

  • Why: Position along time axis shows progression naturally
  • Enables: Pattern detection (rising, falling, cyclical), slope perception
  • Example: Monthly revenue over 2 years

Avoid: Bar chart for trends (acceptable but line is clearer), stacked area (hard to compare non-bottom series)

Application rule:

Time always on x-axis (left to right)
Use line for continuous data, bar for discrete periods
Limit to 5-7 lines max (more becomes tangled mess)
Annotate significant events ("Product launch Q2")

User Task: Understand Distribution

Best choice: Histogram or box plot

  • Why: Shows shape (normal, skewed, bimodal), spread, outliers
  • Enables: Pattern recognition (is data normally distributed?), outlier identification
  • Example: Distribution of customer order values

Avoid: Pie charts (can't show distribution), bar charts of individual values (too granular)

Application rule:

Histogram: Choose appropriate bin width (too narrow = noise, too wide = hide patterns)
Box plot: Shows median, quartiles, outliers clearly
Include sample size in label for context

User Task: Show Part-to-Whole Relationship

Acceptable choice: Pie chart (BUT only if ≤5 slices and differences clear)

  • Why: Circular metaphor intuitive for "whole"
  • Limitations: Hard to judge angles, tiny slices unreadable

Better choice: Stacked bar chart or treemap

  • Why: Position/length still more accurate than angle
  • Enables: Easier comparison of parts

Application rule:

If using pie: ≤5 slices, sort descending, label percentages directly
Better: Stacked bar (compare lengths) or treemap (hierarchical parts)
Always ask: "Do users need precise comparison?" → If yes, avoid pie

User Task: Find Outliers or Clusters

Best choice: Scatterplot

  • Why: 2D position enables pattern detection, outliers pop out preattentively
  • Enables: Correlation visualization, cluster identification
  • Example: Relationship between marketing spend and revenue by product

Application rule:

Add trend line if correlation expected
Use color/shape for categorical third variable (max 5-7 categories)
Label outliers with annotations
Consider log scale if data spans orders of magnitude

User Task: Compare Multiple Variables

Best choice: Small multiples (repeated chart structure)

  • Why: Consistent structure enables pattern comparison across conditions
  • Enables: Seeing how patterns change by category
  • Example: Monthly sales trends for each product category (separate line chart per category)

Application rule:

Keep axes consistent across multiples for fair comparison
Arrange in logical order (by average, alphabetically, geographically)
Limit to 6-12 multiples (more requires pagination)
Highlight one for focus if needed

Why Dashboard Design Matters

WHY This Matters

Core insight: Dashboards combine many visualizations competing for limited attention - without clear hierarchy and grouping, users experience cognitive overload.

Problems dashboards solve:

  • At-a-glance status monitoring
  • Quick decision-making
  • Pattern detection across metrics

Cognitive challenges:

  • Information density overwhelming working memory
  • No clear entry point (where to look first?)
  • Related metrics not visually grouped
  • Excessive scrolling breaks mental model

Mental model: Dashboard is like airplane cockpit - instruments grouped by system, critical alerts preattentively salient, most important gauges in pilot's direct view.


WHAT to Apply

Visual Hierarchy for Dashboards

Principle: Establish clear focal point with size, position, and contrast

Application pattern:

Primary KPIs (3-4 max):

- Position: Top-left (F-pattern start)
- Size: Large numbers (36-48px)
- Style: Bold, high contrast
- Purpose: "What's the current status?" answered in 5 seconds

Secondary Metrics (5-10):

- Position: Below or right of primary KPIs
- Size: Medium (18-24px)
- Grouping: Clustered by relationship (all conversion metrics together)
- Purpose: Supporting detail for primary KPIs

Tertiary/Details:

- Position: Lower sections or drill-down
- Size: Smaller (12-16px)
- Access: Details on demand (click to expand)
- Purpose: Deep investigation, not monitoring

Example: E-commerce dashboard - Primary KPIs top-left (revenue, conversion, users), secondary metrics mid (channels, products), tertiary bottom (details, history).


Gestalt Grouping for Clarity

Principle: Use proximity, similarity, and whitespace to show relationships

Proximity: Related metrics close, unrelated separated by whitespace Similarity: Consistent styling (same charts styled same way, color encoding consistent) Example: Traffic panel (sessions, pageviews, bounce), Conversion panel (rate, revenue, AOV) separated by whitespace, red for errors throughout


Chunking for Working Memory

Principle: Limit concurrent visualizations to 4±1 major groups

Application rule: Group >15 metrics into 3-5 categories, each with 3-5 metrics max = 9-25 metrics organized into 4±1 chunks (fits working memory).


Preattentive Salience for Alerts

Principle: Use red color ONLY for threshold violations requiring immediate action

Application rule: Normal = gray, alert = red (critical only). Avoid red for all negatives (causes alert fatigue). Example: Revenue down 2% = gray, down 15% = red threshold violation.


Data-Ink Ratio Maximization

Principle: Remove decorative elements; maximize ink showing actual data

Remove: Heavy gridlines, 3D effects, gradients, excessive decimals, decorative icons, chart borders Keep: Data (points/lines/bars), minimal axes, direct labels, meaningful annotations


Why Progressive Disclosure Matters

WHY This Matters

Core insight: Users fear getting lost in complex data exploration - progressive disclosure (overview first, zoom/filter, then details) provides safety net and reduces cognitive load.

Shneiderman's Mantra: "Overview first, zoom and filter, then details on demand"

Benefits:

  • Reduces initial cognitive load (don't show everything at once)
  • Provides context before detail (prevents disorientation)
  • Enables personalized investigation (users drill into what they care about)
  • Lowers anxiety (visible state shows "where am I", undo provides safety)

Mental model: Like Google Maps - start with full map view (overview), zoom to neighborhood (filter), click building for details (details on demand).


WHAT to Apply

Overview Level

Purpose: Provide context and entry point

Design principles:

Show: Overall pattern, main trends, big picture
Hide: Granular details, outliers (show in drill-down)
Enable: Quick understanding of "what's the story here?"

Example:

Sales Dashboard Overview:
- Total sales number (aggregated)
- Trend line (last 90 days)
- Top 3 performing regions (summary)
User understands: "Sales are up 12% this quarter, West region leading"

Filter/Zoom Level

Purpose: Focus on subset of interest

Design principles:

Show: Filtered results clearly
Externalize: Active filters as visible chips/tags (recognition, not recall)
Enable: Easy modification (click chip to remove filter)
Provide: Clear "Reset all filters" option

Example:

User clicks "West region" from overview:
- Dashboard updates to show only West region data
- Visible chip at top: "Region: West" with X to remove
- Breadcrumb: "All Regions > West"
- "Reset" button returns to overview
User sees: "I'm viewing West region data, can easily get back"

Details on Demand Level

Purpose: Reveal specifics only when needed

Design principles:

Trigger: Hover, click, or expand action (not visible by default)
Show: Granular data, additional metrics, raw numbers
Position: Tooltip, modal, or expandable section (doesn't disrupt main view)

Example:

User hovers over data point on trend line:
- Tooltip appears: "June 15, 2024: $45,234 sales, 234 orders, $193 AOV"
- User moves mouse away: Tooltip disappears
- Main chart unchanged (not cluttered with all this detail by default)

State Visibility

Principle: Always show current navigation state - don't make users remember

Critical state to externalize:

✓ Active filters (as removable chips)
✓ Current zoom level or time range
✓ Drilled-down path (breadcrumbs)
✓ Sorting/grouping applied
✓ Comparison baseline (if comparing to previous period)

Example implementation:

Dashboard header always shows:
- Time range: "Last 30 days" (clickable to change)
- Filters: [Region: West] [Product: Widget] (each with X to remove)
- Comparison: "vs. previous 30 days" (toggle on/off)
- Breadcrumb: "Overview > West > Widget Sales"

User never wonders: "What am I looking at? How did I get here?"

Why Narrative Visualization Matters

WHY This Matters

Core insight: People naturally seek stories with cause-effect and chronology - structuring data as narrative chunks information meaningfully and aids retention.

Benefits of narrative structure:

  • Easier to process (story arc vs heap of facts)
  • Better retention (narrative = memorable)
  • Guided interpretation (annotations prevent misunderstanding)
  • Emotional engagement (storytelling activates empathy)

Mental model: Data journalism isn't a data dump - it's a guided tour where the designer leads readers to insights.


WHAT to Apply

Narrative Structure

Classic arc: Context → Problem/Question → Data/Evidence → Resolution/Insight

Application to visualization:

Context (What's the situation?):

- Title that sets stage: "U.S. Solar Energy Adoption 2010-2024"
- Subtitle: "How policy changes drove growth"
- Brief intro text or annotation providing background

Problem/Question (What are we investigating?):

- Visual question posed: "Did tax incentives accelerate adoption?"
- Annotation highlighting the question point on chart

Data/Evidence (What does data show?):

- Chart showing trend with clear inflection point
- Annotation: "Tax credit introduced 2015" with arrow to spike
- Supporting data: % increase after policy vs before

Resolution/Insight (What did we learn?):

- Conclusion annotation: "Adoption rate tripled post-incentive"
- Implications: "States with stronger incentives saw faster growth"
- Call-to-action or next question (optional)

Annotation Strategy

Principle: Guide attention to key insights; prevent misinterpretation

Types of annotations:

Callout boxes:

Purpose: Highlight main insight
Position: Near relevant data point
Style: Contrasting color or background
Content: 1-2 sentences max ("Sales spiked 45% after campaign launch")

Arrows/Lines:

Purpose: Connect explanation to data
Use: Point from text to specific data element
Example: Arrow from "Product launch" text to spike in line chart

Shaded regions:

Purpose: Mark time periods or ranges
Use: Highlight recession periods, policy changes, events
Example: Gray shaded region "COVID-19 lockdown Mar-May 2020"

Direct labels:

Purpose: Replace legend lookups
Use: Label lines/bars directly instead of requiring legend
Example: Line chart with "North region" text next to the line (not legend box)

Application rule:

Annotate: Main insight (what users should notice)
Annotate: Unusual events (explain outliers/anomalies)
Don't annotate: Obvious patterns (if trend is clear, don't state "going up")
Test: Can user grasp message without annotations? If not, add guidance

Scrollytelling

Definition: Visualization that updates or highlights aspects as user scrolls

Benefits:

  • Maintains context (chart stays visible while story progresses)
  • Reveals complexity gradually (progressive disclosure in narrative form)
  • Engaging (interactive storytelling)

Application pattern:

Example progression: Start with full context (0%) → Highlight specific periods as user scrolls (33%, 66%) → End with full picture + projection (100%). Chart stays visible, smooth transitions, user can scroll back, provide skip option.