Files
gh-lyndonkl-claude/skills/information-architecture/resources/methodology.md
2025-11-30 08:38:26 +08:00

19 KiB
Raw Blame History

Information Architecture: Advanced Methodology

This document covers advanced techniques for card sorting analysis, taxonomy design, navigation optimization, and findability improvement.

Table of Contents

  1. Card Sorting Analysis
  2. Taxonomy Design Principles
  3. Navigation Depth & Breadth Optimization
  4. Information Scent & Findability
  5. Advanced Topics

1. Card Sorting Analysis

Analyzing Card Sort Results

Goal: Extract meaningful patterns from user groupings

Similarity Matrix

What it is: Shows how often users grouped two cards together

How to calculate:

  • For each pair of cards, count how many users put them in the same group
  • Express as percentage: (# users who grouped together) / (total users)

Example:

Sign Up First Login Quick Start Reports Dashboards
Sign Up - 85% 90% 15% 10%
First Login 85% - 88% 12% 8%
Quick Start 90% 88% - 10% 12%
Reports 15% 12% 10% - 75%
Dashboards 10% 8% 12% 75% -

Interpretation:

  • Strong clustering (>70%): "Sign Up", "First Login", "Quick Start" belong together → "Getting Started" category
  • Strong clustering (75%): "Reports" and "Dashboards" belong together → "Analytics" category
  • Weak links (<20%): "Getting Started" and "Analytics" are distinct categories

Dendrogram (Hierarchical Clustering)

What it is: Tree diagram showing hierarchical relationships

How to create:

  1. Start with each card as its own cluster
  2. Iteratively merge closest clusters (highest similarity)
  3. Continue until all cards in one cluster

Interpreting dendrograms:

  • Short branches: High agreement (merge early)
  • Long branches: Low agreement (merge late)
  • Clusters: Cut tree at appropriate height to identify categories

Example:

                        All Cards
                            |
        ____________________+_____________________
        |                                         |
    Getting Started                          Features
        |                                         |
    ____+____                              _____+_____
    |        |                            |           |
  Sign Up  First Login              Analytics    Settings
                                        |
                                    ____+____
                                    |        |
                                 Reports  Dashboards

Insight: Users see clear distinction between "Getting Started" (onboarding tasks) and "Features" (ongoing use).

Agreement Score (Consensus)

What it is: How much users agree on groupings

Calculation methods:

  1. Category agreement: % of users who created similar category

    • Example: 18/20 users (90%) created "Getting Started" category
  2. Pairwise agreement: Average similarity across all card pairs

    • Formula: Sum(all pairwise similarities) / Number of pairs
    • High score (>70%) = strong consensus
    • Low score (<50%) = weak consensus, need refinement

When consensus is low:

  • Cards may be ambiguous (clarify labels)
  • Users have different mental models (consider multiple navigation paths)
  • Category is too broad (split into subcategories)

Outlier Cards

What they are: Cards that don't fit anywhere consistently

How to identify: Low similarity with all other cards (<30% with any card)

Common reasons:

  • Card label is unclear → Rewrite card
  • Content doesn't belong in product → Remove
  • Content is unique → Create standalone category or utility link

Example: "Billing" card — 15 users put it in "Settings", 3 in "Account", 2 didn't categorize it

  • Action: Clarify if "Billing" is settings (configuration) or account (transactions)

2. Taxonomy Design Principles

Mutually Exclusive, Collectively Exhaustive (MECE)

Principle: Categories don't overlap AND cover all content

Mutually exclusive: Each item belongs to exactly ONE category

  • Bad: "Products" and "Best Sellers" (best sellers are also products — overlap)
  • Good: "Products" (all) and "Featured" (separate facet or tag)

Collectively exhaustive: Every item has a category

  • Bad: Categories: "Electronics", "Clothing" — but you also sell "Books" (gap)
  • Good: Add "Books" OR create "Other" catch-all

Testing MECE:

  1. List all content items
  2. Try to categorize each
  3. If item fits >1 category → not mutually exclusive
  4. If item fits 0 categories → not collectively exhaustive

Polyhierarchy vs. Faceted Classification

Polyhierarchy: Item can live in multiple places in hierarchy

  • Example: "iPhone case" could be in:
    • Electronics > Accessories > Phone Accessories
    • Gifts > Under $50 > Tech Gifts
  • Pro: Matches multiple user mental models
  • Con: Confusing (where is "canonical" location?), hard to maintain

Faceted classification: Item has ONE location, multiple orthogonal attributes

  • Example: "iPhone case" is in Electronics (primary category)
    • Facet 1: Category = Electronics
    • Facet 2: Price = Under $50
    • Facet 3: Use Case = Gifts
  • Pro: Clear, flexible filtering, scalable
  • Con: Requires good facet design

When to use each:

  • Polyhierarchy: Small content sets (<500 items), clear user need for multiple paths
  • Faceted: Large content sets (>500 items), many attributes, users need flexible filtering

Controlled Vocabulary vs. Folksonomy

Controlled vocabulary: Preset tags, curated by admins

  • Example: "Authentication", "API", "Database" (exact tags, no variations)
  • Pro: Consistency, findability, no duplication ("Auth" vs "Authentication")
  • Con: Requires maintenance, may miss user terminology

Folksonomy: User-generated tags, anyone can create

  • Example: Users tag articles with whatever terms they want
  • Pro: Emergent, captures user language, low maintenance
  • Con: Inconsistent, duplicates, noise ("Auth", "Authentication", "auth", "Authn")

Hybrid approach (recommended):

  • Controlled vocabulary for core categories and facets
  • Folksonomy for supplementary tags (with moderation)
  • Periodically review folksonomy tags → promote common ones to controlled vocabulary

Tag moderation:

  • Merge synonyms: "Auth" → "Authentication"
  • Remove noise: "asdf", "test"
  • Suggest tags: When user types "auth", suggest "Authentication"

Category Size & Balance

Guideline: Aim for balanced category sizes (no one category dominates)

Red flags:

  • One huge category: "Other" with 60% of items → need better taxonomy
  • Many tiny categories: 20 categories, each with 2-5 items → over-categorization, consolidate
  • Unbalanced tree: One branch 5 levels deep, others 2 levels → inconsistent complexity

Target distribution:

  • Top-level categories: 5-9 categories
  • Each category: Roughly equal # of items (within 2× of each other)
  • If one category much larger: Split into subcategories

Example: E-commerce with 1000 products

  • Bad: Electronics (600), Clothing (300), Books (80), Other (20)
  • Good: Electronics (250), Clothing (250), Books (250), Home & Garden (250)

Taxonomy Evolution

Principle: Taxonomies grow and change — design for evolution

Strategies:

  1. Leave room for growth: Don't create 10 top-level categories if you'll need 15 next year
  2. Use "Other" temporarily: New category emerging but not big enough yet? Use "Other" until critical mass
  3. Versioning: Date taxonomy versions, track changes over time
  4. Deprecation: Don't delete categories immediately — mark "deprecated", redirect users, then remove after transition period

Example: Software product adding ML features

  • Today: 20 ML-related articles scattered across "Advanced", "API", "Tutorials"
  • Transition: Create "Machine Learning" subcategory under "Advanced"
  • Future: 100 ML articles → Promote "Machine Learning" to top-level category

3. Navigation Depth & Breadth Optimization

Hick's Law & Choice Overload

Hick's Law: Decision time increases logarithmically with number of choices

Formula: Time = a + b × log₂(n + 1)

  • More choices → longer time to decide

Implications for IA:

  • 5-9 items per level: Sweet spot (Miller's "7±2")
  • >12 items: Users feel overwhelmed, scan inefficiently
  • <3 items: Feels unnecessarily nested

Example:

  • 100 items, flat (1 level, 100 choices): Overwhelming
  • 100 items, 2 levels (10 × 10): Manageable
  • 100 items, 4 levels (3 × 3 × 3 × 4): Too many clicks

Optimal for 100 items: 3 levels (5 × 5 × 4) or (7 × 7 × 2)

The "3-Click Rule" Myth

Myth: Users abandon if content requires >3 clicks

Reality: Users tolerate clicks if:

  1. Progress is clear: Breadcrumbs, page titles show "getting closer"
  2. Information scent is strong: Each click brings them closer to goal (see Section 4)
  3. No dead ends: Every click leads somewhere useful

Research (UIE study): Users successfully completed tasks requiring 5-12 clicks when navigation was clear

Guideline: Minimize clicks, but prioritize clarity over absolute number

  • Good: 5 clear, purposeful clicks
  • Bad: 2 clicks but confusing labels, users backtrack

Breadth-First vs. Depth-First Navigation

Breadth-first (shallow, many top-level options):

  • Structure: 10-15 top-level categories, 2-3 levels deep
  • Best for: Browsing, exploration, users know general area but not exact item
  • Example: News sites, e-commerce homepages

Depth-first (narrow, few top-level but deep):

  • Structure: 3-5 top-level categories, 4-6 levels deep
  • Best for: Specific lookup, expert users, hierarchical domains
  • Example: Technical documentation, academic libraries

Hybrid (recommended for most):

  • Structure: 5-7 top-level categories, 3-4 levels deep
  • Supplement with: Search, filters, related links to "shortcut" across hierarchy

Progressive Disclosure

Principle: Start simple, reveal complexity on-demand

Techniques:

  1. Hub-and-spoke: Overview page → Detailed pages

    • Hub: "Getting Started" with 5 clear entry points
    • Spokes: Detailed guides linked from hub
  2. Accordion/Collapse: Hide detail until user expands

    • Navigation: Show categories, hide subcategories until expanded
    • Content: Show summary, expand for full text
  3. Tiered navigation: Primary nav (always visible) + secondary nav (contextual)

    • Primary: "Products", "Support", "About"
    • Secondary (when in "Products"): "Electronics", "Clothing", "Books"
  4. "More..." links: Show top N items, hide rest until "Show more" clicked

    • Navigation: Top 5 categories visible, "+3 more" link expands

Anti-pattern: Mega-menus showing everything at once (overwhelming)


4. Information Scent & Findability

Information Scent

Definition: Cues that indicate whether a path will lead to desired information

Strong scent: Clear labels, descriptive headings, users click confidently Weak scent: Vague labels, users guess, backtrack often

Example:

  • Weak scent: "Solutions" → What's in there? (generic)
  • Strong scent: "Developer API Documentation" → Clear what's inside

Optimizing information scent:

  1. Specific labels (not generic):

    • Bad: "Resources" → Too vague
    • Good: "Code Samples", "Video Tutorials", "White Papers" → Specific
  2. Trigger words (match user vocabulary):

    • Card sort reveals users say "How do I..." → Label category "How-To Guides"
    • Users search "pricing" → Ensure "Pricing" in nav, not "Plans" or "Subscription"
  3. Descriptive breadcrumbs:

    • Bad: "Home > Section 1 > Page 3" → No meaning
    • Good: "Home > Developer Docs > API Reference" → Clear path
  4. Preview text: Show snippet of content under link

    • Navigation item: "API Reference" + "Complete list of endpoints and parameters"

Findability Metrics

Key metrics to track:

  1. Time to find: How long to locate content?

    • Target: <30 sec for simple tasks, <2 min for complex
    • Measurement: Task completion time in usability tests
  2. Success rate: % of users who find content?

    • Target: ≥70% (tree test), ≥80% (live site with search)
    • Measurement: Tree test results, task success in usability tests
  3. Search vs. browse: Do users search or navigate?

    • Good: 40-60% browse, 40-60% search (both work)
    • Bad: 90% search (navigation broken), 90% browse (search broken)
    • Measurement: Analytics (search usage %, nav click-through)
  4. Search refinement rate: % of searches that are refined?

    • Target: <30% (users find on first search)
    • Bad: >50% (users search, refine, search again → poor results)
    • Measurement: Analytics (queries per session)
  5. Bounce rate by entry point: % leaving immediately?

    • Target: <40% for landing pages
    • Bad: >60% (users don't find what they expected)
    • Measurement: Analytics (bounce rate by page)
  6. Navigation abandonment: % who start navigating, then leave?

    • Target: <20%
    • Bad: >40% (users get lost, give up)
    • Measurement: Analytics (drop-off in navigation funnels)

Search vs. Navigation Trade-offs

When search is preferred:

  • Large content sets (>5000 items)
  • Users know exactly what they want ("lookup" mode)
  • Diverse content types (hard to categorize consistently)

When navigation is preferred:

  • Smaller content sets (<500 items)
  • Users browsing, exploring ("discovery" mode)
  • Hierarchical domains (clear parent-child relationships)

Best practice: Offer BOTH

  • Navigation for discovery, context, exploration
  • Search for lookup, speed, known-item finding

Optimizing search:

  • Autocomplete: Suggest as user types
  • Filters: Narrow results by category, date, type
  • Best bets: Featured results for common queries
  • Zero-results page: Suggest alternatives, show popular content

Optimizing navigation:

  • Clear labels: Match user vocabulary (card sort insights)
  • Faceted filters: Browse + filter combination
  • Related links: Help users discover adjacent content
  • Breadcrumbs: Show path, enable backtracking

5. Advanced Topics

Mental Models & User Research

Mental model: User's internal representation of how system works

Why it matters: Navigation should match user's mental model, not company's org chart

Researching mental models:

  1. Card sorting: Reveals how users group/label content
  2. User interviews: Ask "How would you organize this?" "What would you call this?"
  3. Tree testing: Validates if proposed structure matches mental model
  4. First-click testing: Where do users expect to find X?

Common mismatches:

  • Company thinks: "Features" (technical view)
  • Users think: "What can I do?" (task view)
  • Solution: Rename to task-based labels ("Create Report", "Share Dashboard")

Example: SaaS product

  • Internal (wrong): "Modules" → "Synergistic Solutions" → "Widget Management"
  • User mental model (right): "Features" → "Reporting" → "Custom Reports"

Cross-Cultural IA

Challenge: Different cultures have different categorization preferences

Examples:

  • Alphabetical: Works for Latin scripts, not ideographic (Chinese, Japanese)
  • Color coding: Red = danger (Western), Red = luck (Chinese)
  • Icons: Mailbox icon = email (US), doesn't translate (many countries have different mailbox designs)

Strategies:

  1. Localization testing: Card sort with target culture users
  2. Avoid culturally-specific metaphors: "Home run", "touchdown" (US sports)
  3. Simple, universal labels: "Home", "Search", "Help" (widely understood)
  4. Icons + text: Don't rely on icons alone

IA Governance

Problem: Taxonomy degrades over time without maintenance

Governance framework:

  1. Roles:

    • Content owner: Publishes content, assigns categories/tags
    • Taxonomy owner: Maintains category structure, adds/removes categories
    • IA steward: Monitors usage, recommends improvements
  2. Processes:

    • Quarterly review: Check taxonomy usage, identify issues
    • Change request: How to propose new categories or restructure
    • Deprecation: Process for removing outdated categories
    • Tag moderation: Review user-generated tags, merge synonyms
  3. Metrics to monitor:

    • % content in "Other" or "Uncategorized" (should be <5%)
    • Empty categories (no content) — remove or consolidate
    • Oversized categories (>50% of content) — split into subcategories
  4. Tools:

    • CMS with taxonomy management
    • Analytics to track usage
    • Automated alerts (e.g., "Category X has no content")

Personalization & Dynamic IA

Concept: Navigation adapts to user

Approaches:

  1. Audience-based: Show different nav for different user types

    • "For Developers", "For Marketers", "For Executives"
  2. History-based: Prioritize recently visited or frequently used

    • "Recently Viewed", "Your Favorites"
  3. Context-based: Show nav relevant to current task

    • "Related Articles", "Next Steps"
  4. Adaptive search: Results ranked by user's past behavior

Caution: Don't over-personalize

  • Users need consistency to build mental model
  • Personalization should augment, not replace, standard navigation

IA for Voice & AI Interfaces

Challenge: Traditional visual hierarchy doesn't work for voice

Strategies:

  1. Flat structure: No deep nesting (can't show menu)
  2. Natural language categories: "Where can I find information about X?" vs. "Navigate to Category > Subcategory"
  3. Conversational: "What would you like to do?" vs. "Select option 1, 2, or 3"
  4. Context-aware: Remember user's previous question, continue conversation

Example:

  • Web: Home > Products > Electronics > Phones
  • Voice: "Show me phones" → "Here are our top phone options..."

Summary

Card sorting reveals user mental models through similarity matrices, dendrograms, and consensus scores. Outliers indicate unclear content.

Taxonomy design follows MECE principle (mutually exclusive, collectively exhaustive). Use faceted classification for scale, controlled vocabulary for consistency, and plan for evolution.

Navigation optimization balances breadth (many choices) vs. depth (many clicks). Optimal: 5-9 items per level, 3-4 levels deep. Progressive disclosure reduces initial complexity.

Information scent guides users with clear labels, trigger words, and descriptive breadcrumbs. Track findability metrics: time to find (<30 sec), success rate (≥70%), search vs. browse balance (40-60% each).

Advanced techniques include mental model research (card sort, interviews), cross-cultural adaptation, governance frameworks, personalization, and voice interface design.

The goal: Users can predict where information lives and find it quickly, regardless of access method.